GPA: 3.88/4.00
Relevant Coursework: Machine Learning, Data Mining, Analyzing Big Data I/II, Python and Applications to Business Analytics, Application of System Dynamics, Information Visualization, Applied Econometrics with R, Healthcare Data Analytics, Digital Marketing, Big Data Mining Technology
GPA: 3.9/4.0, First-Class Scholarship of SYSU (Top 5%)
Relevant Coursework: Advanced Mathematics, Linear Algebra, Probability and Statistics, Computational Methods, Statistical Analysis and Forecasting, Big Data Analysis and It’s Application in Information Retrieval
• Created cyber incident database: Conducted ETL process, wrote Python code to scrape cyber incident data from different sources using package bs4, utilized financial API to get company information, and designed SQLite database for data storage
• Cleaned up the cyber security incident database: Transferred over 2000 columns of True and False values from vcdb cyber incidents into 30 variables using Pandas
• Matched cyber incident information with company financial information: Calculated the similarity between company names of cyber incident database and financial information database using NLP algorithm. Productionized the NLP model in a pipeline and use it to match over 50k records
• Built cyber catastrophic model: Supported model building process using machine learning and statistics methods to quantify the potential financial damage affecting an insured and across the (re)insurers portfolio
• Carried out accurate ads for clients’ digital marketing business: Built random forests model to predict consumer behavior based on their advertisement exposure times, past purchase behavior, gender, etc., achieved model accuracy of 83.7%
• Supported the digital transformation process for well-known FMCG companies (P & G and Unilever): Applied data cleansing process on data of online delivery products in the FMCG industry using SQL, R, and Excel, and built clustering model based on unit price, sales repurchase rate, etc. to group the products for product analysis
• Built out business intelligent automated process: Applied Python to scrape 3000+ product pages of competitive brands from JD and Taobao, connected data to SQL database, and designed visualization dashboard in Tableau
• Defined important e-commerce metric, extracted relevant structure data from database with SQL, and created tables for data visualization to promote digital operation of e-commerce for business department
• Built time series models to predict sales trends and provided suggestions for production and inventory department. During my 5-months internship, the warehouses I was responsible for achieved 0 alerts (Alerts would be generated when inventory was too high or too low)
• Qualified consumer preference using user data, customized creative marketing campaign based on target persona, and tracked progress of data results such as reading volume and discussion volume for key clients in automotive industry like BMW, Mercedes-Benz, Toyota and Honda
SQL (Basics, Joins, Aggregate Functions, Subquery, Window Functions)
Python (Numpy, Pandas, Scikitlearn, matplotlib, TensorFlow, BeautifulSoup)
R (tidyverse, ggplot2, dplyr, caret)
Machine Learning (Random Forests, NN, CNN, Logistic Regression, Lasso, PCA, K-Means Clustering)
Tableau, Hive, Excel, SPSS, SAS, Power BI, Google Analytics, MapReduce