About

I’m a master student at Cornell University and majoring in Information Science (focusing on Data Science).

Before Cornell, I have worked for Prof. Lau Kin-nam as a data analyst at Chinese University of Hong Kong for two years. We cooperated with IBM in Big Data Plan A100, so I have led over 10 data projects for delivering data solutions for Fortune 500 companies.

I’m passionate about applying my knowledge of data analytics and machine learning in marketing and growth analytics where we can solve business problems. Through two year working experience, I truly believe machine learning and data analytics can make an impact on user growth and marketing efficiency growth. If you’re working in marketing or user growth analytics field, don’t hesitate to reach out if any cooperation opportunities.

Email me if you’re looking to collaborate on projects. I’m also looking forward to full-time opportunities in data science and business intelligence.


Work Experience

  • Data Scientist (Advertising) — Sagadigits (Apr 2017 - Jul 2017)
    • Developed a data-driven advertising product using Python to cluster 200,000 Facebook posts by LDA model and obtained 60 groups of interests for each user
    • Used market basket analysis to select target audiences that are most associated from advertisers’ campaign topic
    • Achievement: Click through rate +5%, Cost per click -20%, Cost per fans -30% by comparing with traditional method of selecting audiences by marketing experience
  • Data Analyst (Marketing Science) — Chinese University of Hong Kong (Dec 2014 - Sep 2016)
    • Roles: Delivered data solutions for 10+ Fortune 500 companies, analyzed data to find business opportunities, built machine learning models in SPSS/Python/R, presented actionable insight to make business impact
    • Product Recommendations for E-commerce store of P&G
      1. Used Hadoop to perform ETL job on 1TB browsing data to generate behavior features for 500,000 members
      2. Used R to build market basket analysis to recommend product at right time through right channel
      3. Achievement: Activated 46% inactive customers; transferred 10% customers from trying samples to purchase
    • Targeted Coupon Prediction Model for Hyatt Hotel
      1. Used SQL server to retrieved data of a previous coupon campaign; Cleaned missing values of customer profile
      2. Built logistic regression in R to calculate probability of using coupon; generated customer list to send coupon
      3. Achievement: Hit rate +40%, presented actionable insights to the management of Hyatt Hotel
  • Data Analyst Intern — EMC (Jun 2014 - Aug 2014)
    • Sales Analytics: Analyzed 3 months’ sales and customer data and designed 17 charts to understand business process
    • Business Intelligence: Developed a BI platform using d3.js to report the performance of pre-sales, sales and post-sales, resulting in visualizing data in real-time

Published Book

  • Web Crawling with Python by China Machine Press in 2017
    • Over 10,000 sales in Alibaba/Amazon/JD platform
    • Collected by over 10 university libraries
    • Wrote the book after 2 years of self-learning Python and Web Crawling when working
  • Machine Learning on Green Card and H1B Data (Working Article) with Lutz Finger (former Data Science director at Snap Inc & LinkedIn), will publish on Forbes
    • In the course Design Data Product, our team were picked by Lutz Finger to write an article about our project on Forbes, only 3 out of 20 teams got this opportunity.

Teaching

Contact me

st883@cornell.edu