Chris Dong

Chris Dong

Data Engineer

Download Resume

About Me

I am passionate about building and optimizing data pipelines and developing tools to automate monotonous tasks.

I am currently in the data engineering team at Meta. I previously worked for LinkedIn as a Data Engineer to build scalable data pipelines within the jobs ecosystem as well as startups such as Komodo Health and Nextdoor.

I received my Master's of Science in Data Science (MSDS) at the University of San Francisco where I have developed a strong programming skill set that can tackle business problems involving Big Data. I also received my bachelor's degree in Statistics from UCLA.

Growing up in the Bay Area, I often play table tennis and go hiking during my spare time.

Work Experience

Senior Data Engineer
Meta (June 2021 - Present)

  • Refactor hundreds of data pipelines programatically using Python and shell scripting by building an end-to-end solution that is estimated to save 500+ engineering hours as part of company-wide initiative.
  • Implement logging, brainstorm success metrics with XFN, build data pipelines, and create data visualizations for multiple 0:1 products.

Co-founder, Head of Data
Peanut Travel (June 2021 - Present)

  • Built a database from scratch to provide hotel quality insights for over 200K hotels globally.
  • Powered data-driven weather insights for over 10K airports globally.
  • Provided real-time analytics on flight delay statistics for non-stop flights and recommendations on choosing multi-stop flights.
  • Gathered neighborhood safety data points (crime rates, theft, women safety, nearest hospitals / police stations, etc) for a given hotel.
  • Deployed a metrics dashboard with near real-time updates on AWS that captures installs, engagement, attributions, etc.

Data Engineer - Data Science
LinkedIn (Nov 2019 - May 2021)

  • Create end-to-end data pipeline to detect possible viral spam content and escalate it for manual review using Scala, SQL, and Azkaban.
  • Develop ETL pipeline to monitor, visualize, and quantify impact of COVID-19 spam content (fake news, promotion, etc.) and deliver key insights to leadership team.
  • Streamline entire data flow on job seeker sessions to make data available 24 hours earlier, reduce failure rate to 0%, and optimize run-time by half using Spark SQL.
  • Revamp attribution of job sessions with a more robust, scalable methodology, reducing unknown job session origins by 23% and eliminating manual updates.
  • Collaborate with cross-functional teams (product, data infrastructure and UX) to design, experiment, and drive adoption of new features on LinkedIn Jobs platform.

Data Science Coach
Interview Query (Oct 2020 - Present)

  • Administer practice interviews, interview guidance and resume feedback.
  • Conduct in-depth training sessions on SQL fundamentals, product case studies, and core data engineering concepts.

Data Analytics Engineer
Komodo Health (Aug 2018 - Nov 2019)

  • Rebuilt processing platform from scratch to expedite data querying performance by 2x through Spark optimization and schema overhaul.
  • Scaled throughput of analytics requests by creating self-service tooling and automation to support 3x company growth as second team hire.
  • Streamlined workflow in Spark, Kubernetes, AWS, and Git through shell scripts.
  • Increased productivity and stability of internal tooling through unit testing and continuous integration using Python, Jenkins, Docker.

Business Intelligence Engineer
Nextdoor (Oct 2017 - Jun 2018)

  • Prioritized potential application improvements by using natural language processing (NLP) techniques (bag-of-words, logistic regression, word embeddings, LDA) to classify free-form survey data and net promoter score responses.
  • Built and deployed an ETL pipeline (Python, Spark, Airflow) to automatically classify incoming survey data.
  • Provided insight into flagged and reported posts on the Nextdoor platform by combining multiple data sources using SQL (Redshift) and visualizing results using Tableau.
  • Collaborated with engineering team to facilitate querying, identify missing data, and improve data stability.

Data Analyst
Gymflash (Sept 2014 - Jan 2015)

  • Identified primary factors leading to user churn using SQL and R.
  • Evaluated the impact of gamification on user's retention and fitness goal achievement across different user demographics.
  • Recommended target geographic regions to optimize marketing spend.

Data Scientist
Emerge Digital Group (June 2014 - Sept 2014)

  • Implemented data schema and warehouse to efficiently and compactly store data.
  • Engineered a digital influence algorithm to measure user engagement level.
  • Designed a user score bonus scheme and dynamic scoring method to mitigate the influence of outliers

Latest Projects

Other Projects

Feel free to click on the title to view the source code on Github!

Machine Learning & Time Series

Distributed Energy Usage Forecasting

  • Chris Dong, Lingzhi Du, Feiran Ji, Zizhen Song, Yuedi Zheng, Paul Intrevado, Diane Woodbridge. "Forecasting Smart Meter Energy Usage using Distributed Systems and Machine Learning". The 16th IEEE International Conference on Smart City. 2018.
  • Python, Apache Spark, MongoDB, AWS S3, AWS EC2, AWS EMR

Forecasting Bankruptcy Rates in Canada

  • R Time Series Modeling (SARIMA, SARIMAX, VARX, Holt-Winters)

Housing Prices in Iowa

  • R (LASSO, Ridge, Elastic Net)

Data Processing & Amazon Elastic Compute Cloud

dvidr Web Application

  • Did most of the frontend and backend
  • Python, HTML, CSS, JavaScript, AWS EC2

YouTube Subtitles

  • Python, Bash, YouTube API

Analytics Ingestion System

  • Python, Flask, paramiko, Bash, crontab, screen

Twitter API Sentiment Analysis

  • Python, Flask, HTML

Miscellaneous

Interactive Visualization on BART Traffic

  • Python, R, HTML

Data Visualization

  • R(ggplot2), Python(seaborn, plotly), Tableau

Aggregation of BART Ridership Reports

  • Python, PostgreSQL

Article Recommendation System

  • Python, GloVe, Flask, HTML

Search Engine Implementation

  • Python

Image Processing

  • Python