Vincent Chiu
Data Engineer at WineDirect- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
-
English Native or bilingual proficiency
-
Cantonese Limited working proficiency
-
Mandarin Elementary proficiency
-
French Elementary proficiency
Topline Score
Bio
Experience
-
WineDirect
-
United States
-
Beverage Manufacturing
-
100 - 200 Employee
-
Data Engineer
-
Oct 2022 - Present
-
-
-
Veeva Systems
-
United States
-
Software Development
-
700 & Above Employee
-
Machine Learning Engineer
-
Jan 2022 - Oct 2022
● Design, implement, and deploy key components of data lake and its associated data engineering pipeline that will be used for consumer facing models using Apache Airflow, Python and Docker. ● Implement key components of entity resolution system that can identify duplicates amongst millions of records using Java, Apache Solr and custom components. ● Create cloud infrastructure for AI service using Amazon Web Services and Ansible. ● Built exception handling system for critical components of AI microservice using Spring Boot and Java. Show less
-
-
-
RBC
-
Canada
-
Banking
-
700 & Above Employee
-
Machine Learning Engineer
-
Apr 2020 - Nov 2021
● Developed machine learning system for analyzing data quality and finding outliers in market data using Scikit-learn.● Productionized, containerized and deployed critical data reconciliation application using Apache Airflow and Docker.
-
-
Data Scientist
-
Jan 2020 - Apr 2020
● Generating over $500,000 in potential savings for RBC by creating a machine learning model that predicts early cash withdrawal from financial products with high accuracy using Apache Spark, and scikit-learn.
-
-
-
RBC
-
Canada
-
Banking
-
700 & Above Employee
-
Data Scientist, Amplify
-
May 2019 - Aug 2019
● Created a patent pending market risk solution that placed in the top 3 out of over 20 teams in a 4-month long tech demo and pitch competition. Used Apache Spark and scikit-learn to create machine learning models for modelling market risk. ● Created a patent pending market risk solution that placed in the top 3 out of over 20 teams in a 4-month long tech demo and pitch competition. Used Apache Spark and scikit-learn to create machine learning models for modelling market risk.
-
-
-
Communications Research Centre
-
Canada
-
Research
-
1 - 100 Employee
-
Big Data and Machine Learning Researcher (Co-op)
-
May 2018 - Dec 2018
● Developed tool for visualizing base station bandwidth usage using Apache Spark for data preprocessing. Enabled the processing of 400x the amount of spectrum-related data. Utilized Azure SQL Data Warehouse for storage and Power BI for visualization. Enabled the Government of Canada to monitor and evaluate commercial mobile spectrum deployments and sell spectrum licenses (market size over 24.4 billion CAD). ● Created a convolutional neural network that classified wireless communications signals by their modulation scheme using TensorFlow achieving over 86% accuracy. This classifier served as a prototype and POC for creating a universal demodulator to greatly improve the decoding of signals with an unknown encoding scheme. ● Developed data transformation and cleaning software to enable researchers to use time-series algorithms on mission-critical spectrum data. Used Apache Spark, scikit-learn, pandas and matplotlib to impute missing values in gigabytes of time-series sensor data. Clustered using k-means and DBSCAN to isolate signals from various transmitters. Utilized linear regression to impute the missing values of a sensor from two other sensors. Provided a foundational data platform for better dynamic spectrum management. Show less
-
-
-
EnerNOC
-
United States
-
IT Services and IT Consulting
-
100 - 200 Employee
-
Data Science Intern
-
Nov 2015 - Jun 2016
●Analyzed electricity and natural gas usage for building occupancy hours using a combination of linear regression, averaging, tree-based labeling, and primary business logic. Developed programs in R and Python to develop insights for targeted business intelligence reports of energy efficiency. ● Created and deployed a web application that generates statistical reports for energy savings programs using the shiny package in R, consulted with other statisticians and data scientists to verify outcomes. Created a transformative tool for management and clients to give a high-level view overview and breakdown of energy consumption by industry. ● Improved data integrity and accuracy by creating a web application that streamlined a manual data labeling process. Utilized the shiny package in R and a SQLite database backend. Increased efficiency of the manual-labeling process 15x, enabled the labeling of 1000 examples/day. Show less
-
-
Education
-
The University of British Columbia
Bachelor’s Degree, Physics with Minor in Mathematics -
Simon Fraser University
Master of Science in Computer Science, Big Data