ZHIXIONG CHENG

Senior Data Scientist at 4Paradigm 第四范式
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
Beijing, China, CN
Languages
  • English -
  • Chinese (Simplified) -

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Credentials

  • SQL for Data analyst
    Udemy

Experience

    • China
    • Software Development
    • 100 - 200 Employee
    • Senior Data Scientist
      • Jun 2021 - Present

      Responsibility: Lead data science team for user demand mining, researching and analysis; AI model design, development, validation, assessment and deployment; wrote design proposal and implementation Project I: China Everbright Bank – Lending Risk Management 12 month Suspicious Cash-out accounts’ detection model: Based on historical suspicious accounts’ info, domain knowledge and data analysis results, training machine learning model(lightGBM) to fit and finally increase the recall rate High Risk Repayment accounts’ detection model: Performing isax (clustering algorithm) to detect high risk activities and built client-grade strategy. Risk Management Upgrade Proposal: Inquiry on current work flow and relevant IT system in risk management process, then wrote a proposal on data governance, modeling tools and computing frameworks; organizing resources and finish implement. Project II: Hunan Valin Steel – Steel-making Scheduling System Design • To solve the problem of steel-making scheduling by human, such as low efficiency, order fulfillment rate. Collecting the current state, designing solution of RL(PPO) and environment simulator, then providing POC demonstration Project III: ICBC Call Center – Representative Rostering Optimization • Job scheduling by human takes long time and often failed to reach the KPI. Collected scheduling rules and objects, then create heuristics optimization module Project IV: Airplane Job Allocation POC Collected rules and objects of airplanes/tasks/pilots, and built integer programming model to optimize finish time. Project V: Dongxing Securities APP – Fund recommendation POC Implemented ML model to predict probability of fund investment on App, based on app transaction data, user data and fund info. Project VI: Bank of Communication money Laundering Detection Created ML model by transaction data and domain knowledge, to detect money laundering activities on accounts’ daily basis. Show less

    • China
    • Technology, Information and Internet
    • 1 - 100 Employee
    • Data Scientist
      • Jan 2021 - Jun 2021

      As a data scientist in Tantan technology, I help to optimize the working logics of A/B testing platform. Responsibility:solution engineering, algorithm research, function development, validation, anomaly analysis Module I: A/B Testing Platform KPI optimization & A/A Grouping acceleration • Background: Analyzing and fixing the unusual fluctuation of KPIs in A/B Testing; accelerating A/A pre-grouping computational efficiency due to the high volume of A/B tests every day. • Method and results: Computed KPIs for two groups the same population for different Hash modulo, sampling and statistics methods (non-params, DID, CUPED…); compared their FPRs and adjust/drop illegals; Applied Simulated Annealing method on AA grouping to replace random search, resulting finish time several hundreds’ time faster. • Highlights: optimized computational logics wrote by SQL and python; applied multi-processing to speed up. Show less

    • United States
    • Software Development
    • 1 - 100 Employee
    • Data Scientist
      • Jun 2019 - Nov 2020

      Project I: Manila International Container Terminal - Real-time Job Allocation Responsibility: domain knowledge, data analysis, algorithm research; model develop, validate, deploy; presentation Background: Developing real-time planning module for Yard Management System, to solve planning problems by human, such as low work efficiency/throughput, high operating cost, high equipment idling time, long response time. Module I: Rubber Tired Gantry (RTG) Total Working Time Prediction Model • Method and results: Provided accurate prediction as input parameters for job allocation model. Analyzing last ten years’ RTG job and related data(big data), create GBM model with domain info as a regressor, MAE decreased remarkably relatively. • Highlights: Conjugate Bayesian for cat variables; RNN to predict driver’s job sequence; developed by Scala/Spark, and run as a microservice. Module II: Container Bay Sequencing Model • Method and results: Find best “output” sequence for containers in a Bay, by generating integer programming methods, finally solved by google or-tools and reduced response time within seconds, and ran steady for months. • Highlights: Apply multi-processing to speed up; Minizinc to generalize programming language; input for RTG model. Module III: Terminal Container Job Allocation Model • Participating in model design and develop. Built by heuristics’ framework Optaplanner to improve performance. Rules and objects from terminal expert and data analysis; utilizing Kafka Topics to store and connect all microservices; A/B testing shows model reduce rehandles significantly. Project II: Virginia International Terminal – Container overnight housekeeping Applied heuristics methods on containers’ overnight preprocessing jobs, to increase future work efficiency in daytime. Show less

    • United States
    • Investment Banking
    • 1 - 100 Employee
    • Data Engineer
      • Apr 2018 - Dec 2018

      • Maintained and optimized CPGO3 platform (FileMaker) and MySQL databases by improving indexes, rewriting queries and scripts, changing relationships between tables, which reduced total response time. • Implemented feather selection techniques such as ExtraTreesClassifier and RFE on the complex investment datasets (including tables of investors, deals, deal status, contacts and email system), and selected the top 12 features for the regression model which reduced the training time and decreased the risk of overfitting; model evaluation and tuning by ROC curve and learning curve. • Performed logistic regression model and other ensemble methods on predicting whether target investors are interested in selected deals and ranked them by probabilities, which saves bankers huge amounts of time with their marketing campaigns. Show less

    • Associate Research Assistant
      • Jul 2014 - Jun 2015

      1. Standardized the data of salary, population, education and employment from citizens in Fengtai; 2. Calculated indexes of data such as mean value and increase rate in Microsoft Excel; visualized these data by R; 3. Wrote statistics reports to provide data summaries of people’s life in Fengtai. 1. Standardized the data of salary, population, education and employment from citizens in Fengtai; 2. Calculated indexes of data such as mean value and increase rate in Microsoft Excel; visualized these data by R; 3. Wrote statistics reports to provide data summaries of people’s life in Fengtai.

Education

  • Northeastern University
    Master of Science (MS), Applied Mathematics
    2016 - 2017
  • Beijing University of Technology
    Bachelor of Science (BS), Information and computing science
    2010 - 2014

Community

You need to have a working account to view this content. Click here to join now