Nikhil Shukla

Data Science Engineer at DQLabs, Inc.
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
Bangalore Urban, Karnataka, India, IN
Languages
  • English Full professional proficiency
  • Hindi Native or bilingual proficiency
  • Dutch Limited working proficiency

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Credentials

  • Gitlab Certified Git Associate
    GitLab
    May, 2022
    - Nov, 2024
  • AWS Cloud Practitioner
    Amazon Web Services (AWS)
    Oct, 2021
    - Nov, 2024
  • Econometrics: Methods and Applications
    Coursera Course Certificates
    Nov, 2016
    - Nov, 2024
  • Certification in Risk Management
    Tata Consultancy Services
    Sep, 2016
    - Nov, 2024
  • Informatica Certified Designer
    Informatica
    Dec, 2015
    - Nov, 2024
  • Python Programming
    Coursera
    Apr, 2014
    - Nov, 2024
  • ChatGPT,AI Tool and API
    Udemy
    Sep, 2023
    - Nov, 2024
  • Introduction to Programming Using Python
    Python Software Foundation
    Jul, 2019
    - Nov, 2024
  • AWS machine learning
    Amazon Web Services (AWS)
  • TensorFlow Developer Certificate
    TensorFlow Certificate Program

Experience

    • United States
    • Software Development
    • 1 - 100 Employee
    • Data Science Engineer
      • Feb 2022 - Present

      - DQLabs is a product under the Intellectyx Corporation focused on Augmenting Quality Data - Involved in Design, Architect, and maintaining large Enterprise solutions on the AI/ Ml platform, monitoring and providing transparency into data quality across systems - Designing Root Cause Analysis architecture using Combinational Boosting techniques(CatBoost,xgBoost,LightBGM) to get the underlying features and rows impacting the target labels - Designing propriety adaptive hybrid algorithm for automated Time Series Anomaly Detection - NLP for semantic discovery of huge dataset leveraging hugging face pre-trained models and development of in-house NLP semantic architecture using a combination of Bert-Family , GPT-family, Transformers. - Deployment of machine learning models (supervised, unsupervised, reinforcement) using mwaa and Django - Designing DQ-Copilot - A first generation service bot integrated with LLM models to speed up the Data quality process and optimize the user time - Designed Text to SQL converter using Seq2Seq parsing model using T5 and mT5 - Tech Stack: Big Data: Apache Spark Cloud Services: AWS (S3, Redshift, EC2, EMR, Glue, Athena, Sagemaker,Mwaa) Analytics: Apache Druid -> Apache Pinot Snowflake, Snowpark Database: SQL(PostGreSQL,MySQL, Denodo) Big Data : Databricks, PySpark Orchestration: Airflow(Mwaa) Data Build Tool (Transformation): dbt ML/DL: Tensorflow, Keras, PyTorch, scikit-learn, Hugging Face,OpenAI ML Deployment: MLFlow, Django, Jenkins Monitoring: Grafana Containers: Docker, EKS Ad-hoc: -Mentoring Amritha University students on their research paper related to NLP and Time series conformal prediction(Probablistic Models). - Multiple workshops on NLP/ML/DL across different universities/online portal Show less

    • United States
    • IT Services and IT Consulting
    • 1 - 100 Employee
    • Senior Information Technology Analyst
      • Feb 2022 - Present

    • Netherlands
    • Marketing Services
    • Full Stack Data Scientist
      • Oct 2020 - Nov 2021

      - SPOT N.L. is a subsidiary of Social Supply Group, V.O.F - The primary goal of the startup was to analyze millions of real-time sports data through multiple sources and utilize the most optimum Machine Learning and Deep Learning models to provide the best bookmaker odds to the customers - Designed and implemented a real-time data pipeline to process semi-structured data by integrating millions of raw records from multiple data sources, using Kafka and PySpark and storing processed data in Redshift - Executed the prediction model using neural networks that use elo features to predict real-time football odds as compared to 30+ bookmakers - Understanding the end to end deployment of products using a data-intensive approach as the company had B2B and B2C business models - Tech Stack: Big Data: Apache Spark, Hadoop(Hive, Pig, MapReduce, Hbase), Kafka (Real-time streaming) Cloud Services: AWS (S3, Redshift, EC2, Glue, Athena, Sagemaker) Database: SQL : PostGre SQL, MongoDB Orchestration: Airflow Show less

    • Germany
    • Manufacturing
    • 700 & Above Employee
    • Digital Transformation & Business Science Intern
      • Feb 2019 - Jan 2020

      • Worked on Recommendation Models using Machine Learning algorithms toward Supply Chain Network Optimization • Extensive use of Tableau for the creation of new dashboards and the improvement of older ones. • [POC] Predicting customer behaviour pattern using the sentiment analysis model for D2C applications • Implementation of Machine Learning model for the Supply Chain network optimization, a recommendation model that is set to realize millions of euros in savings for the Henkel Supply Chain network through smarter route selection and order optimization suggestions for both the customer and Henkel. Show less

    • India
    • IT Services and IT Consulting
    • 700 & Above Employee
    • System Engineer
      • Jan 2016 - Jul 2017

      Customer : ABN AMROProject : MDL Basel chain Reporting Description of project : The overall project objective is to deliver a process that creates data reports in order to comply with mandatory regulatory requirements for BASEL II reporting .The purpose of the project is complying with mandatory reporting needs concerning credit risk en with requirements for internal reporting of ABN AMRO N-share.Highlights or responsibilities : • Mapping the business requirements to IT ETL Informatica Powercenter mapping logic • Mentoring and guiding Junior /Fresher’s consultants.•. Worked on python scripts to allow automated deployment of xml files in the environment • Worked with ETL Powercenter Designer , Performance tuning of components • Knowledge of data warehouse concepts , Unix , Mainframe scheduling and ETL Informatica Powercenter• Worked on Teradata and Db2 database systems in project used for reporting purpose . • Walkthrough of Release IPPT report with IBM and Functional Beheer of bank during release sign off meetings• Building relationships with clients, technical teams and external contractors and working towards common goal• Worked in both Client Managed and TCS managed project Environments • Good presentation and communication skillsWorked on different risk management and operational management scores, while working on agile methodology in the dynamic way the bank works.Tools worked in project : ETL Powercenter v 9 , Unix, Teradata , IBM DB2 , Mainframe ,G3 tool,R,Oracle,Jira, Python, Shell scripting, Linux Show less

    • Assistant System Engineer
      • Dec 2014 - Jan 2016

      Customer : ABN AMROProject Name : MDL Basel II (CCC) & CTL[Closing The Loop]Description : HKU and CTL is one of the first projects that is going to make use of the new data warehouse development street, the so-called ADW (ABN AMRO Data warehouse). The tooling to support this street has currently been set up by the DEWI project; allowing business analysts to fill in G3 spreadsheets that can be automatically migrated to Power enter workflows, capable of executing basic ETL (Entry-Transform-Load) activities populating the data warehouse environment.Responsibilities : • As a project member I was responsible for the design and development of the project in ETL Powercenter• Detailed analysis of the business requirements.• Map the business requirements to IT requirements.• Design, code and test the modifications/enhancements required for meeting the requirements.• Optimizing the ETL Informatica Powercenter components• We had success with numerous deliveries and in a timely and quality conscious manner.Worked with the the business and financial analysts to work on excel based tool using VBA and doing multiple pattern matching analysis.Tools worked in project : ETL Powercenter v 9.6.1, Unix,Teradata , IBM DB2 7.2, Mainframe,R,Oracle Show less

    • India
    • Research Services
    • 700 & Above Employee
    • Summer Research Intern
      • Jul 2012 - Sep 2012

      Named Entity Recognition (NER) using Hidden Markov Model (HMM): Development of NER, a Natural Language Processing tool using a trained HMM to recognize named entities that occur in military intelligence data. Named Entity Recognition (NER) using Hidden Markov Model (HMM): Development of NER, a Natural Language Processing tool using a trained HMM to recognize named entities that occur in military intelligence data.

    • Austria
    • Food and Beverage Services
    • 700 & Above Employee
    • Student Brand Manager
      • Aug 2011 - Jun 2012

      • Facilitated setup and flow of events (on and off-campus) • Sponsorship of student events • Ensured availability of product on and around campus • Assisted in large scale marketing activities Notable Events: • RedBull Campus Cricket 2012 (Bangalore) • Facilitated setup and flow of events (on and off-campus) • Sponsorship of student events • Ensured availability of product on and around campus • Assisted in large scale marketing activities Notable Events: • RedBull Campus Cricket 2012 (Bangalore)

Education

  • Tilburg University
    Master's degree, Data Science and Society
    2019 - 2022
  • Jheronimus Academy of Data Science
    Pre-Masters, Data Science
    2018 - 2019
  • Dr. Ambedkar Institute Of Technology
    Bachelor of Engineering (B.E.), Electrical, Electronics and Communications Engineering
    2010 - 2014

Community

You need to have a working account to view this content. Click here to join now