Topline | Vinay Karingula

Homepage Find Professionals Vinay Karingula

Vinay Karingula

Data Science Specialist at NewMarket Corporation

Claim this Profile

Contact Information

us****@****om

(386) 825-5501

Gold Feature

Click to upgrade to our gold package
for the full feature experience.

Location

Richmond, Virginia, United States, US

Topline Score

Topline score feature will be out soon.

Bio

Generated by

Topline AI

You need to have a working account to view this content.

Join now

You need to have a working account to view this content.

Join now

Experience

NewMarket Corporation

United States
Chemical Manufacturing
100 - 200 Employee

Data Science Specialist
- Mar 2023 - Present

MGM Resorts International

United States
Hospitality
700 & Above Employee

Cloud/Big Data Engineer
- Feb 2022 - Feb 2023
• Gathered all documentation from another team in transition phase about all the project details. • Worked with azure databricks and airflow to maintain and troubleshoot airflow dags and databricks logs. • Documented integration between azure databricks and airflow. • Created couple of spark jobs to maintain data integrity pipelines coming to ADLS from various source systems • and automated with a dag to run on schedule basis. • Developed PySpark script to setup the data pipeline. • Involved in the Design and building of project right from the scratch. • Implement MPI and accompanying interfaces • Implement file and web service interfaces for data exchange • Assisted partner agencies with providing data to a data warehouse • Analyzed data provided by partner agencies to facilitate deduplication • Exported data according to outside vendor specification • Implement Master Patient Index • Worked on Developing DAG, Performance tuning of the DAGs and task implementation • Worked closely with Machine learning engineers to produce the desired output for the customers. • Maintained Inbound and Outbound pipelines for the data transfers from various source systems to target systems • Agile delivery to deliver proof of concept and production implementation in iterative sprints. Show less

Aetna, a CVS Health Company

United States
Wellness and Fitness Services
700 & Above Employee

- Jun 2021 - Jan 2022
• Worked with AWScloud and created EMRclusters with spark for analyzing raw data processing and access data from S3 buckets.• Integrated the end to end data pipeline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times• Agile delivery to deliver proof of concept and production implementation in iterative sprints.• Migrated data from TD to TDV • Developed PySpark script to setup the data pipeline.• Worked on developing ETL streams using Databricks.• Create Spark Application to load the data in Athena Tables.• Involved in the Design and building of project right from the scratch.• Migrated data from TD to S3• Designed and created automation Airflow Dagsand parallel execution.• Worked on Developing DAG, Performance tuning of the DAGs and task implementation.Environment:PySpark, Python, AWS Glue, Athena, Teradata, Airflow, S3, Databricks, IAM, SNS. Show less
- Sep 2020 - May 2021
• Working as Sr. Data Engineer with Hadoop Ecosystems, Apache Spark, and AWS.• Designing and building robust services using streaming and batch data.• Key contributor in building identity services that will enable to share profiles across the organization in support of marketing and analytics.• Created Hiveschemas using performance techniques like partitioning and bucketing.• Developed analytical components using Kafka and Spark Stream.• Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.• Create Spark Application to load the data in Athena Tables.• Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift• Used JSON schema to define table and column mapping from S3 data to Redshift• Developed PySpark script to setup the data pipeline.• Worked on Spark SQL, created Data frames by loading data from Hivetables and created prep data and stored in AWS S3.• Collaborated with product teams, data analysts and data scientists to design and built data-forward solutions.• Create airflow jobs to workflow of Spark. Show less

GE Power

United States
Electric Power Generation
700 & Above Employee

Hadoop Developer
- Oct 2015 - Jun 2018
• Worked as Java/Hadoop Developer and responsible for taking care of everything related to the clusters. • Responsible for building scalable distributed data solutions using Hadoopcluster environment with Hortonworksdistribution. • Developed Spark scripts by using Python shell commands as per the requirement. • Developed Sparkscripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs. • Used Spark API over ClouderaHadoopYARN to perform analytics on data in Hive. • Involved in performance tuning of Spark jobs using Cache and using complete advantage of cluster environment. • Developed in scheduling Oozie workflow engine to run multiple Hive and Pig jobs. • Worked with different file formats such as Text, Sequence files, Avro, ORC and Parquet. • Involved in converting Hive/SQL queries into Sparktransformations using Spark RDDs, Scala Show less

Education

New Jersey Institute of Technology
Master's degree, Information Science/Studies

2018 - 2019
Sreenidhi Institute of Science and Technology
Bachelor's degree, Electrical and Electronics Engineering

2011 - 2015

Community

You need to have a working account to view this content. Click here to join now

Topline Software

Get New Leads on Autopilot

The All-in-One Growth Operating System

Access Our Free Tools

Topline Services

Done For You Marketing & Sales Solutions

Done For You Global Talent Recruitment

Agency and Reseller Services

Topline Use Cases

The All-in-One Growth Operating System

The All-in-One Growth Operating System