Topline | Mohit K

Homepage Find Professionals Mohit K

Mohit K

Senior Data Engineer at Texas Medicaid & Healthcare Partnership

Claim this Profile

Contact Information

us****@****om

(386) 825-5501

Gold Feature

Click to upgrade to our gold package
for the full feature experience.

Location

Tampa, Florida, United States, US

Topline Score

Topline score feature will be out soon.

Bio

Generated by

Topline AI

You need to have a working account to view this content.

Join now

You need to have a working account to view this content.

Join now

Experience

Texas Medicaid & Healthcare Partnership

Hospitals and Health Care
1 - 100 Employee

Senior Data Engineer
- Sep 2021 - Present
• Performed Inventory Planning and Retail data processing on AWS S3 using Spark, performed transformations and applied rules and look ups as per business requirements • Experience in ETL jobs and developing and managing data pipelines • Experience in integrating and scheduling Airflow Dag building for data pipelines for real time data integration • Experience in writing hive queries performing partitioning on top of S3 location and tuning joins on hive tables for optimized… Show more • Performed Inventory Planning and Retail data processing on AWS S3 using Spark, performed transformations and applied rules and look ups as per business requirements • Experience in ETL jobs and developing and managing data pipelines • Experience in integrating and scheduling Airflow Dag building for data pipelines for real time data integration • Experience in writing hive queries performing partitioning on top of S3 location and tuning joins on hive tables for optimized performance • Worked on reading multiple data formats like CSV, JSON, Parquet on HDFS using PySpark And used as lookup tables and applied coalesce • Tuning and optimizing the Spark jobs with partitioning/bucketing and memory management of driver and executor • Optimizing the Hive Queries using the various files format like PARQUET, JSON, CSV • Involved in sprint planning process and standup meetings under the Agile Scrum methodology • Experience in creating hive external table with Partitions, bucketing concepts in Hive • Automated end to end data processing pipelines and scheduled various data workflows • Scheduling Spark jobs using Airflow workflow in Hadoop Cluster • Actively Participated and scheduled calls with Product Owner and BSA’s to gather requirements • Experience in Git and Jenkins to push and deploy code • Persisting data in S3 using kinesis Firehouse, VPC, EMR, Lambda and Cloud Watch. • Mongo DB persistence and maintenance. • EAP (Enterprise Application platform) Maintenance and persistence of real time and batch data. • Tokenization Implementation using DTAAS, Previtaar for supporting REST Level, file level, field level encryption. • Kafka Connects: AWS S3 Sink Connect, Salesforce Connect, Splunk Connect and HDFS Connect • Developed spark applications in PySpark for inventory analytics purpose and loaded data to table for Data science team to perform analysis and make charts • Worked on building End to end data pipelines using spark and storing final data into . Show less • Performed Inventory Planning and Retail data processing on AWS S3 using Spark, performed transformations and applied rules and look ups as per business requirements • Experience in ETL jobs and developing and managing data pipelines • Experience in integrating and scheduling Airflow Dag building for data pipelines for real time data integration • Experience in writing hive queries performing partitioning on top of S3 location and tuning joins on hive tables for optimized… Show more • Performed Inventory Planning and Retail data processing on AWS S3 using Spark, performed transformations and applied rules and look ups as per business requirements • Experience in ETL jobs and developing and managing data pipelines • Experience in integrating and scheduling Airflow Dag building for data pipelines for real time data integration • Experience in writing hive queries performing partitioning on top of S3 location and tuning joins on hive tables for optimized performance • Worked on reading multiple data formats like CSV, JSON, Parquet on HDFS using PySpark And used as lookup tables and applied coalesce • Tuning and optimizing the Spark jobs with partitioning/bucketing and memory management of driver and executor • Optimizing the Hive Queries using the various files format like PARQUET, JSON, CSV • Involved in sprint planning process and standup meetings under the Agile Scrum methodology • Experience in creating hive external table with Partitions, bucketing concepts in Hive • Automated end to end data processing pipelines and scheduled various data workflows • Scheduling Spark jobs using Airflow workflow in Hadoop Cluster • Actively Participated and scheduled calls with Product Owner and BSA’s to gather requirements • Experience in Git and Jenkins to push and deploy code • Persisting data in S3 using kinesis Firehouse, VPC, EMR, Lambda and Cloud Watch. • Mongo DB persistence and maintenance. • EAP (Enterprise Application platform) Maintenance and persistence of real time and batch data. • Tokenization Implementation using DTAAS, Previtaar for supporting REST Level, file level, field level encryption. • Kafka Connects: AWS S3 Sink Connect, Salesforce Connect, Splunk Connect and HDFS Connect • Developed spark applications in PySpark for inventory analytics purpose and loaded data to table for Data science team to perform analysis and make charts • Worked on building End to end data pipelines using spark and storing final data into . Show less

The College Board

United States
Education Administration Programs
700 & Above Employee

Big Data Engineer
- Feb 2020 - Aug 2021
• Performed analytics on AWS S3 using Spark, Performed transformations and actions as per business requirements • Experience in integrating Apache Kafka with Hdfs and S3 data pipelines for real time data • Experience in tuning hive jobs by performing partitioning, bucketing and optimized joins on hive tables • Worked on building ETL process using spark and storing final data into Datawarehouse solution snowflake • Involved in designing Hive schemas, designing, and developing… Show more • Performed analytics on AWS S3 using Spark, Performed transformations and actions as per business requirements • Experience in integrating Apache Kafka with Hdfs and S3 data pipelines for real time data • Experience in tuning hive jobs by performing partitioning, bucketing and optimized joins on hive tables • Worked on building ETL process using spark and storing final data into Datawarehouse solution snowflake • Involved in designing Hive schemas, designing, and developing normalized and denormalized data models • Worked on implementing data lake and responsible for data management in Data Lake. • Developed Ruby Script to map the data to the production environment. • Developed Hive queries and Sqooped data from RDBMS to data lake staging area • Handled data stored in warehouse and made external tables utilizing Hive and created scripts to ingest data to tables that can be reused across the project • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process • Involved in planning process of iterations in Agile methodology • Experience with Partitions, bucketing concepts in Hive and designed both using Managed and External tables in Hive • Experience in partitioning/bucketing for hive tables and memory management of driver and executor for tuning the jobs • Worked in developing Pig scripts to create the relationship between the data present in the Hadoop cluster. • Performed advanced procedures like text analytics and processing using the in-memory computing capabilities of spark using Python and Scala. • Working closely with Data science team and understand the requirement clearly and create hive table on HDFS • Automated data processing flow pipelines and scheduled various data flow jobs • Scheduling Spark jobs using Oozie workflow in Hadoop Cluster • Actively participating in the code reviews, meetings and troubleshoot technical issues and solving them. Show less • Performed analytics on AWS S3 using Spark, Performed transformations and actions as per business requirements • Experience in integrating Apache Kafka with Hdfs and S3 data pipelines for real time data • Experience in tuning hive jobs by performing partitioning, bucketing and optimized joins on hive tables • Worked on building ETL process using spark and storing final data into Datawarehouse solution snowflake • Involved in designing Hive schemas, designing, and developing… Show more • Performed analytics on AWS S3 using Spark, Performed transformations and actions as per business requirements • Experience in integrating Apache Kafka with Hdfs and S3 data pipelines for real time data • Experience in tuning hive jobs by performing partitioning, bucketing and optimized joins on hive tables • Worked on building ETL process using spark and storing final data into Datawarehouse solution snowflake • Involved in designing Hive schemas, designing, and developing normalized and denormalized data models • Worked on implementing data lake and responsible for data management in Data Lake. • Developed Ruby Script to map the data to the production environment. • Developed Hive queries and Sqooped data from RDBMS to data lake staging area • Handled data stored in warehouse and made external tables utilizing Hive and created scripts to ingest data to tables that can be reused across the project • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process • Involved in planning process of iterations in Agile methodology • Experience with Partitions, bucketing concepts in Hive and designed both using Managed and External tables in Hive • Experience in partitioning/bucketing for hive tables and memory management of driver and executor for tuning the jobs • Worked in developing Pig scripts to create the relationship between the data present in the Hadoop cluster. • Performed advanced procedures like text analytics and processing using the in-memory computing capabilities of spark using Python and Scala. • Working closely with Data science team and understand the requirement clearly and create hive table on HDFS • Automated data processing flow pipelines and scheduled various data flow jobs • Scheduling Spark jobs using Oozie workflow in Hadoop Cluster • Actively participating in the code reviews, meetings and troubleshoot technical issues and solving them. Show less

Virginia Credit Union

United States
Banking
400 - 500 Employee

Data Engineer
- Aug 2017 - Jan 2020
• Designed and Developed data integration/engineering workflows on big data technologies and platforms - Hadoop, Spark, MapReduce, Hive, HBase • Worked in Agile methodology and actively participated in standup calls, PI planning and work reported in Rally • Involved in Requirement gathering and prepared the Design documents • Involved in importing data into HDFS and Hive using Sqoop and involved in creating Hive tables, loading with data, and writing Hive queries • Handled importing… Show more • Designed and Developed data integration/engineering workflows on big data technologies and platforms - Hadoop, Spark, MapReduce, Hive, HBase • Worked in Agile methodology and actively participated in standup calls, PI planning and work reported in Rally • Involved in Requirement gathering and prepared the Design documents • Involved in importing data into HDFS and Hive using Sqoop and involved in creating Hive tables, loading with data, and writing Hive queries • Handled importing of data from various data sources, performed transformations using Hive, and loaded data into data lake • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other operations • Processed data stored in data lake and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project • Developed Databricks Python notebooks to Join, filter, pre-aggregate, and process the files stored in Azure data lake storage. • Created Azure Data Factory and managing policies for Data Factory and Utilized Blob storage for storage and backup on Azure. • Extensive knowledge in migrating applications from internal data storage to Azure. • Hive and Spark tuning with partitioning/bucketing of Parquet and executor’s memory • Developed Hive queries and Sqoop data from RDBMS to Hadoop staging area • Developed dataflows and processes for the Data processing using SQL (SparkSQL & Data frames) • Designed and developed Map Reduce (hive) programs to analyze & evaluate multiple solutions, considering multiple cost factors across the business as well operational impact on flight historical data • Involved in planning process of iterations under the Agile Scrum methodology • Scheduling Spark jobs using Oozie workflow in Hadoop Cluster and Generated detailed design documentation for the source-to-target transformations. Show less • Designed and Developed data integration/engineering workflows on big data technologies and platforms - Hadoop, Spark, MapReduce, Hive, HBase • Worked in Agile methodology and actively participated in standup calls, PI planning and work reported in Rally • Involved in Requirement gathering and prepared the Design documents • Involved in importing data into HDFS and Hive using Sqoop and involved in creating Hive tables, loading with data, and writing Hive queries • Handled importing… Show more • Designed and Developed data integration/engineering workflows on big data technologies and platforms - Hadoop, Spark, MapReduce, Hive, HBase • Worked in Agile methodology and actively participated in standup calls, PI planning and work reported in Rally • Involved in Requirement gathering and prepared the Design documents • Involved in importing data into HDFS and Hive using Sqoop and involved in creating Hive tables, loading with data, and writing Hive queries • Handled importing of data from various data sources, performed transformations using Hive, and loaded data into data lake • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other operations • Processed data stored in data lake and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project • Developed Databricks Python notebooks to Join, filter, pre-aggregate, and process the files stored in Azure data lake storage. • Created Azure Data Factory and managing policies for Data Factory and Utilized Blob storage for storage and backup on Azure. • Extensive knowledge in migrating applications from internal data storage to Azure. • Hive and Spark tuning with partitioning/bucketing of Parquet and executor’s memory • Developed Hive queries and Sqoop data from RDBMS to Hadoop staging area • Developed dataflows and processes for the Data processing using SQL (SparkSQL & Data frames) • Designed and developed Map Reduce (hive) programs to analyze & evaluate multiple solutions, considering multiple cost factors across the business as well operational impact on flight historical data • Involved in planning process of iterations under the Agile Scrum methodology • Scheduling Spark jobs using Oozie workflow in Hadoop Cluster and Generated detailed design documentation for the source-to-target transformations. Show less

Tulasi Technologies Pvt Ltd

India
IT Services and IT Consulting
1 - 100 Employee

Data Engineer
- Dec 2015 - May 2017
• Worked on enhanced performance of workflows by tuning at query level and workflow level using partitioning • Worked for the enhancement of authorizing claim process by developing a single module which works irrespective of all file types • Involved in project meetings for the requirement analysis • Used different partitions in the session to improve performance without affecting the logic • Performed peer reviews before reaching code to higher environments • Created Scheduled… Show more • Worked on enhanced performance of workflows by tuning at query level and workflow level using partitioning • Worked for the enhancement of authorizing claim process by developing a single module which works irrespective of all file types • Involved in project meetings for the requirement analysis • Used different partitions in the session to improve performance without affecting the logic • Performed peer reviews before reaching code to higher environments • Created Scheduled jobs using ESP • Involved in Requirement gathering and Requirement analysis form Business grooming sessions • Performed unit testing and prepared project related design documents • Developed the ETL design and responsible for the deliverables • Responsible for defect tracking using ALM. Show less • Worked on enhanced performance of workflows by tuning at query level and workflow level using partitioning • Worked for the enhancement of authorizing claim process by developing a single module which works irrespective of all file types • Involved in project meetings for the requirement analysis • Used different partitions in the session to improve performance without affecting the logic • Performed peer reviews before reaching code to higher environments • Created Scheduled… Show more • Worked on enhanced performance of workflows by tuning at query level and workflow level using partitioning • Worked for the enhancement of authorizing claim process by developing a single module which works irrespective of all file types • Involved in project meetings for the requirement analysis • Used different partitions in the session to improve performance without affecting the logic • Performed peer reviews before reaching code to higher environments • Created Scheduled jobs using ESP • Involved in Requirement gathering and Requirement analysis form Business grooming sessions • Performed unit testing and prepared project related design documents • Developed the ETL design and responsible for the deliverables • Responsible for defect tracking using ALM. Show less

Bloom Soft Tech

India
IT Services and IT Consulting
1 - 100 Employee

Data Analyst
- Jul 2014 - Nov 2015
• Worked on Informatica tools -Source Analyzer, Mapping Designer, Mapplet Designer and Transformation Developer • Created Mappings using the transformations like Source Qualifier, Connected and Unconnected Lookup, Filter, Router, Update Strategy, Aggregator, Sequence Generator, Joiner and Expression Transformations • Prepared Unit test cases for the mappings • Developed mappings as per the given mapping Specs • Responsible for understanding dynamically changing requirements and… Show more • Worked on Informatica tools -Source Analyzer, Mapping Designer, Mapplet Designer and Transformation Developer • Created Mappings using the transformations like Source Qualifier, Connected and Unconnected Lookup, Filter, Router, Update Strategy, Aggregator, Sequence Generator, Joiner and Expression Transformations • Prepared Unit test cases for the mappings • Developed mappings as per the given mapping Specs • Responsible for understanding dynamically changing requirements and accommodating the changes in development • Involved in Data Extracting, Transforming and Loading the data from Source to Staging and Staging to Target according to the Business requirements • Validating ETL code while moving to other environments • Implemented parameters at Mapping and Workflow level • Scheduled Sessions and Batches on the Informatica Server using Workflow Manager. Show less • Worked on Informatica tools -Source Analyzer, Mapping Designer, Mapplet Designer and Transformation Developer • Created Mappings using the transformations like Source Qualifier, Connected and Unconnected Lookup, Filter, Router, Update Strategy, Aggregator, Sequence Generator, Joiner and Expression Transformations • Prepared Unit test cases for the mappings • Developed mappings as per the given mapping Specs • Responsible for understanding dynamically changing requirements and… Show more • Worked on Informatica tools -Source Analyzer, Mapping Designer, Mapplet Designer and Transformation Developer • Created Mappings using the transformations like Source Qualifier, Connected and Unconnected Lookup, Filter, Router, Update Strategy, Aggregator, Sequence Generator, Joiner and Expression Transformations • Prepared Unit test cases for the mappings • Developed mappings as per the given mapping Specs • Responsible for understanding dynamically changing requirements and accommodating the changes in development • Involved in Data Extracting, Transforming and Loading the data from Source to Staging and Staging to Target according to the Business requirements • Validating ETL code while moving to other environments • Implemented parameters at Mapping and Workflow level • Scheduled Sessions and Batches on the Informatica Server using Workflow Manager. Show less

Education

GITAM Deemed University
: Bachelors, Computer Science

2011 - 2014

Community

You need to have a working account to view this content. Click here to join now

Topline Software

Get New Leads on Autopilot

The All-in-One Growth Operating System

Access Our Free Tools

Topline Services

Done For You Marketing & Sales Solutions

Done For You Global Talent Recruitment

Agency and Reseller Services

Topline Use Cases

The All-in-One Growth Operating System

The All-in-One Growth Operating System