Venkata Sri P

Big Data Engineer at Berkley Medical Management Solutions (a Berkley Company)
  • Claim this Profile
Contact Information
Location
Frisco, Texas, United States, US

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Experience

    • United States
    • Insurance
    • 1 - 100 Employee
    • Big Data Engineer
      • May 2022 - Present

      • Build scalable databases capable of ETL processes using SQL and spark. • Used Spark JDBC drivers to ingest data from variety of Relational Databases into HDFS. • Convert raw data with sequence data format, such as Avro and Parquet to reduce data processing time and increase data transferring efficiency through the network. • Designed developed and tested Extract Transform Load (ETL) applications with different types of sources. • Performed various types of tuning techniques like Partitioning, Caching, Broadcasting and bucketing to improve spark jobs. • Processed both structured and semi structured data using Spark. • Implemented large Lambda architectures using Azure Data platform capabilities like Azure Data Lake, Azure Data Factory, Azure SQL Server. • Worked on migration of data from On-prem SQL server to Cloud databases (Azure Data Factory (ETL) & Azure SQL DB). • Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW). • Deployed the data pipeline in Azure Data Factory using JSON scripts to process the data. • Analyzing the Data from different sourcing using Big Data Solution Hadoop by implementing Azure Data Factory, Azure Data Lake, Azure Data Lake Analytics, Hive, Sqoop. • Designed end to end scalable architecture to solve business problems using various Azure Components like Data Factory, Data Lake, Key Vault. • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards. • Scheduled the pipeline using the Azure data bricks job scheduler and as always using the Apache Airflow. • Worked in Production Environment which involves building CI/CD pipeline using Azure Devops with various stages starting from code checkout and Deploying code in specific environment. Show less

    • United States
    • Higher Education
    • 700 & Above Employee
    • Data Associate
      • May 2021 - Apr 2022

      • Import and export data from different sources into HDFS for further processing and vice versa using Apache Sqoop. • Developed Python scripts to do file validations in Databricks and automated the process using ADF • Implemented data pipelines in Azure Data Factory to extract, transform and load data from multiple sources like Azure SQL, Blob storage and Azure SQL Data warehouse • Extract, Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL • Extracted and loaded data into Data lake with ETL jobs and developed shell scripts for dynamic partitions adding to Hive stage • Used notebooks, Spark Data frames, SPARK SQL and python scripting to build ETL pipelines in Databricks • Developed and maintained the data pipeline on Azure Analytics platform using Azure Databricks, PySpark, and Python. • Developing Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for Analyzing& transforming the data to uncover insights into the customer usage patterns. Show less

    • Hadoop Developer/Data Engineer
      • Jun 2016 - Feb 2019

      • Participate in Designing and Developing Big Data application architecture for data processing andanalyzing data as per business requirements and use cases. Hands on experience on spark performance tunning• Experience in Azure Data Services like Azure Data Factory, Azure Data Lake, Azure Databricks• Analyzing existing application architecture to suggest and implement changes for optimizing thesolutions for problems statements.• Recommending new Big data tools and solutions and developing POC (proof of Concept) to resolveissues and problems observed with the existing method or process of application implementation. Show less

    • Data Warehouse Developer
      • Apr 2015 - Jun 2016

      • Assisting the team with performance tuning for ETL and database processes• Design, develop, implement, and assist in validating processes• Self-manage time and task priorities and of other developers on the project• Work with data providers to fill data gaps and/or to adjust source-system data structures to facilitateanalysis and integration with other company data• Conduct ETL performance tuning, troubleshooting, support, and capacity estimation• Map sources to targets using a variety of tools, including Business Objects Data Services/BODI. Show less

Education

  • University of Central Missouri
    Master's degree, Big Data Analytics and Information Technology
    2020 - 2022
  • Jawaharlal Nehru Technological University, Kakinada
    Bachelor's degree, Electrical, Electronics and Communications Engineering
    2011 - 2015

Community

You need to have a working account to view this content. Click here to join now