Raj Vyas

Sr. Data Engineer at The Cigna Group
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
United States, US

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Experience

    • Hospitals and Health Care
    • 700 & Above Employee
    • Sr. Data Engineer
      • Nov 2021 - Present

    • United States
    • Retail
    • 700 & Above Employee
    • Senior Data Engineer
      • Aug 2019 - Nov 2021

      • Worked on AWS Data pipeline to configure data loads from S3 to into Redshift. • Used AWS Redshift, me Extracted, transformed and loaded data from various heterogeneous data sources and destinations. • Created Tables, Stored Procedures, and extracted data using T-SQL for business users whenever required. • Performed data analysis and design, and created and maintained large, complex logical and physical data models, and metadata repositories using ERWIN and MB MDR. • Wrote shell script to trigger data Stage jobs. Assisted service developers in finding relevant content in the existing reference models like Access, Excel, CSV, Oracle, flat files using connectors, tasks and transformations provided by AWS Data Pipeline. • Utilized Spark SQL API in PySpark to extract and load data and perform SQL queries. • Worked on developing PySpark script to encrypting the raw data by using hashing algorithms concepts on client specified columns. • Responsible for Design, Development, and testing of the database and Developed Stored Procedures, Views, and Triggers. • Created Tableau reports with complex calculations and worked on Ad-hoc reporting using Power BI. • Created data model that correlates all the metrics and gives a valuable output. Worked on the tuning of SQL Queries to bring down run time by working on Indexes and Execution Plan. • Explored with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, PostgreSQL, Data Frame, OpenShift, Talend pair RDD's. • Involved in integration of Hadoop cluster with spark engine to perform BATCH and GRAPHX operations. Performed data preprocessing and feature engineering for further predictive analytics using Python Pandas. • Generated report on predictive analytics using Python and Tableau including visualizing model performance and prediction results. Show less

    • United States
    • Telecommunications
    • 700 & Above Employee
    • Data Engineer
      • Dec 2017 - Aug 2019

      • Processed the Web server logs by developing multi-hop flume agents by using Avro Sink and loaded into MongoDB for further analysis, also extracted files from MongoDB through Flume and processed. • Expert knowledge on MongoDB, NoSQL data modeling, tuning, disaster recovery backup used it for distributed storage and processing using CRUD. • Extracted and restructured the data into MongoDB using import and export command line utility tool. • Experienced in setting up Fan-out workflow in flume to design v shaped architecture to take data from many sources and ingest into single sink. • Experienced in creating tables, dropping, and altered at run time without blocking updates and queries using HBase and Hive. • Experienced in working with different join patterns and implemented both Map and Reduce Side Joins. Wrote Flume configuration files for importing streaming log data into HBase with Flume. • Imported several transactional logs from web servers with Flume to ingest the data into HDFS. • Loaded Data into HBase using Bulk Load and Non-bulk load. • Worked on continuous Integration tools Jenkins and automated jar files at end of day. Worked with Tableau and Integrated Hive, Tableau Desktop reports and published to Tableau Server. • Developed MapReduce programs in Java for parsing the raw data and populating staging Tables. • Experienced in setting up the whole app stack, setup, and debug log stash to send Apache logs to AWS Elastic search. • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data. Analyzed the SQL scripts and designed the solution to implement using Scala. • Used Spark-SQL to Load JSON data and create Schema R DD and loaded it into Hive Tables and handled structured data using Spark SQL. • Implemented Spark Scripts using Scala, Spark SQL to access hive tables into Spark for faster processing of data. Show less

    • Airlines and Aviation
    • 700 & Above Employee
    • Data Engineer
      • Mar 2016 - Dec 2017

      • Gathered data and business requirements from end users and management. Designed and built data solutions to migrate existing source data in Data Warehouse to Atlas Data Lake (Big Data). • Analyzed huge volumes of data Devised simple and complex HIVE, SQL scripts to validate Dataflow in various applications. • Performed Cognos report validation. Made use of MHUB for validating Data Profiling & Data Lineage. • Devised PL/SQL statements - Stored Procedures, Functions, Triggers, Views and packages. Made use of Indexing, Aggregation and Materialized views to optimize query performance. • Created reports using Tableau/Power BI/Cognas to perform data validation. • Involved in creating Tableau dashboards using stack bars, bar graphs, scattered plots, geographical maps, Gantt charts etc. using show me functionality. Dashboards and stories as needed using Tableau Desktop and Tableau Server. • Performed statistical analysis using SQL, Python, R Programming and Excel. • Worked extensively with Excel BA Macros, Microsoft Access Forms. • Imported, cleaned, filtered and analyzed data using tools such as SOL, HIVE and PIG. • Used Python & SAS to extract, transform & load source data from transaction systems, generated reports, insights, and key conclusions. • Developed story telling dashboards in Tableau Desktop and published them on to Tableaus which allowed end users to understand the data on the fly with the usage of quick filters for on demand needed information. • Analyzed and recommended improvements for better data consistency and efficiency. • Designed and developed data mapping procedures ETL-Data Extraction, Data Analysis and Loading process for integrating data using R programming. • Effectively communicated plans, project status, project risks and project metrics to the project team planned test strategies in accordance with project scope. Show less

Education

  • Nirma University, Ahmedabad, Gujarat, India
    Bachelor's degree, Computer Science
    2008 - 2012

Community

You need to have a working account to view this content. Click here to join now