Rohit Reddy

Sr. Data engineer at Catalist
  • Claim this Profile
Contact Information
Location
Chicago, Illinois, United States, US

Topline Score

Bio

Generated by
Topline AI

0

/5.0
/ Based on 0 ratings
  • (0)
  • (0)
  • (0)
  • (0)
  • (0)

Filter reviews by:

No reviews to display There are currently no reviews available.

0

/5.0
/ Based on 0 ratings
  • (0)
  • (0)
  • (0)
  • (0)
  • (0)

Filter reviews by:

No reviews to display There are currently no reviews available.
You need to have a working account to view this content. Click here to join now

Experience

    • United States
    • Political Organizations
    • 1 - 100 Employee
    • Sr. Data engineer
      • Nov 2020 - Present

      ● ● Evaluate, extract/transform data for analytical purposes in a Big data environment. Involved in designing ETL processes and developing source to target mappings. ● In-depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames. ● Developed spark application by using python (pyspark) to transform data according to business rules. ● Incorporated AWS native DevOps services to support development and deployment: Git, CodeBuild, CodeDeploy, CodePipleline and Cloudformation/Cloud Development Kit (CDK). ● Cloud development and automation using Node.js, Python (Boto3), AWS Lambda, AWS CDK (Cloud Development Kit) and AWS SAM (Serverless Application Model) ● Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators. ● Developed tools extensively include Spark, Drill, Hive, PostgreSQL. ● Used Hive Queries in Spark-SQL for analysis and processing the data. ● We have worked on many ETL creating aggregate tables by doing many transformations and actions like JOINS, sum of the amounts etc. ● Implemented a 'server less' architecture using API Gateway, Lambda, and Dynamo DB and deployed AWS Lambda code from Amazon S3 buckets. Created a Lambda Deployment function, and configured it to receive events from your S3 bucket ● Worked with building data warehouse structures, and creating facts, dimensions, aggregate tables, by dimensional modeling, Star and Snowflake schemas. ● Used Spark-SQL to Load data into Hive tables and Written queries to fetch data from these tables. Implemented partitioning and bucketing in hive ● Experienced in performance tuning of Spark Applications for setting correct level of Parallelism and memory tuning. ● Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins. Show less

    • United States
    • Information Technology & Services
    • 700 & Above Employee
    • Data Engineer
      • Apr 2018 - Oct 2020

      ● Involved in migrating Hive/SQL queries into Spark transformations using Spark RDDs and PySpark. ● Developed Product weekly end-end flows using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata. ● Developed real time tracking of class schedules using Node JS (socket.io based on socket technology, Express JS framework). ● Develop a presentation tier using JSP, Java Script, HTML, and CSS for manipulating, validating, customizing, error messages to the User Interface. ● Developed internationalized multi-tenant SaaS solutions with responsive UI using Java or React or AngularJS, with NodeJS and CSS. ● Formulated procedures for integration of R programming plans with data sources and delivery systems. ● Performed statistical analyses in R. ● Configure and maintain fully automated pipelines with custom and integrated monitoring methods using Jenkins and Docker. ● Implemented end to end flows in Airflow DAG’s. ● Developed Python scripts to generate the dynamic hive queries based on the reference data. ● Tuned ETL jobs/procedures/scripts, SQL queries, PL/SQL,Kafka spring boot, procedures to improve the system performance. ● Coordination between Business about the anomalies/outliers observed on a day-to-day basis and identifying a solution to fix them. ● Expert in working with Hive tables, data distribution by implementing partitioning and bucketing, writing, and optimizing the HiveQL queries. ● Process and load bound and unbound data from Google pub/sub topic to Bigquery using cloud data platform. ● Created Spark Clusters and configured high concurrency clusters using Azure databricks to speed up the preparation of high-quality data. ● Open SSH tunnel to Google Dataproc to access the yarn manager to monitor spark jobs. ● Developed automation scripts to migrate data from one cluster to another cluster to migrate data. Show less

    • Information Technology & Services
    • 200 - 300 Employee
    • Data Engineer
      • Oct 2016 - Mar 2018

      ● Developed data pipeline using Kafka, Sqoop, and Spark Streaming to ingest customer ● behavioral data and financial histories into HDFS, HBase and Cassandra for analysis. ● Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis. ● Creating S3 buckets also managed policies for S3 buckets and utilized S3 bucket and Glacier for storage and backup on AWS. Spin up an EMR Cluster to run all Spark jobs. ● Use Amazon Elastic Cloud Compute (EC2) infrastructure for computational tasks and Simple Storage Service (S3) as a storage mechanism. Worked with Terraform for configuration changes. Used Terraform apply for potential deployment changes. ● Write test cases, analyze and report test results to product teams. ● Responsible for developing the data pipeline using Sqoop extract the data from RDBMS and store it in HDFS. ● Installed Oozie workflow engine to run multiple Spark, Hive & Sqoop jobs, used Sqoop to import and export data from HDFS to RDBMS and vice-versa for visualization and to generate reports. ● Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation. ● Worked in functional, system, and regression testing activities with agile methodology. ● Worked on Python plugin on MySQL workbench to upload CSV files. ● Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting. ● Developed different process Workflows using Apache NiFi to Extract, Transform and Load raw data into HDFS and then process it to Hive tables. ● Worked with NoSQL databases like HBase. ● Used security groups, network ACL's, internet gateways and route tables to ensure a secure zone for organization in AWS public cloud. ● Created Hive tables by using Apache NiFi and loaded the data into tables using Hive Query Language. ● Developed real time data pipelines using Kafka and Spark Streaming. Show less

    • United Kingdom
    • Financial Services
    • 700 & Above Employee
    • Software Developer
      • Aug 2012 - Feb 2016

      ● Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development. ● Worked with an online system to design, code, unit test, build, system and perform integration testing. ● Developed JSP Java Server Pages starting from HTMLs and detailed technical design specification documents. Pages included HTML, CSS, JavaScript and Hibernate. ● Coded spring controller classes for handling requests and configured spring-servlet configuration file. ● Consumed web services using SOAP based requests. ● Used agile systems and strategies to provide quick and feasible solutions, based on agile systems, to the organization. ● Developing web applications using Spring MVC framework and Hibernate. ● Written JUnit tests to verify the code and did code reviews. Used Find bugs software to find bugs and improve quality of the code. ● Involved in writing PL/SQL queries and stored procedures. ● Interfaced with spring to code the business logic for the web client layer involving J2EE design patterns. ● Involved in the creation of custom interceptors for Validation purposes. ● Analyzed and fixed defects in the Login application. Environment: Core Java, J2EE, JUnit, Eclipse, DHTML, JSP, HTML/CSS, XML, SOAP, Hibernate, Spring, JavaScript, JBoss Application Server, PL/SQL, CVS, Oracle, IBM RAD and UNIX. Show less

Education

  • Osmania University
    computer science, 79%
    2008 - 2012

Community

You need to have a working account to view this content. Click here to join now