Sarada Kurapati

Big Data Engineer at LendingClub
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
Wheeling, Illinois, United States, US

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Experience

    • United States
    • Financial Services
    • 700 & Above Employee
    • Big Data Engineer
      • Aug 2021 - Present

      • Developed Spark Applications to implement various data cleansing/validation and processing activity of large-scale datasets ingested from traditional data warehouse systems. • Migrated the existing on-premises applications and scripts from Java code to Cloud based platform - Azure Cloud storage. • Using PySpark to process and analyze large datasets in a distributed manner. • Writing PySpark scripts to perform complex data transformations and aggregations for improving performance. • Optimizing PySpark applications using techniques such as caching, broadcast variables, and partitioning • Managing and monitoring Azure resources such as virtual machines, storage accounts, and network resources to ensure high availability and performance. • Integrating Azure services such as Azure Stream Analytics and Azure Machine Learning to build real-time data processing pipelines. • Designed and developed scalable big data solutions using Hadoop and Azure Cloud technologies. • Implemented data processing pipelines using Apache Spark • Designed and developed data pipelines using Hive, Pig, and Spark to transform and analyze large datasets for real-time data processing. • Maintained and optimized Hadoop clusters for high availability and performance. • Worked with Databricks and Oozie jobs to automate and schedule Hadoop workflows, resulting in a 25% increase in operational efficiency. • Conducted performance tuning and capacity planning to optimize Hadoop clusters for various workloads. • Defining workflows using Oozie to automate Hadoop jobs and other big data applications. • Creating Oozie coordinators to schedule and manage multiple workflows. • Monitoring and troubleshooting Oozie workflows to ensure successful completion of jobs. • Collaborating with data scientists and analysts to understand their data needs and developing solutions to meet those needs using Hadoop and related technologies. • Writing complex Hive queries to extract data from large datasets for analysis. Show less

    • United States
    • Computer and Network Security
    • 700 & Above Employee
    • Big Data Engineer
      • Sep 2019 - Jul 2021

      • Worked with Spark to create structured data from a pool of unstructured data received. • Implemented advanced procedures such as text analytics and processing using in-memory computing capabilities such as Apache Spark written in Scala. • Implemented Spark using Scala and Spark SQL for faster testing and processing of data. • Converted Hive/SQL queries into Spark transformations using Spark RDDs and Scala. • Documented requirements, including the available code which should be implemented using Spark, Hive, HDFS and Elastic Search. • Maintained ELK (Elastic Search, Kibana) and wrote Spark scripts using Scala shell. • Implemented Spark using Scala and utilized DataFrames and Spark SQL API for faster processing of data. • Developed Spark Streaming applications to consume data from Kafka topics and insert the processed streams to HBase. • Provided a continuous discretized DStream of data with a high level of abstraction with Spark. • Structured Steaming. • Moved transformed data to Spark cluster where the data is set to go live on the application using Kafka. • Created a Kafka producer to connect to different external sources and bring the data to a Kafka broker. • Handled schema changes in data stream using Kafka. • Developed new Flume agents to extract data from Kafka. • Created a Kafka broker in structured streaming to get structured data by schema. • Analyzed and tuned Cassandra data model for multiple internal projects and worked with analysts to model Cassandra tables from business rules and enhance/optimize existing tables. • Designed and deployed new ELK clusters. • Created log monitors and generated visual representations of logs using ELK stack. • Implemented CI/CD tools Upgrade, Backup, and Restore. • Played a key role installing and configuring various Big Data ecosystem tools such as Elastic Search, Logstash, Kibana, Kafka, and Cassandra. • Reviewed functional and non-functional requirements on the Hortonworks Hadoop project. Show less

    • United States
    • Pharmaceutical Manufacturing
    • 700 & Above Employee
    • Big Data Engineer
      • Dec 2018 - Aug 2019

      • Experience in Data ingestion from sources like MySQL, Oracle and CSV files. • Experience in using Apache Sqoop to import and export data to from HDFS and external RDBMS databases, MYSQL and CSV files. • Implemented Spark using Scala and Spark SQL for faster testing and processing of data. • Worked with Spark and Scala mainly in Claims Invoice Ingestion Framework exploration for transition from Hadoop/MapReduce to Spark. • Write Spark jobs to transform the data to calculate and group the Vendor payment status on HDFS and store it in hive tables / Kafka topics. • Spark transformations are performed using data frames and Spark SQL. • Business Reports generated from the data stored in Hive tables to display in the dashboard. • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, Pair RDD's, Spark YARN. • Implemented Spark streaming framework that processes the data for Kafka and perform analytics on top of it. • Migrating the needed data from Oracle, MySQL in to HDFS in using Sqoop and importing various formats of flat files in to HDFS. • Worked in Agile development approach. • Developed a strategy for Full load and incremental load using Sqoop • Implemented POC to migrate map reduce jobs into Spark RDD transformations. Tools Used: Python, Teradata, Netezza, Oracle 12c, PySpark, SQL Server, UML, MS Visio, Oracle Designer, SQL Server 2012, Cassandra, Azure Show less

    • United States
    • IT Services and IT Consulting
    • 200 - 300 Employee
    • Hadoop Developer
      • Apr 2017 - Nov 2018

      • Consumed rest APIs and wrote source code so it could be used for the Kafka program. Worked on various real-time and batch processing applications using Spark/Scala, Kafka and Cassandra. • Built Spark applications to perform data enrichments and transformations using Spark Data frames with Cassandra lookups. • Used Data Stax Spark Cassandra Connector to extract and load data to/from Cassandra. Worked in a team to develop an ETL pipeline that involved extraction of Parquet serialized files from S3 and persisted them in HDFS. • Developed Spark application that uses Kafka Consumer and Broker libraries to connect to Apache Kafka and consume data from the topics and ingest them into Cassandra. • Developed applications involving Big Data technologies such as Hadoop, Spark, Map Reduce, Yarn, Hive, Pig, Kafka, Oozie, Sqoop, and Hue. • Worked on Apache Airflow, Apache Oozie, and Azkaban. Designed and implemented data ingestion framework to load data into the data lake for analytical purposes. • Developed data pipelines using Hive, Pig, and MapReduce. • Wrote Map Reduce jobs. • Administered clusters in the Hadoop ecosystem. • Installed and configured the Hadoop Cluster of Major Hadoop Distributions. • Designed the reporting application that uses the Spark SQL to fetch and generate reports on Hive. • Analyzed data using Hive and wrote User Defined Functions (UDFs). • Used AWS services like EC2 and S3 for small data sets processing and storage. • Executed Hadoop/Spark jobs on AWS EMR using programs, data stored in S3 Buckets. Tools Used: Spark, Scala, AWS, DBeaver, Zeppelin, SSIS, Cassandra, Workspace, c# scripting. Show less

    • India
    • IT Services and IT Consulting
    • 100 - 200 Employee
    • Java Developer
      • Feb 2015 - Mar 2017

      • Implemented Multi-Threaded Environment and used most of the interfaces under the Collection framework by using Core Java concepts. • Involved in developing code using major concepts of Spring Framework Dependency Injection (DI) and Inversion of control (IOC). • sed Spring MVC framework for implementing RESTful web services so that complexity of integration will be reduced, and maintenance will be quite easy. • Used Bootstrap to create responsive web pages which can be displayed properly in different screen sizes. • Used GIT as version control tool to update work progress and attended daily Scrum sessions. • Developed Interactive web pages using Angular, HTML5, CSS and JavaScript. • Build REST web service by building Server in the backend to handle requests sent from the front-end. • Involved in Stored Procedures, User Defined functions, Views and implemented the Error Handling in the Stored Procedures and SQL objects and modified already existing stored. • Functionalities include writing code in HTML, CSS, JavaScript, JSON, Bootstrap with MySQL Database as the backend. • Involved in design and development of a user-friendly enterprise application using Java, Spring, Hibernate, Web services, Eclipse. • Developed and enhanced the application using Java and J2EE (Servlets, JSP, JDBC, JNDI, EJB), Web Services (RESTful Web Services), HTML, JSON, XML, Maven and MySQL DB. • Used GIT as source control management giving a huge speed advantage on centralized systems that must communicate with a server. Tools Used : Java/J2EE, core java, Spring, Hibernate, GIT, MySQL database, Maven, RESTful Web Services, HTML, HTML5, CSS, JavaScript, Bootstrap, JSON, XML Show less

Education

  • KL University
    Bachelor's degree, Electrical and Electronics Engineering
    2011 - 2015

Community

You need to have a working account to view this content. Click here to join now