kishan Pakalapati
Data Engineer at AnalytIQs Inc- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
Topline Score
Bio
Credentials
-
Fundamentals of the Databricks Lakehouse Platform
DatabricksJul, 2023- Nov, 2024 -
Microsoft Certified: Azure Data Fundamentals
MicrosoftMar, 2023- Nov, 2024 -
Microsoft Certified: Azure Data Fundamentals
MicrosoftJun, 2023- Nov, 2024
Experience
-
AnalytIQs Inc
-
United States
-
Information Technology & Services
-
1 - 100 Employee
-
Data Engineer
-
Jan 2023 - Present
As a Big Data Developer, implemented AWS Glue in combination with other AWS services to orchestrate our ETL jobs to build data warehouses and data lakes. Also, generate output streams and create notifications to monitor the job runs. Built multiple dashboards and optimized data pipelines using AWS suite and Quick-Sight to provide Data insights and reporting. Implemented AWS Step Functions to automate and orchestrate the Amazon SageMaker related tasks such as publishing data to S3, training ML model and deploying it for prediction. Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, MapReduce Frameworks, HBase, Hive, Oozie, Flume, Sqoop and others. Implemented a generic ETL framework with high availability for bringing related data for Hadoop & Cassandra from various sources using spark. Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB Integrated Apache Spark with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, HBase and Hive by integrating with Spark. Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift. Show less
-
-
-
Siemens Healthineers
-
Hospitals and Health Care
-
700 & Above Employee
-
Data Analyst
-
Jan 2018 - Jun 2021
Worked as a part of a research and development team working on healthcare data. I am part of a team that is responsible for designing, implementing, and managing data using Azure services. Collaborated with data architects and stakeholders to understand objectives and requirements. Utilized services such as Azure Databricks and Azure data factory (ADF) to Transform raw data into structured actionable data by data cleansing, data aggregations to maintain data quality. Create and maintain optimal data pipeline architecture in cloud Microsoft Azure using Data Factory and Azure Databricks. Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks. Developed file format converter application to convert Raw data to processable data for annotations. Employed Azure data lake storage (ADLS), Azure SQL databases, and Azure Cosmos DB to set up data storage and managed data partitioning, indexing, and optimized data. Collaborated with business analysts to develop Dashboards utilizing tools such as PowerBI Collaborated with data scientists to provide data for ML algorithms. Conducted performance tuning, testing, debugging, and optimization for better data processing and improvement. Show less
-
-
Education
-
Governors State University
Master's degree, Health informatics -
Rajiv Gandhi University(A Central University)
Bachelor of technology