Vaibhav Shenoy M
Data Engineer at Tech Active- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
Topline Score
Bio
Credentials
-
Astronomer Certification for Apache Airflow Fundamentals
AstronomerAug, 2023- Nov, 2024 -
Preparing for Google Cloud Certification: Cloud Data Engineer
CourseraNov, 2022- Nov, 2024 -
Databricks Lakehouse Fundamentals
DatabricksSep, 2022- Nov, 2024 -
dbt Fundamentals
dbt LabsAug, 2023- Nov, 2024 -
Databricks Certified Associate Developer for Apache Spark 3.0
DatabricksDec, 2022- Nov, 2024
Experience
-
Tech Active
-
India
-
IT Services and IT Consulting
-
1 - 100 Employee
-
Data Engineer
-
Aug 2023 - Present
-
-
-
Draup
-
United States
-
Software Development
-
500 - 600 Employee
-
Data Analyst
-
Sep 2022 - Aug 2023
*Successfully implemented automation using Pyspark, resulting in 70% reduction in the manual process* Leveraged scalable architecture concepts such as partitioning, caching, and serialization to optimize Spark code, and utilized tools such as Spark UI to fine-tune application performance and achieve significant improvements in execution time.* Working collaboratively with a full-stack developer by providing data discrepancies, data loss and resolved data discrepancies, and providing critical data insights and analysis to improve the overall data accuracy and reliability of the data. Show less
-
-
Associate data analyst
-
Jun 2020 - Sep 2022
*Automated several data request jobs which has helped researchers to access NoSQL(MongoBD) data base as per client request using(pyspark/sql and databricks/EMR)*Created Daily alert mails to monitor the counts of newly harvested data. Weekly alert mails to test the loss of data between ETL-gateway prod using(pyspark/sql and databricks/EMR)*Scheduled several jobs using airflow Dag scheduler using(airflow/git)*Migrated the jobs from Databricks to EMR.*Took ownership of the data requirements from scratch to Production.*Worked along with ETL team in developing flexible solutions using(Pyspark/Python/EMR)*Got insights on 700M+ dataset using (Pyspark/sql/Databricks/Excel)*Strong Hands-on in designing and developing large scale Python-Spark based applications* In-depth knowledge on Pyspark3, RDD, Dataframe/Dataset, S3,csv,avro operations*Modelled Data as per the requirements. Show less
-
-
Education
-
Imarticus
Post Graduation Diploma in Data analytics, Data Analyst -
Sahyadri College of Engineering & Management
Bachelor of Engineering - BE, Electronics and Communications Engineering