Arunava Addy
Senior Data Engineer at DataGrokr- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
Topline Score
Bio
Credentials
-
SnowPro Core Certification
SnowflakeJun, 2023- Nov, 2024 -
Google Cloud Certified Associate Cloud Engineer
Google Cloud Skills BoostOct, 2021- Nov, 2024 -
Microsoft Certified: Azure Data Fundamentals
MicrosoftNov, 2022- Nov, 2024
Experience
-
DataGrokr
-
India
-
IT Services and IT Consulting
-
1 - 100 Employee
-
Senior Data Engineer
-
Aug 2016 - Present
Energy & Resources Industry:- Design, develop and implement fully automated data engineering pipelines to build cloud data platform using GCP services like Pub/Sub, Dataflow, Big Query, Cloud Function, Google Kubernetes Engine.- Build data warehouse solutions in Big Query.- Used ETL tool Dell Boomi to perform scheduled batch jobs to connect to Salesforce database and fetch data. Ingest the data in BigQuery table and Hive database hosted on a Google DataProc cluster.- Ingested streaming data (500+ records per second) from Data van on-site mobile IT command centres. Publish the data through Google Cloud Pub/Sub topics, consumed and processed by Cloud Dataflow, Cloud function and Kubernetes pods to finally ingest in Google Big query. - Led the planning and execution of the migration project to move the existing data pipelines for 15 different data sources from Dell Boomi ETL and orchestration tool to a docker-containerised framework on GCP. This resulted in cost saving on Dell Boomi license and also overcome the short comings of the tool. - Built a robust monitoring and alerting system on the docker-containerized framework to catch logs from execution of code, GCP service logs, container logs and visualize log summary through Google data studio.- Led in design and implement Azure DevOps CI/CD processes for deployment on GCP.- Provided team with managerial cover with project coordination, client communication, requirement gathering, delivery management, time sheet maintenance etc.Technology: Docker, Kubernetes, Cloud Function, Dataflow, Cloud Storage, BigQuery, DataProc, GKE, Cloud Composer, Pub/sub, Container Registry, Azure DevOps, Github, Python, SQL, shell script. Show less
-
-
Data Ops Engineer
-
2020 - Present
Retail Industry:- Managed a team of 9 engineers in a Data Ops project for a retail giant in the US.- Led the L1/L2 operations support which consists of error handling in the data pipelines, latency in data ingestion processes, error handling in Snowflake tasks and Databricks ml jobs including Service now incidents, RITs, Change Requests, Release management, Process improvement, performance improvement of codes/pipelines etc.- Led the migration of monitoring and alerting process from Email alerts to customized monitoring dashboards on Power BI. Worked on building interactive dashboards and reports on Power BI for monitoring data operations and for the BI team to gain insight of data. - Built cross functional relationship with 30 engineers, 3 PMs and 3 different teams placed in different locations to understand current operation issues and drive to a solution.Technology: Snowflake, PowerBI, GCP. Show less
-
-
Data Engineer
-
2021 - 2022
Payment Processing Industry:- Led a team of 4 engineers in a Data Engineering project for a Credit Card Marchant company. Provided guidance to the team with architectural overview, securing the environment, client communication, time sheet maintenance, documentation etc. - Created PCI DSS compliance guidelines based on the sensitivity of the card holder's data and the AWS services used in the project like Redshift, IAM, DynamoDB, Glue DataBrew etc.Performed security best practice checks as per PCI DSS compliance guidelines using AWS SecurityHub. Set up aggregated alerts and automated remediation using In-built Security Policies. Used CloudWatch to schedule lambda function to read report from SecurityHub and send mail through SNS on specific interval.Technology: AWS Redshift, Glue Databrew, IAM, S3, SecurityHub. Show less
-
-
MapR Hadoop Administrator
-
2016 - 2018
Insurance Industry:MapR cluster administrator. I am responsible for planning, designing and building clusters, setting up MapR distribution and ecosystem components in the cluster nodes, provisioning users, configuring nodes, maintaining, supporting and upgrading big data platform in on-premise environment, testing distribution functionality, handling data workloads and resources, high availability, monitoring services and preparing the repository packages for application team. Performance tuning on Hive and Spark. As per the need of the Data Analysis team, explored the capability of different tools like Dremio, Ovaledge, Zeppelin, Hue, Drill, Neo4j etc and gave demo to the team. Show less
-
-
-
Societe Generale Global Solution Centre
-
India
-
Information Technology & Services
-
700 & Above Employee
-
Unix System Administrator
-
Sep 2012 - Aug 2016
I was a part of Societe Generale Global Solution Centre’s Unix support team on the Market division. Some of the tasks done on the job was as follows: - Our Day to day activity was done on tickets raised by ticketing tools like itrack & impulse. Working on production servers by raising release, acting on Server alerts like unreachable, AD related alerts, swap, CPU based on priority. - OS installation on HP Proliant, Sun SPARC, HP Blade hardware by creating workflows, OS upgrade of servers by back up of old data, planning the activity coordinating with affected teams. - Discovering luns (storage space from a given storage box) on linux and solaris. File system creation and extension, concat and stripe FS on EXT4, ZFS, VERITAS. RAW device & Sybase device creation on linux and solaris. - Automating jobs with shell scripts (basic) and scheduling jobs with cronjobs. Sharing files through NFS and SAMBA. - Installation of Patches, kernel upgrades, Installation of various application packages as require. Decommission of file system and servers. Restoration of data from backup server/tape in case of an accidental deletion of data. Show less
-
-
-
Mindteck
-
India
-
IT Services and IT Consulting
-
700 & Above Employee
-
Unix and Storage Administrator
-
Dec 2010 - Aug 2012
Client: NetApp I worked as a Unix and Storage Administrator on a NetApp data center environment. Our company worked on testing different aspects of the NetApp filers (storage devices) like OS components, firmware and builds, Network testing, stress testing to name a few. - Maintaining NIS server & hosting the NIS user’s storage space through NetApp filer (Storage Controller) with NFS protocol. Creating Samba shares for NIS users. - Handling build-server & home directories. Managing the Web Server. Hosting the builds through web server. - Resolving day to day Linux Client issues such as network issues, OS crash, Package Installations, Connectivity Problem, Patch updates etc. Maintaining the DNS server & the Gateway. Creating & managing IPV6 DHCP server in Linux. - Configuring NetApp filers from scratch including Console setup, boot up, IP address config etc. Creating & managing Aggregates, Volumes & Flex vols in filers. Sharing & maintaining storage shares with NFS & CIFS protocols. - Maintaining a lab environment for running test cases on filers. Trouble shooting connectivity issues to filers including switch or network cabling issues. Show less
-
-
Education
-
Visvesvaraya Technological University
Bachelor of Engineering - BE, Mechanical Engineering