shahbaaz ahmed

Data Engineer at Mettle
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
Manchester, England, United Kingdom, UK

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Experience

    • United Kingdom
    • Financial Services
    • 100 - 200 Employee
    • Data Engineer
      • Aug 2020 - Present

      Techs: GCP, BigQuery, Kafka, Akka Streams, Kafka Streams, Scala, Python, Kubernetes, Docker, Terraform, Airflow, DBT Techs: GCP, BigQuery, Kafka, Akka Streams, Kafka Streams, Scala, Python, Kubernetes, Docker, Terraform, Airflow, DBT

    • London, United Kingdom
    • Data Engineer
      • Jan 2020 - Aug 2020

      Techs: Airflow, Python, Kafka, Kafka Streams, Kafka Connect, Scala, Kubernetes, Helm Charts Techs: Airflow, Python, Kafka, Kafka Streams, Kafka Connect, Scala, Kubernetes, Helm Charts

    • United Kingdom
    • Financial Services
    • 200 - 300 Employee
    • Data Engineer
      • Sep 2019 - Oct 2019

      Summary: Worked in the data team, with plans to build an event driven data pipeline on GCP. Investigated Google Cloud Platform Big data technologies such as Apache Beam and Dataflow. Investigated the open source Scala Apache Beam library Scio. Company decided to change platforms to Azure, which lead to investigating Big data technologies such as Event Hubs and Azure DataBricks. Summary: Worked in the data team, with plans to build an event driven data pipeline on GCP. Investigated Google Cloud Platform Big data technologies such as Apache Beam and Dataflow. Investigated the open source Scala Apache Beam library Scio. Company decided to change platforms to Azure, which lead to investigating Big data technologies such as Event Hubs and Azure DataBricks.

    • United Kingdom
    • Market Research
    • 1 - 100 Employee
    • Data Engineer
      • Jul 2017 - Sep 2019

      Summary: Working in the data engineering pipeline team building data platforms and pipelines using AWS, Spark and Scala. Building a data platform (data lake) for analysts so they can easily have access to all the data they need to perform analyses. Projects: • Media player parser: Built the capability to parse Netflix passive data from iPhone devices. Our media feed consists of media players on several platforms, the Netflix player on the iOS OS was particularly tricky due to the data being in a json graph format. Development included writing good unit tests, building a meta data store to enhance the data feed and calculating durations for viewing sessions. Overall allowed me to gain more exposure into the Scala language and test driven development. • Data Lake: Built a new data lake on AWS S3 to centralize all data feeds and products. This would allow faster product deployment and releases, also making it more pleasant for analysts querying data. Migrated all products to the data lake, upgrading from Spark 1.6 to Spark 2.3, used Jenkins pipelines for building continuous deployment pipelines, Bitbucket for continuous integration and Azkaban for scheduling on AWS EMR. Migrated customer nightly reporting to the data lake, allowing massive savings in AWS costs and faster reporting. AWS Cloudformation was used for all resources so we could easily replicate resources in different environments, KMS was used encrypting the data and IAM roles were used for application permissions. • Event Driven Streaming Pipeline POC: After investigating Spark Structured Streaming to create a streaming pipeline, we wanted to investigate event driven streaming processing using Serverless AWS services. Built an end to end event driven streaming pipeline which processed new raw data in close to real time, processed it using AWS Lambda and used AWS Kinesis Firehose to convert the data to Parquet format when writing out to AWS S3. The coding language of choice was Scala. Show less

    • IT Services and IT Consulting
    • 700 & Above Employee
    • Data Scientist
      • Apr 2016 - Jul 2017

      Responsibilities included performing exploratory analysis, applying machine learning models, building machine learning pipelines, using Big Data Platforms Hadoop and Spark to do Big Data Science and presenting results to stake owners along with being involved in team technical design meetings and client engagement. Recent Projects: ● Predictive Maintenance on Aircraft Engines POC: Purpose of the POC was to demonstrate the value of predictive maintenance over preventive maintenance to clients. Open datasets of aircrafts were utilised for the POC. The data used was time series with sensory attributes on which anomaly detection methods would be used to predict failure. Techs used: R, R Shiny for the dashboard ● Text Analytics on Software Logs: Purchase of the project was to analyse software logs from POS Retail systems in particular the free text message field to find patterns. Processed unstructured data from log files and cleaned the data using Python, Pandas and NLP techniques such as Bag of Words, CountVectorizer from Sklearn in Python. Work was carried out using Jupyter Notebooks. ● Outlier Detection of POS Devices: Detecting outliers in sensory time series data recorded on POS devices to find patterns in POS devices which will be leading to end of life. Exploratory analysis and outlier detection done using PySpark Python connecting to YARN Spark cluster where the data for each POS Till type resides, Zeppelin and Jupyter notebooks were both used. Main Techs: Python, R, Scala, Spark, Hadoop (HDFS & Hive), Jupyter, Zeppelin Show less

Education

  • The Manchester Metropolitan University
    MRes Advanced Computer Science, Computer Science - Artificial Intelligence
    2013 - 2015
  • The Manchester Metropolitan University
    BSc (Hons), Software Engineering
    2010 - 2013
  • 2008-2010 Loreto College
    BTEC National Diploma, Information Technology
    2008 - 2010
  • 2003-2008 Sale High School

Community

You need to have a working account to view this content. Click here to join now