Thabarak khan

Hadoop Developer at Energy Future Holdings
  • Claim this Profile
Contact Information
Location
Irving, Texas, United States, US

Topline Score

Bio

Generated by
Topline AI

0

/5.0
/ Based on 0 ratings
  • (0)
  • (0)
  • (0)
  • (0)
  • (0)

Filter reviews by:

No reviews to display There are currently no reviews available.

0

/5.0
/ Based on 0 ratings
  • (0)
  • (0)
  • (0)
  • (0)
  • (0)

Filter reviews by:

No reviews to display There are currently no reviews available.
You need to have a working account to view this content. Click here to join now

Experience

    • United States
    • Utilities
    • 100 - 200 Employee
    • Hadoop Developer
      • Mar 2018 - Present

      • Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in HBase. • Developed Spark scripts by using Scala shell commands as per the requirement. • Load the data into Spark RDD and performed in-memory data computation to generate the output response. • Enhanced the Pyspark code to replace spark with Impyla. Performed installation for Impyla on the Edge node • Experienced in implementing Spark RDD transformations, actions to implement business analysis. • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with HDFS reference tables and historical metrics. • Experienced on loading and transforming of large sets of data from Cassandra source through Kafka and placed in HDFS for further processing. • Developed Pyspark code to read data from Hive, group the fields and generate XML files. Enhanced the Pyspark code to write the generated XML files to a directory to zip them to CDAs • Implemented Cassandra connector for Spark. Show less

    • United States
    • IT Services and IT Consulting
    • 1 - 100 Employee
    • Hadoop Developer
      • Apr 2017 - Present

      • Design data ingestion and integration process using SQOOP, Shell Scripts & Pig, with Hive. • Adding and Decommissioning Hadoop Cluster Nodes Including Balancing HDFS block data. • Implemented Fair schedulers on the Resource Manager to share the resources of the Cluster for the MRv2 jobs given by the users. • Worked with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments. • Perform investigation and migration from MRv1 to MRv2. • Developed Pyspark code to read data from Hive, group the fields and generate XML files.Enhanced the Pyspark code to write the generated XML files to a directory to zip them to CDAs • Worked with Big Data Analysts, Designers and Scientists in troubleshooting MRv1/MRv2 job failures and issues with Hive, Pig, Flume, and Apache Spark. • Utilized Apache Spark for Interactive Data Mining and Data Processing. • Accommodate load in its place before the data is analyzed using Apache Kafka with its fast, scalable, fault-tolerant system. Show less

    • United States
    • Utilities
    • 100 - 200 Employee
    • Hadoop Developer
      • Apr 2016 - Mar 2017

      • Involved in partitioning the raw data, processed data each by day using one level partitioning schemes. • Created the external tables in Hive based on the processed data obtained from Spark. • Ingested the secondary data from systems like CRM, CPS, ODS using Sqoop and correlated this data with log files providing the platform for data analysis. • Performed basic aggregations like count, average, sum, distinct, max, min on the existing hive tables using impala to determine Average Hit rates, Miss rates, Bounce rates etc. • Persisted the processed data in columnar databases like HBASE and provided the platform for analytics using BI tools, analytical tools like R, machine learning such as Mahout. Show less

    • India
    • Engineering Services
    • 700 & Above Employee
    • Hadoop Developer
      • Mar 2013 - Jan 2015

      • Developed and maintained Web applications as defined by the Project Lead. • Developed GUI using JSP, JavaScript, and CSS. • Used MS Visio for creating business process diagrams. • Developed Action Servlet, Action Form, Java Bean classes for implementing business logic for the struts Framework. • Developed Servlets and JSP based on MVC pattern using struts Action framework. • Developed all the tiers of the J2EE application. Developed data objects to communicate with the database using JDBC in the database tier, implemented business logic using EJBs in the middle tier, developed Java Beans and helper classes to communicate with the presentation tier which consists of JSPs and Servlets. • Used AJAX for Client side validations. Show less

Community

You need to have a working account to view this content. Click here to join now