Muhammad Haseeb
HPC Infrastructure and Performance Postdoc at National Energy Research Scientific Computing Center (NERSC)- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
-
English Native or bilingual proficiency
-
Urdu Native or bilingual proficiency
-
Punjabi Native or bilingual proficiency
Topline Score
Bio
Experience
-
National Energy Research Scientific Computing Center (NERSC)
-
United States
-
Research Services
-
1 - 100 Employee
-
HPC Infrastructure and Performance Postdoc
-
Apr 2023 - Present
1. Develop cutting-edge GPU accelerated scientific software using new technologies in Programming Models (MPI, CUDA, SYCL, Kokkos, OpenMP-offload, AMReX), and C++ (C++26, stdexec, parSTL). 2. Investigate and model GPU-GPU communications in HPC applications over Perlmutter supercomputer interconnects. 3. Develop and optimize GPU-accelerated algorithms for ECP-WarpX (DOE's flagship particle acceleration simulation code). Skills: HPC, modern C++, STL evolution, programming… Show more 1. Develop cutting-edge GPU accelerated scientific software using new technologies in Programming Models (MPI, CUDA, SYCL, Kokkos, OpenMP-offload, AMReX), and C++ (C++26, stdexec, parSTL). 2. Investigate and model GPU-GPU communications in HPC applications over Perlmutter supercomputer interconnects. 3. Develop and optimize GPU-accelerated algorithms for ECP-WarpX (DOE's flagship particle acceleration simulation code). Skills: HPC, modern C++, STL evolution, programming models, Senders/Receivers, CUDA, Python, AMReX, build systems, performance engineering, profiling tools. Show less 1. Develop cutting-edge GPU accelerated scientific software using new technologies in Programming Models (MPI, CUDA, SYCL, Kokkos, OpenMP-offload, AMReX), and C++ (C++26, stdexec, parSTL). 2. Investigate and model GPU-GPU communications in HPC applications over Perlmutter supercomputer interconnects. 3. Develop and optimize GPU-accelerated algorithms for ECP-WarpX (DOE's flagship particle acceleration simulation code). Skills: HPC, modern C++, STL evolution, programming… Show more 1. Develop cutting-edge GPU accelerated scientific software using new technologies in Programming Models (MPI, CUDA, SYCL, Kokkos, OpenMP-offload, AMReX), and C++ (C++26, stdexec, parSTL). 2. Investigate and model GPU-GPU communications in HPC applications over Perlmutter supercomputer interconnects. 3. Develop and optimize GPU-accelerated algorithms for ECP-WarpX (DOE's flagship particle acceleration simulation code). Skills: HPC, modern C++, STL evolution, programming models, Senders/Receivers, CUDA, Python, AMReX, build systems, performance engineering, profiling tools. Show less
-
-
-
FIU School of Computing and Information Sciences
-
United States
-
Software Development
-
1 - 100 Employee
-
Graduate Research Assistant
-
Aug 2018 - Apr 2023
Developed parallel algorithms, data structures, and GPU kernels to scalably accelerate computational proteomics algorithms by > 40x on modern supercomputers. Skills: Modern C++, HPC, GPU Computing, Data Structures, OOP, Computational Biology Developed parallel algorithms, data structures, and GPU kernels to scalably accelerate computational proteomics algorithms by > 40x on modern supercomputers. Skills: Modern C++, HPC, GPU Computing, Data Structures, OOP, Computational Biology
-
-
-
National Energy Research Scientific Computing Center (NERSC)
-
United States
-
Research Services
-
1 - 100 Employee
-
Application Performance Intern
-
May 2021 - Aug 2021
1. Designed and implemented DPC++/SYCL-based GPU-accelerated sequence alignment kernels of the ADEPT framework (10-30% improvement). 2. Performance analysis and optimization of the SYCL-code versus the native implementations on Intel, NVIDIA, and AMD GPUs. Contributed towards the bug fixing and performance optimization of the SYCL + NVIDIA 11 compiler. Developed Python bindings for ADEPT code with zero-copy support for direct code usage from Python. Skills: GPU Computing… Show more 1. Designed and implemented DPC++/SYCL-based GPU-accelerated sequence alignment kernels of the ADEPT framework (10-30% improvement). 2. Performance analysis and optimization of the SYCL-code versus the native implementations on Intel, NVIDIA, and AMD GPUs. Contributed towards the bug fixing and performance optimization of the SYCL + NVIDIA 11 compiler. Developed Python bindings for ADEPT code with zero-copy support for direct code usage from Python. Skills: GPU Computing, DPC++/SYCL, CUDA, NSight, Modern C++, Optimization Show less 1. Designed and implemented DPC++/SYCL-based GPU-accelerated sequence alignment kernels of the ADEPT framework (10-30% improvement). 2. Performance analysis and optimization of the SYCL-code versus the native implementations on Intel, NVIDIA, and AMD GPUs. Contributed towards the bug fixing and performance optimization of the SYCL + NVIDIA 11 compiler. Developed Python bindings for ADEPT code with zero-copy support for direct code usage from Python. Skills: GPU Computing… Show more 1. Designed and implemented DPC++/SYCL-based GPU-accelerated sequence alignment kernels of the ADEPT framework (10-30% improvement). 2. Performance analysis and optimization of the SYCL-code versus the native implementations on Intel, NVIDIA, and AMD GPUs. Contributed towards the bug fixing and performance optimization of the SYCL + NVIDIA 11 compiler. Developed Python bindings for ADEPT code with zero-copy support for direct code usage from Python. Skills: GPU Computing, DPC++/SYCL, CUDA, NSight, Modern C++, Optimization Show less
-
-
-
National Energy Research Scientific Computing Center (NERSC)
-
United States
-
Research Services
-
1 - 100 Employee
-
Application Performance Intern
-
May 2020 - Aug 2020
Developed core features including dynamic instrumentation, python instrumentation, C/PyCtesting, and integration of an HPC instrumentation framework called Timemory. Skills: Performance Analysis, Modern C++, CRTP, SFINAE, CMake, Spack, CI Developed core features including dynamic instrumentation, python instrumentation, C/PyCtesting, and integration of an HPC instrumentation framework called Timemory. Skills: Performance Analysis, Modern C++, CRTP, SFINAE, CMake, Spack, CI
-
-
-
College of Engineering and Applied Sciences at Western Michigan University
-
United States
-
Higher Education
-
1 - 100 Employee
-
Graduate Student Researcher
-
Aug 2017 - Aug 2018
Research on time and memory efficient indexing algorithms for database peptide search for peptide sequencing. Research on time and memory efficient indexing algorithms for database peptide search for peptide sequencing.
-
-
-
Mentor Graphics
-
United States
-
Software Development
-
700 & Above Employee
-
Senior Embedded Software Engineer
-
Dec 2016 - Aug 2017
Developed core features for Mentor Embedded Multicore Framework (MEMF) and Nucleus RTOS Kernel. Skills: Embedded Systems, Embedded C, OS
-
-
Embedded Software Engineer
-
Aug 2015 - Nov 2016
Developed core features of Mentor Embedded Multicore Framework (MEMF) and Nucleus RTOS Kernel. MEMF implements the unsupervised Multicore MultiOS software on ARM-based homogeneous and heterogeneous multicore SOCs. Skills: Embedded Systems, Embedded C, OS
-
-
Education
-
Florida International University
Doctor of Philosophy - PhD, Computer Science -
University of Engineering and Technology, Lahore
Bachelor’s Degree, Electrical Engineering