Dan Berrebbi

Graduate Research Student at Carnegie Mellon University - School of Computer Science - Language Technologies Institute
  • Claim this Profile
Contact Information
us****@****om
(386) 825-5501
Location
Pittsburgh, Pennsylvania, United States, US
Languages
  • Français Native or bilingual proficiency
  • Anglais Full professional proficiency
  • Espagnol Professional working proficiency

Topline Score

Topline score feature will be out soon.

Bio

Generated by
Topline AI

You need to have a working account to view this content.
You need to have a working account to view this content.

Experience

    • Graduate Research Student
      • Aug 2021 - Present

      Research on speech processing, self-supervised models, semi-supervised models, multilingual/low-resource setups, large scale model training, domain adaptation, efficient pre-training/fine-tuning. Publications in top ML/speech conferences : Dan Berrebbi, Brian Yan, Shinji Watanabe. Avoid Overthinking in Self-Supervised Models for Speech Recognition. Published at ICASSP, 2023. Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel Lopez-Francisco, Jonathan D Amith, Shinji Watanabe. Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation. Published at Interspeech, 2022. Jiatong Shi, Dan Berrebbi*, William Chen*, Ho-Lam Chung*, En-Pei Hu*, Wei Ping Huang*, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe. ML-SUPERB: Multilingual Speech Universal PERformance Benchmark.. Published at Interspeech 2023 and at ASRU 2023 (challenge track). Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe. ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit. Published at ACL 2023 (demos). Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe. CMU’s IWSLT 2022 Dialect Speech Translation System. Published at IWSLT 2022. Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu. Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. Published at ICASSP 2022. Show less

    • United States
    • Computers and Electronics Manufacturing
    • 700 & Above Employee
    • Research intern
      • May 2022 - Sep 2022

      Semi-Supervised Algorithms for Speech Recognition, advised by Tatiana Likhomanenko, Ronan Collobert, Navdeep Jaitly and Samy Bengio. ICLR 2023 publication : Dan Berrebbi, Ronan Collobert, Samy Bengio, Navdeep Jaitly, Tatiana Likhomanenko. Continuous Pseudo-Labeling from the Start. (https://openreview.net/pdf?id=m3twGT2bAug) ICASSP 2023 publication : Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko. More Speaking or More Speakers? (https://arxiv.org/pdf/2211.00854.pdf) Show less

    • France
    • Software Development
    • 100 - 200 Employee
    • NLP research intern
      • Mar 2021 - Aug 2021

      Contextualized translation to boost Neural Machine Translation. Machine Translation with similarity. Advised by Josep Crego. Publication at WMT 2021 : Minh Quang Pham, Josep M Crego, Antoine Senellart, Dan Berrebbi, Jean Senellart. Systran@ wmt 2021: Terminology task. Contextualized translation to boost Neural Machine Translation. Machine Translation with similarity. Advised by Josep Crego. Publication at WMT 2021 : Minh Quang Pham, Josep M Crego, Antoine Senellart, Dan Berrebbi, Jean Senellart. Systran@ wmt 2021: Terminology task.

    • France
    • Research Services
    • 700 & Above Employee
    • Research intern
      • Aug 2020 - Jul 2021

      CEDAR TEAM - NLP and graph embeddings Predict citation intent for scientific publications. Established new state of the art with a graph approach. Advised by Pr. Oana Balalau, Publication at Sci-K 2022 : Dan Berrebbi*, Nicolas Huynh*, Oana Balalau. GraphCite: Citation Intent Classification in Scientific Publications via Graph Embeddings. CEDAR TEAM - NLP and graph embeddings Predict citation intent for scientific publications. Established new state of the art with a graph approach. Advised by Pr. Oana Balalau, Publication at Sci-K 2022 : Dan Berrebbi*, Nicolas Huynh*, Oana Balalau. GraphCite: Citation Intent Classification in Scientific Publications via Graph Embeddings.

    • France
    • Software Development
    • 300 - 400 Employee
    • NLP Research Internship
      • Jun 2020 - Aug 2020

      Natural Language Processing research intern - supervisor : Pr. Michael Elhadad Natural Language Processing research intern - supervisor : Pr. Michael Elhadad

    • United States
    • Higher Education
    • 700 & Above Employee
    • Research Internship
      • Jul 2019 - Aug 2019

      Bioengineering research lab Bioengineering research lab

  • Lycée de Navarre Highschool
    • Saint-Jean-Pied-de-Port, Aquitaine, France
    • Assistant Teacher (Maths and Physics)
      • Sep 2018 - Apr 2019

    • United States
    • Higher Education
    • 700 & Above Employee
    • Research Internship
      • Jul 2017 - Aug 2017

      Mechanical Engineering Mechanical Engineering

Education

  • Carnegie Mellon University
    Master of Science, Artificial Intelligence - NLP
    2021 - 2023
  • École Polytechnique
    Ingénieur Polytechnicien, Computer Science and Applied Mathematics
    2018 - 2022
  • Lycée Louis-le-Grand
    MPSI / MP*, Mathématics, physics
    2016 - 2018

Community

You need to have a working account to view this content. Click here to join now