Speech Lab

Research

Research in the SpeechLab at TUDelft aims to build inclusive speech technology that can be used by everyone irrespective of how they speak or the language they speak, with a focus on automatic speech recognition. In order to do so, several highly interesting challenges need to be overcome:

  • The high diversity in languages in the world
  • The high diversity in types of speech
  • Dealing with low-resource scenarios as for many languages and speech types, only (very) little data is available

In doing so, we not only take into account technological aspects, but look at ethical and societal aspects, and use the best speech recognisers that exist, i.e., human listeners, as a source of inspiration. Our research thus combines speech technology with linguistics, psycholinguistics, and ethics.

In our research, we combine different types of research techniques ranging from human listening experiments, eye-tracking, EEG, and computational modelling to deep neural networks.

The research in the SpeechLab focusses on, but is not limited to:

  • Building computational models of human speech processing using techniques from automatic speech recognition and machine learning
  • Speech technology for pathological speech
  • Bias in automatic speech recognition
  • Multi-modal speech technology
  • Building speech technology for under-resourced languages and languages without a common written language
  • Visualisations of the speech representations in deep neural networks
  • Systematic comparisons between human and machine speech processing architectures and performance
  • Using knowledge about human speech processing to build help improve speech processing algorithms

A short video explaining our research can be found here.

Current Projects

  • NSF 19-10319: (RI Small): Automatic Creation of New Phone Inventories; PIs Mark Hasegawa-Johnson (UIUC) and Najim Dehak (JHU), Unfunded collaborator (July 2019 – June 2022).
  • NSF & Amazon 2147350 (FAIN, IIS): A New Paradigm for the Evaluation and Training of Inclusive Automatic Speech Recognition; PIs Mark Hasegawa-Johnson & Zsuzsanna Fagyal (UIUC), Najim Dehak, Piotr Zelasko, & Laureano Moro-Velazquez (JHU), Unfunded collaborator (Feb 2022 – Jan 2025).

The Team