Prof.dr. dr. Birger Kollmeier visited KovacicLab and gave a talk on Thursday July 2nd, 2015 (room B102, 12.15pm)  on “Measuring, modelling, and improving speech recognition across languages – from machine learning to understanding and aiding the auditory system”. 
Prof.dr.dr. Birger Kollmeier is full professor of medical physics at University of Oldenburg (Germany) and director of Cluster of Excellence Hearing4all and one of the leading experts in a speech and hearing research.

Abstract:
The lecture provides an insight into the activities of the cluster of excellence „Hearing4all“ that spans across biophysical principles in hearing impairment, clinical applications in auditory diagnostics and rehabilitation up to assistive listening devices in daily life. A focus of the talk is the degraded speech perception in cocktail party situations where many listeners have difficulties understanding the desired speech in a mixture of voices and noise. To assess these problems in a highly comparable way across languages , the closed-set Matrix sentence intelligibility tests (e.g., Hagerman-Test, Olsa, Dantale II and similar tests) have been developed, evaluated and made available currently in 14 languages (Zokoll et al., 2013, Kollmeier et al., 2015). To quantitatively model the results in terms of sensory and cognitive aspects of speech recognition in normal and hearing-impaired listeners, the Matrix test format is also advantageous for a machine learning approach, i.e. automatic speech recognition (ASR): Due to reducing the ASR processing perplexity, a close match can be obtained between the performance of human listeners in the Matrix test and the predictions using a standard ASR system (Schaedler et al., 2015).

The ASR system uses Mel-frequency cepstral coefficients as a front-end and employs whole-word Gaussian Mixture/Hidden Markov Models on the back-end side. The ASR system is trained and tested with noisy matrix sentences on a broad range of signal-to-noise ratios. The ASR-based predictions of speech reception threshold of 50 % intelligibility (SRT) show a high and significant correlation (R²=0.95, p<0.001) with the measured data across 7 “critical” noise conditions for 10 normal-hearing native German listeners, outperforming the SII-based predictions for the noise conditions which show no correlation with the empirical data (R²=0.00, p=0.987).
To provide help for listeners in Cocktail parties, elements of an assistive listening device will be reviewed that both fits the requirements of near-to-normal listeners (i.e., providing benefit in noisy situations or other daily life acoustical challenges using the concept of acoustically “transparent” earpieces) and, in addition, can be scaled up to a complete hearing aid for a more substantial hearing loss. The current prototype runs on the binaural, cable-connected master hearing aid (MHA) that includes earpieces allowing for approaching acoustic transparency. A binaural high-fidelity enhancement algorithm motivated by interaural magnification is evaluated in its benefit for normal and near-to-normal listeners.

  • Zokoll, M.A.,. et al. 2013. Speech-in-noise tests for multilingual hearing screening and diagnostics. Am J Audiol, 22(1), 175-78.)
  • Kollmeier, B., Warzybok, A., Hochmuth, S., Zokoll, M., Uslar, V., Brand, T. & Wagener, K.C. (2015) The multilingual matrix test: principles, applications and comparison across languages – a review. Online first in Int J Audiol
  • Schaedler, M.R., Warzybok, A., Hochmuth, S., Kollmeier, B., (2015) Matrix sentence intelligibility prediction using an automatic speech recognition system, Online first in Int. J. Audiol.

3-DSC_0248 1-DSC_0230