Publication Details

Front-End Compensation Methods for LVCSR Under Lombard Effect

BOŘIL, H.; GRÉZL, F.; HANSEN, J. Front-End Compensation Methods for LVCSR Under Lombard Effect. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. p. 1257-1260. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Czech title
Kompenzační techniky Front-Endu pro LVCSR řeči ovlivněné Lombardovým efektem
Type
conference paper
Language
English
Authors
Bořil Hynek
Grézl František, Ing., Ph.D. (DCGM)
Hansen John
URL
Keywords

speech recognition, Lombard effect, UT-Scope database, bottleneck features, quantile-based cepstral distribution normalization, histogram equalization

Abstract

This paper describes a Front-End Compensation Methods for LVCSR (Large Vocabulary Continuous Speech Recognition) Under Lombard Effect.

Annotation

This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTALP filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.

Published
2011
Pages
1257–1260
Journal
Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2011
ISBN
978-1-61839-270-1
Publisher
International Speech Communication Association
Place
Florence
BibTeX
@inproceedings{BUT76449,
  author="Hynek {Bořil} and František {Grézl} and John {Hansen}",
  title="Front-End Compensation Methods for LVCSR Under Lombard Effect",
  booktitle="Proceedings of Interspeech 2011",
  year="2011",
  journal="Proceedings of Interspeech",
  volume="2011",
  number="8",
  pages="1257--1260",
  publisher="International Speech Communication Association",
  address="Florence",
  isbn="978-1-61839-270-1",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/boril_interspeech2011_221.pdf"
}
Back to top