Publication Details

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data

KARAFIÁT, M.; SZŐKE, I.; ČERNOCKÝ, J. Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data. Proc. Text, Speech and Dialog 2010. Lecture Notes in Computer Science. LNAI 6231. Brno: Springer Verlag, 2010. p. 322-329. ISBN: 978-3-642-15759-2. ISSN: 0302-9743.
Czech title
Využití gradient descent optimalizace pro trénování akustických modelů z heterogenních dat
Type
conference paper
Language
English
Authors
URL
Keywords

speech, acoustic models, heterogeneous data, HLDA system, gradient descent training, robustness

Abstract

This paper is on using the gradient descent optimization for acoustics training from heterogeneous data. We study the use of heterogeneous data for training of acoustic models.

Annotation

In this paper, we study the use of heterogeneous data for training of acoustic models. In initial experiments, a significant drop of accuracy has been observed on in-domain test set if the data was added without any regularization. A solution is proposed by getting control over the training data by optimization of the weights of different data-sets. The final models shows good performance on all various tests linked to various speaking styles. Furthermore, we used this approach to increase the performance over just the main test set. We obtained 0.3% absolute improvement on basic system and 0.4% on HLDA system although the size of the heterogeneous data set was quite small.

Published
2010
Pages
322–329
Journal
Lecture Notes in Computer Science, vol. 2010, no. 9, ISSN 0302-9743
Proceedings
Proc. Text, Speech and Dialog 2010
Series
LNAI 6231
ISBN
978-3-642-15759-2
Publisher
Springer Verlag
Place
Brno
BibTeX
@inproceedings{BUT34926,
  author="Martin {Karafiát} and Igor {Szőke} and Jan {Černocký}",
  title="Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data",
  booktitle="Proc. Text, Speech and Dialog 2010",
  year="2010",
  series="LNAI 6231",
  journal="Lecture Notes in Computer Science",
  volume="2010",
  number="9",
  pages="322--329",
  publisher="Springer Verlag",
  address="Brno",
  isbn="978-3-642-15759-2",
  issn="0302-9743",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/karafiat_TSD_2010_322.pdf"
}
Back to top