Publication Details

Semi-supervised DNN training with word selection for ASR

VESELÝ, K.; BURGET, L.; ČERNOCKÝ, J. Semi-supervised DNN training with word selection for ASR. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. p. 3687-3691. ISSN: 1990-9772.
Czech title
Částečně kontrolované trénování DNN s výběrem slov pro ASR
Type
conference paper
Language
English
Authors
URL
Keywords

semi-supervised training, DNN, word selection, granularity of confidences

Abstract

The article is about semi-supervised DNN training with word selection for Automatic Speaker Recognition (ASR).

Annotation

Not all the questions related to the semi-supervised training of hybrid ASR system with DNN acoustic model were already deeply investigated. In this paper, we focus on the question of the granularity of confidences (per-sentence, per-word, perframe), the question of how the data should be used (dataselection by masks, or in mini-batch SGD with confidences as weights). Then, we propose to re-tune the system with the manually transcribed data, both with the frame CE training and sMBR training. Our preferred semi-supervised recipe which is both simple and efficient is following: we select words according to the word accuracy we obtain on the development set. Such recipe, which does not rely on a grid-search of the training hyperparameter, generalized well for: Babel Vietnamese (transcribed 11h, untranscribed 74h), Babel Bengali (transcribed 11h, untranscribed 58h) and our custom Switchboard setup (transcribed 14h, untranscribed 95h). We obtained the absolute WER improvements 2.5% for Vietnamese, 2.3% for Bengali and 3.2% for Switchboard.

Published
2017
Pages
3687–3691
Journal
Proceedings of Interspeech, vol. 2017, no. 08, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2017
Publisher
International Speech Communication Association
Place
Stockholm
DOI
UT WoS
000457505000766
EID Scopus
BibTeX
@inproceedings{BUT144493,
  author="Karel {Veselý} and Lukáš {Burget} and Jan {Černocký}",
  title="Semi-supervised DNN training with word selection for ASR",
  booktitle="Proceedings of Interspeech 2017",
  year="2017",
  journal="Proceedings of Interspeech",
  volume="2017",
  number="08",
  pages="3687--3691",
  publisher="International Speech Communication Association",
  address="Stockholm",
  doi="10.21437/Interspeech.2017-1385",
  issn="1990-9772",
  url="http://www.isca-speech.org/archive/Interspeech_2017/pdfs/1385.PDF"
}
Back to top