Publication Details
Spelling-Aware Word-Based End-to-End ASR
EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J. Spelling-Aware Word-Based End-to-End ASR. IEEE SIGNAL PROCESSING LETTERS, 2022, vol. 29, no. 29, p. 1729-1733. ISSN: 1558-2361.
Czech title
End-to-End systém pro rozpoznávání řeči založený na slovech beroucí v úvahu jejich hláskování
Type
journal article
Language
English
Authors
Egorova Ekaterina, Ing., Ph.D.
Vydana Hari Krishna
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
Vydana Hari Krishna
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
URL
Keywords
end-to-end, ASR, OOV, Listen Attend and Spellarchitecture
Abstract
We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.
Published
2022
Pages
1729–1733
Journal
IEEE SIGNAL PROCESSING LETTERS, vol. 29, no. 29, ISSN 1558-2361
DOI
UT WoS
000842088200001
EID Scopus
BibTeX
@article{BUT178877,
author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
title="Spelling-Aware Word-Based End-to-End ASR",
journal="IEEE SIGNAL PROCESSING LETTERS",
year="2022",
volume="29",
number="29",
pages="1729--1733",
doi="10.1109/LSP.2022.3192199",
issn="1558-2361",
url="https://ieeexplore.ieee.org/document/9833231"
}
Files