Result Details
Spelling-Aware Word-Based End-to-End ASR
        EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J. Spelling-Aware Word-Based End-to-End ASR. IEEE SIGNAL PROCESSING LETTERS, 2022, vol. 29, no. 29, p. 1729-1733.  ISSN: 1558-2361.
    
                Type
            
        
                journal article
            
        
                Language
            
        
                English
            
        
            Authors
            
        
                Egorova Ekaterina, Ing., Ph.D., DCGM (FIT)
                
Vydana Hari Krishna
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
        Vydana Hari Krishna
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
                    Abstract
            
        We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.
                Keywords
            
        end-to-end, ASR, OOV, Listen Attend and Spellarchitecture
                URL
            
        
                Published
            
            
                    2022
                    
                
            
                    Pages
                
            
                        1729–1733
                
            
                    Journal
                
            
                    IEEE SIGNAL PROCESSING LETTERS, vol. 29, no. 29, ISSN 1558-2361
                
            
                    DOI
                
            
                    UT WoS
                
            
                    000842088200001
                
            
                EID Scopus
                
            
                    BibTeX
                
            @article{BUT178877,
  author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
  title="Spelling-Aware Word-Based End-to-End ASR",
  journal="IEEE SIGNAL PROCESSING LETTERS",
  year="2022",
  volume="29",
  number="29",
  pages="1729--1733",
  doi="10.1109/LSP.2022.3192199",
  issn="1070-9908",
  url="https://ieeexplore.ieee.org/document/9833231"
}
                
                Files
            
        
                Projects
            
        
        
            
        
    
    
        Multi-linguality in speech technologies, MŠMT, INTER-EXCELLENCE - Podprogram INTER-ACTION, LTAIN19087, start: 2020-01-01, end: 2023-08-31, completed
                
Multiple Intelligent Conversation Agent Sevices for Reception, Management and Integration of Third Country Nationals, EU, Horizon 2020, start: 2020-02-01, end: 2023-04-30, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
        Multiple Intelligent Conversation Agent Sevices for Reception, Management and Integration of Third Country Nationals, EU, Horizon 2020, start: 2020-02-01, end: 2023-04-30, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
                Research groups
            
        
                Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)
            
        
                Departments