Publication Details
DNN derived filters for processing of modulation spectrum of speech
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Heřmanský Hynek, prof. Ing., Dr. Eng. (DCGM)
Veselý Karel, Ing., Ph.D. (DCGM)
deep neural network, convolutive layer, modulation filters, mammalian auditory processing
In this paper DNN paradigm was successfully used for design of modulation frequency FIR filters. This technique optimized the whole process of deriving posterior probabilities of speech sound classes (three-state phonemes).
We propose a novel approach to design modulation frequency filters for the first stage processing of critical band spectrum of speech using deep neural network (DNN). These filters replace conventional modulation frequency filters currently used in state-of-the-art BUT speech recognition system and yield about 10% relative improvement in phoneme recognition accuracy. The resulting filters are consistent with some known temporal properties of higher levels of mammalian auditory processing and suggest more efficient scheme for pre-processing of speech for ASR.
@inproceedings{BUT119905,
author="Jan {Pešán} and Lukáš {Burget} and Hynek {Heřmanský} and Karel {Veselý}",
title="DNN derived filters for processing of modulation spectrum of speech",
booktitle="Proceedings of Interspeech 2015",
year="2015",
journal="Proceedings of Interspeech",
volume="2015",
number="09",
pages="1908--1911",
publisher="International Speech Communication Association",
address="Dresden",
isbn="978-1-5108-1790-6",
issn="1990-9772",
url="https://www.fit.vut.cz/research/publication/10969/"
}