Publication Details
Deriving Spectro-temporal Properties of Hearing from Speech Data
perception, spectro-temporal, auditory, deep learning
Human hearing and human speech are intrinsically tied together, as the properties of speech almost certainly developed in order to be heard by human ears. As a result of this connection, it has been shown that certain properties of human hearing are mimicked within data-driven systems that are trained to understand human speech. In this paper, we further explore this phenomenon by measuring the spectro-temporal responses of data-derived filters in a front-end convolutional layer of a deep network trained to classify the phonemes of clean speech. The analyses show that the filters do indeed exhibit spectro-temporal responses similar to those measured in mammals, and also that the filters exhibit an additional level of frequency selectivity, similar to the processing pipeline assumed within the Articulation Index.
@inproceedings{BUT160004,
author="ONDEL YANG, L. and LI, R. and SELL, G. and HEŘMANSKÝ, H.",
title="Deriving Spectro-temporal Properties of Hearing from Speech Data",
booktitle="Proceedings of ICASSP",
year="2019",
pages="411--415",
publisher="IEEE Signal Processing Society",
address="Brighton",
doi="10.1109/ICASSP.2019.8682787",
isbn="978-1-5386-4658-8",
url="https://ieeexplore.ieee.org/document/8682787"
}