Publication Details
Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language
conversational speech, diarization, speech recognition
As speech technology has matured, there has been a push to- wards systems that
can process conversational speech, reflect- ing the so-called "cocktail party
problem," which includes not only more challenging acoustic conditions, but also
necessi- tates solutions to new problems, such as identifying who spoke when and
processing multiple concurrent streams of speech. Such problems have been
approached primarily via corpora comprising business meetings and dinner parties,
overlooking the broad range of conversational dynamics and speaker de- mographics
that fall under the category of multi-talker speech. To this end, we introduce
the use of the Santa Barbara Corpus of Spoken American English for evaluation of
speech technol- ogy-including preparing the corpus and annotations for auto-
matic processing, demonstrating the failure of state-of-the-art systems to
withstand the heterogeneity of conditions, and high- lighting the situations
where standard methods struggle to per- form at all
@inproceedings{BUT193741,
author="MACIEJEWSKI, M. and KLEMENT, D. and HUANG, R. and WIESNER, M. and KHUDANPUR, S.",
title="Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language",
booktitle="Proceedings of Interspeech 2024",
year="2024",
journal="Proceedings of Interspeech",
volume="2024",
number="9",
pages="2155--2160",
publisher="International Speech Communication Association",
address="Kos",
doi="10.21437/Interspeech.2024-2119",
issn="1990-9772",
url="https://www.isca-archive.org/interspeech_2024/maciejewski24_interspeech.pdf"
}