Publication Details

Multi-Channel Extension of Pre-trained Models for Speaker Verification

MOŠNER, L.; SERIZEL, R.; BURGET, L.; PLCHOT, O.; VINCENT, E.; PENG, J.; ČERNOCKÝ, J. Multi-Channel Extension of Pre-trained Models for Speaker Verification. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. p. 2135-2139. ISSN: 1990-9772.
Czech title
Vícekanálové rozšíření předtrénovaných modelů pro ověřování mluvčího
Type
conference paper
Language
English
Authors
URL
Keywords

multi-channel speaker verification, pre-trained models

Abstract

In this work, we focus on designing a multi-channel speech processing system
based on large pre-trained models. These models are typically trained for
single-channel scenarios via self-supervised learning (SSL). A common approach to
using the SSL models with microphone array data is to prepend it with
a multi-channel speech enhancement. The downside is that spatial information can
be leveraged only by the pre-processing stage, and enhancement errors get
propagated to the SSL model. We aim to alleviate the issue by designing METRO,
a Multi-channel ExTension of pRe-trained mOdels. It interleaves per- channel
processing with cross-channel information exchange, eventually fusing channels
into one. While our approach is general, here we focus on multi-channel speaker
verification. Our experiments on the MultiSV corpus show noteworthy improvements
over the best-published results on the dataset.

Published
2024
Pages
2135–2139
Journal
Proceedings of Interspeech, vol. 2024, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2024
Publisher
International Speech Communication Association
Place
Kos
DOI
EID Scopus
BibTeX
@inproceedings{BUT193682,
  author="MOŠNER, L. and SERIZEL, R. and BURGET, L. and PLCHOT, O. and VINCENT, E. and PENG, J. and ČERNOCKÝ, J.",
  title="Multi-Channel Extension of Pre-trained Models for Speaker Verification",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="2135--2139",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-1260",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/mosner24_interspeech.pdf"
}
Files
Back to top