Publication Details
Multisv: Dataset for Far-Field Multi-Channel Speaker Verification
Plchot Oldřich, Ing., Ph.D. (DCGM)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Černocký Jan, prof. Dr. Ing. (DCGM)
Multi-channel, speaker verification, MultiSV, dataset, beamforming
Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems. It can be readily used also for experiments with dereverberation, denoising, and speech enhancement. We tackled the ever-present problem of the lack of multi-channel training data by utilizing data simulation on top of clean parts of the Voxceleb corpus. The development and evaluation trials are based on a retransmitted Voices Obscured in Complex Environmental Settings (VOiCES) corpus, which we modified to provide multi-channel trials. We publish full recipes that create the dataset from public sources as the MultiSV dataset, and we provide results with two of our multi-channel speaker verification systems with neural network-based beamforming based either on predicting ideal binary masks or the more recent Conv-TasNet.
@inproceedings{BUT178380,
author="Ladislav {Mošner} and Oldřich {Plchot} and Lukáš {Burget} and Jan {Černocký}",
title="Multisv: Dataset for Far-Field Multi-Channel Speaker Verification",
booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
year="2022",
pages="7977--7981",
publisher="IEEE Signal Processing Society",
address="Singapore",
doi="10.1109/ICASSP43922.2022.9746833",
isbn="978-1-6654-0540-9",
url="https://ieeexplore.ieee.org/document/9746833"
}