Publication Details

Utilizing VOiCES dataset for multichannel speaker verification with beamforming

MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; ČERNOCKÝ, J. Utilizing VOiCES dataset for multichannel speaker verification with beamforming. Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Tokyo: International Speech Communication Association, 2020. p. 187-193. ISSN: 2312-2846.
Czech title
Využití datasetu VOiCES pro multikanálové ověřování řečníka se směrováním akustického paprsku
Type
conference paper
Language
English
Authors
URL
Keywords

multichannel speaker verification, application-aware beamforming

Abstract

VOiCES from a Distance Challenge 2019 aimed at the evaluation of speaker
verification (SV) systems using single-channel trials based on the Voices
Obscured in Complex Environmental Settings (VOiCES) corpus. Since it comprises
recordings of the same utterances captured simultaneously by multiple microphones
in the same environments, it is also suitable for multichannel experiments. In
this work, we design a multichannel dataset as well as development and evaluation
trials for SV inspired by the VOiCES challenge. Alternatives discarding harmful
microphones are presented as well. We asses the utilization of the created
dataset for x-vector based SV with beamforming as a front end. Standard fixed
beamforming and NN-supported beamforming using simulated data and ideal binary
masks (IBM) are compared with another variant of NNsupported beamforming that is
trained solely on the VOiCES data. Lack of data revealed by experiments with
VOiCESdata trained beamformer was tackled by means of a variant of SpecAugment
applied to magnitude spectra. This approach led to as much as 10% relative
improvement in EER pushing results closer to those obtained by a good beamformer
based on IBMs.

Published
2020
Pages
187–193
Journal
Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland, vol. 2020, no. 11, ISSN 2312-2846
Proceedings
Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop
Conference
Odyssey 2020: The Speaker and Language Recognition Workshop, Tokyo, JP
Publisher
International Speech Communication Association
Place
Tokyo
DOI
BibTeX
@inproceedings{BUT164069,
  author="Ladislav {Mošner} and Oldřich {Plchot} and Johan Andréas {Rohdin} and Jan {Černocký}",
  title="Utilizing VOiCES dataset for multichannel speaker verification with beamforming",
  booktitle="Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop",
  year="2020",
  journal="Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland",
  volume="2020",
  number="11",
  pages="187--193",
  publisher="International Speech Communication Association",
  address="Tokyo",
  doi="10.21437/Odyssey.2020-27",
  issn="2312-2846",
  url="https://www.isca-speech.org/archive/Odyssey_2020/abstracts/80.html"
}
Files
Back to top