Publication Details

Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge

DIEZ SÁNCHEZ, M.; BURGET, L.; LANDINI, F.; WANG, S.; ČERNOCKÝ, J. Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020. p. 6519-6523. ISBN: 978-1-5090-6631-5.
Czech title
Optimalizace bayesovského shlukování x-vektorů založených na HMM pro druhou soutěž DIHARD v diarizaci řeči
Type
conference paper
Language
English
Authors
URL
Keywords

Speaker Diarization, Variational Bayes, HMM, x-vector, DIHARD

Abstract

This paper presents an analysis of our diarization systemwinning the second DIHARD speech diarization challenge,track 1. This system is based on clustering x-vector speakerembeddings extracted every 0.25s from short segments of theinput recording. In this paper, we focus on the two x-vectorclustering methods employed, namely Agglomerative HierarchicalClustering followed by a clustering based on BayesianHidden Markov Model (BHMM). Even though the systemsubmitted to the challenge had further post-processing steps,we will show that using this BHMM solely is enough toachieve the best performance in the challenge. The analysiswill show improvements achieved by optimizing individualprocessing steps, including a simple procedure to effectivelyperform "domain adaptation" by Probabilistic LinearDiscriminant Analysis model interpolation. All experimentsare performed in the DIHARD II evaluation framework.

Published
2020
Pages
6519–6523
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), Barcelona, ES
ISBN
978-1-5090-6631-5
Publisher
IEEE Signal Processing Society
Place
Barcelona
DOI
UT WoS
000615970406156
EID Scopus
BibTeX
@inproceedings{BUT163963,
  author="Mireia {Diez Sánchez} and Lukáš {Burget} and Federico Nicolás {Landini} and Shuai {Wang} and Jan {Černocký}",
  title="Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge",
  booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
  year="2020",
  pages="6519--6523",
  publisher="IEEE Signal Processing Society",
  address="Barcelona",
  doi="10.1109/ICASSP40776.2020.9053982",
  isbn="978-1-5090-6631-5",
  url="https://ieeexplore.ieee.org/document/9053982"
}
Files
Back to top