Product Details
Bayesian HMM based x-vector clustering - VBx
Created: 2020
Landini Federico Nicolás (RG SPEECH)
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Speaker Diarization, Variational Bayes, HMM, x-vector, DIHARD
Diarization is the task of determining the number of speakers and "who speaks when" in a recording. It is part of speech data mining. The proposed software contains a full implementation of a Bayesian approach to do speaker diarization using low-dimensional neural representation of speakers (x-vectors) in individual segments. It follows the Brno University of Technology recipe for the Second DIHARD Diarization Challenge Track 1, where BUT was the winner. It consists of computing filter-bank features, computing x-vectors, performing Agglomerative Hierarchical Clustering on x-vectors as a first step to produce an initialization, applying Variational Bayes HMM over x-vectors to produce the diarization output, and scoring the diarization output. The software is written in Python and released as open-source under Apache License.
Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-20-6460, 2020-2023, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, 2019-2023, running
Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods, EU, Horizon 2020, 2017-2019, running