Publication Details
Toroidal Probabilistic Spherical Discriminant Analysis
Brummer Johan Nikolaas Langenhoven, Dr.
Swart Albert du Preez
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
speaker recognition, PSDA, Von Mises-Fishe
n speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring back-ends are commonly used, namely cosine scoring and PLDA. We have recently proposed PSDA, an analog to PLDA that uses Von Mises-Fisher distributions instead of Gaussians. In this paper, we present toroidal PSDA (T-PSDA). It extends PSDA with the ability to model within and between-speaker variabilities in toroidal submanifolds of the hypersphere. Like PLDA and PSDA, the model allows closed-form scoring and closed-form EM updates for training. On VoxCeleb, we find T-PSDA accu- racy on par with cosine scoring, while PLDA accuracy is infe- rior. On NIST SRE'21 we find that T-PSDA gives large accu- racy gains compared to both cosine scoring and PLDA.
@inproceedings{BUT185199,
author="Anna {Silnova} and Johan Nikolaas Langenhoven {Brummer} and Albert du Preez {Swart} and Lukáš {Burget}",
title="Toroidal Probabilistic Spherical Discriminant Analysis",
booktitle="Proceedings of ICASSP 2023",
year="2023",
pages="1--5",
publisher="IEEE Signal Processing Society",
address="Rhodes Island",
doi="10.1109/ICASSP49357.2023.10095580",
isbn="978-1-7281-6327-7",
url="https://ieeexplore.ieee.org/document/10095580"
}