Publication Details
Toroidal Probabilistic Spherical Discriminant Analysis
Brummer Johan Nikolaas Langenhoven, Dr.
Swart Albert du Preez
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
speaker recognition, PSDA, Von Mises-Fishe
n speaker recognition, where speech segments are mapped to embeddings on the unit
hypersphere, two scoring back-ends are commonly used, namely cosine scoring and
PLDA. We have recently proposed PSDA, an analog to PLDA that uses Von
Mises-Fisher distributions instead of Gaussians. In this paper, we present
toroidal PSDA (T-PSDA). It extends PSDA with the ability to model within and
between-speaker variabilities in toroidal submanifolds of the hypersphere. Like
PLDA and PSDA, the model allows closed-form scoring and closed-form EM updates
for training. On VoxCeleb, we find T-PSDA accu- racy on par with cosine scoring,
while PLDA accuracy is infe- rior. On NIST SRE'21 we find that T-PSDA gives large
accu- racy gains compared to both cosine scoring and PLDA.
author="Anna {Silnova} and Johan Nikolaas Langenhoven {Brummer} and Albert du Preez {Swart} and Lukáš {Burget}",
title="Toroidal Probabilistic Spherical Discriminant Analysis",
booktitle="Proceedings of ICASSP 2023",
publisher="IEEE Signal Processing Society",
address="Rhodes Island",