Publication Details
Toroidal Probabilistic Spherical Discriminant Analysis
Brummer Johan Nikolaas Langenhoven, Dr.
Swart Albert du Preez
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
speaker recognition, PSDA, Von Mises-Fishe
n speaker recognition, where speech segments are mapped to
embeddings on the unit hypersphere, two scoring back-ends are
commonly used, namely cosine scoring and PLDA. We have
recently proposed PSDA, an analog to PLDA that uses Von
Mises-Fisher distributions instead of Gaussians. In this paper,
we present toroidal PSDA (T-PSDA). It extends PSDA with
the ability to model within and between-speaker variabilities
in toroidal submanifolds of the hypersphere. Like PLDA and
PSDA, the model allows closed-form scoring and closed-form
EM updates for training. On VoxCeleb, we find T-PSDA accu-
racy on par with cosine scoring, while PLDA accuracy is infe-
rior. On NIST SRE'21 we find that T-PSDA gives large accu-
racy gains compared to both cosine scoring and PLDA.
@inproceedings{BUT185199,
author="Anna {Silnova} and Johan Nikolaas Langenhoven {Brummer} and Albert du Preez {Swart} and Lukáš {Burget}",
title="Toroidal Probabilistic Spherical Discriminant Analysis",
booktitle="Proceedings of ICASSP 2023",
year="2023",
pages="1--5",
publisher="IEEE Signal Processing Society",
address="Rhodes Island",
doi="10.1109/ICASSP49357.2023.10095580",
isbn="978-1-7281-6327-7",
url="https://ieeexplore.ieee.org/document/10095580"
}