Publication Details

Compact Network for Speakerbeam Target Speaker Extraction

DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; ARAKI, S.; NAKATANI, T. Compact Network for Speakerbeam Target Speaker Extraction. In Proceedings of ICASSP. Brighton: IEEE Signal Processing Society, 2019. p. 6965-6969. ISBN: 978-1-5386-4658-8.
Czech title
Kompaktní síť pro SpeakerBeam extrakci mluvčího
Type
conference paper
Language
English
Authors
URL
Keywords

Target speech extraction, Neural network, Adaptation, Auxiliary feature, Speech enhancement

Abstract

Speech separation that separates a mixture of speech signals into each of its sources has been an active research topic for a long time and has seen recent progress with the advent of deep learning. A related problem is target speaker extraction, i.e. extraction of only speech of a target speaker out of a mixture, given characteristics of his/her voice. We have recently proposed SpeakerBeam, which is a neural network-based target speaker extraction method. Speaker- Beam uses a speech extraction network that is adapted to the target speaker using auxiliary features derived from an adaptation utterance of that speaker. Initially, we implemented SpeakerBeam with a factorized adaptation layer, which consists of several parallel linear transformations weighted by weights derived from the auxiliary features. The factorized layer is effective for target speech extraction, but it requires a large number of parameters. In this paper, we propose to simply scale the activations of a hidden layer of the speech extraction network with weights derived from the auxiliary features. This simpler approach greatly reduces the number of model parameters by up to 60%, making it much more practical, while maintaining a similar level of performance. We tested our approach on simulated and real noisy and reverberant mixtures, showing the potential of SpeakerBeam for real-life applications. Moreover, we showed that speech extraction performance of SpeakerBeam compares favorably with that of a state-of-the-art speech separation method with a similar network configuration.

Published
2019
Pages
6965–6969
Proceedings
Proceedings of ICASSP
ISBN
978-1-5386-4658-8
Publisher
IEEE Signal Processing Society
Place
Brighton
DOI
UT WoS
000482554007040
EID Scopus
BibTeX
@inproceedings{BUT160003,
  author="DELCROIX, M. and ŽMOLÍKOVÁ, K. and OCHIAI, T. and KINOSHITA, K. and ARAKI, S. and NAKATANI, T.",
  title="Compact Network for Speakerbeam Target Speaker Extraction",
  booktitle="Proceedings of ICASSP",
  year="2019",
  pages="6965--6969",
  publisher="IEEE Signal Processing Society",
  address="Brighton",
  doi="10.1109/ICASSP.2019.8683087",
  isbn="978-1-5386-4658-8",
  url="https://ieeexplore.ieee.org/document/8683087"
}
Files
Back to top