Publication Details

Developing a Speech Activity Detection System for the DARPA RATS Program

NG, T.; ZHANG, B.; NGUYEN, L.; MATSOUKAS, S.; ZHOU, X.; MESGARANI, N.; VESELÝ, K.; MATĚJKA, P. Developing a Speech Activity Detection System for the DARPA RATS Program. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772.

Czech title

Vývoj systému pro detekci řečové aktivity pro program DARPA RATS

Type

conference paper

Language

English

Authors

Ng Tim
Zhang Bing
Nguyen Long
Matsoukas Spyros
Zhou Xinhui
Mesgarani Nima
Veselý Karel, Ing., Ph.D. (DCGM)
Matějka Pavel, Ing., Ph.D.

URL

Keywords

speech activity detection, noisy speech

Abstract

In this paper we present the SAD system developed by the Patrolteam for the DARPA RATS phase 1 evaluation. The systemachieves high accuracy on audio from noisy radio communicationchannels.

Annotation

This paper describes the speech activity detection (SAD) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We present two approaches to SAD, one based on Gaussian mixture models, and one based on multi-layer perceptrons. We show that significant gains in SAD accuracy can be obtained by careful design of acoustic front end, feature normalization, incorporation of long span features via data-driven dimensionality reducing transforms, and channel dependent modeling. We also present a novel technique for normalizing detection scores from different systems for the purpose of system combination.

Published

2012

Pages

1–4

Journal

Proceedings of Interspeech, vol. 2012, no. 9, ISSN 1990-9772

Proceedings

Proceedings of Interspeech 2012

Conference

Interspeech Conference, Portland, US

ISBN

978-1-62276-759-5

Publisher

International Speech Communication Association

Place

Portland, Oregon

BibTeX

@inproceedings{BUT97014,
  author="Tim {Ng} and Bing {Zhang} and Long {Nguyen} and Spyros {Matsoukas} and Xinhui {Zhou} and Nima {Mesgarani} and Karel {Veselý} and Pavel {Matějka}",
  title="Developing a Speech Activity Detection System for the DARPA RATS Program",
  booktitle="Proceedings of Interspeech 2012",
  year="2012",
  journal="Proceedings of Interspeech",
  volume="2012",
  number="9",
  pages="1--4",
  publisher="International Speech Communication Association",
  address="Portland, Oregon",
  isbn="978-1-62276-759-5",
  issn="1990-9772",
  url="http://www.isca-speech.org/archive/interspeech_2012/i12_1969.html"
}