Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Unsupervised Feature Learning for Speech and Music Detection in Radio Broadcasts.

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

Detecting speech and music is an elementary step in extracting information from radio broadcasts. Existing solutions either rely on general-purpose audio features, or build on features specifically engineered for the task. Interpreting spectrograms as images, we can apply unsupervised feature learning methods from computer vision instead. In this work, we show that features learned by a mean-covariance Restricted Boltzmann Machine partly resemble engineered features, but outperform three hand-crafted feature sets in speech and music detection on a large corpus of radio recordings. Our results demonstrate that unsupervised learning is a powerful alternative to knowledge engineering.
OriginalspracheEnglisch
TitelProceedings of the 15th Int. Conference on Digital Audio Effects (DAFx-12),
Seitenumfang8
PublikationsstatusVeröffentlicht - Sep. 2012

Wissenschaftszweige

  • 102 Informatik
  • 102001 Artificial Intelligence
  • 102003 Bildverarbeitung

JKU-Schwerpunkte

  • Computation in Informatics and Mathematics
  • TNF Allgemein

Dieses zitieren