Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Towards Light-weight, Real-time-capable Singing Voice Detection

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

We present a study that indicates that singing voice detection – the problem of identifying those parts of a polyphonic audio recording where one or several persons sing(s) – can be realised with substantially fewer (and less expensive) features than used in current state-of-the-art methods. Essentially, we show that MFCCs alone, if appropriately optimised and used with a suitable classifier, are sufficient to achieve detection results that seem on par with the state of the art – at least as far as this can be ascertained by direct, fair comparisons to existing systems. To make this comparison, we select three relevant publications from the literature where publicly accessible training/test data were used, and where the experimental setup is described in enough detail for us to perform fair comparison experiments. The result of the experiments is that with our simple, optimised MFCC-based classifier we achieve at least comparable identification results, but with (in some cases much) less computational effort, and without any need for extensive lookahead, thus paving the way to on-line, real-time voice detection applications.
OriginalspracheEnglisch
TitelProceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013)
Herausgeber*innenAlceu de Souza Britto, Fabien Gouyon, Simon Dixon
Seiten53-58
Seitenumfang6
ISBN (elektronisch)9780615900650
PublikationsstatusVeröffentlicht - Okt. 2013

Wissenschaftszweige

  • 102 Informatik
  • 102001 Artificial Intelligence
  • 102003 Bildverarbeitung

JKU-Schwerpunkte

  • Computation in Informatics and Mathematics
  • TNF Allgemein

Dieses zitieren