Towards Light-weight, Real-time-capable Singing Voice Detection

  • Reinhard Sonnleitner (Speaker)

Activity: Talk or presentationPoster presentationunknown

Description

We present a study that indicates that singing voice detection – the problem of identifying those parts of a polyphonic audio recording where one or several persons sing(s) – can be realised with substantially fewer (and less expensive) features than used in current state-of-the-art methods. Essentially, we show that MFCCs alone, if appropriately optimised and used with a suitable classifier, are sufficient to achieve detection results that seem on par with the state of the art – at least as far as this can be ascertained by direct, fair comparisons to existing systems. To make this comparison, we select three relevant publications from the literature where publicly accessible training/test data were used, and where the experimental setup is described in enough detail for us to perform fair comparison experiments. The result of the experiments is that with our simple, optimised MFCC-based classifier we achieve at least comparable identification results, but with (in some cases much) less computational effort, and without any need for extensive lookahead, thus paving the way to on-line, real-time voice detection applications.
Period06 Nov 2013
Event titleProceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013)
Event typeConference
LocationBrazilShow on map

Fields of science

  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Engineering and Natural Sciences (in general)