Improved Musical Onset Detection with Convolutional Neural Networks

Jan Schlüter, Sebastian Böck

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Musical onset detection is one of the most elementary tasks in music analysis, but still only solved imperfectly for polyphonic music signals. Interpreted as a computer vision problem in spectrograms, Convolutional Neural Networks (CNNs) seem to be an ideal fit. On a dataset of about 100 minutes of music with 26k annotated onsets, we show that CNNs outperform the previous state-of-the-art while requiring less manual preprocessing. Investigating their inner workings, we find two key advantages over hand-designed methods: Using separate detectors for percussive and harmonic onsets, and combining results from many minor variations of the same scheme. The results suggest that even for well-understood signal processing tasks, machine learning can be superior to knowledge engineering.
Original languageEnglish
Title of host publicationProceedings of the 39th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Number of pages6
Publication statusPublished - May 2014

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Engineering and Natural Sciences (in general)

Cite this