Metrical Analysis of Musical Audio Using Probabilistic Models

Florian Krebs

Research output: ThesisDoctoral thesis

Abstract

Due to the exploding amount of available music in recent years, media collections cannot be managed manually any more, which makes automatic audio analysis crucial for content-based search, organisation, and processing of data. This thesis focuses on the automatic extraction of a metrical grid, determined by beats, downbeats, and time signature, from a music piece. I propose several algorithms to tackle this problem, all comprising three stages: First, (low-level) features are extracted from the audio signal. Second, an acoustic model transfers these features into probabilities in the music domain. Third, a probabilistic sequence model finds the most probable sequence of labels under the model assumptions. This thesis provides contributions to the second and third stage. I (i) explore acoustic models based on machine learning methods, and (ii) develop models and algorithms for efficient probabilistic inference for both online and offline scenarios. Further, I design applications such as an automatic drummer which listens to and accompanies a musician in a live setting. The most recent algorithms developed in this thesis exhibit state-of-the-art per- formance and clearly demonstrate the superiority of systems incorporating machine learning over hand-designed systems, which were prevalent at the time of starting this thesis. All algorithms developed in this thesis are publicly available as open-source software. I also publish beat and downbeat annotations for the Ballroom dataset to foster further research in this area.
Original languageEnglish
Publication statusPublished - Dec 2016

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Engineering and Natural Sciences (in general)

Cite this