Towards Robust and Truly Large-Scale Audio-Sheet Music Retrieval

Luis Carvalho, Gerhard Widmer

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

A range of applications of multi-modal music information retrieval is centred around the problem of connecting large collections of sheet music (images) to corresponding audio recordings, that is, identifying pairs of audio and score excerpts that refer to the same musical content. One of the typical and most recent approaches to this task employs cross-modal deep learning architectures to learn joint embedding spaces that link the two distinct modalities - audio and sheet music images. While there has been steady improvement on this front over the past years, a number of open problems still prevent large-scale employment of this methodology. In this article we attempt to provide an insightful examination of the current developments on audio-sheet music retrieval via deep learning methods. We first identify a set of main challenges on the road towards robust and large-scale cross-modal music retrieval in real scenarios. We then highlight the steps we have taken so far to address some of these challenges, documenting step-by-step improvement along several dimensions. We conclude by analysing the remaining challenges and present ideas for solving these, in order to pave the way to a unified and robust methodology for cross-modal music retrieval.
Original languageEnglish
Title of host publicationProceedings of the 6th International Conference on Multimedia Information Processing and Retrieval (MIPR)
Pages31-36
Number of pages6
ISBN (Electronic)9798350307818
DOIs
Publication statusPublished - 2023

Publication series

NameProceedings - 2023 IEEE 6th International Conference on Multimedia Information Processing and Retrieval, MIPR 2023

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Digital Transformation

Cite this