On Evaluation of Inter- and Intra-Rater Agreement in Music Recommendation

Arthur Flexer, T. Lallai, K. Rasi

Research output: Contribution to journalArticlepeer-review

Abstract

Our work is concerned with the subjective perception of music similarity in the context of music recommendation. We present two user studies to explore inter- and intra-rater agreement in quantification of general similarity between pieces of recommended music. Contrary to previous efforts, our test participants are of more uniform age and share a comparable musical background to lower variation within the participant group. The first study uses carefully curated song material from five distinct genres while the second uses songs from a single genre only, with almost all songs in both studies previously unknown to test participants. Repeating the listening tests with a two week lag shows that intra-rater agreement is higher than inter-rater agreement for both studies. Agreement for the single genre study is lower since genre of songs seems a major factor in judging similarity between songs. Mood of raters at test-time is found to have an influence on intra-rater agreement. We discuss the impacts of our results on evaluation of music recommenders and question the validity of experiments on general music similarity.
Original languageEnglish
Pages (from-to)182-194
Number of pages13
JournalTransactions of the International Society of Music Information Retrieval
Volume4
Issue number1
DOIs
Publication statusPublished - Nov 2021

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Digital Transformation

Cite this