The Million Musical Tweet Dataset: What We Can Learn From Microblogs.

David Hauger, Markus Schedl, A. Kosir, Marko Tkalcic

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Microblogs and Social Media applications are continuously growing in spread and importance. Users of Twitter, the currently most popular platform for microblogging, create more than a billion posts (called tweets) every week. Among all the different types of information being shared, some people post their music listening behavior, which is why Twitter became interesting for the Music Information Retrieval (MIR) community. Depending on the device and personal settings, some users provide geographic coordinates for their microposts. Having continuously crawled and analyzed tweets for more than 500 days (17 months) we can now present the “Million Musical Tweet Dataset” (MMTD) – the biggest publicly available source of microblog-based music listening histories that includes geographic, temporal, and other contextual information. These extended information makes the MMTD outstanding from other datasets providing music listening histories. We introduce the dataset, give basic statistics about its composition, and show how this dataset allows to detect new contextual music listening patterns by performing a comprehensive statistical investigation with respect to correlation between music taste and day of the week, hour of day, and country.
Original languageEnglish
Title of host publicationProceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013),
Number of pages6
Publication statusPublished - 2013

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Engineering and Natural Sciences (in general)

Cite this