Abstract
We explore frame-level audio feature learning for chord
recognition using artificial neural networks. We present
the argument that chroma vectors potentially hold enough
information to model harmonic content of audio for chord
recognition, but that standard chroma extractors compute
too noisy features. This leads us to propose a learned
chroma feature extractor based on artificial neural networks.
It is trained to compute chroma features that encode
harmonic information important for chord recognition,
while being robust to irrelevant interferences. We
achieve this by feeding the network an audio spectrum with
context instead of a single frame as input. This way, the
network can learn to selectively compensate noise and resolve
harmonic ambiguities.
We compare the resulting features to hand-crafted ones
by using a simple linear frame-wise classifier for chord
recognition on various data sets. The results show that the
learned feature extractor produces superior chroma vectors
for chord recognition.
Original language | English |
---|---|
Title of host publication | Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR) |
Number of pages | 7 |
Publication status | Published - 2016 |
Fields of science
- 202002 Audiovisual media
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102015 Information systems
JKU Focus areas
- Computation in Informatics and Mathematics
- Engineering and Natural Sciences (in general)