Abstract
This paper describes several improvements to a newmethod for signal decomposition that we recently formulatedunder the name of Differentiable Dictionary Search (DDS). Thefundamental idea of DDS is to exploit a class of powerful deepinvertible density estimators called normalizing flows, to modelthe dictionary in a linear decomposition method such as NMF,effectively creating a bijection between the space of dictionaryelements and the associated probability space, allowing adifferentiable search through the dictionary space, guided bythe estimated densities. As the initial formulation was a proofof concept with some practical limitations, we will presentseveral steps towards making it scalable, hoping to improve boththe computational complexity of the method and its signaldecomposition capabilities. As a testbed for experimentalevaluation, we choose the task of frame-level pianotranscription, where the signal is to be decomposed into sourceswhose activity is attributed to individual piano notes. Tohighlight the impact of improved non-linear modelling ofsources, we compare variants of our method to a linearovercomplete NMF baseline. Experimental results will show thateven in the absence of additional constraints, our modelsproduce increasingly sparse and precise decompositions,according to two pertinent evaluation measures.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 24thInternational Congress on Acoustics (ICA 2022) |
| Number of pages | 8 |
| Publication status | Published - Oct 2022 |
Fields of science
- 202002 Audiovisual media
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102015 Information systems
JKU Focus areas
- Digital Transformation