Abstract
In this report, we detail the CP-JKU submissions to the DCASE-2019 challenge Task 1 (acoustic scene classification) and Task 2 (audio tagging with noisy labels and minimal supervision). In all of our submissions, we use fully convolutional deep neural networks architectures that are regularized with Receptive Field (RF) adjustments. We adjust the RF of variants of Resnet and Densenet architectures to best fit the various audio processing tasks that use the spectrogram features as input. Additionally, we propose novel CNN layers such as Frequency-Aware CNNs, and new noise compensation techniques such as Adaptive Weighting for Learning from Noisy Labels to cope with the complexities of each task. We prepared all of our submissions without the use of any external data. Our focus in this year’s submissions is to provide the best-performing single-model submission, using our proposed approaches.
Original language | English |
---|---|
Place of Publication | New York |
Publisher | Detection and Classification of Acoustic Scenes and Events 2019 Challenge |
Number of pages | 5 |
Publication status | Published - 2019 |
Fields of science
- 202002 Audiovisual media
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102015 Information systems
JKU Focus areas
- Digital Transformation