Acoustic scene classification with reject option based on resnets

Bernhard Lehner, Khaled Koutini, Christopher Schwarzlmuller, Thomas Gallien, Gerhard Widmer

Research output: Working paper and reportsResearch report

Abstract

This technical report describes the submissions from the SAL/CP JKU team for Task 1 - Subtask C (classification on data that includes classes not encountered in the training data) of the DCASE-2019 challenge. Our method uses a ResNet variant specifically adapted to be used along with spectrograms in the context of Acoustic Scene Classification (ASC). The reject option is based on the logit values of the same networks. We do not use any of the provided external data sets, and perform data augmentation only with the mixup technique [1]. The result of our experiments is a system that achieves classification accuracies of up to around 60% on the public Kaggle-Leaderboard. This is an improvement of around 14 percentage points compared to the official DCASE 2019 baseline
Original languageEnglish
Number of pages5
Publication statusPublished - 2019

Fields of science

  • 202002 Audiovisual media
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102015 Information systems

JKU Focus areas

  • Digital Transformation

Cite this