Improving the Fusion of Outbreak Detection Methods with Supervised Learning

Moritz Kulessa, Eneldo Loza Mencía, Johannes Fürnkranz

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Epidemiologists use a variety of statistical algorithms for the early detection of outbreaks. The practical usefulness of such methods highly depends on the trade-off between the detection rate of outbreaks and the chances of raising a false alarm. Recent research has shown that the use of machine learning for the fusion of multiple statistical algorithms improves outbreak detection. Instead of relying only on the binary outputs (alarm or no alarm) of the statistical algorithms, we propose to make use of their p-values for training a fusion classifier. In addition, we also show that adding contextual features and adapting the labeling of an epidemic period may further improve performance. For comparison and evaluation, a new measure is introduced which captures the performance of an outbreak detection method with respect to a low rate of false alarms more precisely than previous works. We have performed experiments on synthetic data to evaluate our proposed approach and the adaptations in a controlled setting and used the reported cases for the disease Salmonella and Campylobacter from 2001 until 2018 all over Germany to evaluate on real data. The experimental results show a substantial improvement on the synthetic data when p-values are used for learning. The results on real data are less clear. Inconsistencies in the data appearing under real conditions make it more challenging for the learning approach to identify valuable patterns for outbreak detection.
Original languageEnglish
Title of host publicationProceedings of the 16th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB-19)
Place of PublicationBergamo, Italy
PublisherSpringer-Verlag
Pages55-66
Number of pages12
DOIs
Publication statusPublished - 2020

Fields of science

  • 303007 Epidemiology
  • 102015 Information systems
  • 102019 Machine learning
  • 102033 Data mining

JKU Focus areas

  • Digital Transformation

Cite this