Self-Adaptive and Local Strategies for a Smooth Treatment of Drifts in Data Streams

Ammar Shaker, Edwin Lughofer

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we are dealing with a new concept for handling drifts in data streams during the run of on-line, evolving modeling processes in a regression context. Drifts require a specific attention in evolving modeling methods, as they usually change the underlying data distribution making previously learnt model parameters and structure outdated. Our approach comes with three new stages for an appropriate drift handling: 1.) drifts are not only detected, but also quantified with a new extended version of the Page-Hinkley test; 2.) we integrate an adaptive forgetting factor changing over time and which steers the degree of forgetting in dependency of the current drift intensity in the data stream; 3.) we introduce local forgetting factors by addressing the different local regions of the feature space with a different forgetting intensity; this is achieved by using fuzzy model architecture within stream learning whose structural components (fuzzy rules) provide a local partitioning of the feature space and furthermore ensure smooth transitions of drift handling topology between neighboring regions. Additionally, our approach foresees an early drift recognition variant, which relies on divergence measures, indicating the degree of divergence in local parts of the feature space separately already before the global model error may start to rise significantly. Thus, it can be seen as an attempt regarding drift prevention on global model level. The new approach is successfully evaluated and compared with fixed forgetting and no forgetting on high-dimensional real-world data streams, including different types of drifts.
Original languageEnglish
Pages (from-to)239-257
Number of pages19
JournalEvolving Systems
Volume5
Issue number4
DOIs
Publication statusPublished - 2014

Fields of science

  • 101 Mathematics
  • 101013 Mathematical logic
  • 101024 Probability theory
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102019 Machine learning
  • 603109 Logic
  • 202027 Mechatronics

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Mechatronics and Information Processing
  • Nano-, Bio- and Polymer-Systems: From Structure to Function

Cite this