A Dynamic Split-and-Merge Approach for Evolving Cluster Models

Edwin Lughofer

Research output: Contribution to journalArticlepeer-review

Abstract

This paper describes new dynamic split-andmerge operations for evolving cluster models, which are learned incrementally and expanded on-the-fly from data streams. These operations are necessary to resolve the effects of cluster fusion and cluster delamination, which may appear over time in data stream learning. We propose two new criteria for cluster merging: a touching and a homogeneity criterion for two ellipsoidal clusters. The splitting criterion for an updated cluster applies a 2-means algorithm to its sub-samples and compares the quality of the split cluster with that of the original cluster by using a penalized Bayesian information criterion; the cluster partition of higher quality is retained for the next incremental update cycle. This new approach is evaluated using twodimensional and high-dimensional streaming clustering data sets, where feature ranges are extended and clusters evolve over time—and on two large streams of classification data, each containing around 500K samples. The results show that the new split-and-merge approach (a) produces more reliable cluster partitions than conventional evolving clustering techniques and (b) reduces impurity and entropy of cluster partitions evolved on the classification data sets.
Original languageEnglish
Pages (from-to)135-151
Number of pages17
JournalEvolving Systems
Volume3
Issue number3
DOIs
Publication statusPublished - 2012

Fields of science

  • 101001 Algebra
  • 101 Mathematics
  • 102 Computer Sciences
  • 101013 Mathematical logic
  • 101020 Technical mathematics
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 202027 Mechatronics
  • 101019 Stochastics
  • 211913 Quality assurance

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Mechatronics and Information Processing
  • Nano-, Bio- and Polymer-Systems: From Structure to Function

Cite this