Projects per year
Abstract
Clustering is a data mining task that is computationally intensive and shows an increasing runtime in large databases, so that its application in the process of Knowledge Discovery in Databases (KDD) can hardly be done efficiently. This work discusses the approach of anticipatory clustering, which reduces this problem by an application-independent preparation of all data via a clustering method. In a second step any data mining method will then use the prepared data for a specific analysis. Because of the generic preparation analyses can be executed repeatedly with modified parameters where the determination of the results is faster than with non aggregated data.
This work introduces the clustering method EMAD (expectation maximization with aggregated data) that is developed for the second step in the anticipatory clustering. For this reason the clustering method expectation maximization has been adjusted to be applicable to aggregated data. Experimental results from EMAD confirm that the algorithm exhibits a good scalability with large databases.
Original language | German (Austria) |
---|---|
Supervisors/Reviewers |
|
Publication status | Published - Sept 2006 |
Fields of science
- 102 Computer Sciences
- 102015 Information systems
Projects
- 1 Finished
-
Anticipatory Data Mining
Goller, M. (Researcher) & Schrefl, M. (PI)
01.01.2004 → 01.08.2006
Project: Other › PhD thesis project