Partial Pre-Post Code Tree: A Memory-Efficient Tree Structure for Conjunctive Rule Mining

Van Quoc Phuong Huynh*, Florian Beck*, Johannes Fürnkranz*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

State-of-the-art rule mining algorithms rely on summarizing the training set into efficient data structures which allow to quickly answer arbitrary conjunctive queries about the data. The key limitation of such techniques is their memory consumption. Pre-post code trees (PPC-trees) which are the basis of several efficient association and classification rule mining algorithms, are only constructed as an intermediate representation and subsequently converted into a much more efficient N-lists structure. In this paper, we introduce partial pre-post code trees (P3C-trees), which are based on the idea that partial trees are iteratively constructed, and immediately converted into N-lists. This tight integration of these phases allows to avoid the memory bottleneck of a full PPC-tree construction, and thus enables these algorithms to tackle the memory scalability problem posed by large-scale datasets. Our experiments with big datasets confirm that the memory used by P3C-tree is orders of magnitude smaller than the memory consumed by PPC-tree, and the generated N-lists are also more effective than alternative structures such as Tidset or Diffset. Moreover, the N-list construction can also be considerably sped up with the P3C-tree structure.
Original languageEnglish
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
EditorsYizhou Sun, Flavio Chierichetti, Hady W. Lauw, Claudia Perlich, Wee Hyong Tok, Andrew Tomkins
Place of PublicationToronto, ON, Canada
PublisherACM
Pages565-576
Number of pages12
Volume1
ISBN (Electronic)979-8-4007-1245-6
DOIs
Publication statusPublished - 20 Jul 2025

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume1
ISSN (Print)2154-817X

Fields of science

  • 102001 Artificial intelligence
  • 102032 Computational intelligence
  • 102010 Database systems
  • 102035 Data science
  • 102033 Data mining
  • 102 Computer Sciences
  • 102019 Machine learning
  • 102028 Knowledge engineering
  • 102025 Distributed systems

JKU Focus areas

  • Digital Transformation

Cite this