e-ISA: An Incremental Lower Bound Approach for Efficiently Finding Approximate Nearest Neighbor of Complex Vague Queries

Khanh Tran Dang, Josef Küng, Roland Wagner

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

In our context, a complex vague query means a multifeature nearest neighbor query. Answering such queries requires the system to search on some feature spaces individually and then combine the searching results to find the final answers. The feature spaces are commonly multidimensional spaces and may consist of a vast amount of data. Therefore searching costs, including IO-cost and CPU-cost, are prohibitively expensive for complex vague queries. For only such a single-feature space, to alleviate the costs, problem of answering nearest neighbor and approximate nearest neighbor queries has been proposed and quite well-addressed in the literature. A data object P is called a (1+|Å)- approximate nearest neighbor of a given query object Q with |Å>0 if for all other data objects P!?: dist(P, Q) !Ü (1+|Å)dist(P!?, Q), in which dist(X, Y) represents the distance between objects X and Y. In this paper, however, we introduce an approach for finding (1+|Å)- approximate nearest neighbor(s) of complex vague queries, which must deal with the problem on multiple feature spaces. This approach is based on a novel, efficient and general algorithm called ISA-Incremental hyper-Sphere Approach, which has just recently been introduced for solving nearest neighbor problem in the Vague Query System (VQS). To the best of our knowledge, the work presented in this paper is one of a few vanguard solutions for dealing with problem of answering approximate multi-feature nearest neighbor queries. The experimental results with both uniformly distributed and real data sets will prove the efficiency of the proposed approach.
Original languageEnglish
Title of host publicationProceedings of the 4th International Conference on Information Integration and Web-based Applications and Services - iiWAS 2002
Number of pages10
Publication statusPublished - Sept 2002

Fields of science

  • 102001 Artificial intelligence
  • 102006 Computer supported cooperative work (CSCW)
  • 102010 Database systems
  • 102014 Information design
  • 102015 Information systems
  • 102016 IT security
  • 102028 Knowledge engineering
  • 102019 Machine learning
  • 102022 Software development
  • 102025 Distributed systems
  • 502007 E-commerce
  • 505002 Data protection
  • 506002 E-government
  • 509018 Knowledge management
  • 202007 Computer integrated manufacturing (CIM)
  • 102033 Data mining
  • 102035 Data science

Cite this