An Efficient Similarity search in Large Data Collections with MapReduce, in Future Data and Security Engineering,Proceedings of the first International Conference, FDSE 2014,HO Chi Minh City Vietnam Nov

Trong Nhan Phan, Josef Küng, Tran Khanh Dang

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

The era of big data has been calling for many innovations on improving similarity search computing. Such unstoppable large amounts of data threaten both processing capacity and performance of existing information systems. Joining the challenges on scalability, we propose an efficient similarity search in large data collections with MapReduce. In addition, we make the best use of the proposed scheme for widespread similarity search cases including pairwise similarity, search by example, range query, and k-Nearest Neighbor query. Moreover, collaborative strategic refinements are utilized to effectively eliminate unnecessary computations and efficiently speed up the whole process. Last but not least, our methods are enhanced by experiments, along with a previous work, on real large datasets, which shows how well these methods are verified.
Original languageEnglish
Title of host publicationFuture Data and Security Engineering,Proceedings of the first International Conference, FDSE 2014,HO Chi Minh City Vietnam Nov.
Place of PublicationBerlin, Heidelberg
PublisherSpringer
Pages44-57
Number of pages14
Volume8860
Publication statusPublished - Nov 2014

Publication series

NameLecture Notes in Computer Science (LNCS)

Fields of science

  • 202007 Computer integrated manufacturing (CIM)
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102006 Computer supported cooperative work (CSCW)
  • 102010 Database systems
  • 102014 Information design
  • 102015 Information systems
  • 102016 IT security
  • 102022 Software development
  • 102025 Distributed systems
  • 502007 E-commerce
  • 505002 Data protection
  • 506002 E-government
  • 509018 Knowledge management

JKU Focus areas

  • Computation in Informatics and Mathematics

Cite this