The Homepage of Nearest Neighbors and Similarity Search

The Homepage of Nearest Neighbors and Similarity Search
Maintained by Yury Lifshits

Update: this page is frozen. Please visit the successor page by Arnoldo Muller

Intro Tutorial Bibliography Researchers Links Open Problems Yury's work on NN

To preprocess a database of N objects so that given a query object,
one can effectively determine its nearest neighbors in database

The purpose of this page is to collect links, people, ideas, keywords, papers, slides, code and data sets on nearest neighbors in a single place.

Name of the problem: nearest neighbors, k nearest neighbors (kNN, k-NN), nearset neighbor search, proximity search, similarity search, approximate nearest neighbors (ANN), range queries, maximal intersection queries, post-office problem, partial match, best match file searching, best match retrieval, sequence nearest neighbors (SNN).

Solution concepts: locality-sensitive hashing (LSH), low-distortion embeddings, k-d trees, kd-trees, metric trees, M-trees, R*-trees, vp-trees, vantage point trees, vantage point forest, multi-vantage point tree, bisector trees, Orchard's algorithm, random projections, fixed queries tree, Voronoi tree, BBD-tree, min-wise independent permutations, Burkhard-Keller tree, generalized hyperplane tree, geometric near-neighbor access tree (GNAT), approximating eliminating search algorithm (AESA), inverted index, spatial approximation tree (SAT).

Applications: k-nearest neighbor classification algorithm, image similarity identification, audio similarity identification, fingerprint search, audio/video compression (MPEG), optical character recognition, coding theory, function approximation, recommendation systems, near-duplicate detection, targeting on-line ads, distributional similarity computation, spelling correction, nearest neighbor interpolation.

Related keywords: all-nearest-neighbors problem, indexing methods, spatial index, Voronoi diagram, spatial access methods (SAM), multidimensional access methods, closest pair, indexing algorithm, intrinsic dimension, Johnson-Lindenstrauss lemma, Johnson-Lindenstrauss transform, large-scale algorithms, scalability, dimensionality reduction, high dimensions, curse of dimensionality, high-dimensional spaces, cell probe model, metric spaces, Euclidean space, brunch-and-bound search, divide and conquer, massive data sets, metric embeddings, cell probe complexity, spatial data structures, Euclidean k-median problem, point sampling.

Call for promotion and feedback
Find a mistake? Send me email to yury@yury.name
Something is missed? Give me a suggestion: yury@yury.name
Have an open problem? Publish it on this page! Copyrights are reserved to you.
Like it? Put a hyperlink on your page!

Thanks for visiting this page!

To preprocess a database of N objects so that given a query object, one can effectively determine its nearest neighbors in database

To preprocess a database of N objects so that given a query object,
one can effectively determine its nearest neighbors in database