Module 3: Hashing
  Lecture 11: Locality Sensitive Hashing and Grid File
 

                                            

 

 

Locality sensitive hashing (LSH)
  • To support range and kNN queries
  • Idea of randomized algorithms
 
  • Algorithms that make random choices
 
  • Monte Carlo: Probabilistic but bounded error results in bounded time accuracy improves with each run
 
  • Las Vegas: Correct and deterministic result but varying time
  • Goal in LSH: To find a hashing function that is approximately distance-preserving (within some tolerance)
  • A hash function is -sensitive if for any ,,
 
  • If , then
 
  • If , then
  • Can also be viewed as a dimensionality reduction technique