Skip to content

a Python program that uses LSH (locality-sensitive hashing) to search and retrieve filenames from a csv file that contains similar words to the user's input.

Notifications You must be signed in to change notification settings

julialwang/docuSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docuSearch: A LSH project

This is a Python program that builds LSH (locality-sensitive hash) from scratch to search and retrieve filenames with similar titles as what is inputted. It will give the highest similarity titles first and do its best to provide whichever ones are most identifiable.

After running the program, make sure that the default document or whichever .csv file of titles is desired to be searched is imported properly (all titles must be separated by /n newline). Then, type in a keyword, phrase, or complete title to browse similar titles within the .csv file.

The repository also contains a non-randomized brute-force method that can be timed for comparison to optimized algorithm, as well as integrated matplotlib programs to generate timed functions.

About

a Python program that uses LSH (locality-sensitive hashing) to search and retrieve filenames from a csv file that contains similar words to the user's input.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages