RPatternJoin - String Similarity Joins for Hamming and Levenshtein Distances
This project is a tool for words edit similarity joins
(a.k.a. all-pairs similarity search) under small (< 3) edit
distance constraints. It works for Levenshtein/Hamming
distances and words from any alphabet. The software was
originally developed for joining amino-acid/nucleotide
sequences from Adaptive Immune Repertoires, where the number of
words is relatively large (10^5-10^6) and the average length of
words is relatively small (10-100).