[gsoc 2013] Approximate string matching

12 Apr 2013

      Hi, I'm Jan Strnad, currently a Master's student at CTU in Prague, Czech 
Republic. This summer, I would like to participate in GSOC and I believe 
that approximate string matching is a good match for me, since my 
background i this area is quite strong.

Now more specifically. I would like to implement approximate string 
matching algorithm(s) based on the NFA. I would like to support various 
approximate distances, such as Hamming distance [1], Levenstein distance 
[2], Damerau-Levenstein distance [3], Delta distance, Gamma distance and 
finally (delta, gamma) distance. Sorry no explanation of the last three 
found on the Web, but I can provide one if interested.

In the matter of NFA implementation, I would probably choose dynamic 
programming approach, but I'm thinking of shift-or algorithm as well.

Can I ask you for any kind of thoughts or feedback regarding this topic?

Best regards,
Jan Strnad

[1] http://en.wikipedia.org/wiki/Hamming_distance
[2] http://en.wikipedia.org/wiki/Levenstein_distance
[3] http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

Jan Strnad

Michael Marcin

Jan Strnad

Adam D. Walling

Jan Strnad

Bjorn Reese

Jan Strnad

Bjorn Reese

Jan Strnad

Jan Strnad

Michael Marcin

Jan Strnad

Michael Marcin

Jan Strnad

Marshall Clow

Erik Erlandson

Jan Strnad

Jeff Flinn

Jan Strnad

Jan Strnad

tags

participants (7)