Previous |  Up |  Next

Article

Keywords:
string and sequence matching; nondeterministic finite automata
Summary:
searching for all occurrences of a pattern (string or sequence) in some text, where the pattern can occur with some limited number of errors given by edit distance. Several methods were designed for the approximate string matching that simulate nondeterministic finite automata (NFA) constructed for this problem. This paper presents reduced NFAs for the approximate string matching usable in case, when we are interested only in occurrences having edit distance less than or equal to a given integer, but we are not interested in exact edit distance of each found occurrence. Then an algorithm based on the dynamic programming that simulates these reduced NFAs is presented. It is also presented how to use this algorithm for the approximate sequence matching.
References:
[1] Baeza-Yates R. A., Gonnet G. H.: A new approach to text searching. Comm. ACM 35 (1992), 10, 74–82 DOI 10.1145/135239.135243
[2] Galil Z., Park K.: An improved algorithm for approximate string matching. In: Proceedings of the 16th International Colloquium on Automata, Languages and Programming (G. Ausiello, M. Dezani-Ciancaglini, and S. Ronchi Della Rocca, eds., Lecture Notes in Computer Science 372), Springer–Verlag, Berlin, Stresa 1989, pp. 394–404 MR 1037064 | Zbl 0683.68034
[3] Holub J.: Reduced nondeterministic finite automata for approximate string matching. In: Proceedings of the Prague Stringologic Club Workshop’96 (J. Holub, ed.), Czech Technical University, Prague 1996, pp. 19–27. Collaborative Report DC–96–10
[4] Holub J.: Simulation of NFA in approximate string and sequence matching. In: Proceedings of the Prague Stringology Club Workshop’97 (J. Holub, ed.), Czech Technical University, Prague 1997, pp. 39–46. Collaborative Report DC–97–03
[5] Melichar B.: String matching with $k$ differences by finite automata. In: Proceedings of the 13th International Conference on Pattern Recognition, volume II, IEEE Computer Society Press, Vienna 1996, pp. 256–260
[6] Sellers P. H.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1 (1980), 4, 359–373 DOI 10.1016/0196-6774(80)90016-4 | MR 0604870 | Zbl 0454.68110
[7] Ukkonen E.: Finding approximate patterns in strings. J. Algorithms 6 (1985), 1–3, 132–137 DOI 10.1016/0196-6774(85)90023-9 | MR 0780855 | Zbl 0566.68072
[8] Wu S., Manber U.: Fast text searching allowing errors. Comm. ACM 35 (1992), 10, 83–91 DOI 10.1145/135239.135244
Partner of
EuDML logo