Previous |  Up |  Next

Article

Keywords:
string matching algorithm; efficiency
Summary:
In this paper we present experimental results for string matching algorithms which have a competitive theoretical worst case run time complexity. Of these algorithms a few are already famous for their speed in practice, such as the Boyer–Moore and its derivatives. We chose to evaluate the algorithms by counting the number of comparisons made and by timing how long they took to complete a given search. Using the experimental results we were able to introduce a new string matching algorithm and compared it with the existing algorithms by experimentation. These experimental results clearly show that the new algorithm is more efficient than the existing algorithms for our chosen data sets. Using the chosen data sets over 1,500,000 separate tests were conducted to determine the most efficient algorithm.
References:
[1] Apostolico A., Giancarlo R.: The Boyer–Moore–Galil string strategies revisited. SIAM J. Comput. 15 (1986), 1, 98–105 DOI 10.1137/0215007 | MR 0822195
[2] Baeza–Yates R. A.: Improved string searching. Software – Practice and Experience 19 (1989), 3, 257–271 DOI 10.1002/spe.4380190305
[3] Boyer R. S., Moore J. S.: A fast string searching algorithm. Comm. ACM 23 (1977), 5, 1075–1091 Zbl 1219.68165
[4] Charras C., Lecroq T.: Exact string matching, available at: http:--www</b> dir.univ-rouen.fr/ lecroq/string.ps (1998)
[5] Charras C., Lecroq T.: Exact string matching animation in JAVA available at: http:--www</b> dir.univ-rouen.fr/ charras/string/ (1998)
[6] Crochemore M., Czumaj A., Gasieniec L., Jarominek L., Lecroq T., Plandowski, W., Rytter W.: Speeding up two string matching algorithms. Algorithmica 12 (1994), 4, 247–267 DOI 10.1007/BF01185427 | MR 1289482 | Zbl 0942.68574
[7] Crochemore M., Lecroq T.: Tight bounds on the complexity of the Apostolico–Giancarlo algorithm. Inform. Process. Lett. 63 (1997), 4, 195–203 DOI 10.1016/S0020-0190(97)00107-5 | MR 1477304
[8] Crochemore M., Rytter W.: Text Algorithms. Oxford University Press 1994 MR 1307378 | Zbl 1078.68151
[9] Hancart C.: Analyse exacte et en moyenne d’algorithmes de recherche d’un motif dans un texte. These de doctorat de l’Universite de Paris 7, 1993
[10] Horspool R. N.: Practical fast searching in strings. Software – Practice and Experience 10 (1980), 6, 501–506 DOI 10.1002/spe.4380100608
[11] Hume A., Sunday D.: Fast string searching. Software – Practice and Experience 21 (1991), 11, 1221–1248 DOI 10.1002/spe.4380211105
[12] Knuth D. E., Jr. J. H. Morris, Pratt V. R.: Fast pattern matching in strings. SIAM J. of Comput. 6 (1980), 1, 323–350 MR 0451916
[13] Liu Z., Du X., Ishi N.: An improved adaptive string searching algorithm. Software – Practice and Experience 28 (1998), 2, 191–198 DOI 10.1002/(SICI)1097-024X(199802)28:2<191::AID-SPE149>3.0.CO;2-2
[14] Melichar B.: Approximate string matching by finite automata. In: Computer Analysis of Images and Patterns (Lecture Notes in Computer Scince 970), Springer–Verlag, Berlin 1995, pp. 342–349
[15] Raita T.: Tuning the Boyer–Moore–Horspool string searching algorithm. Software – Practice and Experience 22 (1992), 10, 879–884 DOI 10.1002/spe.4380221006
[16] Smith P. D.: Experiments with a very fast substring search algorithm. Software – Practice and Experience 21 (1991), 10, 1065–1074 DOI 10.1002/spe.4380211006
[17] Smit G. V. de: A comparison of three string matching algorithms. Software – Practice and Experience 12 (1982), 57–66 DOI 10.1002/spe.4380120106 | Zbl 0466.68050
[18] Sunday D. M.: A very fast substring search algorithm. Comm. ACM 33 (1990), 8, 132–142
[19] Rytter W.: A correct preprocessing algorithm for Boyer–Moore string searching. SIAM J. Comput. 9 (1980), 509–512 DOI 10.1137/0209037 | MR 0584507 | Zbl 0446.68049
[20] Zhu R. F., Takaoka T.: On improving the average case of the Boyer–Moore string matching algorithm. J. Inform. Process. 10 (1987), 3, 173–177 Zbl 0641.68134
Partner of
EuDML logo