The finite automata approaches in stringology

Jan Holub

Kybernetika (2012)

  • Volume: 48, Issue: 3, page 386-401
  • ISSN: 0023-5954

Abstract

top
We present an overview of four approaches of the finite automata use in stringology: deterministic finite automaton, deterministic simulation of nondeterministic finite automaton, finite automaton as a model of computation, and compositions of finite automata solutions. We also show how the finite automata can process strings build over more complex alphabet than just single symbols (degenerate symbols, strings, variables).

How to cite

top

Holub, Jan. "The finite automata approaches in stringology." Kybernetika 48.3 (2012): 386-401. <http://eudml.org/doc/246462>.

@article{Holub2012,
abstract = {We present an overview of four approaches of the finite automata use in stringology: deterministic finite automaton, deterministic simulation of nondeterministic finite automaton, finite automaton as a model of computation, and compositions of finite automata solutions. We also show how the finite automata can process strings build over more complex alphabet than just single symbols (degenerate symbols, strings, variables).},
author = {Holub, Jan},
journal = {Kybernetika},
keywords = {exact pattern matching; approximate pattern matching; finite automata; dynamic programming; bitwise parallelism; suffix automaton; border array; degenerate symbol; exact pattern matching; approximate pattern matching; finite automata; dynamic programming; bitwise parallelism; suffix automaton; border array; degenerate symbol},
language = {eng},
number = {3},
pages = {386-401},
publisher = {Institute of Information Theory and Automation AS CR},
title = {The finite automata approaches in stringology},
url = {http://eudml.org/doc/246462},
volume = {48},
year = {2012},
}

TY - JOUR
AU - Holub, Jan
TI - The finite automata approaches in stringology
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 3
SP - 386
EP - 401
AB - We present an overview of four approaches of the finite automata use in stringology: deterministic finite automaton, deterministic simulation of nondeterministic finite automaton, finite automaton as a model of computation, and compositions of finite automata solutions. We also show how the finite automata can process strings build over more complex alphabet than just single symbols (degenerate symbols, strings, variables).
LA - eng
KW - exact pattern matching; approximate pattern matching; finite automata; dynamic programming; bitwise parallelism; suffix automaton; border array; degenerate symbol; exact pattern matching; approximate pattern matching; finite automata; dynamic programming; bitwise parallelism; suffix automaton; border array; degenerate symbol
UR - http://eudml.org/doc/246462
ER -

References

top
  1. Abrahamson, K., 10.1137/0216067, SIAM J. Comput. 16 (1987), 6, 1039–1051. Zbl0646.68079MR0917040DOI10.1137/0216067
  2. Aho, A. V., Corasick, M. J., 10.1145/360825.360855, Commun. ACM 18 (1975), 6, 333–340. Zbl0301.68048MR0371172DOI10.1145/360825.360855
  3. Allauzen, C., Crochemore, M., Raffinot, M., Factor oracle: A new structure for pattern matching, In: SOFSEM'99, Theory and Practice of Informatics (J. Pavelka, G. Tel, and M. Bartošek, eds.), Milovy 1999, Lect. Notes Comput. Sci. 1725 (1999), Springer-Verlag, Berlin, pp. 295–310. Zbl0964.68078MR1784520
  4. Antoniou, P., Holub, J., Iliopoulos, C. S., Melichar, B., Peterlongo, P., Finding common motifs with gaps using finite automata, In: Implementation and Application of Automata (O. H. Ibarra and H.-C. Yen, eds.), Lect. Notes Comput. Sci. 4094 (2006), Springer-Verlag, Heidelberg, pp. 69–77. Zbl1160.68683MR2296448
  5. Baeza-Yates, R. A., Gonnet, G. H., 10.1145/135239.135243, Commun. ACM 35 (1992), 10, 74–82. DOI10.1145/135239.135243
  6. Baker, B. S., 10.1137/S0097539793246707, SIAM J. Comput. 26 (1997), 5, 1343–1362. Zbl0885.68085MR1471985DOI10.1137/S0097539793246707
  7. Blumer, A., Blumer, J., Ehrenfeucht, A., Haussler, D., Chen, M. T., Seiferas, J., 10.1016/0304-3975(85)90157-4, Theor. Comput. Sci. 40 (1985), 1, 31–44. Zbl0574.68070MR0828515DOI10.1016/0304-3975(85)90157-4
  8. Blumer, A., Blumer, J., Ehrenfeucht, A., Haussler, D., McConnel, R., 10.1145/28869.28873, J. Assoc. Comput. Mach. 34 (1987), 3, 578–595. MR0904194DOI10.1145/28869.28873
  9. Boyer, R. S., Moore, J. S., 10.1145/359842.359859, Commun. ACM 20 (1977), 10, 762–772. Zbl1219.68165DOI10.1145/359842.359859
  10. Cambouropoulos, E., Crochemore, M., Iliopoulos, C. S., Mouchard, L., Pinzon, Y. J., Algorithms for computing approximate repetitions in musical sequences, In: Proc. Tenth Australian Workshop on Combinatorial Algorithms (R. Raman and J. Simpson, eds.), AWOCA'99, School of Computing, Curtin University of Technology, Perth 1999, pp. 129–144. Zbl1008.68043
  11. Commentz-Walter, B., A string matching algorithm fast on the average, In: Proc. 6th International Colloquium on Automata, Languages and Programming (H. A. Maurer, ed.), Graz 1979, Lect. Notes Comput. Sci. 71 (1979), Springer-Verlag, Berlin, pp. 118–132. Zbl0407.68092
  12. Crochemore, M., 10.1016/0304-3975(86)90041-1, Theor. Comput. Sci. 45 (1986), 1, 63–86. Zbl0615.68053MR0865967DOI10.1016/0304-3975(86)90041-1
  13. Damerau, F., 10.1145/363958.363994, Commun. ACM 7 (1964), 3, 171–176. DOI10.1145/363958.363994
  14. Dömölki, B., An algorithm for syntactical analysis, Comput. Linguistics 3 (1964), 29–46, 1964. 
  15. Fischer, M. J., Paterson, M., String matching and other products, In: Proc. SIAM-AMS Complexity of Computation (R. M. Karp, ed.), Providence 1974, pp. 113–125. Zbl0301.68027MR0400782
  16. Fredriksson, K., Mozgovoy, M., 10.1016/j.ipl.2006.06.009, Inf. Process. Lett. 100 (2006), 3, 91–96. Zbl1185.68282MR2254198DOI10.1016/j.ipl.2006.06.009
  17. Galil, Z., Open problems in stringology, In: Combinatorial Algorithms on Words (A. Apostolico and Z. Galil, eds.), NATO ASI Series F 12 (1985), Springer-Verlag, Berlin, pp. 1–8. Zbl0607.68054MR0815328
  18. Hamming, R. W., Error detecting and error correcting codes, The Bell System Techn. J. 29 (1950), 2, 147–160. MR0035935
  19. Holub, J., Bit parallelism-NFA simulation, In: Implementation and Application of Automata (B. W. Watson and D. Wood, eds.), Lect. Notes in Comput. Sci. 2494 (2002), Springer-Verlag, Heidelberg, pp. 149–160. Zbl1015.68524
  20. Holub, J., Finite automata in pattern matching, In: Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications (M. Elloumi and A. Y. Zomaya, eds.), John Wiley and Sons, 2011, pp. 51–71. 
  21. Holub, J., Kadlec, T., NFA simulation using deterministic state cache, In: London Algorithmics 2008: Theory and Practice (J. Chan, J. W. Daykin, and M. S. Rahman, eds.), College Publications 2009, pp. 152–166. 
  22. Holub, J., Melichar, B., Implementation of nondeterministic finite automata for approximate pattern matching, In: Proc. 3rd International Workshop on Implementing Automata, Rouen 1998, pp. 74–81. 
  23. Holub, J., Melichar, B., 10.1016/S0304-3975(00)00064-5, Theor. Comput. Sci. 249 (2000), 2, 30–311. Zbl0949.68088MR1798315DOI10.1016/S0304-3975(00)00064-5
  24. Holub, J., Smyth, W. F., Algorithms on indeterminate strings, In: Proc. 14th Australasian Workshop On Combinatorial Algorithms (M. Miller and K. Park, eds.), Seoul National University, Seoul 2003, pp. 36–45. 
  25. Holub, J., Smyth, W. F., Wang, S., Hybrid pattern-matching algorithms on indeterminate strings, In: London Stringology Day and London Algorithmic Workshop 2006 (J. Daykin, M. Mohamed, and K. Steinhoefel, eds.) King's College London Series Texts in Algorithmics 2007, pp. 115–133. 
  26. Holub, J., Smyth, W. F., Wang, S., 10.1016/j.jda.2006.10.003, J. Discret. Algorithms 6 (2008), 1, 37–50. Zbl1162.68808MR2397082DOI10.1016/j.jda.2006.10.003
  27. Holub, J., Špiller, P., Practical experiments with NFA simulation, In: Proc. Eindhoven FASTAR Days 2004 (L. Cleophas and B. W. Watson, eds.), TU Eindhoven 2004, The Finite Automata Approaches in Stringology 15, pp. 73–95. 
  28. Hopcroft, J. E., Ullman, J. D., Introduction to automata, languages and computations, Addison-Wesley, Reading 1979. MR0645539
  29. Huffman, D. A., 10.1016/0016-0032(54)90574-8, J. Franklin Inst. 257 (1954), 161–190, 275–303. Zbl0166.27201MR0062648DOI10.1016/0016-0032(54)90574-8
  30. Huffman, D. A., A method for the construction of minimum-redundancy codes, Proc. Inst. Radio Engrs 40 (1952), 9, 1098–1101. Zbl0137.13605
  31. Iliopoulos, C. S., Jayasekera, I., Melichar, B., Šupol, J., Weighted degenerated approximate pattern matching, In: Proc. 1st International Conference on Language and Automata Theory and Applications, Tarragona 2007. 
  32. Kleene, S. C., Representation of events in nerve nets and finite automata, Automata Studies (1956), 3–42. MR0077478
  33. Knuth, D. E., Morris, J. H., Jr, Pratt, V. R., 10.1137/0206024, SIAM J. Comput. 6 (1977), 2, 323–350. Zbl0372.68005MR0451916DOI10.1137/0206024
  34. Kurtz, S., 10.1002/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O, Softw. Pract. Exp. 29 (1999), 13, 1149–1171. DOI10.1002/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O
  35. Lahoda, J., Melichar, B., Pattern matching in text coded by finite translation automaton, In: Proc. 7th International Multiconference Information Society (M. Heričko et al., eds.), Institut Jožef Stefan, Ljubljana 2004, pp. 212–214. 
  36. Lahoda, J., Melichar, B., General pattern matching on regular collage system, In: Proc. Prague Stringology Conference'05 (J. Holub and M. Šimánek, eds.), Czech Technical University in Prague 2005, pp. 153–162. 
  37. Levenshtein, V. I., Binary codes capable of correcting deletions, insertions and reversals, Sov. Phys. Dokl. 6 (1966), 707–710. MR0189928
  38. McCulloch, W. S., Pitts, W., 10.1007/BF02478259, Bull. Math. Biophys. 5 (1943), 115–133. Zbl0063.03860MR0010388DOI10.1007/BF02478259
  39. Mealy, G. H., 10.1002/j.1538-7305.1955.tb03788.x, Bell System Techn. J. 34 (1955), 5, 1045–1079. MR0073450DOI10.1002/j.1538-7305.1955.tb03788.x
  40. Melichar, B., Repetitions in text and finite automata, In: Proc. Eindhoven FASTAR Days 2004 (L. Cleophas and B. W. Watson, eds.), TU Eindhoven 2004, pp. 1–46. 
  41. Melichar, B., Holub, J., 6D classification of pattern matching problems, In: Proceedings of the Prague Stringology Club Workshop'97 (J. Holub, ed.), Czech Technical University in Prague, 1997, pp. 24–32. Collaborative Report DC-97-03. 
  42. Melichar, B., Holub, J., Polcar, T., Text searching algorithms. Vol. I: Forward string matching, Textbook for course Text Searching Algorithms, 2005. 
  43. Moore, E. F., Gedanken experiments on sequential machines, Automata Studies (1965), 129–153. MR0078059
  44. Morris, J. H., Jr, Pratt, V. R., A Linear Pattern-Matching Algorithm, Report 40, University of California, Berkeley 1970. 
  45. Navarro, G., Raffinot, M., A bit-parallel approach to suffix automata: Fast extended string matching, In: Proc. 9th Annual Symposium on Combinatorial Pattern Matching (M. Farach-Colton, ed.), Lect. Notes in Comput. Sci. 1448, Piscataway, NJ, 1998. Springer-Verlag, Berlin, pp. 14–33. MR1672618
  46. Rabin, M. O., Scott, D., 10.1147/rd.32.0114, IBM J. Res. Dev. 3 (1959) 114–125. Zbl0158.25404MR0103795DOI10.1147/rd.32.0114
  47. Sellers, P. H., 10.1016/0196-6774(80)90016-4, J. Algorithms 1 (1980), 4, 359–373. Zbl0454.68110MR0604870DOI10.1016/0196-6774(80)90016-4
  48. Shyamasundar, R. K., A Simple String Matching Algorithm, Technical Report, Tata Institute of Fundamental Research 1976. 
  49. M. Šimůnek, Melichar, B., Borders and finite automata, In: Implementation and Application of Automata (O. H. Ibarra and H.-C. Yen, eds.), Lecture Notes in Comput. Sci. 4094, Springer-Verlag, Heidelberg 2006, pp. 58–68. Zbl1142.68416MR2296447
  50. Ukkonen, E., 10.1016/0196-6774(85)90023-9, J. Algorithms 6 (1985), 1–3, 132–137. Zbl0566.68072MR0780855DOI10.1016/0196-6774(85)90023-9
  51. Voráček, M., Vagner, L., Rahman, S., Iliopoulos, C. S., The constrained longest common subsequence problem for degenerate strings, In: Implementation and Application of Automata (J. Holub and J. Žďárek, eds.), Lecture Notes in Comput. Sci. 4783, Springer-Verlag, Heidelberg 2007, pp. 309–311. Zbl1137.68621
  52. Wu, S., Manber, U., 10.1145/135239.135244, Commun. ACM 35 (1992), 10, 83–91. DOI10.1145/135239.135244
  53. Ziv, J., Lempel, A., 10.1109/TIT.1978.1055934, IEEE Trans. Inf. Theory 24 (1978), 530–536. MR0507465DOI10.1109/TIT.1978.1055934

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.