A sparse dynamic programming algorithm for alignment with non-overlapping inversions

Alair Pereira Do Lago; Ilya Muchnik; Casimir Kulikowski

RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications (2005)

  • Volume: 39, Issue: 1, page 175-190
  • ISSN: 0988-3754

Abstract

top
Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kinds of rearrangements. Despite many important results in the last decade, the complexity of the problem of alignment with inversions is still unknown. In 1992, Schöniger and Waterman proposed the simplification hypothesis that the inversions do not overlap. They also presented an O ( n 6 ) exact solution for the alignment with non-overlapping inversions problem and introduced a heuristic for it that brings the average case complexity down. (In this work, n is the maximal length of both sequences that are aligned.) The present paper gives two exact algorithms for the simplified problem. We give a quite simple dynamic program with O ( n 4 ) -time and O ( n 2 ) -space complexity for alignments with non-overlapping inversions and exhibit a sparse and exact implementation version of this procedure that uses much less resources for some applications with real data.

How to cite

top

Do Lago, Alair Pereira, Muchnik, Ilya, and Kulikowski, Casimir. "A sparse dynamic programming algorithm for alignment with non-overlapping inversions." RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications 39.1 (2005): 175-190. <http://eudml.org/doc/245917>.

@article{DoLago2005,
abstract = {Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kinds of rearrangements. Despite many important results in the last decade, the complexity of the problem of alignment with inversions is still unknown. In 1992, Schöniger and Waterman proposed the simplification hypothesis that the inversions do not overlap. They also presented an $O(n^6)$ exact solution for the alignment with non-overlapping inversions problem and introduced a heuristic for it that brings the average case complexity down. (In this work, $n$ is the maximal length of both sequences that are aligned.) The present paper gives two exact algorithms for the simplified problem. We give a quite simple dynamic program with $O(n^4)$-time and $O(n^2)$-space complexity for alignments with non-overlapping inversions and exhibit a sparse and exact implementation version of this procedure that uses much less resources for some applications with real data.},
author = {Do Lago, Alair Pereira, Muchnik, Ilya, Kulikowski, Casimir},
journal = {RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications},
language = {eng},
number = {1},
pages = {175-190},
publisher = {EDP-Sciences},
title = {A sparse dynamic programming algorithm for alignment with non-overlapping inversions},
url = {http://eudml.org/doc/245917},
volume = {39},
year = {2005},
}

TY - JOUR
AU - Do Lago, Alair Pereira
AU - Muchnik, Ilya
AU - Kulikowski, Casimir
TI - A sparse dynamic programming algorithm for alignment with non-overlapping inversions
JO - RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications
PY - 2005
PB - EDP-Sciences
VL - 39
IS - 1
SP - 175
EP - 190
AB - Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kinds of rearrangements. Despite many important results in the last decade, the complexity of the problem of alignment with inversions is still unknown. In 1992, Schöniger and Waterman proposed the simplification hypothesis that the inversions do not overlap. They also presented an $O(n^6)$ exact solution for the alignment with non-overlapping inversions problem and introduced a heuristic for it that brings the average case complexity down. (In this work, $n$ is the maximal length of both sequences that are aligned.) The present paper gives two exact algorithms for the simplified problem. We give a quite simple dynamic program with $O(n^4)$-time and $O(n^2)$-space complexity for alignments with non-overlapping inversions and exhibit a sparse and exact implementation version of this procedure that uses much less resources for some applications with real data.
LA - eng
UR - http://eudml.org/doc/245917
ER -

References

top
  1. [1] D.A. Bader, B.M.E. Moret and M. Yan, A linear-time algorithm for computing inversion distance between signed permutations with an experimental study, in Algorithms and data structures (Providence, RI, 2001), Lect. Notes Comput. Sci. 2125 (2001) 365–376. Zbl0997.68527
  2. [2] A. Caprara, Sorting permutations by reversals and Eulerian cycle decompositions. SIAM J. Discrete Math. 12 (1999) 91–110 (electronic). Zbl0916.68074
  3. [3] D.A. Christie, A 3 / 2 -approximation algorithm for sorting by reversals, in Proc. of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (San Francisco, CA, 1998). ACM, New York (1998) 244–252. Zbl0930.68044
  4. [4] A.P. do Lago, C.A. Kulikowski, E. Linton, J. Messing and I. Muchnik, Comparative genomics: simultaneous identification of conserved regions and their rearrangements through global optimization, in The Second University of Sao Paulo/Rutgers University Biotechnology Conference, Rutgers University Inn and Conference Center, New Brunswick, NJ (August 2001). 
  5. [5] A.P. do Lago, A sparse dynamic programming algorithm for alignment with inversions. Workshop on Combinatorics, Algorithms, and Applications (2003). 
  6. [6] A.P. do Lago, I. Muchnik and C. Kulikowski, An o ( n 4 ) algorithm for alignment with non-overlapping inversions, in Second Brazilian Workshop on Bioinformatics, WOB 2003, Macaé, RJ, Brazil (2003) http://www.ime.usp.br/ alair/woob03.pdf 
  7. [7] N. El-Mabrouk, Genome rearrangement by reversals and insertions/deletions of contiguous segments, in Combinatorial pattern matching (Montreal, QC, 2000). Lect. Notes Comput. Sci. 1848 (2000) 222–234. Zbl0964.92030
  8. [8] N. El-Mabrouk, Sorting signed permutations by reversals and insertions/deletions of contiguous segments. J. Discrete Algorithms 1 (2000) 105–121. 
  9. [9] N. El-Mabrouk, Reconstructing an ancestral genome using minimum segments duplications and reversals. J. Comput. Syst. Sci. 65 (2002) 442–464. Special issue on computational biology (2002). Zbl1058.68529
  10. [10] D. Eppstein, Z. Galil, R. Giancarlo and G.F. Italiano, Sparse dynamic programming. I. Linear cost functions. J. Assoc. Comput. Mach. 39 (1992) 519–545. Zbl0807.90120
  11. [11] D. Eppstein, Z. Galil, R. Giancarlo and G.F. Italiano, Sparse dynamic programming. II. Convex and concave cost functions. J. Assoc. Comput. Mach. 39 (1992) 546–567. Zbl0816.90130
  12. [12] Y. Gao, J. Wu, R. Niewiadomski1, Y. Wang, Z.-Z. Chen and G. Lin, A space efficient algorithm for sequence alignment with inversions, in Computing and Combinatorics, 9th Annual International Conference, COCOON 2003. Lect. Notes Comput. Sci. 2697 57–67 (2003). Zbl1276.92087
  13. [13] S. Hannenhalli and P.A. Pevzner, Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals, in ACM Symposium on Theory of Computing, Association for Computing Machinery (1995) 178–189. Zbl0978.68530
  14. [14] S. Hannenhalli and P.A. Pevzner, Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J. ACM 46 (1999) 1–27. Zbl1064.92510
  15. [15] J.W. Hunt and T.G. Szymanski, A fast algorithm for computing longest common subsequences. Comm. ACM 20 (1997) 350–353. Zbl0354.68078
  16. [16] H. Kaplan, R. Shamir and R.E. Tarjan, Faster and simpler algorithm for sorting signed permutations by reversals, in Proc. of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (New Orleans, LA, 1997), ACM, New York (1997) 344–351. Zbl1321.68234
  17. [17] H. Kaplan, R. Shamir and R.E. Tarjan, A faster and simpler algorithm for sorting signed permutations by reversals. SIAM J. Comput. 29 (2000) 880–892 (electronic). Zbl1112.68405
  18. [18] J. Kececioglu and D. Sankoff, Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement. Algorithmica 13 (1995) 180–210. Zbl0831.92014
  19. [19] I. Muchnik, A. Pereira do Lago, V. Llaca, E. Linton, C.A. Kulikowski and J. Messing, Assignment-like optimization on bipartite graphs with ordered nodes as an approach to the analysis of comparative genomic data. DIMACS Workshop on Whole Genome Comparison (2001) http://dimacs.rutgers.edu/Workshops/WholeGenome/ 
  20. [20] M. Schöniger and M.S. Waterman, A local algorithm for DNA sequence alignment with inversions. Bull. Math. Biology 54 (Jul. 1992) 521–536. Zbl0745.92020
  21. [21] I. Simon, Sequence comparison: some theory and some practice, in Electronic Dictionaries and Automata in Computational Linguistics, edited by M. Gross and D. Perrin. Springer-Verlag. Berlin, Lect. Notes Comput. Sci. 377 (1989) 79–92. 
  22. [22] R. Wagner, On the complexity of the extended string-to-string correction problem, in Seventh ACM Symposium on the Theory of Computation. Association for Computing Machinery (1975). Zbl0357.68047MR445910
  23. [23] Waterman and Eggert, A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisons. J. Molecular Biology 197 (1987) 723–728. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.