Some remarks on evaluating the quality of the multiple sequence alignment based on the BALiBASE benchmark

Jacek Błażewicz; Piotr Formanowicz; Paweł Wojciechowski

International Journal of Applied Mathematics and Computer Science (2009)

  • Volume: 19, Issue: 4, page 675-678
  • ISSN: 1641-876X

Abstract

top
BAliBASE is one of the most widely used benchmarks for multiple sequence alignment programs. The accuracy of alignment methods is measured by bali score-an application provided together with the database. The standard accuracy measures are the Sum of Pairs (SP) and the Total Column (TC). We have found that, for non-core block columns, results calculated by bali score are different from those obtained on the basis of the formal definitions of the measures. We do not claim that one of these measures is better than the other, but they are definitely different. Such a situation can be the source of confusion when alignments obtained using various methods are compared. Therefore, we propose a new nomenclature for the measures of the quality of multiple sequence alignments to distinguish which one was actually calculated. Moreover, we have found that the occurrence of a gap in some column in the first sequence of the reference alignment causes column discarding.

How to cite

top

Jacek Błażewicz, Piotr Formanowicz, and Paweł Wojciechowski. "Some remarks on evaluating the quality of the multiple sequence alignment based on the BALiBASE benchmark." International Journal of Applied Mathematics and Computer Science 19.4 (2009): 675-678. <http://eudml.org/doc/207965>.

@article{JacekBłażewicz2009,
abstract = {BAliBASE is one of the most widely used benchmarks for multiple sequence alignment programs. The accuracy of alignment methods is measured by bali score-an application provided together with the database. The standard accuracy measures are the Sum of Pairs (SP) and the Total Column (TC). We have found that, for non-core block columns, results calculated by bali score are different from those obtained on the basis of the formal definitions of the measures. We do not claim that one of these measures is better than the other, but they are definitely different. Such a situation can be the source of confusion when alignments obtained using various methods are compared. Therefore, we propose a new nomenclature for the measures of the quality of multiple sequence alignments to distinguish which one was actually calculated. Moreover, we have found that the occurrence of a gap in some column in the first sequence of the reference alignment causes column discarding.},
author = {Jacek Błażewicz, Piotr Formanowicz, Paweł Wojciechowski},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {multiple sequence alignment; reference alignment; alignment accuracy},
language = {eng},
number = {4},
pages = {675-678},
title = {Some remarks on evaluating the quality of the multiple sequence alignment based on the BALiBASE benchmark},
url = {http://eudml.org/doc/207965},
volume = {19},
year = {2009},
}

TY - JOUR
AU - Jacek Błażewicz
AU - Piotr Formanowicz
AU - Paweł Wojciechowski
TI - Some remarks on evaluating the quality of the multiple sequence alignment based on the BALiBASE benchmark
JO - International Journal of Applied Mathematics and Computer Science
PY - 2009
VL - 19
IS - 4
SP - 675
EP - 678
AB - BAliBASE is one of the most widely used benchmarks for multiple sequence alignment programs. The accuracy of alignment methods is measured by bali score-an application provided together with the database. The standard accuracy measures are the Sum of Pairs (SP) and the Total Column (TC). We have found that, for non-core block columns, results calculated by bali score are different from those obtained on the basis of the formal definitions of the measures. We do not claim that one of these measures is better than the other, but they are definitely different. Such a situation can be the source of confusion when alignments obtained using various methods are compared. Therefore, we propose a new nomenclature for the measures of the quality of multiple sequence alignments to distinguish which one was actually calculated. Moreover, we have found that the occurrence of a gap in some column in the first sequence of the reference alignment causes column discarding.
LA - eng
KW - multiple sequence alignment; reference alignment; alignment accuracy
UR - http://eudml.org/doc/207965
ER -

References

top
  1. Thompson, J. D., Koehl, P., Ripp, R. and Poch, O. (2005). Balibase 3.0: Latest developments of the multiple sequence alignment benchmark, PROTEINS: Structure, Function, and Bioinformatics 61(1): 127-136. 
  2. Thompson, J. D., Plewniak, F. and Poch, O. (1999). A comprehensive comparison of multiple sequence alignment programs, Nucleic Acids Research 27(13): 2682-2690. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.