Abstract
AbstractOne of the most central methods in bioinformatics is the alignment of two protein or DNA sequences. However, so far large‐scale benchmarks examining the quality of these alignments are scarce. On the other hand, recently several large‐scale studies of the capacity of different methods to identify related sequences has led to new insights about the performance of fold recognition methods. To increase our understanding about fold recognition methods, we present a large‐scale benchmark of alignment quality. We compare alignments from several different alignment methods, including sequence alignments, hidden Markov models, PSI‐BLAST, CLUSTALW, and threading methods. For most methods, the alignment quality increases significantly at about 20% sequence identity. The difference in alignment quality between different methods is quite small, and the main difference can be seen at the exact positioning of the sharp rise in alignment quality, that is, around 15–20% sequence identity. The alignments are improved by using structural information. In general, the best alignments are obtained by methods that use predicted secondary structure information and sequence profiles obtained from PSI‐BLAST. One interesting observation is that for different pairs many different methods create the best alignments. This finding implies that if a method that could select the best alignment method for each pair existed, a significant improvement of the alignment quality could be gained. Proteins 2002;46:330–339. © 2002 Wiley‐Liss, Inc.
References
43
Referenced
53
- CASP. The casp‐site.http://predictioncenter.llnl.gov/casp3/Casp3.html 1999.
10.1016/0022-2836(81)90087-5
10.1016/0022-2836(70)90057-4
10.1093/nar/22.22.4673
10.1073/pnas.84.13.4355
10.1093/nar/25.17.3389
10.1002/pro.5560050516
10.1006/jmbi.1997.1101
10.1006/jmbi.1997.0924
10.1038/358086a0
10.1073/pnas.95.23.13597
10.1002/pro.5560070204
10.1006/jmbi.1997.1287
10.1073/pnas.95.11.6073
10.1006/jmbi.1997.1288
10.1006/jmbi.1998.2221
10.1002/(SICI)1097-0134(1997)1 <123::AID-PROT16>3.0.CO;2-Q
10.1006/jmbi.1999.3377
10.1002/(SICI)1097-0134(1997)1 <192::AID-PROT25>3.0.CO;2-I
10.1006/jmbi.2000.3615
10.1006/jmbi.2000.3541
10.1110/ps.9.8.1487
10.1002/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.0.CO;2-7
10.1186/1471-2105-2-5
10.1093/nar/27.13.2682
10.1016/S1359-0278(96)00021-1
10.1002/pro.5560050711
- FischerD ElofssonA RychlewskiL PazosF ValenciaA RostB OrtizA DunbrackR.Cafasp2: the critical assessment of fully automated structure prediction methods. Submitted for publication.
10.1110/ps.40501
10.1093/bioinformatics/14.10.846
- EddyS. Hmmer‐hidden Markov model software url:http://genome.wustl.edu/eddy/hmmer.html 1997.
10.1110/ps.9.2.232
10.1006/jmbi.1999.3233
{'key': 'e_1_2_8_35_2', 'article-title': 'Pcons: a neural network based consensus predictor that improves fold recognition', 'author': 'Lundström J', 'journal-title': 'Protein Sci'}
/ Protein Sci / Pcons: a neural network based consensus predictor that improves fold recognition by Lundström J10.1073/pnas.95.11.5913
10.1002/(SICI)1097-0134(1999)37:3 <22::AID-PROT5>3.0.CO;2-W
10.1002/(SICI)1097-0134(1999)37:3 <15::AID-PROT4>3.0.CO;2-Z
10.1093/bioinformatics/16.9.776
10.1016/S0022-2836(05)80134-2
10.1002/pro.5560010313
10.1093/bioinformatics/14.9.755
10.1093/bioinformatics/15.3.260
10.1006/jmbi.2000.3741
Dates
Type | When |
---|---|
Created | 23 years ago (Aug. 25, 2002, 6:16 p.m.) |
Deposited | 1 year, 10 months ago (Oct. 15, 2023, 2:13 p.m.) |
Indexed | 2 months, 2 weeks ago (June 5, 2025, 12:24 p.m.) |
Issued | 23 years, 7 months ago (Jan. 8, 2002) |
Published | 23 years, 7 months ago (Jan. 8, 2002) |
Published Online | 23 years, 7 months ago (Jan. 8, 2002) |
Published Print | 23 years, 6 months ago (Feb. 15, 2002) |
@article{Elofsson_2002, title={A study on protein sequence alignment quality}, volume={46}, ISSN={1097-0134}, url={http://dx.doi.org/10.1002/prot.10043}, DOI={10.1002/prot.10043}, number={3}, journal={Proteins: Structure, Function, and Bioinformatics}, publisher={Wiley}, author={Elofsson, Arne}, year={2002}, month=jan, pages={330–339} }