Abstract
AbstractComparative structure models are available for two orders of magnitude more protein sequences than are experimentally determined structures. These models, however, suffer from two limitations that experimentally determined structures do not: They frequently contain significant errors, and their accuracy cannot be readily assessed. We have addressed the latter limitation by developing a protocol optimized specifically for predicting the Cα root‐mean‐squared deviation (RMSD) and native overlap (NO3.5Å) errors of a model in the absence of its native structure. In contrast to most traditional assessment scores that merely predict one model is more accurate than others, this approach quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications. The assessment relies on a model‐specific scoring function constructed by a support vector machine. This regression optimizes the weights of up to nine features, including various sequence similarity measures and statistical potentials, extracted from a tailored training set of models unique to the model being assessed: If possible, we use similarly sized models with the same fold; otherwise, we use similarly sized models with the same secondary structure composition. This protocol predicts the RMSD and NO3.5Å errors for a diverse set of 580,317 comparative models of 6174 sequences with correlation coefficients (r) of 0.84 and 0.86, respectively, to the actual errors. This scoring function achieves the best correlation compared to 13 other tested assessment criteria that achieved correlations ranging from 0.35 to 0.71.
References
71
Referenced
123
{'key': 'e_1_2_6_2_1', 'first-page': '1650', 'article-title': 'ROC‐curve analysis. A statistical method for the evaluation of diagnostic tests', 'volume': '152', 'author': 'Albeck M.J.', 'year': '1990', 'journal-title': 'Ugeskr. Laeger'}
/ Ugeskr. Laeger / ROC‐curve analysis. A statistical method for the evaluation of diagnostic tests by Albeck M.J. (1990)10.1093/nar/25.17.3389
10.1093/nar/gkh039
10.1093/nar/gkh131
10.1126/science.1065659
10.1093/nar/28.1.235
10.1021/bi048252q
10.1126/science.1113801
10.1515/BC.2005.041
10.1016/j.str.2004.05.018
10.1093/protein/gzi019
10.1002/j.1460-2075.1986.tb04288.x
10.1016/j.jmb.2006.08.035
10.1210/me.2004-0435
10.1002/(SICI)1097-0134(1999)37:3 <112::AID-PROT15>3.0.CO;2-R
10.1110/ps.062095806
10.1093/nar/gkg543
{'key': 'e_1_2_6_19_1', 'first-page': 'Unit 2.9', 'article-title': 'Comparative protein structure modeling using MODELLER', 'author': 'Eswar N.', 'year': '2007', 'journal-title': 'Curr. Protoc. Protein Sci.'}
/ Curr. Protoc. Protein Sci. / Comparative protein structure modeling using MODELLER by Eswar N. (2007)10.1110/ps.9.9.1753
10.1110/ps.072939707
10.1142/9781860949852_0003
10.1002/1097-0134(20001201)41:4<518::AID-PROT90>3.0.CO;2-6
10.1093/nar/gki327
10.1016/0022-2836(91)90027-4
10.1110/ps.4820102
{'key': 'e_1_2_6_27_1', 'volume-title': 'Advances in kernel methods: Support vector learning', 'author': 'Joachims T.', 'year': '1999'}
/ Advances in kernel methods: Support vector learning by Joachims T. (1999)10.1006/jmbi.1999.3091
10.1002/bip.360221211
10.1006/jmbi.1999.2685
10.1016/S0959-440X(00)00063-4
10.1016/0022-2836(71)90324-X
10.1016/j.jmb.2007.11.033
10.1093/protein/gzj005
10.1021/ci600485s
10.1146/annurev.biophys.29.1.291
10.1110/ps.03379804
10.1186/1471-2105-8-345
10.1093/bioinformatics/btn014
10.1093/bioinformatics/btg097
10.1006/jmbi.1996.0868
10.1006/jmbi.1998.1665
10.1110/ps.072895107
10.1002/pro.110430
10.1177/0272989X9801800118
10.1006/jmbi.1996.0114
10.1006/jmbi.1996.0256
10.1006/jmbi.1996.0809
10.1093/bioinformatics/bti540
10.1093/nar/gkj059
10.1002/prot.21809
10.1002/prot.20835
10.1093/protein/12.2.85
10.1006/jmbi.1993.1626
10.1110/ps.9.7.1399
10.1038/80776
10.1002/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.0.CO;2-7
10.1002/jcc.10124
10.1110/ps.062416606
10.1016/j.cplett.2005.02.029
10.1073/pnas.95.19.11158
10.1093/bioinformatics/16.9.776
10.1002/prot.340170404
10.1016/0022-2836(81)90087-5
10.1021/ci049924m
10.1002/prot.10015
10.1002/prot.10454
10.1110/ps.0236803
10.1110/ps.051799606
10.1038/nsmb885
10.1002/prot.20264
10.1110/ps.0217002
Dates
Type | When |
---|---|
Created | 16 years, 10 months ago (Oct. 1, 2008, 10:01 p.m.) |
Deposited | 1 year, 11 months ago (Sept. 28, 2023, 9:51 a.m.) |
Indexed | 3 days, 16 hours ago (Aug. 26, 2025, 3:04 a.m.) |
Issued | 16 years, 9 months ago (Nov. 1, 2008) |
Published | 16 years, 9 months ago (Nov. 1, 2008) |
Published Online | 16 years, 7 months ago (Jan. 2, 2009) |
Published Print | 16 years, 9 months ago (Nov. 1, 2008) |
@article{Eramian_2008, title={How well can the accuracy of comparative protein structure models be predicted?}, volume={17}, ISSN={1469-896X}, url={http://dx.doi.org/10.1110/ps.036061.108}, DOI={10.1110/ps.036061.108}, number={11}, journal={Protein Science}, publisher={Wiley}, author={Eramian, David and Eswar, Narayanan and Shen, Min‐Yi and Sali, Andrej}, year={2008}, month=nov, pages={1881–1893} }