Abstract
AbstractCurrently, the prediction of three‐dimensional (3D) protein structure from sequence alone is an exceedingly difficult task. As an intermediate step, a much simpler task has been pursued extensively: predicting 1D strings of secondary structure. Here, we present an analysis of another 1D projection from 3D structure: the relative solvent accessibility of each residue. We show that solvent accessibility is less conserved in 3D homologues than is secondary structure, and hence is predicted less accurately from automatic homology modeling; the correlation coefficient of relative solvent accessibility between 3D homologues is only 0.77, and the average accuracy of predictions based on sequence alignments is only 0.68. The latter number provides an effective upper limit on the accuracy of predicting accessibility from sequence when homology modeling is not possible. We introduce a neural network system that predicts relative solvent accessibility (projected onto ten discrete states) using evolutionary profiles of amino acid substitutions derived from multiple sequence alignments. Evaluated in a cross‐validation test on 238 unique proteins, the correlation between predicted and observed relative accessibility is 0.54. Interpreted in terms of a three‐state (buried, intermediate, exposed) description of relative accessibility, the fraction of correctly predicted residue states is about 58%. In absolute terms this accuracy appears poor, but given the relatively low conservation of accessibility in 3D families, the network system is not far from its likely optimal performance. The most reliably predicted fraction of the residues (50%) is predicted as accurately as by automatic homology modeling. Prediction is best for buried residues, e.g., 86% of the completely buried sites are correctly predicted as having 0% relative accessibility. © 1994 Wiley‐Liss, Inc.
References
89
Referenced
523
10.1093/nar/20.suppl.2019
10.1016/S0022-2836(77)80200-3
10.1002/pro.5560010313
10.1002/pro.5560030317
10.1038/357038a0
10.1073/pnas.47.9.1309
10.1126/science.181.4096.223
10.1002/j.1460-2075.1986.tb04288.x
10.1002/prot.340090107
10.1073/pnas.77.6.3393
10.1038/326347a0
10.1111/j.1432-1033.1988.tb13917.x
10.1016/0022-2836(89)90109-5
10.1093/protein/2.7.505
10.1098/rspb.1990.0077
10.1016/0076-6879(91)02014-Z
10.1002/prot.340110107
10.1016/0022-2836(92)90964-L
10.1002/pro.5560010203
10.1016/0959-440X(92)90231-U
10.1093/nar/21.13.3105
10.1016/S0022-2836(05)80269-4
10.1093/protein/5.7.617
10.1002/pro.5560010509
10.1002/prot.340130308
10.1002/prot.340160110
10.1093/protein/6.3.267
10.1093/protein/6.8.811
10.1006/jmbi.1993.1433
10.1002/pro.5560020302
10.1093/protein/6.6.593
10.1016/S0959-440X(05)80160-5
{'key': 'e_1_2_1_34_2', 'first-page': '113', 'volume-title': 'Protein Structure by Distance Analysis', 'author': 'Sippl M. J.', 'year': '1994'}
/ Protein Structure by Distance Analysis by Sippl M. J. (1994)10.1016/0968-0004(93)90017-H
10.1006/jmbi.1993.1649
10.1016/S0022-2836(05)80007-5
10.1002/prot.340190108
10.1093/protein/3.8.659
10.1016/0022-2836(71)90324-X
10.1016/0022-2836(76)90191-1
10.1016/0022-2836(76)90192-3
10.1146/annurev.bb.06.060177.001055
10.1016/0022-2836(78)90201-2
10.1038/272586a0
10.1038/277491a0
10.1073/pnas.77.4.1736
10.1016/0022-2836(82)90515-0
10.1016/0022-2836(83)90041-4
10.1016/0022-2836(84)90309-7
10.1126/science.4023714
10.1016/0022-2836(87)90038-6
10.1016/0022-2836(87)90358-5
10.1073/pnas.81.1.140
10.1016/0022-2836(87)90189-6
{'key': 'e_1_2_1_56_2', 'first-page': '71', 'article-title': 'A scale‐independent signal processing method for sequence analysis', 'volume': '6', 'author': 'Viari A.', 'year': '1990', 'journal-title': 'CABIOS'}
/ CABIOS / A scale‐independent signal processing method for sequence analysis by Viari A. (1990)10.1021/bi00526a005
10.1073/pnas.78.6.3824
10.1007/BF02337558
10.1006/jmbi.1994.1050
10.1093/protein/2.5.329
10.1002/prot.340170404
10.1007/BF02337562
10.1016/0014-5793(87)80439-8
10.1016/0300-9084(90)90120-6
10.1126/science.1853201
10.1002/prot.340100307
10.1016/0022-2836(92)90556-Y
10.1038/356083a0
10.1093/protein/1.3.159
10.1002/prot.340020208
10.1002/pro.5560021104
10.1002/bip.360221211
{'key': 'e_1_2_1_74_2', 'first-page': '89', 'volume-title': 'Protein Engineering', 'author': 'Sandel C.', 'year': '1992'}
/ Protein Engineering by Sandel C. (1992)10.1006/jmbi.1993.1413
10.1073/pnas.90.16.7558
10.1006/jmbi.1993.1489
10.1093/protein/6.8.831
10.1038/323533a0
{'key': 'e_1_2_1_80_2', 'first-page': '51', 'volume-title': 'Protein Structure by Distance Analysis', 'author': 'Esposito G.', 'year': '1994'}
/ Protein Structure by Distance Analysis by Esposito G. (1994)10.1038/285378a0
10.1002/prot.340180402
10.1073/pnas.91.1.98
10.1016/0022-2836(78)90302-9
10.1007/978-1-4684-6831-1_12
10.1126/science.8332897
10.1017/S0033583500003966
10.1093/protein/6.6.557
{'key': 'e_1_2_1_89_2', 'volume-title': 'Carte‐graphics for the Display of Contact Maps.', 'author': 'Hubbard S.', 'year': '1994'}
/ Carte‐graphics for the Display of Contact Maps. by Hubbard S. (1994)10.1006/jmbi.1994.1329
Dates
Type | When |
---|---|
Created | 20 years, 2 months ago (May 28, 2005, 9:37 p.m.) |
Deposited | 1 year, 10 months ago (Oct. 7, 2023, 9:59 p.m.) |
Indexed | 2 weeks, 1 day ago (Aug. 7, 2025, 4:54 a.m.) |
Issued | 30 years, 9 months ago (Nov. 1, 1994) |
Published | 30 years, 9 months ago (Nov. 1, 1994) |
Published Online | 21 years, 6 months ago (Feb. 3, 2004) |
Published Print | 30 years, 9 months ago (Nov. 1, 1994) |
@article{Rost_1994, title={Conservation and prediction of solvent accessibility in protein families}, volume={20}, ISSN={1097-0134}, url={http://dx.doi.org/10.1002/prot.340200303}, DOI={10.1002/prot.340200303}, number={3}, journal={Proteins: Structure, Function, and Bioinformatics}, publisher={Wiley}, author={Rost, Burkhard and Sander, Chris}, year={1994}, month=nov, pages={216–226} }