Abstract
AbstractModeling the inherent flexibility of the protein backbone as part of computational protein design is necessary to capture the behavior of real proteins and is a prerequisite for the accurate exploration of protein sequence space. We present the results of a broad exploration of sequence space, with backbone flexibility, through a novel approach: large‐scale protein design to structural ensembles. A distributed computing architecture has allowed us to generate hundreds of thousands of diverse sequences for a set of 253 naturally occurring proteins, allowing exciting insights into the nature of protein sequence space. Designing to a structural ensemble produces a much greater diversity of sequences than previous studies have reported, and homology searches using profiles derived from the designed sequences against the Protein Data Bank show that the relevance and quality of the sequences is not diminished. The designed sequences have greater overall diversity than corresponding natural sequence alignments, and no direct correlations are seen between the diversity of natural sequence alignments and the diversity of the corresponding designed sequences. For structures in the same fold, the sequence entropies of the designed sequences cluster together tightly. This tight clustering of sequence entropies within a fold and the separation of sequence entropy distributions for different folds suggest that the diversity of designed sequences is primarily determined by a structure's overall fold, and that the designability principle postulated from studies of simple models holds in real proteins. This has important implications for experimental protein design and engineering, as well as providing insight into protein evolution.
References
54
Referenced
87
10.1093/nar/25.17.3389
10.1126/science.8259514
10.1093/nar/28.1.263
10.1093/nar/28.1.235
10.1016/S1367-5931(00)00182-4
10.1016/S0959-440X(97)80054-1
10.1002/(SICI)1097-0134(19990101)34:1<113::AID-PROT9>3.0.CO;2-J
10.1016/S0959-440X(00)00106-8
10.1038/357543a0
10.1016/S0958-1669(99)80070-6
10.1016/S0959-440X(98)80125-5
10.1002/pro.5560041006
10.1006/jmbi.1999.2866
10.1038/319199a0
10.1126/science.1553543
10.1126/science.282.5393.1462
10.1016/S1093-3263(00)00137-6
10.1006/geno.1994.1018
10.1021/ja00214a001
10.1002/bip.360221211
10.1016/S1367-5931(99)00056-3
10.1006/jmbi.1994.1366
10.1006/jmbi.1999.3211
10.1006/jmbi.1999.3212
10.1073/pnas.022408799
10.1073/pnas.032405199
10.1006/jmbi.2000.4422
10.1073/pnas.97.19.10383
10.1006/jmbi.2000.4146
10.1006/jmbi.1994.1198
10.1126/science.273.5275.666
10.1073/pnas.95.9.4987
10.1002/prot.340230309
10.1038/nsb0698-470
10.1002/prot.10107
10.1016/S0022-2836(05)80134-2
10.1038/nsb0894-546
10.1038/372631a0
10.1038/301200a0
10.1016/S0006-3495(97)78345-0
10.1006/jmbi.1998.1645
10.1006/jsbi.2001.4349
10.1110/ps.9.6.1106
10.1016/S1359-0278(98)00021-2
10.1002/prot.340110408
10.1126/science.290.5498.1903
10.1002/pro.5560060810
10.1016/S0959-440X(00)00109-3
10.1006/jmbi.2000.3758
10.1073/pnas.051614498
10.1093/nar/28.1.243
10.1021/ja00315a051
10.1006/jmbi.2000.3984
10.1006/jmbi.1999.3426
Dates
Type | When |
---|---|
Created | 22 years, 9 months ago (Nov. 19, 2002, 7:14 p.m.) |
Deposited | 1 year, 10 months ago (Oct. 9, 2023, 10:41 a.m.) |
Indexed | 1 month ago (July 26, 2025, 5:04 a.m.) |
Issued | 22 years, 8 months ago (Dec. 1, 2002) |
Published | 22 years, 8 months ago (Dec. 1, 2002) |
Published Online | 16 years, 4 months ago (April 13, 2009) |
Published Print | 22 years, 8 months ago (Dec. 1, 2002) |
@article{Larson_2002, title={Thoroughly sampling sequence space: Large‐scale protein design of structural ensembles}, volume={11}, ISSN={1469-896X}, url={http://dx.doi.org/10.1110/ps.0203902}, DOI={10.1110/ps.0203902}, number={12}, journal={Protein Science}, publisher={Wiley}, author={Larson, Stefan M. and England, Jeremy L. and Desjarlais, John R. and Pande, Vijay S.}, year={2002}, month=dec, pages={2804–2813} }