Abstract
Abstract Motivation: A sizeable fraction of eukaryotic proteins contain intrinsically disordered regions (IDRs), which act in unfolded states or by undergoing transitions between structured and unstructured conformations. Over time, sequence-based classifiers of IDRs have become fairly accurate and currently a major challenge is linking IDRs to their biological roles from the molecular to the systems level. Results: We describe DISOPRED3, which extends its predecessor with new modules to predict IDRs and protein-binding sites within them. Based on recent CASP evaluation results, DISOPRED3 can be regarded as state of the art in the identification of IDRs, and our self-assessment shows that it significantly improves over DISOPRED2 because its predictions are more specific across the whole board and more sensitive to IDRs longer than 20 amino acids. Predicted IDRs are annotated as protein binding through a novel SVM based classifier, which uses profile data and additional sequence-derived features. Based on benchmarking experiments with full cross-validation, we show that this predictor generates precise assignments of disordered protein binding regions and that it compares well with other publicly available tools. Availability and implementation: http://bioinf.cs.ucl.ac.uk/disopred Contact: d.t.jones@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
References
43
Referenced
824
10.1093/nar/25.17.3389
/ Nucleic Acids Res. / Gapped BLAST and PSI-BLAST: a new generation of protein database search programs by Altschul (1997)10.1093/nar/28.1.235
/ Nucleic Acids Res. / The Protein Data Bank by Berman (2000)10.1016/j.sbi.2013.03.006
/ Curr. Opin. Struct. Biol. / Alternative splicing of intrinsically disordered regions and rewiring of protein interactions by Buljan (2013)10.1145/1961189.1961199
/ ACM Trans. Intell. Syst. Technol. / LIBSVM: a library for support vector machines by Chang (2011)10.1016/j.sbi.2013.02.001
/ Curr. Opin. Struct. Biol. / The contribution of intrinsic disorder prediction to the elucidation of protein function by Cozzetto (2013)10.2307/2531595
/ Biometrics / Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach by DeLong (1988)10.1039/C1MB05207A
/ Mol. Biosyst. / A comprehensive overview of computational protein disorder prediction methods by Deng (2012)10.1093/bioinformatics/bts209
/ Bioinformatics / MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins by Disfani (2012)10.1016/j.jmb.2005.01.071
/ J. Mol. Biol. / The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins by Dosztanyi (2005)10.1186/1471-2105-14-88
/ BMC Bioinformatics / DNdisorder: predicting protein disorder using boosting and deep networks by Eickholt (2013)10.1186/1471-2105-14-300
/ BMC Bioinformatics / MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation by Fang (2013)10.1093/bioinformatics/btm302
/ Bioinformatics / POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions by Hirose (2007)10.1093/nar/gkm363
/ Nucleic Acids Res. / PrDOS: prediction of disordered protein regions from amino acid sequence by Ishida (2007)10.1093/bioinformatics/btn195
/ Bioinformatics / Prediction of disordered regions in proteins based on the meta approach by Ishida (2008)10.1002/prot.10528
/ Proteins / Prediction of disordered regions in proteins from position specific score matrices by Jones (2003)10.1371/journal.pone.0072838
/ PLoS One / Predicting binding within disordered protein regions to structurally characterised peptide-binding domains by Khan (2013)10.1186/1471-2105-13-111
/ BMC Bioinformatics / MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins by Kozlowski (2012)10.1093/nar/gkg519
/ Nucleic Acids Res. / GlobPlot: Exploring protein sequences for globularity and disorder by Linding (2003)10.1093/nar/gkg515
/ Nucleic Acids Res. / NORSp: predictions of long regions without regular secondary structure by Liu (2003)10.1371/journal.pcbi.0030162
/ PLoS Comput. Biol. / Inferring function using patterns of native disorder in proteins by Lobley (2007)10.1002/prot.10533
/ Proteins / Evaluation of disorder predictions in CASP5 by Melamud (2003)10.1371/journal.pcbi.1000376
/ PLoS Comput. Biol. / Prediction of protein binding regions in disordered proteins by Meszaros (2009)10.1371/journal.pone.0063754
/ PLoS One / FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences by Minneci (2013)10.1002/prot.23161
/ Proteins / Evaluation of disorder predictions in CASP9 by Monastyrskyy (2011)10.1002/prot.24391
/ Proteins / Assessment of protein disorder region predictions in CASP10 by Monastyrskyy (2014)10.1002/prot.22586
/ Proteins / Assessment of disorder predictions in CASP8 by Noivirt-Brik (2009)10.1093/bioinformatics/btr175
/ Bioinformatics / Proteins without 3D structure: definition, detection and beyond by Orosz (2011)10.1016/j.str.2010.08.007
/ Structure / Transient protein–protein interactions: structural, functional, and network properties by Perkins (2010)10.1093/bioinformatics/bti537
/ Bioinformatics / FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded by Prilusky (2005){'key': '2023020116172487400_btu744-B37', 'author': 'R Core Team', 'year': '2012', 'journal-title': 'R: A Language and Environment for Statistical Computing'}
/ R: A Language and Environment for Statistical Computing by R Core Team (2012)10.1186/1471-2105-12-77
/ BMC Bioinformatics / pROC: an open-source package for R and S+ to analyze and compare ROC curves by Robin (2011)10.1371/journal.pone.0004433
/ PLoS One / Improved disorder prediction by combination of orthogonal approaches by Schlessinger (2009)10.1093/bioinformatics/btm330
/ Bioinformatics / POODLE-S: web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix by Shimizu (2007)10.1186/1471-2105-8-78
/ BMC Bioinformatics / Predicting mostly disordered proteins by using structure-unknown protein data by Shimizu (2007)10.1093/nar/gkl893
/ Nucleic Acids Res / DisProt: the database of disordered proteins by Sickmeier (2007)10.1093/nar/gks1211
/ Nucleic Acids Res. / New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures by Sillitoe (2013)10.1093/bioinformatics/btm098
/ Bioinformatics / UniRef: comprehensive and non-redundant UniProt reference clusters by Suzek (2007)10.1093/nar/gku469
/ Nucleic Acids Res. / Activities at the Universal Protein Resource (UniProt) by UniProt Consortium (2014)10.1093/nar/gks1258
/ Nucleic Acids Res / SIFTS: Structure Integration with Function, Taxonomy and Sequences resource by Velankar (2013)10.1093/nar/gki402
/ Nucleic Acids Res. / PISCES: recent improvements to a PDB sequence culling server by Wang (2005)10.1074/jbc.M109.087528
/ J. Biol. Chem. / A large intrinsically disordered region in SKIP and its disorder-order transition induced by PPIL1 binding revealed by NMR by Wang (2010)10.1016/j.jmb.2004.02.002
/ J. Mol. Biol. / Prediction and functional analysis of native disorder in proteins from the three kingdoms of life by Ward (2004)10.1093/nar/gkg571
/ Nucleic Acids Res. / LGA: a method for finding 3D similarities in protein structures by Zemla (2003)
Dates
Type | When |
---|---|
Created | 10 years, 9 months ago (Nov. 13, 2014, 12:02 a.m.) |
Deposited | 2 years, 6 months ago (Feb. 1, 2023, 7:30 p.m.) |
Indexed | 1 week, 5 days ago (Aug. 12, 2025, 6:16 p.m.) |
Issued | 10 years, 9 months ago (Nov. 12, 2014) |
Published | 10 years, 9 months ago (Nov. 12, 2014) |
Published Online | 10 years, 9 months ago (Nov. 12, 2014) |
Published Print | 10 years, 5 months ago (March 15, 2015) |
@article{Jones_2014, title={DISOPRED3: precise disordered region predictions with annotated protein-binding activity}, volume={31}, ISSN={1367-4803}, url={http://dx.doi.org/10.1093/bioinformatics/btu744}, DOI={10.1093/bioinformatics/btu744}, number={6}, journal={Bioinformatics}, publisher={Oxford University Press (OUP)}, author={Jones, David T. and Cozzetto, Domenico}, year={2014}, month=nov, pages={857–863} }