Abstract
Abstract Motivation: Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence. Results: We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values. Conclusion: Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10–138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome. Availability: RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes Contact: robert@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online.
References
32
Referenced
18
10.1093/bioinformatics/btm223
/ Bioinformatics / Efficient parameter estimation for RNA secondary structure prediction by Andronescu (2007)10.1101/gr.5159906
/ Genome Res. / Many novel mammalian microRNA candidates identified by extensive cloning and RAKE analysis by Berezikov (2006)10.1186/1471-2105-9-474
/ BMC Bioinformatics / RNAalifold: improved consensus structure prediction for RNA alignments by Bernhart (2008)10.1016/j.jcss.2007.03.011
/ J. Comput. Syst. Sci. / The most probable annotation problem in HMMs and its application to bioinformatics by Brejová (2007)10.1073/pnas.0712329105
/ Proc. Natl Acad. Sci. USA / Centroid estimation in discrete high-dimensional spaces with applications in biology by Carvalho (2008)10.1093/bioinformatics/bti632
/ Bioinformatics / Structure clustering features on the Sfold Web server by Chan (2005)10.1261/rna.7220505
/ RNA / Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency by Clote (2005)10.1093/nar/gkg938
/ Nucleic Acids Res. / A statistical sampling algorithm for RNA secondary structure prediction by Ding (2003)10.1093/bioinformatics/btl246
/ Bioinformatics / CONTRAfold: RNA secondary structure prediction without physics-based models by Do (2006)10.1186/1471-2105-5-105
/ BMC Bioinformatics / Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction by Doshi (2004){'key': '2023012511004116100_B11', 'volume-title': 'Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids', 'author': 'Durbin', 'year': '1999'}
/ Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids by Durbin (1999)10.1016/j.scico.2003.12.005
/ Sci. Comp. Program. / A discipline of dynamic programming over sequence data by Giegerich (2004)10.1093/nar/gkh779
/ Nucleic Acids Res. / Abstract Shapes of RNA by Giegerich (2004)10.1093/nar/gki081
/ Nucleic Acids Res. / Rfam: annotating non-coding RNAs in complete genomes by Griffiths-Jones (2005)10.1093/bioinformatics/btn601
/ Bioinformatics / Prediction of RNA secondary structure using generalized centroid estimators by Hamada (2009)10.1371/journal.pcbi.0030193
/ PLoS Comput. Biol. / Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix by Havgaard (2007)10.1007/BF00818163
/ Monatsh. Chem. / Fast folding and comparison of RNA secondary structures by Hofacker (1994)10.1186/1471-2105-9-131
/ BMC Bioinformatics / Shape based indexing for faster search of RNA family databases by Janssen (2008)10.1089/cmb.2006.0153
/ J. Comput. Biol. / Asymptotics of RNA shapes by Lorenz (2008)10.1038/ng.73
/ Nat. Genet. / The birth and death of microRNA genes in Drosophila by Lu (2008)10.1038/nrm1403
/ Nat. Rev. Mol. Cell Biol. / Gene regulation by riboswitches by Mandal (2004)10.1006/jmbi.1999.2700
/ J. Mol. Biol. / Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure by Mathews (1999)10.1016/j.sbi.2006.05.010
/ Curr. Opin. Struct. Biol. / Prediction of RNA secondary structure by free energy minimization by Mathews (2006)10.1002/bip.360290621
/ Biopolymers / The equilibrium partition function and base pair binding probabilities for RNA secondary structure by McCaskill (1990)10.1186/1471-2199-5-10
/ BMC Mol. Biol. / Co-transcriptional folding is encoded within RNA genes by Meyer (2004)10.1007/s12064-009-0074-z
/ Theory Biosci. / On quantitative effects of RNA shape abstraction by Nebel (2009)10.1093/bioinformatics/btm179
/ Bioinformatics / Locomotif: from graphical motif description to RNA motif search by Reeder (2007)10.1093/bioinformatics/bti577
/ Bioinformatics / Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction by Reeder (2005)10.1093/bioinformatics/btk010
/ Bioinformatics / RNAshapes: an integrated RNA analysis package based on abstract shapes by Steffen (2006)10.1186/1741-7007-4-5
/ BMC Biol. / Complete probabilistic analysis of RNA shapes by Voß (2006)10.4161/rna.6.4.9014
/ RNA Biol. / The Escherichia coli ibpA thermometer is comprised of stable and unstable structural elements by Waldminghaus (2009)10.1002/(SICI)1097-0282(199902)49:2<145::AID-BIP4>3.0.CO;2-G
/ Biopolymers / Complete suboptimal folding of RNA and the stability of secondary structures by Wuchty (1999)
Dates
Type | When |
---|---|
Created | 15 years, 7 months ago (Jan. 15, 2010, 8:14 p.m.) |
Deposited | 2 years, 7 months ago (Jan. 25, 2023, 6:03 a.m.) |
Indexed | 1 year, 3 months ago (May 23, 2024, 5:57 a.m.) |
Issued | 15 years, 7 months ago (Jan. 14, 2010) |
Published | 15 years, 7 months ago (Jan. 14, 2010) |
Published Online | 15 years, 7 months ago (Jan. 14, 2010) |
Published Print | 15 years, 5 months ago (March 1, 2010) |
@article{Janssen_2010, title={Faster computation of exact RNA shape probabilities}, volume={26}, ISSN={1367-4803}, url={http://dx.doi.org/10.1093/bioinformatics/btq014}, DOI={10.1093/bioinformatics/btq014}, number={5}, journal={Bioinformatics}, publisher={Oxford University Press (OUP)}, author={Janssen, Stefan and Giegerich, Robert}, year={2010}, month=jan, pages={632–639} }