Abstract
Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences.
Dates
Type | When |
---|---|
Created | 19 years, 3 months ago (May 31, 2006, 9:03 a.m.) |
Deposited | 3 years, 4 months ago (April 13, 2022, 2:02 p.m.) |
Indexed | 2 days, 19 hours ago (Sept. 3, 2025, 6:02 a.m.) |
Issued | 31 years, 7 months ago (Feb. 1, 1994) |
Published | 31 years, 7 months ago (Feb. 1, 1994) |
Published Online | 31 years, 7 months ago (Feb. 1, 1994) |
Published Print | 31 years, 7 months ago (Feb. 1, 1994) |
@article{Baldi_1994, title={Hidden Markov models of biological primary sequence information.}, volume={91}, ISSN={1091-6490}, url={http://dx.doi.org/10.1073/pnas.91.3.1059}, DOI={10.1073/pnas.91.3.1059}, number={3}, journal={Proceedings of the National Academy of Sciences}, publisher={Proceedings of the National Academy of Sciences}, author={Baldi, P and Chauvin, Y and Hunkapiller, T and McClure, M A}, year={1994}, month=feb, pages={1059–1063} }