Crossref journal-article
Wiley
Protein Science (311)
Abstract

AbstractA new method is presented for identifying distantly related homologous proteins that are unrecognizable by conventional sequence comparison methods. The method combines information about functionally conserved sequence patterns with information about structure context. This information is encoded in stochastic discrete state‐space models (DSMs) that comprise a new family of hidden Markov models. The new models are called sequence‐pattern‐embedded DSMs (pDSMs). This method can identify distantly related protein family members with a high sensitivity and specificity. The method is illustrated with trypsin‐like serine proteases and globins. The strategy for building pDSMs is presented. The method has been validated using carefully constructed positive and negative control sets. In addition to the ability to recognize remote homologs, pDSM sequence analysis predicts secondary structures with higher sensitivity, specificity, and Q3 accuracy than DSM analysis, which omits information about conserved sequence patterns. The identification of trypsin‐like serine proteases in new genomes is discussed.

Bibliography

Yu, L., White, J. V., & Smith, T. F. (1998). A homology identification method that combines protein sequence and structure information. Protein Science, 7(12), 2499–2510. Portico.

Authors 3
  1. Lihua Yu (first)
  2. James V. White (additional)
  3. Temple F. Smith (additional)
References 53 Referenced 19
  1. 10.1002/pro.5560050703
  2. 10.1093/protein/9.7.591
  3. 10.1038/369072a0
  4. 10.1016/S0022-2836(05)80360-2
  5. 10.1093/nar/25.17.3389
  6. BairochA.1991. PROSITE: A dictionary of sites and patterns in proteins.Nucleic Acids Res19 Suppl:2241–2245. (10.1093/nar/19.suppl.2241)
  7. 10.1093/nar/22.17.3626
  8. 10.1093/nar/22.17.3626
  9. 10.1016/0022-2836(87)90521-3
  10. 10.1093/nar/22.17.3441
  11. 10.1016/S0022-2836(77)80200-3
  12. 10.1093/emboj/17.20.6061
  13. 10.1111/j.1432-1033.1987.tb13566.x
  14. 10.1126/science.273.5278.1058
  15. 10.1126/science.2471267
  16. 10.1006/jmbi.1996.0874
  17. 10.1126/science.7280687
  18. 10.1016/S0959-440X(96)80056-X
  19. 10.1016/0022-2836(78)90297-8
  20. 10.1016/S0959-440X(96)80058-3
  21. 10.1006/jmbi.1996.0569
  22. 10.1002/prot.340070404
  23. 10.1073/pnas.84.13.4355
  24. 10.1042/bj1010229
  25. 10.1093/nar/19.23.6565
  26. 10.1016/S0968-0004(00)89071-4
  27. 10.1126/science.273.5275.595
  28. 10.1002/(SICI)1097-0134(199705)28:1<72::AID-PROT7>3.0.CO;2-L
  29. 10.1016/S0959-440X(97)80024-3
  30. 10.1002/bip.360221211
  31. 10.1002/(SICI)1097-0134(1997)1 <134::AID-PROT18>3.0.CO;2-P
  32. 10.1006/jmbi.1994.1104
  33. 10.1111/j.1365-2958.1995.tb02309.x
  34. 10.1006/jmbi.1996.0053
  35. 10.1145/32206.32207
  36. 10.1016/0022-2836(80)90373-3
  37. 10.1128/jb.171.3.1574-1584.1989
  38. 10.1038/387s007 / Nature / Overview of the yeast genome by Mewes HW (1997)
  39. 10.1006/jmbi.1996.0506
  40. 10.1016/S0022-2836(05)80134-2
  41. 10.1093/nar/25.9.1665
  42. {'key': 'e_1_2_1_43_1', 'first-page': '325', 'article-title': 'Identifying distantly related protein sequences', 'volume': '13', 'author': 'Pearson WR', 'year': '1997', 'journal-title': 'CABIOS'} / CABIOS / Identifying distantly related protein sequences by Pearson WR (1997)
  43. 10.1109/5.18626
  44. 10.1128/jb.172.2.1019-1023.1990
  45. {'key': 'e_1_2_1_46_1', 'first-page': '279', 'article-title': 'Scrutineer: A computer program that flexibly seeks and describes motifs and profiles in protein sequence databases', 'volume': '6', 'author': 'Sibbald PR', 'year': '1990', 'journal-title': 'CABIOS'} / CABIOS / Scrutineer: A computer program that flexibly seeks and describes motifs and profiles in protein sequence databases by Sibbald PR (1990)
  46. 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L
  47. 10.1016/0959-440X(95)80101-4
  48. {'key': 'e_1_2_1_49_1', 'first-page': '13', 'volume-title': 'Proteases and biological control.', 'author': 'Stroud RM', 'year': '1975'} / Proteases and biological control. by Stroud RM (1975)
  49. 10.1016/S1569-2558(08)60483-X
  50. 10.1002/pro.5560020302
  51. 10.1016/0022-2836(86)90308-6
  52. {'key': 'e_1_2_1_53_1', 'first-page': '255', 'volume-title': 'Bayesian analysis of time series and dynamic models.', 'author': 'White JV', 'year': '1988'} / Bayesian analysis of time series and dynamic models. by White JV (1988)
  53. 10.1016/0025-5564(94)90004-3
Dates
Type When
Created 16 years, 6 months ago (Feb. 10, 2009, 1:16 a.m.)
Deposited 1 year, 10 months ago (Oct. 28, 2023, 3:47 p.m.)
Indexed 1 year, 6 months ago (Feb. 1, 2024, 1:20 p.m.)
Issued 26 years, 8 months ago (Dec. 1, 1998)
Published 26 years, 8 months ago (Dec. 1, 1998)
Published Online 16 years, 7 months ago (Dec. 31, 2008)
Published Print 26 years, 8 months ago (Dec. 1, 1998)
Funders 0

None

@article{Yu_1998, title={A homology identification method that combines protein sequence and structure information}, volume={7}, ISSN={1469-896X}, url={http://dx.doi.org/10.1002/pro.5560071203}, DOI={10.1002/pro.5560071203}, number={12}, journal={Protein Science}, publisher={Wiley}, author={Yu, Lihua and White, James V. and Smith, Temple F.}, year={1998}, month=dec, pages={2499–2510} }