Abstract
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
References
33
Referenced
1,555
10.1073/pnas.95.25.14863
10.1091/mbc.9.12.3273
10.1073/pnas.96.6.2907
- V Vapnik Statistical Learning Theory (Wiley, New York, 1998). / Statistical Learning Theory by Vapnik V (1998)
10.1023/A:1009715923555
- C Scholkopf, J C Burges, A J Smola Advances in Kernel Methods (MIT Press, Cambridge, MA, 1999). / Advances in Kernel Methods by Scholkopf C (1999)
- R O Duda, P E Hart Pattern Classification and Scene Analysis (Wiley, New York, 1973). / Pattern Classification and Scene Analysis by Duda R O (1973)
-
C Bishop Neural Networks for Pattern Recognition (Oxford Univ. Press, New York, 1995).
(
10.1093/oso/9780198538493.001.0001
) / Neural Networks for Pattern Recognition by Bishop C (1995) - J Quinlan Programs for Machine Learning, Series in Machine Learning (Morgan Kaufmann, San Francisco, 1997). / Programs for Machine Learning, Series in Machine Learning by Quinlan J (1997)
- D Wu, K Bennett, N Cristianini, J Shawe-Taylor ICML99 (Morgan Kaufmann, San Francisco, 1999). / ICML99 by Wu D (1999)
10.1073/pnas.94.24.13057
10.1126/science.278.5338.680
10.1091/mbc.9.12.3273
10.1126/science.282.5389.699
10.1109/78.650102
- T Jaakkola, M Diekhans, D Haussler ISMB99 (AAAI Press, Menlo Park, CA), pp. 149–158 (1999). / ISMB99 by Jaakkola T (1999)
10.1093/genetics/141.2.481
10.1073/pnas.95.5.2296
10.1111/j.1432-1033.1991.tb15775.x
10.1016/S0014-5793(97)01533-0
10.1074/jbc.270.29.17442
10.1101/gad.9.5.573
10.1016/S0014-5793(98)00084-2
10.1128/MCB.18.6.3149
10.1091/mbc.10.3.741
- Garrett, Grisham Biochemistry (Saunders, Philadelphia), pp. 619–622 (1995). / Biochemistry by Garrett (1995)
10.1139/o95-101
10.1074/jbc.272.20.13372
10.1074/jbc.274.1.36
10.1093/nar/17.20.8367
10.1002/j.1460-2075.1994.tb06586.x
10.1128/MCB.18.12.7278
- T Jaakkola, D Haussler NIPS 11 (Morgan Kaufmann, San Francisco), pp. 487–493 (1998). / NIPS 11 by Jaakkola T (1998)
Dates
Type | When |
---|---|
Created | 23 years, 1 month ago (July 26, 2002, 10:35 a.m.) |
Deposited | 1 year, 7 months ago (Jan. 3, 2024, 11:50 p.m.) |
Indexed | 3 days, 23 hours ago (Aug. 28, 2025, 7:58 a.m.) |
Issued | 25 years, 7 months ago (Jan. 4, 2000) |
Published | 25 years, 7 months ago (Jan. 4, 2000) |
Published Online | 25 years, 7 months ago (Jan. 4, 2000) |
Published Print | 25 years, 7 months ago (Jan. 4, 2000) |
@article{Brown_2000, title={Knowledge-based analysis of microarray gene expression data by using support vector machines}, volume={97}, ISSN={1091-6490}, url={http://dx.doi.org/10.1073/pnas.97.1.262}, DOI={10.1073/pnas.97.1.262}, number={1}, journal={Proceedings of the National Academy of Sciences}, publisher={Proceedings of the National Academy of Sciences}, author={Brown, Michael P. S. and Grundy, William Noble and Lin, David and Cristianini, Nello and Sugnet, Charles Walsh and Furey, Terrence S. and Ares, Manuel and Haussler, David}, year={2000}, month=jan, pages={262–267} }