Abstract
As a step toward understanding the complex differences between normal and cancer cells in humans, gene expression patterns were examined in gastrointestinal tumors. More than 300,000 transcripts derived from at least 45,000 different genes were analyzed. Although extensive similarity was noted between the expression profiles, more than 500 transcripts that were expressed at significantly different levels in normal and neoplastic cells were identified. These data provide insight into the extent of expression differences underlying malignancy and reveal genes that may prove useful as diagnostic or prognostic markers.
References
47
Referenced
1,052
-
M. D. Adams et al. Nature 377 (suppl. 28) 3 (1995);
(
10.1542/peds.96.2.377a
) 10.1126/science.270.5235.467
10.1038/ng1296-457
- Gress T. M., et al., Oncogene 13, 1819 (1996); / Oncogene by Gress T. M. (1996)
- ; D. J. Lockhart et al. Nature Biotechnol. 14 1675 (1996); M. Schena et al. Proc. Natl. Acad. Sci. U.S.A. 93 10 614 (1996).
10.1126/science.270.5235.484
10.1016/S0092-8674(00)81845-0
- To minimize individual variation approximately equal numbers of tags (30 000) were derived from two different patients for each tissue. For primary tumors (two CR carcinomas and two pancreatic adenocarcinomas) RNA was isolated from portions of tumors judged by histopathology to contain 60 to 90% tumor cells. The cells grown in vitro were derived from CR (SW837 and Caco2) and pancreatic (ASPC-1 and PL45) cancer cell lines. CR epithelial cells were isolated from sections of normal colon mucosa from two patients with the use of EDTA as described [
-
Nakamura S., Kino I., Baba S., Gut 34, 1240 (1993);
(
10.1136/gut.34.9.1240
) / Gut by Nakamura S. (1993) - ]. Histopathology confirmed that the isolated cells were >90% epithelial. Isolation of polyadenylate RNA and SAGE was performed as described (2). SAGE data were analyzed with SAGE software and GenBank Release 94 as described (2).
- A total of 69 393 different SAGE tags were identified among the 303 706 tags analyzed. A small fraction of these different tags was likely due to sequencing errors. SAGE analysis of yeast (2) for which the entire genomic sequence is known demonstrated a sequencing error rate of ∼0.7% translating to a SAGE tag error rate of 6.8% (1 − 0.993 10 ). Because these sequencing mistakes are essentially random they do not substantially affect the analysis although they could artificially inflate the number of different genes identified. Therefore to be conservative we reduced our estimate of different genes identified by this maximum tag error rate (that is 6.8% of 303 706 total tags). The number of different tags derived from the same gene because of alternative splicing was assumed to be negligible.
- Abundance can be determined simply by dividing the observed number of tags for a given transcript by the total number of tags obtained. An estimate of about 300 000 transcripts per cell was used to convert the abundances to copies per cell [
-
Hastie N. D., Bishop J. O., Cell 9, 761 (1976)].
(
10.1016/0092-8674(76)90139-2
) / Cell by Hastie N. D. (1976) -
Bishop J. O., Morton J. G., Rosbash M., Richardson M., Nature 250, 199 (1974);
(
10.1038/250199a0
) / Nature by Bishop J. O. (1974) - ; B. Lewin Gene Expression (Wiley New York 1980) vol. 2.
- Computer simulations indicated that analysis of 300 000 tags would yield a 92% chance of detecting a tag for a transcript whose expression on average was at least three copies per cell among the tissues examined assuming 300 000 transcripts per cell.
- To minimize the number of assumptions and to account for the large number of comparisons being made we used Monte Carlo analysis to determine statistical significance. The null hypothesis was that the level kind and distribution of transcripts were the same for cancer and normal cells. For each transcript we performed 100 000 simulations to determine the relative likelihood due to chance alone ( p -chance) of obtaining a difference in expression equal to or greater than the observed difference given the null hypothesis. We converted this likelihood to an absolute probability value by simulating 40 experiments in which a representative number of transcripts (27 993 transcripts in each experiment) were identified and compared. We derived the distribution of transcripts used for these simulations from the average level of expression observed in the original samples. We then compared the distribution of the p -chance scores obtained in the 40 simulated experiments (false positives) with those obtained experimentally. On the basis of this comparison a maximum value of 0.0005 was chosen for p -chance. This yielded a false-positive rate that was no higher than 0.01 for the least significant p -chance value below the cutoff.
- Two hundred simulations assuming an abundance of 0.0001 in one sample and 0.0006 in a second sample revealed a significant difference [ P < 0.01 (8)] 95% of the time.
- This analysis revealed 208 transcripts that were significantly decreased in CR cancer cell lines as compared with normal colon cells and 228 transcripts that were increased. Venn diagrams and tables illustrating the relation between the in vivo and in vitro differences are available through the Internet at http:// ∼molgen-g/home.htm.
- It is not possible to obtain pancreatic duct epithelium from which pancreatic carcinomas arise in sufficient quantities to perform SAGE. It is therefore not possible to determine whether these transcripts were derived from genes that were highly expressed only in pancreatic cancers or that were also expressed in pancreatic duct cells.
- Total RNA isolation and Northern blot analysis were performed as described [
10.1016/0092-8674(93)90500-P
- A. H. Owens D. S. Coffey S. B. Baylin Eds. Tumor Cell Heterogeneity: Origins and Implications (Academic Press New York 1982).
- Northern blot analyses were done on 45 of the 337 differentially expressed transcripts with tentative database matches. In three cases the pattern of expression was not differentially expressed as predicted by SAGE and for the purposes of this calculation they were presumed to represent incorrect database matches.
-
Rubin D. C., Ong D. E., Gordon J. I., Proc. Natl. Acad. Sci. U.S.A. 86, 1278 (1989);
(
10.1073/pnas.86.4.1278
) / Proc. Natl. Acad. Sci. U.S.A. by Rubin D. C. (1989) -
Okubo K., Yoshii J., Yokouchi H., Kameyama M., Matsubara K., DNA Res. 1, 37 (1994).
(
10.1093/dnares/1.1.37
) / DNA Res. by Okubo K. (1994) -
Moll R., et al., Differentiation 53, 75 (1993).
(
10.1111/j.1432-0436.1993.tb00648.x
) / Differentiation by Moll R. (1993) - J. Sowden S. Leigh I. Talbot J. Delhanty Y. Edwards ibid. p. 67.
-
de Sauvage F. J., et al., Proc. Natl. Acad. Sci. U.S.A. 89, 9089 (1992).
(
10.1073/pnas.89.19.9089
) / Proc. Natl. Acad. Sci. U.S.A. by de Sauvage F. J. (1992) -
Wiegand R. C., et al., FEBS Lett. 311, 150 (1992).
(
10.1016/0014-5793(92)81387-2
) / FEBS Lett. by Wiegand R. C. (1992) - Tricoli J. V., et al., Cancer Res. 46, 6169 (1986); / Cancer Res. by Tricoli J. V. (1986)
-
Lambert S., Vivario J., Boniver J., Gol-Winkler R., Int. J. Cancer 46, 405 (1990).
(
10.1002/ijc.2910460313
) / Int. J. Cancer by Lambert S. (1990) -
Chan W. Y., et al., Biochemistry 28, 1033 (1989).
(
10.1021/bi00429a017
) / Biochemistry by Chan W. Y. (1989) -
Hayes J. D., Pulford D. J., Crit. Rev. Biochem. Mol. Biol. 30, 445 (1995).
(
10.3109/10409239509083491
) / Crit. Rev. Biochem. Mol. Biol. by Hayes J. D. (1995) - Barnard G. F., et al., Cancer Res. 52, 3067 (1992); / Cancer Res. by Barnard G. F. (1992)
-
; P. J. Chiao D. M. Shin P. G. Sacks W. K. Hong M. A. Tainsky Mol. Carcinogen 5 219 (1992);
(
10.1002/mc.2940050309
) - Kondoh N., Schweinfest C. W., Henderson K. W., Papas T. S., Cancer Res. 52, 791 (1992); / Cancer Res. by Kondoh N. (1992)
- Barnard G. F., et al., ibid. 53, 4048 (1993); / ibid. by Barnard G. F. (1993)
-
Denis M. G., et al., Int. J. Cancer 55, 275 (1993);
(
10.1002/ijc.2910550218
) / Int. J. Cancer by Denis M. G. (1993) -
Frigerio J. M., et al., Hum. Mol. Genet. 4, 37 (1995).
(
10.1093/hmg/4.1.37
) / Hum. Mol. Genet. by Frigerio J. M. (1995) -
Schweinfest C. W., Henderson K. W., Suster S., Kondoh N., Papas T. S., Proc. Natl. Acad. Sci. U.S.A. 90, 4166 (1993).
(
10.1073/pnas.90.9.4166
) / Proc. Natl. Acad. Sci. U.S.A. by Schweinfest C. W. (1993) - Tanaka M., et al., Cancer Res. 55, 3228 (1995); / Cancer Res. by Tanaka M. (1995)
- Medina D., Kittrell F. S., Oborn C. J., Schwartz M., ibid. 53, 668 (1993). / ibid. by Medina D. (1993)
-
Miller A. D., Curran T., Verma I. M., Cell 36, 51 (1984);
(
10.1016/0092-8674(84)90073-4
) / Cell by Miller A. D. (1984) 10.1073/pnas.86.23.9193
- In the case of normal and neoplastic colon cancer tissue 548 differentially expressed transcripts were identified among the 36 125 different transcripts.
- We thank K. Polyak and P. J. Morin for providing colon cancer cell lines; G. M. Nadasdy for providing pancreatic primary tumors; and J. Floyd C. R. Robinson and Y. Beazer-Barclay for technical assistance. Supported by the Clayton Fund and by NIH grants GM07309 CA57345 and CA62924. B.V. is an investigator of the Howard Hughes Medical Institute.
Dates
Type | When |
---|---|
Created | 23 years, 1 month ago (July 27, 2002, 1:50 a.m.) |
Deposited | 1 year, 7 months ago (Jan. 12, 2024, 5:31 p.m.) |
Indexed | 1 week, 4 days ago (Aug. 19, 2025, 6:33 a.m.) |
Issued | 28 years, 3 months ago (May 23, 1997) |
Published | 28 years, 3 months ago (May 23, 1997) |
Published Print | 28 years, 3 months ago (May 23, 1997) |
@article{Zhang_1997, title={Gene Expression Profiles in Normal and Cancer Cells}, volume={276}, ISSN={1095-9203}, url={http://dx.doi.org/10.1126/science.276.5316.1268}, DOI={10.1126/science.276.5316.1268}, number={5316}, journal={Science}, publisher={American Association for the Advancement of Science (AAAS)}, author={Zhang, Lin and Zhou, Wei and Velculescu, Victor E. and Kern, Scott E. and Hruban, Ralph H. and Hamilton, Stanley R. and Vogelstein, Bert and Kinzler, Kenneth W.}, year={1997}, month=may, pages={1268–1272} }