Abstract
Abstract Background Microarray studies in cancer compare expression levels between two or more sample groups on thousands of genes. Data analysis follows a population-level approach (e.g., comparison of sample means) to identify differentially expressed genes. This leads to the discovery of 'population-level' markers, i.e., genes with the expression patterns A > B and B > A. We introduce the PPST test that identifies genes where a significantly large subset of cases exhibit expression values beyond upper and lower thresholds observed in the control samples. Results Interestingly, the test identifies A > B and B < A pattern genes that are missed by population-level approaches, such as the t-test, and many genes that exhibit both significant overexpression and significant underexpression in statistically significantly large subsets of cancer patients (ABA pattern genes). These patterns tend to show distributions that are unique to individual genes, and are aptly visualized in a 'gene expression pattern grid'. The low degree of among-gene correlations in these genes suggests unique underlying genomic pathologies and high degree of unique tumor-specific differential expression. We compare the PPST and the ABA test to the parametric and non-parametric t-test by analyzing two independently published data sets from studies of progression in astrocytoma. Conclusions The PPST test resulted findings similar to the nonparametric t-test with higher self-consistency. These tests and the gene expression pattern grid may be useful for the identification of therapeutic targets and diagnostic or prognostic markers that are present only in subsets of cancer patients, and provide a more complete portrait of differential expression in cancer.
References
53
Referenced
29
-
DeRisi JL, Iyer VR, Brown PO: Exploring the metabolic and genetic control of gene expression on a genomic scale.
Science 1997, 24: 680–686. 10.1126/science.278.5338.680
(
10.1126/science.278.5338.680
) / Science by JL DeRisi (1997) -
Baldi P, Long AD: A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes.
Bioinformatics 2001, 17: 509–519. 10.1093/bioinformatics/17.6.509
(
10.1093/bioinformatics/17.6.509
) / Bioinformatics by P Baldi (2001) -
Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data.
J Comput Biol 2000, 7: 819–837. 10.1089/10665270050514954
(
10.1089/10665270050514954
) / J Comput Biol by MK Kerr (2000) -
Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Afshari C, Paules RS: Assessing gene significance from cDNA microarray expression data via mixed models.
J Comput Biol 2001, 8: 625–637. 10.1089/106652701753307520
(
10.1089/106652701753307520
) / J Comput Biol by RD Wolfinger (2001) -
Black MA, Doerge RW: Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments.
Bioinformatics 2002, 18: 1609–1616. 10.1093/bioinformatics/18.12.1609
(
10.1093/bioinformatics/18.12.1609
) / Bioinformatics by MA Black (2002) -
Ideker T, Thorsson V, Siegel AF, Hood LE: Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data.
J Comput Biol 2000, 7: 805–817. 10.1089/10665270050514945
(
10.1089/10665270050514945
) / J Comput Biol by T Ideker (2000) -
Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borresen-Dale AL, Brown PO, Botstein D: Molecular portraits of human breast tumours.
Nature 2000, 406: 747–752. 10.1038/35021093
(
10.1038/35021093
) / Nature by CM Perou (2000) -
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T, Gerald W, Loda M, Lander ES, Golub TR: Multiclass cancer diagnosis using tumor gene expression signatures.
Proc Natl Acad Sci USA 2001, 98: 15149–15154. 10.1073/pnas.211566398
(
10.1073/pnas.211566398
) / Proc Natl Acad Sci USA by S Ramaswamy (2001) -
Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
Proc Natl Acad Sci USA 2001, 98: 10869–10874. 10.1073/pnas.191367098
(
10.1073/pnas.191367098
) / Proc Natl Acad Sci USA by T Sorlie (2001) -
Alizadeh AA, Ross DT, Perou CM, van de Rijn M: Towards a novel classification of human malignancies based on gene expression patterns.
J Pathol 2001, 195: 41–52. 10.1002/path.889
(
10.1002/path.889
) / J Pathol by AA Alizadeh (2001) -
Alizadeh AA, Eisen MB, Davis RE, et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.
Nature 2000, 403: 503–511. 10.1038/35000501
(
10.1038/35000501
) / Nature by AA Alizadeh (2000) -
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science 1999, 286: 531–537. 10.1126/science.286.5439.531
(
10.1126/science.286.5439.531
) / Science by TR Golub (1999) -
Welford SM, Gregg J, Chen E, Garrison D, Sorensen PH, Denny CT, Nelson SF: Detection of differentially expressed genes in primary tumor tissues using representational differences analysis coupled to microarray hybridization.
Nucleic Acids Res 1998, 26: 3059–3065. 10.1093/nar/26.12.3059
(
10.1093/nar/26.12.3059
) / Nucleic Acids Res by SM Welford (1998) -
Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi OP, Wilfond B, Borg A, Trent J: Gene-expression profiles in hereditary breast cancer.
N Engl J Med 2001, 344: 539–548. 10.1056/NEJM200102223440801
(
10.1056/NEJM200102223440801
) / N Engl J Med by I Hedenfalk (2001) -
Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, Simon R, Yakhini Z, Ben-Dor A, Sampas N, Dougherty E, Wang E, Marincola F, Gooden C, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K, Beaudry C, Berens M, Alberts D, Sondak V: Molecular classification of cutaneous malignant melanoma by gene expression profiling.
Nature 2000, 406: 536–540. 10.1038/35020115
(
10.1038/35020115
) / Nature by M Bittner (2000) -
Welsh JB, Zarrinkar PP, Sapinoso LM, Kern SG, Behling CA, Monk BJ, Lockhart DJ, Burger RA, Hampton GM: Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer.
Proc Natl Acad Sci USA 2001, 98: 1176–1181. 10.1073/pnas.98.3.1176
(
10.1073/pnas.98.3.1176
) / Proc Natl Acad Sci USA by JB Welsh (2001) -
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
Proc Natl Acad Sci USA 1999, 96: 6745–6750. 10.1073/pnas.96.12.6745
(
10.1073/pnas.96.12.6745
) / Proc Natl Acad Sci USA by U Alon (1999) -
De Vos J, Thykjaer T, Tarte K, Ensslen M, Raynaud P, Requirand G, Pellet F, Pantesco V, Reme T, Jourdan M, Rossi JF, Orntoft T, Klein B: Comparison of gene expression profiling between malignant and normal plasma cells with oligonucleotide arrays.
Oncogene 2002, 21: 6848–6857. 10.1038/sj.onc.1205868
(
10.1038/sj.onc.1205868
) / Oncogene by J De Vos (2002) -
Garber ME, Troyanskaya OG, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, van de Rijn M, Rosen GD, Perou CM, Whyte RI, Altman RB, Brown PO, Botstein D, Petersen I: Diversity of gene expression in adenocarcinoma of the lung.
Proc Natl Acad Sci USA 2001, 98: 13784–13789. 10.1073/pnas.241500798
(
10.1073/pnas.241500798
) / Proc Natl Acad Sci USA by ME Garber (2001) -
Tan ZJ, Hu XG, Cao GS, Tang Y: Analysis of gene expression profile of pancreatic carcinoma using cDNA microarray.
World J Gastroenterol 2003, 9: 818–823.
(
10.3748/wjg.v9.i4.818
) / World J Gastroenterol by ZJ Tan (2003) -
Bushel PR, Hamadeh HK, Bennett L, Green J, Ableson A, Misener S, Afshari CA, Paules RS: Computational selection of distinct class- and subclass-specific gene expression signatures.
J Biomed Inform 2002, 35: 160–170. 10.1016/S1532-0464(02)00525-7
(
10.1016/S1532-0464(02)00525-7
) / J Biomed Inform by PR Bushel (2002) -
Cui X, Churchill GA: Statistical tests for differential expression in cDNA microarray experiments.
Genome Biol 2003, 4: 210. 10.1186/gb-2003-4-4-210
(
10.1186/gb-2003-4-4-210
) / Genome Biol by X Cui (2003) -
Thomas JG, Olson JM, Tapscott SJ, Zhao LP: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles.
Genome Res 2001, 11: 1227–1236. 10.1101/gr.165101
(
10.1101/gr.165101
) / Genome Res by JG Thomas (2001) -
Draghici S, Kulaeva O, Hoff B, Petrov A, Shams S, Tainsky MA: Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays.
Bioinformatics 2003, 19: 1348–1359. 10.1093/bioinformatics/btg165
(
10.1093/bioinformatics/btg165
) / Bioinformatics by S Draghici (2003) -
Welford SM, Gregg J, Chen E, Garrison D, Sorensen PH, Denny CT, Nelson SF: Detection of differentially expressed genes in primary tumor tissues using representational differences analysis coupled to microarray hybridization.
Nucleic Acids Res 1998, 26: 3059–3065. 10.1093/nar/26.12.3059
(
10.1093/nar/26.12.3059
) / Nucleic Acids Res by SM Welford (1998) - Yang IV, Chen E, Hasseman JP, Liang W, Frank BC, Wang S, Sharov V, Saeed AI, White J, Li J, Lee NH, Yeatman TJ, Quackenbush J: Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol 2002, 24: 3–62. / Genome Biol by IV Yang (2002)
-
Ideker T, Thorsson V, Siegel AF, Hood LE: Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data.
Journal of Computational Biology 2000, 7: 805–817. 10.1089/10665270050514945
(
10.1089/10665270050514945
) / Journal of Computational Biology by T Ideker (2000) -
Baldi P, Long AD: A Bayesian framework for the analysis of microarray expression data, regularized t-test and statistical inferences of gene changes.
Bioinformatics 2001, 17: 509–519. 10.1093/bioinformatics/17.6.509
(
10.1093/bioinformatics/17.6.509
) / Bioinformatics by P Baldi (2001) -
Broet P, Richardson S, Radvanyi F: Bayesian hierarchical model for identifying changes in gene expression from microarray experiments.
J Comput Biol 2002, 9: 671–683. 10.1089/106652702760277381
(
10.1089/106652702760277381
) / J Comput Biol by P Broet (2002) -
Domingos P, Pazzani M: On the optimality of the simple Bayesian classifier under zero-one loss.
Machine Learning 1997, 29: 103–130. 10.1023/A:1007413511361
(
10.1023/A:1007413511361
) / Machine Learning by P Domingos (1997) -
Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data.
J Comput Biol 2000, 7: 601–620. 10.1089/106652700750050961
(
10.1089/106652700750050961
) / J Comput Biol by N Friedman (2000) -
Ibrahim JG, Chen MH, Gray RJ: Bayesian models for gene expression with DNA microarray data.
Journal of the American Statistical Association 2002, 97: 88–99. 10.1198/016214502753479257
(
10.1198/016214502753479257
) / Journal of the American Statistical Association by JG Ibrahim (2002) -
Kendziorski CM, Newton MA, Lan H, Gould MN: On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles.
Statistics in Medicine 2003, 22: 3899–3914. 10.1002/sim.1548
(
10.1002/sim.1548
) / Statistics in Medicine by CM Kendziorski (2003) -
Lee KE, Sha N, Dougherty ER, Vannucci M, Mallick BK: Gene selection: a Bayesian variable selection approach.
Bioinformatics 2003, 19: 90–97. 10.1093/bioinformatics/19.1.90
(
10.1093/bioinformatics/19.1.90
) / Bioinformatics by KE Lee (2003) -
Townsend JP, Hartl DL: Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple strains or treatments.
Genome Biol 2002, 3: RESEARCH0071. 10.1186/gb-2002-3-12-research0071
(
10.1186/gb-2002-3-12-research0071
) / Genome Biol by JP Townsend (2002) -
Theilhaber J, Bushnell S, Jackson A, Fuchs R: Bayesian estimation of fold-changes in the analysis of gene expression: the PFOLD algorithm.
J Comput Biol 2001, 8: 585–614. 10.1089/106652701753307502
(
10.1089/106652701753307502
) / J Comput Biol by J Theilhaber (2001) -
Li Y, Campbell C, Tipping M: Bayesian automatic relevance determination algorithms for classifying gene expression data.
Bioinformatics 2002, 18: 1332–1339. 10.1093/bioinformatics/18.10.1332
(
10.1093/bioinformatics/18.10.1332
) / Bioinformatics by Y Li (2002) -
Pan W: On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression.
Bioinformatics 2003, 19: 1333–1340. 10.1093/bioinformatics/btg167
(
10.1093/bioinformatics/btg167
) / Bioinformatics by W Pan (2003) -
Huang X, Pan W: Comparing three methods for variance estimation with duplicated high density oligonucleotide arrays.
Funct Integr Genomics 2002, 2: 126–133. 10.1007/s10142-002-0066-2
(
10.1007/s10142-002-0066-2
) / Funct Integr Genomics by X Huang (2002) - Park PJ, Pagano M, Bonetti M: A nonparametric scoring algorithm for identifying informative genes from microarray data. Pac Symp Biocomput 2001, 52–63. / Pac Symp Biocomput by PJ Park (2001)
-
Troyanskaya OG, Garber ME, Brown PO, Botstein D, Altman RB: Nonparametric methods for identifying differentially expressed genes in microarray data.
Bioinformatics 2002, 18: 1454–1461. 10.1093/bioinformatics/18.11.1454
(
10.1093/bioinformatics/18.11.1454
) / Bioinformatics by OG Troyanskaya (2002) -
Li C, Wong WH: Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection.
Proc Natl Acad Sci USA 2001, 98: 31–36. 10.1073/pnas.011404098
(
10.1073/pnas.98.1.31
) / Proc Natl Acad Sci USA by C Li (2001) -
Efron B, Tibshirani R: Empirical Bayes methods and false discovery rates for microarrays.
Genet Epidemiol 2002, 23: 70–86. 10.1002/gepi.1124
(
10.1002/gepi.1124
) / Genet Epidemiol by B Efron (2002) -
Storey J: A direct approach to false discovery rates.
J Roy Stat Soc Ser B 2002, 64: 479–498. 10.1111/1467-9868.00346
(
10.1111/1467-9868.00346
) / J Roy Stat Soc Ser B by J Storey (2002) -
Reiner A, Yekutieli D, Benjamini Y: Identifying differentially expressed genes using false discovery rate controlling procedures.
Bioinformatics 2003, 19: 368–375. 10.1093/bioinformatics/btf877
(
10.1093/bioinformatics/btf877
) / Bioinformatics by A Reiner (2003) -
Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response.
Proc Natl Acad Sci USA 2001, 98: 5116–5121. 10.1073/pnas.091062498
(
10.1073/pnas.091062498
) / Proc Natl Acad Sci USA by VG Tusher (2001) - Bhattacharya S, Long D, Lyons-Weiler J: Overcoming confounded controls in the analysis of gene expression data from microarray experiments. Applied Bioinformatics 2004, 2: 197–208. We have previously determined that 5 samples in the Alon et al. colon cancer data set [17] were epithelial-like normal using unsupervised bootstrap cluster analysis and removed the remaining muscle-like normals from this analysis. / Applied Bioinformatics by S Bhattacharya (2004)
- For 72 additional studies of gene expression patterns in cancer, see the University of Pittsburgh Cancer Gene Expression Data Link Database[http://bioinformatics.upmc.edu/Help/UPITTGED.html]
-
Knudsen AG: Mutation and cancer: Statistical study of retinoblastoma.
Proc Natl Acad Sci USA 1971, 68: 820–823.
(
10.1073/pnas.68.4.820
) / Proc Natl Acad Sci USA by AG Knudsen (1971) -
Hanahan D, Weinberg RA: The hallmarks of cancer.
Cell 2000, 100: 57–70. 10.1016/S0092-8674(00)81683-9
(
10.1016/S0092-8674(00)81683-9
) / Cell by D Hanahan (2000) -
Patel S, Lyons-Weiler J: caGEDA: A web application for the integrated analysis of global gene expression patterns in cancer.
Applied Bioinformatics 2004, 3: 49–62.
(
10.2165/00822942-200403010-00007
) / Applied Bioinformatics by S Patel (2004) - Khatua S, Peterson KM, Brown KM, Lawlor C, Santi MR, LaFleur B, Dressman D, Stephan DA, MacDonald TJ: Overexpression of the EGFR/FKBP12/HIF-2alpha pathway identified in childhood astrocytomas by angiogenesis gene profiling. Cancer Res 2003, 63: 1865–1870. / Cancer Res by S Khatua (2003)
-
van den Boom J, Wolter M, Kuick R, Misek DE, Youkilis AS, Wechsler DS, Sommer C, Reifenberger G, Hanash SM: Characterization of gene expression profiles associated with glioma progression using oligonucleotide-based microarray analysis and real-time reverse transcription-polymerase chain reaction.
Am J Pathol 2003, 163: 1033–1043.
(
10.1016/S0002-9440(10)63463-3
) / Am J Pathol by J van den Boom (2003)
@article{Lyons_Weiler_2004, title={Tests for finding complex patterns of differential expression in cancers: towards individualized medicine}, volume={5}, ISSN={1471-2105}, url={http://dx.doi.org/10.1186/1471-2105-5-110}, DOI={10.1186/1471-2105-5-110}, number={1}, journal={BMC Bioinformatics}, publisher={Springer Science and Business Media LLC}, author={Lyons-Weiler, James and Patel, Satish and Becich, Michael J and Godfrey, Tony E}, year={2004}, month=aug }