Abstract
As more genomes are sequenced, the identification and characterization of the causes of heritable variation within a species will be increasingly important. It is demonstrated that allelic variation in any two isolates of a species can be scanned, mapped, and scored directly and efficiently without allele-specific polymerase chain reaction, without creating new strains or constructs, and without knowing the specific nature of the variation. A total of 3714 biallelic markers, spaced about every 3.5 kilobases, were identified by analyzing the patterns obtained when total genomic DNA from two different strains of yeast was hybridized to high-density oligonucleotide arrays. The markers were then used to simultaneously map a multidrug-resistance locus and four other loci with high resolution (11 to 64 kilobases).
Authors
11
- Elizabeth A. Winzeler (first)
- Dan R. Richards (additional)
- Andrew R. Conway (additional)
- Alan L. Goldstein (additional)
- Sue Kalman (additional)
- Michael J. McCullough (additional)
- John H. McCusker (additional)
- David A. Stevens (additional)
- Lisa Wodicka (additional)
- David J. Lockhart (additional)
- Ronald W. Davis (additional)
References
39
Referenced
307
10.1038/nbt0198-33
10.1126/science.278.5343.1580
10.1038/nbt1297-1359
- The probes for the entire yeast genome are synthesized in a spatially addressable fashion with a combination of photolithography and solid-phase chemistry [
10.1073/pnas.91.11.5022
10.1126/science.1990438
- ] on a series of five 1.64-cm 2 arrays. Each expression array contains more than 65 000 synthesis features with each physical feature containing more than 10 7 copies of the specific oligonucleotide probe covalently attached to a glass surface. Excluding the rDNA and CUP1 repeats the largest gap is 41 325 bases wide at position 510 000 on chr XII.
10.1073/pnas.80.1.278
10.1126/science.274.5287.610
10.1038/ng0593-11
- The completed S. cerevisiae genome sequence is from strain S288c and 88% of the S288c genome is derived from EM93 which was isolated from a rotting fig near Merced California in 1938 [
10.1093/genetics/113.1.35
- ]. S96 is isogenic with S288c but is unable to undergo mating type switching ( ho ) is able to mate with YJM789 and contains a lesion in the lys5 gene that can be easily scored in crosses. YJM789 is isogenic with YJM145 a segregant of a clinical isolate of S. cerevisiae (27). YJM145 has been characterized genetically and the ultimate source of its parent (human lung) differs substantially from that of S288c in that the strains were isolated from different environments at different times and in different geographic locations. Theoretically any two yeast strains could be used.
- A library of YJM789 genomic DNA was constructed in an M13 sequencing vector. The sequence was determined for 696 clones as previously described [
10.1038/387s078
- ]. The sequences were called by means of the phred base caller software (see chimera.biotech.washington.edu/UWGC/tools/phred.htm) which produces a quality measurement for each base [−10 × log(10) (probability of an error)]. A total of 122 258 bases were sequenced with greater than 99% confidence by this quality measurement. The YJM789 sequences were compared with S288c sequence with the cross_match program (see chimera.biotech.washington.edu/UWGC/tools/phrap.htm). Discrepancies between the YJM789 and S288c sequences were then classified by quality and assigned into coding and noncoding regions with the phred base caller. In most cases because only a single trace was available and no alignments were performed regions of the traces that did not show high quality were excluded from the analysis. When a high-quality sequence (>99.7% accuracy) was used 466 cases of allelic variation were observed with a frequency of about one every 160 bases.
- Of the 466 cases of allelic variation in sequences with greater than 99.7% accuracy 288 were from coding regions (61%). Of the estimated 13.2 Mb 8.637 Mb (65%) of the yeast genome is annotated as coding sequence by Saccharomyces Genome Database.
- Yeast genomic DNA (10 μg purified on a Qiagen column) was digested with 0.15 U of deoxyribonuclease I (DNase I) [Gibco-BRL polymerase chain reaction (PCR) grade] in 1× One-Phor-All buffer (Pharmacia) containing 1.5 mM CoCl 2 for 5 min at 37°C. After heat inactivation of the DNase I the DNA fragments were end-labeled in the same buffer by the addition of 25 U of terminal transferase (Boehringer Mannheim) and 1 nmol biotin-N6–dideoxyadenosine triphosphate (NEN) for 1 hour at 37°C. The entire sample was hybridized to the array in a 200-μl volume as previously described (3).
- Grids were aligned to the scanned images by the known feature dimensions of the array. The hybridization intensities for each of the elements in the grid were determined by the seventy-fifth percentile method in the Affymetrix GeneChip software package.
- An adjusted array hybridization intensity value ( I ) was determined for each hybridization (20 altogether) as the mean of the log(PM) signals of all features that showed minimal variation across all hybridizations (the nonmarkers determined recursively as described below). Then for each feature on the array a linear regression of log[perfect match (PM)] on I for all hybridizations was determined by the least squares method first under the null hypothesis that the S96 and YJM789 samples had the same response and then under the alternative hypothesis that the S96 samples had a greater signal than the YJM789 samples. The models were compared with the F test and the same signal model was rejected in favor of a marker with 99% confidence. This software is available upon request to D. Richards.
- Gaps were often found near regions with low probe coverage for example near repeated elements in the genome or regions of low open reading frame (ORF) density. However in some cases probe coverage was adequate suggesting that the gap might be due to a high amount of sequence conservation or to the region having a recent common origin for the two strains.
- The p is computed as P (S96)/[ P (S96)+ P (YJM789)] where P ( X ) is the probability (from the t distribution) that a marker has genotype X based on the observed (PM) hybridization signal of the feature and the expected signal (given the array hybridization intensity) and the estimated variance from the regression for the marker.
10.1038/387s067
- T. Petes R. Malone L. Symington in The Molecular and Cellular Biology of the Yeast Saccharomyces J. Broach J. Pringle E. Jones Eds. (Cold Spring Harbor Laboratory Cold Spring Harbor NY 1991) vol. I pp. 407–521.
10.1093/genetics/93.1.51
- Alani E., Cao L., Kleckner N., ibid. 116, 541 (1987). / ibid. by Alani E. (1987)
- Yeast strains were routinely grown in yeast extract peptone and dextrose (YEPD) medium; sporulation medium and defined medium for scoring auxotrophs were prepared as previously described [F. Sherman G. Fink C. Lawrence Methods in Yeast Genetics: Laboratory Manual (Cold Spring Harbor Laboratory Cold Spring Harbor NY 1974)]. Segregants were complementation tested to distinguish lys2 from lys5. Cycloheximide sensitivity was scored by inability to grow on YEPD plates containing cycloheximide (0.5 μg/ml). All segregants from the 99 tetrads were phenotyped for lys2 lys5 and cyh. lys5 and cyh segregated 2:2 for 99 tetrads and lys2 segregated 2:2 for 98 of the 99. Selected segregants were scored for MAT and ho. The ho and ho::hisG were distinguished by checking the size of PCR products on a gel and MAT was determined by mating and complementation.
- The loci could have been mapped with any segregant as long as the genotype was known; however segregants with similar genotypes were chosen to simplify the analysis.
- The probability of an interval segregating 10 to 0 randomly (a false positive) was estimated to be about 40% for each outcome. No false positives were observed with 10 segregants and therefore no additional hybridizations were performed. This conservative estimate of probability which does not take into account recombination hotspots or interference was calculated by dividing the genome size (12 Mb) by the average interval (29 kb for 10 segregants with 1 cM = 2.9 kb for yeast) and then multiplying this number by the probability of 10 events having the same outcome (1/2) N . In general up to 13 segregants (or more if the trait is non-Mendelian) may need to be examined to have a 95% probability of identifying a single region as responsible for a trait.
- The breakpoints were recursively added to each chromosome on the basis of the p values. The probabilities of breakpoints at every pair of markers were tested against the probability of no breakpoint. The breakpoints that maximized this likelihood were accepted if the logarithmic likelihood ratio was greater than 30. This procedure was repeated for each new subinterval created by a breakpoint to 500-bp resolution.
- Boehnke M., Am. J. Hum. Genet. 55, 379 (1994). / Am. J. Hum. Genet. by Boehnke M. (1994)
10.1016/S0021-9258(17)42155-7
- E. A. Winzeler et al. unpublished data.
10.1093/genetics/136.4.1261
10.1126/science.280.5366.1077
10.1101/gr.7.10.996
10.1073/pnas.86.8.2766
10.1038/nbt1296-1675
- We thank N. Risch and D. Siegmund for helpful advice. E.A.W is supported by the John Wasmuth Fellowship in Genomic Analysis (HG00185-01). D.R.R is a Howard Hughes Medical Institute predoctoral fellow. M.J.M. is supported by a Commonwealth AIDS Research Grants Committee of Australia postdoctoral overseas fellowship. Funding by NIH grant 1R01 HG01633.
Dates
Type | When |
---|---|
Created | 23 years, 1 month ago (July 27, 2002, 5:50 a.m.) |
Deposited | 1 year, 7 months ago (Jan. 12, 2024, 10:47 p.m.) |
Indexed | 3 weeks, 6 days ago (Aug. 5, 2025, 8:55 a.m.) |
Issued | 27 years ago (Aug. 21, 1998) |
Published | 27 years ago (Aug. 21, 1998) |
Published Print | 27 years ago (Aug. 21, 1998) |
@article{Winzeler_1998, title={Direct Allelic Variation Scanning of the Yeast Genome}, volume={281}, ISSN={1095-9203}, url={http://dx.doi.org/10.1126/science.281.5380.1194}, DOI={10.1126/science.281.5380.1194}, number={5380}, journal={Science}, publisher={American Association for the Advancement of Science (AAAS)}, author={Winzeler, Elizabeth A. and Richards, Dan R. and Conway, Andrew R. and Goldstein, Alan L. and Kalman, Sue and McCullough, Michael J. and McCusker, John H. and Stevens, David A. and Wodicka, Lisa and Lockhart, David J. and Davis, Ronald W.}, year={1998}, month=aug, pages={1194–1197} }