GLOSSARY

1C is used to designate the nuclear DNA content of a gamete. Basically, an organism's 1C DNA content is the smallest quantity of nuclear DNA that can be used to define that organism. Somatic cells in G1 of interphase have a 2C DNA content, and G2 cells possess a 4C DNA content.

Alpha-complementation is the most common form of insertional inactivation. In alpha-complementation, the vector molecule contains the regulatory and coding regions for the first 146 amino acids of the ß-galactosidase (lacZ) gene. A polycloning site has been engineered into the coding sequence without disrupting the activity of the gene product (the amino-terminus of ß-galactosidase). During ligation, vector molecules are cut at a particular sequence in the polycloning site and incubated with pieces of insert DNA cut with the same restriction enzyme. In some instances, insert DNA is successfully ligated into a vector molecule causing disruption of the lacZ coding sequences on the vector. However, some vectors will not be cut by the restriction endonuclease or will have their ends religated without incorporation of an insert. After ligation, the resulting DNA is used to transform a competent E. coli strain in which only the region of the lacZ gene coding for the carboxy-terminal portion of ß-galactosidase is present. In cells that contain a vector without an insert, the amino-terminal ß-galactosidase subunit will interact with the carboxy-terminal ß-galactosidase subunit to form an active ß-galactosidase enzyme (the inducer IPTG is often necessary for transcription). ß-galactosidase catalyzes the transformation of the clear substrate X-GAL into a blue precipitate, giving colonies with an active ß-galactosidase protein a distinctive blue color. In contrast, clones containing a polycloning site into which a piece of exogenous DNA has been inserted (i.e., recombinant clones) will not produce a functional amino-terminal subunit, and will consequently appear white in color. It is the recombinant clones that are of interest, and hence they are preferentially selected from agar plates for further analysis (Sambrook et al. 1989).

BAC-end sequencing is a powerful technique used in association with chromosome walking to construct contigs. It is rooted in the principles of Sanger dideoxy DNA sequencing and DNA amplification (Rosenblum et al., 1997). In brief, (a) BAC DNA is isolated from clones using a miniprep procedure, (b) vector-based primers, Taq polymerase, and a nucleotide cocktail containing dye-labeled ddNTPs are added to each miniprep, (c) the mixtures are heat-denatured, cooled to allow primer hybridization, and warmed to a temperature that allows DNA polymerization, (d) each reaction is electrophoretically resolved on an acrylamide sequencing gel, and (e) the gel is analyzed using an automated gel analysis system (see Boysen et al. 1997; Rosenblum et al. 1997; Kelly et al. 1999 for reviews). If the BAC DNA concentration in the minipreps is particularly low, several thermocycle runs can be used to produce more sequencing substrates (Liu and Whittier 1995).

The sequences generated by BAC-end sequencing represent regions of insert DNA adjacent to insert sites ("BAC ends"). Once BAC end sequences for a particular clone are obtained, probes based on that clone's end sequences can be used to screen the BAC library and find clones that overlap the starting clone (i.e., chromosome walking). When applied on a large scale, probing libraries with BAC end sequences can lead to relatively rapid construction of physical maps.

Chromosome walking: In chromosome walking, the end sequences of a "starter" clone(s), typically associated with an EST or RFLP marker, are used to probe colony blots/grids. DNA fingerprints of positive clones are compared to the fingerprint of the starter clone, and those exhibiting a minimal amount of overlap with the starter are grouped into a contig with that clone.

Contig: A contig is a set of clones containing partially overlapping pieces of insert DNA that collectively represent an uninterrupted stretch of genomic DNA. Contigs are constructed using physical mapping techniques.

Chimeras are clones that possess two or more noncontiguous DNA inserts. Chimeras can result from insertion of more than one noncontiguous DNA fragment into a single vector molecule, recombination between inserts in two different vectors, and/or inclusion of two or more recombinant molecules (vectors with inserts) in a single host cell (Green et al. 1991; Shizuya et al. 1992).

Concatemers are DNA molecules composed of a vector/genome repeated in tandem (e.g., several BAC vector molecules ligated together) (Lewin 1997).

DNA fingerprinting is a means of analyzing the similarity between several DNA samples based upon the presence or absence of specific restriction sites within their sequences. In DNA fingerprinting, two or more DNA samples (e.g., BAC clones) are digested with the same set of restriction enzymes. The digested DNA samples are run on a gel and blotted onto nitrocellulose (Southern blotting). Blots are hybridized with labeled probe sequences, and similarities/differences between the hybridization patterns for the DNA samples are noted.

In BAC-based physical mapping, DNA fingerprints of BAC clones can be compared. Those clones that have considerable overlap in their fingerprint patterns can be grouped together into contigs (Marek and Shoemaker 1997; Marra et al. 1997).

False positive clones do not contain inserts from the experimental organism, yet exhibit the phenotype of recombinants based upon their growth pattern and colony color on selective media (e.g., white color, chloramphenicol-resistance). False positives generally do contain a vector molecule. However, the marker gene has been inactivated by an event other than insertion of a large DNA fragment into the polycloning site.

F factors are naturally occurring episomes (i.e., DNA elements that can exist as circular plasmids or can be integrated into the bacterial chromosome) found in some bacterial strains. In its free plasmid (circular) form, the typical F factor is approximately 100 kb and is maintained at a level of one copy per bacterial genome. When inserted into the bacterial genome, F factors are replicated along with the bacterial genome. Integrated F factors can be present in more than one copy per cell. Bacteria that possess an F factor (F-positive bacteria) can conjugate with strains that do not contain an F factor (F-negative bacteria). During conjugation (which is mediated by F factor gene products), an F-positive bacterium containing a free plasmid transfers a copy of the F plasmid to an F-negative bacterium. If the F factor is integrated into the F-positive bacterium's genome, the F factor and part or all of the donor's chromosomal DNA may be transferred into the F-negative bacterium (Willetts and Skurry 1987; Lewin 1997).

Although BACs are derived from F factors, only a few of the F factor genes have been preserved in BACs. Genes involved in conjugation and regions involved in insertion of the F factor into the bacterial genome have been eliminated. The endogenous F factor genes left in BACs serve to prevent more than one BAC/F factor from inhabiting a cell and ensure proper replication and segregation of the BAC into daughter cells (see FIGURE 1.1).

Fluorescence in situ hybridization (FISH) is a technique in which hapten-labeled DNA probes are hybridized to chromosomes that have been spread on glass microscope slides. Antibodies or other affinity reagents conjugated to fluorochromes are used to detect (directly or indirectly) sites of hybridization (Peterson et al. 1999).

Genome coverage is the combined base pair length of all the inserts in a genomic library divided by the 1C genome content of the organism for which the library was made. The level of genome coverage for a particular library can be determined using the following formula:

W = NI/G

Where W = coverage, N = total number of clones in the library, I = mean length in base pairs of DNA inserts, and G = the 1C genome size (in base pairs) of the organism from which the library was made.

For example, suppose that a BAC library was constructed for soybean (Glycine max). The 1C DNA content for soybean is approximately 1.115 x 109 base pairs. If the library contained 100,000 clones with an average insert size of 120,000 bp, the library coverage would be

W = (100,000 clones x 120,000 bp)/1.115 x 109 bp

W = 10.8

To put it another way, the library would contain 10.8 genome’s worth of soybean DNA or 10.8 times (10.8X) the amount of nuclear DNA in a soybean gamete.

With 3X coverage, the chance of finding a particular genomic sequence in a library is approximately 95%. Increasing coverage to 5X improves the chances that a library is truly representative (includes all of the sequences within the genome of interest) to 99%. Naturally, increasing the genome coverage above 5X affords even higher confidence levels (Paterson 1996).

Insertional inactivation can be used to differentiate recombinant clones from non-recombinants. Most plasmid vectors (including most BACs) contain a reporter gene into which a polycloning site has been engineered. Ligation of a piece of exogenous (insert) DNA into the polycloning site of a vector results in disruption of the reporter gene whereas a vector molecule that either was not cut by the restriction enzyme or that has had its termini ligated back together contains an intact reporter gene. After transformation, bacteria are plated onto nutritive agar. In recombinant clones (clones in which the reporter gene has been disrupted by an insert), a functional version of the reporter gene protein (reporter protein) will not be produced. In clones containing a plasmid without an insert (non-recombinants), the reporter protein will be expressed. The reporter protein is typically an enzyme (or part of an enzyme complex) that catalyzes a colorimetric reaction using a component in the selective media as a substrate. Consequently, recombinant and non-recombinant clones can be differentiated based on colony color (see alpha-complementation).

Insert rearrangements include deletions, transpositions, and inversions. Insert rearrangements, especially in clones containing tandemly repeated DNA sequences, are relatively frequent in yeast artificial chromosome systems (Neil et al. 1990).

Map-based cloning is the use of physical mapping and molecular mapping to isolate a gene(s) involved in a particular phenotype. Basically, molecular mapping is used to determine where on the molecular map the gene is located. Once the two markers that most closely flank the gene have been determined, physical mapping is used to isolate a contig containing the DNA between the two markers. This contig, which presumably contains the gene(s) of interest, can be further evaluated.

Master copy is a term used with regard to ordered libraries. In brief, the plates produced by transferring bacteria directly from colonies on agar plates into microtiter wells constitute the original or "master" copy of the library. Once the bacteria in the master copy plates have been allowed to propagate overnight, copies of the library can be made using the master copy as a template (see CHAPTER 17).

Polycloning site: A polycloning site is a relatively short region within a vector into which several restriction sites have been engineered for the purpose of DNA cloning. These restriction sites are not found elsewhere on the vector molecule. In many instances, a polycloning site is engineered into a reporter gene allowing insertional inactivation. In order for insertional inactivation to work, addition of the polycloning site must not prevent proper transcription of the reporter gene or significantly alter the activity of the reporter gene product.

Multiplex screening is a colony hybridization strategy for efficiently screening ordered libraries with multiple radioisotope- or fluorochrome-labeled probes. Briefly, probes of interest are labeled and arranged in a series of rows and columns in a microtiter plate. Probes from an entire row are pooled and used to screen a set of library grids. Likewise, probes from one column are pooled and used to probe an identical set of grids. Hybridization patterns on both sets of grids are recorded and compared. If a particular clone is recognized by both the pooled row and pooled column probes, that clone most likely contains a DNA sequence complementary to the probe found at the intersection of the row and column on the microtiter plate. Computer analysis of the hybridization patterns of all pooled column and row combinations allows clones to be assigned to probes using a minimum number of hybridizations (Cai et al. 1998).

Ordered libraries: In an ordered BAC library, bacteria from positive colonies (i.e., colonies presumably containing insert DNA) are picked from agar trays and placed into freezing medium in individual wells of microtiter plates (one clone per well). Letters along the Y-axis and numbers along the X-axis of each plate provide each well with a specific alphanumerical designation (e.g., well G13). Additionally, the microtiter plates in a library are numbered consecutively. Consequently, any particular clone in the library possesses its own unique address (e.g., plate 131, well G13).

Segregation of individual clones into separate wells coupled with automation allows complete libraries to be gridded onto filters in a highly specific manner, i.e., each clone is gridded onto a filter based on its address in the library. If a probe hybridizes to a specific spot(s) on a grid, the relative location of the spot can be used to determine the exact location of the clone within the library.

Ordered libraries save valuable time and resources by increasing the efficiency and speed of library screening (see Choi and Wing 1999 for review).

Physical mapping: Physical mapping is the grouping of clones into contigs using physical mapping techniques. The goal of most physical mapping projects is to assemble contigs that encompass entire chromosomes/genomes as a prelude to genome sequencing (e.g., Mozo et al. 1999).

Physical mapping techniques are any techniques used in contig construction. The most common BAC physical mapping techniques include chromosome walking, BAC-end sequencing, STS-based mapping, map-based cloning, and DNA fingerprinting.

Quality designations are placed on the labels of microtiter plates as an indicator of how "far removed" a library copy is from the original or master copy. The master copy itself has the "highest quality" denoted on labels by the letter "Q" for "quality" and the Roman numeral "I" (i.e., QI). Libraries prepared using the master copy as a template have the second highest quality designation, i.e., "QII". Copies made from QII plates are designated QIII, etc.

Radiation hybrid mapping: In radiation hybrid mapping, cells from a species of interest (donor species) are exposed to radiation of sufficient intensity to cause chromosome fragmentation. The irradiated cells are then fused with cultured cells from a second species (host species). The host cell line often contains a mutation that prevents its growth on selective media. Over time, the hybrid cells lose most of the chromosome fragments from the donor species. However, one or two chromosomal fragments from the donor species may become stably transmitted and expressed in some of the fusion products. If the hybrid cells are placed in selective media, only those cells in which the mutated host gene has been complemented by a DNA fragment from the donor will survive. Viable hybrids can then be tested for the presence of other markers from the donor species. The likelihood that a donor-specific marker linked to the gene complementing the host's mutation will be found in viable hybrid cells is inversely related to the physical distance between the marker and the gene. Consequently, comparison of the transmission of donor-specific markers with the complementing phenotype can be used to determine gene order and estimate distances between markers (Goss and Harris 1975).

Secondary compounds are chemicals not required in the normal metabolic and developmental pathways common to plants (or at least large sub-groups of plants). Examples of secondary compounds are latex, polyphenols (e.g., tannins), and alkaloids (e.g., caffeine, cocaine, nicotine, quinine). Some secondary compounds are apparently involved in plant defense while others have no known function (Salisbury and Ross 1992). Secondary compounds often make isolation of nucleic acids, organelles, and proteins difficult (Loomis 1974; Peterson et al. 1997).

STS-based mapping is a physical mapping technique rooted in the principles of PCR. In general, primer pairs are designed from cDNAs and/or genomic regions known to be single-copy in nature. In the presence of labeled nucleotides, PCR is performed using a particular primer pair(s) and a set of BAC templates (e.g., minipreps from a BAC library). Those BACs that function as templates for a particular primer pair are grouped into a contig. If different primer pairs produce amplification products from the same BAC template, those primers represent loci that are physically close to one another. The amplification product of a primer pair is called an STS marker. Those amplification products generated using primers derived from cDNAs often are called ESTs (expressed sequence tags) because they presumably correspond to expressed genes.

Yeast artificial chromosomes (YACs) are linear DNA vectors equipped with the essential elements of yeast chromosomes (Hieter et al. 1990; Burke and Olsen 1991). When introduced into living yeast cells, YACs are replicated and segregated along with the standard yeast chromosome complement. During meiosis a YAC will pair and synapse with "homologous" YACs if any are present (Shero et al. 1991).


Return to CONTENTS