Sunday, April 12, 2009

Targeted Gene Disruption in Transgenic Mice

The specific experimental inactivation of a gene
affords an opportunity to study its normal function
by comparing it with that of the inactive
state. This will yield information about the role
of a particular gene in development or other
functions that would not be available otherwise.
In the approach described here, a normal
gene is replaced by a mutant allele by disrupting
the normal gene (targeted gene disruption).
The effects can be studied in different embryonic
stages of mouse development and after
birth. Subsequently the results can be compared
to the effects ofmutations in homologous
human genes as seen in human genetic diseases.
The method requires the use of mouse
embryonic stem (ES) cells. ES cells are
pluripotential, i.e., they are capable of giving
rise to different kinds of cells but not to an entire
organism. These cells can be grown in culture
through many generations and yet retain
the potential to be integrated into a mouse blastocyst.
Here they participate in embryonic
development allowing mice to produce that are
homozygous for a mutation introduced into a
specific gene (knockout mice).

Transgenic mouse derived from ES cells with a disrupted gene

Embryonic stem cells (ES) from a mouse blastocyst
are isolated (1) after 3.5 days of gestation
(of a total of 19.5 days) and transferred to a cell
culture (2). Here they will grow on a layer of irradiated
cells that are themselves unable to
grow (feeder layer). Target DNA (see B) from a
mouse homozygous for a marker coat color, e.g.,
dominant black, is added to the ES cell culture.
Very few cells or perhaps just one may take up
target DNA by recombination with the homologous
gene in the ES cell (3). This disrupts
the normal gene. These recombinant ES cells
can be grown in a selective culture medium
(nonrecombinant cells will not grow, see B). Recombinant
ES cells containing a copy of the disrupted
gene are injected into a recipient mouse
blastocyst (4). These cells will be integrated into
the early mouse embryo (5). The blastocyst
partly containing recombinant ES cells is transferred
into a pseudopregnant mouse (6). After
birth (6), mice derived fromnormal and recombinant
cells (chimeric mice) can easily be identified
by spots of black coat color (7). When
adult chimeric mice are mated to normal mice
homozygous for another coat color allele (e.g.,
white, 8), the birth of black progeny indicates
that the targeted gene is present in the germ
line (9). The mating of such heterozygous mice
(not shown) will produce mice that are homozygous
for the disrupted gene

Double selection for ES cells

containing the disrupted gene
The isolation of mouse ES cells with a gene-targeted
disruption requires positive and a negative
selection. A bacterial gene conferring resistance
to neomycin (neoR) is added (2) into
DNA cloned from the target gene (1). DNA containing
the thymidine kinase gene (tk+) from
herpes simplex virus is then also added to the
vector outside the region of homology. These
two selectable modifications (neoR and tk+) are
part of the replacement vector (3). The vector is
introduced into ES cells, which are grownin culture.

ES cells

All ES cells that do not take up the vector
(the majority) are sensitive to neomycin and
will not grow in a medium containing the antibiotic
(positive selection, not shown). Cells that
take up the vector (about 1%) do so either by
nonhomologous insertion at random sites (4) or
by homologous recombination at the target site
(5). These cells can be distinguished when
grown in a selective medium containing gancyclovir,
a nucleotide analogue toxic to cells containing
tk+. Only cells containing the gene-targeted
insertion (through homologous recombination)
will grow because they do not contain
tk+ (negative selection).

Genomics, the Study of the Organization of Genomes

A genome contains all biological information
required for life and/or reproduction.
The term genomics for the study of genomes
was introduced in 1987 by V. A.McKusick and F.
H. Ruddle at the suggestion of T. H. Roderick of
the Jackson Laboratory, Bar Harbor,Maine, USA.
The term genomics extends beyond genetics.
While the latter mainly deals with heredity and
its mechanism and consequences, the term
genomics encompasses many aspects relating
to molecular and cell biology: the different
types of genomic maps; nucleic acid sequencing;
assembly, storage, and management of
data; gene identification; functional analysis
(functional genomics), evolution of genomes;
and other interdisciplinary areas relating to the
wide variety of genomes in different organisms.
A eukaryotic genome, contained in the chromosomes,
is many times larger than a prokaryote
genome. A prokaryotic genome consists of a
circular chromosome with compactly arranged
genes.

Important insights

Important insights about the functions, evolutionary
relationships, antibiotic resistance, and
other metabolic aspects required for the
development of new strategies for therapy are
gained from sequencing the genome of microorganisms

The genome of a small bacteriophage

The genome of a bacteriophage usually consists
of double-stranded DNA, although some phage
genomes consist of single-stranded DNA or of
RNA. The size of the genome of phages ranges
from 1.6 kb to over 150 kb, representing anywhere
from a few to over 200 genes. One of the
smallest phages is !X174. F. Sanger and coworkers
sequenced the genome completely and
found that several of the ten genes of !X174
overlap.

Overlapping genes in !X174

The genes A and B, B and C, and D and E partially
overlap. In these overlapping regions of the
genome, a different reading frame is used
differently by different genes. Gene E begins
with the start codon ATG, the first two

Genome of Escherichia coli

E. coli is an important microorganism. It
colonizes the lower intestines of mammals including
man in a symbiotic relationship. Pathogenic
strains of E. coli cause gastrointestinal,
urinary, pulmonary, and nervous system infections
in humans. The E. coli genome has
4639221 bp. A total of 2657 protein-coding
genes with known function (62% of all genes)
and 1632 genes (38%) without known function
have been identified. The simplified figure
shows eight genes (A–H), the origin of replication
(ORI), and the genes for DNA polymerase
and methionine. Four operons are shown: the
operons for lactose consisting of three genes,
for galactose with four genes, for tryptophan
with five genes, and for histidine with nine
genes. About a fourth of all genes in E. coli are
organized into 75 different operons. Most genes
of the E. coli genome are present as a single
copy; only the genes for ribosomal RNA (rRNA)
are present in multiple copies. As a result, the
bacteria can double their protein content every
20 minutes during cell division.

The Complete Sequence of the Escherichia coli Genome

The report of the complete sequence of the
genome of the E. coli K-12 strain in 1997
(Blattner et al., 1997) with a full map of the
4289 protein-coding genes of the 4 639 221-
base pair genome is presented here as an
example of one of many sequenced microorganism
genomes

Overall structure and comparison with other genome sequences

The figure shows a small section of about 80 kb
of the entire genome from the original publication.
Base pair numbers 3310000 to !3345000
are shown in the first row (top), and 3339000
to 4025000 in the second row. The top double
line shows color-coded genes of E. coli encoding
a protein on either of the two strands of theDNA
double helix. Six other completed genomes are
shown for comparison.

average distance between genes

The average distance between genes is 118 bp.
The protein-coding genes (87.8% of the
genome) can be assigned to 22 functional
groups (see gene function color code at the
bottom of the figure). Among these are 45 genes
with recognized regulatory functions (1.05% of
the total); 243 genes for energy metabolism
(5.67%); 115 genes for DNA replication, recombination,
and repair (2.68%); 255 genes for transcription,
RNA synthesis, and metabolism
(5.94%); 182 genes for translation (4.24%); 131
genes for amino acid biosynthesis and metabolism
(3.06%); and 58 genes for nucleotide biosynthesis
and metabolism (1.35%).

...........

How many genes are required for a microorganism?
The smallest known cellular genome is
that of Mycoplasma genitalium, with only 480
protein-coding genes and 37 RNA genes in 580
kb of DNA. However, not all of these genes are
essential. Hutchison et al. (1999) determined by
global transposon mutagenesis that 1354 of
2209 insertions into genes were not lethal to
the organism. The results suggested that 265–
350 of the 480 genes are essential under laboratory
conditions. The limited capacity formetabolism
in M. genitalium is compensated for by a
greater dependence on the transport of
molecules from the extracellular environment
into the cell, mainly by ABC transporters. An
ABC transporter is a heterotrimeric transport
system made up of a specific ligand-binding
subunit, a permease, and an ATP-binding protein.

total of protein families

The total of protein families required for an organism
is called a proteome. Its study is called
proteomics. Since many genes are required to
encode the different proteins of a given metabolic
or signal pathway, genes of related functions
are grouped into families. Their number is
considerably smaller than the total number of
genes. The CAI (Codon Adaption Index) reflects
the preferred codon usage of an organism. (The
figure is adapted from a small part of the
complete map

Genome of a Plasmid from a Multiresistant Corynebacterium

Plasmids are double-stranded circular DNA
molecules in bacteria but separate from the
bacterial chromosome. They are self-replicating
and occur in a symbiotic or parasitic relationship
with the host cell. The number of plasmids
per bacterial cell varies from a few to thousands.
Their sizes range from a few thousand
base pairs to more than 100 kb. Plasmids usually
confer a benefit to the host cell, often because
they contain genes encoding enzymes
that inactivate antibiotics. Drug-resistant
plasmids pose amajor threat to successful antibiotic
therapy. Since many plasmids also contain
transfer genes encoding proteins that form
amacromolecular tube, or pilus, through which
a copy of plasmid DNA can be transferred to
other bacteria, antibiotic resistance can spread
very rapidly. The following example provides
new insights into the origin and evolution of a
multiresistant plasmid composed of DNA segments
derived from bacteria of very different
origins (soil bacteria and plant, animal, and
human pathogens).

The multiresistant plasmid pTP10

A large, 51409 bp plasmid from the multiresistant
Gram-positive Corynebacterium striatum
M82B containing genes encoding proteins
renders its host bacteria resistant to 16 antimicrobial
agents from six structural classes
(Tauch et al., 2000). This is the largest plasmid
to have been sequenced to date. It contains DNA
segments from a plasmid-encoded erythromycin
(Em, shown inside the circular genome diagram)
resistance region from the human pathogen
Corynebacterium diphtheriae, a chromosomal
DNA region from Mycobacterium tuberculosis
containing tetracycline (Tc) and oxacillin
resistance, a plasmid-encoded chloramphenicol
(Cm) resistance region from the soil bacterium
Corynebacterium glutamicum, and a
plasmid-encoded aminoglycoside resistance to
kanamycin (Km), neomycin, lividomycin, paramomycin,
and ribostamycin from the fish
pathogen Pasteurella piscicida. In addition, the
plasmid contains five transposons and four insertion
sequences (IS1249, IS1513, IS1250, and
IS26) at eight different sites. Altogether eight
genetically distinct DNA segments of different

Genetic map of plasmid pTP10

Plasmid pTP10 from the Gram-positive opportunistic
human pathogen Corynebacterium striatum
M82B has 47 open reading frames (ORFs).
They can be assigned to eight different DNA
segments forming a contiguous array of subdivided
stretches in the linear representation
shown

Segment I

Segment I (shown in green) consists of five ORFs
comprising the composite resistance transposon
Tn5432. The insertion sequences IS1249b
(ORF1) and IS1249a (ORF5) flank the erythromycin
resistance gene region ermCX (ORF3) and
ermLP (ORF4). An identical copy of IS1249 occurs
in ORF29 (segment VIII). ORF3, the central
region of Tn5432, encodes a 23 S rRNA methyltransferase
preceded by a short leader peptide
probably involved in the regulation of erythromycin-
inducible translational attenuation.
This region is virtually identical to the antibiotic
resistance gene region (erythromycin, clindamycin)
of plasmid pNG2 from C. diphtheriae
S601.

Segment II

Segment II (ORFs 6–14), located downstream of
Tn5432, contains the tetracycline resistance
genes tetA (ORF6) and tetB (ORF7). This segment
(ORFs 6–14) is very similar to ATP-binding cassette
(ABC) transporters identified in a mycobacterium
(M. smegmatis) chromosome. The
tandemly arranged genes tetA and tetB also mediate
resistance to the !-lactam antibiotic oxacillin,
although it is structurally and functionally
unrelated to tetracycline. Presumably this
results from TetAB protein heterodimerization
and subsequent export of the antibiotics out of
the bacterial cell. Similarly, the other segments
have been delineated.