Download Next Generation Genome Sequencing: Towards Personalized Medicine PDF

TitleNext Generation Genome Sequencing: Towards Personalized Medicine
File Size7.5 MB
Total Pages280
Document Text Contents
Page 1


Genome Sequencing

Edited by
Michal Janitz

Page 2

Related Titles

Dehmer, M., Emmert-Streib, F. (eds.)

Analysis of Microarray Data


ISBN: 978-3-527-31822-3

Helms, V.

Principles of Computational Cell Biology

ISBN: 978-3-527-31555-1

Knudsen, S.

Cancer Diagnostics with DNA Microarrays


ISBN: 978-0-471-78407-4

Sensen, C. W. (ed.)

Handbook of Genome Research
Genomics, Proteomics, Metabolomics, Bioinformatics, Ethical and Legal Issues


ISBN: 978-3-527-31348-8

Page 140

background, which is achieved by electrostatic repulsion of fluorescent impurities by
themultilayer charged polymer [13]. The low fluorescence background surfacemakes
it possible to detect single fluorescent dye molecules tagged along DNA backbone.
Individual fluorescent labels on the DNA backbone are imaged using multicolor

TIRFmicroscopy, a technique capable of localizing single fluorescent dye molecules
with nanometer-scale accuracy [14]. The TIRFM system was based on an Olympus
IX-71 microscope with a custom-modified Olympus TIRFM Fiber Illuminator and a
100-SAPO objective. The current system is capable of detecting three colors. DNA
backbone is stained with YOYO-1, which is excited by using 488-nm wide-field
excitation from a mercury lamp. Cy3 (green) and Cy5 (red) fluorophores are used
to label the sequencemotifs or SNPpolymorphic sites andare excitedbyusing543-nm
and628-nmhelium…neon lasers, respectively. Thespatial localizationof individual dye
molecules at the polymorphic sites or sequence motif sites is based on centroid
analysis [15]. The centroid analysis strategy relies on the observation that a fluorescent
molecule forms a diffraction-limited image of width l/2, but the center of the distribu-
tion (which under appropriate conditions corresponds to the position of the dye)
can be localized to arbitrarily high precision by fitting to a two-dimensional elliptical
Gaussianpoint spread function (PSF), if a sufficientnumberofphotonswere collected.
Custom-written software in IDL (Research Systems, Inc., Boulder, CO, USA) was

used for image analysis. The software can extract from the image DNA molecules
based on the intensity ofDNAbackbone staining dye (YOYO-1) and singlefluorescent
dye labels. The extracted images are then merged and the individual dye labels are
superimposed onto the DNA backbone. Sequence motif maps are then constructed
and the haplotypes inferred after localizing the dye labels along the DNA backbone.

Single DNA Molecule Mapping

DNA mapping is an important analytical tool in genomic sequence assembly,
medical diagnostics, and pathogen identification [16…18]. The current strategy for
DNAmapping is based on sizing DNA fragments generated by enzymatic digestion
of genomic DNA with restriction endonucleases. More recently, several linear DNA
mapping techniques have been developed with the goal of mapping DNA in their
native states. The DNA molecular combing and optical mapping techniques inter-
rogatemultiple sequence sites on singleDNAmolecules deposited on a glass surface,
which has been used to detect disease-related mutations and map several microbial
genomes [6, 8, 19]. A direct linear analysis (DLA) technique, in which a long dsDNA
molecule was tagged at specific sequence sites with fluorescent dyes and stretched
into linear form as it flowed through amicrofluidic channel, has also been developed
for DNAmapping [9]. These techniques not only provide the location of restriction or
fluorescent labeling of sites, but also preserve the order of the restriction or
fluorescent labeling of sites within the DNA molecule.
Our DNA mapping method is based on direct imaging of individual DNA

molecules and localization of multiple sequence motifs on these molecules.

120j 10 A Single DNA Molecule Barcoding Method with Applications in DNA Mapping

Page 141

Individual genomic DNA molecules are labeled with fluorescent dyes at specific
sequence motifs. The sequence-specific labeling starts with introducing nicks in
dsDNA at specific sequence motifs recognized by nicking endonucleases, which
cleave only one strand of a dsDNA substrate [20]. DNApolymerase then incorporates
fluorescent dye terminators at these nicking sites (Figure 10.1a). Currently, there are
six commercially available nicking endonucleases (New England Biolabs) with
recognition sequence motifs ranging from three to seven bases long.
The labeled DNA molecules are stretched into linear form and imaged on a

modified glass surface. The distribution of the sequence motifs recognized by the
nicking endonuclease can be established with great accuracy. With this approach, we
constructed sequencemotifmaps of the lambda phage, a strain of human adenovirus
and several strains of human rhinoviruses. Because of the simplicity of thismapping
strategy (single DNA molecule analysis, high accuracy, and potential of high
throughput), it will likely find applications in DNA mapping, medical diagnostics,
and especially in rapid identification of microbial pathogens.

Sequence Motif Maps of Lambda DNA

Lambda DNA is used as a model system to construct a sequence motif map.
Figure 10.2a shows the distribution of its seven nick endonuclease Nb.BbvC I
recognition sites. The solid black line represents the backbone of the lambda DNA
and the black arrow indicates the positions of the predicted Nb.BbvC I sites. Two
images were taken and superimposed to produce a composite picture of the DNA
molecules. Figure 10.2b is a false-color two-channel composite image showing the
stretched DNA contours (YOYO in blue) and labeled sites (Tamra-ddUTP in green).
Three DNAmolecules are nearly fully stretched (A, B, and C) with contour lengths of
19.8, 19.5, and 16.9mm, respectively. Although the data suggest that DNAmolecules
A and B are overstretched at 0.41 nm/bp, compared to the solution conformation of
0.34 nm/bp, this may be due to the effect of YOYO staining [21]. The rest of the DNA
molecules are either broken or folded back onto themselves, giving lengths much
shorter than that predicted. There are also occasional Tamra dye signals (green) not
associated with the DNA backbone. These are most likely the result of either
fluorescent impurities on the coverslip or free Tamra-ddUTP. DNA fragments A
and B in Figure 10.2b have four Tamra labels (green) along the DNA backbone, and
DNA fragment C has three green labels. The signal for two of the green labels (red
arrows) is much stronger and occupies more pixels than that corresponding to a
single fluorescent dye, indicating that several green labels have clustered together
and cannot be resolved due to light diffraction limits of the instrument. The two
clusters most likely correspond to the predicted sites 2, 3, and 4 and sites 5 and 6, as
they are separated by nomore than 1000 bp. Accordingly, the sevenNb.BbvC I sites of
lambda DNA are collapsed to four resolvable sites, with the middle two signals
stronger than the outer two signals. The distances between the labels were calculated
with respect to the DNA backbone starting from the top right end of the DNA
backbone. The positions of four green labels on DNA molecule A starting from the

10.3 Single DNA Molecule Mapping j121

Page 279

polymerase chain reaction (PCR) 10, 30, 31,
45, 58, 60, 109

– amplicon 44, 141
– amplification process 31, 43, 63, 108
– amplified templates 33
– based ultradeep sequencing 47
– chamber 156
– inhibiting molecules 156
– primers 34, 44, 124, 125, 176
– primer sites 64
– products 8, 143
– reagents 58
– sequencing analysis 184
polymer networks 156
polymorphic alleles tagged 125
– direct haplotype determination 127
– localization of 125
primer extension reactions 147
protein–DNA interaction 39, 201, 203, 204,

205, 209, 212
– mapping of 201
– sites 212
protein–nucleic acid interactions 21, 26
protein–protein interactions 203
prototype technique, seeplus andminusmethod

Q-PCR, see Northern blot

rapid amplification of cDNA ends (RACE)

reactions 230
RDBMS model 86
reaction chamber setup 137–139
real-time DNA sequencing 97, 100
real-time PCR method 62
repressor element-1 silencing transcription

factor (REST) 208
restriction enzyme 235
– BsmFI 235
– DpnII 235
– NlaIII 235
– Sau3A 235
reverse transcription 124
rhinovirus genomes 123
ribosomal RNA band 231
RNA-degrading RNases 231
RNA-induced silencing complex (RISC) 217
RNA splicing patterns 130
RNA virus 124

SABE, see serial analysis of binding elements
SAGE, see serial analysis of gene expression

Sanger DNA sequencing 5, 6, 10, 50, 91, 133,
153, 156

– approaches 133
– basics of 3
– capillary electrophoresis 133
– dideoxy-based tag sequencing 203, 204, 205
– instruments 36
– limitation and oppurtunities 7
– method 4, 10, 90, 117, 153
– principle of 4
– metagenomic libraries 183
SeqID Hit method 82
SeqMan genome assembler (SMGA) 91
sequence assembly tools 21
sequence-based expression profiling 226
sequence-based techniques 40
sequence searching, strategies 80
sequence-specific probes 119
sequencing-by-synthesis (SBS) 133, 206
sequencing factories 5
sequencing library 63
sequencing technology 3, 15, 69, 109
– ultrahigh-throughput 69
serial analysis of binding elements (SABE)

serial analysis of gene expression (SAGE)

113, 230
– library 235
– protocol 231
– technique 204, 237
short-read sequencing technology 211
shotgun cloning 10
signal-to-noise (S/N) ratio 146
silica sol-gel monolith, see photopolymerized

single fluorescent dye molecules 125
single-molecule sequencing technologies

single-nucleotide polymorphisms (SNPs) 3,

8, 72, 89
single-pair fluorescent resonance energy

transfer (spFRET) 133
SMART technology (Clontech) 221
SMGA assembly projects 91
Smith–Waterman alignment algorithm 210
sodium dodecyl sulfate (SDS) 159
Solexa sequencing 233–235
SOLiD system 29, 30, 35–38, 40
– applications 35
– library generation 30–31
– overview of 29
– performance of 29, 33
– technology of 29, 39
SSAHA-based methods 84

Index j259

Page 280

tag-based sequencing 21, 36, 167
– platforms 167
tag-based transcriptome analysis methods

229, 238
tag-based transcriptome profiles 235
Tamra dye signals 121
total internal reflectancefluorescence (TIRF)

– microscopy 118
– system 120
transcriptional elements 202
transcription factor-binding elements 39
transcription factor binding sites (TFBS) 167,

168, 169, 173, 177
– genome mapping 167, 168
transcription regulatory circuits 173
transcription regulatory networks 179
transcription start sites (TSS) 168
transcriptome 23, 229
– high-resolution map 24
transcriptome analysis methods 23, 37, 113,

170, 229, 230
– DNA microarrays 230
– serial analysis of gene expression (SAGE)

transcriptome profiling 238
transcriptome sequencing 19
transient entanglement coupling (TEC)

transmission electron microscopes (TEMs)

– analysis 104
– based sequencing 109

– image 104
– instrument 107
– sequencing technology 110
– substrate 106, 107
– technology 111
– visualization 107
T7 exonuclease 148

ultradeep sequencing 47
ultrahigh-speed DNA sequencing 97

viral genomes 123
VisiGen�s core technology 101
– approach 100

Western blotting methods 39
whole genome shotgun (WGS) 5, 6

xenon lamp 34

yeast telomeric heterochromatin 204

Z-labeled nucleotides (Z-dNTPs) 107
Z-modified nucleotides (Z-dNTPs) 104, 105,

ZS genetics (ZSG) 103, 111
Z-substituted DNA molecules 103
Z-tagged nucleotides 108

260j Index

Similer Documents