Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Genome structure and function part 1
Description: Fully typed and clear (colour-coded) concise notes on the zoology and biochem second year module C12SFG at the University of Nottingham, but should cover relevant topics for other courses, modules and unis. Covers: DNA structure and organisation DNA replication with E. coli Contents and organisation of genomes Bacterial transcription Eukaryotic transcription RNA processing Transcription factors and their DNA binding forms Control of transcription with E. coli Control of gene expression in eukaryotes See part 2 for: Gene cloning DNA libraries PCR DNA sequencing Translation in eukaryotes Manipulating gene expression (miRNA, siRNA, shRNA, long non-coding RNA)

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


GENETICS 2018
DNA structure and organisation ✓✓
At the end of this lecture you should be able to:



Describe the essential structural parameters of the DNA double helix
o
o

DNA is right handed – the right zig is over the left zag
The strands have opposite polarity and run 5’ to 3’

o

It has minor and major grooves – different atoms of
the bases can be accessed in each groove, and most
of the sequence specific binding information is in the
major groove
...

The helix has approximately 10
...
3o rotation per base (axial twist)
...

Remember when drawing DNA that it is right
10
...
Can form due to
alternating purine-pyrimidine sequences such as GCGCGC
...

There are various forms of displacement of bases pairs, with the Z axis being the axis of
the helix
...

Do not confuse propeller twist (between bases) with the axial twist of the helix (34
...
This is how
transcription factors recognise specific sequences and control gene expression
...

It is important to remember that all parameters are on average – the width and depth of
grooves, DNA form etc are all distorted by the sequence context
...
The DNA is naturally slightly
negative, and the binding proteins are normally positive
...

Sequence specific interactions are by a network of hydrogen
bond donors and acceptors, between the bases and the
amino acid side chains
...

DNA can be bent before proteins bind
...
There are lots of online modelling tools which base off structure and residues
...
It is also
unusual in that it binds in the minor groove
...
Architectural proteins
bind and bend DNA
...
They are also important in
eukaryotes
...
L = twist + writhe
Supercoiling is introduced by adding or subtracting
extra turns
Twist and writhe is often simplified by leaving out the
innate number of twists and showing the DNA as
parallel lines
Twist – can be positive (added in the same direction
as the DNA turns) or negative (‘undoes’ one of the
DNA turns)
...

Writhe – twists turn into writhe, where the loop coils
on itself (supercoiling) will be negative if twists were
negative (and form a right handed supercoil), and vice
versa
...
4bp per turn

R

R handed
coiling

L

L handed
coiling

o
Although topology usually refers to
closed circular DNA, linear DNA can also
be supercoiled if it is anchored at both
ends
...
Most
supercoiling takes the form of
plectonemic or toroidal writhe nodes,
however toroidal nodes in naked DNA
need a protein core to wrap around
...
This
helps condense the DNA (eg for chromosome condensation)
Some few bacteria have reverse gyrase that adds positive supercoils (or
Topology
twists) to the DNA
...
When nucleosomes are removed, toroidal (or negative)
writhes will be formed

Protein
Binding

4

DNA replication with E
...
The
DNA polymerase moves from the 3’ end to the 5’ end
(with the new strand adding nucleotides from the 5’
to the 3’ end)
Both antiparallel strands are replicated (therefore are
both the template strand)
Nucleosides are nucleotides which lack the phosphate
group
...
This is isoenergetic (no ATP
used)

5’
3’
New nucleotides
added


Describe the direction and initiation of DNA replication, and polymerisation
o
o
o
o

o

To start replication the strands need to be separated, and an RNA
primer synthesis to use to start DNA synthesis (from 3’ end)
DNA polymerases have to have a 3’ end to add nucleotides to
...

DnaA polymerises along the length of the DNA, with accessory
proteins (like IHF) that help bend the DNA
...

Torsional stress and ATP hydrolysis gives the energy to separate
the strands, and DnaB helicase (a doughnut shaped protein) is
loaded (by DnaC) onto the strands to prepare the way for the
polymerase by opening the replication fork
...

o
Single-stranded DNA binding proteins (SSP) stop stem-loop structures from
forming
...
This is achieved by polymerase III being a dimer – this means
that the lagging strand has to loop around so that synthesis is still 5’ to 3’
• In eukaryotes the leading and lagging strands are replicated by different polymerases
(being a heterodimer, not a homodimer (like the bacterial one)
The β sliding clamp
• This is a ring shaped protein which encircles the DNA and holds the polymerase on, so
that it does not fall off
...
With this system replication can go half way
around the genome without the need to restart
...

• In bacteria helicase is loaded around the lagging strand (needs to be loaded as circular
– clamp also has to be loaded but can fit dsDNA) and cannot fall off
...

o This does not include τ (tao) which acts as a scaffold which other components are
mounted onto (like RNA polymerase dimers)
o The clamp loader is also not shown, which assembles and loads new sliding clamps at the
start of each okazaki fragment
...
The helicase is
a hexameric ring which encircles the leading strand in eukaryotes but the lagging strand in
prokaryotes
...

Most of the components have different names in eukaryotes – eg the β clamp is PCNA
(proliferating cell nuclear antigen) in eukaryotes
Higher eukaryotic replication is much more complicated, with 218 proteins in the
replication complex
Eukaryotic replication is much slower (60bp/s as opposed to 1000bp/s), but has multiple
origins of replication (which are not fixed, unlike bacteria)
...
coli genomes are only 4Mbp (4 million base pairs), whereas a human’s is 3000Mbp
...
Correspondingly E
...

Viral and prokaryotic genomes are relatively simple and mostly code for proteins
...

Therefore, for mitosis and meiosis the chromosomes are super condensed
...
Multicellular organisms need 10,000 – 20,000 genes, requiring about 100-200Mbp
of coding DNA
...

7

o
o
o

o



The C-value gives the size of the haploid genome
Organisms may have a larger variation in their genome because the environment or the
organism can tolerate it, which is not necessarily true for complex multicellular organisms
As a general rule, more complex organisms have more repeats
...
Therefore some
single cell organisms have tiny genomes (with amoeba as an exception)
Low coding densities indicate a lack of environmental or sexual selection

Describe genome features (nucleosomes, centromeres, telomeres…)
o

o
o

o

o
o
o

o
o

o
o



Regions of DNA wind around histones to form chromatin
...

Bacteria do not have centromeres,
and we’re still not sure how they
segregate
...

S
...
pombe has a ‘regional’ centromere of 40-100Kb which
contains blocks of repeated units
Higher eukaryotes have regional centromeres of 4005000Kb, which can be larger than the entire E
...
The lagging strand is more
of an issue as there is always a last bit with no primer which is lost
...
Only about a 1000 copies of this are present, and are absent in the
somatic cells, leading to senescence

Describe the different types of simple DNA repeats
o

Most of non-coding DNA is repeated sequences, which can be classified into families:
o Tandem repeats – can be a tandem cluster →→ or inverted →
...


8



Mini satellites are simple repeats of up to 10-60bp (which can form clusters)
repeated 5-50 times
...

 Micro satellites can be 1-6bp long repeated 5-50 times
...
It is very unlikely
that someone would have the same repeats as another at all loci
...
They are used by viruses to insert their genetic
material into the host genomes
...
The repeated sequences are
dispersed and nonadjacent
...

 Transposons are DNA sequences that can move within the genome and increase in
number, creating or negating mutations and changing the genome size and content
...

They have played a large part in the genomes of most higher organisms
...
The retrotransposons are more common in higher organisms
 Discovered by Barbara McClintock in 1983
...
She also showed that telomeres do not act like breaks, and described the
nucleolus
...
Histone genes are often
highly repeated (whether clustered or scattered) as they are in high demand
...
Or they can
have multiple (200 plus) pseudogenes which look something like the protein but are nonfunctional
Pseudogenes on the same chromosome near the functional gene may have frame shifts,
nonsense mutations, or lack promoter signals
...
These are probably generated by reverse
transcription of mRNA and reintegration into a new chromosome (possibly following
retroviral infection at some point)
...
Normally in genes encoding the RNA components of
protein manufacture – ribosomes and tRNAs
...
Eg histone genes
...

The two copies of the gene are often adjacent to one another immediately following
duplication, so it is thought that the duplication frequently results from inexact repair of
double-strand chromosome breaks
...
Tehrefore most
duplications are followed by loss-of-function mutations in one or the other gene
...


9

o

An alternative fate for gene duplications is for both copies to remain functional, while
diverging in their sequence and pattern of expression and taking on different roles
...


Bacterial transcription ✓✓
At the end of this lecture you should be able to:



Detail enzymes involved in transcription
o
o
o

o
o
o
o

o

o
o
o



Transcription is catalysed by a DNA-dependent RNA polymerase
Synthesis start de novo – does not require a primer, occurs 5’ → 3’, and only uses one
strand of the DNA
One PPi (pyrophosphate) is released for each nucleotide added/polymerised, leaving a
free 3’OH group to which further nucleotides (NTPs) are added
...

DNA lacks a 2’ OH group which makes it more stable than RNA
In eukaryotes, protein coding genes are transcribed into mRNA by RNA pol II, ribosomal
DNA genes into rRNA by RNA pol I, and tDNA and 5S rDNA genes into RNA by RNA pol III
...

The core complex is made up of two α
subunits, a β and a β’ (beta prime)
subunits
...

The polymerase active site is between the beta subunits
...
This forms a holoenzyme (whole
enzyme)
...

Using different sigma factors changes the genes transcribed
...


Describe the process of transcription and sigma factor binding and removal
o

o
o
o

The RNA polymerase binds to DNA and moves along it
...
This is called an open complex formation
...

Sigma70 is the standard sigma factor, and recognises the -35 and -10 (TATA box)
consensus sequences
...

RNA polymerase interacts with the promoter via the sigma factor
...
100%
matches are not found in DNA
...

o There are four binding stages before transcription is
initiated:
Complex 1
Non specific DNA binding – to scan for
promoters
Complex 2
RPc1 -35 interactions
Complex 3
RPc2 (closed complex) -35 and -10 interactions
ISOMERISATION – DNA changes shape to form the open complex
Complex 4
RPO (open complex) – strands of DNA separate -12 to +3
o DNA is bent at this stage as some promoters have intrinsic
bends – isomerisation relieves the stress
o Strands of DNA separate in the open complex, giving the RNA
polymerase (RP) access to the template strand to synthesise
RNA
...
Some promoters are sensitive to the
level of supercoiling
...
Having
‘escaped’ the promoter, the RNA pol can begin elongation
...

This can all be shown on gels, with lots of short aborted transcripts, and then mostly only full
length chains
...
coli there are two methods of transcription termination:
Rho-independent – the normal form in bacteria
...
The stem-loop binds to the
polymerase complex and stalls the elongation, leading to
termination
...

 The high concentration of bases to the polymerase
causes it to change shape, releasing the transcript
...
It chases and displaces the RNA
polymerase, terminating transcription
...
It also has a low affinity for RNA which is being translated, so binds
to the 3’ UTR of RNA
...
Only happens in polycistronic (multi transcript) mRNA, and affects all later
genes
...
Both enzymes are very
similar to RNA polymerase II, with several of the core
enzyme subunits identical in all three eukaryotic RNA
polymerases
...

TBP is a common feature across all three, to initiate
transcription
...

RNA polymerase I binds to a promoter containing a core
promoter element and an upstream control element (UCE)
...

Genes transcribed by RNA polymerase III have an unusual
promoter structure: some of the key promoter elements are
located downstream of the transcription start site
...

Promoter recognition by RNA polymerase III is mediated by TBP – in TFIIIB - mirroring
promoter recognition by polymerases I and II
...

Chemical modifications are comprised
mostly of base and sugar methylation,
and pseudouridine addition (fake uracil,
no one sure why)

13



Describe the basal transcription factors needed at all RNA pol II promoters
o

o
o

o

o

o



RNA Pol II doesn’t have anything like sigma factors,
which can recognize the promoter and unwind the DNA
double helix
...
The RNA Pol II is associated with six
general transcription factors, designated as TFIIA, TFIIB,
TFIID, TFIIE, TFIIF and TFIIH, where "TF" stands for
"transcription factor" and "II" for the RNA Pol II
...
Some proteins are found in
more than one complex, which is important for
regulation and flexibility (combinatorial regulation)
Some proteins have enzymatic activity, eg TFIIH has a
kinase and two helicases which help Pol II initiate
transcription and escape the promoter
...

Recognition of the TATA box by TFIID is followed by an
orchestrated series of steps with various binding and
removal of transcription factors, ending in the elongation phase
The diagram is simplified, missing many TAFs (TATA binding
protein associated factor), but shows that there is a specific
temporal progression towards a higher order complex

Discuss post- and co-transcriptional RNA processing events
o

o
o

Processing can be co-transcriptional or post transcriptional
...

Reminder than bacteria don’t process transcripts
mRNA processing has 4 parts:
 Cap formation – adds a 7-methylguanosine to the 5’ end of
the primary transcript, soon after initiation
...

Bacteria have something similar, have a triphosphate on
their 5’ end
...
Interactions between
them produce an RNA loop on either side of the cleavage
site
...
POST
Splicing –Splicing involves two trans-esterification
reactions
...
Intron and
exon boundaries are defined by consensus sequence
...
These promote productive interaction with the nuclear pore, and
the mRNA is actively exported out through nuclear pore complexes
...
Many processing factors are carried on
phosphorylated CTD (C-terminus domain of RNA pol) next to the RNA transcript exit site –
ready to attach to new RNA sequence
...
The
CTD of bac RNA pol can also interact with the CAP protein (the cAMP receptor), which is
loosely similar to RNA pol II
...
Intronless genes include αInterferon, ubiquitin, histones, human
Cytochrome C, and many yeast genes (less
introns in S
...
p)
Alternative splicing allows the control of
expression
...


15

Transcription factors and their DNA-binding domains ✓✓
At the end of this lecture you should be able to:



Describe the five major DNA-binding domains in transcription factors

Superclass 1:

Basic Domains

Leucine zipper factors (bZIP), helix-loop-helix factors (bHLH), helix-loop-helix / leucine zipper, NF-1, RF-X, bHSH



α helix of TF fits into the major grooves of the DNA
...




These two use slightly different scaffolds to hold the recognition helices in the
correct position relative the DNA binding site
 Eg
...

This is the most common type, and is a small, fast
evolving, domain
Knuckle () formation provides two
ligands (ion/molecule attached to a
metal atom)- a common combination

The zinc acts the stabilise the
scaffold, not interact with
DNA

As the zinc fingers are modular, each new finger H
bonding adds a little more specificity, until it can recognise
unique sequences
...
TFIIIA (below) controls transcription of the 5S RNA gene
...

16

Superclass 3:

Helix-turn-helix (HTH)

Homeo domain, paired box, fork head / winged helix, heat shock factors, tryptophan clusters, TEA domain




Found in prokaryotes (eg trp repressor) and eukaryotes (eg homeodomain)
Two α helices held at a fixed angle – the carboxy terminal helix is the recognition
helix and docks in the major groove
 An alpha helix is a good fit in the major groove of DNA, where the sequence specific
binding sites are available (H bonds)
 HTH proteins often bind as symmetrical dimers, binding one
turn apart
 Binding as a dimer increases specificity and affinity of binding
– if one lets go, the other prevents the protein from diffusing
Homeo
away
...

 Famous example is the Hox proteins for body plan
 The eukaryotic homeodomain is similar to the bacterial HTH, but has 3 helices
...

 Eg
...
NF-KB P65 homodimer (for oncogenes)
Superclass 5:
Other transcription factors


Copper fist proteins, HMGI(Y), Pocket domain, E1A-like factors, AP2/EREBP-related factors






HMG domains are eukaryotic architectural proteins that bind DNA with little or no
sequence specificity
...

Normally DNA is bent towards the protein, not away
Eg
...
coli ✓✓
At the end of this lecture you should be able to:



Describe the role of different
sigma factors in controlling gene
expression
o

Cells don’t express all of their
genetic information all of the
time
...
Genes are regulated by
the availability of alternative sigma factors which recognise different promoters (giving
different transcripts)
o These alternatives may be encoded by an invading virus
o In the host cell a change in conditions (eg low resources) triggers sigma factors to create
the relevant suite of genes
o Phage T7 encodes its own RNA polymerase which recognises it’s own promoters (an
extreme version of having your own sigma factors)
o Repressible system – anabolic (synthesis), eg
...
Tryptophan) has to bind to the apo-repressor
in order to change the repressor shape so that it can bind and
repress the gene (stopping synthesis)
o Inducible system – catabolic (breakdown), eg
...
Lactose or IPTG) has to bind the
repressor so that it changes shape and releases the DNA
o Both ligands which bind the repressors are called effectors
o Both above examples use sigma70
o The tryptophan operon (repressible) is a
common example
...
The bound transcription factor (and corepressor)
prevents the RNA polymerase from binding
...

The lac repressor only represses the lactose operon in the
absence of its effector
...
This
means that if the repressor dissociates it is ready to translate straight
away
...
It aids the recruitment of RNA
polymerase in the absence of glucose (cell prefers glucose to lactose, so only uses
a low level of lactose in the presence of glucose)
...

In reality it’s a little more complex, because there are three operator sites
...
These dimers can block the CAP protein from binding

18



Describe examples of non-consensus sigma70 promoters (mercury response)
o

o
o

o



Sigma70 promoters are always active (unless repressed), however in the mercury
response, the promoter is inactive because it differs from the consensus
sequence
...
It can affect the bendability of
the DNA between -35 and -10 if it is out of phase, and help separate the
strands for open complex formation
...
It controls the
genes needed for when nitrogen is in short
supply
...
When a polymerase
binds to the perfect sigma54 promoter sequence, it cannot form an open complex until an
activator binds (whereas in sigma70 if it binds it transcribes)

19

o

o

o



In the nitrogen starvation response, the activator is NtrC
...
This
means that it can interact with a site on the polymerase and use ATP hydrolysis to
promote a shape change, allowing transcription initiation
...
If it is far enough away it can
loop due to Brownian (random) motion
...
Any longer and the DNA can bend (although IHF may still be required)

Discuss the mechanisms of action in activator proteins
o
o

o
o
o
o



Is cooperativity enough to explain activation by activator proteins?
In the case of CAP yes, experiments show that any extra bond made by the polymerase
aids recruitment (harder to dissociate, increases the concentration so more likely to bind)
– CAP itself has no specific action to acid recruitment
With the sigma54 nitrogen response and sigma70 mercury response no – additional
signals are transmitted by the activator proteins
With sigma54, the NtrC transmits a conformational change to the RNA polymerase
With sigma70, the MerR causes a conformational change to the DNA
This gives three distinct methods of activation:
1 Increased reactant concentration (via tethering)
2 Conformational change to the polymerase
3 Conformational change to the promoter DNA

Describe the λ repressor as a dual-function activator/repressor
o

o

o
o

The bacteriophage lambda repressor is an example of a dual
activator and repressor protein
...

The repressor binding at OR1 and OR2 stimulates its own
production
...
If there is too much repressor, it binds at the
lower affinity OR3, inhibiting more repressor production (in a negative feedback loop)
Lambda Cl repressor binding is cooperative at OR1 and OR2, and also at
the left and right operator regions
...
Either it forms two stem-loop structures (like
A), the second of which causes premature termination, or (in B)
termination can be prevented if a protein binds to the mRNA at site 1,
allowing 2 and 3 to pair
...
The
pause
attenuator sequence has four (leader
sequence) domains, where 3 can bind with 2
→→
or 4
...
If there’s plenty then the
ribosome can translate it, reaching the
polymerase faster, allowing 3 and 4 to form
3 and 4 pair
a terminator
...

In this example the ribosome is a negative regulator – if it catches
up to RNA polymerase in time it gets in the way of the pairing
between 2 and 3, allowing 3 and 4 to pair causing a
transcriptional terminator
...
Eg
...
The nut site only binds the N protein in RNA
(it is an RNA binding protein)
...

N is an anti-terminator, which slowly increases in
concentration, meaning more and more of the late
early genes can be transcribed
Viral proteins are either early or late proteins
depending on how they are replicated
...
The early
genes initiate replication of the genome and expression of the late genes
...
RNA polymerase II has a much more
complex array of factors because it transcribes differing sets of proteins in different cells
at different times
...
They were discovered in D
...
melanogaster it forms a fly’s eye (not the donor spp
...

Eukaryotic transcription factors often have separate DNA binding and transcriptional
activating domains
...
The DNA binding
regions are often basic (net positive charge) whilst
the activation domains are often acidic (net negative
charge)

22



Understand that transcription factors are modular (control of Gal1 in yeast)
o
o

o
o

o

o
o

o

o
o



GAL1 is positively regulated by galactose, and negatively
regulated by glucose
...
This in turn binds Gal80 in the absence of
galactose, which represses transcription
...
This makes this system
more similar to recruitment and tethering by CAP than
activation by NtrC and MerR
Other proteins are recruited at induction and repression –
mostly histone modifiers, and nucleosome modelling machines
Higher-order complexes may contain even more additional
components like the example on the right (don’t learn)

Describe the role of histone modifications involved in gene
expression control
o
o

o
o
o

Chemical modifications of histone tails control many properties, like
condensation, accessibility, and the processivity of transcription
Transcription alters the pattern of histone modification, and the
modifications alter the transcribability of the chromatin
...
This forms the concept of the histone code
...

The primary signals must come from protein-DNA interactions, which are then related into
DNA by signalling cascades
...


23



NOT ACCESSIBLE

Discuss chromatin condensation and chromatin remodelling in gene expression
o Chromatin remodelling machines are used for transcription (include
histone modifiers, and nucleosome modelling machines (NMM)
o Histone octamers are eight protein complexes found at the centre
of nucleosomes, with two copies of each of the four main histones
o The SWI/SNF is a NMM which binds to chromatin and uses ATP to
mobilise nucleosomes
...

This can activate or repress genes
o Chromatin condensation can make DNA completely inaccessible to
transcription factors, regulating gene expression
...
It
keeps the gene amounts in male and female mammals equal
...
This shows that they can
permeate to some degree
o Gene activation is accompanied by chromatin decondensation, and
vice versa
...
The bands visibly activate (puff up)
...


24


Title: Genome structure and function part 1
Description: Fully typed and clear (colour-coded) concise notes on the zoology and biochem second year module C12SFG at the University of Nottingham, but should cover relevant topics for other courses, modules and unis. Covers: DNA structure and organisation DNA replication with E. coli Contents and organisation of genomes Bacterial transcription Eukaryotic transcription RNA processing Transcription factors and their DNA binding forms Control of transcription with E. coli Control of gene expression in eukaryotes See part 2 for: Gene cloning DNA libraries PCR DNA sequencing Translation in eukaryotes Manipulating gene expression (miRNA, siRNA, shRNA, long non-coding RNA)