Homework 2009
Assigned
Sept. 3, due Sept. 10
1.
Problem 1.2 from “Genetics and Molecular
Biology”.
First, determine the concentration of H+ ions
inside of a
cell at pH 7:
pH =
-log[H+]
7 =
-log[H+]
10^-7 M =
[H+]
Then, we can estimate the volume of a typical
bacterial cell
to be ~ 1 cubic micrometer. This is
equal to 1 femtoliter (10^-15 L).
(10^-7 mol/L)*(10^-15 L) = 10^-22 moles
(6.0221X10^23 ions/mole)*(10^-22 moles) = ~60 ions
That’s not very many ions, especially when you
consider that
a bacterial cell has ~250,000 proteins (from Goodsell paper) that are
pH
sensitive!
2.
What is the osmotic pressure in a
typical
bacterial cell? The cell wall of an E.
coli bacterium is one huge covalently connected molecule.
Although not part of the problem, you might
want to think about how such a molecule grows and also how close the
force of osmotic
pressure comes to rupturing the cell wall.
(Note that evolution is very unlikely to have evolved a cell
wall that
is orders of magnitude stronger than it needs to be.)
Osmotic pressure (P) in a bacterial cell is
proportional to
the molarity (M) inside of the cell, the temperature (T) that the cell
is
exposed to, and the gas constant (R):
P = MRT
We know R (.0821 L-atm/mol-K) and T (310 K = 37
C), so we
need to determine M.
A typical bacterial cell has an intracellular salt
concentration of 0.15 M. From the
Goodsell paper, we know that a bacterial cell has about 250,000
proteins. This is 4.15X10^-19 moles, and
.0415 M in
concentration (assuming a volume of 1 femtoliter). You
can see that the protein concentration is
small relative to the salt concentration. You
can take into account the nucleic acid,
sugar, and lipid contributions toward the total cellular molarity, but
for
simplicity let’s ignore them because it seems that the salt
concentration is
the major component.
P = (.15 M +.0415
M)*(.0821 L-atm/mol-K)*(310K)
P = ~5 atm
Assigned
Sept. 8, due Sept. 15
1.
Problem 2.6 from “Genetics and Molecular
Biology”.
Deleting 5 base pairs will reduce the twist by
0.5, and thus
increase the writhe by 0.5. This will
cause the supercoiling pattern to run slightly further on a gel
compared to the
original DNA. Using this principle, you
could make single base pair deletions in the DNA and compare the
supercoiling
pattern to the original DNA. When there
is no difference between the migration rate of the original DNA and the
experimental DNA, then the number of base pair deletions in the
experimental
DNA is equal to the helical twist.
2. Problem 2.18.
Correction: The image on
the left is correct. Apparently there are many inaccurate
representation of DNA floating around the web and in textbooks.
However, you can see that the left image is correct by looking at an
atomic resolution structure of DNA:

Because the question was graded incorrectly, everyone will receive full
credit for the problem.
Assigned
Sept. 10, due Sept. 17
1. As
promised in lecture, the problem is to draw several of the
DNA-Polymerase structures at a DNA replication fork over the course of
the synthesis of one Okazaki fragment and the beginning of the next.
For a good representation of this, see the
following
website. An important feature to note is
that the polymerases on the leading and lagging strands are tethered
together
via the helicase and primase. This
causes the lagging strand to loop out like the valve of a trombone:
http://www.mcb.harvard.edu/Losick/images/TromboneFINALd.swf
2. What are several of the obstacles the interfere with
the progression of a replisome and how are these obstacles
surmounted? This will require a bit of a literature search (I
hope you are motivated to use Google to find out a little bit more
about a few of the topics that are mentioned in each of the
lectures. This will correct misimpressions and help tie things
together. In this case, I am explicitly asking you to find the
answer in the literature, which will require a little searching.)
The papers you find likely will be quite technical and difficult to
read, but you should be able to extract the essential mechanism nature
uses to solve the problem.
Two of the major obstacles encounter by DNAP are
DNA lesions
and other proteins (like RNAP). When
DNAP encounters a lesion, it “skips” over the lesion; the clamp loads
past the
lesion, and a new RNA primer is laid down by primase so that DNA
synthesis can
continue past the lesion. When DNAP
encounters RNAP, RNAP falls off of the DNA so that DNAP can continue
synthesis.
The mRNA that was being transcribed is
retained and used as a new RNA primer.
Assigned
Sept. 15, due Sept. 22
1. Find a recent
publication that describes the use of RNA polymerase and then trace
back through the inevitable chain of linked papers until you find a
description of the assay that was used to monitor steps of the
protein's purification. Briefly describe the assay.
There were a lot of different answers to this, and
the methods differed depending on the year of the publication. One
of the first reports of purifying RNAP was from 1965, where polymerase
was purified from bovine lymphosarcoma tissue by fractionating the
cellular lysate throughout the course of several ammonium sulfate
precipitation steps. The activity of the enzyme was
measured by following the incorporation of radiolabeled rNTPs into
polymers. (Furth
JJ and Ho P. 1965. The enzymatic synthesis of ribonucleic acid in
animal tissue. Journal of Biological Chemistry, 240(6) 2602-7).
2. Problem 4.11
There are two RNAP binding sites at this promoter,
one that
is strong and one that is weak. Protein A can bind to the strong
site. In Protocol 1, A binds to the strong binding site, allowing
RNAP to
bind to the weaker site and initiating transcription at the maximal
rate.
In Protocol 2, RNAP binds to the strong site and has trouble
leaving the
promoter. Eventually, when RNAP leaves the promoter, A will bind
at the
strong site and prevent RNAP from binding there again. This
explains the
slow increase in transcriptional rate in the second experiment.
To confim
this, one could do ChIP on RNAP to determine if it’s bound to two
different
places on the DNA in the presence and absence of protein A.
Assigned
Sept. 17, due Sept. 24
1. Suppose the gene for the sigma-70 were fused to the gene for
the beta subunit of RNAP such that a single fused protein product is
synthesized in vivo and that the linker region in the fusion protein is
sufficiently long as to allow the sigma-70 portion to function normally
in the core RNAP. What are the likely physiological consequences
of such a fusion?
RNAP will only associate with sigma 70, which
activates
housekeeping genes. If the cell
undergoes stress due to heat shocking or nitrogen starvation, for
example, it
will not be able to turn those genes on because the sigma factors that
regulate
these processes will not be able to associate with the core polymerase. In addition to this, sigma usually
dissociates from polymerase after transcript elongation begins, and
transcript
stalling has been associated with sigma failing to dissociate at the
proper
time. Fusing sigma to the core
polymerase may increase the rate of transcript stalling because
sigma-70 will tend to stay associated with the core polymerase more
often than if the proteins were not fused.
2. Suppose that you hypothesize that the speed of transcription
of RNAP decreases as the length of the transcript grows, and therefore,
the elongation rate of an RNAP molecule is the lowest on a gene
just before the polymerase terminates transcription. What experiment
would you do to test this idea. The simpler and easier the
experiment, the better.
A
simple experiment would be to use an in vitro
transcriptional assay and determine the rate of RNA synthesis using
radiolabeled rNTPs, much the same way that you can determine the rate
of DNA
synthesis as described on September 10 in lecture. You
can measure the rate of elongation on
genes that vary in length (as a control, the DNA sequence of the longer
genes
can be repeats of the sequence of the shorter genes). If
the rate of transcription of the long genes
is slower than the rate for the short genes, then you know that your
hypothesis
is correct.
Assigned Sept. 22, due Sept. 29
1. Problem 5.6
By simultaneously adding radiolabeled uridine and
rifamycin,
RNA molecules that are currently being transcribed during the time of
the
addition will be labeled. No new RNA
synthesis will occur. You could then
harvest the cells at timepoints and run the RNA on a gel. Over time, you will see a decrease in the
radioactivity for the RNA bands on the gel. The
time it takes for half of the radiolabeled
RNA to disappear is the half-life of the RNA.
2. Problem 5.8
Virusoids are essentially small, circular pieces
of single-stranded
RNA that require a helper virus in order to replicate. Virusoids
use a “rolling circle” method of
replication with the help of an RNA-dependent RNA polymerase provided
by the
helper virus. This method of replication
results in a long strand of RNA with head-to-tail repeats of the
virusoid
genome. To complete the lifecycle, the
virusoid must self-splice the individual copies of its genome apart,
and the
RNA must then be able to self-life to reform the original circular
structure. The virusoid must do these
things in order to be able to reproduce itself and properly package
itself
within the helper virus.
Assigned
Sept. 24, due Oct. 1
1. How are proteins labelled with biotin such that the biotin can
still be bound by avidin?
Proteins are labeled with biotin so that biotin
can still
bind to avidin in several ways. First,
the labeled amino acids must be on the surface of the protein. Second, the proteins must be attached to
biotin at the carboxylic acid end of the biotin, because the ring
structure is
the part of biotin that interacts with avidin. Finally,
the biotin ring structure must be
able to interact with avidin, which is tetrameric and can bind up to
four
biotin molecules. To do this the ring
structure of biotin must be sufficiently far away from the protein
surface for
this to happen, and you can do this by linking biotin to the side
chains of
amino acids like lysine that are long, or elongate the carbon chain
that
connects the ring structure to the carboxylic acid in biotin.
2. Suppose DNA of length equal to that of the human chromosome
possessed a random sequence. How long must a "test" sequence be to have
a 50% chance of being present in the long random DNA? What is the
relevance of this question to gene silencing?
The average chromosome is about 160 Mbp long. Any one nucleotide has a ¼ chance of
annealing
to any particular site in the chromosome. You
can set up an equation like this:
0.5 = (160000000)/(4^n)
Solving for n gives ~14 bases in order for there
to be a ~50%
chance of annealing. For RNA silencing,
cells need to use oligonucleotides that are longer than this in order
to reduce
the chance of silencing off-target genes.
Assigned Sept. 29, due Oct. 8
1. You are to learn how to download and examine a protein
structure from the Protein Data Bank. If you do not already have
suitable software running on your computer, I suggest either VMD or
PyMol (search with Google, download, and install). Tutorials are
provided at the sites, but you can expect to spend a few hours learning
how to use either program. Download the file 2arc.pdb from the
protein databank. By examination of the structure, predict the
consequences of the double mutation changing residues 151 and 161 to
lysine.
The protein in 2arc is a dimer, with a coiled coil structure that
dimerizes the protein. Residues 151 and 161 are both leucine,
which is relatively small and hydrophobic. Their side chains
point inward towards the dimerization interface, and residue 151 on one
monomer lies in close proximity to residue 161 on the other
monomer. Changing both 151 and 161 to lysine will introduce long,
positively charged side chains into the dimerization interface.
There will be charge repulsion between these residues, weakening the
ability of the protein to dimerize.
2. Last week the university handed out bottles of hand
sterilizing solution. Below is a picture of one such bottle
(labels removed for clarity). What is interesting about this?
You know from experience that bubbles in honey of the size shown in the
picture rise to the top in perhaps 10 seconds. The bubbles in the
picture have been in place for minutes at least, and since the bottles
have been lying around the university for a week now, we know the
bubbles have stayed in place for at least a week. You also know
that honey is much too viscous to be dispensed with a hand pump, and
anything much more viscous could not even be gotten out of the
bottle. So, what is going on? There seems to be a discrepancy
between the likely viscosity as indicated by the bubbles that don't
rise and the fact that the solution is to be dispensed. Next, when you
look more closely, you see that there are no larger bubbles. Why
not? These observations should induce you to take a bottle and play
with it a little. What you find is that the "liquid" is like a
solid up to a certain point. When the force on it exceeds a
certain value, the solid "breaks down" and becomes almost like
water. If you shake a bottle, you can form bigger bubbles.
These rise to the top and disappear in a second or two, but the smaller
bubbles remain stuck in place forever. If you try to pump some of
the suff out of the bottle, it readily pumps like water. In your
hand, it regels and doesn't run off. Presumably, these strange
viscosity properties make it easier to use the stuff. This
problem was assigned to illustrate something about being a
scientist. You want to be super observant. You will never
know when something strange will lead to an unexpected and important
discovery, but you have to notice something strange first. It may
be subtle. Be observant, curious, and critical.

Assigned Oct. 1, due Oct. 13
1. By finding and reading the relevant papers briefly describe
how nature can build proteins that utilize additional amino acids
beyond the canonical 20.
Nature can do this through the use of special tRNAs and codons that are
read in a context-dependent manner. For example, selenocysteine
is an amino acid whose codon is UGA, normally a stop codon. When
the UGA is located upstream of a SECIS element in the mRNA, which forms
a distinct secondary structure, selenocysteine is incorporated.
In the absence of selenium the UGA is read as a stop codon.
Pyrrolysine is another example, and its incorporation into proteins
works by a similar mechanism.
2. Problem 7.8.
An explanation for this is that the mutation lies in the anticodon of
the corresponding tRNA. If a nucleotide is added or deleted from
the anticodon of the tRNA, then the mRNA would be read in frame by the
translational machinery. You could sequence the tRNA to confirm
that this is true.
Assigned Oct. 8, due Oct. 15
1. Problem 8.6.
A single nucleotide change could inactivate an entire chromosome if the
mutation occurs in the X chromosome and X inactivation occurs
aberrantly on both X chromosomes in a female. It is thought that
X inactivation occurs when a non-protein coding RNA, Xist, binds to an
X chromosome. In theory, aberrant inactivation of both X
chromosomes could occur if there is a single mutation in Xist that
allows it to bind tightly to the chromosome, or if the mutation is in
the Xist promoter such that Xist is overexpressed.
2. Problem 8.14.
The markers should be placed in the following order: 25413
You'll note that the numbers don't agree perfectly with one
another. This could suggest that recombination is not directly
correlated with the length of DNA separating the markers; for example,
perhaps there are certain DNA sequences that promote or reduce the
likelihood of recombination.
Assigned Oct. 13, due Oct. 20
1. Problem 9.12, and in addition, correct the associated figure,
which is a little bit wrong.
You can conclude that this restriction enzyme is a blunt cutter, and is
unusual in that it cuts the DNA 3 base pairs away from its binding
site. The binding sequence for this enzyme is GCGCGC. The
problem with the figure is that there should be faint bands of DNA in
all lanes, even in the lanes that reveal the binding sequence, because
of the way that the enzyme cleaves the DNA.
Most people said that the problem with the figure is that the lanes
should read G, A+G, C, C+T for Maxam-Gilbert sequencing. I gave
credit for this because it is true, but this is irrelevant to the
problem.
2. Problem 9.15.
From cells that contain the methylase gene you could ligate DNA
fragments into a cloning plasmid carrying the ampicillin resistance
gene. Transform into cells, selecting for ampicillin resistance.
Then prepare plasmid DNA from the transformed cells and digest with the
purified restriction enzyme. Plasmids surviving the digestion must have
been appropriately methylated by the methylase which they, themselves
carry, and hence, are resistant to the restriction enzyme. It's a
good
idea to overexpress the methylase first because otherwise the
restriction enzyme will cleave the chromosomal DNA and kill the cell.
Assigned Oct. 15, due Oct. 22
1. Look up and then describe with a few drawings how the
QuikChange mutagenesis procedure works. Despite appearances, why
is it not PCR?
There is a good description of how QuikChange mutagenesis works in the
Stratagene manual, which is here:
http://www.stratagene.com/manuals/200518.pdf
The product of one cycle of QuikChange mutagenesis is a nicked
plasmid. Because of this nick, the product cannot be amplified,
so all amplification occurs off of the original template plasmid
instead of on the template and product, like in PCR. Thus,
QuikChange mutagenesis results in a linear increase in product over
time, whereas in PCR the product amplification is exponential.
2. Suppose you wanted to perform DNaseI footprinting to determine
the binding site of a DNA binding protein, but instead of radioactively
labeling the DNA and doing gel electrophoresis and autoradiography, you
wanted to make use of a capillary electrophoresis machine that normally
is used for DNA sequencing. How would you do it?
This would be very similar to doing gel electrophoresis and
autoradiography. You could use a primer that is labeled on its 5'
end with a fluorophore and amplify your sequence of interest.
Then, you could digest the DNA with DNaseI plus and minus your protein
of interest, wash off the protein, and do capillary electrophoresis of
your two DNA samples using the fluorophore as a signal. You can
compare your two samples, plus a chromotogram of your DNA's sequencing
reaction, to determine the binding site for the protein.
Assigned Oct. 20, due Oct. 27
1. Many Europeans are strongly opposed to genetically modified,
GM, foods. What are they objecting to, and does the objection
have any scientific merit?
There are a lot of answers to this question. One objection is
that GM foods might have a genetic advantage and thus would be bad for
the environment because they could outcompete resources from WT
plants. Another objection is that the consumption of GM foods
could harm humans because these foods have "unnatural"
properties. Yet another objection is that genetically modifying
organisms is playing God and is thus unethical. These are
reasonable causes for concern, but so far science has yet to
substantiate these claims
2. To identify specific residue-base interactions between a
DNA-binding protein and its target DNA sequence, a residue thought to
be in direct contact with the DNA was mutated to alanine. How can
you identify the base that the residue in the wild type protein
contacts? Note that standard chemical protection experiments will
not work because protection over the full binding site occurs when the
protein binds.
There are two ways of doing this. One is method is called missing
contact probing, and involves lightly depurinating/depyrimidating the
DNA such that about one base per DNA duplex is altered.
Alternatively, one could systematically mutate every base in the
binding site, one at a time. Then, after the DNA is altered or
mutated, one could run gel shift assays to determine the affinity of WT
and mutant protein for the altered DNA. Bases that are in direct
contact with the wild type and mutant protein will show a reduction in
the DNA binding affinity of each protein. However, the base that
contacts the mutated residue will show a reduced binding affinity in
the wild type protein, but there will be no further reduction in
affinity in the alanine mutant.
Assigned Oct. 22, due Oct. 29
1. It is statistics time. We could do some dorky problem
like "If you sequence a genome to a depth of 5x, what is the
probability that a particular sequence is not sequenced at all?"
Instead, we will do something different. Look up and learn what
variance of a probability distribution measures. In the case of a
Poisson distribution, it can be shown that the variance equals the
average. Using the previously mentioned property, tell how to determine
how many photons per pixel detector element in a digital camera it
takes to produce a signal, say 50% gray. (This may strike you as pretty neat.
For example, the same principle can be used to determine the charge of
an electron without ever making measurements on single electrons.)
You can take a picture of something gray, like a gray wall. Then,
using an image processing program, you can calculate the intensity of
the signal at each pixel. Then, figure out the variance for your
data set. The variance is equal to the mean, so this is how you
can determine the number of photons per pixel it takes to produce a 50%
gray signal.
2. In pyrosequencing, a
next
generation sequencing method, a DNA fragment is PCR amplified in a
small droplet that contains the PCR reagents and a small bead.
How are the amplified DNA fragments attached to the bead and what trick
is used to keep both ends from being attached?
You can amplify the DNA using a set of PCR primers where only one of
the primers is biotinylated. Then, your ssDNA will attach to a
streptavidin-coated bead in only one orientation.
This works if you already know the sequence of part of the DNA because
you know which primer to use. If you wish to sequencing something
larger, like a chromosome, without knowing the sequence beforehand you
need to use something a little more complicated. First, sheer
your DNA of interest such that the DNA has blunt ends. Then, you
can ligate on a ~40 bp double-stranded DNA fragment that contains a
known sequence from which you can bind sequencing primers. This
DNA, call it an adapter DNA, has biotin on it's 5' end, allowing it to
attach to a streptavidin bead. Additionally, the 5' end has a
small overhang, whereas the 3' end is blunt. The DNA will ligate
to the adapter only on the blunt 3' end of the adapter, giving you
directionality.
Assigned Oct. 27, due Nov. 3.
1. How does the number of sites in the human genome at which a
polymerase is bound and ready to transcribe compare with the number of
genes which is determined by analysis of the genome sequence?
The paper from the Lis lab (assigned reading) says that roughly 30% of
genes have a stalled polymerase in a region proximal to the
promoter. This doesn't give the total number of genes at which a
polymerase resides at any given instance, but this is still a lot of
genes (there are roughly 25,000 genes in the human genome).
2. What is your best hypothesis as to the reason for the
existence of the oppositely oriented polymerases at eukaryotic
promoters?
The Lis paper also goes into detail for this. Note that this
question is not asking about genes that can be transcribed in both
directions, that is something that does happen, but it is a different
issue. Rather, this question refers to the finding in the Lis
paper that some polymerases transcribe a short ways in the opposite
direction of the gene, producing small RNAs that don't code for protein.
It has been hypothesized that these small RNAs can be involved in RNA
interference. Also, a transcribing polymerase in one direction
leaves a wake of supercoiling in the opposite direction, and this
supercoiling could help a correctly positioned polymerase form the open
complex. Or perhaps this type of action helps transcriptional
regulators become deposited at the promoter. No one knows for
certain (yet!).
Assigned Oct. 29, due Nov. 5.
1. Use the sequence of any tRNA and use a dot matrix approach to
determine all possible Watson-Crick base pairs. The cloverleaf
base pairing should be present. Is it the secondary structure
with the maximum number of base pairs? Comment.
Most people generated dot matrices that found the cloverleaf base
pairing. It does seem that this produces the maximum number of
base pairs.
2. The namesake problem. Determine that protein in the
known databases which is most closely related to you (as defined by
taking the amino acid sequence defined by your name and place of birth
deleted of the letters not corresponding to the single amino acid
abbreviations). Use blastp at the NCBI web site.
Most people found sequences that matched the sequence of their
protein. This shows that you have to be really careful when
deciding whether or not two sequences are homologous!
Assigned Nov. 3, due Nov. 12.
1. AraC loops to repress the pBAD promoter with the binding of
one subunit to a half-site called O2 and one subunit to the half-site
I1. Upon the addition of arabinose, the subunit at I1 remains
there, and the subunit that was bound to O2 shifts and now binds to the
I2 half-site. The I1 and I2 half sites constitute what is called
the I site, and when bound at I1-I2 AraC activates transcription from
pBAD. The three half-sites, I1, I2, and O2 are similar, but not
identical, and in fact, AraC binds to I1 with an affinity (relative) of
1000, to O2 with an affinity of 100, to I2 with an affinity of 10, and
to random sequences with an affinity of 1. Predict the
consequences on expression of pBAD of replacing the I2 sequence with
the I1 sequence.
If the I2 site is mutated to I1, such that I1I1 is present at the
promoter instead of I1I2, then AraC is much more likely to bind to
adjacent sites (and not loop the DNA) than it normally would.
This means that pBAD expression will occur even in the absence of
arabinose.
2. Suppose you isolated mutants that express pBAD even though
arabinose or molecules of structure similar to arabinose are not added
(technical term is constitutive). Where in the components of the
regulation-transcription machinery could such mutations lie, and for
each different location you propose, also tell how the mutation brings
about constitutivity.
One way to produce a constitutive phenotype is to mutate I2 to I1, as
in the above problem. You could delete or mutate the O2 site such
that looping is prevented.
You could also make mutations in the AraC protein that cause
constitutivity. If you mutate or delete the N-terminal arm, then
the protein is unable to form the repressive conformation and will be
unable to loop the DNA, and by default this will lead to
transcriptional activation. You could also mutate regions in the
DNA binding domain that interact with the N-terminal arm, which would
also prevent the formation of the repressive state.
Assigned Nov. 5, due Nov. 17.
1. What are the consequences to regulation of the trp operon in E. coli if the ribosome is a little
tardy in binding to and starting translation of the leader
peptide? What is a simple way Nature could minimize this problem?
2. How do you knock out a gene in mice?
Assigned Nov. 12, due Nov. 19.
1. Problem 16.3.
2. Problem 16.5
Assigned Nov. 17, due Nov. 24.
1. Ultimately, what is it that transmits the information to a
developing mouse embryo of which way is right and which way is left?
2. Design a genetic switching circuit that is off until IPTG is
added, and once IPTG has been added, the circuit stays on whether or
not IPTG remains present.
Assigned Nov. 19, due Dec. 1.
(Last homework to be assigned in the course!)
1. The following represents the orientations of the four
GFP-family proteins mentioned in class. Let a black triangle
represent a site of action of a site specific recombinase in which
recombination between two triangles oriented in the same direction
deletes the intervening DNA, leaving just one triangle, and
recombination between two oppositely oriented trangles invertes the
intervening DNA. What arrangement of triangle(s) at sites 1, 2, and 3
will allow the following behavior: Genes A and B are deleted and
then the unit of C and D inverts repeatedly until the recombinase
disappears, at which time its orientation is stuck. Or, Genes C
and D are deleted and the unit of A and B inverts. In either
case, once the first deletion occurs, no additional deletion is
possible.
