Homework 2009

Assigned Sept. 3, due Sept. 10

1.  Problem 1.2 from “Genetics and Molecular Biology”.


First, determine the concentration of H+ ions inside of a cell at pH 7:

pH = -log[H+]

7 = -log[H+]

10^-7 M = [H+]

 

Then, we can estimate the volume of a typical bacterial cell to be ~ 1 cubic micrometer.  This is equal to 1 femtoliter (10^-15 L).

 

(10^-7 mol/L)*(10^-15 L) = 10^-22 moles

 

(6.0221X10^23 ions/mole)*(10^-22 moles) = ~60 ions

 

That’s not very many ions, especially when you consider that a bacterial cell has ~250,000 proteins (from Goodsell paper) that are pH sensitive!


2.  What is the osmotic pressure in a typical bacterial cell?  The cell wall of an E. coli bacterium is one huge covalently connected molecule.  Although not part of the problem, you might want to think about how such a molecule grows and also how close the force of osmotic pressure comes to rupturing the cell wall.  (Note that evolution is very unlikely to have evolved a cell wall that is orders of magnitude stronger than it needs to be.)

Osmotic pressure (P) in a bacterial cell is proportional to the molarity (M) inside of the cell, the temperature (T) that the cell is exposed to, and the gas constant (R):

 

P = MRT

 

We know R (.0821 L-atm/mol-K) and T (310 K = 37 C), so we need to determine M.

 

A typical bacterial cell has an intracellular salt concentration of 0.15 M.  From the Goodsell paper, we know that a bacterial cell has about 250,000 proteins.  This is 4.15X10^-19 moles, and .0415 M in concentration (assuming a volume of 1 femtoliter).  You can see that the protein concentration is small relative to the salt concentration.  You can take into account the nucleic acid, sugar, and lipid contributions toward the total cellular molarity, but for simplicity let’s ignore them because it seems that the salt concentration is the major component.  

 

P = (.15 M +.0415 M)*(.0821 L-atm/mol-K)*(310K)

 

P = ~5 atm



Assigned Sept. 8, due Sept. 15

1.  Problem 2.6 from “Genetics and Molecular Biology”.


Deleting 5 base pairs will reduce the twist by 0.5, and thus increase the writhe by 0.5.  This will cause the supercoiling pattern to run slightly further on a gel compared to the original DNA.  Using this principle, you could make single base pair deletions in the DNA and compare the supercoiling pattern to the original DNA.  When there is no difference between the migration rate of the original DNA and the experimental DNA, then the number of base pair deletions in the experimental DNA is equal to the helical twist.



2.  Problem 2.18.

Correction:  The image on the left is correct.  Apparently there are many inaccurate representation of DNA floating around the web and in textbooks.  However, you can see that the left image is correct by looking at an atomic resolution structure of DNA:

DNA

Because the question was graded incorrectly, everyone will receive full credit for the problem.

Assigned Sept. 10, due Sept. 17

1. As promised in lecture, the problem is to draw several of the DNA-Polymerase structures at a DNA replication fork over the course of the synthesis of one Okazaki fragment and the beginning of the next.


For a good representation of this, see the following website.  An important feature to note is that the polymerases on the leading and lagging strands are tethered together via the helicase and primase.  This causes the lagging strand to loop out like the valve of a trombone:

http://www.mcb.harvard.edu/Losick/images/TromboneFINALd.swf



2. What are several of the obstacles the interfere with the progression of a replisome and how are these obstacles surmounted?  This will require a bit of a literature search (I hope you are motivated to use Google to find out a little bit more about a few of the topics that are mentioned in each of the lectures.  This will correct misimpressions and help tie things together.  In this case, I am explicitly asking you to find the answer in the literature, which will require a little searching.)  The papers you find likely will be quite technical and difficult to read, but you should be able to extract the essential mechanism nature uses to solve the problem.

Two of the major obstacles encounter by DNAP are DNA lesions and other proteins (like RNAP).  When DNAP encounters a lesion, it “skips” over the lesion; the clamp loads past the lesion, and a new RNA primer is laid down by primase so that DNA synthesis can continue past the lesion.  When DNAP encounters RNAP, RNAP falls off of the DNA so that DNAP can continue synthesis.  The mRNA that was being transcribed is retained and used as a new RNA primer.



Assigned Sept. 15, due Sept. 22

1. Find a recent publication that describes the use of RNA polymerase and then trace back through the inevitable chain of linked papers until you find a description of the assay that was used to monitor steps of the protein's purification. Briefly describe the assay.

There were a lot of different answers to this, and the methods differed depending on the year of the publication.  One of the first reports of purifying RNAP was from 1965, where polymerase was purified from bovine lymphosarcoma tissue by fractionating the cellular lysate throughout the course of several ammonium sulfate precipitation steps.  The activity of the enzyme was measured by following the incorporation of radiolabeled rNTPs into polymers.  (Furth JJ and Ho P. 1965. The enzymatic synthesis of ribonucleic acid in animal tissue. Journal of Biological Chemistry, 240(6) 2602-7).



2.  Problem 4.11

There are two RNAP binding sites at this promoter, one that is strong and one that is weak.  Protein A can bind to the strong site.  In Protocol 1, A binds to the strong binding site, allowing RNAP to bind to the weaker site and initiating transcription at the maximal rate.  In Protocol 2, RNAP binds to the strong site and has trouble leaving the promoter.  Eventually, when RNAP leaves the promoter, A will bind at the strong site and prevent RNAP from binding there again.  This explains the slow increase in transcriptional rate in the second experiment.  To confim this, one could do ChIP on RNAP to determine if it’s bound to two different places on the DNA in the presence and absence of protein A.



Assigned Sept. 17, due Sept. 24
1.  Suppose the gene for the sigma-70 were fused to the gene for the beta subunit of RNAP such that a single fused protein product is synthesized in vivo and that the linker region in the fusion protein is sufficiently long as to allow the sigma-70 portion to function normally in the core RNAP.  What are the likely physiological consequences of such a fusion?

RNAP will only associate with sigma 70, which activates housekeeping genes.  If the cell undergoes stress due to heat shocking or nitrogen starvation, for example, it will not be able to turn those genes on because the sigma factors that regulate these processes will not be able to associate with the core polymerase.  In addition to this, sigma usually dissociates from polymerase after transcript elongation begins, and transcript stalling has been associated with sigma failing to dissociate at the proper time.  Fusing sigma to the core polymerase may increase the rate of transcript stalling because sigma-70 will tend to stay associated with the core polymerase more often than if the proteins were not fused.



2.  Suppose that you hypothesize that the speed of transcription of RNAP decreases as the length of the transcript grows, and therefore, the elongation rate of an RNAP molecule is the lowest  on a gene just before the polymerase terminates transcription. What experiment would you do to test this idea.  The simpler and easier the experiment, the better.

A simple experiment would be to use an in vitro transcriptional assay and determine the rate of RNA synthesis using radiolabeled rNTPs, much the same way that you can determine the rate of DNA synthesis as described on September 10 in lecture.  You can measure the rate of elongation on genes that vary in length (as a control, the DNA sequence of the longer genes can be repeats of the sequence of the shorter genes).  If the rate of transcription of the long genes is slower than the rate for the short genes, then you know that your hypothesis is correct.

Assigned Sept. 22, due Sept. 29
1.  Problem 5.6

By simultaneously adding radiolabeled uridine and rifamycin, RNA molecules that are currently being transcribed during the time of the addition will be labeled.  No new RNA synthesis will occur.  You could then harvest the cells at timepoints and run the RNA on a gel.  Over time, you will see a decrease in the radioactivity for the RNA bands on the gel.  The time it takes for half of the radiolabeled RNA to disappear is the half-life of the RNA.


2.  Problem 5.8

Virusoids are essentially small, circular pieces of single-stranded RNA that require a helper virus in order to replicate.  Virusoids use a “rolling circle” method of replication with the help of an RNA-dependent RNA polymerase provided by the helper virus.  This method of replication results in a long strand of RNA with head-to-tail repeats of the virusoid genome.  To complete the lifecycle, the virusoid must self-splice the individual copies of its genome apart, and the RNA must then be able to self-life to reform the original circular structure.  The virusoid must do these things in order to be able to reproduce itself and properly package itself within the helper virus.


Assigned Sept. 24, due Oct. 1
1.  How are proteins labelled with biotin such that the biotin can still be bound by avidin?

Proteins are labeled with biotin so that biotin can still bind to avidin in several ways.  First, the labeled amino acids must be on the surface of the protein.  Second, the proteins must be attached to biotin at the carboxylic acid end of the biotin, because the ring structure is the part of biotin that interacts with avidin.  Finally, the biotin ring structure must be able to interact with avidin, which is tetrameric and can bind up to four biotin molecules.  To do this the ring structure of biotin must be sufficiently far away from the protein surface for this to happen, and you can do this by linking biotin to the side chains of amino acids like lysine that are long, or elongate the carbon chain that connects the ring structure to the carboxylic acid in biotin.



2.  Suppose DNA of length equal to that of the human chromosome possessed a random sequence. How long must a "test" sequence be to have a 50% chance of being present in the long random DNA?  What is the relevance of this question to gene silencing?

The average chromosome is about 160 Mbp long.  Any one nucleotide has a ¼ chance of annealing to any particular site in the chromosome.  You can set up an equation like this:

 

0.5 = (160000000)/(4^n)

 

Solving for n gives ~14 bases in order for there to be a ~50% chance of annealing.  For RNA silencing, cells need to use oligonucleotides that are longer than this in order to reduce the chance of silencing off-target genes.



Assigned Sept. 29, due Oct. 8
1.  You are to learn how to download and examine a protein structure from the Protein Data Bank.  If you do not already have suitable software running on your computer, I suggest either VMD or PyMol (search with Google, download, and install).  Tutorials are provided at the sites, but you can expect to spend a few hours learning how to use either program.  Download the file 2arc.pdb from the protein databank.  By examination of the structure, predict the consequences of the double mutation changing residues 151 and 161 to lysine.

The protein in 2arc is a dimer, with a coiled coil structure that dimerizes the protein.  Residues 151 and 161 are both leucine, which is relatively small and hydrophobic.  Their side chains point inward towards the dimerization interface, and residue 151 on one monomer lies in close proximity to residue 161 on the other monomer.  Changing both 151 and 161 to lysine will introduce long, positively charged side chains into the dimerization interface.  There will be charge repulsion between these residues, weakening the ability of the protein to dimerize.

2.  Last week the university handed out bottles of hand sterilizing solution.  Below is a picture of one such bottle (labels removed for clarity).  What is interesting about this?

You know from experience that bubbles in honey of the size shown in the picture rise to the top in perhaps 10 seconds.  The bubbles in the picture have been in place for minutes at least, and since the bottles have been lying around the university for a week now, we know the bubbles have stayed in place for at least a week.  You also know that honey is much too viscous to be dispensed with a hand pump, and anything much more viscous could not even be gotten out of the bottle.  So, what is going on? There seems to be a discrepancy between the likely viscosity as indicated by the bubbles that don't rise and the fact that the solution is to be dispensed. Next, when you look more closely, you see that there are no larger bubbles.  Why not? These observations should induce you to take a bottle and play with it a little.  What you find is that the "liquid" is like a solid up to a certain point.  When the force on it exceeds a certain value, the solid "breaks down" and becomes almost like water.  If you shake a bottle, you can form bigger bubbles.  These rise to the top and disappear in a second or two, but the smaller bubbles remain stuck in place forever.  If you try to pump some of the suff out of the bottle, it readily pumps like water.  In your hand, it regels and doesn't run off.  Presumably, these strange viscosity properties make it easier to use the stuff.  This problem was assigned to illustrate something about being a scientist.  You want to be super observant.  You will never know when something strange will lead to an unexpected and important discovery, but you have to notice something strange first.  It may be subtle.  Be observant, curious, and critical.
bottle of sterilizing solution

Assigned Oct. 1, due Oct. 13
1.  By finding and reading the relevant papers briefly describe how nature can build proteins that utilize additional amino acids beyond the canonical 20.

Nature can do this through the use of special tRNAs and codons that are read in a context-dependent manner.  For example, selenocysteine is an amino acid whose codon is UGA, normally a stop codon.  When the UGA is located upstream of a SECIS element in the mRNA, which forms a distinct secondary structure, selenocysteine is incorporated.  In the absence of selenium the UGA is read as a stop codon.  Pyrrolysine is another example, and its incorporation into proteins works by a similar mechanism.

2.  Problem 7.8.

An explanation for this is that the mutation lies in the anticodon of the corresponding tRNA.  If a nucleotide is added or deleted from the anticodon of the tRNA, then the mRNA would be read in frame by the translational machinery.  You could sequence the tRNA to confirm that this is true.

Assigned Oct. 8, due Oct. 15
1.  Problem 8.6.

A single nucleotide change could inactivate an entire chromosome if the mutation occurs in the X chromosome and X inactivation occurs aberrantly on both X chromosomes in a female.  It is thought that X inactivation occurs when a non-protein coding RNA, Xist, binds to an X chromosome.  In theory, aberrant inactivation of both X chromosomes could occur if there is a single mutation in Xist that allows it to bind tightly to the chromosome, or if the mutation is in the Xist promoter such that Xist is overexpressed.

2.  Problem 8.14.

The markers should be placed in the following order:  25413

You'll note that the numbers don't agree perfectly with one another.  This could suggest that recombination is not directly correlated with the length of DNA separating the markers; for example, perhaps there are certain DNA sequences that promote or reduce the likelihood of recombination.

Assigned Oct. 13, due Oct. 20
1.  Problem 9.12, and in addition, correct the associated figure, which is a little bit wrong.

You can conclude that this restriction enzyme is a blunt cutter, and is unusual in that it cuts the DNA 3 base pairs away from its binding site.  The binding sequence for this enzyme is GCGCGC.  The problem with the figure is that there should be faint bands of DNA in all lanes, even in the lanes that reveal the binding sequence, because of the way that the enzyme cleaves the DNA.

Most people said that the problem with the figure is that the lanes should read G, A+G, C, C+T for Maxam-Gilbert sequencing.  I gave credit for this because it is true, but this is irrelevant to the problem.

2.  Problem 9.15.

From cells that contain the methylase gene you could ligate DNA fragments into a cloning plasmid carrying the ampicillin resistance gene.  Transform into cells, selecting for ampicillin resistance. Then prepare plasmid DNA from the transformed cells and digest with the purified restriction enzyme. Plasmids surviving the digestion must have been appropriately methylated by the methylase which they, themselves carry, and hence, are resistant to the restriction enzyme.  It's a good idea to overexpress the methylase first because otherwise the restriction enzyme will cleave the chromosomal DNA and kill the cell.

Assigned Oct. 15, due Oct. 22
1.  Look up and then describe with a few drawings how the QuikChange mutagenesis procedure works.  Despite appearances, why is it not PCR?

There is a good description of how QuikChange mutagenesis works in the Stratagene manual, which is here: http://www.stratagene.com/manuals/200518.pdf

The product of one cycle of QuikChange mutagenesis is a nicked plasmid.  Because of this nick, the product cannot be amplified, so all amplification occurs off of the original template plasmid instead of on the template and product, like in PCR.  Thus, QuikChange mutagenesis results in a linear increase in product over time, whereas in PCR the product amplification is exponential.

2.  Suppose you wanted to perform DNaseI footprinting to determine the binding site of a DNA binding protein, but instead of radioactively labeling the DNA and doing gel electrophoresis and autoradiography, you wanted to make use of a capillary electrophoresis machine that normally is used for DNA sequencing.   How would you do it?

This would be very similar to doing gel electrophoresis and autoradiography.  You could use a primer that is labeled on its 5' end with a fluorophore and amplify your sequence of interest.  Then, you could digest the DNA with DNaseI plus and minus your protein of interest, wash off the protein, and do capillary electrophoresis of your two DNA samples using the fluorophore as a signal.  You can compare your two samples, plus a chromotogram of your DNA's sequencing reaction, to determine the binding site for the protein.

Assigned Oct. 20, due Oct. 27
1.  Many Europeans are strongly opposed to genetically modified, GM, foods.  What are they objecting to, and does the objection have any scientific merit?

There are a lot of answers to this question.  One objection is that GM foods might have a genetic advantage and thus would be bad for the environment because they could outcompete resources from WT plants.  Another objection is that the consumption of GM foods could harm humans because these foods have "unnatural" properties.  Yet another objection is that genetically modifying organisms is playing God and is thus unethical.  These are reasonable causes for concern, but so far science has yet to substantiate these claims

2.  To identify specific residue-base interactions between a DNA-binding protein and its target DNA sequence, a residue thought to be in direct contact with the DNA was mutated to alanine.  How can you identify the base that the residue in the wild type protein contacts?  Note that standard chemical protection experiments will not work because protection over the full binding site occurs when the protein binds.

There are two ways of doing this.  One is method is called missing contact probing, and involves lightly depurinating/depyrimidating the DNA such that about one base per DNA duplex is altered.  Alternatively, one could systematically mutate every base in the binding site, one at a time.  Then, after the DNA is altered or mutated, one could run gel shift assays to determine the affinity of WT and mutant protein for the altered DNA.  Bases that are in direct contact with the wild type and mutant protein will show a reduction in the DNA binding affinity of each protein.  However, the base that contacts the mutated residue will show a reduced binding affinity in the wild type protein, but there will be no further reduction in affinity in the alanine mutant.

Assigned Oct. 22, due Oct. 29
1.  It is statistics time.  We could do some dorky problem like "If you sequence a genome to a depth of 5x, what is the probability that a particular sequence is not sequenced at all?"  Instead, we will do something different.  Look up and learn what variance of a probability distribution measures.  In the case of a Poisson distribution, it can be shown that the variance equals the average. Using the previously mentioned property, tell how to determine how many photons per pixel detector element in a digital camera it takes to produce a signal, say 50% gray.
(This may strike you as pretty neat.  For example, the same principle can be used to determine the charge of an electron without ever making measurements on single electrons.)

You can take a picture of something gray, like a gray wall.  Then, using an image processing program, you can calculate the intensity of the signal at each pixel.  Then, figure out the variance for your data set.  The variance is equal to the mean, so this is how you can determine the number of photons per pixel it takes to produce a 50% gray signal.

2.  In pyrosequencing, a next generation sequencing method, a DNA fragment is PCR amplified in a small droplet that contains the PCR reagents and a small bead.  How are the amplified DNA fragments attached to the bead and what trick is used to keep both ends from being attached?

You can amplify the DNA using a set of PCR primers where only one of the primers is biotinylated.  Then, your ssDNA will attach to a streptavidin-coated bead in only one orientation.

This works if you already know the sequence of part of the DNA because you know which primer to use.  If you wish to sequencing something larger, like a chromosome, without knowing the sequence beforehand you need to use something a little more complicated.  First, sheer your DNA of interest such that the DNA has blunt ends.  Then, you can ligate on a ~40 bp double-stranded DNA fragment that contains a known sequence from which you can bind sequencing primers.  This DNA, call it an adapter DNA, has biotin on it's 5' end, allowing it to attach to a streptavidin bead.  Additionally, the 5' end has a small overhang, whereas the 3' end is blunt.  The DNA will ligate to the adapter only on the blunt 3' end of the adapter, giving you directionality.

Assigned Oct. 27, due Nov. 3.
1.  How does the number of sites in the human genome at which a polymerase is bound and ready to transcribe compare with the number of genes which is determined by analysis of the genome sequence?

The paper from the Lis lab (assigned reading) says that roughly 30% of genes have a stalled polymerase in a region proximal to the promoter.  This doesn't give the total number of genes at which a polymerase resides at any given instance, but this is still a lot of genes (there are roughly 25,000 genes in the human genome).

2.  What is your best hypothesis as to the reason for the existence of the oppositely oriented polymerases at eukaryotic promoters?

The Lis paper also goes into detail for this.  Note that this question is not asking about genes that can be transcribed in both directions, that is something that does happen, but it is a different issue.  Rather, this question refers to the finding in the Lis paper that some polymerases transcribe a short ways in the opposite direction of the gene, producing small RNAs that don't code for protein.

It has been hypothesized that these small RNAs can be involved in RNA interference.  Also, a transcribing polymerase in one direction leaves a wake of supercoiling in the opposite direction, and this supercoiling could help a correctly positioned polymerase form the open complex.  Or perhaps this type of action helps transcriptional regulators become deposited at the promoter.  No one knows for certain (yet!).

Assigned Oct. 29, due Nov. 5.
1. Use the sequence of any tRNA and use a dot matrix approach to determine all possible Watson-Crick base pairs.  The cloverleaf base pairing should be present.  Is it the secondary structure with the maximum number of base pairs?  Comment.

Most people generated dot matrices that found the cloverleaf base pairing.  It does seem that this produces the maximum number of base pairs.

2.  The namesake problem.  Determine that protein in the known databases which is most closely related to you (as defined by taking the amino acid sequence defined by your name and place of birth deleted of the letters not corresponding to the single amino acid abbreviations).  Use blastp at the NCBI web site.

Most people found sequences that matched the sequence of their protein.  This shows that you have to be really careful when deciding whether or not two sequences are homologous!

Assigned Nov. 3, due Nov. 12.
1.  AraC loops to repress the pBAD promoter with the binding of one subunit to a half-site called O2 and one subunit to the half-site I1.  Upon the addition of arabinose, the subunit at I1 remains there, and the subunit that was bound to O2 shifts and now binds to the I2 half-site.  The I1 and I2 half sites constitute what is called the I site, and when bound at I1-I2 AraC activates transcription from pBAD. The three half-sites, I1, I2, and O2 are similar, but not identical, and in fact, AraC binds to I1 with an affinity (relative) of 1000, to O2 with an affinity of 100, to I2 with an affinity of 10, and to random sequences with an affinity of 1.  Predict the consequences on expression of pBAD of replacing the I2 sequence with the I1 sequence.

If the I2 site is mutated to I1, such that I1I1 is present at the promoter instead of I1I2, then AraC is much more likely to bind to adjacent sites (and not loop the DNA) than it normally would.  This means that pBAD expression will occur even in the absence of arabinose.

2.  Suppose you isolated mutants that express pBAD even though arabinose or molecules of structure similar to arabinose are not added (technical term is constitutive).  Where in the components of the regulation-transcription machinery could such mutations lie, and for each different location you propose, also tell how the mutation brings about constitutivity.

One way to produce a constitutive phenotype is to mutate I2 to I1, as in the above problem.  You could delete or mutate the O2 site such that looping is prevented. 

You could also make mutations in the AraC protein that cause constitutivity.  If you mutate or delete the N-terminal arm, then the protein is unable to form the repressive conformation and will be unable to loop the DNA, and by default this will lead to transcriptional activation.  You could also mutate regions in the DNA binding domain that interact with the N-terminal arm, which would also prevent the formation of the repressive state.

Assigned Nov. 5, due Nov. 17.
1.  What are the consequences to regulation of the trp operon in E. coli if the ribosome is a little tardy in binding to and starting translation of the leader peptide?  What is a simple way Nature could minimize this problem?
2.  How do you knock out a gene in mice?

Assigned Nov. 12, due Nov. 19.
1.  Problem 16.3.
2.  Problem 16.5

Assigned Nov. 17, due Nov. 24.
1.  Ultimately, what is it that transmits the information to a developing mouse embryo of which way is right and which way is left?
2.  Design a genetic switching circuit that is off until IPTG is added, and once IPTG has been added, the circuit stays on whether or not IPTG remains present.

Assigned Nov. 19, due Dec. 1. (Last homework to be assigned in the course!)
1.  The following represents the orientations of the four GFP-family proteins mentioned in class.  Let a black triangle represent a site of action of a site specific recombinase in which recombination between two triangles oriented in the same direction deletes the intervening DNA, leaving just one triangle, and recombination between two oppositely oriented trangles invertes the intervening DNA. What arrangement of triangle(s) at sites 1, 2, and 3 will allow the following behavior:  Genes A and B are deleted and then the unit of C and D inverts repeatedly until the recombinase disappears, at which time its orientation is stuck.  Or, Genes C and D are deleted and the unit of A and B inverts.  In either case, once the first deletion occurs, no additional deletion is possible.
Sites A, B, C, and D