Transcription is a fundamental cellular process: RNA polymerases "transcribe" the genetic information on DNA into RNA strands. All cells have RNA polymerases (RNAP).
The RNA polymerases increase in complexity as you go from viruses (example, T7 RNA polymerase is made up of a single protein), to bacterial systems (one RNA polymerase made up of the proteins - beta, beta', 2 x alpha, omega and the sigma factor), and finally to eukaryotic systems (Three RNA polymerases - Pol I, Pol II, and Pol III, each with ten or more subunits).
While the RNA polymerases have become increasing complex as life evolved, their overall structure (as evidenced by crystallographic structures of bacterial RNA polymerase and Pol II) show remarkable similarity. There is also sequence similarity between the bacterial polymerase protein subunits and the proteins that make up the eukaryotic polymerases.
Transcription of any gene usually involves three distinct stages:
First, the RNA polymerase has to find the start site of a gene. The "holoenzyme" form of the polymerase does this by looking for the "promoter" site that exists just upstream of the gene start site. This process is termed "transcription initiation". This is followed by opening up (melting) of the duplex DNA to form an "open complex".
This is followed by a rapid change into the "elongation" phase of transcription where the "core polymerase" part of the RNA polymerase rapidly transcribes an RNA strand that is complementary to the "template" strand of the DNA. The change into the elongation phase usually occurs after a few bases of RNA have been transcribed (typically abour 8-9 bases of RNA in bacterial systemm which form a RNA-DNA hybrid with the template strand), and involves a "clamping down" on the DNA to prevent the polymerase from falling off the DNA.
The final stage of the transciption of a gene is "termination", after the stop codon of the gene. The process of termination usually involves sequences where the polymerase slows down or stalls, and the polymerase-RNA-DNA complex (often proteins such as rho and NusA are involved in bacterial systems).
Genes have to be transcribed to mRNAs before they can be translated into proteins; more or less mRNA from a particular gene equals more or less of the protein encoded by the gene. Transcription is, thus, an important point in the control of gene "expression". Most genes are controlled transcriptionally, usually by regulation of the level of transcriptional initiation. For example, if a gene has a strong promoter, it will be more highly expressed when compared to another gene with a weak promoter site. Similarly if a regulatory protein can bind the promoter site of a gene (and prevent transcription initiation), than it can turn off the expression of that gene.
Transcriptional control of genetic expression is vital for cellular functions, and many diseases and cancers are results of defects in the transcriptional control of essentials genes.
Polymerization reaction; nucleotides are added in the 5' -> 3' direction; template & coding(non-template) strands; promotor; upstream; downstream; Open vs closed complex; binary vs ternary complex; typical RNA polymerization rate (~ 40 nts/second at 37ƒC in bacteria - close to translation rate of about ~15 aa/second).
Transcription initiation in bacteria (prokaryotes) involves sigma factors. The sigma factor combine with the core RNA polymerase to form a holoenzyme that is competent for promoter binding. The core RNA polymerase, by itself, cannot bind the promoter sites that signal the starts of genes.
The sigma factor can be thought of as the specificity factor in the RNA polymerase. Each bacteria has several different sigma factors that recognize slightly different promoter sequences. Predominant among these is the sigma70 (70 kDa in E. coli; also called rpoD), which initiates the transcription of most genes in exponentially growing cells. There are two general classes of sigma factors - sigma 70 class and sigma 54 class:
The sigma 70 class of sigma factors share extensive sequence homology, and bind to two conserved sequences upstream of the gene start site (the -35 box and the -10 box). Each of these sigma factors recognizes slight variations in these conserved sequence boxes. Sigma70 binds promoter DNA as a part of the holoenzyme, and binds DNA very poorly in absence of the core enzyme.
The sigma54 class of genes (54 kDa in E. coli; also called rpoN) controls a much smaller set of genes than sigma70. It recognizes different conserved sequences (the -12 and -24 boxes). Unlike Sigma70, sigma54 can bind DNA even in the absence of the core RNA polymerase. It however lacks the ability to melt promoter DNA on its own - for this it needs to interact with other activator proteins that bind further upstream of the promoter site, as well as the core RNAP.
In eukaryotic systems, transcription initiation is very different. There are no sigma factors. Instead, the central protein in forming the "pre-initiation complex " (PIC) is the TATA binding protein (TBP), that binds to the TATA box, a conserved sequence just upstream of the initiation region. A large number of other general transcription factors such as TFIIB, TFIIE, TFIIF, TFIIH (TF stands for transcription factor; II stands for Pol II; there are similar factors for Pol I and Pol II) and others assemble to form the multisubunit TFIID complex. This PIC then recruits the RNA polymerase (Pol I, II, or III in eukaryotic cells) to initiate transcription. (see figure in the handout for a schematic of this process). The PIC often remains at the promoter site, and is then available to initiate another round of transcription.
TBP is a universal transcription factor, and is seen in all eukaryotes and archaea. It sharply bends DNA at the TATA box.
Once initiation is complete and the open complex forms, the RNA polymerase begins to read the template strand and add corresponding RNA nucleotides. This process is not always efficient (check out "abortive initiation" in the text), and the polymerase may make several passes at this. After a long enough RNA-DNA hybrid is made, the polymerase clears the promoter region and moves rapidly downstream. This is preceeded by a large conformational change in the polymerase core enzyme, as it clamps down on the DNA and becomes quite processive.
While transcription elongation is quite rapid, the polymerase does not transcribe all sequences with equal efficieny. The elongation rate is not uniform, and there can be pausing or stalling. Elongation factors (GreA/GreB in bacterial systems; TFIIS in eukaryotes) act to help the polymerase along by stimulating backtracking and cleavage of the newly formed RNA (from the 3' end).
Other factors are involved in the elongation cycle - for example, in eukaryotic Pol II there are the elongins and ELL proteins that increase the elongation rate, as well as factors to remodel chromatin (see handout and your textbook).
There are multiple modes of transcription activation. All these usually involve different protein factors (activators) that bind DNA sequences (enhancer sequences) in and around the promoter site. All these act to increase the affinity (increased "dwell time") of the initiation complex or RNA holoenzyme at the promoter (thus enhancing the chance that a productive open complex will form and transcription will initiate).
When enhancer elements combine with poor promoter sequences, activators can modulate the activity of a gene by several hundred or thousand fold. For example, if a gene has a poor -35 or -10 box in its promoter site, it will be poorly expressed - an activator protein can, in such a case, enhance the activity of the gene by several orders of magnitude, by recruiting the RNA polymerase (or other initiation factors) to the promoter site.
Repression, on the other hand, works by having protein (repressors) that sit on or close to the promoter regions of the DNA, preventing RNA polymerase or initiator/activator proteins from starting transcription initiation (thereby "turning-off" the gene). (see the classic lac repressor in the textbook).
Activators and repressors are usually DNA binding proteins. These proteins have common DNA binding motifs such as Zn-fingers, helix-turn-helix, etc. (see textbook for common DNA binding motifs). Activators also have regions that interact with different domains of the RNA polymerase (parts of alpha, sigma, etc.).
In addition to protein activators, there are also DNA sequences that can directly interact with RNA polymerase components. For example, the alpha subunit in the bacterial polymerase has two domains - the alpha NTD (n-terminal domain) and the alpha CTD (c-terminal domain), linked by a flexible linker region. The alpha CTD can bind DNA sequences (UP elements) upstream of the promoter region, enhancing the dwell time of the polymerase on the promoter.
You should be familiar with the mechanism of control of the lac and gal promoters, and the function and role of the CAP (catabolite activator protein) - see chapter 10 in your textbook and also figure in handout.
Termination in bacterial system can be broadly classified as "rho independent", and "rho dependent".
In eukaryotic cells, transcription termination involves cleavage of the elongating RNA chain by specific endonucleases which recognize particular sequences (AAUAAA) in the newly formed RNA. Once this happens, the RNA elongation complex is destabilized, and falls off the DNA. It is then available to attach to another nearby PIC, and start transcribing again.
Termination provides another point at which transcription can be controlled. Many factors act by decreasing (anti-termination) or enhancing termination rates at certain RNA sequences, thereby controlling the expression rates of downstream genes. One example of such a factor is TRAP (trp RNA-binding attenuation protein) that binds a specific RNA sequence in the leader segment of nascent RNA during transcription of tryptophan biosynthetic genes (see handout). Several other examples of anti-termination are in your textbook.
See chapter 20 in your textbook.
Promoter regions in eukaryotic genes are more complex than prokaryotic promoters. The core promoter is usually made up of a TATA box around the -31 to -24 region, along with other motifs such as BRE (TFIIB recognition element that is upstream of the TATA box). The transcription start site is usually a INR box (initiator) located at -2 to +4. Not all of these elements are required. TBP binds the TATA box, though there are some genes that do not have a TATA box (TATA-less promoters).
While the core promoter is required for assembly of the basal transcription initiation complex (or PIC), eukayotic genes have a variety of regulatory elements (enhancers, UAS, response elements for steroids, etc.) that can be extend several hundred or thousand bps upstream of the transcription start site.
See your textbook - chapter 21 for a detailed description of several types of regulatory proteins that bind these sequences upstream of the core promoter.
5' Capping - mRNAs made by Pol II are capped on the 5' end with a 7' methyl GTP (m7GpppNpNp....) which protects the RNA from digestion by 5' exonucleases. This cap is added by an RNA guanylyltransferase enzyme associated with the phosphorylated CTD domain of Pol II. The CTD tail of Pol II is a long unstructured domain at the C-terminal end of subunit RPB1 consisting of multiple repeats (26-52) of a seven amino-acid sequence (YSPTSPS). This domain is hyperphosphorylated as Pol II goes from the initiation state to the elongation state by TFIIH, and is the site of attachment for many accessory factors and enzymes.
At the other end (3') of mRNAs, a long poly A tail is added by another enzyme, Poly A polymerase, that also associates with the CTD tail of the elongating Pol II. The Poly A tail is added after transcription termination.
An important activity associated with Pol II (both during initiation and elongation) has to do with chromatin remodeling. This involves a number of co-activators that have HAT (histone acetyltransferase) activities. This includes the p300/CBP and PCAF factors (see your textbook for details). Other factors involved in chromatin remodeling include several HMTs (histone methyltransferases) associated with the elongating Pol II (see part of review paper in handout).
While there is increasing complexity in RNA polymerases from phages to eukaryotes, there are many similarities at the sequence and structural levels.
At the simplest level, these polymerases are metalloenzymes with a Mg2+ at the active site. They have specialized channels for the two DNA strands as well as the nascent RNA. One important feature of the change from the initation state to the elongation state is the change from an open form into a closed processive form of the enzyme that clamps tightly on the template DNA.
There is sequence similarity between the different subunits of the prokaryotic RNA polymerase and the eukaryotic RNA polymerases. For example, in yeast Pol II the two largest subunits (RBP1 and RBP2) are homologous to the two largest subunits of bacterial RNA polymerases (beta' and beta). Similarly, the yeast RBP3 and RBP7 subunits are similar to the alpha subunit in prokaryotic RNAP. The crystal structures of a bacterial core RNAP (from Taq) and the yeast core Pol II show an astonishing level of conservation in the architecture of these enzymes.
Recently determined crystal structures of the bacterial RNAP holoenzyme are allowing us to understand how the different sigma factor domains associate with the RNAP core enzyme during the various steps of transcription initiation.
(NOT required for the exam! Parts of some of these papers are included in your handout)
Antson AA, Dodson EJ, Dodson G, Greaves RB, Chen X, Gollnick P. (1999) Structure of the trp RNA-binding attenuation protein, TRAP, bound to RNA. Nature, 401, 235-242. Abstract
Armache KJ, Kettenberger H, Cramer P. (2003) Architecture of initiation-competent 12-subunit RNA polymerase II. Proc. Natl. Acad. Sci. USA, 100, 6964-6968. Abstract
Cramer P, Bushnell DA, Kornberg RD. (2001) Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science, 292, 1863-1876. Abstract
Gerber M, Shilatifard A. (2003) Transcriptional elongation by RNA polymerase II and histone methylation. J. Biol. Chem., 278, 26303-26306. Abstract
Muller-Hill B. (1998) Some repressors of bacterial transcription. Current Opinions in Microbiology, 1, 145-151. Abstract
Murakami KS, Masuda S, Campbell EA, Muzzin O, Darst SA (2002) Structural basis of transcription initiation: an RNA polymerase holoenzyme-DNA complex. Science, 296, 1285-1290. Abstract
Murakami KS & Darst SA (2003) Bacterial RNA polymerases: the wholo story. Current Opinion Structural Biology, 13, 31-39. Abstract
Nikolov DB, Burley SK (1997) RNA polymerase II transcription initiation: a structural view. Proc. Natl. Acad. Sci. USA, 94, 15-22. Abstract
Rhodius VA, Busby SJ. (1998) Positive activation of gene expression. Current Opinion in Microbiology, 1, 152-159. Abstract
Zhang G, Campbell EA, Minakhin L, Richter C, Severinov K, Darst SA. (1999) Crystal structure of Thermus aquaticus core RNA polymerase at 3.3 A resolution. Cell, 98, 811-824. Abstract