Overview of Gene Expression
Central dogma: To express a gene, DNA is transcribed into RNA, which is then translated into a protein (or a protein fragment known as a polypeptide).
Transcription is the process by which DNA is used as a template to make mRNA.
DNA is a double-helix molecule made up of 2 antiparallel strands. DNA has a structure that looks like a twisted ladder.
- The “sides” of each ladder:
- Made up of alternating deoxyribose (a 5-carbon sugar) and phosphate molecules
- Phosphodiester bonds link the 3’ carbon on 1 sugar to the 5’ carbon on the next.
- The “rungs” of the ladder are made of matched nitrogen-containing molecules called nucleotides, frequently referred to as “bases.”
- DNA base pairs:
- Guanine (G), cytosine (C), adenine (A), and thymine (T)
- G pairs with C (and vice versa) via 3 hydrogen bonds.
- A pairs with T (and vice versa) via 2 hydrogen bonds.
- These base pairs can be “read” as a string of letters (e.g., GTATCGA).
- This string of letters is the “code,” or instruction manual, that is ultimately used to create proteins.
- DNA strands:
- Because of the way the sugars are oriented, 1 strand goes in a 5’ → 3’ direction while the other goes in a 3’ → 5’ direction.
- Coding strand: the strand that contains the primary genetic code
- Template strand:
- The strand opposite the coding strand: contains the “opposite” base pairs to those of the coding strand
- This is the strand that is read during transcription.
- The DNA helix is asymmetrical as it rotates.
- This rotation creates major and minor grooves between coils.
- The major groove is wide enough that many regulatory proteins can bind directly to the DNA through this space.
- DNA is negatively charged (because of the phosphate molecules).
- A single-stranded molecule made up of alternating ribose (a 5-carbon sugar) and phosphate molecules
- Each ribose is bound to an RNA nucleotide:
- Guanine (G), cytosine (C), adenine (A), and uracil (U)
- Note that instead of thymine, A binds with U (and vice versa) via 2 hydrogen bonds.
Types of RNA:
- Created during transcription from the template strand of the DNA in the nucleus
- Moves into the cytosol for translation into polypeptides by ribosomes
- rRNA: a component of the ribosome complexes that are responsible for protein synthesis
- tRNA: carries amino acids to the ribosome, where they bind to the mRNA, lining up amino acids that will bond to form the polypeptide
Initiation of Transcription
Transcription begins at a region known as the promoter. An enzyme called RNA polymerase reads the DNA template strand and creates the mRNA. Additional proteins, known as transcription factors, are required for the RNA polymerase to bind to the promoter sequence in eukaryotes.
Promoters are AT-rich regions that signal the starting point for transcription:
- Usually just upstream from the target gene
- The binding site for the RNA polymerase
- Requires multiple transcription factors in eukaryotes
- Requires only sigma factor in prokaryotes
- Allow the RNA polymerase to determine which strand is the coding strand and which is the template strand based on the orientation of the sequence
- TATA box: a common promoter
- A–T bonds are weaker than G–C bonds.
- The A–T-rich regions separate more easily, allowing access to the template strand.
- Mutations in the promoter lead to decreased transcription.
RNA polymerases are enzymes that read the template strand of the DNA and create a corresponding mRNA strand. They are made up of multiple subunits.
- Have only 1 type of RNA polymerase
- Require only a single protein, known as sigma factor, to bind to the promotor sequence
- There are 3 types:
- RNA polymerase I (pol I) synthesizes rRNA.
- RNA polymerase II (pol II) synthesizes mRNA.
- RNA polymerase III (pol III) synthesizes tRNA.
- Multiple transcription factors are required to bind to the DNA at the promoter sequence (RNA pol II cannot bind to DNA on its own).
Transcription factors (TFs) are proteins that bind to the promoter region and are required for RNA pol II to bind to the DNA in eukaryotes.
- Each TF helps to regulate gene expression.
- Transcription factor TFIID:
- Contains TATA-box binding protein (TBP)
- Among the most important TFs required to assemble the initiation complex
- Initiation complex: the complex of transcription factors and RNA pol II at the promoter sequence
- Once the initiation complex is assembled on the promoter, transcription can begin.
Elogation of Transcription
After the initiation complex is assembled at the promoter, transcription elongation can begin. This is the phase during which mRNA is created.
- Occurs within the transcription bubble
- After initiation, additional elongation factors assemble:
- Additional proteins that help to “push” the RNA pol II along
- Additional sites of transcriptional regulation
- Matching nucleotides are brought into the RNA polymerase:
- Brought in as nucleotide triphosphates: ATP, UTP, GTP, CTP
- These nucleotides “bring their own energy with them.”
- The enzyme builds a new mRNA strand by creating phosphodiester bonds between these nucleotides.
- RNA pol II reads the DNA template from 3’ to 5’ → produces mRNA from 5’ to 3’
- RNA synthesized according to rules of base-pairing: purines pair with pyrimidines:
- Adenine (purine) ↔ uracil (pyrimidine)
- Guanine (purine) ↔ cytosine (pyrimidine)
- A temporary DNA–RNA hybrid helix forms.
- RNA pol II continues until a DNA terminator sequence is encountered by transcription machinery.
Termination of Transcription
Factor-independent termination occurs when the transcription machinery reaches a termination sequence.
- First comes a GC-rich palindrome:
- Causes the newly produced RNA to form a base pair with itself, creating a hairpin structure
- The hairpin structure begins to destabilize the DNA–RNA complex.
- Next come 4 or more uracils in a row:
- U–A bonds are weaker than G–C bonds
- These bonds are unable to hold the RNA on the DNA → mRNA falls off
- Rho protein binds to the tail of the new RNA
- Using energy from ATP hydrolysis, the Rho protein “climbs” the tail faster than the RNA polymerase is moving and “catches up” to the RNA polymerase at the correct time.
- Causes the dissociation of RNA and RNA polymerase from the DNA template
- Can occur in addition to termination caused by terminator sequence
After mRNA is synthesized in eukaryotes, it is modified to prevent immediate degradation. These modifications include splicing, capping, and polyadenylation.
- Noncoding introns are spliced out by spliceosomes (enzymatic ribonucleoprotein complexes)
- Multiple different proteins can be made from a single gene with differential splicing
During capping, a methylated guanosine (m7G) is added to the 5’ end of the mRNA:
- Prevents the mRNA from binding to other RNA chains
- Protects the mRNA from degradation
- Promotes translocation of the mRNA from the nucleus to the cytoplasm
- Facilitates binding of the mRNA to the ribosome to initiate translation
During polyadenylation, a tail of adenine molecules is added to the 3’ end of the mRNA:
- Referred to as the “polyA tail”
- Stabilizes the mRNA
- Death-cap mushroom poisoning: These mushrooms contain a toxin called α-amantin, which inhibits the function of RNA polymerase II. Poisoning with α-amantin is fatal.
- Nucleoside analogs: competitive inhibitors of nucleosides, which cause termination of a growing nucleoside chain when incorporated by the polymerase: Nucleoside analogs are used in HIV treatment (e.g., azidothymidine) and chemotherapy.
- Transcription regulation: There are thousands of transcription factors, cofactors, and chromatin regulators involved in transcription regulation. There are many disorders associated with abnormal regulation of transcription, including cancer, autoimmune disease, neurologic disorders, cardiovascular disease, and obesity, to name a few.
- Griffiths AJF, Miller JH, Suzuki DT, et al. (2000). Transcription and RNA polymerase — an introduction to genetic analysis. https://www.ncbi.nlm.nih.gov/books/NBK22085/
- Lee TI, Young RA. (2013). Transcriptional regulation and its misregulation in disease. Cell 152:1237–1251. https://doi.org/10.1016/j.cell.2013.02.014
- Christensen K, Hulick PJ. (2020). Basic genetics concepts: DNA regulation and gene expression. UpToDate. Retrieved April 15, 2021, from https://www.uptodate.com/contents/basic-genetics-concepts-dna-regulation-and-gene-expression