BMB 407/507 - Section on Protein Stability and Folding
Warning: This is a informal summary of some of the material covered in class. These notes are NOT a substitute for attending class or the reading of the review papers listed at the bottom.
Protein Stability
Proteins are usually only marginally stable
- Protein stability is intimately connected with protein folding - proteins have to be folded into their final active state (and maintain it) to be stable.
- D -> F or U -> F (Denatured or Unfolded state to folded state; D is not the same as U, and there can be a variety of different D or U states). D can often be different states depending on the denaturing condition; Dphys is often a denatured physiological state of a protein (usually a molten globule state, with extensive residual secondary structure). U is sometimes defined as the protein at a very high chemical denaturant concentration.
- Delta(G) = Delta(H) - T delta(S)
- Free energy of folding is the sum of enthalpic and entropic forces. These are large forces, and the resulting delta(G) is often a small number (-5 to -15 kcal/mol is typical for many proteins).
- This delta(G) is comparable to the energy of a few hydrogen bonds, even though a protein may have hundreds of H bonds. Proteins are thus only marginally stable.
- The favorable enthalpic gains (H bonds, salt bridges, electrostatics, etc.) gains during the folding of a protein are offset by the substantial loss of entropy as the protein goes from an extended chain into a compact structure.
- This loss of entropy, called Delta (S conf) or conformational entropy is quite large (of the order of RlnN, where N is the number of states; approx 1.67 kcal/mol/residue at 25 degrees C)
- There is also gain of entropy on folding from the hydrophobic effect (from the burial of hydrophobic groups)
- For a typical 100 aa protein being folded:
- Loss of conformational entropy: +167 kcal/mol
- Hydrophobic effect: -95 kcal/mol
- H-bonding and other enthalpic effects: -83 kcal/mol
- Total delta G of folding: -11 kcal/mol
Forces driving protein folding (and hence stability)
- Hydrophobic effect
- Major effect in protein stability - "driving force"
- Non-polar side chains (leucine, isoleucine, phenylalanine, tryptophan (max hydorphobic effect) ....)
- Water cannot form H bonds with these side chains; thus water molecules form ice-like "cages" around these side chains, resulting in a loss of entropy if these side chains are solvent exposed.
- Clustering of these side chains reduces this entropic loss (fewer water molecules in a ice-like state). These side chains thus prefer to cluster or get buried into the interior of the protein in "hydrophobic cores"
- This can be energetically quite favorable; the burial of a single -CH2 group has a delta G of about -1 kcal/mol (close to energy of a H bond).
- H-bonds
- Within the protein backbone, these are usually between N-H and O=C (amide hydrogen and the carbonyl oxygen; ~70% of H-bonds in a globular protein)
- Delta G of -1 to -5 kcal/mol
Conditions that can denature proteins
- Thermal denaturation
- As temperature increases, entropy of unfolding becomes higher and overcomes enthalphic forces holding the protein together. There is also breakdown of the "ice cages" around non-polar side chains, weakening the hydrophobic effect and favoring the unfolded state.
- Since proteins are only marginally stable, even a little temperature increase can destabilize proteins. This is the reason that point mutations can often yield "temperature sensitive" proteins that denature with just a few degree increase in temperature.
- Proteins in thermophilic organisms are often seen to include more H-bonds and salt bridges (more delta H), as well as greater hydrophobic interactions.
- Cold Denaturation
- Some proteins denature in cold temperatures as well
- At lower temperature, the hydrophobic effect weakens - bulk water becomes more ordered and the hydration of non-polar groups (ice cage formation) is not as unfavorable as under more reasonable temperatures.
- Delta H also decreases at lower temperatures.
- These changes favor the unfolded state for some proteins.
- Cold adapted proteins often show fewer hydrophobic forces (reduced aromatic interactions; fewer salt-briges; lower number of prolines & arginines for more flexibility; ....).
- There is a trade-off between stability and flexibility in most proteins. Most enzymes require some flexibility (for activity), while also needing structural rigidity (for stability).
- Proteins are stable and active in only a narrow range of temperatures; thus, even though life can exist between -50 and 110 degrees C, most species are active only in a much narrower band of temperatures (usually in tens of degrees C).
- Chemical denaturants
- e.g. Urea, guanidinium chloride./LI>
- chemical denaturants often act by increasing solubility of hydrophobic side chains in water, thus decreasing the hydrophobic effect.
- Urea can also form H-bonds with the protein backbone, and aromatic sidechains.
- effect is usually linearly propotional to the concentration of the denaturant.
- High or low pH
- Proteins have buried groups with highly perturbed pKa's.
- High or low pH affects protonation levels, and electrostatic charges between groups.
- Often destabilization of protein is because of increased electrostatic repulsion.
- High pressure
Experimental study of protein stability
- Use of heat capacity (Cp - heat capacity at constant pressure - energy to heat 1 mole of material by 1 degree C)
- Denatured state is characterized by a high Cp (needed to melt the icebergs around exposed non-polar side chains
- folded proteins are characterized by a low Cp
- Delta Cp of unfolding is of the order of about 12 cal/degree/mol per residue
- Cp usually peaks at tm or the "melting temperature" of the protein.
- Differential scanning calorimetry can be used to monitor Cp changes as a protein in denatured. The van't Hoff plot can be used to derive delta H of unfolding (see chapter 17 in textbook for details - fig. 17.1)
- Similar measurements can be done for solvent denaturation (see figs 17.2, 17.3).
- Protein folding can often be followed experimentally by techniques such as fluorescence or CD, or NMR (also see section below).
- Fluorescence measurements monitor the state of aromatic sidechains within the protein (often tryptophans, which when excited at 280 nm, emit energy at around 320 nm). As a protein unfolds and the aromatic sidechains are exposed, the fluorescence signal goes up.
- Circular Dichroism (CD) measures the unequal absorption of left and right handed circularly polarized light by optically active protein molecules. Folded proteins have a characteristic CD spectra which changes as the protein unfolds.
- Use of point mutations to study effect on protein stability and the transition states in protein folding - Phi-value analysis (see Chapter 18, section E in your textbook).
Protein Folding
Why do proteins fold
- The folded state is more stable (lower delta G) than the unfolded state.
- While delta G of folding is negative, most proteins are only marginally stable - (delta G of only a few kcal/mol).
- One reason for the marginal stability is that both the folded and unfolded states form a large number of H-bonds & other non-ionic interactions; gain of H-bonds as protein folds is often offset by the loss of H-bonds between the protein side-chains and the solvent.
- Another reason is that delta G = delta H - T delta S (enthalpy - entropy terms). So gain in enthalpic contributions is offset by loss of entropy as the protein folds.
How do proteins fold (in-vitro)
- Protein folding is spontaneous. [Anfinsen's principle (1973) - all information necessary to specify the native 3D fold of a protein is contained in its amino acid sequence].
- Reversible for small proteins, under ideal conditions.
- Protein folding is non-random & highly cooperative [Levinthal's paradox (1968) - proteins fold in time scale of milliseconds-seconds, even though a systematic search of all possible conformations in even a small protein (10^100 conformations for a 100 aa protein) will take a very-very long time].
- There is no single model or specific pathway for folding - protein folding is better described in terms of multi-dimensional energy landscapes or folding funnels, with different pathways possible depending on the sequence, conditions, etc.
- Folding can generally be described by a two state transition model for small (<100 aa) proteins: U -> F.
- For small proteins, a general pattern of rapid nucleation and hydrophobic collapse is seen (formation of a molten globule state), followed by a slower compaction into the native state [referred to also as a "nucleation-condensation" reaction].
- The unfolded state (U) is generally "expanded & loose" with most of the local secondary structural features (helices, beta sheets, loops, etc.) in place, but with only a few long-range interactions.
- The folded state (F) is more "compact and ordered", and has a large number of long range interactions between different parts of the protein chain, apart from the local secondary structure.
- For larger proteins, there are often multiple structural domains which each fold by mechanisms similar to that for smaller proteins. Once these fold, the different domains reshuffle slightly to form the final native structure.
How do proteins fold in-vivo ?
- While most of the principles described above also apply in-vivo, protein folding within cells in complicated by two major factors:
- Proteins are synthesized in a sequential manner at a rate much slower (4-20 aa/second) than typical folding rates (< 1 second). So the N terminal of the protein may start to fold before the C terminal end has even been synthesized by the ribosome.
- Cells are very crowded with all kinds of macromolecules (typical densities in E. coli are about 340 mg/ml of protein), leading to increased possibilities for the nascent polypeptide chain to interact with other molecules and hydrophobic surfaces. This can lead to aggregation or misfolding.
- Folding of over half of the proteins in cells is assisted by molecular chaperones.
- "Molecular chaperones are proteins whose role is to mediate the folding of certain other polypeptides and, in some instances, their assembly into oligomeric structures, but which are not components of these final structures" (definition from RJ Ellis, 2000).
- Chaperone assist or mediate folding and assembly (non-covalent) of proteins, and also inhibit "off-pathway" folds. They increase the efficiency of protein folding within the cell.
- Chaperone assisted protein folding appears to be a universal mechanism in all cells that "enables the crowded state of the cellular interior to be compatible with life" (RJ Ellis, 2000).
- Over 20 families of chaperone molecules have been identified.
- While most chaperones are proteins, there are hints that ribosomal RNA and some phospholipids may also play a chaperone type function.
- Chaperones are often stress or heat shock factors
- Another class of chaperones in the "steric chaperones" - where the chaperones provide essential steric information during folding. An example of this is proteins with a "prosequences" - an N-terminal part that is required for correct folding, but is subsequently cleaved off to get the functional protein (e.g. prosubtilisin).
- The best understood families of chaperones are the HSP70 and HSP60 family of chaperones [HSP stands for Heat Shock Protein]. The HSP70 family proteins bind short exposed hydrophobic stretches on unfolded proteins, and assist in folding by preventing aggregation. The HSP60 family (also called the chaperonins), form a "folding cage" with a large central cavity which provides a protective environment for other proteins to fold.
- See Figure 5 & description in the Radford review paper for more details.
- See pages 603-609, Chapter 19th of your textbook for a detailed description of protein folding by chaperonins
- Apart from chaperones, other enzymes are also involved in proper folding of some proteins within the cell. For example, protein disulfide isomerases (PDI) are involved in the proper formation of some disulfide bonds. Another set of such enzymes are the peptidyl prolyl cis-trans isomerases (PPI).
- PDI's - Protein disulfide isomerases catalyse formation of disulfide bonds (-S-S-)
- disulfide bonds act like "staples" in a protein structure
- These can often form in a complex pathway - e.g. BPTI (Bovine pancreatic trypsin inhibitor)
- kinetics of disulfide bond formation are largely independent and much slower than conformational folding kinetics
- disulfide bond formation usually requires an oxidative environment (periplasm in bacteria, ER in eukaryotes), while the cytoplasm is usually a reducing environment
- PPI's - Peptidyl prolyl cis-trans isomerases
- These are ubiquitious enzymes - isomerases or rotamases that catalyse the cis-trans isomerization without breaking bonds
- In proteins most peptidyl bonds are trans (omega = torsion along the C-N bond = 180) and this conformation is heavily favored in both denatured or folded forms. However, in extended chains, the peptide bond preceding a proline can be either in trans or cis forms, with the trans form only slightly more favored than the cis form. In folded proteins, on the other hand, only about 7% of all prolyl-peptide bonds are cis
- Isomerization about the prolyl peptide bond from the cis to trans conformation is very slow - t1/2 of 10-100 seconds with activation energy of almost 20 kcal/mol. PPIs catalyse this to a timescale of a second or less. This isomerization is often the rate limiting step in protein folding.
- Three unrelated families of PPI's - cyclophilins, FK506 binding proteins (FKBP), and Parvulins. The Trigger Factor in bacteria is a 48 kDa ribosome associated PPIase with a FKBP type catalytic domain
- The endoplasmic reticulum (ER) is an important site for protein folding in eukaryotes. About 1/3 of all proteins in eukaryotes fold within the ER, especially all secretory and membrane proteins. ER is especially rich in chaperones, such as BIP which belongs to the Hsp70 family.
- Protein folding quality control - unfolded proteins cause a response called UPR (unfolded protein response). This uses three mechanisms to enhance overall level of protein folding:
- Attenuation of translation levels in the cell
- Induction or transcriptional upregulation of chaperones
- degradation of malfolded proteins - degradasome
- Suggested reading:
- Mori, K. (2000) Tripartite management of unfolded proteins in the endoplasmic reticulum. Cell, 101, 451-454.
- Ellis, R. J. (2000) Introduction. Seminars in Cell & Developmental Biology, 11, 1-5.
How can we measure or observe proteins folding
- While the underlying molecular events in protein folding occur in the nanosecond or faster time scale, global folding of typical proteins occurs in the millisecond - second time scale.
- Protein folding/unfolding is usually initiated in experiments by a temperature jump (thermal denaturation) or changes in the chemical environment [pH jump, chemical denaturation or unfolding by addition of urea or guanidinium chloride (GdmCl)].
- Spectroscopic techniques are often used to follow protein folding. Measurement of changes in intrinsic fluorescence (typically of tryptophan residues in the protein) or circular dichroism (CD) are commonly used. These techniques can measure changes in the millisecond time scale using stopped-flow type instruments.
- NMR can be used follow changes in the environment of individual side-chains in the millisecond to second time scale.
- See Table I in the Radford review paper for a list of many other techniques.
Protein Dynamics and Simulation
- Apart from proper folding and rigidity for stability, proteins need flexibility for their function
- Protein dynamics is important to understand motions in proteins, and the pathway of protein folding
- X-ray crystallographic structures include a column with the B or thermal factor, apart from the X,Y,Z coordinates of each atom. While these are often just measures of disorder in the crystal, they can provide hints about regions of a protein that may be flexible
- NMR based relaxation methods are an excellent approach to probe dynamics in proteins
- Simulations of molecular dynamics (MD) are often necessary to understand the fast steps in protein folding (especially for following transitions faster than the resolution of experimental techniques)
- MD works by solving numerically the newtonian equations of motion, and following motion of each atom in a protein
- Time steps in these simulations have to be in the order of femtoseconds or picoseconds making it very computationally expensive to look at even proteins that fold very fast (microseconds to milliseconds).
- There are also problems with accuracy of potential functions - constants and parameters in the equations that describe the interactions between different atoms in a protein.
Coupling between protein folding and binding
Optional: Based on Dyson and Wright (2002) Curr. opinions in Structural Biology, 12, 54-60
- A significant fraction of protein in cells may be unstructured, especially in eukaryotes
- Drosophilla: about 17% of proteins predicted to be wholly denatured
- Four eukaryotic genomes: about 30% of proteins expected to have disordered segments of 50 aa or more
- These "unstructured" proteins may fold only on finding their binding targets. Some example may include:
- ribosomal proteins
- transcriptional factors
- proteins involved in complexes
- inducible "snap-lock" folding activated by DNA binding
- "folding process for any protein can be thought of as a 'binding reaction' that involves binding of distant parts of the polypeptide chain, as tertiary structure is formed. However in some cases, there is a requirement for external factors as well ...."
Why is any of this important?
- Correct folding is required for proteins to function. Aberrant folding of proteins is involved in many diseases. For example:
- Cystic fibrosis - A deletion mutation (Phe 508) in a chloride channel protein (CFTR) leads to improper folding & reduced Cl- conductance in diseased cells
- Amyloid diseases (such as Alzheimer's, prion diseases, mad-cow, etc.) involve protein misfolding or defective processing, leading to aggregation & formation of insoluble plaques.
- Understanding the mechanism of protein folding will allow better ab-initio prediction of protein 3D structures from their amino acid sequences.
Required Reading
Radford, S. E. (2000) Protein folding: progress made and promises ahead. Trends in Biochemical Sciences (TIBS), 25, 611-618. Abstract
Fersht, A. R. & Daggett, V. (2002) Protein folding and unfolding at atomic resolution. Cell, 108, 573-582. Abstract
Mayor U., Guydosh N. R., Johnson C. M., Grossmann J. G., Sato S., Jas G. S., Freund S. M., Alonso D. O., Daggett V. & Fersht A. R. (2003) The compete folding pathway of a protein from nanoseconds to microseconds. Nature, 421, 863-867. Abstract
Reading List from Your Textbook (Fersht)
Chapter 17
- pp 508-513 (section A1-A2)
- pp 516-517
- pp 519-521
- Skim through section D, especially pages 532-537
Chapter 18
- pp 540-543 (section A, intro to section B)
- pp 557-563 (section D, E, Transition state Phi analysis)
Chapter 19
- pp 573-576 (section A; section B)
- pp 583-587
- pp 591-593 (section E, intro to section F)
- pp 603-609 (section I - chaperones)
Optional Reading
Daggett, V. & Fersht, A. R. (2003) Is there a unifying mechanism for protein folding? Trends in Biochemical Sciences (TIBS), 28, 18-25. Abstract
Baker, D. (2000) A surprising simplicity to protein folding. Nature, 405, 39-42. Abstract