Kinetic Proofreading

How biology spends energy to beat informational entropy.

May 02, 2021

Fundamentally, kinetic proofreading is a specific, mathematically describable method to spend energy to beat informational decay. This is one of the basic ideas you see popping up over and over in biology and understanding it will allow you to understand the natural world better. The citation for the first publication of this idea is J. J. Hopfield, Kinetic Proofreading: A New Mechanism for Reducing Errors in Biosynthetic Processes Requiring High Specificity, PNAS 1974

The Specificity Problem

A big problem in biological systems is that a lot of potential substrates look alike. A necessary step ahead of protein synthesis is attaching amino acids to transfer RNA, or tRNA. But the enzyme that does this, the tRNA synthetase, has to discriminate between twenty different alternatives, some of which are very similar. For example, look at the structures of phenylalanine vs. tyrosine or serine vs. threonine.

Since the reaction happens down near the alpha carbon where the molecules are identical, it’s hard to prevent a synthetase that binds the wrong substrate from carrying out the wrong reaction. And yet somehow, all the synthetases have an error rate (which we’ll call f) significantly lower than one in a thousand.

What does an f < 0.001 mean from a kinetics viewpoint? Let’s look at the example of serine tRNA charging using standard Michaelis–Menten kinetics. First the enzyme (E) must bind the substrate (S for serine or T for threonine) with dissociation constants of KdS and KdT respectively. Then the reaction happens at some rate m to produce either the correct product P or the incorrect product P’. Since the reactive groups of the substrates look identical, m is the same for serine and threonine. All the specificity lies in the binding constants, the enzymatic function cannot differentiate between substrates.

If we’re at steady state, with equal concentrations of S & T, the error rate f is simply the ratio of KdS & KdT . We can expand upon this by connecting the binding energy to the dissociation constant with ∆G = RT ln(Kd) to give us a formula that connects energy of binding to error rate:

If f = 1/1000, then ∆Gt−∆Gs must be a bit more than 23 kJ/mol. This is a lot of energy - especially for discriminating between substrates that are different by a single methyl group. These energy demands get even more ridiculous when you look at polymerases, some of which have error rates of one in hundreds of millions of reactions. This implies a ΔG difference of 35-45 kJ between binding different bases, which is just ridiculous.

Breaking down the pathway

Of course, we’ve simplified this system dramatically to use Michaelis–Menten kinetics. What do we need to add back in to understand the real-world specificity? Well, there’s a known intermediate step where the amino acid is conjugated with AMP (which we’ll denote S*), yielding the following reaction pathway.

This doesn’t obviously help – we’ve just expanded out the final arrow towards the product. The ratio of incorrect to correct products will be present in both the E-S state and E-S*, proceeding to E-P.

However, it’s known the phosphorylated amino acid can unbind without proceeding to the final reaction step, and there are high-activity enzymes whose sole purpose is to remove the AMP from these byproducts, recycling those unbound intermediates back to substrate. Adding this mechanism makes the pathway look like this:

The incorrect substrate T also goes through the same process, and will have a Kdt2 of its own. Since those Kd’s are still dissociation constants for the substrate, it is likely that Kdt2 will be smaller than Kds2, meaning more of T than S will exit at that second binding step. And since the pool of E-S* is already biased towards the correct product by the ratio f, allowing that pool of E-S* to undergo another unbinding step biases the pool of substrate as if we were doing the initial binding step again. To be more precise, if Kdt2/Kds2=Kdt/Kds, then the above reaction pathway has specificity f squared. If the ratios aren’t equal, then we still get the sum of the ∆G’s of binding.

By adding an extra step to the enzymatic function, we have squared the specificity or halved the energy requirement for a given specificity.

However, there are a couple of very important details before we can present a generalized version of kinetic proofreading.

Any S* or T* binding to the enzyme bypasses proofreading. Therefore, the conversion of S* or T* back to S and T must be extremely rapid. This ensures that the only way to make E-S* is by E binding to S.
The conversion step E-S -> E-S* needs to be irreversible or at least heavily forced. This separates the two proofreading steps, guaranteeing that the substrate bias gained by the first binding step can be improved upon by the second binding step. Otherwise, the increased proportion of E-S* would revert to E-S compared to a much lower proportion of E-T* reverting to E-T. The only way to get irreversible steps is to spend energy, so the conversion step needs to be an energy-expending step.

From here, we can state a generalized version of kinetic proofreading for substrate A of enzyme E to make product Z by way of intermediates B, C….

And any generalization of this pathway should carry aspects of kinetic proofreading

Each additional proofreading step will add another layer of proofreading, and thus another power to the exponent of specificity. With two intermediates and three levels of proofreading we would cube f, the specificity.

Implications and Examples

Now that we have a general description, let’s take a step back and think about this for a second. You can square your specificity by spending energy to add a step to your reaction, but only if you allow the modified substrate to potentially unbind afterwards, thereby wasting that energy. These reactions are beating the kinetic and thermodynamic limits on substrate specificity by spending energy on the problem to force an extra proofreading step. They’re improving faithful information transfer and reducing entropy by spending extra energy on the problem. It makes a beautiful kind of sense that the energy to make the conversion step unidirectional is required in order to increase the specificity of the enzyme, and that the energy is often lost by the dissociation of both correct and incorrect substrates.

Finally, I want to talk about how general this principle is. I’ve already discussed it for tRNA synthetases, which are responsible for the first step of accurate protein synthesis. It also occurs in the next step of protein synthesis, where charged tRNAs bind to mRNAs in the ribosome to generate a new protein. The key to this is a universally conserved protein called EF-Tu, which binds tRNAs and is required for tRNAs to bind to the ribosome. But EF-Tu must cleave a GTP and dissociate before the peptide chain is extended, and the tRNA can unbind after EF-Tu departure but before the peptide is extended. The overall schema looks familiar:

The more you look, the more commonly you see kinetic proofreading in cases where accuracy matters. It’s no surprise to see DNA replication on the list, though there are some other polymerase-specific ways that polymerases have increased specificity. The first energy-expending step attaches the new nucleotide (here dAMP) to the growing strand. However, it can then be cleaved off and dissociate in the proofreading step, and the procession of the polymerase to the next nucleotide is the final step.

d-AMP-DNA is the new nucleotide being part of the full strand of DNA

Kinetic proofreading is also an important principle in eukaryotic signaling cascades, especially in the immune system to determine true-positive antigen binding and avoid off-target immune function. Both DNA damage repair mechanisms and homology-directed recombination use kinetic proofreading to ensure they’re attaching the correct pieces of DNA. The principle of putting in an exit point after an energy-expending step to increase specificity is present in myriad and various biological systems.

This work is a prime example of the last century of biology, where advances were often made by sitting down and thinking up a single beautiful experiment. J.J. Hopfield’s paper, cited above, lays out kinetic proofreading in a longer and more detailed fashion than this post and doesn’t contain a single experiment. However, it prompted several experimental papers in subsequent years that simply and beautifully demonstrated kinetic proofreading in real systems.

Sequence Structure Function

Discussion about this post

Ready for more?