A demonstration of Scientific Mastery
Reasoning through why prokaryotes are different from eukaryotes
In my second year of graduate school, I had to study for my qualifying exams. Quals are the moment when you prove that you know the field you’ve chosen to study. That means reading a ton of papers and memorizing important facts and the experiments that had proven them. I read about sixty papers, and found it daunting to try and remember all of the important bits.
But about halfway through, I read a paper that showcased such incredible scientific reasoning it changed how I thought about science. It’s a 2013 BMC Biology interview with Julie Theriot, a cell biologist at Stanford. The piece is titled “Why are bacteria different from eukaryotes?, and it lays out a beautiful argument that pieces together dozens of separate pieces of evidence to claim that the filament-binding set of P-loop NTPase proteins are ultimately responsible for eukaryotic structural complexity. Along the way, it describes the different organizational philosophies of bacteria & eukaryotes, and also simply explains the underlying reasoning of biological filament organization. These are deep whys that explain why the universe is the way it is.
Working through this paper changed how I think about science. Following Dr. Theriot as she assembled a broader theory taught me what scientific mastery really is. Beyond simply accumulating facts, she built up a framework solid enough to reason toward truths larger than any single experiment. The next level of research is about gathering data across multiple lines of evidence to work towards deeper truths, and this paper is one of the purest examples of that.
Probing for deeper truths
The main way bacteria differ from eukaryotes is that they’re small and lack specialized organelles. But why? The textbook answer is that bacteria lack a cytoskeleton,1 so they can’t spatially organize. Eukaryotes, like the cells that make up your body, have actin, tubulin,2 and the whole machinery of cellular organization that lets them manage and divide big cells and spatially organize their organelles. Bacteria don’t. That’s why they’re small and simple, while eukaryotes like us get to be structurally complicated.
The problem is that this isn’t really true. In the 1990s we discovered that FtsZ, a bacterial cell division protein, is a structural homolog3 of tubulin. Then, in the early 2000s, we discovered that MreB, which governs bacterial cell shape, is a close homolog of actin. The sequences aren’t very similar, but when you solve the crystal structures and lay them on top of each other, they’re nearly superimposable. Bacteria have a cytoskeleton, but they don’t use it to spatially organize in the same way that eukaryotes do.
Why not? is Theriot’s primary question. If bacteria have the necessary molecular machinery to be complicated, why don’t they do anything more interesting with it?
Two Regimes of Cytoskeletal Organization
To answer that question, let’s look at what bacteria and eukaryotes use their cytoskeletons for. Bacteria make simple, elegant structures that assemble spontaneously without any special regulatory machinery (Type A). Eukaryotes build more complex, directional structures that require either localized nucleation4 or filament-motor proteins5 to organize the filaments (Type B).
Figure 2 from the cited paper. The blue circles are nucleators.
Theriot’s central hypothesis is that bacteria are confined to Type A structures because they lack those two specific molecular inventions: regulated filament nucleators which let you form an organized cytoskeleton, and linear stepping motors on filaments that let you do stuff with them.
But why? Why don’t bacteria have complex, organized cytoskeletons?
This is where things get complicated, but also brilliant. Theriot keeps drilling down, applying dozens of experimental results across decades of science to reach a real theory about how bacteria and eukaryotes appear to have adopted a fundamentally different strategy of spatial organization at a very early point in evolutionary history that depended on the evolution of one specific protein family of filament-binding proteins. It’s quite possible that was the critical adaptation that split the tree of life.
#1: Self-assembly
The first piece of evidence is counterintuitive for anybody familiar with classic biochemistry. The classical theory of protein polymerization, developed in the 1950s and 60s, says that nucleation is the rate-limiting step in filament formation, since that is how eukaryotic cells control their cytoskeletons. They control where a filament forms by directing the localization of special filament-nucleating proteins.
But that is emphatically not true for bacterial filaments like FtsZ & MreB, which nucleate very quickly on their own. Bacteria don’t control filaments through nucleation; they control where they’re stable, because that makes your cytoskeleton faster and more responsive, which is a better strategy for smaller cells that divide quickly. The MinD/MinE system is the clearest example of this: MinE restricts MinD to the ends of a cell, and MinD destabilizes FtsZ. This restricts FtsZ ring formation to the cell midpoint, and so that’s where cell division happens. If you disrupt MinD, then cells will divide in random places.6
These two methods are radically different. Bacteria are doing the equivalent of zoning their development, while eukaryotes have a state-owned enterprise do all of their building.
#2: Nucleators
But evolving a regulated nucleator isn’t hard, so long as you’ve already got the filament! The two main eukaryotic nucleators are Arp2/3 for actin and the γ-tubulin ring complex for microtubules, and they’re both just specialized copies of the structural subunit gene, duplicated and cut in half to seed filament growth.7 They’re simply a paralog of the filament protein, and paralogs are easy to make.8 Bacteria do a lot of gene duplication and have plenty of filament paralogs. They could evolve nucleators if they wanted to, so why don’t they?
In fact, they have a couple of times. Certain bacterial pathogens, such as Chlamydia and Vibrio cholerae, secrete nucleation factors that nucleate eukaryotic actin to hijack the cytoskeleton to help them infect human cells. Bacteria can do it, they just don’t do it for their own cytoskeletons. Almost every well-characterized bacterial cytoskeleton ever described is regulated by stability, not nucleation.
The conclusion here is that these are fundamentally different paradigms of organization. If you regulate by stabilization, then you cannot regulate by nucleation, and it seems bacteria almost universally prefer regulation by stabilization, because even if it can’t coordinate as much complexity, it’s faster and more responsive.
#3: Motors
The third part of the argument goes a bit deeper into evolutionary history.
Linear stepping motors, like myosin walking on actin and kinesin walking on microtubules, are what make Type B structures useful. A cluster of motors walking toward the same end of polarized filaments9 lets you sort things spatially. Add two asters with cross-linkers, and you get a mitotic spindle.10
Bacteria have motors. The bacterial flagellar motor11 is one of the most complex molecular machines known. But bacteria lack linear stepping motors that operate on their cytoskeletal filaments. Once again, why?
The answer is cool. Myosin and kinesin both belong to the P-loop NTPase superfamily,12 a large protein family that couples nucleotide hydrolysis to mechanical work. But only eukaryotes have filament-binding versions of these proteins, and that specific adaptation appears to have arisen once, in the eukaryotic lineage. Bacteria and archaea have no processive motors that step along cytoskeletal filaments. Furthermore, the same clade gave rise to the Rho GTPases, Rab GTPases, Ran, and Arf,13 essentially the entire regulatory signaling apparatus we associate with eukaryotic cellular complexity.
The motors and the signaling machinery share a single evolutionary origin, a divergence that cascaded into almost everything that makes a eukaryote a eukaryote.
#4: The Chromosome Knows Where It Is
The fourth argument is the most speculative. If bacteria give up structural organization for a faster and simpler cytoskeleton, how do they do spatial organization?
In bacteria, transcription and translation are coupled, since the ribosome starts translating an mRNA while it’s still being synthesized from the DNA. This means the physical location of a gene on the chromosome is also where its protein is made. The chromosome sits in the cytoplasm in a well-defined, oriented arrangement,14 and it effectively functions as a spatial coordinate system. If a bacterium needs to target a protein somewhere, it already has a way to do that: just put the gene in the right place, and that’s where the protein gets made.
If you wrap that chromosome in a membrane, all of that spatial information disappears. Exported mRNAs diffuse around and get translated wherever they end up. The built-in coordinate system is gone, and you have to rebuild spatial information from scratch. That is exactly what Type B cytoskeletal structures do.
So you have these two solutions to spatial organization. Bacteria use their genome to organize the cell, and then regulate where their cytoskeleton is stable, while eukaryotes regulate nucleation of their cytoskeleton, then use it to organize everything.
Bringing it all together
The grand answer is that the nucleus and eukaryotic cytoskeletal complexity are almost certainly cause and consequence of each other. Bacteria have a built-in spatial organizer in their chromosome, so they regulate by stabilization. They’re stuck with that framework because switching strategies would require reinventing nearly everything. Besides, it’s a simpler and faster strategy that works great for small, quickly-dividing cells because the kinetics of self-assembling filaments are so much faster than nucleator-driven filaments.
Eukaryotes went the other direction. They wrapped the chromosome in a membrane, losing the built-in spatial organizer but gaining a new layer of gene regulation by separating transcription from translation. The P-loop NTPase family made it possible to reorganize with a cytoskeleton, and slower filament assembly isn’t a problem when your cytoskeleton is semi-permanent. Once the ancestral eukaryote had motors and nucleators, bigger cells became possible, which demanded better organization in turn. That feedback loop was probably pretty fast, with more organelles and structural complexity evolving very quickly in the last common eukaryotic ancestor before it diversified. That’s why all existing eukaryotes are already quite complex and share so many organelles with each other.
This paper builds up the case for this theory without running a single new experiment. Theriot is pulling together decades of results from different labs and finding the thread that runs through them all.
What It Taught a Fledgling Scientist
Theriot’s paper showed me that science is about reasoning from facts on a grander scale than I’d previously seen done. Doing experiments and learning things is certainly necessary, indeed her argument only works because she knows the literature deeply enough to see patterns across multiple lines of inquiry. That allows her to bring in more and more data until she’s answered the question of why.
You learn things so that you can assemble them into an argument. This might seem obvious, but it was a ray of light for an early grad student studying for quals. Individual papers hold nuggets of truth whose ultimate purpose is to be smelted together into something grand, and this paper was a beginner watching a master demonstrate the heights of the art.
I learned some cool stuff from this paper. The biggest thing wasn’t the classification of filament structures or the P-loop NTPase story — though those are great. The true learning was a model of how to think, an example of a framework of knowledge deep enough to make real contact with the world and explain deep mysteries from simple facts, like the ultimate reason why prokaryotes are different from eukaryotes.
The network of protein filaments inside a cell that provides structural support and spatial organization. Think of it as the cell's skeleton and musculature combined. In eukaryotes, the main players are actin filaments and microtubules, made from actin and tubulin, respectively. I’m ignoring intermediate filaments for this article.
Actin & tubulin are the two main structural proteins of the eukaryotic cytoskeleton. Actin filaments are flexible and used for cell movement, shape, and division. Tubulin microtubules are hollow, rigid tubes used for chromosome segregation and intracellular transport. Both are found in essentially every eukaryote on earth, but bacteria were thought to lack these kinds of proteins until relatively recently.
A structural homolog is a protein with almost the same three-dimensional fold as another, usually because they share a common evolutionary ancestor. Two proteins can be structural homologs even when their DNA sequences have diverged significantly over billions of years, as is the case here!
Nucleation is the process by which a new protein filament starts forming. Monomers floating in solution usually don’t stick together spontaneously; there's an energetic barrier to forming the initial seed that other monomers can then grow from. Controlling where nucleation happens means controlling where filaments appear. It’s like using a seed to grow salt crystals or rock candy in a specific place.
Filament motor proteins use chemical energy to move along a filament. Myosin walks along actin; kinesin walks along microtubules. They're how cells move cargo, segregate chromosomes, and generate force. You’ve probably seen the famous YouTube video of this:
This is so robust that you can port it into mammalian cells, and it still works. Rajasekaran et al. (Cell, 2023) transplanted MinDE into mammalian cells and used its self-organizing oscillations to pattern various behaviors. A neat inversion: the bacterial machinery for controlling stability in a small cell becomes a way to manufacture spatial information inside a large eukaryotic one, the very thing eukaryotes otherwise had to evolve a cytoskeleton to do.
Arp2/3 nucleates actin filaments by mimicking the end of an existing filament and giving free monomers something to add onto. The γ-tubulin ring complex does the same for microtubules. Both are, structurally speaking, modified copies of the filament protein itself that specialize in seeding growth rather than building the filament.
A paralog is a gene that arose by duplication within the same organism, and is the most common source of new genes. When a cell copies a gene and keeps both versions, the two copies can diverge and specialize. About 70% of human proteins are paralogs of another one.
Polarized filaments are filaments with directionality. Actin and microtubules are asymmetric, and the motors use this. Myosin and kinesin always walk toward the same end, so they move cargo directionally rather than randomly diffusing. Without polarity, you can’t build a motor that goes anywhere in particular.
An aster is a star-shaped array of microtubules radiating outward from a central nucleating point. Two asters connected by cross-linking proteins form a spindle, the structure that physically pulls chromosomes apart during mitosis and meiosis cell division. The spindle can self-center within a cell because growing microtubule ends push against the cell boundary, finding the midpoint through pure mechanics. It’s a wonderful example of simple underlying interactions leading to complex emergent behaviors.
Bacteria move by spinning a “flagellum,” basically a long propeller. It’s driven by a rotary motor embedded in the cell membrane, powered not by ATP directly but by the flow of protons across the membrane. It can spin at tens of thousands of RPM and reverse direction almost instantly. The motor involves around 40 different proteins assembled into an extraordinarily precise nanoscale machine. Intelligent design advocates have famously claimed it's too complex to have evolved and is therefore proof of a divine designer.
A large family of proteins that share a structural motif for binding and hydrolyzing ATP or GTP. The "P-loop" grabs the phosphate groups of the nucleotide. Bacteria have lots of these, but only eukaryotes have filament-binding versions. For more information, see Leipe DD, Wolf YI, Koonin EV, Aravind L (2002), "Classification and evolution of P-loop GTPases and related ATPases," J Mol Biol 317(1):41–72
These are families of proteins that switch between active and inactive states by binding GTP or GDP. They do quite different things. Rho GTPases (including RhoA, Rac, and Cdc42) are primarily about actin organization and govern cell shape, polarity, and movement by controlling where and how actin networks assemble. Rab GTPases are trafficking regulators: there are over 60 of them in humans, and each marks a specific membrane compartment, acting like an address label that ensures vesicles reach the right destination. Ran governs nuclear transport, maintaining a GTP/GDP gradient across the nuclear envelope that drives the directional import and export of proteins and RNA (and regulate it). Arf GTPases regulate vesicle budding, particularly at the Golgi, where they recruit the coat proteins that physically shape and pinch off transport vesicles. So the family covers cytoskeletal organization, membrane identity, nuclear-cytoplasmic communication, and vesicle formation, which is essentially the full toolkit for running a spatially complex cell. The fact that all of these, plus myosin and kinesin, trace back to the same P-loop NTPase ancestor as the cytoskeletal motors is what makes the phylogenetic argument so striking.
Originally argued in Viollier, P.H. et al. (2004) PNAS, this hasn’t aged quite as well as the rest of the paper and is true to different extents in different organisms. In C. crescentus, the chromosome is tethered and arranged linearly along the cell, with genomic position mapping onto physical position (Viollier, P.H. et al. (2004) PNAS), and RNA-processing machinery clusters at the chromosomal loci of the genes it serves (Bayas, C.A. et al. (2018) PNAS). In B. subtilis, the comE mRNA is actively localized to the cell poles where its protein is needed (dos Santos et al. 2012). And in E. coli, transcribed-and-translated loci are pulled toward the nucleoid periphery (Yang, S. et al. (2019) Nat. Comms), while modeling predicts excluded volume segregates most ribosome-loaded mRNAs to the poles, away from their genes (Castellana, M., Li, S.H.-J. & Wingreen, N.S. (2016) PNAS). But E. coli doesn’t seem to do as much subcellular genome organization as C. crescentus.


