1. The Promise of Assembly Theory
Assembly Theory (AT) is currently the hottest thing going in theoretical biology, if not in science as a whole. Its chief proponent, origin-of-life researcher Leroy Cronin, advertises it as the long-awaited materialistic resolution to the problem of complexity, and not just complexity in biology, but also in physics and cosmology — in short, complexity everywhere.
The journal Nature in 2023 propelled the theory’s standing in the scientific community with the publication Cronin et al.’s “Assembly Theory Explains and Quantifies Selection and Evolution.” In regard to biology, Cronin sees Assembly Theory as explaining life’s origin and subsequent evolution while also decisively refuting intelligent design (ID).
Here’s a flavor of the enthusiasm surrounding Assembly Theory, as reported by the Santa Fe Institute. Consider, for instance, physicist Sara Walker, who is a Cronin collaborator, as she sings the theory’s praises:
Assembly theory provides a completely new lens for looking at physics, chemistry, and biology as different perspectives of the same underlying reality. With this theory, we can start to close the gap between reductionist physics and Darwinian evolution — it's a major step toward a fundamental theory unifying inert and living matter.
Cronin takes Walker’s paean further, touting Assembly Theory as revolutionizing all of the exact sciences:
Assembly theory provides an entirely new way to look at the matter that makes up our world, as defined not just by immutable particles but by the memory needed to build objects through selection over time. With further work, this approach has the potential to transform fields from cosmology to computer science. It represents a new frontier at the intersection of physics, chemistry, biology, and information theory.
Evolutionary biologist Michael Lachman is by comparison more measured, though by any objective standard he is still hyperbolic:
Life is defined by an evolutionary process. To find life, we should find evolution, and luckily the process of construction of objects and molecules with and without evolution is totally different. Evolution will discover an assembly plan and then build the same object again and again, or reuse it in more complex objects. Once evolution is involved, we are dealing with the dynamics of assembly plans and not of particles.
To listen to Cronin describe Assembly Theory to YouTube’s Prof. Dave, one would think that all the splendor of scientific achievement has come to a head in this theory. All the things that could be desired for a paradigm-shifting scientific revolution are, according to Cronin, evident in Assembly Theory:
Effective Tool for Empirical Measurement: Cronin emphasizes that Assembly Theory allows for the direct measurement of the assembly index from molecules, enabling the empirical quantification of complexity.
Fruitful Experimental Framework: The theory is said to provide a framework for inspiring new experiments, particularly in understanding the processes that drive the assembly of complex structures
Broad Explanatory Power: Assembly Theory is described as explaining how selection processes in the universe lead to the formation of complex biological and non-biological systems, making it a comprehensive theory for studying complexity, both theoretically and concretely, and in particular for detecting life.
Beyond Mere Algorithms: While algorithms may handle data and processing, Assembly Theory is said to provide a theoretical basis for why and how complex structures emerge and evolve, integrating principles of selection and causation that are not addressed by algorithmic information theory alone.
Potential for New Discoveries: Cronin suggests that the theory has already led to significant discoveries, as in quantifying selection and understanding prebiotic chemistry, and will continue to uncover new insights into the mechanisms of complexity and evolution.
2. The Reality of Assembly Theory
When I first encountered Assembly Theory, I was intrigued by its promise to provide new insights into how complex systems might form and be understood. Although I’m enough of a Platonist and Aristotelian to think that nature is more than an assembly of parts, I recognize that assembly plays a critical role in the development of complex systems. This applies to both human-made artifacts and to living forms, regardless of whether their ultimate cause is intelligent or unintelligent.
But when I read the Assembly Theory literature and studied the details of the theory, I was frankly shocked to see how devoid it is of substance and insight. The problem is not that it’s wrong but that it is so simplistic and limited in scope that it doesn’t matter what it gets right. Its inadequacies in biology were especially hard to match with the hype surrounding it.
Anyone who has worked in information and complexity theory will find the techniques that Assembly Theory uses familiar, even if not exactly the same as the theory it most readily evokes, namely, algorithmic information theory. The mathematics that Assembly Theory uses is legitimate as far as it goes. It’s just that it doesn’t go nearly as far as its proponents claim. And by that I don’t just mean that it makes good progress, taking us half the distance we need to go even if not fully able to take us past the finish line. I mean something more like that it promises to get us from New York to LA, and it only gets us as far as Newark.
The rhetoric surrounding Assembly Theory has been over the top and triumphalist. Yet the reality of Assembly Theory falls so far short of its promise that a vigorous rebuttal is warranted. I was tempted to write such a rebuttal as a full-fledged scientific review article for the journal Bio-Complexity. But even though Assembly Theory is a formal mathematical theory, what the mathematics in it is doing can be explained quite easily. Moreover, such an explanation makes clear even to the educated lay reader that Assembly Theory cannot be up to the job that Leroy Cronin and his colleagues are claiming it can do.
The main fault of Assembly Theory is this: Its model for the assembly of complex systems is so simple and stylized and it cannot reasonably map onto the complex systems in our experience, be these biological or artifactual. And if it can’t handle these, forget about the complex systems of physics and cosmology, such as the structure of galaxies. To see this, we need to examine the nuts and bolts of the theory, which we do in the next three sections.
3. The Origin of Assembly Theory
To understand why Assembly Theory’s model for the assembly of complex systems can’t bear the weight that Cronin and his colleagues want to put on it, let’s return to the origin of Assembly Theory when it went under a different name and when its aspirations were more modest. In 2017, Cronin and colleagues published in the Philosophical Transactions of the Royal Society (A) a paper titled “A Probabilistic Framework for Identifying Biosignatures Using Pathway Complexity.”
The point of that paper was to lay out a measure that could distinguish between living and non-living assemblages, with artifacts produced by living things also included under living assemblages. The idea behind this pathway complexity measure was that assemblages composed of elemental constituents can be built up recursively into ever more complex assemblages. The word recursive here refers to a technique used in computer science to solve a problem by having a function repeatedly call on itself to solve simpler subproblems.
To take a very simple example of pathway complexity, consider sequences of bits. The elemental constituents here are 0 and 1. The pathway complexity for a sequence of bits is then the shortest number of steps to produce the given sequence by adding not just elemental constituents (in this case 0 and 1), but also intermediary subsequences that are produced along the way. The recursivity of this scheme consists in the ability to reuse subsequences already produced as though they were elemental constituents, adding them to a growing list of constituent items that can be used and reused in producing a final assemblage.
Consider, for instance, the sequence of 16 bits 0000000000000000. This sequence can, according to the scheme just described, be produced in 4 steps:
0,1 | 00
0,1,00 | 0000
0,1,00,0000 | 00000000
0,1,00,0000,00000000 | 0000000000000000
In this way of representing the pathway complexity of 0000000000000000, what’s to the left of the solidus (i.e., |) is the growing list of items that can be used and reused to produce assemblages, and what’s on the right are the actual assemblages produced. In each case, what appears to the right of the solidus goes to the left as part of the growing list of constituents available for the next step along the pathway.
In this example, the 16 bits 0000000000000000 are produced in 4 steps. It’s no accident that log2(16) = 4 (i.e., the logarithm to the base 2 of 16 is 4). In general, for any assemblage with n elementary constituents (allowing repetitions), the shortest pathway complexity according to Cronin’s scheme is log2(n).
Conversely, it’s clear that the longest pathway complexity for an assemblage of n elementary constituents can at most be n–1. That’s because elemental constituents can always be added one at a time. Consider, for instance, the sequence of 8 bits 01000110. This sequence could be built as follows:
0,1 | 01
0,1,01 | 010
0,1,01,010 | 0100
0,1,01,010,0100 | 01000
0,1,01,010,0100,01000 | 010001
0,1,01,010,0100,01000,010001 | 0100011
0,1,01,010,0100,01000,010001,0100011 | 01000110
The number n here is 8, and there are n–1 = 8–1 = 7 steps in the path to this 8-bit sequence.
One complicating factor in Cronin’s pathway complexity measure is that intermediate steps can be accomplished independently, so that the assemblage being built does not need to grow strictly incrementally, as in the two previous examples. For instance, with the 8-bit sequence just considered (i.e., 01000110), it could also have formed as follows:
0,1 | 01
0,1,01 | 00
0,1,01,00 | 0100 [from 1. and 2.]
0,1,01,00,0100 | 01000
0,1,01,00,0100,01000 | 11
0,1,01,00,0100,01000,11 | 0100011 [from 4. and 5.]
0,1,01,00,0100,01000,11,0100011 | 01000110
The path here to the target sequence 01000110 is different from the one given just before, but it also comes to n–1 = 8–1 = 7 steps.
For a given assemblage A, let’s denote by |A| the total number of elemental constituents in A, including repetitions. In that case, the pathway complexity of A, denoted by a, is defined as the shortest number of steps needed to produce A. As we saw in the case of bitstrings, a is always between log2(|A|) and |A|–1. This result holds in general.
According to Cronin, the biologically interesting assemblages with respect to this pathway complexity measure are those sitting in a sweet spot between log2(|A|) and |A|–1 that hugs neither of these extremes. Thus, if a is close to log2(|A|), that suggests, like the sequence of repeated bits 0000000000000000, the assemblage is too simple to be biologically significant. On the other hand, if a is close to |A|–1, that suggests the assemblage is too complicated/random to be biologically significant.
This way of thinking about pathway complexity seems reasonable, evoking other measures of complexity that have been used to assess biological complexity, not least specified complexity. Even apart from my account of specified complexity as given in the second edition of The Design Inference and summarized here and here, when Leslie Orgel, the originator of this concept, introduced it in his 1973 book The Origins of Life (whose focus on astrobiology parallels Cronin’s), Orgel considered three types of situations:
It is possible to make a more fundamental distinction between living and nonliving things by examining their molecular structure and molecular behavior. In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple, well-specified structures because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures which are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. (p. 189)
Orgel elaborated on this intuitive way of understanding different forms of complexity by drawing on information theory:
These vague ideas can be made more precise by introducing the idea of information. Roughly speaking, the information content of a structure is the minimum number of instructions needed to specify the structure. One can see intuitively that many instructions are needed to specify a complex structure. On the other hand, a simple repeating structure can be specified in rather few instructions. (p. 190)
Orgel’s calculation to identify the “minimum number of instructions to specify a complex structure” parallels Cronin’s calculation to identify the pathway complexity of a complex structure. Specifically, in his 2017 paper (p. 3), Cronin defined pathway complexity as the number of steps for “the shortest pathway to assemble a given object by allowing the object to be dissected into a set of basic building units and rebuilding the object using those units.” The measures differ in details, but the underlying rationale is the same.
4. From Pathway Complexity to Assembly Index
Cronin might have left matters where they were in his 2017 paper, using pathway complexity to identify biosignatures. In that case, his complexity measure would have been one more in the stock of complexity measures used by information theorists and theoretical biologists to serve as mathematical/probabilistic indicia for distinguishing life from non-life. Interesting, perhaps. Significant, perhaps. But nothing groundbreaking.
Given Cronin’s proposal to treat pathway complexity as a biosignature, the next order of business would have been a reliability analysis. Such an analysis would determine what measured values of pathway complexity reliably correlate with life and how well or badly this measure produces false positives (seems like life but is non-life) and false negatives (seems like non-life but is life). Such an analysis has yet to be performed. If it could be performed, it might find that certain values of pathway complexity are reliably correlated with life and thus identify true biosignatures. That would be interesting. But without such an analysis, we lack adequate independent evidence for pathway complexity to be diagnostic of biosignatures.
Considered by itself, irrespective of how its values are interpreted, pathway complexity performs work that in statistics is characterized as data reduction. Most real-world assemblages are complicated, whether they be living or non-living. Pathway complexity abstracts certain features from those assemblages and assigns them a number. In the 2017 article, this number was called pathway complexity. In Cronin’s 2023 article in Nature titled “Assembly Theory Explains and Quantifies Selection and Evolution,” it was redubbed an assembly index.
This change in terminology is not innocent. The idea of pathway complexity, in line with statistical data reduction, suggests that this measure may omit quite a bit of the reality it is trying to measure. Consider, for instance, a measure of pathway complexity not for assemblages of items composed of elemental constituents, but for travelers driving by car. One such measure could simply count the number of right and left turns that the traveler needs to make. Such a measure of pathway complexity would capture something of the complexity of the route taken by the traveler. But it would miss other forms of complexity, such as needing to navigate hairpin turns on West Virginia mountain roads, to say nothing of the sheer distance that must be traveled. Any valid measure of pathway complexity, whether for chemists using Assembly Theory or for travelers by car, will capture some aspects of the effort and cost of traversing a path but also omit other aspects.
Yet to redub pathway complexity as an assembly index suggests that this measure is fully capturing what it takes to produce a given assemblage. To take another example from statistics, given a set of n real-valued data points, the sample mean (i.e., the sum of the data points divided by n) captures everything in those data regarding the population mean. Or as statisticians would say, the sample mean is a sufficient statistic for the population mean. The data reduction inherent in calculating the mean is extensive. It requires keeping track of a lot less than those original n data points. Yet for the purposes of the statistical analysis, nothing of consequence is lost by going with the mean.
The same cannot be said for the assembly index. I’ll detail its failure to adequately represent the assembly of complex items in the next section. But for the remainder of this section, I’ll simply show that Cronin and his colleagues are overselling the assembly index, ascribing to it powers that it does not possess. Essential to Cronin’s project is what he calls the assembly equation, which in the 2023 Nature article (p. 323) he characterizes as follows (note that I use ^ for exponentiation, _ for subscripts, * for multiplication, and / for division):
We define assembly as the total amount of selection necessary to produce an ensemble of observed objects, quantified using equation (1):
A = Sum(i=1 to N, (e^a_i)*(n_i – 1)/N_T) (1)
where A is the assembly of the ensemble, a_i is the assembly index of object i, n_i is its copy number, N is the total number of unique objects, e is Euler’s number and N_T is the total number of objects in the ensemble. Normalizing by the number of objects in the ensemble allows assembly to be compared between ensembles with different numbers of objects.
In the previous section, we considered a given complex item A having a pathway complexity a. In the assembly equation, the focus is not on an individual complex item but on an ensemble of such items. We should therefore now think of an ensemble of N distinct items A_i for i running from 1 to N, with A now being the entire ensemble of the items A_i, which can be represented set theoretically as A = {A_1, A_2, …, A_N}. Note that each A_i represents a unique item type, each being repeated with n_i copies (≥ 1). Moreover, the total number of items in the ensemble A, including all copies, is then N_T = Sum(i=1 to N, n_i).
Certain things become immediately evident from the assembly equation given in the above quote: If each item type only occurs once, then for all i, n_i –1 = 0, entailing that A = 0. So in this equation, if A is going to be strictly greater than 0, items of the same type need to be repeated. Moreover, A will grow large to the degree that items are repeated and the assembly index for those corresponding items is large. That’s because the term (e^a_i)*(n_i – 1)/N_T will be large to the degree that the assembly index a_i is large (each of these terms is given still greater weight in the assembly equation because it is exponentiated via e^a_i), and is multiplied by the weighting factor (n_i – 1)/N_T that itself is large (implying many repetitions of the item A_i in relation to the total number of items N_T). In effect, the assembly equation provides a normalized composite of assembly indexes, controlling for the number of items of the same type answering to each index.
In plain English, for the value of the assembly equation to become large, there must be items with large assembly index and these must be repeated often. This characterization of the assembly equation, however, immediately raises certain troubling questions:
How was the ensemble of items that forms the basis for this equation identified in the first place? Arbitrariness in the choice of ensemble seems baked into the assembly equation. Cronin will refer to the assembly equation as “capturing
the degree of causation required to produce a given ensemble of objects.” But that just commits the fallacy of turning a problem into its own solution. The choice of ensemble is itself left unexplained, and the assembly equation is simply presupposed as giving causal/explanatory insights rather than being shown on independent evidential grounds to do so (compare point 4. below).
Where in this equation is the bridge to biology or to other exact sciences? Where is the actual empirical science? The formalism as such is purely mathematical. Presupposed in this equation is that the assembly indexes calculated in it adequately capture how complicated it is to build the real physical items in question. Yet Cronin and his colleagues have nowhere established any actual bridges here. They may think they have, but as we’ll see in the next section, the pathway complexity measure on which their program depends is deeply problematic when interpreted as an assembly index. It cannot adequately capture the actual complexities that arise in the building of things.
What do sheer numbers of repetitions of item types have to do with selection and evolution? If some organism has a selective advantage, it will (tend to) replicate. But the scale of replication will differ vastly between organisms (compare mammals to insects to bacteria). Even if the ensemble in question is judiciously chosen, why think it has anything to do with natural selection? The assembly equation applies equally to technological selection, where ideas and items get replicated for functional reasons answering to human purposes (compare TRIZ, also known as the Theory of Inventive Problem Solving). And it applies to economic selection, where ideas and items get replicated because consumers en masse buy the items (as in Harvard business professor Carliss Baldwin’s work on design rules and the power of modularity—more on this in the appendix to this paper).
Finally, given the title of Cronin’s 2023 Nature article “Assembly Theory Explains and Quantifies Selection and Evolution,” in what sense does his so-called theory do explanatory work? In fact, there is no theory here in the traditional sense of the term, namely, in the sense of a conceptual framework that offers a causal account (explanation) of the thing to be explained. Indeed, what is the thing to be explained here? Presumably, it is particular items with high but not too high assembly index (pathway complexity) that are oft repeated in some ensemble. Yes, the assembly equation puts a number on this. But why should we think this number explains anything? It does provide a quantitative description, but simply assigning numbers to things does not an explanation make. I can assign direct mileage distances between towns (as the crow flies), but that won’t say anything about how difficult it will be to navigate the roads between those towns, or even whether there are any roads between them.
5. Assembly Theory’s Anisomorphism Problem
Anisomorphism is a term used differently in biology and mathematics. In biology, it refers to differences in anatomical structures of two species that perform similar functions but have evolved independently. For instance, the wings of birds and insects are anisomorphic because, despite serving the same function (flight), they are structurally different and evolved through different evolutionary pathways.
My interest, however, in the term anisomorphism is mathematical. In mathematics, it describes a lack of structural similarity between two mathematical objects. In contrast to isomorphism, where two structures can be mapped onto each other in a way that preserves their operations and relations, anisomorphism indicates that no such a mapping is possible.
To appreciate the extent of Assembly Theory’s failure requires understanding the anisomorphism problem (understood mathematically) confronting its pathway complexity measure (redubbed assembly index). This measure, as a numerical description of the complexity of multipart items built from elemental constituents, applies legitimately in only a few limited and artificial cases. But in the vast majority of the cases where we would want to understand the assembly of real-world items, it fails utterly.
If pathway complexity is going to capture the step-by-step formational complexity of items under assembly, then there must be an isomorphism mapping pathway complexity to the complex item being assembled. In particular, given an abstract description of elemental constituents and a shortest path consistent with pathway complexity, it should be possible to describe the actual formation of the complex item on which these mathematical tools of assembly theory are being used.
Simply put, fill in the elemental constituents and a shortest path on the mathematical side, and it should be possible to build the item in question on the reality side. As with CAD/CAM or a blueprint, the mathematics of Assembly Theory should characterize how a complex item can be built without omitting any essential feature of the item. And yet that is exactly what Assembly Theory doesn’t do, except in the very limited circumstance where items can be characterized by undirected graphs.
As it is, Assembly Theory can’t even properly handle bitstrings, treating a given bitstring and the same bitstring in reverse order as identical. Worse yet for Assembly Theory, the construction of assemblages composed of Lego pieces cannot be properly accounted for in terms of pathway complexity/assembly indexes. This is ironic because Lego should, if anything, provide a proof of concept for Assembly Theory. Indeed, the announcement by the Santa Fe Institute on October 10, 2023 that Assembly Theory is poised to revolutionize biology and science as a whole had, as its featured image, a Lego assemblage (for copyright reasons, I’m not reproducing it in this paper, but readers can view it here).
The chief problem with using pathway complexity (aka assembly index) to characterize assembly is that its mathematical formalism is too poor and weak to handle the job. To see this, we need to understand that pathway complexity is a complexity measure applied to undirected graphs and entails a particular formational history of such graphs. The reference here to graphs is in the mathematical sense of graph theory. From the vantage of graph theory, here is how items are assembled. This is the key example in the 2017 Cronin et al. paper, appearing on p. 5. For copyright reasons I’ve redone it, omitting extraneous repetitions of the constituents:
Let’s walk through this example. We start with two elemental constituents, a dark circle and a light square. Because the assemblages formed in Assembly Theory are undirected graphs, the first step will therefore be to take two of the elemental constituents and connect them, forming an undirected graph. Thus we might have taken two circles or two squares or, as we did, a circle and square, and then drawn a line to connect them. Because these are undirected graphs, it doesn’t matter if we put the circle to the left and the square to the right and draw a line connecting them. As we note with the equality sign in this figure, reversing the order gives an identical undirected graph. For all undirected graphs care, the circle could be in Timbuktu and the square in Tripoli.
Once the circle and square are connected, it gets added to our (growing list of) constituents. Now at step 2, we have the elemental constituents (the circle and the square) as well as a non-elemental constituent (the circle connected to the square). At step 2, we decide simply to connect the circle to another instance of the circle. That takes us to step 3, where the circle connected to another circle is now added to the constituents.
At the next step, step 3, the circle connected to the square is duplicated and now the circle of one is connected to the square of the other, and the square of one is connected to the circle of the other. Notice that at each step, two items from the constituents formed so far appear in the growing list of assemblages, and at least one elemental item from one constituent is connected to one elemental item of another constituent. No elemental item from one constituent is connected to two elemental items of another (though the theory of undirected graphs would allow that).
We now add the assemblage formed at step 3 to our growing list of constituents, and then in step 4, we join the constituent consisting of two squares and two circles to the assemblage consisting of two circles by connecting a square of one to a circle of the other. Note that this assemblage formed at step 4 doesn’t care which square of the 4-item subassemblage is connected to which circle of the 2-item subassemblage. They’re all equivalent.
Simply eyeballing this process of assembly makes clear how limited it is. Items are connected without any attention to order. Thus at step 2, we could have formed the assemblage
But because this is an unordered graph, it is equivalent to
If we now think of circles as 0 and squares as 1, then the first of these would look like 0101 and the second as 1010. But that’s just a matter of appearance. In fact, these two undirected graphs would be equivalent. As a consequence, Assembly Theory cannot adequately model bitstrings or the formation of words in general. For instance, Cronin et al. use Roman letters as elemental constituents to characterize the formation of words, as in the 2017 paper (p, 4), where from the elemental constituents B, A, and N, the word “BANANA” is assembled. In fact, this string of letters is, within Assembly Theory, equivalent to “ANANAB.”
This example clearly underscores an anisomorphism problem in Assembly Theory between its representation of words via undirected graphs and the way we ordinarily represent words. Now it may seem that the problem here is minor, as in we simply need to keep track of whether we’re reading a string of characters front to back or back to front. But the problem here is much more far reaching. To see this, let’s turn to Lego.
A standard Lego brick, consists of several key parts:
Studs: The small, cylindrical protrusions on the top of the brick. Studs allow bricks to interlock with other bricks.
Tubes: Hollow cylinders on the underside of the brick that align with the studs of other bricks. They provide stability and allow for a secure connection between pieces.
Top Surface: The flat surface of the brick where the studs are located. This is the side that is visible when the brick is connected to others.
Bottom Surface: The underside of the brick that contains the tubes. This part interlocks with the studs of other bricks.
Walls: The vertical sides of the brick that connect the top and bottom surfaces. These walls provide the structural shape of the brick.
Edges: The corners of the brick where the walls meet. In traditional bricks, these edges are sharp and at right angles.
Suppose now that we take two standard 8-stud Lego bricks, one red and the other green, treating these as elemental constituents in a growing assemblage as urged by Assembly Theory. Assembly theory will simply indicate a joining of two such bricks. Let’s say it’s a red and a green one. Anybody who has worked with Lego bricks realizes that there are many ways to join these two bricks. Let’s imagine that the red brick is at the bottom, the green brick at the top. We could have the green brick latch onto one, two, three, four, six, or eight studs, and in some cases we can do this in multiple non-equivalent ways. Thus we could have the green brick include two of the red studs on its short side, and with the green brick oriented in the same way or perpendicular to the red brick. Or we could have the green brick include two adjacent red studs anywhere along the long side of the brick. Moreover, for any configuration that puts the green brick atop the red brick, there would be a geometrically equivalent configuration putting the red brick atop the green brick.
I’m counting dozens of possibilities here. In fact, there are infinitely many possibilities here because attaching bricks at a single corner stud (corner to corner) will allow infinitely many angles between the two bricks (that’s because a single stud can’t lock a Lego piece into place but will allow it to swivel). Just two 8-stud Lego bricks therefore allow for infinitely many configurations. And yet, because it reconceives all assembly in terms of undirected graphs, Assembly Theory must treat all these possibilities as equivalent. Moreover, whatever configuration of red and green 8-stud Lego bricks one chooses, adding another brick to the assembly process will only further highlight the anisomorphism here, which is to say the lack of correspondence between the actual configuration of Lego bricks and the undirected graph on the Assembly Theory side of things that’s supposed to represent it.
6. If the Theory Can’t Handle Lego …
It should by now be clear that if Assembly Theory cannot handle Lego bricks, then there’s virtually nothing it can handle. But let me press this point to banish any doubt. Assembly Theory would work if elemental constituents always locked into place in one and only one way and if composite constituents (built along the way in the AT’s assembly process) would always connect only via their elemental constituents (with their unique lock fit). In that case, the undirected graph approach inherent in Assembly Theory would map isomorphically onto the thing actually assembled.
But what in the real world fits that bill? What’s completely absent from Assembly Theory is that interfaces between constituents in an assemblage can, and in practice do, allow for multiple degrees of freedom. Cronin is a chemist working on the origin of life, so he can appreciate this point. Consider the bonds between amino acids in proteins. L-amino acids, which appear in living forms, can connect to each other in various ways, through both peptide bonds and non-peptide interactions.
A peptide bond forms between the carboxyl group of one amino acid and the amino group of another, creating the backbone of polypeptide chains in proteins. But additionally, amino acids can form disulfide bonds between cysteine residues, ionic bonds between oppositely charged side chains, hydrogen bonds, hydrophobic interactions among nonpolar side chains, and van der Waals forces due to transient dipole moments. Given only two random L-amino acids in a non-enzymatic environment, non-peptide bonds are more likely to form spontaneously than peptide bonds because they don’t require the high activation energy or specific catalytic conditions necessary for peptide bond formation.
This interface variability for the bonds between amino acids means that Assembly Theory, which requires single-lock constituents to adequately model complexity, cannot account for the formation of proteins. Similar considerations apply to nucleic acid chains, carbohydrates, and lipids. Consequently, Assembly Theory cannot handle the basic building blocks of life. Assembly Theory is therefore irrelevant to the origin of life as well as to its subsequent molecular evolution.
A deeper theoretical issue also confronts Assembly Theory. Let’s imagine that, in the context of biology, instead of the basic building blocks being amino acids and other molecules thought to inhabit a prebiotic soup, we decided to take as our basic building blocks the atoms of the periodic table. In other words, we take these to be our elemental constituents. Even if Assembly Theory could account for how these atoms build up to the basic molecules of life, it would still come to a dead end in trying to explain life.
The problem is that when atoms combine to form molecules, novel diverse interfaces emerge between the molecules, implying multiple degrees of freedom for the connections between molecules, which in turn invalidates Assembly Theory’s relevance here because it has no resources for handling interface variability. Assembly Theory attempts to build assemblages item by item, assuming the relations among those items will take care of themselves. But with interface variability, those relations can’t take care of themselves because the theory is formulated in terms of undirected graphs, which allow for no interface variability.
This interface variability problem also manifests itself in Assembly Theory’s inability to deal with nested hierarchies. At each step in the assembly of an item, Assembly Theory requires connecting an elemental constituent to an elemental constituent. But in a nested hierarchy, composite constituents at various hierarchical levels can be connected across those levels without going back to the bare elemental level that forms the basis of the hierarchy. For instance, amino acids bond to other amino acids on a molecular level and not by matching up atoms by atoms on the purely atomic level, as required by Assembly Theory (assuming atoms are taken to be the elemental constituents).
At this point in the argument, the ardent defender of Assembly Theory might want to say that even though Assembly Theory loses a lot of information about a complex item it is trying to explain, it is still providing some insight into that item. Granted, for any complex item, there can be a path of undirected graphs leading to a final graph that in some loose sense represents the item. But that hardly seems impressive.
To see just how unimpressive this is, here’s a two-step process by which a red, a green, and blue 8-stud Lego brick, together taken as elemental constituents, can be configured to form a 3-part Lego brick item:
RED—GREEN [2-part item]
RED—GREEN—BLUE [3-part item]
What insight does this assembly process provide into any actual configuration of three 8-stud colored Lego bricks? In fact, it provides no insight at all. We know nothing here about how the studs of one brick fit into the tubes of another. All specificity about the actual Lego configuration is lost.
Ultimately, Assembly Theory’s problem is that it passes itself off as a theory of assembly when in fact at best it offers a crude redundancy measure. At its heart, it keeps track of repetitions of constituents (elemental and composite) in a step-by-step binary process of augmentation. In other words, at each step, Assembly Theory brings together a pair of constituents previously identified or formed; and it then expedites the process of producing a final item by recursively taking advantage of repetitions among the growing list of constituents.
Based on an undirected graph model, however, Assembly Theory cannot make sense of interface variability or nested hierarchies, nor can it provide faithful representations of item configurations, to say nothing of their actual assembly. The paths for which Assembly Theory calculates pathway complexity (aka an assembly index) at best represent a coarse organizational grouping of elemental constituents (certainly not enough to reconstruct an actual item based on the Assembly Theory path). The path is therefore not an actual path of construction but merely a sequential way of characterizing the organization of the complex item in question, and usually with a massive loss of detail unless the item conforms to the structure of an undirected graph, which it almost never does.
There are many more problems with Assembly Theory. It is purely augmentative, though in practice much assembly is also substractive, where things like scaffolds need to be removed to bring about the final assembly. For instance, in human fetal development, the separation of digits on the hand occurs through apoptosis, where cells in the regions between the developing fingers undergo programmed cell death, ensuring the digits are properly sculpted and separated. Nothing like this is present in, or can be present in, Assembly Theory.
In sum, even though further criticisms against Assembly Theory can be made, the train of argument to this point in this paper suffices to show that Assembly Theory stands defeated on its own terms.
Appendix: A Modular Theory of Assembly
I might have ended this paper with the last section, thereby offering a pure critique of Assembly Theory with no alternative to it. Nonetheless, despite its fatal defects, Assembly Theory does raise the prospect of what a successful theory of assembly might look like. As it is, one promising approach to such a theory exists in the work of Harvard Business School professors Carliss Baldwin and Kim Clark. I’ll briefly summarize their work, and then relate it to Cronin’s Assembly Theory.
In their book Design Rules: The Power of Modularity (MIT Press, 2000), Baldwin and Kim identify six modular operators that play a crucial role in the design and evolution of complex systems. These modular operators are:
Splitting: Dividing a system or component into smaller, more manageable modules. This allows for specialization and independent development of different parts of the system.
Substituting: Replacing one module with another that performs the same function. This enables flexibility and the ability to upgrade or change parts of the system without affecting the whole.
Augmenting: Adding new modules to an existing system to enhance its capabilities or performance. This allows for the incremental improvement of the system.
Excluding: Removing a module from the system. This can simplify the system or eliminate unnecessary components.
Inverting: Changing the relationship between modules, often by switching the roles of components or altering the direction of dependencies. This can lead to more efficient designs or new functionalities.
Porting: Transferring a module from one system to another. This allows for the reuse of existing components in different contexts or environments, promoting efficiency and consistency.
These modular operators provide a framework for understanding how complex systems can be designed, managed, and evolved over time. By systematically applying these operators, designers and engineers can create more flexible, adaptable, and efficient systems. Baldwin and Kim’s work falls under design theory, and though concerned with human design from conception to assembly to economic impact, they draw inspiration from biology and especially from John Holland’s work on complex adaptive systems. Here is an elaboration of these six operators, explaining and giving a concrete example for each.
Splitting involves dividing a complex system into smaller, more manageable modules. This process enables different parts of the system to be developed, maintained, and updated independently. By isolating specific functions or components, splitting can enhance specialization, as different teams or experts can focus on perfecting individual modules. This modularization also allows for parallel development processes, reducing the overall time needed to bring a product to market. Furthermore, splitting can improve system robustness by ensuring that issues in one module do not necessarily affect the entire system.
Example: Consider the development of a modern smartphone. The smartphone can be split into several distinct modules such as the display, battery, camera, processor, and software. Each of these modules can be developed independently by different teams or even different companies. For instance, a specialized team can work on the camera module, continuously improving its quality and features without interfering with the development of the processor or software. This division not only speeds up the development process but also ensures that advancements in one area (like a new camera technology) can be integrated into the system without necessitating a complete redesign of the phone. ««
Substituting refers to the ability to replace one module with another that performs the same function, thereby enhancing the system's flexibility. This operator allows for upgrades and changes without disrupting the entire system. Substitution is crucial for maintaining the relevance and efficiency of a system, as it enables the integration of improved or updated components over time. It also allows for customization, where different versions of a module can be swapped to meet varying user needs or preferences.
Example: In a desktop computer, the graphics card is a module that can be easily substituted. Users can replace an existing graphics card with a newer, more powerful one to enhance the computer's performance, particularly for tasks like gaming or graphic design. This substitution does not require changes to other parts of the computer, such as the motherboard, CPU, or power supply, provided they are compatible. The ability to substitute the graphics card allows users to keep their systems up-to-date with the latest technology without needing to purchase a completely new computer. ««
Augmenting involves adding new modules to an existing system to expand its capabilities or improve its performance. This operator allows for incremental enhancements, making it possible to introduce new features or functions without redesigning the entire system. Augmenting can be particularly useful for extending the lifespan of a product by continuously updating it with new technologies or capabilities.
Example: A smart home system can be augmented by adding new devices such as smart lights, thermostats, and security cameras. Initially, a user might start with a basic system that includes a smart speaker for voice control. Over time, they can augment this system by adding smart bulbs, a smart thermostat, and a smart doorbell. Each new device integrates with the existing system, enhancing its functionality and providing the user with a more comprehensive smart home experience without needing to replace the original components. ««
Excluding is the process of removing a module from a system, which can simplify the system or eliminate unnecessary components. This operator is useful for streamlining and optimizing systems by getting rid of redundant or obsolete modules. Excluding can lead to more efficient operation, reduced costs, and easier maintenance.
Example: In software development, excluding can be seen in the process of refactoring code to remove deprecated functions or modules. For example, a software application might have an old payment processing module that is no longer used because the application has migrated to a new, more secure payment gateway. By excluding the outdated module, the developers can reduce the complexity of the codebase, improve the system's performance, and minimize the risk of security vulnerabilities associated with the old payment module. ««
Inverting changes the relationship between modules, often by switching roles or altering the direction of dependencies. This operator can lead to more efficient designs or new functionalities by rethinking how modules interact with each other. Inverting can help uncover hidden efficiencies or opportunities for innovation within a system.
Example: In a traditional client-server architecture, clients request services from a central server. However, inverting this relationship can lead to the creation of peer-to-peer (P2P) networks. In a P2P network, each node can act as both a client and a server, sharing resources directly with other nodes. This inversion reduces the dependency on a central server, potentially improving the system's scalability and resilience. An example of this is file-sharing networks where users download and upload files directly from and to each other, rather than from a single central server. ««
Porting involves transferring a module from one system to another, allowing for the reuse of existing components in different contexts or environments. This operator promotes efficiency and consistency by leveraging proven solutions across multiple applications. Porting can save development time and costs, as well as ensure that successful components are utilized to their full potential.
Example: In the world of video games, porting is common when a game is transferred from one platform to another, such as from a console to a PC. For instance, a popular game developed for PlayStation might be ported to work on Windows PCs. This process involves adapting the game code to run on a different operating system and hardware configuration while maintaining the core gameplay experience. By porting the game, the developers can reach a broader audience without having to create a new game from scratch for each platform. ««
By comparison with Baldwin and Clark's analysis of complex systems in terms of modularity, Assembly Theory is crude and simplistic. Assembly Theory essentially only allows for augmentation among the modular operators, and even the items being assembled in Assembly Theory are not full-fledged modules but merely aggregates/assemblages.
Modules for Baldwin and Clark are not just aggregates but well-defined components with clear interfaces and interactions, being functional, swappable, and portable. Within their theory, modularity offers a comprehensive framework for managing complexity, fostering innovation, and achieving economic efficiencies. For a more robust and dynamic analysis of complex systems than is possible within Assembly Theory, Baldwin and Clark’s modularity theory offers a superior framework.
Hey Bill, I would encourage you to be tempted on this occasion and publish a rebuttal in a suitable refereed journal. In the meantime, thanks so much for clearly outlining your thoughts on this hot topic.