A couple of months ago, I was in Cambridge for the Evolution Evolving conference. It was a lot of fun, and it was nice to catch up with some familiar faces and meet some new ones. My favourite talk was Karen Kovaka‘s “Fighting about frequency”. It was an extremely well-delivered talk on the philosophy of science. And it engaged with a topic that has been very important to discussions of my own recent work. Although in my case it is on a much smaller scale than the general phenomenon that Kovaka was concerned with,
Let me first set up my own teacup, before discussing the more general storm.
This question has come up during review, presentations, and emails (most recently from Jianzhi Zhang’s reading group). I’ve spent some time addressing it in the paper. But it is not a question with a clear answer. So unsurprisingly, my comments have not been clear. Hence, I want to use this post to add some clarity.
Let’s look at the general storm that Kovaka has identified: debates over frequency. Or debates over how typical certain effects are.
Whenever two or more interesting mechanisms or phenomena that are at odds with each other arise in biology, a debate emerges: how frequenty is X? Are most cases Y, and X is just an edge case? Do most systems follow Y? Does X matter more? And other similar questions are asked.
Scientists then aggregate themselves into communities that bifurcate around the answers.
Sometimes it feels like these questions could be resolved empirically. But from Kovaka’s study, this is seldom the case. Heated frequency debates are not resolved empirically. In fact, they usually just peter out and fade.
Let’s look at this in the context of my work on computational complexity.
I introduce the notion of easy landscapes, where the computation is not the limiting constraint, and populations can quickly find local fitness peaks. These describe how biologists have mostly thought about static fitness landscapes. As contrast, I also define hard landscapes where computational is a limiting constraint and thus populations cannot find a local fitness peak in polynomial time.
To establish this, I carry over several techniques from computer science. Most relevant to this case: worst case analysis.
A big difficulty arises in introducing radically different tools from theoretical computer science into biology. They require a thorough defence that I was hoping to delay until after I could attribute some success to the tools. But a careful reviewer 2 noticed this sleight-of-hand and asked me to mount the defence right away. A defence of worst-case analysis. I’ve know since at least 2014 that I’d have to provide compelling arguments. So before the paper was published, I already mounted a partial defense in the text, and more carefully in appendix D.3.
I’d encourage you to read these if you’re interested, dear reader. But I’ll try to discuss the same content in a slightly different way here.
I don’t reason about randomly generated fitness landscapes or the corresponding probability distributions over fitness landscapes. As such, I show the existence of hard fitness landscapes but I cannot reason about the likelihood of such instances. This is not a bug — it’s a feature. I don’t think it makes sense to talk about random fitness landscapes.
As reviewer 2 noted (and as I incorporated into the final text), this is a cultural difference between theoretical computer science and statistical physics. Since statistical physics provides the currently dominant toolset in evolutionary biology, I have an uphill struggle. But I think that cstheory is in the right here.
Fitness landscapes are huge objects. Hyper-astronomical objects. And although we’ve made some measurements of tiny landscapes or local snapshots of more realistic landscapes, it is conceptually impossible to measure a full fitness landscape exhaustively. They are simply too big.
If we can’t measure even one particular object. How is it reasonable to define a distribution that generates these objects? How would we ever test the reasonableness of this generating distribution?
More importantly, fitness landscapes are particular. A zebra is on a particular fitness landscape that exists due to various physical and biological laws. There isn’t some process that has generated random landscapes and some species ended up on some and some on others.
But these are ontological arguments. Let’s make a pragmatic one.
When people discuss classical fitness landscape results. They often talk about logical properties like the degree of epistasis and size of the landscape — but they seldom explicitly discuss (or change) the sampling distribution. They speak as if the assume generating distribution is not an assumption, but just ignorance.
But this isn’t the case for such high dimensional objects. In these cases, randomness gives structure. And that structure is highly dependent on the sampling distribution.
In the case of easy vs hard landscapes, I expect hardness results to be extremely sensitive to sampling distributions. I believe this since similar results exist for similar models, although I haven’t proved them yet for the NK-model. In particular, I expect that for samples from uniform distributions (even when properly defined to avoid the simply span arguments we can make against current distributions), hard landscapes will be rare. But if we sample from the inverse Kolmogorov distribution (i.e. landscapes with short descriptions are more likely than landscapes with long descriptions — like Occam’s razor) then my asymptotic hardness results will cary over: hard landscapes will be common.
Yet both the uniform distribution and Occam’s razor can be defended as reasonable. So what should we make of this sensitivity of the model? We can’t use this to answer how frequent fitness landscapes are, but maybe that wasn’t the right question.
Kovaka gives us a way forward. She points out that the talk of ‘frequency’ is actually not important in general biological fights over frequency. This wording is just a frame and if taking literally, it seldom contributes to actual advancement of the field. What matters instead, is that in the process of arguing about how typical or rare particular phenomena or mechanisms are, biologists learn the logical limits and boundaries of these phenomena much better. It is these logical characterizations and interconnections that form the lasting contribution of frequency debates.
So what does this mean in the context of fitness landscapes? It means that we should do the next step that cstheory points us to: parametrized complexity. Figure out what logical features of (the descriptions of) fitness landscapes can guarantee easy ones and which can’t.
I am currently working on this. And I hope it ends up with more helpful questions than “how common are hard landscapes?”
Brief editor’s note: for readers who are reading this post on Saturday, I apologize for its rushed form. I will edit it slightly during the day on Sunday and remove this note. I have a flight back to London until then.
If you want a visual intuition for just how unpredictable chaotic dynamics can be then the go-to toy model is the double pendulum. There are lots of great simulations (and some physical implementations) of the double pendulum online. Recently, /u/abraxasknister posted such a simulation on the /r/physics subreddit and quickly attracted a lot of attention.
In their simulation, /u/abraxasknister has a fixed center (block dot) that the first mass (red dot) is attached to (by an invisible rigid massless bar). The second mass (blue dot) is then attached to the first mass (also by an invisible rigid massless bar). They then release these two masses from rest at some initial height and watch what happens.
The resulting dynamics are at right.
It is certainly unpredictable and complicated. Chaotic? Most importantly, it is obviously wrong.
But because the double pendulum is a famous chaotic system, some people did not want to acknowledge that there is an obvious mistake. They wanted to hide behind chaos: they claimed that for a complex system, we cannot possibly have intuitions about how the system should behave.
In this post, I want to discuss the error of hiding behind chaos, and how the distinction between microdynamics and global properties lets us catch /u/abraxasknister’s mistake.
A number of people on Reddit noticed the error with /u/abraxasknister’s simulation right away. But the interesting part for me was how other people then jumped in to argue that the correctors could not possibly know what they were talking about.
That’s possible, but also double pendulums involve chaos theory. It’s likely this is just a frictionless simulation.
These detractors were trying to hide behind complexity. They thought that unpredictable microdynamics meant that nothing about the system is knowable. Of course, they were wrong.
But their error is an interesting one. This seems like an unfortunately common misuse of chaos in some corners of complexology. We say that a some system (say the economy) is complex. Thus it is unknowable. Thus, people offering liner theories (say economists) cannot possibly know what they are talking about. They cannot possibly be right.
Have you encountered variants of this argument, dear reader?
You can’t just slap the word chaos on something and expect the conservation of energy to no longer apply
So let us use the conservation of energy to explain why the simulation is wrong.
From the initial conditions, we can get an estimate of the system’s energy. This is particularly easy in this case since the masses start at rest at some height — thus all energy is potential energy. From this — due to the time-invariance of the Hamiltonian specifying the double pendulum — we know by Noether’s theorem that this initial energy will be conserved. In this particular case, this means that we cannot ever have both of the masses above their initial position at the same time. If that happened then (just) the potential energy of this configuration will be strictly higher than the total initial energy. Since we see both of the masses simultaneously above their initial position in the gif, we can conclude that there is an error in /u/abraxasknister simulation.
Based on this violation of energy conservation, many theories were discussed for what the error in the simulation might have been. And the possibility of energy-pumping from finite step size was a particularly exciting candidate. A ‘bug’ (that can be a ‘feature’) that I’ll discuss another day in the context of replicator dynamics.
The actual main mistake turned out to be much less exciting: a typo in the code. A psy instead of phi in one equation.
I’m sure that all of us that have coded simulations can relate. If only we always had something as nice as the conservation of energy to help us debug.
I feel like TheEGG has been a bit monotone in the sort of theoretical computer science that I’ve been writing about recently. In part, this has been due to time constraints and the pressure of the weekly posting schedule (it has now been over a year with a post every calendar week); and in part due to my mind being too fixated on algorithmic biology.
But I did promise to write about span programs — a technique used to reason about query complexity. So in this post, I want to shift gears to quantum computing and discuss span programs. I doubt this is useful to thinking about evolution, but it never hurts to discuss a cool linear-algebraic representation of functions.
I started writing this post for the CSTheory Community Blog. Unfortunately, that blog is largely defunct. So, after 6 years, I decided to post on TheEGG instead.
Please humour me, dear reader.
Span programs are a linear-algebraic representation of functions. They originated in the work of Karchmer and Wigderson [KW93] on classical complexity, and were introduced into quantum computing by Reichardt and Spalek [RS08] in the context of formulate evaluation. A span program consists of a target in a vector space , and a collection of subspaces for and . For an input , if then the target vector can be expressed as a linear combination of vectors in . For the classical complexity measures on span programs (size) the particular choice of bases for does not matter, but for the quantum witness size it is important to fix the set of “input vectors” that span each subspace.
A span program consists of a “target” vector in a finite-dimensional inner-product space over , together with “input” vectors for . Here the index set is
a disjoint union:
corresponds to a function , defined by:
A span program is called strict if . In general, we can assume span programs are strict, a non-empty is only useful for optimizing some algorithmic considerations. In the classical literature only strict span programs were considered [KW93,Gal01,GP03]. In fact, the classical literature considers even more restrictive programs such
as monotone span programs [Gal01,GP03]. A span program is monotone if for all we have . For every monotone function there exists a monotone span program representing it and vice-versa. These programs also correspond to linear secret-sharing schemes, but as of 2011, were not yet studied from the quantum interpretation (have they since, dear reader?). Unlike monotone circuits, monotone span programs are believed to be less restrictive with respect to the natural classical notion of span program complexity.
The classical notion of complexity for span programs is called size. The size of a span program is the number of input vectors , and the size of a function is then the minimum size over all span programs that represent the function [KW93]. For the correspondence between span programs and quantum query complexity, however, we have to consider a different measure of complexity known as witness size [RS08].
Consider a span program . Let . For each input , let and , and
If , then there is a witness satisfying . Let be the minimum norm of any such witness:
If , then there is a witness satisfying property B: and . Let
Let the witness size of be .
Note that there is a certain imprecision in how we specified a span program. In particular, if we replace the target vector by (), then we change the witness size by a factor of if or if . Thus we might as well have defined the witness size as:
However, we will see this is unnecessary since we can transform any span program into a canonical span program:
A span program is canonical if , the target vector is , and for all and , .
Using classical techniques [KW93] we can show that this does not increase our complexity measures:
A span program can be converted to a canonical span program that computes the same function, with and . For all with , itself is an optimal witness.
This simplifies the definition of witness size, and we can write down an optimization problem to solve for the smallest witness size of a function, as:
Notice the similarity of the above equation and the dual of the adversary method. The similarity is no coincidence: the former is the dual of the negative adversary method:
For a proof, I direct the interested reader to Ref.[Rei09].
[Gal01] Anna Gal. A characterization of span program size and improved lower bounds for monotone span programs. Computational Complexity, 10:277-296, 2001.
[GP03] Anna Gal and Pavel Pudlak. A note on monotone complexity and the rank of matrices. Information Processing Letters, 87:321-326, 2003.
[KW93] Mauricio Karchmer and Avi Wigderson. On span programs. In Proc. of 8th IEEE Symp. Structure of Complexity Theory, pages 102-111, 1993.
[Rei09] Ben W. Reichardt. Span programs and quantum query complexity: The general adversary bound is nearly tight for every boolean function. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 544-551. IEEE, 2009, arXiv:0904.2759v1.
[RS08] Ben W. Reichardt and Robert Spalek. Span-program based quantum algorithm for evaluating formulas. In Proc. 40th ACM STOC, pages 103-112, 2008, arXiv:0710.2630.
As Aaron Roth wrote on Twitter — and as I bet with my career: “Rigorously understanding evolution as a computational process will be one of the most important problems in theoretical biology in the next century. The basics of evolution are many students’ first exposure to “computational thinking” — but we need to finish the thought!”
If you didn’t get a chance to attend, maybe the title and abstract will get you reading further:
Algorithmic Biology: Evolution is an algorithm; let us analyze it like one.
Evolutionary biology and theoretical computer science are fundamentally interconnected. In the work of Charles Darwin and Alfred Russel Wallace, we can see the emergence of concepts that theoretical computer scientists would later hold as central to their discipline. Ideas like asymptotic analysis, the role of algorithms in nature, distributed computation, and analogy from man-made to natural control processes. By recognizing evolution as an algorithm, we can continue to apply the mathematical tools of computer science to solve biological puzzles – to build an algorithmic biology.
One of these puzzles is open-ended evolution: why do populations continue to adapt instead of getting stuck at local fitness optima? Or alternatively: what constraint prevents evolution from finding a local fitness peak? Many solutions have been proposed to this puzzle, with most being proximal – i.e. depending on the details of the particular population structure. But computational complexity provides an ultimate constraint on evolution. I will discuss this constraint, and the positive aspects of the resultant perpetual maladaptive disequilibrium. In particular, I will explain how we can use this to understand both on-going long-term evolution experiments in bacteria; and the evolution of costly learning and cooperation in populations of complex organisms like humans.
Unsurprisingly, I’ve writen about all these topics already on TheEGG, and so my overview of the talk will involve a lot of links back to previous posts. In this way. this can serve as an analytic linkdex on algorithmic biology.
One of my students, Joe Gardner, invited me to give this talk. Together with Ben Slater, he was excited about my recent paper on computational complexity as an ultimate constraint on evolution. They thought that other students would also be interested, and that this could be a good way to bring the Computational Society and Biological Society together for a joint event.
I was more than happy to participate.
But given the technical nature of some of my work, I wanted to focus on a broad overview of algorithmic biology and evolution. And only briefly touch on my specific work at the end. The talk ended up having four main parts:
Introduction: the spectrum from computational to algorithmic
Natural Selection: Struggle for Existence as the constraint that powers evolution
Machines and Nature: from Steam Engines to Computers
Algorithmic Biology: Computational Complexity as the constraint that powers open-ended evolution
In terms of the actual content, the talk followed the ideas sketched in the following posts from TheEGG:
Most people are familiar with a particular form of interaction between computer science and biology — what can be broadly classified as computational biology. This is areas like bioinformatics, agent-based modeling of evolution, and maybe even extending to topics like genetic programming and genetic algorithms. But this isn’t the only aspect of the boundary between computer science and biology. This is just practical skills from computer science applied to the outputs of biology. A complementary approach is the use of mathematical techniques from computer science to build the conceptual grounding of biology. This was the aspect that I wanted to focus the talk on.
For this, I needed to give some history of evolution:
If we look at the origins of evolution by natural selection, we can find a strong motivating factor from technology. One of the technological achievements of Darwin’s time was rapid improvements in animal husbandry and agriculture. In the above post, I make the case that this technology was an essential part of the inspiration for Darwin’s foundational insights. Darwin looked at selection implemented by Robert Bakewell and asked if instead it can be implemented by a non-human agent — an abstract agent like the struggle for existence.
In doing this, Darwin was being an early algorithmic biologist. He was recognizing the importance of multiple-realizability and the fact that algorithms can be implemented in a distributed manner. In viewing the struggle for existence as the implementing agent for natural selection, Darwin was also using asymptotic analysis: seeing the qualitatively difference in growth rate between the constant or polynomially increasing abundance of resources versus the exponential growth of populations.
To make this process of evolution easier to think about and model, in the century after Darwin, Wright developed the fitness landscape metaphor. The discrete aspect of the metaphor is especially useful if we want to incorporate the discrete elements of Mendelian genetics — something that was foreign to Darwin, but is an essential part of our current biological thought. But fitness landscapes also raise an issue: why aren’t all species just stuck at local fitness peaks? Why do we see any evolution happening at all?
Darwin and Wallace were geologists, so they overcome the local peak problem by having the environment constantly change. For them the world is constantly changing at the geological level and thus geological change gets reflected in the biological world. This is almost certainly a great explanation for a naturalist, but an experimentalist can throw a wrench in this reasoning. Richard Lenski has done this by evolving E. coli for (now) 70,000 generations in a static environment. They still haven’t found a fitness peak and . Lenski and colleagues see open-ended evolution.
But before explaining open-ended evolution, it is important to understand the ideas of multiple realizability and abstraction. This can be illustrated by turning to the other big new technology of Darwin’s time: steam engines.
If today someone came to you with plans for a perpetual motion machine, you would know that those plans are wrong without even examining them. This is due to foundational role that thermodynamics has achieved. We don’t need to know the details of the proposed machine to know that it would have to violate the laws of physics as we know them to achieve its result.
Computational complexity can achieve a similar foundational reach. Just like we don’t need to think about thermodynamics as about steam engines, we don’t need to see computational complexity as about computers. Any system can be viewed an analyzed as an algorithm — including evolution. And thus computational constrants can be applied to evolution.
Most importantly, this means that we should analyze evolution as an algorithm.
And if we can’t analyze the algorithm then shouldn’t attribute random properties to it that ‘feel right’, seem obvious on one step case, or that we want to be true. That’s not good to practice when we’re programming and it’s not good practice when we’re doing science, either.
In evolutionary biology, constraints are what prevent evolution from finding peaks (or exits in the maze metaphor) in fitness landscapes. These constraints can be caused by various factors. By analogy to the distinction between algorithms vs problems, I divide the constraints into two types: proximal vs ultimate.
Most of the constraints on evolution familiar to biologists are proximal. They are due to features of the particular population, like population or developmental structure, trait co-variants, standing variation, etc. In contrast, computational complexity is an ultimate constraint: it applies to any population on a given family of landscapes.
From this background, I could describe my recent results:
This involves thinking about the different kinds of epistasis and corresponding landscapes from smooth to semismooth to rugged. How we can get progressively harder landscapes if we have more freedom on the kind of epistasis that occurs. And how this resultant perpetual maladaptive disequilibrium can be transformed into positive results like solutions to the Baldwin effect for costly learning, or making permanent cooperation due to the Hankshaw effect.
For now, I think that the best summary of these results is still my 25 tweet thread on this paper. But maybe I should consider writing a more linkable post in the future.
Today we might discuss artificial selection or domestication (or even evolutionary oncology) as applying the principles of natural selection to achieve human goals. This is only because we now take Darwin’s work as given. At the time that he was writing, however, Darwin actually had to make his argument in the other direction. Darwin’s argument proceeds from looking at the selection algorithms used by humans and then abstracting it to focus only on the algorithm and not the agent carrying out the algorithm. Having made this abstraction, he can implement the breeder by the distributed struggle for existence and thus get natural selection.
The inspiration is clearly from the technological to the theoretical. But there is a problem with my story.
Why wasn’t Aristotle or any other ancient philosophers inspired by the agriculture and animal husbandry of their day to arrive at the same theory as Darwin?
The ancients didn’t arrive at the same view because it wasn’t the domestication of the first agricultural revolution that inspired Darwin. It was something much more contemporary to him. Darwin was inspired by the British agricultural revolution of the 18th and early 19th century.
In this post, I want to sketch this connection between the technological development of the Georgian era and the theoretical breakthroughs in natural science in the subsequent Victorian era. As before, I’ll focus on evolution and algorithm.
What was the British agricultural revolution? Was it even seen as a revolution in its own time or recognized only in hindsight? For this, we can turn to Victorian texts. We can see a description of the revolution directly in Darwin’s writings. For example, in his October 1857 letter to Asa Gray, Darwin writes: “[s]election has been methodically followed in Europe for only the last half century” (the emphasis is original).
Bakewell built a mechanistic approach to agriculture. Not the replacement of farm workers by machines, but the methodical scientific approach to agriculture. The introduction of methodical inductive empiricism.
In particular, Bakewell is most remembered for the Dishley system — known today as line breeding or breeding in-and-in. Prior to Bakewell, livestock of both sexes was kept together in fields. This resulted in natural assortment between the livestock and did not easily produce systematic traits in the offspring — to the casual onlooker, the traits in the offspring of these populations would be diverse and seemingly random. Bakewell seperated the sexes and only allowed deliberate, specific mating. This allowed him to more easily and rapidly select for desired traits in his livestock.
During the 35 years from Robert Bakewell inheriting his father’s farm in 1760 to his own death in 1795, he developed several new breeds of livestock including new kinds of sheep, cattle, and horses.
It was apparent to any observed that these were different variations on species. For example, they produced more wool, gained more weight more quickly, and were easier to work with than prior livestock. During Bakewell’s lifetime, the average weight of bulls at action is reported to have doubled.
The Dishley system — i.e. Bakewell’s algorithm — clearly produced new varieties. The puzzle was now an algorithmic one: was a human breeder required to implement this algorithm, or was this always taking place even without human intervention.
Thus, in recognizing (artificial) selection, Darwin was not extracting an implicit algorithm from a long-held human practice. Rather, he was taking an explicit algorithm advocated and practiced by his contemporaries. In the On the Origin of Species, Darwin explicitly acknowledges Bakewell’s demonstration of variation under domestication, and even discusses the branching of Bakewell’s variations under the breeding of different farmers (Buckley vs. Burgess).
Darwin’s contribution to Backewell’s algorithm was abstracting it: recognizing that the agent that implements the algorithm is irrelevant. We don’t need to have Robert Bakewell or another agriculturalist do the selecting. Instead, we can have a distributed agent like the struggle for existence. It is this algorithmic abstraction that allowed Darwin to revolutionize how we think about nature. But it was the latest technology of his day that led him there. Darwin took a human algorithm and asked if it can also explain nature.
Bakewell’s contribution to the technology of agriculture and influence on the future of evolutionary theory extends beyond breeding. He also established experimental plots on his farm to test different manure and irrigation methods. This practice was part of the inspiration for John Bennet Lawes’ establishment of the Rothamsted Experimental Station in 1843 for carrying out long-term agricultural experiments. Their 1856 Park Grass Experiment is still ongoing. But the station is perhaps best known for its theoretical contribution to evolutionary biology during the 14-year tenure (1919–1933) of Ronald Fisher. While at Rothamsted, Fisher developed the statistics and population genetics of the modern evolutionary synthesis to make sense of the data from these ongoing evolutionary experiments.
And the inspiration on evolution from technology was not limited to agriculture. Steam engines — the other new tech of the day — also make an appearance in the first publication of natural selection in 1858. In his section, Alfred Russel Wallace writes that the “action of [natural selection] is exactly like that of the centrifugal governor of the steam engine, which checks and corrects any irregularities almost before they become evident.” An analogy to another recent technology; this one introduced into common usage by James Watt in 1788.
It is easy to imagine history as going from idea to technology. But I think this is often an anachronism. Rather, technology can lead foundational ideas. The tools we build to understand and develop technology can often form the basis for the abstractions and analogies that create new theoretical, philosophical, and scientific ideas.
Today, we should look at the computer — the big technology from the last 50 years — not as just a practical tool. We need to recognize the principles and mathematics underlying algorithms and computation not as just technological aids but as means to fundamentally transform our understanding of nature. And as with Darwin and Wallace, I propose that we focus that transformation on our understanding of biology. Maybe the computing revolution will give us algorithmic biology as the next development in our understanding of the world around us.
One of the things that the Department of Integrated Mathematical Oncology at the Moffitt Cancer Center is doing very well, is creating an atmosphere that combines mathematics and experiment in cancer. Fellow TheEGG blogger, Robert Vander Velde is one of the new generation of cancer researchers who are combining mathematics and experiment. Since I left Tampa, I’ve had less opportunity to keep up with the work at the IMO, but occasionally I catch up on Slack.
A couple of years ago, Robert had a computer science question. One at the data analysis and visualization stage of the relationship between computer science and cancer. Given that I haven’t posted code on TheEGG in a long time, I thought I’d share some visualizations I wrote to address Robert’s question.
But you can also combine both approaches. And do time-lapse microscopy of the colonies as they form.
A couple of years ago, Robert Vander Velde Andriy Marusyk were working on experiments that use colony forming units (CFUs) as a measure of populations. However, they wanted to dig deeper into the heterogeneous dynamics of CFUs by tracking the formation process through time-lapsed microscopy. Robert asked me if I could help out with a bit of the computer vision, so I wrote a Python script for them to identify and track individual colonies through time. I thought that the code might be useful to others — or me in the future — so I wanted to write a quick post explaining my approach.
This post ended up trapped in the drafts box of TheEGG for a while, but I thought now is as good a time as any to share it. I don’t know where Robert’s work on this has gone since, or if the space-time visualizations I developed were of any use. Maybe he can fill us in in the comments or with a new guest post.
So let’s just get started with the code.
Of course, we first need to import the main packages: numpy, pyplot, and cv2.
import numpy as np
from matplotlib import pyplot as plt
The first two are standard packages, the last one — OpenCV — takes a little bit more work to install.
Now we do two main tasks at once, we load all the images and create something I want to call a ‘space-time map’. A space-time map is an image that uses the colour map of a pixel to represent the number of time points that it appears in. This is the first name that occurred to me, if you’ve seen this visualisation used before, dear reader, and know its name then please let me know.
More importantly, the image total_colonies now has each non-background pixel labeled by its colony number, so counting the number of pixels in each colony at each time point becomes as straightforward as applying a mask:
#use the total image (properly thresholded) as the permanent numbers for the colonies;
#get future colonies numbers from them)
colony_sizes = np.zeros((num_colonies + 1,num_imgs), dtype=np.int)
#Note that colony_size[0,:] will contain the amount of empty space
img_num = 0
for img in img_all:
#label colonies by their numbers (for upto 255 colonies):
labeled_img = np.minimum(total_colonies,img)
#get the colonies that appear and their sizes
colonies, sizes = np.unique(labeled_img, return_counts = True)
colony_sizes[colonies,img_num] = sizes
img_num += 1
for colony in range(1,num_colonies + 1):
Unfortunately, there is a number of colonies that ‘blink’ in and out of existance. This is not a manifestation of reality, but probably an artefact of the image processing software used to produce the initial threshold images and the sensitivity of the microscope. As such, it can be helpful to clean up the time series and focus only on the colonies that didn’t go to extinct during the experiment and look at their population dynamics.
#let's clean up by eliminating colonies that go extinct at some point.
colony_lifetimes = np.sum(colony_sizes > 0, axis = 1)
surviving_colonies = np.where(colony_lifetimes == num_imgs)[1:]
for colony in surviving_colonies:
But the figure that this produces still a difficult figure to make sense. I don’t even bother to produce it given how many different lines there are going in different directions for growth rates.
What we really care about is higher level properties of these colonies like their growth rate, so let’s infer those with the help of scipy:
#among those that don't go extinct, let's calculate the growth rates
from scipy.stats import mstats
growth_rates = np.zeros(num_colonies + 1)
growth_rates_low = np.zeros(num_colonies + 1)
growth_rates_high = np.zeros(num_colonies + 1)
for colony in surviving_colonies:
growth_rates[colony], _, growth_rates_low[colony], growth_rates_high[colony] = \
plt.errorbar(np.arange(num_colonies + 1),growth_rates,
yerr = [growth_rates - growth_rates_low, growth_rates_high - growth_rates],fmt='ko')
This yields an easier to look at colony growth rate plot, with 95% confidence intervals.
Above, we have a fitness measure for each colony, so we can look not only at the number of colony forming units but also at differences in how the colonies formed. I still find it hard to make sense of this particular plot, but looking explicitly at the inter-colony heterogeneity does seem like a good exercise. Definitely better than just summarising it as a single variance. Especially since I know from experience that sometimes a variance can hide an interesting discovery.
How would you, dear reader, extend these visualizations? Or is there a good use that you can think of putting them to? After all, visualizations are one of the most important parts of science. I hope this code helps a little. At least as inspiration, or an example of how easy it is to get things done with Python.
This weekend, Oliver Schneider — an old high-school friend — is visiting me in the UK. He is a computer scientist working on human-computer interaction and was recently appointed as an assistant professor at the Department of Management Sciences, University of Waterloo. Back in high-school, Oliver and I would occasionally sneak out of class and head to the University of Saskatchewan to play counter strike in the campus internet cafe. Now, Oliver builds haptic interfaces that can represent virtually worlds physically so vividly that a blind person can now play a first-person shooter like counter strike. Take a look:
DualPanto: A Haptic Device that Enables Blind Users to Continuously Interact with Virtual Worlds - YouTube
Now, dear reader, can you draw a connecting link between this and the algorithmic biology that I typically blog about on TheEGG?
I would not be able to find such a link. And that is what makes computer science so wonderful. It is an extremely broad discipline that encompasses many areas. I might be reading a paper on evolutionary biology or fixed-point theorems, while Oliver reads a paper on i/o-psychology or how to cut 150 micron-thick glass. Yet we still bring a computational flavour to the fields that we interface with.
A few years ago, Karp’s (2011; Xu & Tu, 2011) wrote a nice piece about the myriad ways in which computer science can interact with other disciplines. He was coming at it from a theorist’s perspective — that is compatible with TheEGG but maybe not as much with Oliver’s work — and the bias shows. But I think that the stages he identified in the relationship between computer science and others fields is still enlightening.
In this post, I want to share how Xu & Tu (2011) summarize Karp’s (2011) four phases of the relationship between computer science and other fields: (1) numerical analysis, (2) computational science, (3) e-Science, and the (4) algorithmic lens.
The first stage is the numerical analysis of X. This is solving the equations that already exist in the field X but are too big for pen-and-paper. A classic example of this for physics would be from the Manhattan Project: MANIAC, IAC and ENIAC, processing for sixty days straight for the engineering calculations required to build the hydrogen bomb.
The second stage is computational science of X. Often this is abbreviated as just computational X. This is when we move from just automating the solution of equations to the sort work we wouldn’t even consider on paper: simulating and visualizing the objects of X. In the case of physics, it can at times be hard to draw the line between numerical analysis and computational sciences but in less mathematized-fields, it is usually much more clear. In biology, for example, it would be running agent-based simulations of evolution. From the visualization side it might involve all the algorithms associated with bioinformatics.
The third stage is e-Science of X. This is the name that I am most ambivalent about. This is when we manage extensive experimental data and method for collaboration over the internet. A biological example might be something like folding-at-home, or the various queriable gene or disease databases. In physics, this might involve historic examples like Tim Berners-Lee developing hypertext to help physicists collaborate on high-energy physics projects and inadvertently giving birth to the world wide web. A more recent example might be all the engineering and computer science involved in getting data out of Large Hadron Collider or in synchronizing the various observatories around the world for the black hole image. More broadly it might involve books like Michael Nielson’s (2011) Reinventing Discovery.
But all three of the stages view computer science primarily as a service provider to field X. As a mean to do better the things that field X would do anyway. They don’t fundamentally change the basic objects and theories of X. The computational science stage much shift emphasis: for example towards token-based rather than type-based models in evolutionary biology. And the discoveries that these resources might facilitate could change the field. But the computer science itself in stage one through three is seldom the direct cause of the change in itself.
This is why I can’t agree when Markowetz (2017) writes that all biology is computational biology, for example. All biology might (or should) use computational biology. But this service role in itself does not place computation at the heart of biology. For that we need algorithmic biology.
The fourth stage is the algorithmic lens on X. Computing as a universal way of thinking. This is when we recognize that theories and other theoretical objects are themselves specifications of objects, and all physical processes can themselves be viewed as computations. Once this link is made, theoretical computer science becomes part of the field itself. It’s theorems and perspectives become parts of the bedrock on the field. This is what a theoretical computer science interested in natural science field X aspires to.
But these stages are very theory focused. They put theory on a pedestal at the top. And, in general, this is an unreasonable view. Do you, dear reader, have some suggestions for better categories? Ones that aren’t linear or hierarchical. Ones that don’t ‘end at theory’?
And if Karp, Xu & Tu’s four stages are reasonable: what stage if your favourite field? Is this good or bad?
Karp, R. M. (2011). Understanding science through the computational lens. Journal of Computer Science and Technology, 26(4), 569-577.
Markowetz, F. (2017). All biology is computational biology. PLoS Biology, 15(3): e2002050.
Nielsen, M. (2011). Reinventing discovery: the new era of networked science. Princeton University Press.
Xu, Z. W., & Tu, D. D. (2011). Three new concepts of future computer science. Journal of Computer Science and Technology, 26(4), 616-624.
In 2015 and 2016, as part of my new year reflections on the year prior, I wrote a post about the ‘year in books’. The first was about philosophy, psychology and political economy and it was unreasonably long and sprawling as post. The second time, I decided to divide into several posts, but only wrote the first one on cancer: Neanderthals to the National Cancer Act to now. In this post, I want to return two of the books that were supposed to be in the second post for that year: Harry G. Frankfurt’s On Bullshit and On Truth.
Reading these two books in 2015 might have been an unfortunate preminission for the post-2016 world. And I wonder if a lot of people have picked up Frankfurt’s essays since. But with a shortage of thoughts for this week, I thought it’s better late than never to share my impressions.
In this post I want to briefly summarize my reading of Frankfurt’s position. And then I’ll focus on a particular shortcoming: I don’t think Frankfurt focuses enough on how and what for Truth is used in practice. From the perspective of their relationship to investigation and inquiry, Truth and Bullshit start to seem much less distinct than Frankfurt makes them. And both start to look like the negative force — although in the case of Truth: sometimes a necessary negative.
First, I am not sure if these two works should really count as books; they are basically 20 page essays reformatted with big font, wide margins, and small pages to make cute booklets. However, since I picked them up at Barnes & Nobles as books, I thought that I would classify them as such. The former was originally published as an essay in 1986 and after its repackaging as a book it reached #1 on the New York Times bestseller list. This motivated the latter as a follow up.
Frankfurt observes that our life is full of bullshit, and sets out to provide an analysis and definition of the phenomena. He summarizes his finding at the start of the second book: “bullshitters, although they represent themselves as being engaged simply in conveying information, are not engaged in that enterprise at all.” In this deception, they have a commonality with liars, but “[w]hat they care about primarily … is whether what they say is effective in accomplishing this manipulation. Correspondingly, they are more or less indifferent to whether what they say is true or whether it is false.” This indifference is not shared by the liar who must keep an eye on the truth in order to mislead you. As such, Frankfurt believes that the bullshitter is more dangerous to society than the liar. He closes the first book with a strong denouncement of (what he considers to be) the postmodern bend:
One who is concerned to report or to conceal the facts assumes that there are indeed facts that are in some way both determinate and knowable. … Someone who ceases to believe in the possibility of identifying certain statements as true and others as false can have only two alternatives. The first is to desist both from efforts to tell the truth and from efforts to deceive. … refraining from making any assertions about the facts. The second alternative is to continue making assertions that purport to describe the way things are, but that cannot be anything except bullshit.
In the first book, Frankfurt holds the importance of truth as self-evident and leaves it to the second book to answers the questions of:
Is truth something that in fact we do — and should — especially care about? Or is the love of truth, as professed by so many distinguished thinkers and writers, itself merely another example of bullshit?
He avoids pinning down exactly what he means by truth, suggesting that the common sense notion — by which, at my most generous reading, I assume he means something like Sellars’ manifest image — will do. Unsurprisingly, he doesn’t only see truth as important but follows Spinoza to the conclusion that anybody who values their life must also (maybe unknowingly) love truth. Frankfurt extends the importance of truth from the individual to society with the historical claim that:
Civilizations have never gotten along healthily, and cannot get along healthily, without large quantities of reliable factual information. They cannot flourish if they are beset with troublesome infections of mistaken beliefs. To establish and to sustain an advanced culture, we need to avoid being debilitated either by error or by ignorance.
The above statement is certainly effective in manipulating me to believe in the value of truth. However, it is also sufficiently vague as to make it impossible to test whether what Frankfurt says is true or whether it is false. Certainly the adaptive nature of positive illusions or our work on religion and the social interface theory might hint toward falsehood. But a sufficiently slippery definition of truth can hint truth.
The real issue is that Frankfurt presents a straw-man of people who deflate or question capital-T ‘Truth’ as an organizing principle. The whole point of pragmatic approaches to the question is to eliminate Truth as a category in favour of that with lets us avoid error and provide flourishing. As such, they can agree with Frankfurt’s claim above without attributing it to ‘Truth’. In fact, they might point to very useful and cohesion enhancing beliefs that would not be Truth for Frankfurt.
If we are to think about Truth then I think we need to think about how Truth is used in practice. In the real world.
From my experience, it isn’t static Truth that enables advances or lets us escape error and ignorance. Rather, it is dynamic Investigation. Truth’s job, instead, is to end investigation and inquiry. To say “this case is done, let’s move on”.
Sometimes this is an important thing to do. Not everything needs to be debated. Not everything needs to be investigated. And not everything needs to be questioned. There have to be priorities. And in this regard Truth can be useful.
I think this also lets us better understand bullshit. One of the practical uses of bullshit is usually the same as the practical use of Truth: stop investigation and inquiry. Except whereas in using Truth as our stop requires some due diligence and wondering about if the point in question is a reasonable place to stop. And sometime even gives us a means to potentially resume investigation later. Bullshit lets us avoid this.
But both end investigation.
A tempting dissimilarity between Truth and Bullshit’s relationship to Investigation might be their role in motivating investigation. A common position for Truth, and one that Frankfurt takes throughout, is that a desire for Truth can motivate us to investigate. So from my anti-Frankfurt perspective: even if Truth itself is a — at times desirable and necessary — negative, it’s motivation role is a positive.
But I don’t think this is that different from Bullshit. At least from the garden-hose of misinformation kind of bullshit. From the merchants of doubt kind of bullshit. One of the safety mechanisms built into our notion of Truth is that if we get two conflicting ‘truths’ then we should restart investigation to resolve the contradiction. This is what bullshit can capitalize on if instead of stopping investigation, it wants to start it. By throwing enough disinformation at us, it becomes difficult to know what to believe. This can prompt us to investigate. However, since we are so conditioned on truth and mostly bad at actually carrying out investigations, this often ends up with us just arbitrarily picking the most comfortable — or most repeated or easily accessible — set of propositions as our static set.
In the end, I don’t think the line between Bullshit and Truth is nearly as clear cut as Frankfurt makes it. In particular, if we focus on the uses to which we put both concepts. And without focusing on this practical aspect, I think that Frankfurt fails to engage with the more interesting challenges to capital-T ‘Truth’.
But these are my recollections from a pair of books I read 4 years ago. So I might have forgotten some of the nuance of Frankfurt’s position. I am eager to hear from you, dear reader, on the points that I missed.
Back in September 2017, Sandy Anderson was tweeting about the mathematical oncology revolution. To which Noel Aherne replied with a thorny observation that “we have been curing cancers for decades with radiation without a full understanding of all the mechanisms”.
This lead to a wide-ranging discussion and clarification of what is meant by terms like mechanism. I had meant to blog about these conversations when they were happening, but the post fell through the cracks and into the long to-write list.
And not just in cancer. Although my starting example will focus on VEGF and cancer.
I want to focus on a particular point that came up in my discussion with Paul Maclin: what is the difference between coarse-graining and abstraction? In the process, I will argue that if we want to build mechanistic models, we should aim not after explaining new unknown effects but rather focus on effects where we already have great predictive power from simple effective models.
Since Paul and I often have useful disagreements on twitter, hopefully writing about it on TheEGG will also prove useful.
I think that there is a difference in kind between abstraction and coarse-graining.
Both allow for multiple realizability, but in different ways.
Let’s start with an example. Let’s take Paul Macklin’s example of a “sufficiently general but descriptive CAS”. In such a setting, we might talk about a general “[a]ngiogenesis promoter instead of specifically VEGF-A 165”. Many different specific molecules might act as an angiogenesis promoter and might do this at different strengths, and we are not committing ourselves to any one of them. Thus, we have multiple-realizability. And for Paul, this would be an abstraction.
But I disagree. For me, this is a coarse-graining because the fundamental mechanism is pinned down and assumed. All that is left to vary is a real parameter (or a few) specifying strength, but no real complexity is abstracted over. Replacing VEGF-A 165 by an unspecified angiogenesis promoter does not make our life as modellers significantly simpler. In fact, it might make our life harder since instead of worrying about a single parameter setting that captures VEGF-A 165, we have to worry about a family or range of parameters.
With only coarse-graining available to us, Paul is right that we can’t describe a complex adaptive system without using a complex adaptive system.
Abstraction, however, allows us to define new effective ontologies where the effective objects are not related to their CAS reductive counterparts in simple terms. The act of measurement itself can hide computation; the effective object measured might not be computationally easy to determine from a reductive theory. For me, it is only in such cases where our new objects hide complexity that we can say we’ve abstracted over that complexity.
Let’s return to the VEGF-A 165 example. And here I am — as is often the case — flying by the seat of my pants, since I know nothing about VEGF-A 165 apart from it being a signal protein important for the formation of blood vessels. But this knowledge itself is a kind of abstraction.
Since VEGF-A 165 is a protein, we could study its molecular structure and then do the computational chemistry to simulate it. This would be a lot of work, but it would give us very little use if the relevance of VEGF to our model is only in how it effects the formation of blood vessels. In this case, we might only worry about some higher-level property like binding efficiency or even more operational: translation function from VEGF abundance to rate of blood vessel recruitment. If we can measure these higher-level properties experimentally then we can let nature do the abstracting for us, and just take the relevant output.
This would make our life easier (assuming such a measurement could be done and yielded reliable results).
But it is important in how we interpret this measurement. Because the measurement abstracted not only over the details of the chemistry of VEGF-A but probably countless other factors like the geometry of the space in which the molecule is defusing, the abundance of typically associated molecules, etc. And since — in this hypothetical example — we don’t have a good way to separate this overdetermination, we should not attribute the resultant higher-level property to VEGF-A 165 but to an appropriately defined higher-level object. Much as we wouldn’t assign a temperature to a ‘representative atom’ but instead define it as a higher-level property of ensembles.
Another example of such empirical abstraction that I frequently return to is the idea of effective games. Here, the abstract object (effective game) rolls into its measurement complex and difficult to calculate properties of the reductive object (reductive game, spatial structure, etc). In some ways, we lose this detail, but we gain a theory that is analyzable and understandable. This is what Peter Jeavons and I wrote about for Rockne’s roadmap. From this understanding, we can then roll back our abstraction and measure more specific things about say the spatial structure and thus build a slightly more complex effective theory. We can invert the usual direction of EGT.
But these more complex effective theories will then have a simpler theory to recapitulate at a higher level of abstraction — instead of just having to match some data.
In this way, I propose that we should focus more on working top-down. Building simple (preferably linear) effective theories that are reliable. And only after we are confident in them, should we try to build more reductive theories that when measured in the right way recapitulate the higher-order theories in full.
This is the exact opposite of Paul’s approach. Paul says we should start with a complex reductive model, and if we want a more workable theory then we (quasi-)linearize it.
But I think that we have to start with (quasi-)linear theories and only when they are well coupled to experiment, should we move toward more detail. Only once the simple theory has done all it can for us, should we move to a more reductive one.
I think that we can see successful examples of this throughout the history of science. We knew animal husbandry and selectively improved our crops before we knew about evolution. We could do genetics rather well before we knew about DNA. Or if we want to turn to physics: we could predict eclipses and the positions of planets in the sky before we knew about the inverse-square law of gravitation.
Note that this doesn’t mean that we shouldn’t build progressively more and more reductive theories. It just tells us where we should prioritize. I think that the current urge for many in mathematical oncology is to prioritize building reductive mechanistic theories of effects that aren’t well understood and don’t have any good existing theory to predict them. Instead, we should look at fields where good simple effective theories exist and then aim to build (more) reductive theories that fully recapitulate the effective theory and give us something further: a why.
So let’s return to Noel Aherne’s opening comment: “we have been curing cancers for decades with radiation without a full understanding of all the mechanisms”. Or maybe we can look at his work on how simple linear models of drug-induced cardic toxicity outperform or do nearly as well “as a multi-model approach using three large-scale models consisting of 100s of differential equations combined with machine learning approach”.
This means that we have either implicitly or explicitly, a good effective theory. So this is exactly where we should start building reductive theories to recapitulate our higher-order knowledge. We should expect these reductive theories to be worse at prediction (at least at first), but for that price, we’ll buy some answers to the question: why?
My biology writing focuses heavily on fitness landscapes and evolutionary games. On the surface, these might seem fundamentally different from each other, with their only common feature being that they are both about evolution. But there are many ways that we can interconnect these two approaches.
The most popular connection is to view these models as two different extremes in terms of time-scale.
When we are looking at evolution on short time-scales, we are primarily interested which of a limited number of extant variants will take over the population or how they’ll co-exist. We can take the effort to model the interactions of the different types with each other, and we summarize these interactions as games.
But when we zoom out to longer and longer timescales, the importance of these short term dynamics diminish. And we start to worry about how new types arise and take over the population. At this timescale, the details of the type interactions are not as important and we can just focus on the first-order: fitness. What starts to matter is how fitness of nearby mutants compares to each other, so that we can reason about long-term evolutionary trajectories. We summarize this as fitness landscapes.
From this perspective, the fitness landscapes are the more foundational concept. Games are the details that only matter in the short term.
In this post, I will take this exploration of fitness landscapes a little further and finally connect to games. Nothing profound will be said, but maybe it will give another look at a well-known object.
This was mostly because I feel that fitness landscapes have had less impact than games as concrete models in oncology.
Fitness landscapes conceptualize fitness as a single scalar value — a number. A scalar can only express cell-autonomous effects, where fitness is inherent to the properties of a single. But cancer displays important non-cell-autonomous effects that allow fitness to depend on a cell’s micro-environmental context, including the frequency of other cell types. And this is certainly not limited to cancer. Microenvironmental context and the abundance of other organisms almost always matters to the fitness of a particular type.
To accommodate this non-cell-autonomous (or more generally, non-type-autonomous) fitness, EGT views the types as strategies and models fitness as a function which depends on the abundance of strategies in the population.
This is the other way that we can connect fitness landscapes and games:
Fitness landscapes map types to a fitness scalar. Games map types to a fitness function.
Since any scalar can be represented as a constant function, this perspective makes games the more general and foundational perspective. At least in appearance, although often not in practice.
As is often the case, greater expressiveness comes at a price. In the case of the greater expressiveness of games, this price is a loss of analysis techniques. For example, when dealing with fitness landscapes, we can often consider the strong-selection weak-mutation limit. In this regime, we image that mutations are so rare that the population remains monomorphic except for a brief time during a selective sweep. This allows us as modellers to replace a population by a single point in the landscape.
But suppose that we did want to work with a huge combinatorially structured space of strategies with frequency dependent fitness. A game landscape. Could we get started?
In the case of scalar fitness, it becomes useful to think about a fitness graph: with each edge between nearby mutants oriented from the lower fitness to the higher fitness type. This results in just two kinds of edges: a direction from one to the other type, or a neutral edge with no direction (for equal fitness). More importantly, these edges have a nice global property: their directed graph is acyclic.
Can we get something with game landscapes? The obvious first step is to limit to linear games and consider what weak mutation dynamics might look like. Now our edge count increases to four: we can still have the two old types plus two new ones. But even with the old directed edge, the acyclic property disappears: just consider the rock-paper-scissors game. The new types get us even more. Let’s look at three and four.
The third edge type arises in games like Stag-Hunt where a repulsive fixed point exists: this results in a fitness graph edge that points against either direction. This stops a population from moving across the edge, even in the random drift limit.
The fourth edge type is the most interesting. It arises in games like Hawk-Dove where an attractive fixed point exists. This results in a fitness graph edge that points inwards. It moves the population to a fixed point where both types co-exist and thus even in the strong-selection weak-mutation limit creates a polymorphic population. This can be difficult to deal with.
But the most non-obvious part of game landscapes is how to represent them. Traditional fitness landscapes are exponentially large objects, so we don’t simply store a long list of scalar fitness values. Instead, we use a compact representation like the NK-model that tells us how to quickly compute a fitness from a genotype. We would need something similar with game landscapes: a mapping from genotype to fitness function. Here two new difficulties arise. First, we need to output a function, not a scalar, so doing something simple like adding together fitness components (as the NK-model does) doesn’t work anumore. Second, what is the domain of the resulting fitness function? The naive answer is that it is all the genotypes: so an exponential domain. This means that our compact mapping from genotype to fitness function has to output a compact mapping of its own (and not just an array of linear coefficients for each potential interaction partner)!
Do you have any suggestions on how to handle these difficulties, dear reader?
I think that these challenges of game landscapes can be addressed in interesting ways. In ways that differ from the stand use of fitness landscapes or the replicator-mutator equation. But I’ll tackle thoughts on this in a future post.