In this blog you can read about many things here: from math to physics to earth science and biology, computer science and the technologies of today and tomorrow—but in general, centered around the theme of what scientists, engineers and programmers can do to help save a planet in crisis.
Jonathan Lorand is a math grad student at the University of Zurich working on symplectic and Poisson geometry with Alberto Cattaneo. Fabrizio Genovese is a grad student in computer science at the University of Oxford, working with Bob Coecke and Dan Marsden on categorical quantum mechanics, quantum field theory and the like.
Brendan was my student, so it’s nice to see newer students writing a clear summary of some of his thesis work, namely this paper:
• Brendan Fong, Decorated cospans, Theory and Applications of Categories30 (2015), 1096–1120.
I wrote a summary of it myself, so I won’t repeat it here:
What’s especially interesting to me is that both Jonathan and Fabrizio know some mathematical physics, and they’re part of a group who will be working with me on some problems as part of the Applied Category Theory 2018 school! Brendan and Blake Pollard and I used symplectic geometry and decorated cospans to study the black-boxing of electrical circuits and Markov processes… maybe we should try to go further with that project!
Graham’s number is famous for being the largest number to have ever shown up in a proof. The true story is more complicated, as I discovered by asking Graham. But here’s a smaller but still respectably large number that showed up in knot theory:
It’s 2 to the 2 to the 2 to the 2… where we go on for 101,000,000 times. It appears in a 2011 paper by Coward and Lackenby. It shows up in their upper bound on how many steps it can take to wiggle around one picture of a link until you get another picture of the same link.
This upper bound is ridiculously large. But because this upper bound is computable, it follows that we can decide, in a finite amount of time, whether two pictures show the same link or not. We know when to give up. This had previously been unknown!
A link is a collection of circle embedded in 3-dimensional Euclidean space. We count two links as ‘the same’, or ‘ambient isotopic’, if we can carry one to another by a smooth motion where no circle ever crosses another. (This can be made more precise.) We can draw links in the plane:
and we can get between any two diagrams of the same link by distorting the plane but also doing a sequence of ‘Reidemeister moves’. There are 3 kinds of Reidemeister moves, shown above and also here:
Coward and Lackenby found an upper bound on how many Reidemeister moves it takes to get between two diagrams of the same link. Let n be the total number of crossings in both diagrams. Then we need at most 2 to the 2 to the 2 to the 2 to the 2… Reidemeister moves, where the number of 2’s in this tower is cn, where c = 101,000,000.
It’s fun to look at the paper and see how they get such a terrible upper bound. I’m sure they could have done much better with a bit of work, but that wasn’t the point. All they wanted was a computable upper bound.
Subsequently, Lackenby proved a polynomial upper bound on how many Reidemeister moves it takes to reduce a diagram of the unknot to a circle, like this:
If the original diagram has n crossings, he proved it takes at most (236n11 Reidemeister moves. Because this is a polynomial, it follows that recognizing whether a knot diagram is a diagram of the unknot is in NP. As far as I know, it remains an open question whether this problem is in P.
It may be almost done. So, it would be great if people here could take a look and comment on it! It’s a cool mix of probability theory and double categories. I’ve posted a similar but non-isomorphic article on the n-Category Café, where people know a lot about double categories. But maybe some of you here know more about Markov processes!
‘Coarse-graining’ is a standard method of extracting a simple Markov process from a more complicated one by identifying states. We extend coarse-graining to open Markov processes. An ‘open’ Markov process is one where probability can flow in or out of certain states called ‘inputs’ and ‘outputs’. One can build up an ordinary Markov process from smaller open pieces in two basic ways:
• composition, where we identify the outputs of one open Markov process with the inputs of another,
• tensoring, where we set two open Markov processes side by side.
A while back, Brendan Fong, Blake Pollard and I showed that these constructions make open Markov processes into the morphisms of a symmetric monoidal category:
Here Kenny and I go further by constructing a symmetric monoidal double category where the 2-morphisms include ways of coarse-graining open Markov processes. We also extend the previously defined ‘black-boxing’ functor from the category of open Markov processes to this double category.
But before you dive into the paper, let me explain all this stuff a bit more….
Very roughly speaking, a ‘Markov process’ is a stochastic model describing a sequence of transitions between states in which the probability of a transition depends only on the current state. But the only Markov processes talk about are continuous-time Markov processes with a finite set of states. These can be drawn as labeled graphs:
where the number labeling each edge describes the probability per time of making a transition from one state to another.
An ‘open’ Markov process is a generalization in which probability can also flow in or out of certain states designated as ‘inputs’ and outputs’:
Open Markov processes can be seen as morphisms in a category, since we can compose two open Markov processes by identifying the outputs of the first with the inputs of the second. Composition lets us build a Markov process from smaller open parts—or conversely, analyze the behavior of a Markov process in terms of its parts.
In this paper, Kenny extend the study of open Markov processes to include coarse-graining. ‘Coarse-graining’ is a widely studied method of simplifying a Markov process by mapping its set of states onto some smaller set in a manner that respects the dynamics. Here we introduce coarse-graining for open Markov processes. And we show how to extend this notion to the case of maps that are not surjective, obtaining a general concept of morphism between open Markov processes.
Since open Markov processes are already morphisms in a category, it is natural to treat morphisms between them as morphisms between morphisms, or ‘2-morphisms’. We can do this using double categories!
Double categories were first introduced by Ehresmann around 1963. Since then, they’ve used in topology and other branches of pure math—but more recently they’ve been used to study open dynamical systems and open discrete-time Markov chains. So, it should not be surprising that they are also useful for open Markov processes..
A 2-morphism in a double category looks like this:
While a mere category has only objects and morphisms, here we have a few more types of things. We call and ‘objects’, and ‘vertical 1-morphisms’, and ‘horizontal 1-cells’, and a ‘2-morphism’. We can compose vertical 1-morphisms to get new vertical 1-morphisms and compose horizontal 1-cells to get new horizontal 1-cells. We can compose the 2-morphisms in two ways: horizontally by setting squares side by side, and vertically by setting one on top of the other. The ‘interchange law’ relates vertical and horizontal composition of 2-morphisms.
In a ‘strict’ double category all these forms of composition are associative. In a ‘pseudo’ double category, horizontal 1-cells compose in a weakly associative manner: that is, the associative law holds only up to an invertible 2-morphism, the ‘associator’, which obeys a coherence law. All this is just a sketch; for details on strict and pseudo double categories try the paper by Grandis and Paré.
Kenny and I construct a double category with:
finite sets as objects,
maps between finite sets as vertical 1-morphisms,
open Markov processes as horizontal 1-cells,
morphisms between open Markov processes as 2-morphisms.
I won’t give the definition of item 4 here; you gotta read our paper for that! Composition of open Markov processes is only weakly associative, so is a pseudo double category.
This is how our paper goes. In Section 2 we define open Markov processes and steady state solutions of the open master equation. In Section 3 we introduce coarse-graining first for Markov processes and then open Markov processes. In Section 4 we construct the double category described above. We prove this is a symmetric monoidal double category in the sense defined by Mike Shulman. This captures the fact that we can not only compose open Markov processes but also ‘tensor’ them by setting them side by side.
For example, if we compose this open Markov process:
with the one I showed you before:
we get this open Markov process:
But if we tensor them, we get this:
As compared with an ordinary Markov process, the key new feature of an open Markov process is that probability can flow in or out. To describe this we need a generalization of the usual master equation for Markov processes, called the ‘open master equation’.
This is something that Brendan, Blake and I came up with earlier. In this equation, the probabilities at input and output states are arbitrary specified functions of time, while the probabilities at other states obey the usual master equation. As a result, the probabilities are not necessarily normalized. We interpret this by saying probability can flow either in or out at both the input and the output states.
If we fix constant probabilities at the inputs and outputs, there typically exist solutions of the open master equation with these boundary conditions that are constant as a function of time. These are called ‘steady states’. Often these are nonequilibrium steady states, meaning that there is a nonzero net flow of probabilities at the inputs and outputs. For example, probability can flow through an open Markov process at a constant rate in a nonequilibrium steady state. It’s like a bathtub where water is flowing in from the faucet, and flowing out of the drain, but the level of the water isn’t changing.
Brendan, Blake and I studied the relation between probabilities and flows at the inputs and outputs that holds in steady state. We called the process of extracting this relation from an open Markov process ‘black-boxing’, since it gives a way to forget the internal workings of an open system and remember only its externally observable behavior. We showed that black-boxing is compatible with composition and tensoring. In other words, we showed that black-boxing is a symmetric monoidal functor.
In Section 5 of our new paper, Kenny and I show that black-boxing is compatible with morphisms between open Markov processes. To make this idea precise, we prove that black-boxing gives a map from the double category to another double category, called , which has:
finite-dimensional real vector spaces as objects,
linear maps as vertical 1-morphisms from to ,
linear relations as horizontal 1-cells from to ,
obeying as 2-morphisms.
Here a ‘linear relation’ from a vector space to a vector space is a linear subspace . Linear relations can be composed in the usual way we compose relations. The double category becomes symmetric monoidal using direct sum as the tensor product, but unlike it is a strict double category: that is, composition of linear relations is associative.
Our main result, Theorem 5.5, says that black-boxing gives a symmetric monoidal double functor
As you’ll see if you check out our paper, there’s a lot of nontrivial content hidden in this short statement! The proof requires a lot of linear algebra and also a reasonable amount of category theory. For example, we needed this fact: if you’ve got a commutative cube in the category of finite sets:
and the top and bottom faces are pushouts, and the two left-most faces are pullbacks, and the two left-most arrows on the bottom face are monic, then the two right-most faces are pullbacks. I think it’s cool that this is relevant to Markov processes!
Finally, in Section 6 we state a conjecture. First we use a technique invented by Mike Shulman to construct symmetric monoidal bicategories and from the symmetric monoidal double categories and . We conjecture that our black-boxing double functor determines a functor between these symmetric monoidal bicategories. This has got to be true. However, double categories seem to be a simpler framework for coarse-graining open Markov processes.
Finally, let me talk a bit about some related work. As I already mentioned, Brendan, Blake and I constructed a symmetric monoidal category where the morphisms are open Markov processes. However, we formalized such Markov processes in a slightly different way than Kenny and I do now. We defined a Markov process to be one of the pictures I’ve been showing you: a directed multigraph where each edge is assigned a positive number called its ‘rate constant’. In other words, we defined it to be a diagram
where is a finite set of vertices or ‘states’, is a finite set of edges or ‘transitions’ between states, the functions give the source and target of each edge, and $r : E \to (0,\infty)$ gives the rate constant for each transition. We explained how from this data one can extract a matrix of real numbers called the ‘Hamiltonian’ of the Markov process, with two properties that are familiar in this game:
• if ,
• for all .
A matrix with these properties is called ‘infinitesimal stochastic’, since these conditions are equivalent to being stochastic for all .
In our new paper, Kenny and I skip the directed multigraphs and work directly with the Hamiltonians! In other words, we define a Markov process to be a finite set together with an infinitesimal stochastic matrix . This allows us to work more directly with the Hamiltonian and the all-important ‘master equation’
which describes the evolution of a time-dependent probability distribution
Clerc, Humphrey and Panangaden have constructed a bicategory with finite sets as objects, ‘open discrete labeled Markov processes’ as morphisms, and ‘simulations’ as 2-morphisms. The use the word ‘open’ in a pretty similar way to me. But their open discrete labeled Markov processes are also equipped with a set of ‘actions’ which represent interactions between the Markov process and the environment, such as an outside entity acting on a stochastic system. A ‘simulation’ is then a function between the state spaces that map the inputs, outputs and set of actions of one open discrete labeled Markov process to the inputs, outputs and set of actions of another.
Another compositional framework for Markov processes was discussed by de Francesco Albasini, Sabadini and Walters. They constructed an algebra of ‘Markov automata’. A Markov automaton is a family of matrices with non-negative real coefficients that is indexed by elements of a binary product of sets, where one set represents a set of ‘signals on the left interface’ of the Markov automata and the other set analogously for the right interface.
So, double categories are gradually invading the theory of Markov processes… as part of the bigger trend toward applied category theory. They’re natural things; scientists should use them.
Let me try to explain it in a simplified way. I think all cool math should be known more widely than it is. Getting this to happen requires a lot of explanations at different levels.
The Peano axioms are a nice set of axioms describing the natural numbers. But thanks to Gödel’s incompleteness theorem, these axioms can’t completely nail down the structure of the natural numbers. So, there are lots of different ‘models’ of Peano arithmetic.
These are often called ‘nonstandard’ models. If you take a model of Peano arithmetic—say, your favorite ‘standard’ model —you can get other models by throwing in extra natural numbers, larger than all the standard ones. These nonstandard models can be countable or uncountable. For more, try this:
Starting with any of these models you can define integers in the usual way (as differences of natural numbers), and then rational numbers (as ratios of integers). So, there are lots of nonstandard versions of the rational numbers. Any one of these will be a field: you can add, subtract, multiply and divide your nonstandard rationals, in ways that obey all the usual basic rules.
Now for the cool part: if your nonstandard model of the natural numbers is small enough, your field of nonstandard rational numbers can be found somewhere in the standard field of complex numbers!
In other words, your nonstandard rationals are a subfield of the usual complex numbers: a subset that’s closed under addition, subtraction, multiplication and division by things that aren’t zero.
This is counterintuitive at first, because we tend to think of nonstandard models of Peano arithmetic as spooky and elusive things, while we tend to think of the complex numbers as well-understood.
However, the field of complex numbers is actually very large, and it has room for many spooky and elusive things inside it. This is well-known to experts, and we’re just seeing more evidence of that.
I said that all this works if your nonstandard model of the natural numbers is small enough. But what is “small enough”? Just the obvious thing: your nonstandard model needs to have a cardinality smaller than that of the complex numbers. So if it’s countable, that’s definitely small enough.
This fact was recently noticed by Alfred Dolich at a pub after a logic seminar at the City University of New York. The proof is very easy if you know this result: any field of characteristic zero whose cardinality is less than or equal to that of the continuum is isomorphic to some subfield of the complex numbers. So, unsurprisingly, it turned out to have been repeatedly discovered before.
And the result I just mentioned follows from this: any two algebraically closed fields of characteristic zero that have the same uncountable cardinality must be isomorphic. So, say someone hands you a field F of characteristic zero whose cardinality is smaller than that of the continuum. You can take its algebraic closure by throwing in roots to all polynomials, and its cardinality won’t get bigger. Then you can throw in even more elements, if necessary, to get a field whose cardinality is that of the continuum. The resulting field must be isomorphic to the complex numbers. So, F is isomorphic to a subfield of the complex numbers.
To round this off, I should say a bit about why nonstandard models of Peano arithmetic are considered spooky and elusive. Tennenbaum’s theorem says that for any countable non-standard model of Peano arithmetic there is no way to code the elements of the model as standard natural numbers such that either the addition or multiplication operation of the model is a computable function on the codes.
We can, however, say some things about what these countable nonstandard models are like as ordered sets. They can be linearly ordered in a way compatible with addition and multiplication. And then they consist of one copy of the standard natural numbers, followed by a lot of copies of the standard integers, which are packed together in a dense way: that is, for any two distinct copies, there’s another distinct copy between them. Furthermore, for any of these copies, there’s another copy before it, and another after it.
I should also say what’s good about algebraically closed fields of characteristic zero: they are uncountably categorical. In other words, any two models of the axioms for an algebraically closed field with the same cardinality must be isomorphic. (This is not true for the countable models: it’s easy to find different countable algebraically closed fields of characteristic zero. They are not spooky and elusive.)
So, any algebraically closed field whose cardinality is that of the continuum is isomorphic to the complex numbers!
For more on the logic of complex numbers, written at about the same level as this, try this post of mine:
Jules Hedges is a postdoc in the computer science department at Oxford who is applying category theory to game theory and economics. Daniel Cicala is a grad student working with me on a compositional approach to graph rewriting, which is about stuff like this:
The picture at top shows a ‘rewrite rule’ where the operation of multiplying by 2 is replaced by the operation of adding something to itself. We can then apply this rule to the larger graph at bottom left to get the graph at bottom right. The things being rewritten here aren’t bare graphs: they’re labelled in various ways. So, we need to understand what generalizations of graphs this idea applies to… and in the end, it seems they apply to objects in any adhesive category—or for example, any topos.
Together with my student Kenny Courser, Daniel has been investigating bicategories where the morphisms are ‘open objects’ (for example open graphs, meaning graphs with input and output vertices) and the 2-morphisms are rewrites:
• Daniel Cicala, Spans of cospans, Theory and Applications of Categories 33 (2018), 131–147.
Abstract. We discuss the notion of a span of cospans and define, for them, horizonal and vertical composition. These compositions satisfy the interchange law if working in a topos C and if the span legs are monic. A bicategory is then constructed from C-objects, C-cospans, and doubly monic spans of C-cospans. The primary motivation for this construction is an application to graph rewriting.
Abstract. For a topos T, there is a bicategory MonicSp(Csp(T)) whose objects are those of T, morphisms are cospans in T, and 2-morphisms are isomorphism classes of monic spans of cospans in T. Using a result of Shulman, we prove that MonicSp(Csp(T)) is symmetric monoidal, and moreover, that it is compact closed in the sense of Stay. We provide an application which illustrates how to encode double pushout rewrite rules as 2-morphisms inside a compact closed sub-bicategory of MonicSp(Csp(Graph)).
This stuff sounds abstract but it’s an important part of my network theory program! Recently Daniel Cicala has noticed that some of the bicategories he’s getting are ‘cartesian bicategories’ in the sense of this paper:
A global crash in insect populations has found its way to Australia, with entomologists across the country reporting lower than average numbers of wild insects.
University of Sydney entomologist Dr. Cameron Webb said researchers around the world widely acknowledge that insect populations are in decline, but are at a loss to determine the cause.
“On one hand it might be the widespread use of insecticides, on the other hand it might be urbanisation and the fact that we’re eliminating some of the plants where it’s really critical that these insects complete their development,” Dr Webb said.
“Add in to the mix climate change and sea level rise and it’s incredibly difficult to predict exactly what it is. It’s left me dumbfounded.”
Entomologist and owner of the Australian Insect Farm, near Innisfail in far north Queensland, Jack Hasenpusch is usually able to collect swarms of wild insects at this time of year.
“I’ve been wondering for the last few years why some of the insects have been dropping off and put it down to lack of rainfall,” Mr. Hasenpusch said.
“This year has really taken the cake with the lack of insects, it’s left me dumbfounded, I can’t figure out what’s going on.”
Mr Hasenpusch said entomologists he had spoken to from Sydney, Brisbane, Perth and even as far away as New Caledonia and Italy all had similar stories.
The Australian Butterfly Sanctuary in Kuranda, west of Cairns, has had difficulty breeding the far north’s iconic Ulysses butterfly for more than two years.
“We’ve had [the problem] checked by scientists, the University of Queensland was involved, Biosecurity Queensland was involved but so far we haven’t found anything unusual in the bodies [of caterpillars] that didn’t survive,” said breeding laboratory supervisor Tina Kupke.
“We’ve had some short successes but always failed in the second generation.”
Ms. Lupke said the problem was not confined to far north Queensland, or even Australia. “Some of our pupae go overseas from some of our breeders here and they’ve all had the same problem,” she said. “And the Melbourne Zoo has been trying for quite a while with the same problems.”
Limited lifecycle prefaces population plummet
Dr. Webb, who primarily researches mosquitoes, said numbers were also in decline across New South Wales this year, which was indicative of the situation in other insect populations.
“We’ve had a really strange summer; it’s been very dry, sometimes it’s been brutally hot but sometimes it’s been cooler than average,” he said.
“Mosquito populations, much like a lot of other insects, rely on the combination of water, humidity and temperature to complete their lifecycle. When you mix around any one of those three components you can really change the local population dynamics.”
All this reminds me of a much more detailed study showing a dramatic insect population decline in Germany over a much longer time period:
Now, a new set of long-term data is coming to light, this time from a dedicated group of mostly amateur entomologists who have tracked insect abundance at more than 100 nature reserves in western Europe since the 1980s.
Over that time the group, the Krefeld Entomological Society, has seen the yearly insect catches fluctuate, as expected. But in 2013 they spotted something alarming. When they returned to one of their earliest trapping sites from 1989, the total mass of their catch had fallen by nearly 80%. Perhaps it was a particularly bad year, they thought, so they set up the traps again in 2014. The numbers were just as low. Through more direct comparisons, the group—which had preserved thousands of samples over 3 decades—found dramatic declines across more than a dozen other sites.
It also mentions a similar phenomenon in Scotland:
Since 1968, scientists at Rothamsted Research, an agricultural research center in Harpenden, U.K., have operated a system of suction traps—12-meter-long suction tubes pointing skyward. Set up in fields to monitor agricultural pests, the traps capture all manner of insects that happen to fly over them; they are “effectively upside-down Hoovers running 24/7, continually sampling the air for migrating insects,” says James Bell, who heads the Rothamsted Insect Survey.
Between 1970 and 2002, the biomass caught in the traps in southern England did not decline significantly. Catches in southern Scotland, however, declined by more than two-thirds during the same period. Bell notes that overall numbers in Scotland were much higher at the start of the study. “It might be that much of the [insect] abundance in southern England had already been lost” by 1970, he says, after the dramatic postwar changes in agriculture and land use.
“Higher algebra” has become important throughout mathematics, physics, and mathematical physics, and this conference will bring together leading experts in higher algebra and its mathematical physics applications. In physics, the term “algebra” is used quite broadly: any time you can take two operators or fields, multiply them, and write the answer in some standard form, a physicist will be happy to call this an “algebra”. “Higher algebra” is characterized by the appearance of a hierarchy of multilinear operations (e.g. A-infinity and L-infinity algebras). These structures can be higher categorical in nature (e.g. derived categories, cohomology theories), and can involve mixtures of operations and co-operations (Hopf algebras, Frobenius algebras, etc.). Some of these notions are purely algebraic (e.g. algebra objects in a category), while others are quite geometric (e.g. shifted symplectic structures).
An early manifestation of higher algebra in high-energy physics was supersymmetry. Supersymmetry makes quantum field theory richer and thus more complicated, but at the same time many aspects become more tractable and many problems become exactly solvable. Since then, higher algebra has made numerous appearances in mathematical physics, both high- and low-energy._
Participation is limited. Some financial support is available for early-career mathematicians. For more information and to apply, please visit the conference website of the institute closer to you:
If you have any questions, please write to email@example.com.
One of the organizers, Aaron Mazel-Gee, told me:
We are also interested in spreading the idea of double conferences more generally: we’re hoping that our own event’s success inspires other academic communities to organize their own double conferences. We’re hoping to eventually compile a sort of handbook to streamline the process for others, so that they can learn from our own experiences regarding the various unique challenges that organizing such an event poses. Anyways, all of this is just to say that I would be happy for you to publicize this event anywhere that it might reach these broader audiences.
So, if you’re interested in having a double conference, please contact the organizers of this one for tips on how to do it! I’m sure they’ll have better advice after they’ve actually done it. I’ve found that the technical details really matter for these things: it can be very frustrating when they don’t work correctly. Avoiding such problems requires testing everything ahead of time—under conditions that exactly match what you’re planning to do!
But we’ve done a lot more, and my blog articles are having trouble keeping up! So I’d like to sketch out the big picture as it stands today.
If I had to summarize, I’d say we’ve developed a formalism for step-by-step compositional design and tasking, using commitment networks. But this takes a while to explain.
Here’s a very simple example of a commitment network:
It has four nodes, which represent agents: a port, a helicopter, a UAV (an unmanned aerial vehicle, or drone) and a target. The edges between these notes describe relationships between these agents. Some of these relationships are ‘commitments’. For example, the edges labelled ‘SIR’ say that one agent should ‘search, intervene and rescue’ the other.
Our framework for dealing with commitment networks has some special features. It uses operads, but this isn’t really saying much. An ‘operad’ is a bunch of ways of sticking things together. An ‘algebra’ of the operad gives a particular collection of these things, and says what we get when we stick them together. These concepts are extremely general, so there’s a huge diversity of operads, each with a huge diversity of algebras. To say one is using operads to solve a problem is a bit like saying one is using math. What matters more is the specific kind of operad one is using, and how one is using it.
For our work, we needed to develop a new class of operads called network operads, which are described here:
• John Baez, John Foley, Joseph Moeller and Blake Pollard, Network models.
In this paper we mainly discuss communication networks. Subsequently we’ve been working on a special class of network operads that describe how to build commitment networks.
Here are some of key ideas:
• Using network operads we can build bigger networks from smaller ones by overlaying them. David Spivak’s operad of wiring diagrams only let us ‘wire together’ smaller networks to form bigger ones:
Here networks X1, X2 and X3 are being wired together to form Y.
Network operads also let us wire together networks, but in addition they let us take one network:
and overlay another:
to create a larger network:
This is a new methodology for designing systems. We’re all used to building systems by wiring together subsystems: anyone who has a home stereo system has done this. But overlaying systems lets us do more. For example, we can take two plans of action involving the same collection of agents, and overlay them to get a new plan. We’ve all done this informally: you tell a bunch of people to do things… and then tell the same people, or an overlapping set of people, to do some other things. But we all know the problems that can arise. A mathematically principled approach can avoid some of these problems.
• The nodes of our networks represent agents of various types. The edges represent various relationships between agents. For example, they can represent communication channels. But more interestingly, they can represent commitments. For example, we can have an edge from A to B saying that agent A has to go rescue agent B. We call this kind of network a commitment network.
• By overlaying commitment networks, we can not only build systems out of smaller pieces but also build complicated plans by overlaying smaller pieces of plans. Since ‘tasking’ means telling a system what to do, we call this compositional tasking.
• If one isn’t careful, overlaying commitment networks can produce conflicts. Suppose we have a network with an edge saying that agent A has to rescue agent B. On top of this we overlay a network with an edge saying that A has to rescue agent C. If A can’t do both of these tasks at once, what should A do? There are various choices. We need to build a specific choice into the framework, so we can freely overlay commitment networks and get a well-defined result that doesn’t overburden the agents involved. We call this automatic deconflicting.
• Our approach to automatic deconflicting uses an idea developed by the famous category theorist Bill Lawvere: graphic monoids. I’ll explain these later, along with some of their side-benefits.
• Networks operads should let us do step-by-step compositional tasking. In other words, they should let us partially automate the process of tasking networks of agents, both
1) compositionally: tasking smaller networks and then sticking them together, e.g. by overlaying them, to get larger networks,
2) in a step-by-step way, starting at a low level of detail and then increasing the amount of detail.
To do this we need not just operads but their algebras.
• Remember, a network operad is a bunch of ways to stick together networks of some kind, e.g. by overlaying them. An algebra of this operad specifies a particular collection of networks of this kind, and says what we actually get when we stick them together.
So, a network operad can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but for simplicity let’s just think about two.
When we have a ‘less detailed’ algebra and a ‘more detailed’ algebra they will typically be related by a map
which ‘forgets the extra details’. This sort of map is called a homomorphism of algebras. We give examples in our paper Network models.
But what we usually want to do, when designing a system, is not forget extra detail, but rather add extra detail to a rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism
going back the other way. This lets us automate the process of filling in the details. But we can’t usually count on being able to do this. So, often we may have to start with an element of and search for an element of that is mapped to it by And typically we want this element to be optimal, or at least ‘good enough’, according to some chosen criteria. Expressing this idea formally helps us figure out how to automate the search. John Foley, in particular, has been working on this.
That’s an overview of our ideas.
Next, for the mathematically inclined, I want to give a few more details on one of the new elements not mentioned in our Network models paper: ‘graphic monoids’.
A monoid is a set M with an binary product that is associative and has an identity element 1:
Thanks to the associative law we don’t need to write parentheses when we multiply a string of elements in a monoid.
A graphic monoid is a monoid in which the graphic identity
holds for all
To understand this, let us think of the elements of the monoid as “commitments”. The product means “first committing to do then committing to do ”. The graphic identity says that if we first commit to do , then , and then again, it’s the same as first committing to do and then Committing to do again doesn’t change anything!
In particular, in any graphic monoid we have
so making the same commitment twice is the same as making it once. Mathematically we say every element of a graphic monoid is idempotent:
A commutative monoid obeying this law automatically obeys the graphic identity, since then
But for a noncommutative monoid, the graphic identity is stronger than . It says that after committing to no matter what intervening commitments one might have made, committing to again has no further effect. In other words: the intervening commitments did not undo the original commitment, so making the original commitment a second time has no effect! This captures the idea of how promises should behave.
In our paper Network models we explain how the ‘overlay’ operation makes the collection of networks involving a given set of agents into a monoid. In a commitment network this monoid is required to be a graphic monoid. Joseph Moeller is writing a paper that shows how to construct a large class of commitment networks. We will follow this with a paper illustrating how to use these in compositional tasking.
For now, let me just mention a side-benefit. In any graphic monoid we can define a relation by
In the context of commitment networks, means that starting from we can reach by making some further commitment : that is, for some . So, as we ‘task’ a collection of agents by giving them more and more commitments, we move up in this partial order.
I’m looking forward to this workshop organized by Spencer Breiner at the National Institute of Standards and Technology:
• Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA.
It’s by invitation only, but it’s part of a movement I’m excited about, so I can’t resist mentioning its existence. Here’s the idea:
What: The Information Technology Laboratory at NIST is pleased to announce a workshop on Applied Category Theory to be held at NIST’s Gaithersburg, Maryland campus on March 15 & 16, 2018. The meeting will focus on practical avenues for introducing methods from category theory into real-world applications, with an emphasis on examples rather than theorems.
Who: The workshop aims to bring together two distinct groups. First, category theorists interested in pursuing applications outside of the usual mathematical fields. Second, domain experts and research managers from industry, government, science and engineering who have in mind potential domain applications for categorical methods.
Intended Outcomes: A proposed landscape of potential CT applications and the infrastructure needed to realize them, together with a 5-10 year roadmap for developing the field of applied category theory. This should include perspectives from industry, academia and government as well as major milestones, potential funding sources, avenues for technology transfer and necessary improvements in tool support and methodology. Exploratory collaborations between category theorists and domain experts. We will ask that each group come prepared to meet the other side. Mathematicians should be prepared with concrete examples that demonstrate practical applications of CT in an intuitive way. Domain experts should bring to the table specific problems to which they can devote time and/or funding as well as some reasons about why they think CT might be relevant to this application.
John Baez (University of California at Riverside) and John Foley (Metron Scientific Solutions).
Bob Coecke (University of Oxford).
Dusko Pavlovic (University of Hawaii).
Some other participants include Arquimedes Canedo (Siemens at Princeton), Brendan Fong (MIT), Steve Huntsman (BAE Systems), John Paschkewitz (DARPA), Blake Pollard (NIST), Alberto Speranzon (Honeywell Aerospace), David Spivak (MIT), Eswaran Subrahmanian (Carnegie Mellon), Jamie Vicary (Birmingham and Oxford), and Ryan Wisnesky (Categorical Informatics). But this list is heavily biased toward people I already know, and I’m looking forward to meeting others too!
I hope we make a lot of progress—and I hope to tell you what we did!
This was written by my grad student Jade Master along with Cory Griffith, an undergrad at Stanford. Jade is currently working with me on the semantics of open Petri nets.
What’s the basic idea of this linguistics and category theory stuff? I don’t know much about this, but I can say a bit.
Since category theory is great for understanding the semantics of programming languages, it makes sense to try it for human languages, even though they’re much harder. The first serious attempt I know was by Jim Lambek, who introduced pregroup grammars in 1958:
In this article he hid the connection to category theory. But when you start diagramming sentences or phrases using his grammar, as below, you get planar string diagrams as shown above. So it’s not surprising—if you’re in the know—that he’s secretly using monoidal categories where every object has a right dual and, separately, a left dual.
This fact is just barely mentioned in the Wikipedia article:
This stuff is hugely fun, so I’m wondering why I never looked into it before! When I talked to Lambek, who is sadly no longer with us, it was mainly about his theories relating particle physics to quaternions.
Recently Mehrnoosh Sadrzadeh and Bob Coecke have taken up Lambek’s ideas, relating them to the category of finite-dimensional vector spaces. Choosing a monoidal functor from a pregroup grammar to this category allows one to study linguistics using linear algebra! This simplifies things, perhaps a bit too much—but it makes it easy to do massive computations, which is very popular in this age of “big data” and machine learning.
It also sets up a weird analogy between linguistics and quantum mechanics, which I’m a bit suspicious of. While the category of finite-dimensional vector spaces with its usual tensor product is monoidal, and has duals, it’s symmetric, so the difference between writing a word to the left of another and writing it to the right of another gets washed out! I think instead of using vector spaces one should use modules of some noncommutative Hopf algebra, or something like that. Hmm… I should talk to those folks.
To discuss this, please visit The n-Category Café, since there’s a nice conversation going on there and I don’t want to split it. There has also been a conversation on Google+, and I’ll quote some of it here, so you don’t have to run all over the internet.
Noam Zeilberger wrote:
You might have been simplifying things for the post, but a small comment anyways: what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”. Lambek grammars actually correspond to monoidal biclosed categories in disguise (i.e., based on left/right division rather than left/right duals), and may also be considered without a unit (as in his original paper). (I only have a passing familiarity with this stuff, though, and am not very clear on the difference in linguistic expressivity between grammars based on division vs grammars based on duals.)
Noam Zeilberger wrote:
If you haven’t seen it before, you might also like Lambek’s followup paper “On the calculus of syntactic types”, which generalized his original calculus by dropping associativity (so that sentences are viewed as trees rather than strings). Here are the first few paragraphs from the introduction:
…and here is a bit near the end of the 1961 paper, where he made explicit how derivations in the (original) associative calculus can be interpreted as morphisms of a monoidal biclosed category:
John Baez wrote:
Noam Zeilberger wrote: “what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”.”
Can you say what the difference is? I wasn’t simplifying things on purpose; I just don’t know this stuff. I think monoidal biclosed categories are great, and if someone wants to demand that the left or right duals be inverses, or that the category be a poset, I can live with that too…. though if I ever learned more linguistics, I might ask why those additional assumptions are reasonable. (Right now I have no idea how reasonable the whole approach is to begin with!)
Thanks for the links! I will read them in my enormous amounts of spare time. :-)
Noam Zeilberger wrote:
As I said it’s not clear to me what the linguistic motivations are, but the way I understand the difference between the original “Lambek” grammars and (later introduced by Lambek) pregroup grammars is that it is precisely analogous to the difference between a monoidal category with left/right residuals and a monoidal category with left/right duals. Lambek’s 1958 paper was building off the idea of “categorial grammar” introduced earlier by Ajdukiewicz and Bar-Hillel, where the basic way of combining types was left division A\B and right division B/A (with no product).
Noam Zeilberger wrote:
At least one seeming advantage of the original approach (without duals) is that it permits interpretations of the “semantics” of sentences/derivations in cartesian closed categories. So it’s in harmony with the approach of “Montague semantics” (mentioned by Richard Williamson over at the n-Cafe) where the meanings of natural language expressions are interpreted using lambda calculus. What I understand is that this is one of the reasons Lambek grammar started to become more popular in the 80s, following a paper by Van Benthem where he observed that such such lambda terms denoting the meanings of expressions could be computed via “homomorphism” from syntactic derivations in Lambek grammar.
Jason Nichols wrote:
John Baez, as someone with a minimal understanding of set theory, lambda calculus, and information theory, what would you recommend as background reading to try to understand this stuff?
It’s really interesting, and looks relevant to work I do with NLP and even abstract syntax trees, but I reading the papers and wiki pages, I feel like there’s a pretty big gap to cross between where I am, and where I’d need to be to begin to understand this stuff.
John Baez wrote:
Jason Nichols: I suggest trying to read some of Lambek’s early papers, like this one:
(If you have access to the version at the American Mathematical Monthly, it’s better typeset than this free version.) I don’t think you need to understand category theory to follow them, at least not this first one. At least for starters, knowing category theory mainly makes it clear that the structures he’s trying to use are not arbitrary, but “mathematically natural”. I guess that as the subject develops further, people take more advantage of the category theory and it becomes more important to know it. But anyway, I recommend Lambek’s papers!
Borislav Iordanov wrote:
Lambek was an amazing teacher, I was lucky to have him in my ungrad. There is a small and very approachable book on his pregroups treatment that he wrote shortly before he passed away: “From Word to Sentence: a computational algebraic approach to grammar”. It’s plain algebra and very fun. Sadly looks like out of print on Amazon, but if you can find it, well worth it.
Andreas Geisler wrote:
One immediate concern for me here is that this seems (don’t have the expertise to be sure) to repeat a very old mistake of linguistics, long abandoned :
Words do not have atomic meanings. They are not a part of some 1:1 lookup table.
The most likely scenario right now is that our brains store meaning as a continuously accumulating set of connections that ultimately are impacted by every instance of a form we’ve ever heard/seen.
So, you shall know a word by all the company you’ve ever seen it in.
Andreas Geisler wrote:
John Baez I am a linguist by training, you’re welcome to borrow my brain if you want. You just have to figure out the words to use to get my brain to index what you need, as I don’t know the category theory stuff at all.
It’s a question of interpretation. I am also a translator, so i might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.
John Baez wrote:
Andreas Geisler wrote: “I might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.”
No, it wouldn’t. Alas, I don’t really have time to tackle linguistics myself. Mehrnoosh Sadrzadeh is seriously working on category theory and linguistics. She’s one of the people leading a team of students at this Applied Category Theory 2018 school. She’s the one who assigned this paper by Lambek, which 2 students blogged about. So she would be the one to talk to.
So, you shall know a word by all the company you’ve ever seen it in.