Three serious books lie open before me. I had a variety of reasons for checking them out of the library, although they’re all related in one way or another to current goings on here at the Simons Institute:
Donald O. Hebb’s The Organization of Behavior: A Neuropsychological Theory. This is the book that introduced a fundamental hypothesis about learning and memory, captured in the slogan “Neurons that fire together get wired together.”
Norbert Wiener’s Cybernetics: or, Control and Communication in the Animal and the Machine, an eccentric and wide-ranging masterpiece with a crucial chapter on “Computing Machines and the Nervous System.”
Claude Shannon’s The Mathematical Theory of Communication, the foundational document of information theory. (Shannon’s part of this work had appeared a year earlier in the Bell System Technical Journal; the book version includes an interpretive essay by Warren Weaver.)
When I got the three volumes home, I made a surprising discovery: They were all published at roughly the same time, in 1948 and 1949. What are the odds of that? Perhaps it means nothing—just the long arm of coincidence reaching out to tap me on the shoulder. On the other hand, maybe there was something in the air circa 1950, something that made the period unusually fertile for studies of information, communication, and computation in brains and machines.
I have done a little digging in library catalogues and Wikipedia, as well as in my own files, looking for other titles that might belong on this list of distinguished midcentury milestones.
It turns out that George Kingsley Zipf’s Human Behavior and the Principle of Least Effort was also published in 1949. (This is the one about the curious power-law distribution seen in rankings of word frequencies, city sizes, and so on.)
Gilbert Ryle’s The Concept of Mind is another 1949 title, though I’ve never read it. Also from 1949: Nicholas Metropolis and Stanislaw Ulam published the first open account of the Monte Carlo method.
Drifting forward into 1950, we find another cluster of notables. There is John Nash’s one-page paper introducing what we now call the Nash equilibrium. Elsewhere in game theory, 1950 was the debut year for prisoner’s dilemma, although Merrill Flood’s paper describing it did not appear until two years later. Richard Hamming published “Error Detecting and Error Correcting Codes” in 1950. (It’s another paper from the Bell System Technical Journal.) Finally, there’s Alan M. Turing’s famous essay on “Computing Machinery and Intelligence.”
Does the density of high-octane publications really make 1948–50 an exceptional season of intellectual history? I can’t offer any solid statistical support for that notion. In the first place, my criteria for inclusion on the list are way too vague. (“Subjects I find interesting” may be closest to the truth.) In the second place, I can’t offer any evidence that other intervals were not equally productive. As a matter of fact, in my bibliographic rummaging I came across a nexus of brilliance five years earlier:
Warren S. McCollough and Walter H. Pitts, “A logical calculus of the ideas immanent in nervous activity,” 1943.
John von Neumann and Oskar Morgenstern, Theory of Games and Economic Behavior, 1944.
Erwin Schrödinger, What Is Life? The Physical Aspect of the Living Cell, 1944.
Vannevar Bush, “As We May Think,” 1945.
John von Neumann, “First Draft of a Report on the EDVAC,” 1945
I acknowledge a further reason for caution when I cite 1949 as a year of special distinction. It’s my year, the year of my birth.
In 1994 a document called the QED Manifesto made the rounds of certain mathematical mailing lists and Usenet groups.
QED is the very tentative title of a project to build a computer system that effectively represents all important mathematical knowledge and techniques. The QED system will conform to the highest standards of mathematical rigor, including the use of strict formality in the internal representation of knowledge and the use of mechanical methods to check proofs of the correctness of all entries in the system.
The ambitions of the QED project—and its eventual failure—were front and center in a talk by Thomas Hales (University of Pittsburgh) on Formal Abstracts in Mathematics. Hales is proposing another such undertaking: A comprehensive database of theorems and other mathematical propositions, along with the axioms, assumptions, and definitions on which the theorems depend, all represented in a formal notation readable by both humans and machines. Unlike QED, however, these “formal abstracts” would not include proofs of the theorems. Excluding proofs is a huge retreat from the aims of the QED group, but Hales argues that it’s necessary to make the project feasible with current technology.
Hales has plenty of experience in this field. In 1998 he announced a proof of the Kepler conjecture—the assertion that the grocer’s stack of oranges embodies the densest possible arrangement of equal-size spheres in three-dimensional space. Hales’s proof was long and complex, so much so that it stymied the efforts of journal referees to fully check it. Hales and 21 collaborators then spent a dozen years constructing a formal, computer-mediated verification of the proof.
What’s the use of a database of mathematical assertions if it doesn’t include proofs? Hales held out several potential benefits, two of which I found particularly appealing. First, the database could answer global questions about the mathematical literature; one could ask, “How many theorems depend on the Riemann hypothesis?” Second, the formal abstracts would capture the meaning of mathematical statements, not just their surface form. A search for all mentions of the equation \(x^m - y^n = 1\) would find instances that use symbols other than \(x, y, m, n,\) or that take slightly different forms, such as \(x^m - 1 = y^n\).
Hales’s formal abstracts sound intriguing, but I have to confess to a certain level of disappointment and bafflement. All around us, triumphant machines are conquering one domain after another—chess, go, poker, Jeopardy, the driver’s seat. But not proofs, apparently.
Sperner’s Lemma
Am I the last person in the whole republic of numbers to learn that Sperner’s lemma is a discrete version of the Brouwer fixed-point theorem? Francis Su and John Stillwell clued me in.
The lemma—first stated in 1928 by the German mathematician Emanuel Sperner—seems rather narrow and specialized, but it turns up everywhere. It concerns a triangle whose vertices are assigned three distinct colors:
Divide the triangle into smaller triangles, constrained by two rules. First, no edge or segment of an edge can be part of more than two triangles. Second, if a vertex of a new triangle lies on an edge whose end points have colors i and j, then the new vertex must be colored either i or j. (A vertex that’s not on an edge can be colored arbitrarily.)
The lemma states that at least one interior triangle must have a full complement of red, green, and blue vertices. Actually, the lemma’s claim is slightly stronger: The number of trichromatic inner triangles must be odd. In the augmented diagram below, adding a single new red vertex has created two more RGB triangles, for a total of three.
Su gave a quick proof of the lemma. Consider the set of all edge segments that have one red and one green endpoint. On the exterior boundary of the large triangle, such segments can appear only along the red-green edge, and there must be an odd number of them. Now draw a path that enters the large triangle from the outside, that crosses only red-green segments, and that crosses each such segment at most once.
One possible fate of this RG path is to enter through one red-green segment and exit through another. But since the number of red-green segments on the boundary is odd, there must be at least one path that enters the large triangle and never exits. The only way it can become trapped is to enter a red-green-blue triangle. (There’s nothing special about red-green segments, so this argument also holds for red-blue and blue-green segments.)
So much for Sperner’s lemma. What do these nested triangles have to do with the Brouwer fixed-point theorem? That theorem operates in a continuous domain, which seems remote from the discrete network of Sperner’s triangulated triangle.
As the story goes (I can’t vouch for its provenance), L. E. J. Brouwer formulated his theorem at the breakfast table. Stirring his coffee, he noticed that there always seemed to be at least one stationary point on the surface of the moving liquid. He was able to prove this fact not just for the interior of a coffee cup but for any bounded, closed, and convex region, and not just for circular motion but for any continuous function that maps points within such a region to points in the same region. For each such function \(f\), there is a point \(p\) such that \(f(p) = p\).
Brouwer’s fixed-point theorem was a landmark in the development of topology, and yet Brouwer himself later renounced the theorem—or at least his proof of it, because the proof was nonconstructive: It gave no procedure for finding or identifying the fixed point. John Stillwell argues that a proof based on Sperner’s lemma comes as close as possible to a constructive proof, though it would still have left Brouwer unsatisfied.
The proof relies on the same kind of paths represented by yellow arrows in the diagram above. At least one such path comes to an end inside a tri-colored triangle, which Sperner’s lemma shows must exist in any properly colored triangulated network. If we continue subdividing the triangles under the Sperner rules, and proceed to the limit where the edge lengths go to zero, then the path ends at a single, stationary point. (It’s the “proceed to the limit” step that Brouwer would not have liked.)
The Muffin Man
You have five muffins to share among three students; lets call the students April, May, and June. One solution is to give each student one whole muffin, then divide the remaining two muffins into pieces of size one-third and two-thirds. Then the portions are divvied up as follows:
This allotment is quantitatively fair, in that each student receives five-thirds of a muffin, but June complains that her two small pieces are less appetizing than the others’ larger ones. She feels she’s been given leftover crumbs. Hence the division is not envy-free.
There are surely many ways of addressing this complaint. You might cut all the muffins into pieces of size one-third, and give each student five equal pieces. Or you might give each student a muffin and a half, then eat the leftover half yourself. These are practical and sensible strategies, but they are not what Bill Gasarch was seeking when he gave a talk on the problem Saturday afternoon. Gasarch asked a specific question: What is the maximum size of the minimum piece? Can we do better than one-third?
The answer is yes. Here is a division that cuts one muffin in half and divides each of the other four muffins into portions of size seven-twelfths and five-twelfths. April and May each get \(\frac{1}{2} + \frac{7}{12} + \frac{7}{12}\); June gets \(4 \times \frac{5}{12}\).
Five-twelfths is larger than one-third, and thus should seem less crumby. Indeed, Gasarch and his colleagues have proved five-twelfths is the best result possible: It is the maximum of the minimum. (Nevertheless, I worry that June may still be unhappy. Her portion is cut up into four pieces, whereas the others get three pieces each; furthermore, all of June’s pieces are smaller than April’s and May’s. Again, however, these concerns lie outside the scope of the mathematical problem.)
A key observation is that the smallest piece can never be larger than one-half. This is thunderously obvious once you know it, but I failed to see it when I first started thinking about the problem.
Fair-division problems have a long history (going back at least as far as the Talmud), and cake-cutting versions have been proliferating for decades. A 1961 article by L. E. Dubins and E. H. Spanier (American Mathematical Monthly 68:1–17) inspired much further work. There are even connections with Sperner’s lemma. Nevertheless, the genre is not exhausted yet; the muffin problem seems to be a new wrinkle. Gasarch and six co-authors (three of them high school students) have prepared a 166-page manuscript describing a year’s worth of labor on the problem, with optimal results for all instances with up to six students (and any number of muffins), as well as upper and lower bounds on solutions to larger instances, and various conjectures on open problems.
Long-time readers of bit-player may remember that Gasarch has been mentioned here before. Back in 2009 he offered (and eventually paid) \($17^2\) for a four-coloring of a 17-by-17 lattice such that no four lattice points forming a rectangle all have the same color. That problem attracted considerable attention both here and on Gasarch’s own Computational Complexity blog (conducted jointly with Lance Fortnow).
Twenty years ago, Kimberly-Clark, the Kleenex company, introduced a line of toilet paper embossed with the kite-and-dart aperiodic tiling discovered by Roger Penrose. When I first heard about this, I thought: How clever. Because the pattern never repeats, the creases in successive layers of a roll would never line up over any extended region, and so the sheets would be less likely to stick together.
Sir Roger Penrose had a different response. Apparently be believes the pattern is subject to copyright protection, and he also managed to get a patent issued in 1979, although that would have expired about the time of the toilet paper scandal. Penrose assigned his rights to a British company called Pentaplex Ltd. An article in the Times of London quoted a representative of Pentaplex:
So often we read of very large companies riding roughshod over small businesses or individuals, but when it comes to the population of Great Britain being invited by a multinational [company] to wipe their bottoms on what appears to be the work of a knight of the realm without his permission, then a last stand must be made.
Sir Roger sued. I haven’t been able to find a documented account of how the legal action was resolved, but it seems Kimberly-Clark quickly withdrew the product.
Some years ago I was given a small sample of the infamous Penrose toilet paper. It came to me from Phil and Phylis Morrison; a note from Phylis indicates that they acquired it from Marion Walter. Now I would like to pass this treasure on to a new custodian. The specimen is unused though not pristine, roughly a foot long, and accompanied by a photocopy of the abovementioned Times news item. In the photograph below I have boosted the contrast to make the raised ridges more visible; in real life the pattern is subtle.
Are you interested in artifacts with unusual symmetries? Would you like to add this object to your collection? Send a note with a U.S. mailing address to brian@bit-player.org. If I get multiple requests, I’ll figure out some Solomonic procedure for choosing the recipient(s). If there are no takers, I guess I’ll use it for its intended purpose.
I must also note that my hypothesis about the special non-nesting property of the embossed paper is totally bogus. In the first place, a roll of toilet paper is an Archimedian spiral, so that the circumference increases from one layer to the next; even a perfectly regular pattern will come into coincidence with itself only when the circumference equals an integer multiple of the pattern period. Second, the texture imprinted on the toilet paper is surely not a real aperiodic tiling. The manufacturing process would have involved passing the sheet between a pair of steel crimping cylinders bearing the incised network of kites and darts. Those cylinders are necessarily of finite diameter, and so the pattern must in fact repeat. If Kimberly-Clark had contested the law suit, they might have used that point in their defense.
My first glimpse of the World Wide Web came in 1993 on a visit to Fermilab, the physics playground near Chicago. Tom Nash, head of the computing division, showed me a screenful of text with a few highlighted phrases. When he selected one of the phrases, the screen went blank for a moment, and then another page of text appeared. We had just followed a hyperlink. I asked Tom what the system was good for, and he said it was great for sharing software documentation. I was so unimpressed I failed even to mention this new tool in the article I was writing about scientific computing at Fermilab.
A year later, after the Mosaic browser came on the scene, my eyes were opened. I wrote a gushing article on the marvels of the WWW.
There have long been protocols for transferring various kinds of information over the Internet, but the Web offers the first seamless interface to the entire network . . . The Web promotes the illusion that all resources are at your fingertips; the universe of information is inside the little box that sits on your desk.
I was still missing half the story. Yes, the web (which has since lost its capital W) opened up an amazing portal onto humanity’s accumulated storehouse of knowledge. But it did something else as well: It empowered all of us to put our own stories and ideas before the public. Economic and technological barriers were swept away; we could all become creators as well as consumers. Perhaps for the first time since Gutenberg, public communication became a reasonably symmetrical, two-way social process.
The miracle of the web is not just that the technology exists, but that it’s accessible to much of the world’s population. The entire software infrastructure is freely available, including the HTTP protocol that started it all, the languages for markup, styling, and scripting (HTML, CSS, JavaScript), server software (Apache, Nginx), content-management systems such as WordPress, and also editors, debuggers, and other development tools. Thanks to this community effort, I get to have my own little broadcasting station, my personal media empire.
But can it last?
In the U.S., the immediate threat to the web is the repeal of net-neutrality regulations. Under the new rules (or non-rules), Internet service providers will be allowed to set up toll booths and roadblocks, fast lanes and slow lanes. They will be able to expedite content from favored sources (perhaps their own affiliates) and impede or block other kinds of traffic. They could charge consumers extra fees for access to some sites, or collect back-channel payments from publishers who want preferential treatment. For a glimpse of what might be in store, a New York Times article looks at some recent developments in Europe. (The European Union has its own net-neutrality law, but apparently it’s not being consistently enforced.)
The loss of net neutrality has elicited much wringing of hands and gnashing of teeth. I’m as annoyed as the next netizen. But I also think it’s important to keep in mind that the web (along with the internet more generally) has always lived at the edge of the precipice. Losing net neutrality will further erode the foundations, but it is not the only threat, and probably not the worst one.
Need I point out that the internet lost its innocence a long time ago? In the early years, when the network was entirely funded by the federal government, most commercial activity was forbidden. That began to change circa 1990, when crosslinks with private-enterprise networks were put in place, and the general public found ways to get online through dial-up links. The broadening of access did not please everyone. Internet insiders recoiled at the onslaught of clueless newbies (like me); commercial network operators such as CompuServe and AmericaOnline feared that their customers would be lured away by a heavily subsidized competitor. Both sides were right about the outcome.
As late as 1994, hucksterism on the internet was still a social trangression if not a legal one. Advertising, in particular, was punished by vigorous and vocal vigilante action. But the cause was already lost. The insular, nerdy community of internet adepts was soon overwhelmed by the dot-com boom. Advertising, of course, is now the engine that drives most of the largest websites.
Commerce also intruded at a deeper level in the stack of internet technologies. When the internet first became inter—a network of networks—bits moved freely from one system to another through an arrangement called peering, in which no money changed hands. By the late 1990s, however, peering was reserved for true peers—for networks of roughly the same size. Smaller carriers, such as local ISPs, had to pay to connect to the network backbone. These pay-to-play arrangements were never affected by network neutrality rules.
A patch panel in a “meet me room” allows independent nework carriers to exchange streams of bits. Some of the data transfers are peering arrangements, made without payment, but others are cash transactions. The meet-me room is at the Summer Street internet switching center in Boston.
Express lanes and tolls are also not a novelty on the internet. Netflix, for example, pays to place disk farms full of videos at strategic internet nodes around the world, reducing both transit time and network congestion. And Google has built its own private data highways, laying thousands of miles of fiber optic cable to bypass the major backbone carriers. If you’re not Netflix or Google, and you can’t quite afford to build your own global distribution system, you can hire a content delivery network (CDN) such as Akamai or Cloudfare to do it for you. What you get for your money: speedier delivery, caching of static content near the destination, and some protection against malicious traffic. Again the network neutrality rules do not apply to CDNs, even when they are owned and run by companies that also act as telecommunications carriers and ISPs, such as AT&T.
In pointing out that there’s already a lot of money grubbing in the temple of the internet, I don’t mean to suggest that the repeal of net neutrality doesn’t matter or won’t make a difference. It’s a stupid decision. As a consumer, I dread the prospect of buying internet service the way one buys bundles of cable TV channels. As a creator of websites, I fear losing affordable access to readers. As a citizen, I denounce the reckless endangerment of a valuable civic asset. This is nothing but muddy boots trampling a cultural treasure.
Still and all, it could be worse. Most likely it will be. Here are three developments that make me uneasy about the future of the web.
Dominance. In round numbers, the web has something like a billion sites and four billion users—an extraordinarily close match of producers to consumers. For any other modern medium—television stations and their viewers, newspaper and their readers—the ratio is surely orders of magnitude larger. Yet the ratio for the web is also misleading. Three fourths of those billion web sites have no content and no audience (they are “parked” domain names), and almost all the rest are tiny. Meanwhile, Facebook gets the attention of roughly half of the four billion web users. Google and Facebook together, along with their subsidiaries such as YouTube, account for 70 percent of all internet traffic. The wealth distribution of the web is even more skewed than that of the world economy.
It’s not just the scale of the few large sites that I find intimidating. Facebook in particular seems eager not just to dominate the web but to supplant it. They make an offer to the consumer: We’ll give you a better internet, a curated experience; we’ll show you what you want to see and filter out the crap. And they make an offer to the publisher and advertiser: This is where the people are. If you want to reach them, buy a ticket and join the party.
If everyone follows the same trail to the same few destinations, net neutrality is meaningless.
Fragmentation. The web is built on open standards and a philosophy of sharing and cooperation. If I put up a public website, anyone can visit without asking my permission; they can use whatever software they please when they read my pages; they can publish links to what I’ve written, which any other web user can then follow. This crosslinked body of literature is now being shattered by the rise of apps. Facebook and Twitter and Google and other large internet properties would really prefer that you visit them not on the open web but via their own proprietary software. And no wonder: They can hold you captive in an environment where you can’t wander away to other sites; they can prevent you from blocking advertising or otherwise fiddling with what they feed you; and they can gather more information about you than they could from a generic web browser. The trouble is, when every website requires its own app, there’s no longer a web, just a sheaf of disconnected threads.
This battle seems to be lost already on mobile platforms.
Suppression. All of the challenges to the future of the web that I have mentioned so far are driven by the mere pursuit of money. Far scarier are forms of manipulation and discrimination based on noneconomic motives.
Governments have ultimate control over virtually all communications media—radio and TV, newspapers, books, movies, the telephone system, the postal service, and certainly the internet. Nations that we like to think of as enlightened have not hesitated to use that power to shape public discourse or to suppress unpopular or inconvenient opinions, particularly in times of stress. With internet technology, surveillance and censorship are far easier and more efficient than they ever were with earlier media. A number of countries (most notoriously China) have taken full advantage of those capabilities. Others could follow their example. Controls might be introduced overtly through legislation or imposed surreptitiously through hacking or by coercing service providers.
Still another avenue of suppression is inciting popular sentiment—burning down websites with tiki torches. I can’t say I’m sorry to see the Nazi site Daily Stormer hounded from the web by public outcry; no one, it seems, will register their domain name or host their content. Historically, however, this kind of intimidation has weighed most heavily on the other end of the political spectrum. It is the labor movement, racial and ethnic and religious minorities, socialists and communists and anarchists, feminists, and the LGBT community who have most often had their speech suppressed. Considering who wields power in Washington just now, a crackdown on “fake news” on the internet is hardly an outlandish possibility.
In spite of all these forebodings, I remain strangely optimistic about the web’s prospects for survival. The internet is a resilient structure, not just in its technological underpinnings but also in its social organization. Over the past 20 years, for many of us, the net has wormed its way into every aspect of daily life. It’s too big to fail now. Even if some basement command center in the White House had a big red switch that shuts down the whole network, no one would dare to throw it.
My erstwhile employer, mentor, and dearest friend was Dennis Flanagan, who edited Scientific American for 37 years. He is the larger of the two aquatic specimens in the photograph below.
One of the quirks of life with Dennis was that he didn’t hear well, as a result of childhood ear infections. In an unpublished memoir he lists his deafness as a major influence on his path through life. It was a hardship in school, because he missed much of what his teachers were saying. On the other hand, it kept him out of the military in World War II.
Later in life, hearing aids helped considerably, but only on one side. When we went to lunch, I learned to sit to his right, so that I could speak to the better ear. When we took someone out to lunch, the guest got the favored chair. In our monthly editorial meetings, however, he turned his deaf ear to Gerard Piel, the magazine’s co-founder and publisher. (They didn’t always get along.) In Dennis’s last years, after both of us had left the magazine, we would take long walks through Lower Manhattan, with stops in coffee shops and sojourns on park benches, and again I made sure I was the right-hand man. Dennis died in 2005. I miss him all the time.
Although I was always aware of Dennis’s hearing impairment, I never had an inkling of what his asymmetric sensory experience might feel like from inside his head. Now I have a chance to find out. A few days ago I had a sudden failure of hearing in my left ear. At the time I had no idea what was happening, so I can’t reconstruct an exact chronology, but I think the ear went from normal function to zilch in a matter of seconds or minutes. It was like somebody pulled the plug.
I have since learned that this is a rare phenomenon (5 to 20 cases per 100,000 population) but well-known to the medical community. It has a name: Sudden Sensorineural Hearing Loss. It is a malfunction of the cochlea, the inner-ear transducer between mechanical vibration and neural activity. An audiological exam confirmed that my eardrum and the delicate linkage of tiny bones in the middle ear are functioning normally, but the signal is not getting through to the brain. In most cases of SSNH, the cause is never identified. I’m under treatment, and there’s a decent chance that at least some level of hearing will be restored.
I don’t often write about matters this personal, and I’m not doing so now to whine about my fate or to elicit sympathy. I want to record what I’m going through because I find it fascinating as well as distressing. A great deal of what we know about the human brain comes from accidents and malfunctions, and now I’m learning some interesting lessons at first hand.
The obvious first-order effect of losing an ear is cutting in half the amplitude of the received acoustic signal. This is perhaps the least disruptive aspect of the impairment, and the easiest to mitigate.
The second major effect is more disturbing: trouble locating the source of a sound. Binaural hearing is key to localization. For low-pitched sounds, with wavelengths greater than the diameter of the head, the brain detects the phase difference between waves reaching the two ears. The phase measurement can yield an angular resolution of just a few degrees. At higher frequencies and shorter wavelengths, the head effectly blocks sound, and so there is a large intensity difference between the two ears, which provides another localizing cue. This mechanism is somewhat less acurate, but you can home in on a source by turning your head to null the intensity difference.
With just one ear, both kinds of directional guidance are lacking. This did not come as a surprise to me, but I had never thought about what it would be like to perceive nonlocalized sounds. You might imagine it would be like switching the audio system from stereophonic to monoaural. In that case, you lose the illusion that the strings are on the left side of the stage and the brasses on the right; the whole orchestra is all mixed up in front of you. Nevertheless, in your head you are still localizing the sounds; they are all coming from the speakers across the room. Having one ear is not like that; it’s not just life in mono.
In my present state I can’t identify the sources of many sounds, but they don’t come from nowhere. Some of them come from everywhere. The drone of the refrigerator surrounds me; I hear it radiating from all four walls and the floor and ceiling; it’s as if I’m somehow inside the sound. And one night there was a repetitive thrub-a-dub that puzzled me so much I had to get out of bed and go searching for the cause. The search was essentially a random one: I determined it was not the heating system, and nothing in the kitchen or bathroom. Finally I discovered that the noise was rain pouring off the roof into the gutters and downspouts.
The failures of localization are most disturbing when the apparent source is not vague or unknown but rather quite definite—and wrong! My phone rings, and I reach out to my right to pick it up, but in fact it’s in my shirt pocket. While driving the other day, I heard the whoosh of a car that seemed to be passing me on the right, along the shoulder of the road. I almost veered left to make room. If I had done so, I would have run into the overtaking vehicle, which was of course actually on my left. (Urgent priority: Learn to ignore deceptive directional cues.)
In the first hour or so after this whole episode began, I did not recognize it as a loss of hearing; what I noticed instead was a distracting barrage of echoes. I was chatting with three other people in a room that has always seemed acoustically normal, but words were coming at me from all directions like high-velocity ping-pong balls. The echoes have faded a little in the days since, but I still hear double in some situations. And, interestingly, the echo often seems to be coming from the nonfunctioning ear. I have a hypothesis about what’s going on. Echoes are real, after all; sounds really do bounce off walls, so that the ears receive multiple instances of a sound separated by millisecond delays. Normally, we don’t perceive those echoes. The ears must be sensing them, but some circuitry in the brain is suppressing the perception. (Telephone systems have such circuitry too.) Based on my experience, I suspect that the suppression mechanism depends on the presence of signals from both ears.
Similar to echo suppression is noise suppression. I find I have lost the benefit of the “cocktail party effect,” whereby we select a single voice to attend to and filter out the background chatter. The truth is, I was never very good at that trick, but I’m notably worse now. A possibly related development is that I have the illusion of enhanced hearing acuity for some kinds of noise. The sound of water running from a faucet carries all through the house now. And the sound of my own chewing can be thunderous. In the past, perhaps the binaural screening process was turning down the gain on such commonplace distractions.
Even though no sounds of the outside world are reaching me from the left side of my head, that doesn’t mean the ear is silent. It seems to emit a steady hiss, which I’m told is common in this condition. Occasionally, in a very quiet room, I also hear faint chimes of pure sine tones. Do any of these signals actually originate in the affected cochlea, or are they phantoms that the brain merely attributes to that source?
The most curious interior noise is one that I’ve taken to calling the motor. In the still of the night, if I turn my head a certain way, I hear a putt-putt-putt with the rhythm of a sputtering lawn-mower engine, though very faint and voiceless. The intriguing thing is, the sound is altered by my breathing. If I hold my breath for a few seconds, the putt-putting slows and sometimes stops entirely. Then when I take a breath, the motor revs up again. Could this response indicate sensitivity to oxygen levels in the blood reaching my head? I like to imagine that the source of the noise is a single lonely neuron in the cochlea, bravely tapping out its spike train—the last little drummer boy in my left ear. But I wouldn’t be surprised to learn it comes from somewhere higher up in the auditory pathway.
One of the first manuscripts I edited at Scientific American (published in October 1973) was an article by the polymath Gerald Oster.
Ordinary beat tones are elementary physics: Whenever two waves combine and interfere, they create a new wave whose frequency is equal to the difference between the two original frequencies. In the case of sound waves at frequencies at few hertz apart, we perceive the beat tone as a throbbing modulation of the sound intensity. Oster asked what happens when the waves are not allowed to combine and interfere but instead are presented separately to the two ears. In certain frequency ranges it turns out that most people still hear the beats; evidently they are generated by some interference process within the auditory networks of the brain. Oster suggested that a likely site is the superior olivary nucleus. There are two of these bodies arrayed symmetrically just to the left and right of the midline in the back of the brain. They both receive signals from both ears.
Whatever the mechanism generating the binaural beats, it has to be happening somewhere inside the head. It’s a dramatic reminder that perception is not a passive process. We don’t really see and hear the world; we fabricate a model of it based on the sensations we receive—or fail to receive.
I’m hopeful that this little experiment of nature going on inside my cranium will soon end, but if it turns out to be a permanent condition, I’ll cope. As it happens, my listening skills will be put to the test over the next several months, as I’m going to be spending a lot of time in lecture halls. There’s the annual Joint Mathematics Meeting coming up in early January, then I’m spending the rest of the spring semester at the Simons Institute for the Theory of Computing in Berkeley. Lots of talks to attend. You’ll find me in the front of the room, to the left of the speaker.
My years with Dennis Flanagan offer much comfort when I consider the prospect of being half-deaf. His deficit was more severe than mine, and he put up with it from childhood. It never held him back—not from creating one of the world’s great magazines, not from leading several organizations, not from traveling the world, not from spearing a 40-pound bass while free diving in Great South Bay.
One worry I face is music—will I ever be able to enjoy it again?—but Dennis’s example again offers encouragement. We shared a great fondness for Schubert. I can’t know exactly what Dennis was hearing when we listened to a performance of the Trout Quintet together, but he got as much pleasure out of it as I did. And in his sixties he went beyond appreciation to performance. He had wanted to learn the cello, but a musician friend advised him to take up the brass instrument of the same register. He did so, and promptly learned to play a Bach suite for unaccompanied cello on the slide trombone.
Today, I’m told, is Rational Approximation Day. It’s 22/7 (for those who write dates in little-endian format), which differs from π by about 0.04 percent. (The big-endians among us are welcome to approximate 1/π.)
Given the present state of life in America, what we really need is an Approximation to Rationality Day, but that may have to wait for 20/1/21. In the meantime, let us merrily fiddle with numbers, searching for ratios of integers that brazenly invade the personal space of famous irrationals.
When I was a teenager, somebody told me about the number 355/113, which is an exceptionally good approximation to π. The exact value is
correct through the first six digits after the decimal point. In other words, it differs from the true value by less than one-millionth. I was intrigued, and so I set out to find an even better approximation. My search was necessarily a pencil-and-paper affair, since I had no access to any electronic or even mechanical aids to computation. The spiral-bound notebook in which I made my calculations has not survived, and I remember nothing about the outcome of the effort.
A dozen years later I acquired some computing machinery: a Hewlett-Packard programmable calculator, called the HP-41C. Here is the main loop of an HP-41C program that searches for good rational approximations. Note the date at the top of the printout (written in middle-endian format). Apparently I was finishing up this program just before Approximation Day in 1981.
What’s that you say? You’re not fluent in the 30-year-old Hewlett-Packard dialect of reverse Polish notation? All right, here’s a program that does roughly the same thing, written in an oh-so-modern language, Julia.
function approximate(T, dmax)
d = 1
leastError = T
while d <= dmax && leastError > 0
n = Int(round(d * T))
err = abs(T - n/d) / T
merit = 1 / ((n + d)^2 * err)
if err < leastError
println("$n/$d = $(n/d) error = $err merit = $merit")
leastError = err
end
d += 1
end
end
The algorithm is a naive, sequential search for fractions \(n/d\) that approximate the target number \(T\). For each value of \(d\), you need to consider only one value of \(n\), namely the integer nearest to \(d \times T\). (What happens if \(d \times T\) falls halfway between two integers? That can’t happen if \(T\) is irrational.) Thus you can begin with \(d = 1\) and continue up to a specified largest denominator \(d = dmax\). The accuracy of the approximation is measured by the error term \(|T - n/d| / T\). Whenever a value of \(n/d\) yields a new minimum error, the program prints a line of results. (This version of the algorithm works correctly only for \(T \gt 1\), but it can readily be adapted to \(T \lt 1\).
The HP-41C has a numerical precision of 10 decimal digits, and so the closest possible approximation to π is 3.141592654. Back in 1981 I ran the program until it found a fraction equal to this value—a perfect approximation, from the program’s point of view. According to a note on the printout, that took 13 hours. The Julia program above, running on a laptop, completes the same computation in about three milliseconds. You’re welcome to take a scroll through the results, below. (The numbers are not digit-for-digit identical to those generated by the HP-41C because Julia calculates with higher precision, about 16 decimal digits.)
The error values in the middle column of the table above shrink steadily as you read from the top of the list to the bottom. Each successive approximation is more accurate than all those above it. Does that also mean each successive approximation is better than those above it? I would say no. Any reasonable notion of “better” in this context has to take into account the size of the numerator and the denominator.
If you want an approximation of \(\pi\) accurate to seven digits, I can give you one off the top of my head: \(3141593/1000000\). But the numbers making up that ratio are themselves seven digits long. What makes \(355/113\) impressive is that it achieves seven-digit accuracy with only three digits in the numerator and the denominator. Accordingly, I would argue that a “better” approximation is one that minimizes both error and size. The rightmost column of the table, filled with numbers labeled “merit” is meant to quantify this intuition.
When I wrote that program in 1981, I chose a strange formula for merit, one that now baffles me:
\[\frac{1}{(n + d)^2 * err}.\]
Adding the numerator and denominator and then squaring the sum is an operation that makes no sense, although the formula as a whole does have the correct qualitative behavior, favoring both smaller errors and smaller values of \(n\) and \(d\). In trying to reconstruct what I had in mind 26 years ago, my best guess is that I was trying to capture a geometric insight, and I flubbed it when translating math into code. On this assumption, the correct figure of merit would be:
\[\frac{1}{\sqrt{n^2 + d^2} * err}.\]
To see where this formula comes from, consider a two-dimensional lattice of integers, with a ray of slope \(\pi\) drawn from the origin and going on to infinite distance.
Because the line’s slope is irrational, it will never pass through any point of the integer lattice, but it will have many near misses. The near-miss points, with coordinates interpreted as numerator and denominator, are the accurate approximations to \(\pi\). The diagram suggests a measure of the merit based on distances. An approximation gets better when we minimize the distance of the lattice point from the origin as well as the vertical distance from the point to the \(\pi\) line. That’s the meaning of the formula with \(\sqrt{n^2 + d^2}\) in the denominator.
Another approach to defining merit simply counts digits. The merit is the ratio of the number of correctly predicted digits in the irrational target \(T\) to the number of digits in the denominator. A problem with this scheme is that it’s rather coarse. For example, \(13/4\) and \(16/5\) both have single-digit denominators and they each get one digit of \(\pi\) correct, but
\(16/5\) actually has a smaller error.
To smooth out the digit-counting criterion, and distinguish between values that differ in magnitude but have the same number of digits, we can take logarithms of the numbers. Let merit equal: \(-log(err) / log(d)\). (The \(log(err)\) term is negated because the error is always less than \(1\) and so its logarithm is negative.)
Here’s a comparison of the three merit criteria for some selected approximations to \(\pi\):
It started with a brief story in the New York Times about Luke Robitaille, a 13-year-old student from Euless, Texas, who won the Raytheon Mathcounts National Competition by correctly answering the following question:
In a barn, 100 chicks sit peacefully in a circle. Suddenly, each chick randomly pecks the chick immediately to its left or right. What is the expected number of unpecked chicks?
Robitaille took less than a second to buzz in with the correct answer, according to the Times.
The next day, Jordan Ellenberg tweeted a followup problem:
Since I don’t have to squeeze this story into 140 characters, I’ll fill in some details of Ellenberg’s question, as I understand it. Where the original problem called for a single round of synchronized random pecking, we now have multiple rounds. During a round, each chick randomly turns either left or right and pecks one of its neighbors. However, once a chick has been pecked, it will never peck again, even if it continues to receive pecks. When two adjacent chicks peck each other in the same round, they both drop out of the pecking game for all future rounds. If an unpecked chick winds up sitting between two pecked neighbors, it can never be pecked and will therefore keep on pecking forever. The question is, what proportion of the flock will survive to become invulnerable peckers?
Spoilers below, so now’s the time to work out the answers for yourself. While you’re busy with that, I’m going to say a few words about chickens, and about the rhetoric and semiotics of mathematical “word problems.”
My only direct knowledge of poultry comes from boyhood visits to my Aunt Noretta’s farm in southern New Jersey. That’s not much of a claim to expertise, but for what it’s worth I never saw her chickens sit in a circle, and they didn’t peck randomly. (They had a pecking order!) Furthermore, nothing I observed in their social interactions resembled the turn-the-other-cheek behavior of the chickens described in this problem. Why does a pecked chick never peck again? This is a bigger riddle than the quantitative question we are asked to address. Has the chick suddenly discovered the wisdom and power of nonviolence? I can think of another explanation, but it’s not for the squeamish: Maybe pecked chicks don’t peck back because pecks are lethal.
I know it’s silly to demand narrative realism in a story like this one. Mathematical word problems belong to a genre where no one expects verisimilitude. They are set in a world where knaves always lie and knights always speak the truth, where shipwrecked sailors obsess about the divisibility properties of a pile of coconuts, where people don’t know the color of the hat on their own head. Even the laws of physics yield to mathematical necessity: A fly shuttling between oncoming locomotives instantaneously reverses direction. Those chicks sitting in a circle are not fluffly bundles of yellow plumage; they are mathematical abstractions. They have coordinates and state variables rather than feathers.
I’m okay with abstraction; by all means, let us strip away extraneous detail. Nevertheless, isn’t the point of word problems to connect the mathematics to some aspect of familiar experience? Consider the ancient and famous river-crossing problem, where the fox must not be left alone with the chicken, which must not be left alone with the bag of corn. These constraints are easy to understand when you know something about the dietary preferences of foxes and chickens. That kind of intuitive boost is not to be found in the pecking problem. On the contrary, a little knowledge of avian behavior actually makes the problem more perplexing.
But no matter. Onward! Have you come up with your answers?
The single-round problem from the Mathcounts Competition yields to the oldest trick in the probability book. A chick remains unpecked only if both of its neighbors turn away and peck in the other direction. On both the left and the right, the probability of escaping a peck is \(\frac{1}{2}\), and the two events are independent, so the probability of staying unpecked on both sides is \(\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}\). This argument applies identically to all the birds in the circle, so you can expect 25 percent of the chicks to come through unscathed.
Do you agree with this analysis? I came up with it pretty quickly when I read the Times article (though not nearly fast enough to beat Luke Robitaille to the buzzer). But then I began to have doubts. Is it strictly true that a chick’s left and right neighbors are totally independent? After all, they are connected by a chain of other chicks. Perhaps some influence can propagate around the circle, creating a correlation between left and right and altering the probability of survival.
Time for an experiment: Write the program, run the simulation. Set up a ring of 100 unpecked chickens and allow a single round of random simultaneous pecking. Repeat many times and calculate the mean number of unpecked birds remaining. (Some quick notation: Let \(N\) be the number of chicks in the ring and \(S\) be the number that survive unpecked. I’ll use \(\bar{S}\) for the mean value of \(S\) averaged over \(R\) repetitions of the experiment.) My results:
\(R\)
\(\bar{S}\)
100
24.79
10,000
24.9881
1,000,000
25.000274
100,000,000
24.99991518
As expected, the mean is quite close to 25 survivors. Furthermore, each time the sample size increases by a factor of 100, the accuracy of the approximation improves about tenfold. This pattern conforms to a statistical rule of thumb—that the fluctuations in a random process are proportional to the square root of the sample size. Thus the slight departures from \(\bar{S} = 25\) appear to be innocent random noise, not some systematic bias.
So that settles it, right?
Well, the simulation looks pretty convincing for the specific case of \(N = 100\) chicks, but the result might differ for other values of N. In particular, perhaps there’s some finite-size effect that becomes apparent only when N is small. Consider a “circle” of just two chicks. In this situation the left neighbor and the right neighbor are one and the same chicken! No matter what random choices are made, the two chicks immediately peck each other, and the proportion of survivors is not 25 percent but zero.
The next-larger “circle” consists of three chicks arranged in a triangle. The two neighbors of a chick are distinct, but they are also neighbors of each other. What happens when the three chickens are set loose on one another? The system has \( 2^3 = 8\) possible pecking patterns, and we can easily examine all of them. In the diagram, the arrows indicate where the chicks choose to direct their pecking.
In two cases, where all the chicks peck left or all peck right, there are no survivors. In every other instance exactly one chick remains unpecked. Aggregating the eight patterns, we find six unpecked chicks out of 24 total chicks, for a proportion of \(\bar{S} = \frac{1}{4}\). Thus it appears the finite-size anomaly afflicts only the two-chick version of the problem.
But wait! There’s another possible confounding factor. Can we be sure of seeing the same outcome for both even and odd numbers of chicks? For any odd value of N there is just one way to annihilate all the chicks in a single round: They must all peck in the same direction. For even N, however, another pattern also leads to immediate extinction: Adjacent chicks can pair up, knocking each other out. Won’t this extra pathway slightly alter the overall probability of survival?
Let’s see what happens with N = 4. Now there are \(2^4 = 16\) possible outcomes:
As expected, four patterns leave no survivors at all. On the other hand, there are also four patterns that leave two chicks unpecked rather than just one. Miraculously, the extra losses and the extra gains balance exactly. In all we have 16 survivors out of 64 chicks, so the ratio is again \(\bar{S} = \frac{1}{4}\).
After that long and twisty detour through the combinatorics of chicken pecking, we are right back where we started. The probability of surviving unpecked after a single round of pecking is \(\frac{1}{4}\) for any \(N \gt 2\). All of my fretting about finite-size effects and odd-even disparities was a waste of time. So why have I inflicted it on you? Well, although those worries turn out to be unfounded, they are not farfetched. Making just a small change to the pecking protocol leads to a different outcome. Let the pecking be sequential rather than simultaneous. Some designated chick initiates the sequence of pecks, and then the birds take turns, proceeding clockwise around the circle. When a chick’s turn comes, if it has already been pecked, it does nothing. If it is unpecked, it pecks either its left or its right neighbor, choosing randomly. The round ends when every chick has had a turn.
For \(N = 2\) it’s easy to see that the first chick to peck always survives and the other chick always dies, for a survival rate of \(\frac{1}{2}\). With a little more pencil-and-paper chicken scratching, you can establish that the 50 percent survival rate also holds for \(N = 3\). Looking at very large values of \(N\), computer experiments indicate that the survival fraction again approaches \(\frac{1}{2}\) as N goes to infinity. Between these extremes, however, there’s some funny business:
At \(N = 4\) the survivor rate dips below 0.47. (The exact probability is \(\frac{30}{64} = 0.46875\).) This is a minimum. But as the rate recovers back toward 0.5, there is some telltale wiggling in the curve that reveals an odd-even bias: The survival probability is depressed further for even N than for odd N. This is just the kind of behavior I was looking for (but not finding) in the original Mathcounts version of the problem.
Let us now take up Ellenberg’s problem of iterated pecking (using the simultaneous rather than the sequential protocol). We already know that after the first round we can expect to find about one-fourth of the chicks still unpecked. Clearly, the unpecked fraction cannot increase after multiple rounds. Thus in the final state the expected surviving fraction \(\bar{S}\) must lie somewhere between zero and \(\frac{1}{4}\).
It’s helpful to look at a typical configuration of pecked (●) and unpecked (○) chicks after a single round of synchronized pecking:
(You’ll have to use your imagination to connect the left and right ends of this array and thereby form a ring.) Notice that there are long strings of pecked chicks, but the unpecked chicks appear in only two configurations. They are either singletons (●○●) or pairs (●○○●). The cause of this pattern is not hard to understand. After a round of pecking, a group of three consecutive unpecked chicks (●○○○●) is impossible. The middle chick must have pecked either left or right, and so it cannot have two unpecked neighbors.
These constraints simplify the analysis of subsequent rounds. The singletons are essentially immortal and unchangeable: The unpecked chick in the middle can never be pecked, and the pecked neighbors can never be unpecked. For the pairs, there are four possible fates, corresponding to the four ways the two active chicks could choose to peck:
In any one round, all four of these events have the same probability, namely \(\frac{1}{4}\). The first three result states are terminal, in the sense that further rounds of pecking will leave them unchanged. In the fourth case we are left with an adjacent pair again, which will therefore face the same set of choices in the next round. Eventually, as the number of rounds goes to infinity, the fourth case must yield one of the other outcomes, and thus in the long run we can consider the fourth case to have probability zero and each of the other three cases to have probability \(\frac{1}{3}\).
And now it’s time to bring all these contingent events together and work out a chicken’s long-term probability of survival. The diagram below presents the scheme. In the first round of pecking, three-fourths of the chicks are eliminated immediately. Of the remaining one-fourth, half are singletons, which survive indefinitely. The other surviving chicks are members of pairs, with another pecking chick as either a right neighbor or a left neighbor.
The lower part of the diagram summarizes the effect of all subsequent rounds, which are assumed to continue until all pairs have been either annihilated or reduced to singletons. (I call this pecking to completion.) For each pathway that leads to a surviving singleton, the probability is the product of the individual probabilities encountered along that pathway. There are three such pathways, with probabilities \(\frac{1}{8}, \frac{1}{48}\), and \(\frac{1}{48}\), for a sum of \(\frac{1}{6}\).
I have to confess that I did not come up with this analysis—or with the correct answer—on my first try. I was able to work it out only after I had run a simulation and thus knew what I was looking for. Even then I had trouble with double counting.
Here are the simulation results:
\(R\)
\(\bar{S}\)
100
16.53
10,000
16.6835
1,000,000
16.664404
100,000,000
16.66701664
Again note that accuracy seems to improve as the square root of the sample size, although the variance here is larger than in the single-round experiment.
What about finite-size effects? In circles with only two or three members, the fate of the chicks is fully decided after a single round of pecking: \(\bar{S}\) is 0 and \(\frac{1}{4}\) respectively. Thus these smallest rings escape the \(\frac{1}{6}\) rule, but it appears that circles of all larger sizes converge to \(\frac{1}{6}\). There’s no evidence of even-odd discrepancies.
Another approach to understanding the iterated chicken-pecking problem is through the theory of Markov chains. For a ring of \(N\) chicks we list all \(2^N\) states of the flock and assign a probability to each transition between states. Consider a ring of four chicks, which has 16 states. Symmetries allow us to consolidate some sets of states, and other states can be ignored because they are unreachable from the starting state of four unpecked chicks ().
Only the four states in the red box need to be retained in the model. The transitions between them are recorded in a directed graph, where each arrow is labeled with the corresponding probability. Note that the starting state has only outgoing arrows; there is no way to re-enter the state once you leave. The states and are absorbing: The only outgoing arrow leads directly back to the same state; thus, once you reach one of those states, you never escape it.
The essential information from the directed graph can be captured in a \(4 \times 4\) matrix, where the rows and columns are labeled with the four states, and the matrix entries represent the probability of a transition from the row state to the column state. The entries in each row sum to 1, as they must if they are to represent probabilities.
The pattern of zero entries in the transition matrix implies that certain states can’t be reached from other states, even by an indirect route. For this reason the Markov model is said to be irregular. That’s a bit awkward, because regular Markov models are easier to analyze and understand. In a regular model, when you take successive powers of the transition matrix, it converges to a steady state, where all the rows are identical and every column consists of a single, repeated value. This fixed point reveals the system’s long-term probability distribution. An irregular Markov model may not even have a stable limiting distribution, but this one does, and it seems to offer some insight. Every ring of four chickens must wind up in one of the two absorbing states. With probability two-thirds that terminal state will be and with probability one-third . This result is consistent with the finding that one-sixth of the chickens survive unpecked.
So, finally, that wraps it up, right? Both the contest problem and Ellenberg’s iterative extension asked for the expected number of surviving chickens, and we have supplied the answers: for a circle of \(N\) chickens, the expected number of survivors \(\bar{S}\) is \(\frac{N}{4}\) after a single round of pecking and \(\frac{N}{6}\) upon pecking to completion. Ironically, though, the expected value of a probabilistic process doesn’t necessarily tell you what to expect. Consider a simpler problem: When you flip a fair coin 100 times, how many heads do you expect to see? The obvious answer is 50, and it’s correct in the sense that no other number has a higher likelihood of correctly predicting the outcome of the experiment. However, the probability of seeing exactly 50 heads is only about 0.08, and thus some other number will turn up more than 90 percent of the time.
Instead of looking only at the expected value, let’s examine the range of possible \(S\) values in the pecking game. We’ve already established that zero survivors is a possible outcome, so that forms a lower bound. What is the upper bound—the maximum number of survivors? In the single-round process, every chick pecks, and so after that round every chick must have at least one pecked neighbor. On the basis of this fact I claim that the surviving population can never be greater than \(\frac{N}{2}\). (Do you agree? It took me a while to persuade myself it’s true.)
If \(S\) can never be greater than \(\frac{N}{2}\), the next question is whether it can ever attain that bound. And if we can have equal numbers of pecked (●) and unpecked (○) chicks, how are they arranged in the ring? It’s tempting to propose the following configuration:
●○●○●○●○●○●○
This is a stable state: The unpecked chicks can never be pecked, so no further changes are possible. And the fraction of survivors is \(\frac{1}{2}\). But there’s a problem with this pattern: It cannot be reached from the starting state. Look at any of the black pecked chicks and ask yourself: Which of its neighbors did it peck? Neither of them, evidently, since they are both unpecked. But that’s not possible, given that every chicken must peck in the first round.
Although the alternating black and white arrangement is ruled out, we’re on the right track. There’s another configuration that also leaves one-half of the chicks unpecked after a single round, and that pattern is achievable from the starting state:
●●○○●●○○●●○○
When you join the ends to form a ring, every chick, whether pecked or not, has one pecked neighbor. It turns out this is the only way—after allowing for some obvious symmetries—to reach 50 percent survivorship. (Strictly speaking, 50 percent is attainable only when \(N\) is divisible by 4, but \(S\) is never less than \(\frac{N-2}{2}\).)
When the pecking continues to completion, the upper bound of \(S = \frac{N}{2}\) is no longer reachable. Suppose we tried to maintain \(\frac{N}{2}\) over multiple rounds of pecking. Clearly we would have to start in the first round with the maximal-survivor state ●●○○●●○○●●○○. However, at least half of the unpecked chicks in this configuration must succumb in subsequent rounds, leaving no more than \(\frac{N}{4}\) survivors.
Does this argument mean that \(S = \frac{N}{4}\) is the greatest possible after pecking to completion? No, it doesn’t. There’s another pattern where one of every three chicks survives:
●●○●●○●●○●●○
This configuration is reachable in a single round and stable indefinitely, since none of the pecking chicks has any pecking neighbors. No other arrangement has a higher density of survivors once the pecking process goes to completion.
To summarize: After one round of pecking the number of surviving chicks must lie somewhere between zero and \(\frac{N}{2}\), and the expected number \(\bar{S}\) is right in the middle at \(\frac{N}{4}\). After all further rounds of pecking are completed, the count of unpecked chicks is between zero and \(\frac{N}{3}\), with the expected value again in the middle, at \(\bar{S} = \frac{N}{6}\).
“How many chickens survive?” is a question that seems to call for a numeric answer, but in truth the most informative response is not a number at all; it is a distribution:
Each curve records the results of a million experiments with a ring of 100 chicks, giving the frequency of each possible value of \(S\). As expected, the one-round distribution has a peak at 25 survivors, and the iterated curve peaks at 17 (the closest integer to..
James Tanton tosses off number theory problems the way John D. Rockefeller handed out dimes. I wrote about one of Tanton’s problems back in January. Then a few weeks ago this tweet about factorials and squares snagged my attention, and it hasn’t let go:
With pencil and paper it’s easy to show that \(6!\) doesn’t work. The factorial of \(6\) is \(1 \times 2 \times 3 \times 4 \times 5 \times 6 = 720\); adding \(1\) brings us to \(721\), which is not a square. (It factors as \(7 \times 103\).) On the other hand, \(7!\) is \(5040\), and adding \(1\) yields \(5041\), which is equal to \(71^2\). This makes for a very cute equation:
\[7! + 1 = 71^2.\]
Continuing on, you can establish that \(8! + 1\), \(9! +1\) and \(10! + 1\) are not square numbers. But to extend the search much further, we need mechanized assistance. Here’s a Julia function that does the obvious thing, generating successive factorials and checking each one to see if it is \(1\) less than a perfect square:
function search_fac_sqr(maxn)
fac = big(1) # bigints needed for n > 20
for n in 1:maxn
fac *= n # incremental factorial
r = isqrt(fac + 1) # floor of sqrt
if r * r == fac + 1
println(n, "! + 1 = ", r, "^2 = ", r^2)
end
end
println("That's all folks!")
end
With this tool in hand, let’s check out \(n! + 1\) for all \(n\) between \(1\) and \(100\). Here’s what the program reports:
Those are the three cases we’ve already discovered with pencil and paper—and no more are listed. In other words, among all values of \(n! + 1\) up to \(n = 100\), only \(n = 4\), \(n = 5\), and \(n = 7\) yield squares. When I continued the search up to \(n = 1{,}000\), I got exactly the same result: no more squares. Likewise \(n = 10{,}000\) and \(n = 100{,}000\). Allow me to mention that the factorial of \(100{,}000\) is a rather large number, with \(456{,}574\) decimal digits. At this point in the search, I began to grow weary; furthermore, I began to lose hope. When \(99{,}993\) successive values of \(n\) fail to produce a single square, it’s hard to sustain faith that success might be just around the corner. Nevertheless, I persisted. I got as far as \(n = 500{,}000\), which has \(2{,}632{,}341\) decimal digits. Not one more perfect square in the whole lot.
What can we learn from this evidence—or lack of evidence? Are 4, 5, and 7 the only values of \(n!\) that lie \(1\) short of a perfect square? Or are there more such cases somewhere out there along the number line, maybe just beyond my reach, waiting to be found? Could there be infinitely many? If so, where are they? If not, why not?
To my taste, the most satisfying way to resolve these questions would be to find some number-theoretical principle ensuring that \(n! + 1 \ne m^2\) for \(n \gt 7\). I have not discovered any such principle, but in a dreamy sort of way I can imagine what a proof might look like. Suppose we eliminate the “\(+1\)” part of the formula, and search for integers such that \(n! = m^2\). It turns out there is just one solution to this equation, with \(n = m = 1\). You needn’t bother lathering up your laptop in the quest for larger examples; there’s a simple proof they don’t exist. In any square number, all the prime factors must be present an even number of times, as in \(36 = 2 \times 2 \times 3\times 3\). In a factorial, at least one prime factor—the largest one—always appears just once. (If you’re not sure why, check out Bertrand’s postulate/Chebyshev’s theorem.)
Of course when we put the “\(+1\)” back into the formula, this whole line of reasoning falls to pieces. In general, the factorization of \(n!\) and of \(n! + 1\) are totally different. But maybe there’s some other property of \(n! + 1\) that conflicts with squareness. It might have something to do with congruence classes, or quadratic residues. From the definition of a factorial, we know that \(n!\) is divisible by all positive integers less than or equal to \(n\), which means that \(n! + 1\) cannot be divisible by any of those numbers (except \(1\)). This observation rules out certain kinds of squares, namely those that have small primes in their factorization. But for all \(n \gt 4\) the square root of \(n!\) greatly exceeds \(n\), so there’s plenty of room for larger factors, as in the case of \(7! + 1 = 71^2\).
Here’s another avenue that might be worth exploring. The decimal representation of any large factorial ends with a string of \(0\)s, formed as the products of \(5\)s and \(2\)s among the factors of the number. Thus \(n! + 1\) must look like
\[XXXXX \ldots XXXXX00000 \ldots 00001,\]
where \(X\) represents any decimal digit, and the trailing sequence of \(0\)s now ends with a single terminal \(1\). Can we figure out a way to prove that a number of this form is never a square? Well, if the final digit were anything other than \(1, 4,\) or \(9\), the proof would be easy, but lots of squares end in \(\ldots 01\), such as \(10{,}201 = 101^2\) and \(62{,}001 = 249^2\). If there’s some algebraic argument along these lines showing that \(n! + 1\) can’t be a square, it will have to be something subtler.
All of the above is make-believe mathematics. I have stirred up some ingredients that look like they might make a tasty confection, but I have no idea how to bake the cake. Perhaps someone else will supply the recipe. In the meantime, I want to entertain an alternative hypothesis: that nothing prevents \(n! + 1\) from being a square except improbability.
The pattern observed in the \(n! + 1 = m^2\) problem—a few matches among the smallest elements of the sequences, and then nothing more for many thousands of terms—is not unique to factorials and squares. Other pairs of sequences exhibit similar behavior. For example, I have tried matching factorials with triangular numbers. The triangulars, beginning \(1, 3, 6, 10, 15, 21, \ldots\), are defined by the formula \(T(m) = m(m + 1)/2\). If we look for factorials that are also triangular, we get \(1! = T(1) = 1\), then \(3! = T(3) = 6\), and finally \(5! = T(15) = 120\). No more examples appear through \(n = 100{,}000\).
What about factorials that are \(1\) less than a triangular, satisfying the equation \(n! + 1 = T(m)\)? I know of only one case: \(2! + 1 = 3\). Broadening the search a little, I found that \(n! + 4\) is triangular for \(n \in {2, 3, 4}\), again with no more hits up to \(100{,}000\).
For another experiment we can bring back the square numbers and swap out the factorials, replacing them with the ever-popular Fibonacci sequence, \(1, 1, 2, 3, 5, 8, 13, \ldots\), defined by the recurrence \(F(n) = F(n - 1) + F(n - 2)\), with \(F(1) = F(2) = 1\). It’s been known since the 1960s that \(1\) and \(144\) are the only positive integers that are both Fibonacci numbers and perfect squares. Looking for Fibonacci numbers that are \(1\) less than a square, I found that \(F(4) + 1 = 4\) and \(F(6) + 1 = 9\), with no other instances up to \(F(500{,}000)\).
We can do the same sort of thing with the Catalan numbers, \(1, 1, 2, 5, 14, 42, 132 \ldots\), another sequence with a huge fan club. I find no squares other than \(1\) among the Catalan numbers up to \(n = 100{,}000\); I don’t know if anyone has proved that none exist. A search for cases where \(C(n) + 1 = m^2\) also comes up empty, but there are a few low-lying matches for \(C(n) + k = m^2\) for \(k \in {2, 3, 4}\).
Finding similar behavior in all of these diverse sequences changes the complexion of the problem, in my view. If we discover some obscure, special property of \(n! + 1\) that explains why it never lands on a square (for large values of \(n\)), do we then have to invent another mechanism for Fibonacci numbers and still another for Catalan numbers? Isn’t it more plausible that some single, generic cause lies behind all the observations?
But the cause can’t be too generic. It’s not the case that you can take any two numeric sequences and expect to see the same kind of pattern in their intersections. Consider the factorials and the prime numbers. By the very nature of a factorial, none of them except 2! = 2 can possibly be prime, but there’s no obvious reason that \(n! + 1\) can’t be a prime. And, indeed, for \(n \le 100\) nine values of \(n! + 1\) are prime. Extending the search to \(n \le 1000\) turns up another seven. Here is the full set of known numbers for which \(n! + 1\) is prime:
They get rare as \(n\) increases, but there’s no hint of a sharp cutoff, as there is in the other cases explored above. Does the sequence continue indefinitely? That seems a reasonable conjecture. (For more on this sequence, including references, see Chris K. Caldwell’s factorial prime page.)
My question is this: Can we understand these curious patterns in terms of mere chance coincidence? The values of \(n! + 1\) form an infinite sequence of integers spread over the number line, dense near the origin but becoming extremely sparse as \(n\) increases. The values of \(m^2\) form another infinite sequence, again with diminishing density, although the dropoff is not as steep. Maybe factorials bump into squares among the smallest integers because there just aren’t enough of those integers to go around, and some of them have to do double duty. But in the vast open spaces out in the farther reaches of the number line, a factorial can wander around for years—maybe forever—and not meet a square.
Let me try to state this idea more precisely. Since \(n!\) cannot be a square, we know that it must lie somewhere between two square numbers; the arrangement on the number line is \((m - 1)^2 \lt n! \lt m^2\). The distance between the end points of this interval is \(m^2 - (m - 1)^2 = 2m - 1\). Now choose a number \(k\) at random from the interval, and ask whether \(n! + k = m^2\). Exactly one value of \(k\) must satisfy this condition, and so the probability of success is \(1/(2m - 1)\), or roughly \(1 / (2 \sqrt{n!})\). Because \(\sqrt{n!}\) increases very rapidly, this probability takes a nosedive toward zero as \(n\) increases. It is represented by the red curve in the graph below. Note that by \(n = 100\) the red curve has already reached \(10^{-80}\).
The green curve gives the probability of a collision between Fibonacci numbers and squares; the shape is similar, though it dives off the precipice a little later. The Fibonacci-square curve approximates a negative exponential: The probability is proportional to \(\phi^{-\sqrt{F(n)}}\), where \(\phi = (\sqrt{5} + 1) / 2 \approx 1.618\). The factorial-square curve is even steeper because the factorial function is superexponential: \(n!\) grows faster than \(c^n\) for any fixed \(c\).
The blue curve, recording the probability of coincidences between factorials and primes, has a very different shape. In the neighborhood of \(n!\) the average distance between consecutive primes is approximately \(\log n!\), which grows just a little faster than \(n\) itself and very much slower than \(n!\). The probability of collision between factorials and primes is roughly \(1 / \log n!\). The continuous blue curve corresponds to this smooth approximation. The blue dots sprinkled near that line give the probability based on actual distances between consecutive primes.
What to make of those curves? Is it legitimate to apply probability theory to these totally deterministic sequences of numbers? I’m not quite sure. Before confronting the question directly, I’d like to retreat a few steps and look at a simpler model where probability is clearly entitled to a seat at the table.
Let us borrow one of Jacob Bernouilli’s famous urns, which have room to hold an infinite number of ping pong balls. Start with one black ball and one white ball in the urn, then reach in and take a ball at random. Clearly, the probability of choosing black is \(1/2\). Put the chosen ball back in the urn, and also add another white ball. Now there are three balls and only one is black, so the probability of drawing black is \(1/3\). Add a fourth ball, and the probability of black falls to \(1/4\). Continuing in this way, the probability of black on the \(n\)th draw must be \(\frac{1}{n + 1}\).
If we go on with this protocol forever—always choosing a ball at random, putting it back, and adding an extra white ball—what is the probability of eventually seeing the black ball at least once? It’s easier to answer the complement of this question, calculating the probability of never seeing the black ball. This is the infinite product \(\frac{1}{2} \times \frac{2}{3} \times\frac{3}{4} \times\frac{4}{5} \ldots\), or:
The product goes to zero as \(n\) goes to infinity. In other words, in an endless series of trials, the probability of never drawing black is \(0\), which means the probability of seeing black at least once must be \(1\). (“Probability \(1\)” is not exactly the same thing as “certain,” but it’s mighty close.)
Now let’s try a different experiment. Again start with one black ball and one white ball, but after the first draw-and-replace cycle add two white balls, then four white balls, and so on, so that the total number of balls in the urn at stage \(n\) is \(2^n\); throughout the process all of the balls but one are white. Now the probability of never seeing the black ball is \(\frac{1}{2} \times \frac{3}{4} \times\frac{7}{8} \times\frac{15}{16} \ldots\), or:
This product does not go to zero, no matter how large \(n\) becomes. Neither does it go to \(1\). The product converges to a constant with the approximate value \(0.288788095\). Strange, isn’t it? Even in an infinite series of draws from the urn, you can’t be sure whether the black ball will turn up or not.
These two urn experiments do not correspond directly to any of the sequence coincidence problems described above; they simply illustrate a range of possible outcomes. But we can rig up an urn process that mimics the probabilistic treatment of the factorials-and-squares problem. At the \(n\)th stage, the urn holds \(1 + 2 \sqrt{n!}\) balls, only one of which is black. The probability of never seeing the black ball, even in an infinite series of trials, is
This expression converges to a value of approximately \(0.2921426977\). It follows that the probability of seeing black at least once is \(1 - 0.2921426977\), or \(0.7078573023\). (No, that number is not \(1/\sqrt{2}\), although it’s close.)
An urn process resembling the factorials-and-primes problem gives a somewhat different result. Here the number of balls in the urn at stage \(n\) is \(\log n!\), again with just one black ball. The infinite product governing the cumulative probability is
\[\prod_{n = 2}^{\infty} 1 - \frac{1}{\log n!}.\]
On numerical evidence this expression seems to dwindle away to zero as \(n\) goes to infinity (although I’m not \(100\) percent sure of that). If it does go to \(0\), then the complementary probability that the black ball will eventually appear must be \(1\).
Some of these results leave me feeling befuddled, and even a little grumpy. Call me old-fashioned, but I always thought that rolling the dice infinitely many times ought to be enough to settle beyond doubt whether a pattern appears or not. In the harsh light of eternity, I would have said, everything is either forbidden or mandatory; as \(n\) goes to infinity, probability goes to \(0\) or it goes to \(1\). But apparently that’s not so. In the factorial urn model the probability of never seeing a black ball is neither \(0\) nor \(1\) but lies somewhere in the neighborhood of \(0.2921426977\). What does that mean, exactly? How am I supposed to verify the number, or even check its first few digits? Running an infinite series of trials is not enough; you need to collect a statistically significant sample of infinite experiments. For an exact result, try an infinite series of infinite experiments. Sigh.
The urn model corresponds in a natural way to the randomized version of the factorial-square problem, where we look at \(n! + k = m^2\) and choose \(k\) at random from an appropriate range of values. But what about the original problem of \(n! + 1 = m^2\)? In this case there’s no random variable, and hence there’s no point in running multiple trials for each value of \(n\). The system is deterministic. For each \(n\) the factorial of \(n\) has a definite value, and either it is or it isn’t adjacent to a perfect square. There’s no maybe.
Nevertheless, there might be a way to sneak probabilities in through the back door. To do so we have to assume that factorials and squares form a kind of ergodic system, where observing one chain of events for a long period is equivalent to watching many shorter chains. Suppose that factorials and squares are uncorrelated in their positions on the number line—that when a factorial lands between two squares, its distance from the larger square can be treated as a random variable, with every possible distance being equally likely. If this assumption holds, then instead of looking at one value of \(n!\) and trying many random values of \(k\), we can adopt a single value of \(k\) (namely \(k = 1\)) and look at \(n!\) for many values of \(n\).
Is the ergodic assumption defensible? Not entirely. Some distances between \(n!\) and \(m^2\) are known to be more likely than others, and indeed some distances are impossible. However, the empirical evidence suggests that the deviations must be slight. The histogram below shows the distribution of distances between a factorial and the next larger square for the first \(100{,}000\) values of \(n!\). The distances have all been normalized to the range \((0, 1)\) and classified in \(100\) bins. There is no obvious sign of bias. Calculating the mean and standard deviation of the same \(100{,}000\) relative distances yields values within \(1\) percent of those expected for a uniform random distribution. (The expected values are \(\mu = 1/2\) and \(\sigma = 1/12\).)
If this probabalistic approach can be taken seriously, I can make some quantitative statements about the prospects for ever finding a large factorial adjacent to a perfect square. As mentioned above, the overall probability that one or more values of \(n! + 1\) are equal to squares is about \(0.7078573023\). Thus we should not be too surprised that three such cases are already known, namely the examples with \(n = 4, 5,\) and \(7\). Now we can apply the same method to calculate the probability of finding at least one more case with \(n \gt 7\). Let’s make the question more general: “Whether or not I have seen any squares among the first \(C\) values of \(n! + 1\), what are the chances I’ll see any thereafter?” To answer this question, we can just remove the first \(C\) elements from the infinite product:
For \(C = 7\), the answer is about \(0.0037\). For \(C = 100\), it’s about \(5.7 \times 10^{-80}\). We are sliding down the steep slope of the red curve.
As a practical matter, further searching for another factorial-square couple does not look like a promising way to spend time and CPU cycles. The probability of success soon falls into the realm of ridiculously small numbers like \(10^{-1{,}000{,}000}\). And yet, from the mathematical point of view, the probability never vanishes. Removing a finite number of terms from the front of an infinite product cannot change its convergence properties. If the original product converged to a nonzero value, then so will the truncated version. Thus we have wandered into the canyon of maximal frustration, where there’s no realistic hope of finding the prize, but the probabilities tell us it still might exist.
I am going to close this shambling essay by considering one more example—a cautionary one. Suppose we apply probabilistic reasoning to the search for a cube that is \(1\) less than a square. If we were looking for exact matches between cubes and squares, we’d find plenty of them: They are the sixth powers: \(1, 64, 729, \ldots\). But integer solutions to the equation \(n^3 + 1 = m^2\) are not so abundant. One low-lying example is easy to find: \(2^3 + 1 = 3^2\), but after 8 and 9 where can we expect to see the next consecutive cube and square?
The probabilistic approach suggests there might be reason for optimism. Compared with factorials and Fibonaccis, cubes grow quite slowly; the rate is polynomial rather than exponential or superexponential. As a result, the probability of finding a cube at a given distance from a square falls off much less steeply than it does for \(n!\) or \(F(n)\). In the graph below, \(P(n^3 + k = m^2)\) is the orange curve.
Note that the orange curve lies just below the blue one, which represents the probability that \(n!\) lies near a prime. The proximity of the two curves suggests that the two problems—factorials adjacent to primes, cubes adjacent to squares—might belong to the same class. We already know that factorial primes do seem to go on and on, perhaps endlessly. The analogy leads to a surmise: Maybe cube-square coincidences are also unbounded. If we keep looking, we’ll find lots more besides \(8\) and \(9\).
The surmise is utterly wrong. The problem has a long history. In 1844 Eugène Catalan conjectured that \(8\) and \(9\) are the only consecutive perfect powers among the integers; the conjecture was finally proved in 2004 by Preda Mihăilescu. For the special case of squares and cubes, Euler had already settled the matter in the 18th century. Thus, probabilities are beside the point.
All of the questions considered here belong to the category of Diophantine analysis—the study of equations whose solutions are required to be integers. It is a field notorious for problems that are easy to state but hard to solve. Catalan’s conjecture is one of the most famous examples, along with Fermat’s Last Theorem. When Diophantine problems are ultimately resolved, the proofs tend to be non-elementary, drawing on sophisticated tools from distant realms of mathematics—algebraic geometry in the proof of Fermat’s Last Theorem by Andrew Wiles and Richard Taylor, cyclotomic fields in Mihăilescu’s proof of the Catalan conjecture. As far as I know, probability theory has not played a central role in any such proof.
When I started wrestling..
Read Full Article
Visit website
Show original
.
Share
.
Favorite
.
Email
.
Add Tags
close
Scroll to Top
Separate tags by commas
To access this feature, please upgrade your account.