Follow foldl on Feedspot

Continue with Google
Continue with Facebook


Motivating the Rules of the Game for Adversarial Example Research is one of the most level-headed things I’ve read on AI safety/security in a while. It’s 25 pages, which is long for a machine learning paper — but it’s worth it. My brief take-away from the paper, which I totally support:

Adversarial example research has been framed in two ways:

  1. an experimental method for pure research which helps us better understand neural network architectures and their learned representations
  2. a practical method for securing machine learning models against attacks from adversaries in the wild.

Adversarial examples are the least of our problems in the latter practical framing. We ought to either (1) re-cast adversarial example work as a pure research problem, or (2) build better “rules of the game” which actually motivate popular adversarial defense methods as sufficient security solutions.

Here are some more extracts that I think summarize the push of the paper (emphasis mine):

We argue that adversarial example defense papers have, to date, mostly considered abstract, toy games that do not relate to any specific security concern (1).

Much of the adversarial perturbation research arose based on observations that even small perturbations can cause significant mistakes in deep learning models, with no security motivation attached. … Goodfellow et al. intended $l_p$ adversarial examples to be a toy problem where evaluation would be easy, with the hope that the solution to this toy problem would generalize to other problems. … Because solutions to this metric have not generalized to other settings, it is important to now find other, better more realistic ways of evaluating classifiers in the adversarial [security] setting (20).

Exploring robustness to a whitebox adversary [i.e. $l_p$-norm attacks] should not come at the cost of ignoring defenses against high-likelihood, simplistic attacks such as applying random transformations or supplying the most difficult test cases. … Work primarily motivated by security should first build a better understanding of the attacker action space (23).

An appealing alternative for the machine learning community would be to recenter defenses against restricted adversarial perturbations as machine learning contributions and not security contributions (25).

To have the largest impact, we should both recast future adversarial example research as a contribution to core machine learning functionality and develop new abstractions that capture realistic thread models (25).

Some other notes:

  1. The authors correctly point out that “content-preserving” perturbations are difficult to identify. $l_p$-norm is just a proxy (and a poor one at that!) this criterion. If we try to formalize this briefly, it seems like a content-preserving perturbation $\delta_{O,T}(x)$ on an input $x$ for some task $T$ is one which does not push $x$ out of some perceptual equivalence class according to a system-external observer $O$ who knows $T$.

    If that’s right, then concretely defining $\delta$ for any domain requires that we construct the relevant perceptual equivalence classes for $O$ on $T$. Is this any easier than reverse-engineering the representations that $O$ uses to solve $T$ in the first place? If not, then posing the “correct” perturbation mechanism is just as difficult as learning the “correct” predictive model in the first place.

  2. I think the definition of “adversarial example” begins to fall apart as we expand its scope. See e.g. this quote:

    for many problem settings, the existence of non-zero test error implies the existence of adversarial examples for sufficiently powerful attacker models (17).

    This is true for a maximally broad notion of “adversarial example,” which just means “an example that the system gets wrong.” If we expand the definition that way, the line between a robust system (in the security sense) and a well-generalizing model begins to get fuzzy.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

tl;dr: I question the assumption that technical solutions to mid-term safety problems will be relevant to the long-horizon problems of AI safety. This assumption fails to account for a potential paradigmatic change in technology between now and the date at which these long-horizon problems will become pressing. I present a historical example of paradigmatic change and suggest that the same is possible for AI, and argue that our bets on the importance of present-day safety work ought to incorporate our beliefs over the strength of the current paradigm.

I’m back from a brief workshop on technical issues in AI safety, organized by the Open Philanthropy Project. The workshop brought together the new class of AI Fellows with researchers from industry labs, nonprofits, and academia to discuss actionable issues in AI safety.

Discussions at the workshop have changed and augmented my views on AI safety in fundamental ways. Most importantly, they have revealed to me several critical conceptual issues at the foundation of AI safety research, involving work with both medium time horizons (e.g. adversarial attacks, interpretability) and much longer horizons (e.g. aligning the incentives of superintelligent AIs to match our own values). I believe that these are blocking issues for safety research: I don’t know how to value the various sorts of safety work until I arrive at satisfying answers to these questions. Over the next months, I’ll formalize these questions in separate single-authored and co-authored blog posts.

This post addresses the first critical of these critical conceptual issues. This issue is the least technical – and possibly the least deep-cutting – of those which I want to raise. Because it touches on one of the most common safety arguments, though, I thought it’d be best to publish this one first.


AI safety is a very diverse field, encompassing work targeted at vastly different time horizons. I identify three in this post:

This post is mainly concerned with the value of mid-term work.

Mid-term AI safety

Many mid-term researchers assume that their work is well aligned with solving longer-horizon safety risks — that is, that technical solutions to mid-term problems will also help us make progress on the most concerning long-horizon risk scenarios. Paul Christiano has made statements about the likely alignment of mid-term and long-term issues — see, for example, his 2016 article on prosaic AI:

It now seems possible that we could build “prosaic” AGI, which can replicate human behavior but doesn’t involve qualitatively new ideas about “how intelligence works” … If we build prosaic superhuman AGI, it seems most likely that it will be trained by reinforcement learning … But we don’t have any shovel-ready approach to training an RL system to autonomously pursue our values.

To illustrate how this can go wrong, imagine using RL to implement a decentralized autonomous organization (DAO) which maximizes its profit. If we had very powerful RL systems, such a DAO might be able to outcompete human organizations at a wide range of tasks — producing and selling cheaper widgets, but also influencing government policy, extorting/manipulating other actors, and so on.

This sort of argument is used to motivate mid-term technical work on controlling AI systems, aligning their values with our own, and so on. In particular, this argument is used to motivate technical work in small-scale synthetic scenarios which connect to these long-term concerns. Leike et al. (2017) propose minimal environments for checking the safety of reinforcement learning agents, for example, and justify the work as follows:

While these [proposed grid-world safety environments] are highly abstract and not always intuitive, their simplicity has two advantages: it makes the learning problem very simple and it limits confounding factors in experiments. Such simple environments could also be considered as minimal safety checks: an algorithm that fails to behave safely in such simple environments is also unlikely to behave safely in real-world, safety-critical environments where it is much more complicated to test. Despite the simplicity of the environments, we have selected these challenges with the safety of very powerful artificial agents (such as artificial general intelligence) in mind.

These arguments gesture at the successes of modern machine learning technology—especially reinforcement learning—and recognize (correctly!) that we don’t yet have good procedures for ensuring that these systems behave the way that we want them to when they are deployed in the wild. We need to have safety procedures in place, they claim, far before more powerful longer-horizon systems arrive that can do much more harm in the real world. This argument rests on the assumption that our technical solutions to mid-term problems will be relevant at the long-horizon date when such systems arrive.

This post questions that assumption. I claim that this assumption fails to account for a potential paradigmatic change in our engineered AI systems between now and the date at which these long-horizon problems become pressing.2 A substantial paradigmatic change — which could entail a major change in the way we engineer AI systems, or the way AI is used and deployed by corporations and end users — may make irrelevant any mid-term work done now which aims to solve those long-horizon problems.

I’ll make the argument by historical analogy, and circle back to the issue of AI safety at the end of this post.

Paradigmatic change: an example

At the end of the 19th century, some of the largest cities in the world relied on horses as a central mode of transportation. A city horse was tasked with driving everything from the private hansom cab (a Sherlock Holmes favorite) to the double-decker horsebus, which could tow dozens of passengers.

1890s New York City, for example, housed over a hundred thousand horses for transporting freight and humans. While this massive transportation industry helped to continue an era of explosive city growth, it also posed some serious novel logistical problems. Many of those horses were housed directly in urban stables, taking up valuable city space. Rats and other city rodents flocked to the urban granaries established to support these stables.

But the most threatening problem posed by this industry by far was the waste. The massive horse population produced a similarly massive daily output of excrement and urine. Because the average city horse survived fewer than three years of work, horse carcasses would commonly be found abandoned in the streets.3

This sort of waste had the potential to doom New York and similar cities to an eventual crisis of public health. On dry days, piles of horse excrement left in the streets would turn to dust and pollute the air. Rainstorms and melting snow would precipitate floods of horse poop, meeting the especially unlucky residents of ground-floor apartments. In all climates, flies flocked to the waste and helped to spread typhoid fever.

Enterprising Americans were quick to respond to the problem—or, rather, to the business opportunities posed by the problem. “Crossing sweepers” paved the way through the muck for the classiest of street-crossers. Urban factories cropped up to process horse carcasses, producing glue, gelatin, and fertilizer. Services carted away as much horse poop as possible to pre-designated “manure blocks,” in order to keep at least part of the city presentable.

Horse waste posed a major looming public health risk for the 19th-century city. I assume there were two clear roads forward here:

  1. Reduce the horse population. With cities around the world booming in population, banning or restricting horse-based transportation would stall a city’s growth. Not a viable option, then.
  2. Eliminate the waste. New technical solutions would need to be future-proof, robust even in the face of a continuously growing horse population. While the technical solutions of the day only mitigated some of the worst effects of the waste, this would have seemed like the only viable technical solution to pursue.

I certainly would have voted for #2 as an 1890s technologist or urban planner. But neither of these solutions ended up saving New York City, London, and friends from their smelly 19th-century fates. What saved them?

The automobile, of course. The internal combustion engine offered a fast, efficient, and increasingly cheap alternative to the horse. Urban horses were replaced somewhat slowly, only as market pressures forced horse owners to close up shop. By the final decade of the 19th century, most major cities had switched from horse-pulled streetcars to electrified trolleys. Over the following decades, increasingly economical engines replaced horses in buses, cabs, and personal vehicles. Automobiles introduced a novel technological paradigm, leading to entirely new manufacturing methods, service jobs, and — most importantly — safety issues.

The introduction of the automobile dealt a final blow to the previous transportation paradigm, and rendered irrelevant the safety issues it had imposed on modern cities: automobiles did not leave excrement, urine, or horse carcasses in the streets. Automobiles introduced entirely new safety issues, no doubt, which still trouble us today: car exhaust pollutes our atmosphere, and drunk car drivers do far more damage to pedestrians than a drunk hansom driver ever could. But it’s critical to note for our purposes that technologists of the horse-era could not have foreseen such safety problems, let alone develop solutions to them.

Potential paradigmatic changes in AI

Modern machine learning and AI are likewise built within a relatively fixed paradigm, which specifies how systems are constructed and used. I want to suggest that substantial changes to the present paradigm might invalidate the assumed alignment between mid-term AI safety work and the longer-term goals. But first, I’ll identify the relevant features of the paradigm that contains modern ML work.

What are some assumptions of the assumed paradigm which might change in the future? Any answer is bound to look silly in hindsight. In any case, here are a few candidate concepts which are currently central to machine learning, and to AI safety by association. I’ve heard all of these concepts questioned in conversations with reasonable machine learning researchers. Many of these assumptions have even been questioned/subverted in published papers. In short, I don’t think any of these concepts are set in stone.

  1. the train/test regime — the notion that a system is “trained” offline and then “deployed” to the real world4
  2. reinforcement learning; (discrete-time) MDPs
  3. stationarity as a default assumption; IID data sampling as a default assumption
  4. RL agents with discrete action spaces
  5. RL agents with actions whose effects are pre-specified by the system’s designer
  6. gradient-based learning / local parameter search5
  7. parametric models6
  8. the notion of discrete “tasks” or “objectives” that systems optimize
  9. (heresy!) probabilistic inference as a framework for learning and inference

I believe that, while many of the above axiomatic elements of modern machine learning seem foundational and unshakable, most are likely to be obsolete within decades. Before you disagree with that last sentence, think of what futures a horse-drawn cab driver or an 1890s urban planner would have predicted. Consider also what sort of futures that expert systems developers and Lisp machine engineers from past decades of AI research would have sketched. (Would they have mentioned MDPs?)7

You may not agree that all or most of the above concepts are about to be subverted any time soon. If you do agree that any foundational axiom A has the chance of disappearing, though, it is imperative that 1) your safety questions are still relevant, and 2) your technical solutions are successful both in a world where $A$ holds and $\neg A$ holds.

Consequences of paradigmatic change

The argument I am suggesting here is different from the standard “technical gap” argument.8 I am instead pointing out a paradigmatic gap: the technical solutions we develop now may be fatally attached to the current technological paradigm. Let $T_S$ be the future time at which long-horizon AI safety risks – say, prosaic AGI or superintelligence – become a reality. Here are two consequences of granting this as a possibility:

  1. Our current technological paradigm may mislead us to consider safety problems that won’t be at all relevant at $T_S$, due to paradigmatic change.

    Excrement evacuation seemed like a pressing issue in the late 19th century; the problem is entirely irrelevant in the present-day automobile paradigm. We instead deal with an entirely different set of modern automobile safety issues.

    The task of scalable reward specification likewise appears critically important to the mid-term and long-term AI safety crowds. Such a problem is only relevant, however, if many of the paradigmatic axioms from the previous section hold (at least #2–5).

  2. Technical solutions developed now may be irrelevant at $T_S$. Even if the pressing safety issues overlap with the pressing safety issues at $T_S$ (i.e., #1 above doesn’t hold), it’s possible that our technical solutions will still be fatally tied to elements of the current paradigm.

    Pedestrians and riders alike faced collision risks in the horse era — runaway horses might kick and run over people in their way. But the technical solutions to horse collision look nothing like those which save lives today (for example, airbags and stop lights).

There’s room to disagree on this question of a paradigmatic gap. But it certainly needs to be part of the AI safety discussion: our bets on the importance of present-day technical safety work ought to incorporate our beliefs over the strength of the current paradigm. Here are some concrete questions worth debating once we’ve granted the possibility of paradigmatic change:

  • How much are different risks and research directions actually tied to the current paradigm?9 (How can we get more specific about this “fatal attachment?”)
  • Do our current paradigm-bets look good, or should we be looking to diversify across possible paradigm changes or weaken the connection to the current paradigm?
    • What does “diversify” mean here? Would it entail doing more or less work under the framing of AI safety?
    • We need to arrive at a consensus on the pessimistic meta-induction argument here (see footnote #8). Are we justified in assuming the current paradigm (or any candidate future paradigm) is the right one in which to do mid-term safety work? Can empirical evidence help here? How can we get more concrete, in any case, about our uncertainty about the strength of a technological paradigm?
  • Are there ways to do research that’s likely to survive a paradigm shift?10 (What are the safety problems which are likely to survive a paradigm shift?)

Future posts will address some of the above questions in detail. For now, I look forward to the community’s response!


Here I’ll respond to various comments from reviewers which I couldn’t fit nicely into the above narrative.

  • Aditi and Alex suggested that AI safety work might be what actually brings about the paradigmatic change I’m talking about. Under this view, the safety objective motivates novel work which otherwise would have come more slowly (or not at all). I think that’s possible for some sorts of AI safety research — for example, the quest to build systems which are robust to open-ended / real-world adversarial attacks (stop sign graffiti) might end up motivating substantial paradigm changes. This is a possibility worth considering. My current belief is that many of these sorts of safety research could be just as well branded as “doing machine learning better” or “better specifying the task.” In other words, the “safety” framing adds nothing new. At best, it’s distracting; at worst, it gives AI safety an undeserved poor reputation. (As Michael suggested: I’d rather say “I’m working on X because it makes AI more robust / ethical / fair” than “I’m working on X because it will help stave off an existential threat to the human race.”) This is a very compressed argument, and I’ll expand it in a future post in this series.
  • Michael, Jacob, Max, and Paul suggested that mid- and long-term AI safety research might transfer across paradigm shifts. This is certainly true for the most philosophical parts of AI safety research. I am not convinced it applies in more mid-term work. I’m not certain about the answer here, but I am certain that this is a live question and ought to play an important role in debates over AI timelines.
  • Jacob, Tomer, and Daniel pointed out the possible link to Kuhnian paradigm shifts. See footnote #2 for a response. In a future post, I intend to address the separate danger of failing to acknowledge dependence on the current scientific paradigm (i.e., on our present notion of “what intelligence is”).

This post benefited from many discussions at the Open Philanthropy AI safety workshop, as well as from reviews from colleagues across the world. Thanks to Paul Christiano, Daniel Dewey, Roger Levy, Alex Lew, Jessy Lin, João Loula, Chris Maddison, Maxinder Kanwal, Michael Littman, Thomas Schatz, Amir Soltani, Jacob Steinhardt, Tomer Ullman, and all of the AI Fellows for enlightening discussions and feedback on earlier drafts of this post.

  1. I prefer to separate these practical issues under the name “machine learning security,” which has a more down-to-earth ring compared to “AI safety.” 

  2. I don’t intend to refer to Kuhnian paradigm shifts by using this term. Kuhn makes the strong claim that shifts between scientific paradigms (which establish standards of measurement, theory evaluation, common concepts, etc.) render theories incommensurable. I am referring to a much simpler sort of technological paradigm (the toolsets and procedures we use to reach our engineering targets). This post is only concerned with the latter sort of paradigmatic change. 

  3. From Greene (2008), cited in this Quora answer

  4. see e.g. online learning / lifelong learning 

  5. see e.g...

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Jack Gallant’s group published a Nature paper several years back which caused quite a buzz. It presented interactive “semantic maps” spanning the human cortex, mapping out how words of different semantic categories were represented in different places. From the abstract:

Our results suggest that most areas1 within the [brain’s] semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas [an interactive web application] shows which domains are represented in each area. This study demonstrates that data-driven methods – commonplace in studies of human neuroanatomy and functional connectivity – provide a powerful and efficient means for mapping functional representations in the brain.

The paper is worth a read, but is unfortunately behind a paywall. The group also produced the video below, which gives a brief introduction to the methods and results.

The brain dictionary - YouTube

In extremely abbreviated form, here’s what happened: the authors of the paper put people in a functional magnetic resonance imaging machine and took snapshots of their brain activity while they listened to podcasts. They tracked the exact moments at which each subject heard each word in a podcast recording, yielding a large dataset mapping individual words to the brain responses of subjects who heard those words.

They combined this dataset with several fancy computational models to produce maps of semantic selectivity, charting which parts of the brain respond especially strongly to which sorts of words. You can see the video for examples, or try out their online 3D brain viewer yourself.

This systems neuroscience paper managed to reach people all the way out in the AI community, as it seemed to promise a comprehensive account of actual neural word representations.2 There has since been plenty of criticism of the paper on multiple levels – in experimental design, in modeling choices, in scientific value, and so forth. In this post, I’ll raise a simple philosophical issue with the claims of the paper. That issue has to do with the central concept of “representation.” This paper’s claims to representation bring us to what I think is one of the most important open questions in the philosophy of cognitive science and neuroscience.

This post is intended to serve as a non-philosopher-friendly introduction to the problem of neural representation. Rather than advancing any new theory in this post, I’ll just chart out the problem and end with some links to further reading.

The essential argument

The authors repeatedly allude to “(functional) representations” of words in the brain. This term is often bandied about in systems neuroscience, but it is much more philosophically troubling than you might think at first glance. Let’s spell out the high-level logic of the paper:

  1. We play subjects some podcasts and make sure they pay attention.
  2. At the same time, we record physical traces of their brain activity.3
  3. After we have collected our dataset matching words spoken in the podcasts to brain activity, we build a mathematical model relating the two. We find that we can predict the brain activity of a subject (in particular regions) based on the words that they heard at that moment.4
  4. When we can predict the brain activity of a region with reasonable accuracy based on the identity of the word being heard alone, we can say that the region serves to represent that word.
  5. Our derived semantic map shows how the brain represents words from different semantic domains in different areas.

Let’s step back and put on our philosopher-hats here.

Things bumping around

What we actually observe in this experiment are two different types of physical events. First, a word is played through a pair of headphones, vibrating the air around the ears of the subject in particular way. Next, we see some neurons firing in the brain, spewing out neurotransmitters and demanding extra nutrients to replenish their strength.5 We find that there is some regular relationship between the way the air vibrates (that is, the particular words a subject hears) and the way particular populations of neurons respond.

Let’s make an even higher-level gloss of the core logic in this spirit:

  1. We make some atoms bump around in pattern \( A \) near the subject’s ears.
  2. We watch how some atoms bump around at the same time in a nearby area (the subject’s brain). Call this pattern \( B(A) \). Note that \( B \) is a function of \( A \) – we group the atom-bumps in the brain according to the particular patterns \( A \) presented to the subject.
  3. We build a mathematical model relating how the ear-atom-bumping \( A \) relates to the brain-atom-bumping \( B(A) \)_.
  4. When our model accurately predicts the bumping \( B(A) \) given the bumping \( A \), we say that \( B(A) \) represents some aspect of \( A \).
  5. The brain activity pattern \( B(A) \) represents the ear-bumping pattern \( A \).

At this level of abstraction—a level which might sound a little silly, but which preserves the essential moves of the argument—we might be able to draw out a strange logical leap. Point #4 takes a correlation between different bumping-patterns \( A \) and \( B(A) \) and concludes that \( B(A) \) represents \( A \).

Correlation as representation

That notion of representation captures the relevant relation in the paper. But it also captures quite a bit more – namely, any pair of physical events \( A \), \( B(A) \) for which some aspect of \( B(A) \) correlates with some aspect of \( A \). Here’s a random list of pairs of physical events or states which satisfy this requirement:

  • The length of a tree’s shadow (\( B(A) \)) and the time of day (\( A \))
  • My car’s engine temperature (\( B(A) \)) and the position of the key in my car’s ignition (\( A \))
  • The volume of a crowd in a restaurant (\( B(A) \)) and the number of eggs broken in the last hour of that restaurant’s kitchen (\( A \))

In none of the above cases would we say that the atom/molecule/photon-bumps \( B(A) \) represent an aspect of \( A \). So why do we make the claim so confidently when it comes to brains? Our model of the brain as an information-processor needs this notion of representation to be rather strong – to not also include random physical relationships between shadows and time, or volumes and egg-cracking.6

The quest

We could just declare by fiat, of course, that the relationships between the brain and the outside world are the ones we are interested in explaining. But as scientists we are interested in developing explanations that are maximally observer-independent. The facts we discover – that region \( X \) of the brain exhibiting a pattern \( B(A) \) represents some aspect \( A \) of the outside world – ought to be true whether or not any scientist cares to investigate it. Our desired notion of representation should emerge naturally from a description how \B(A)\) and \(A\) relate, without selecting the silly cases from above. For this reason, people generally think of this theoretical program as a quest for naturalistic representation.

M.C. Escher — Hand with Reflecting Sphere.

A first response: Sure, the details of \( B(A) \) can be used to infer the details of \( A \) in all of these cases, including the case of the Nature paper. The difference between the Nature paper and the silly examples given above is that the correlation between \( B(A) \) and \( A \) is relevant or important in some sense. We’re capturing some actual mechanistic relationship in the case of the brain, whereas the other examples simply pick on chance correlations.

A counter: I don’t see a principled difference between your “mechanistic relationships” and your “chance correlations.” There are certainly mechanistic explanations which link the length of a tree’s shadow and the time of day, or any of the other pairs given above. Why privilege the neural relationship with the label of “mechanism?”

Our answer to that question can’t fall back on claims about the brain being a more “interesting” or “relevant” system of study in any respect. We need to find a naturalistic account of why the brain as a data-processor is any different than those (admittedly silly) examples above.

This, then, is the critical problem of representation in the brain: we need to find some way to assert that the brain is doing something substantial in responding to its inputs, over and above the way a tree or a car engine “respond” to their “inputs.” (Why do we need scare-quotes in the second case, but not in the first?)

Future posts on this blog will characterize some of the most popular responses to this conceptual issue. In particular, I’ll explore notions of representation which require an account of how content is used or consumed. For now, though, I’ll link to some relevant writing:

  • From neuroscientists: deCharms & Zador (2000), Parker & Newsome (1998) – more sophisticated operational definitions of neural representation.
  • From philosophers:
    • Ramsey (2003) – difficult, but very exciting, attack on the idea of neural representation.
    • Egan (2013), see also video here – argues that talk of representational content is simply a useful “gloss” on actual theories. (Directed at mental representation, but applies just as well to neural representation.)
  1. Here “area” means a particular region of the cortex of the human brain. 

  2. This is absolutely not the first paper on how words are represented neurally – see e.g. Martin (2007). It may be unique as of 2016, though, in its breadth and its spread into the AI community. The first author of the paper presented this work, for example, at the NIPS conference in the winter of 2016. 

  3. In this particular case, those traces consist of changes in blood flow to different regions of the brain, detected by a machine with an enormous magnet surrounding the person’s head. For more, check out the Wikipedia article on functional magnetic resonance imaging (fMRI)

  4. Technical note: “at that moment” is not exactly correct, since fMRI data only tracks the pooled responses of samples of neurons over the span of several seconds. 

  5. Another hedge: what we actually observe is the flow of oxygenated and deoxygenated blood around the brain. I’ll stop making these technical hedges now; the neuroscientists can grant me a bit of loose language, and the non-neuroscientists nerdy enough to read these footnotes are hopefully motivated by this point to go read about the details of fMRI

  6. M.H. points out that this naïve notion of neural representation also fails to pick out cases we would call proper representation. Consider entertaining an arbitrary thought, which (presumably) activates neural populations in such a way that we’d say those activations represent the thought. It’s not possible in this case to point out any stimulus or action correlated with that neural activity, since the actual represented content of the thought is unobservable to the scientist. 

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A friend remarked recently that the majority of the recent posts on this were rather “academic.” Well, I’m an academic, aren’t I? I didn’t see any problem with this label at the time.

But it turns out that academics are people, too—and, as genuine people, might benefit from exploring outside the ivory tower every once in a while. In celebration of my own human-ness, then, this week’s post has zero intellectual content.

I was sitting earlier tonight at Cambridge Zen Center in a weekly community meeting. It’s a nice, humble get-together where Zen Center members and a few dozen people from the community come to talk about meditation and Buddhism.

Each meeting begins with a five-minute meditation. We had an unusually large crowd tonight, and the room was packed as we settled in for our sit. A clap from the leader signaled the start of the five minutes, and the room fell silent.

But that silence tonight was by no means the absence of sound. I’ve been going to these meetings for about 9 months, but somehow never noticed within the stillness all this noise.

A man behind me pushed air back and forth slowly over tensed vocal cords, singing a high-frequency static like that of a distant sea. A girl to my breathed quickly, occasionally voicing little falsetto squeaks. In front of me, a man exhaled in quick bursts, like a horse just after a gallop. Beneath these solos swayed a textured chorus of ins and outs, ins and outs.

The symphony at the Zen Center cued a memory of a quiet forest, with the wind filtering through the leaves of the trees: in and out, in and out.

The forest behind Dhamma Suttama. Montebello, Québec.August 2017

It was a unique moment. After five precious minutes, we separated ourselves from our branched brethren and began to talk.

This post shall have no conclusion attempting to induce any general lessons from the above story. Instead, without a whiff of conceptual analysis or other “academic” hullabaloo, it will simply end.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
foldl - 9M ago

—as I walked down a quiet side-street in Cambridge, not far from Central Square. I was glued to my phone and couldn’t make out so many details without looking up, but I could see that it was middle-sized and black, facing me and angled to the north-east.

I could tell this was a dog not only from its shape, but also from that primitive thwang that dogs trigger in my bones. I’m not afraid of dogs – I’ve spent most of my life around them – but I’m still wary around arbitrary canines on the street, leashed or not.

I felt that thwang as I registered the dog’s basic features. Black, medium size – maybe a black labrador. I raised my head, ready to step out of the way, smile at the owner, follow the basic program. But there was no dog in front of me.

What was in front of me was not a black labrador, but a commuter bike locked to a slightly oblique street sign. The bike had a thin black seat and narrow road tires, with a rusty pannier rack framing its back wheel. Its handlebars – drop bars, taped black – were angled away from me. No recognizable dog-features in sight, let alone an actual dog.

How could my own experience of the world be so wrong?

Am I pathological? I don’t think so. I’ve been noticing more of these experiences over the past few months. Sights, sounds, and sensations occasionally reveal themselves to be little fibs: reasonable, but ultimately inaccurate, pictures of what is actually out there in the real world.

There are at least two pictures of perception that such fib-experiences might suggest. Both views suggest that sensations and beliefs combine to produce our visual experience—this much is uncontroversial. They differ, though, on how much credit is assigned to each of those sources.

In one picture, my brain takes in an abundant amount of detail about the visual world at all times. On top of that abundant stream of information, some higher-level system sprinkles on the conceptual details: that cube is a cardboard box, that wiggling object is dangerous, and so on. Serious mistakes in those higher-level attributions – like the dog-percept presented above – can temporarily paint over my sensory inputs and cause me to see things that aren’t there.

A second picture suggests that the sensory information reaching my brain at any moment is actually quite sparse. On top of this sparse stream, most of the work of perception is performed at higher levels, with the mind “filling in” all of the gaps in my sensory data on the basis of beliefs and expectations. In this view, it’s not that the mind overwrites sensory data that is already there — rather, the mind is continuously tasked with filling in perceptual information not present in the original sensory data.

To further develop these two pictures, I’ll turn to some details on the human eye.

The human retina contains two major types of light-sensitive cells:1 rods and cones. Rods are responsible for vision in low light, and are not sensitive to color. Cones function well only in high-light situations, and uniquely support color vision.

It turns out that these two types of cells are distributed unequally in the retina. Cones cluster around the area of the retina which maps to the center of our visual field (the fovea), while rods dominate everywhere else.

Spatial distribution of rods and cones in the human retina. From Cmglee on Wikipedia.

This spatial distribution suggests that, at any moment, the majority of the color information my retina receives only picks out points in the very center of my visual field.2

This is one case, then, in which the brain seems to receive rather sparse sensory information. That’s puzzling, because it doesn’t seem to map onto my experience. I certainly don’t think that my color vision is limited to the very center of my visual field—I really hope yours isn’t, either.3

How is it that I perceive the world as fully colored, if my sensory machinery cannot possibly yield such an image? If that underlying hardware is yielding only a sparse picture of the real world, why does color feel so abundant in my visual experience?

Balas & Sinha (2007) present a simple experiment which will help us better draw out this sparse view. Their results offer behavioral evidence that some higher-level mental module actively fills in color information, turning what is originally a rather sparse sensory representation into an abundant visual experience.

(Unfortunately, the paper is not open-access, and the publisher demands hundreds of dollars for the rights to reproduce the figures. So I’ll do my best to summarize the procedure and results here.)

The authors prepared modified images of natural scenes like the one in the figure below. They took full-color images and imposed a randomly sized color mask, such that a circle in the center of the image remained in color while the rest of the image appeared in grayscale.

Partially-colored chimera image like those used in Balas & Sinha (2007).

These “chimera” images were rapidly presented to adult subjects, and the subjects were asked after each presentation to report whether the image they had just seen was in grayscale, full color, or a mix of the two.

The crucial metric of interest here is the rate of color “false alarms” — that is, how often subjects perceive an image with only its center in color as a fully colored picture. These false alarms be evidence of the brain filling in color percepts.

What would we expect to find? We know that the actual sensory data is rather sparse — recall that the majority of color-sensitive photoreceptors cluster in the fovea, in the center of the visual field. We might guess, then, that if the region of the image perceived by this color-sensitive area is appropriately colored, then the brain would serve to fill in the rest of the percept.

This is what Balas & Sinha find in their main result: even when nontrivial portions of the image are presented in grayscale, people are likely to report that the entire image is in color. For example, when the color mask covers 17.5 degrees of the visual field, subjects report that the entire image is colored almost 40% of the time. These false alarm rates reach 60% as the size of the color mask increases.

There’s much more to the paper: the authors present further experiments attempting to work out the source of the information used to fill in color. For our purposes, though, this headline result is already interesting.

We have evidence from both directions for the sparse view, then:

  • At the neural level we can see that the hardware to support color vision is clustered around a small central area of the retina, yielding rather sparse information about color in the rest of the visual field.
  • At the behavioral level we see that people often perceive these partially-colored images as fully colored.

It seems, then, that higher-level mechanisms are doing quite a bit of work in the brain to “fill in” the sparse information which manages to filter through the retina.

Why am I writing about this? I think the “filling in” metaphor is a useful tool for the mental toolbox.4 While this sort of phenomenon shows up again and again in psychology, I feel like I’ve only just begun to internalize it — to start to actually see footprints of the process in my own experience.

It’s likely due only in small part to my intellectual understanding of the process. It’s more likely, I think, that regular meditation and introspection is what is actually helping me see my own experience more clearly.

In any case, it’s quite the thrilling ride. I am catching my mind for the regular fibber that it is, as it paints pretty pictures over messy and sparse arrays of input from my sensory hardware. Happy hallucinating!

  1. Fun fact. There is actually a third type with quite a long name: intrinsically photosensitive retinal ganglion cells. These cells (a ~1% minority in the retina) help regulate circadian rhythms and contribute to melatonin production/suppression. They were first hypothesized after scientists discovered that supposedly blind mice were still able to respond to changes in their visual environment. 

  2. This is not exactly correct, of course. We rapidly and subconsciously microsaccade, even when we feel we are fixating our eyes on one position in our visual field. It’s possible that these microsaccades function in part to gather information about colors and forms in our periphery. I don’t pretend to cover all my bases as a vision scientist here – I only hope to get the broad strokes of this argument correct. 

  3. I also don’t think that my peripheral vision is especially acute in low-light conditions. 

  4. Psychologists and cognitive scientists might be reminded of terms like “top-down processing” and “predictive processing.” I’m not sure this metaphor adds anything on top of those, but it does sound quite a bit more intuitive. Anyway, the point of this post is to share some fun facts and ideas, not to present a novel metaphor. 

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Abi See recently hosted a debate between Yann LeCun (Facebook/NYU) and Chris Manning (Stanford) on the importance of linguistic structure and innate priors in systems for natural language understanding. The video, along with a nice write-up from Abi herself, are available here.

I used to have strong opinions on the use of linguistic structure in NLP systems.1 I’m no longer so passionate about that debate, but I still found something interesting in this particular discussion. Yann made a striking remark near the end of the debate (beginning in the video at 59:54):

Language is sort of an epiphenomenon [of human intelligence] … it’s not that complicated … There is a lot to intelligence that has absolutely nothing to do with language, and that’s where we should attack things first. … [Language] is number 300 in the list of 500 problems that we need to face.2

Wow. Those words come from the man who regularly claimed a few years ago (circa 2014) that language was the “next frontier” of artificial intelligence.3

Four years is quite a long time in the world of AI — enough to legitimately change your mind in the light of evidence (or lack thereof). Recall that NLP in 2014 was awash in the first exciting results in neural machine translation.4 LeCun rode the wave, too: in 2015 he and one of his students put up a rather ambitiously titled preprint called “Text Understanding from Scratch.” (No, they didn’t solve “text understanding.”)

Yann seems to have had a change of heart since those brave words. I think the natural language processing community as a whole has begun to brush off the deep learning hype, too.

One can hope.

  1. Heck, it was a central motivation for my research at that time. I suppose that was a natural consequence of being directly advised by Chris. :) 

  2. The “we” in this quote is ambiguous. I’d guess from context that he was referring to Facebook AI, but he could have also meant to refer to the larger AI research community. 

  3. I recall this distinctive phrasing from several public talks, but we also have some text records. Source 1, Yann’s response to a 2014 AMA: “The next frontier [sic] for deep learning are language understanding, video, and control/planning.” Source 2, quoted in a 2015 article from Cade Metz: “Yann LeCun … calls natural language processing ‘the next frontier.’” 

  4. See e.g. Kalchbrenner & Blunsom (2013); Sutskever, Vinyals, & Le (2014); and Bahdanau, Cho, & Bengio (2014)

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
The meditation hall at Dhamma Suttama in Montebello, Québec.August 2017

Several weeks ago I sat a 10-day Vipassana meditation retreat at the Dhamma Suttama in Montebello, Québec. In an explicitly secular setting, a team of volunteer teachers demonstrated to around 100 of us students the basics of a meditation technique recovered directly from the teachings of the Buddha.1 These 10 days were excruciating – painful both physically and mentally. But out of the pain and maddening boredom emerged an inner stillness and peace I had never felt before.

I could write a lot more about my personal experience at the retreat. I could document how days of sitting in silence brought about such heightened sensory perception that I felt I had acquired superpowers. I could recall my dreams during those 10 days, more vivid and bizarre than I had ever experienced. Unsurprisingly, there is already plenty of writing like this all over the Internet. And I suggest you read none of it.

Meditation is fundamentally a solitary practice. The mission, after all, is to discover a new way to see the world through your own eyes. This can’t be accomplished by reading blog posts or books, or by discussing your practice with teachers or fellow students. You can only find your own way.

I’ve been reading other people’s accounts of Vipassana retreats after returning from my own, and frankly, a lot of this content underserves the experience. It’s not that there is a dearth of skilled writers discussing the topic. It’s simply that extended meditation yields effects that really can’t be conveyed through language, no matter the skill of the writer.

I can sympathize if that last sentence sounds a little loosey-goosey to you. My analytical-philosopher-mind of early 2017 would have thought the same. But try to think of meditation as a method of practicing a skill – the skill of seeing clearly into your own, first-person experience.2 It’s a skill just like improvisational jazz, or cooking, or racecar driving. If you ask a master of any of these skills to describe what it’s like to engage their mastery, you might be able to get a detailed description of the sights, sounds, tastes, smells, and thoughts involved in the experience. But hearing such a description is not the same thing as having these feelings first-hand, no matter how lengthy or beautifully worded.

The special thing about meditation as skill practice, then, is that the skill we are practicing is necessary for grasping the original lessons of the Buddha. It’s quite fun to study Buddhist notions of suffering, impermanence, and “not-self” at an intellectual level. But we can’t fully grasp these concepts until we directly experience their effects — until we sit down and listen.

So go sit down and listen for yourself! Retreats organized by the international Dhamma organization are free of charge.

But am I prepared?

Good question. I was anxiously asking this question as my own retreat approached. By this point, I had a daily meditation practice but had never participated in any retreat longer than several hours.

You are not prepared if you’re like I was. But this is no reason at all to despair. Even those students returning for their second or third retreat feel they are not physically or mentally prepared for the experience. I suppose the only way to really be prepared for this sort of experience is to achieve enlightenment. Unless you think can accomplish that before your trip, I think you ought to settle for less.

That being said, I would like to end the post with some actual advice on how to be ready for your own retreat.

Physical preparation

First: resign yourself to the fact that you are not—and will not be—physically prepared. Experienced meditators and newbies alike complain of extreme physical pain during the one-hour sits over the ten days. You are no different.

Having said that, here are some basic items I think it’d be wise to follow in order to minimize your risk of injury.

  1. Research proper meditation posture. There are lots of ways to sit in meditation, and you need to find one (or two, or more) which work well for your body. Whichever position you pick, make sure that it is safe! Your best bet is probably to visit a local meditation group and rely on the teacher’s advice here. An unsafe posture can do real damage to your muscles, bones, and nerves. Be safe!
  2. Sit! Establish a daily meditation practice. You can sit for 5 minutes or 50 minutes at first – just make sure you keep it up every day. As your retreat approaches, begin increasing your daily sit time. You should be comfortable sitting in some position – whichever works for you – for at least 20 or 30 minutes.
Mental preparation

Again: resign yourself to the fact that you are not—and will not be—mentally prepared. But you can do your best with the following:

  1. Remove stressors. Try to leave for your retreat on good terms with your family, friends, and colleagues. Finish big work projects or life projects. Pay off your bills.
  2. Keep a journal. Write about your daily experience and think about what you want to get out of your meditation practice.
  3. Share your journey. Tell your friends and family about your plans and let them interrogate you. Some will be surprised, some won’t understand at all, and some might want to come with you. You can use your social network to help work out for yourself what the retreat is for in the first place.

It’s up to you to begin the exploration. Good luck!

A wooden bridge in the forest near the Dhamma Suttama meditation center in Montebello, Québec.August 2017
  1. Wikipedia has fairly good coverage on the Vipassana movement and its relation to Buddhism. We were given rather orthodox lessons in the tradition of Theravāda Buddhism, with lectures drawing almost entirely on content from the Pāli Canon

  2. In Pali, vipassana actually means something like “seeing into [your own experience].” Those interested can read an entire article on the meaning of vipassana

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
foldl - 9M ago
  1. I like plants. Plants are nice. But are they conscious? Do they think? Follow this very beginner-friendly series on the Brains Blog: Can plants remember? Perceive? Represent? Think?
  2. How is cognitive function distributed across the human brain? A meta-study applies a simple entropy measure over distributions of activations for brain regions across tasks. The findings: both cortical and subcortical regions vary greatly in their task diversity. An assortativity measure also varies greatly across cortex; some regions are functionally cohesive, while others seem to yield a patchwork of different functions.
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

It’s been quiet here for a while! I have a new series of posts coming up soon, in which I’ll try to specify some desiderata for agents which understand language. That word understand gets thrown around a lot in my circles, with too little critical thought about it’s meaning.

But enough foreshadowing—for now, I wanted to just share some philosophizing thoughts which have been rolling around in my head for the past few days.

Language is defined by its use. The meaning of the words we use can be derived from the way that they are deployed and the way that people react to them in conversation.

But it’s certainly necessary that this language bottom out in uses that are nonlinguistic. Language doesn’t happen in a vacuum: we deploy it as a tool in all sorts of situations in order to get things done. Take the standard Wittgensteinian language game example:1

The language is meant to serve for communication between a builder A and an assistant B. A is building with building-stones: there are blocks, pillars, slabs and beams. B has to pass the stones, and that in the order in which A needs them. For this purpose they use a language consisting of the words “block”, “pillar”, “slab”, “beam”. A calls them out;—B brings the stone which he has learnt to bring at such-and-such a call.——Conceive this as a complete primitive language.

I think Wittgenstein—and many others—would readily agree that this game is not just a game of words but a game of words causing things and of other things causing words. We can’t fully define the meaning of a word like “slab” without referring to the physical actions of A and B. In this way, linguistic meaning has to bottom out at some point in nonlinguistic facts.2

Bridging principles

Call the above argument a defense of a bridging principle. Generally speaking, a bridge principle is some statement which links entities from a domain or mode A to a domain or mode B, and thereby gives items in B some new sort of meaning. In the case above, we have that nonlinguistic things—grounded objects, physical actions, nonlinguistic cognitive states, etc.—exist in a domain A, and link to words, sentences, etc. in domain B, thereby giving them their meaning.

I wanted to make this post simply to point out that this search for bridging principles is by no means one unique to language. There are at least three parallels within philosophy that I can think of off of the top of my head:

  1. A central open question in moral philosophy asks whether normative/evaluative statements (“you should be politically involved,” “it is bad to kill people,” …; domain B) bottom out in non-normative statements (physical states of our brain, etc.; domain A). Some people believe that this non-normative domain is the only thing that we can actually use to make our normative statements meaningful. Concretely, the bridge here is from non-normative to normative statements.
  2. In epistemology, we ask whether our (inferential) justified beliefs (“I see a rock over there,” “I am in pain,” …; domain B) might bottom out in things that are not beliefs at all (the perceptual experience of seeing a rock, the sense of pain; domain A). Concretely, the bridge here is from nondoxastic experiences to justified beliefs.
  3. In philosophy of mind, theories of intentional representation attempt to explain how our items of mental content (thinking of the color blue, wanting pizza; domain B) represent things in the real world (blue things, the state of wanting pizza; domain A). These theories explain how our representations bottom out in the real world by some direct causal chain, normative conditions, etc. Concretely, the bridge here is from real-world things to representations of those things.

The case of linguistic meaning is certainly very close to #3, though I’m not yet sure how to unify the two (or if they can be unified).

I’m not sure what to do next with this information. In any way, I find it pleasing to recognize that a pile of nominally separate disciplines are actually all engaged in rather similar activities at a high level.

  1. Philosophical Investigations, §2 

  2. John Searle calls this system of nonlinguistic facts a cognitive “Background.” Where we locate the Background — whether in the brain or in the real world — is not very relevant for the purposes of this post. 

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
foldl - 9M ago
  1. Whither Speech Recognition?
  2. Is the mind just an accident of the universe? A very brief introduction to panpsychism.
  3. Three ways you might still be racist. Eric Schwitzgebel spells out how eliminating consistent overt racist behavior is not the same thing as not being a racist. I think it’s very good to clearly designate these other less obvious moral violations. As a result, we’ll be forced to recognize our own “moral mediocrity” (Eric’s phrase). The unacceptable alternative is to pretend we are morally perfect and not leave any room to keep learning. This is the default state of the non-critical modern liberal.
Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview