As the idea and importance of displaying uncertainty is not going away any time soon, I wanted to document my small contributions to this area with the hopes that it could spark inspiration for future (uncertain) work down the road.
Axis of Uncertainty
For any topic, it can be useful to define a loose taxonomy around it - to facilitate the exploration of its ‘space’. The space won’t be perfect, but it helps expand from existing work to possible future work (plus it makes you seem more of an expert!).
For uncertainty visualization, we started conjuring a space along two axes: the possible methods that could be used to visualize the uncertainty, and the data type of the “uncertain” data we are trying to visualize.
Now, visualization possibilities are what I typically think of first when I think about visualizing uncertainty. Nathan Yau has a great post on visualizing uncertainty where he outlines a number of different options, all of which (in my opinion) deal directly with encoding choices.
If I were to rename his headers a bit, the possible options I see him suggesting include:
These categories are a bit generic, and are probably not comprehensive, but provide one way to slice up the uncertainty visualization space.
The other dimension we might split on is around different types of data that exist and thus can have uncertainty associated with them. Here I’m talking about statistical data types as opposed to data types of a particular programming language. Briefly, the main data types I like to think about are:
Ordinal (where the values are discrete, but order matters)
Quantitative (aka continuous or real-valued)
To this basic list, we can add a few more types that I believe warrant separating:
Temporal (dates, hours, minutes - time is special!)
Spatial (or geographical, e.g. lat & lon values)
An Uncertain Taxonomy
With the possible values set for these two axes, we get a grid of possibilities to explore. And everyone knows that when there’s a grid, there’s a taxonomy! This table provides a tiny framework by which we can start to categorize existing uncertainty visualizations. It also provides a bit of inspiration for where one might explore to create novel uncertainty visuals.
This grid is super coarse in its capabilities to catalog. It’s my thought that each of the above “visualization methods” could be expanded further, either based on particular encoding options or perhaps by which visual metaphors are used, which would form a sort of taxonomy cube!
Our ultimate hope with our experimentation was to explore this entire cube, but for now I will just provide some examples from a tiny corner of the taxonomy.
(Also, I think there is a lot more to say about visual metaphors in data vis, but that is a topic for another time).
Uncertainty and Simulation
The area I really wanted to explore more was around how animation was used to visualize the notion of simulating possible outcomes from an uncertain future. As the reactions around the NYT Election Needle indicated, animation can be a very motivating when displaying uncertainty.
Fortunately, there is some prior work in this area. Jessica Hulman et al have developed HOP Plots which I see fitting very nicely in grid for using animation to visualize uncertainty around quantitative values. You can find a nice write up on the UW Interactive Data Lab Blog.
Here are some basic recreations of the HOP plot idea where I’ve added a stop/start button, but left out any scale or annotation.
A slow HOP plot:
A fast HOP plot:
As an exercise, we experimented with showing a history of simulations using onion skinning, which I deemed the “Onion HOP Plot”:
A slow Onion HOP plot:
A fast Onion HOP plot:
So, using this existing work as guidance, we started thinking about how animation could be used to display uncertainty for these other data types. I’m excited to start to fill in two additional little boxes in the grid for animation and categorical uncertainty and geographic uncertainty.
Let’s start with the simpler of the two: using animation to visualize uncertainty around categorical data.
Consider some categorical data that can each be one of a few possible values. Say also that these possible values can be represented as icons. Here is our universe of possible iconic values:
The kicker is that for any particular data point, we aren’t 100% certain which value it should be classified as. And so for each data point we have probabilities associated with a subset of these categories.
The specifics of what this data is, is hopefully somewhat generic - and not all that important to the visualization method. Perhaps this is an automatic labelling of a set of images by a computer algorithm, or some other classification problem with a finite number of possible categories.
Our Morphing Icons idea is a pretty simple one: take those probabilities associated with each possible category and show each icon for a length of time based on that probability. So, if we are 90% certain an item is of type A and 10% it is of type B, then show type A’s icon 90% of the time, and type B’s 10%.
Here’s examples using 2 and 3 possible icons:
Morphing Icon with 2 types:
Morphing Icon with 3 types:
The morphing icon on the left is showing 50% A and 50% C probabilities. The one on the right is showing 70% A, 20% D, and 10% E.
Here are a few other examples, just for fun:
Of course there are all sorts of caveats and limitations to this approach to displaying categorical uncertainty. It breaks down if there are too many possible categorical values, or if there are no acceptable icon representations. It’s also unclear how long each loop of the animation should take.
Also, while this visual certainly uses animation, it is not really a simulation of the possible data. A real simulation would draw results based on the associated probabilities and thus flash back and forth between categories rapidly, based on which one was selected each iteration.
All that being said, I still enjoyed this representation. One could imagine a toggleable ‘detail’ view that showed the probabilities explicitly when needed.
The other prototype I wanted to show also uses animation to show uncertainty, but this time for spatial data.
Here we have a very similar problem. Say we have events that happen somewhere in a particular region of the world. We want to display them, but don’t know exactly where they happened. For whatever reason, we only know the location of each event at some resolution.
For example, the data might have a group of events as happening at a particular city, but another group we might only know the county or state they occured in. These resolution levels thus indicate the amount of uncertainty associated with each event location.
One of the approaches we experimented with tackling this type of uncertainty with animation was what we called Wandering Dots. The idea again, is nothing mind-blowing, and has plenty of drawbacks, but is still interesting (at least to me).
With Wandering Dots, each event is representing as a dot (genius!). Each dot is animated to indicate the resolution of the data. Dots wander far (have a larger radius) for events that we are more uncertain about.
Wandering Dots with a large radius:
Wandering Dots with a small radius:
The wandering is random within the constraints of their assigned radius. But they maintain a fluid trajectory, instead of hoping all over the place, to all the following of an individual event, if need be.
Wandering Dots with many points:
Wandering Dots with few points:
Briefly, we also explored the idea of a using a solid plane to represent this uncertainty, called Wandering Hulls.
Wandering Hull with large radius:
Wandering Hull with small radius:
The idea was that we could perhaps use a choropleth approach with these hulls. However, the lack of access to individual points, as well as worries around occlusion, made us abandon this idea from further exploration.
Bounded Wandering Dots
Back with our dots, often times the level of resolution for an uncertain spatial data point corresponds with a particular geographic region - like a county or state.
Peter Beshai made a clever variant, the Bounded Wandering Dot which constrains the wandering to a specific shape.
Wandering Dots bounded by a square:
Wandering Dots bounded by a map region:
Pretty cool right?
An Uncomfortable Analysis
As part of our research, some of these prototypes were cleaned up and incorporated with other visuals into a dashboard-style tool for a small group of analysts to try out for a set of decision making activities.
Unfortunately we didn’t perform a controlled academic study around these representations, but we did get some initial impressions and feedback from these users.
One of the main pieces of feedback that has stuck with me is that the shifting visual representation made them feel a bit uncomfortable. But uncomfortable in a good way!
With more traditional uncertainty visualizations it can be easy to see the level of uncertainty, but ultimately ignore it, choosing the average over the error-bars, for example. While simpler aggregates are easy additions to these visuals, the animation provides perhaps a more visceral experience of the uncertainty that is harder to ignore.
Filling in the Cube
These quick examples are almost certainly not the “best” ways to represent uncertainty in these data types, but they are a start.
With more research in this area, I’m excited about the expansion and improvement to our informal taxonomy, as well as how other areas of the cube will be filled in with more ways to show the unknown. There are a lot of visual encodings and associated visual metaphors that can, and should be experimented with in this area. I’m looking forward to experimenting more in the future - and seeing how others tackle this uncertain issue!
Most of these more common map types focus on a particular variable that is displayed. But what if you have multiple variables that you would like to present on a map at the same time?
Here is my attempt to collect examples of multivariate maps I’ve found and organize them into a loose categorization. Follow along, or dive into the references, to spur on your own investigations and inspirations!
Before we begin, certainly you’ve heard by now that, even for geo-related data, a map is not always the right answer. With this collection, I am just trying to enumerate the various methods that have been attempted, without too much judgement as to whether it is a ‘good’ or ‘bad’ encoding. Ok? Ok!
With the interactivity available to the modern map maker, it is not surprising that extending into the third dimension is a popular way to encode data.
The above uses color and 3D height to encode natural gas and electric efficiencies of various neighborhoods in Chicago. It doesn’t provide freeform rotation, but does allow you to rotate to different cardinal directions, which helps with the occlusion. This tool also provides a detailed census block view of the data after clicking a neighborhood.
This was created by Doug McCune. It is not really multivariate, but I always really loved the style where he retains the basemap visual but uses hillshading to show geo-data in a very organic way. It seems like a technique that should have caught on more.
Taking the idea from exact shapes toward less precise icons are CartoDB’s Data Mountains. These maps use color and “mountain” size to encode multiple variables. The idea reminds me very much of geo-based Joyplots, like this great “Joymap” from Andrew Mollica showing the population density of Wisconsin:
The idea of using color alone to represent multiple pieces of data may seem strange, but it can happen! Let’s take a look at a few examples
Originally created in 2009 by Shawn Allen while he was at Stamen, this artistic piece no doubt influenced the trivariate choropleth we just looked at. With the city-level data in the dot map, you can see more interesting patterns (if you are familiar with San Francisco).
If one of the variables you are visualizing is categorical in nature, it is possible to show a multitude of maps, one for each category. That is what we find in the map below.
Hopefully this was a fun romp through the fun and strange possibilities of multivariate map displays. Come back to this page for potential inspiration or jumping off points the next time someone demands a map for your complex data.
Of course I’m not the only one who likes collecting, nor the first to ponder multivariate map encodings. For more, check out the great Axis Maps Thematic Cartography Guide which includes a multivariate section.
For OpenVis Conf 2016, we had the wonderful opportunity of offering workshops to conference goers.
In collaboration with my amazingly talented coworker, Yannick Assogba, we created a brand new, full day, text analysis and visualization workshop. It was a lot of hard work, but it was all worth it. For one magical day, we overcame internet difficulties and hurdled bad IT policy to provide an experience that impacted deeply my teaching approach and was universally enjoyed by our 30+ attendees.
We started with A series of Jupyter notebooks that familiarized everyone with the basics of working in a iterative notebook environment. We covered the basics of notebook use and of python - in case folks were unfamiliar.
Then we dove into text analysis, starting at the sentence level and working our way up to corpora. We looked at how to turn raw text into data, and a number of different metrics useful for understanding and visualizing the text.
Attendees each copied these notebooks to their machines, and used them as starting points for hands-on exercises throughout the morning.
After learning ‘enough to be dangerous’ about text analysis methods, we jumped over to the visualization side of things. Yannick and I gave a short intro to D3 and Processing.js before diving into an exploration of text visualization examples. The content for this presentation was organized based on visual encoding strategies of the text visualization methods.
The day ended with a short collaborative hackathon. Small groups were formed based on common interests and we all got to work building our own text visualizations. The results were impressive! Even after a full day of workshopping, the amount of interaction and the progress made by the teams was far more than I expected.
It was a great day, and I believe from the reviews and the interactions I’ve had since then that folks really enjoyed it.
regl is a technology meant to simplify building awesome things with WebGL. Recently, Mikola Lysenko, one of the creators of regl, gave a great talk at OpenVis Conf that got me wanting to spend more time with WebGL and regl in particular - to see how it could be applied to my data visualization work.
With this tutorial, I hope to share my brief learnings on this wonderfully mystical technology and remove some of that magic. Specifically, how do we make the jump from the interesting but not so applicable intro triangle example:
To using regl for some sort of data visualization? At the end of this tutorial, hopefully you (and I) will find out!
We will start with the triangle. Try to understand WebGL and regl in the context of this example, then work to modify it to expand our understanding. We will cover some basics of WebGL and shaders, setting up your development environment to work with regl, and then explore a subset of regl’s functionality and capabilities.
The final result will be a data stream visualization:
This falls short of the amazing regl visualizations Peter Beshai has recently produced and discussed recently, but hopefully this tutorial can serve as a stepping stone towards this greatness.
And if you want to just skip to the results, I’ve made a bunch of blocks to illustrate the concepts covered:
First, it might be useful to step back a bit and talk at a high level about WebGL and regl and what they are good for. If you know all this, feel free to skip ahead.
But it is also a crazy confusing API for folks just getting started who are not familiar with these types of systems. WebGl is a graphics pipeline that includes a number of steps that are used to get an image rendered to the screen. The API is very low-level and it can take a lot of work just to get something on the screen.
What is regl?
The regl framework simplifies the task of developing WebGL code by providing a more functional interface. It takes cues from the React world, providing a consistent and familiar way to pass state and properties into the WebGL code.
So, you still have to write some WebGL, but the parts that you are responsible for writing are simpler, isolated, and have a defined signature.
Throwing Shade with Shaders
The WebGL code you still need to write are known as shaders. Shaders are functions written in a special C-like graphics language called GLSL or OpenGL Shading Language (OpenGL being the standard WebGL is based on).
There are different types of shaders, specifically two types:
Each type is responsible for configuring a different portion of the WebGL rendering pipeline. You need to implement both to make a complete
A Vertex shader is given each point (vertex) of the thing that is being rendered (triangle, sphere, rabbit, dataset) and its job is to determine the position of each of these vertices in WebGL space. If we ponder this idea from a data visualization / D3 perspective, this is kind of like implementing the most specific d3.scale ever. Each vertex is a data point, and the shader is a scale that maps each input point to a location in space and time for your specific visual.
A Fragment shader deals with the visual display of the things rendered in the WebGL pipeline to the screen. Specifically, they need to set the color for each pixel that is being displayed. (Why is it called a fragment shader and not a pixel shader? Good Question!).
As an aside, shaders are called shaders because fragment shaders are used to control lighting and other special effects when using GLSL in game development.
We won’t go into the details of GLSL in this tutorial, but hopefully the simple shaders we use aren’t too confusing. I’d suggest reading a bit of The Book of Shaders if you haven’t seen GLSL at all before - it provides a nice smooth introduction (though it focuses soley on fragment shaders).
As an aside, The Book of Shaders also has a great shader editor you should check out that includes all sorts of nice features. You can learn more about it here.
Here are some other resources that I shamelessly borrowed from, and might cover these concepts more elegantly than me:
Before we put our pedals to the metals in implementing these concepts in our very own regl program, let’s take a moment to setup an environment that helps facilitate an exploration of these new technologies in a way that doesn’t incite us to throw our computers into the ocean out of frustration.
My solution to reducing frustration with new technologies is typically to supplement existing tools I like with new features. To this end, I’m going to suggest some Atom plugins to use that could make working with GLSL code easier for you.
But there are many other approaches! Feel free to ignore these recommendations and skip ahead, if you have a different methodology for WebGL development.
Also, I’ve included Blocks for each of the steps in the tutorial - which work without any additional setup - so if you don’t want to setup your coding environment now, you could just start forking those!
Atom Packages for GLSL Fun
Here are the Atom packages I would recommend using as we get started. Each can be installed via Atom’s “Preferences” menu.
First, grab the language-glsl package for some nice syntax highlighting of our GLSL code. Initially our GLSL code will be written inline as strings, but eventually we will write this code in .glsl files, so this package will come in handy then.
Next, you might be interested in the autocomplete-glsl which gives you handy autocompleting of GLSL types and functions. It even provides links to the documentation for each function!
Finally, I never leave home without a linter - and linter-glsl provides nice inline warnings and errors in your code so you can catch them early and (hopefully) avoid hours of glaring angrily at your screen just because you forgot a ‘.’ somewhere (it might still happen though!).
To get the linter working, you need to install glslangValidator - which if you are on a Mac and use homebrew you can do easily:
brew install glslang
A Baseline for good regling
Ok, after far too much yawning, let’s get to some code. Here we will add the necessary JS packages to our development environment - so again skip ahead if you are just working with the Blocks for now. Most of this section is a rehash of the lovely from nothing to something in WebGL blog post by Ricky Reusser - so feel free to use the original source.
You can just hit enter to select the defaults for the project if you like - or tweak them as necessary. This command adds a package.json file to our new regl-start directory. We will use this file to manage the npm packages we will use.
Some Nice-to-have Packages
We will install a few packages to get things up and running quickly. The commands to run inside your project are:
Next we create a regl draw command by calling regl(). As the docs state, draw commands are the main point of the library. They allow for packaging shader code, as well as state and input parameters into a single reusable command.
Minimally, a regl draw command takes an object with a few parameters:
frag which specifies the fragment shader code (written in GLSL).
vert which specifies the vertex shader code (again in GLSL).
count which indicates the number of vertices to draw.
In this example, both shaders are written as one big string (we will see how to improve this setup later). This draw command also provides more parameters: attributes and uniforms which we will look at below.
Now let’s recap the purpose of these two shaders, and look a bit at how they work.
The vertex shader needs to accomplish its goal of positioning each vertex. It is called once for each vertex and needs to set the gl_Position global during each call.
The fragment shader needs to set the color for each fragment. It does this by setting the gl_FragColor global each time it is called.
No matter what else happens in these shaders, these two variables (gl_Position and gl_FragColor) are what need to be set.
Also note the general structure of a shader. You start with the declaration of variables and then use these variables in your shader’s main() function. The precision mediump float; line sets the precision of floating point values.
We run the draw command by calling it on the last line:
And a triangle is born!
We can see some of the interactions between regl and shader code - but not everything is immediately clear. For example, we see color listed in uniforms section of the regl command, and then we see uniform vec4 color specified and used in the fragment shader, but what is a uniform?
Let’s talk more about the different variable types in shaders, then come back to see how we work with these in regl.
The Many Shader Variable Types
In shader land, there are three types of variables you can declare. They are all confusing, but I like the explanation provided by html5rocks, so I’ll try to summarize here:
Uniforms are accessible in both the vertex and fragment shader. They are ‘uniform’ because they are constant during each frame being rendered. But (as we will see below), their value can change between frames.
Attributes are only provided to the vertex shader. There is a value for each vertex being displayed. The individual value is provided to the vertex shader when dealing with that specific vertex.
Varyings are variables declared in the vertex shader that can be shared with the fragment shader. We won’t use this in the rest of the tutorial, but its good to know they exist!
regl and Variables
So finally we are getting to what makes regl interesting and worth trying out. How it organizes what you pass in to the fragment shader functions.
Now that we know that uniform and attribute are variable types, I bet you can guess what the uniforms and attributes parameters of the draw command object are for, right? They allow us to specify the values of variables that are accessible in our shaders!
Let’s break it down a bit more.
Our uniforms parameter looks like this:
color: [1, 0, 0, 1]
This indicates that there will be a color uniform available to the vert and frag shaders. This color variable is a vec4 - a vector of length 4 - as seen in the declaration in the fragment shader, so it is declared here as an array with 4 values.
(Check out the Book of Shaders Color Chapter to learn more about how colors are defined in GLSL).
We can see we define one attribute, position, that is an array with 3 values. As we have our count parameter set to 3, the vertex shader code in vert will be run 3 times, each time its corresponding position attribute will be set to the correct value of the position array inside our attributes.
Note that the coordinate system is a bit different then what you might expect. The x and y scales go from -1.0 to 1.0 as shown in this handy diagram from MDN’s WebGL Concepts.
So thats how this triangle gets drawn. Kind of cool, right?
But wait there’s more!
Currently in this example, the input uniforms and attributes are all static values. But it doesn’t have to be this way!
To see some of these benefits, let’s convert our triangle from a frame that is displayed just once, to a visual that is displaying over and over through time.
A Triangle In Time
Currently our triangle is rendered in one shot. It is displayed and then it is done. In order to see the benefits of dynamic data, we want to render this triangle over and over again so we can pass in different value each render.
Looping can be done in many ways, but regl provides regl.frame for just such a purpose. This allows us to call our drawTriangle draw command inside a requestAnimationFrame().
In the Console, you should see our tick count. And it should be increasing!
(I would remove that console.log after verifying that it is indeed looping - cause it clogs up the Console and slows things down).
The context variable is something that regl populates with a few values. Let’s return back to regl and talk about the ways it allows inputs to the draw commands.
regl Inputs: Context, Props, and this
So we have learned that a core feature of regl is handling inputs to our shaders. We know that shader variables come in 3 flavors: uniforms, attributes, and varyings. And while the triangle example we have been working with deals with static versions of these inputs, I indicated that dynamic inputs were also possible. So lets’ find out how!
Context: Context variables are state accessible between child draw commands (which we won’t look at here). It is also the place where regl keeps a number of variables you might find useful - including tick and time.
Props: Props are the most common way to pass in data to a regl draw command. You can pass them in as an input object when you call the draw command.
this: The this variable can be used to save state within a regl command. We won’t look more into this here - but something to keep in mind.
If you are familiar with React, as the regl documentation states, you can use this knowledge to better understand these input options.