Distributions in d3

Distributions in d3

Something that often comes up is visualizing in 3 dimensions. I don't mean like 3dprinting. I mean more like heatmaps. With heatmaps, you can easily change two variables (the x and y dimensions) and see its impact on a third variable, usually denoted by color. You can also add texture and size to increase the number of dimensions, but let's not go overboard. (Here's one of my favorite heatmaps by the way).

When performing a typical analysis, we might make some heatmaps in Excel and be done with it. Here is a heat map in Excel:

 Awesome Excel heatmap.

Awesome Excel heatmap.

Notice how fun and exciting it is. Notice all the interactivity. Also notice the sarcasm. It's as painful as this graphic is boring.

Working in tech, I wondered if I couldn't improve on our Excel iteration using every data-nerd cum visualizer's favorite new tool - d3. I started with this static example (which was very helpful in getting started), and made some tweaks, namely:

  1. Changes to the color scheme
  2. Added tool-tips and hover styles for interactivity and feedback
  3. Allowed the user to generate random numbers with a click, coloring the tiles accordingly

However, I didn't want those random numbers to be uniform, i.e. some random number between 0 and 1. Rather, I wanted to give the user the option of which distribution they wanted to sample from. In the process of remaking this visual, I stumbled upon the fact that in Javascript, you can't easily generate random numbers from non-uniform distributions.

Javascript is a web scripting language, not one for statistical analysis. So I decided to make my own distribution generating functions.

Below is code to sample from a Normal distribution using the Box-Muller transform:

function box_muller(mu, sigma) {

var x, y, r, v = 0;

do {
x = 2*Math.random() - 1;
y = 2*Math.random() - 1;
r = x*x + y*y;
}
while (r >= 1);

co = Math.sqrt( -2 * Math.log(r) / r);
v = x*co;

return v*sigma + mu;
}

That's not so bad huh?

I included some code to generate a few other distributions as well. As expected, the distribution of the tile colors will vary depending on the - distribution - you click on. So you can get a sense of what distinguishes a Normal from an Exponential by clicking around.

Oh yeah, here's the resulting heat map:


Feel free to peruse the code here.

That's all for now!

When less is more

When less is more

TV analysis

TV analysis