5 conceptos cruciales para aprender d3.js y cómo entenderlos -

PorAdam Janes es

Es posible que ya haya oído hablar de d3.js , la deslumbrante biblioteca de JavaScript que le permite crear cuadros y gráficos hermosos con solo unas pocas líneas de código. Es posible que haya visto algunos de los fantásticos ejemplos de D3 en acción o que haya oído que el New York Times lo utiliza para crear sushistorias visuales interactivas.

Si alguna vez has intentado sumergirte en el mundo de D3, entonces ya estarás familiarizado con su famosa y empinada curva de aprendizaje .

Simplemente no puedes construir cosas de inmediato con D3.

Con sus confusas cadenas de métodos, su sintaxis extraña y sus funciones de caja negra que parecen funcionar por arte de magia, D3 rápidamente puede parecer más complicado de lo que vale. Pero no temas, porque D3 se vuelve mucho más fácil si comprendes solo algunos conceptos clave.

Quiero guiarte a través de un sencillo tutorial, explicando 5 de las áreas de confusión más comunes que enfrentan los principiantes cuando comienzan con D3.

Vamos a crear un diagrama de dispersión dinámica, que se actualiza cada segundo entre dos conjuntos de datos diferentes:

Tómate un momento para apreciar los pequeños detalles aquí. Mira con qué facilidad estos puntos se deslizan por la pantalla. Mire cómo se desvanecen suavemente apareciendo y desapareciendo de la vista. Contempla el tranquilo vaivén de nuestros ejes entre sus diferentes valores.

En realidad, estas son algunas de las funciones más fáciles de implementar en D3. Una vez que puedas superar la lucha inicial de descubrir los componentes básicos de la biblioteca, agrega este tipo de cosas es pan comido.

Antes de adelantarnos, hablemos de qué es realmente D3.

D3 significa Documentos basados ​​en datos .

Los datos pueden ser absolutamente cualquier cosa, lo cual es parte de lo que hace que D3 sea tan poderoso. La mayoría de las veces en D3, querrás leer estos datos de un archivo, pero para este ejemplo solo usaremos dos matrices almacenadas como variables:

var data0 = [ { gpa: 3.42, altura: 138 }, { gpa: 3.54, altura: 153 }, { gpa: 3.14, altura: 148 }, { gpa: 2.76, altura: 164 }, { gpa: 2.95, altura : 162 }, { gpa: 3.36, altura: 143 }]var data1 = [ { gpa: 3.15, altura: 157 }, { gpa: 3.12, altura: 175 }, { gpa: 3.67, altura: 167 }, { gpa : 3.85, altura: 149 }, { gpa: 2.32, altura: 165 }, { gpa: 3.01, altura: 171 }, { gpa: 3.54, altura: 168 }, { gpa: 2.89, altura: 180 }, { gpa : 3,75, altura: 153 }]

La parte de documentos en D3 se refiere al Modelo de objetos de documento (DOM). D3 se trata de mover elementos en la página, según lo que dicen los datos. Específicamente, estamos trabajando con elementos de formas especiales llamados SVG.

Crucial Concept #1 — Working with SVGs

So here we come to the first challenging concept that every D3 newbie has to deal with. You immediately need to get a good grasp on a special type of markup which you might not have seen before.

Here’s what SVG markup might look like:

svg  rect x="0" y="0" fill="green"/rect  circle cx="90" cy="25" r="25" fill="red"/circle  ellipse cx="145" cy="25" rx="15" ry="25" fill="grey"/ellipse  line x1="185" y1="5" x2="230" y2="40" stroke="blue" stroke-width="5"/line  text x="260" y="25" font-size="20px" fill="orange"Hello World/text/svg

If we place this snippet into an HTML document, then our browser will interpret it like this:

Basically, each of these SVGs has a set of attributes which our browser uses to place these shapes on the screen. A few things to know about SVGs:

  • There’s a distinction between the SVG canvas (drawn with the canvas tags) and the SVGs shapes themselves.
  • There’s a fairly unintuitive coordinate system that you’ll need to understand, since the (0, 0) point of an SVG grid is at the top-left, rather than the bottom-left.
  • You might come across some pretty weird behavior if you don’t understand what’s going on under the hood.

It can be tempting to gloss over this subject, opting instead to dive head-first into the titillating business of laying down some D3 code right away, but things will seem a lot clearer later on if you know how these shapes are working.

Resources for understanding SVGs…

  • A guide to SVGs for absolute beginners — Rob Levin
  • An SVG primer for D3 — Scott Murray

As a first step to building our scatter plot, we’ll want to add a small circle SVG for each item of data that we want to display. We add SVGs in D3 like this:

d3.select("#canvas")    .append("circle")        .attr("cx", 50)        .attr("cy", 50)        .attr("r", 5)        .attr("fill", "grey");

Writing d3.select(“#canvas”) here is analogous to writing $(“#canvas”) in jQuery, as it grabs hold of the element with the ID of “canvas”. d3.select goes one step further, adding a few special methods to this selection that we’ll be using later on.

We’re using the d3.append method to add a circle SVG to that element, and we’re setting each of the circle’s attributes with the d3.attr method.

Since we want to add a circle for every item in our array, you might think that we’d want to use a for loop:

for(var i = 0; i  data0.length; i++) {    d3.select("#canvas")        .append("circle")            .attr("cx", data0[i].gpa)            .attr("cy", data0[i].height)            .attr("r", 5)            .attr("fill", "grey");}

However, since this is D3, we’ll be doing something slightly more complicated, and slightly more powerful…

Crucial Concept #2 — Data Binding

The next hurdle that every new D3 developer needs to overcome is the D3 data join. D3 has its own special way of binding data to our SVGs.

Here’s how we add a circle for every item in our array with D3:

var circles = d3.select("#canvas").selectAll("circle")    .data(data0);circles.enter().append("circle")    .attr("cx", function(d, i){ return 25 + (50 * i); })    .attr("cy", function(d, i){ return 25 + (50 * i); })    .attr("r", 5)    .attr("fill", "grey");

For a developer who is just starting off with D3, this can seem confusing. Actually, for many seasoned developers with years of experience in D3, this can still seem confusing…

You would think that calling selectAll(“circle”) on a page devoid of circles would return a selection of nothing. We’re then calling the data() method on this selection of nothing, passing in our array. We have a mysterious call to the enter() method, and then we have a similar setup as before.

This block of code adds a circle for each item in our array, allowing us to set our attributes with anonymous functions. The first argument to these functions gives us access to the item in our data that we’re looking at, and the second argument gives us the item’s index in our array.

Creating a “data join” like this marks the first step to doing something useful with our data, so it’s an important step to understand. This strange syntax can be daunting when you first encounter it, but it’s a handy tool to know how to use.

Resources for understanding data binding in D3:

  • A beginner’s guide to data binding — SitePoint
  • Thinking with joins — Mike Bostock
  • Let’s make a grid with D3.js — Chuck Grimmett

Once we run the code that we’ve written so far, we end up with something that looks like this:

We attached the right number of circles to the screen and spaced them out a little, but what we have so far isn’t particularly helpful. For a scatter plot, the coordinates of these circles should correspond to two different values.

The GPA and height values that we have in our arrays aren’t much use to us at the moment. Our GPA values range from 2.32 to 3.85, and our height values range from 138 to 180. When positioning our circles, we want to work with x-values between 0 and 800 (the width of our SVG), and y-values between 0 and 500 (the height of our SVG).

We’ll need to apply some kind of transformation to our raw data, to convert these values into a format that we can use.

In D3, we do this by using scales.

Crucial Concept #3 — Scales

Here comes our next major challenge to picking up D3.

Scales are confusing to talk about when you’re first getting started. They need to be set with a domain and a range, which can be pretty easy to confuse. The domain represents the interval that our input values will run between, and the range represents the interval that our output values will run between.

A scale is a function in D3 that will take in a value as an input, and spit out a different value as an output. In this example, we’ll need an x-scale that converts a GPA to a pixel value, and a y-scale that converts a person’s height to a pixel value, so that we can use our data to set the attributes of our circles.

Here’s a diagram to show you what our x-scale should be doing:

We need to initialize our domain and range with some minimum and maximum values. We’re saying that a value of 3.54 should translate to a pixel value of 800, and a GPA of 2.76 should translate to a pixel value of 0. So, if we pass in a value of 3.15 to our scale, then the output would be 400, since 3.15 is half way between the min and max of our domain.

In this example, we’re using a linear scale, meaning that values should be scaled proportionally between the two extremes that we’re looking at. However, there are a few different types of scales that you’ll want to get your head around.

  • If you’re working with data that increases exponentially over time, then you might want to use a logarithmic scale.
  • If you’re working with date values, then you’ll use a time scale.
  • If you want to assign colors between different categories, you can use an ordinal scale.
  • If you’re spacing out rectangles in a bar chart, then you’ll use a band scale.

For each of these scales, the syntax is slightly different, but it’ll still follow the same general format as our linear scale.

Resources for understanding scales in D3…

  • An introduction to linear scales in D3 — Ben Clikinbeard
  • A walkthrough of the different types of scales — D3 in depth
  • The entry for scales in the D3 API reference

So now, we can add in two linear scales to use for our x and y axes.

var x = d3.scaleLinear()    .domain([d3.min(data0, function(d){ return d.gpa; }) / 1.05,         d3.max(data0, function(d){ return d.gpa; }) * 1.05])    .range([0, 800]);var y = d3.scaleLinear()    .domain([d3.min(data0, function(d){ return d.height; }) / 1.05,        d3.max(data0, function(d){ return d.height; }) * 1.05])    .range([500, 0]);

Each of our scales will take in a value somewhere between the minimum and maximum of each variable in our data, and spit out a pixel value that we can use for our SVGs. I’m using the d3.min() and d3.max() functions here so that D3 will automatically automatically adjust if our dataset changes. I’m also giving our domains a 5% buffer both ways, so that all of our dots will fit on the screen.

We’re also reversing the range values for our y-scale, since an input of 0 should spit out an output of 500px (the bottom of a cartesian grid in the SVG coordinate system).

Next, we can make a few edits to our code from earlier, so that the values for our circles come from our scales.

var circles = d3.select("#canvas").selectAll("circle")    .data(data0);circles.enter()    .append("circle")        .attr("cx", function(d){ return x(d.gpa) })        .attr("cy", function(d){ return y(d.height) })        .attr("r", 5)        .attr("fill", "grey");

At this point, we have something that looks like a real visualization!

The next step is to add in some axes, so that we can tell what these dots are meant to represent. We can do this by using D3’s axis generator functions, but we’ll soon run into some problems…

Crucial Concept #4 — Margins and Axes

D3’s axis generators work by attaching an axis onto whichever element they’re called on. The problem is that, if we try attaching axes straight onto our SVG canvas, then we’ll end up with something like this:

Our first problem is that the axes are always positioned at the top-left hand corner of the grid. That’s fine for our y-axis in this case, but it’s not okay for our x-axis, which we want to place at the bottom.

Another issue here is that, since our axes are sticking out over the edge of our SVG canvas, our axis tick marks don’t show up for our y-axis.

We can fix this by making use of a few SVG groups — invisible elements for adding structure to our pages.

In D3, we need to get used to the “margin convention” that all of our projects should follow:

The idea is that we want to give ourselves a buffer around the edge of our visualization area, giving us some space for our axes to live. We need to set some width, height, and margin variables at the top of our file, allowing us to simulate this effect:

ar svg = d3.select("#canvas");var margin = {top: 10, right: 10, bottom: 50, left: 50};var width = +svg.attr("width") - margin.left - margin.right;var height = +svg.attr("height") - margin.top - margin.bottom;var g = svg.append("g")    .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

We now need to use these width and height variables to set the range for our scales, and we’ll be attaching our circles onto this g variable, which represents our main visualization area.

If we also attach our axes to SVG groups, then we can shift them into the right position using the transform attribute that comes with the group element. Here’s the code we’ll be using to add our axes onto our graph:

// Axesvar xAxisCall = d3.axisBottom(x)var xAxis = g.append("g")    .attr("class", "x-axis")    .attr("transform", "translate(" + 0 + "," + height + ")")    .call(xAxisCall);var yAxisCall = d3.axisLeft(y)var yAxis = g.append("g")    .attr("class", "y-axis")    .call(yAxisCall)// LabelsxAxis.append("text")    .attr("class", "axis-title")    .attr("transform", "translate(" + width + ", 0)")    .attr("y", -6)    .text("Grade Point Average")yAxis.append("text")    .attr("class", "axis-title")    .attr("transform", "rotate(-90)")    .attr("y", 16)    .text("Height / Centimeters");

I’m also adding some text SVGs as labels, which will tell us what each of the axes is showing.

The margin convention can seem a little random for newcomers to D3, and there are a wide range of methods that we can use to edit how our tick marks should look.

Resources for understanding margins and axes in D3…

  • An walkthrough of our margin convention code — Mike Bostock
  • A guide to axis generators in D3 — TutorialsTeacher
  • The D3 API reference entry on axes

Now that we can see what our chart is showing, I want to take it to the next level by adding in an update to our data. To do this, we’ll use the D3 interval method to run some code continuously:

var flag = true;// Run this code every second...d3.interval(function(){    // Flick between our two data arrays    data = flag ? data0 : data1;    // Update our chart with new data    update(data);    // Update our flag variable    flag = !flag;}, 1000)

Every 1000ms, this function is going to execute an update function, changing the data that we’re using between our two different arrays.

We need to make a few edits to our code to get everything to update like we want it to:

// Scalesvar x = d3.scaleLinear()    .range([0, width]);var y = d3.scaleLinear()    .range([height, 0]);// Axesvar xAxisCall = d3.axisBottom(x)var xAxis = g.append("g")    .attr("class", "x-axis")    .attr("transform", "translate(" + 0 + "," + height + ")");var yAxisCall = d3.axisLeft(y)var yAxis = g.append("g")    .attr("class", "y-axis");// LabelsxAxis.append("text")    .attr("class", "axis-title")    .attr("transform", "translate(" + width + ", 0)")    .attr("y", -6)    .text("Grade Point Average")yAxis.append("text")    .attr("class", "axis-title")    .attr("transform", "rotate(-90)")    .attr("y", 16)    .text("Height / Centimeters");var flag = true;// Run this code every second...d3.interval(function(){    // Flick between our two data arrays    data = flag ? data0 : data1;    // Update our chart with new data    update(data);    // Update our flag variable    flag = !flag;}, 1000)// Run for the first timeupdate(data0);function update(data){    // Update our scales    x.domain([d3.min(data, function(d){ return d.gpa; }) / 1.05,         d3.max(data, function(d){ return d.gpa; }) * 1.05])    y.domain([d3.min(data, function(d){ return d.height; }) / 1.05,        d3.max(data, function(d){ return d.height; }) * 1.05])    // Update our axes    xAxis.call(xAxisCall);    yAxis.call(yAxisCall);    // Update our circles    var circles = g.selectAll("circle")        .data(data);    circles.exit().remove()    circles        .attr("cx", function(d){ return x(d.gpa) })        .attr("cy", function(d){ return y(d.height) })    circles.enter()        .append("circle")            .attr("cx", function(d){ return x(d.gpa) })            .attr("cy", function(d){ return y(d.height) })            .attr("r", 5)            .attr("fill", "grey");}

We’re setting our scale domains inside our update function, so that they adjust to the data that we’re working with. We’re then calling our axis generators here too, which will update them accordingly. We then have a confusing block of code, which handles how we want our circles to update.

Crucial Concept #5 — The General Update Pattern

The general update pattern is used in pretty much every visualization that you’ll want to build with D3. It defines the behavior of elements in our data that should enter, update, or exit the screen. As a beginner, all of this code can seem a little overwhelming.

Let’s take a closer look at what each of these lines are doing.

First, we’re binding our new array of data to our D3 selection:

// JOIN new data with old elements.var circles = g.selectAll("circle")    .data(data);

Next, this block of code will remove all the dots that no longer exist in our new array of data:

// EXIT old elements not present in new data.circles.exit().remove()

Here, we’re updating the position of all the dots on the screen that still exist in our new data array.

// UPDATE old elements present in new data.circles    .attr("cx", function(d){ return x(d.gpa) })    .attr("cy", function(d){ return y(d.height) })

Finally, we’re adding a dot for every item in our new data array that doesn’t have a corresponding circle on the screen.

// ENTER new elements present in new data.circles.enter().append("circle")    .attr("cx", function(d){ return x(d.gpa) })    .attr("cy", function(d){ return y(d.height) })    .attr("r", 5)    .attr("fill", "grey");

The tricky thing about understanding the general update pattern is figuring out exactly what selectAll(), enter(), and exit() are doing. D3 works by using a set of “virtual selectors”, which we can use to keep track of which elements need to be updated.

Although you can get away with having only a surface understanding of the update pattern with many charts that you’d want to create, the whole library becomes a lot clearer once you can figure out what each of these selectors are doing.

Resources for understanding the general update pattern in D3…

  • A walkthrough of the general update pattern — Quinton Louis Aiken
  • An interactive exploration of the general update pattern — Chris Given

Once we’ve added in our updates, here’s what our chart looks like:

Our visualization is now flicking between the two arrays of data that we want to display. I’m going to add one more final flourish to make our graph look a little neater.

We can add in some beautiful transitions by making use of the superb D3 transition suite. First, we’re defining a transition variable at the top of our update function, which is spreading each of our transitions out over a duration of 750ms.

// Standard transition for our visualizationvar t = d3.transition().duration(750);

Any attributes that we set before calling the transition method on a D3 selection will be set straight away, and any attributes that we set after this transition method will be applied gradually.

We can add transitions to our axes like this:

// Update our axesxAxis.transition(t).call(xAxisCall);yAxis.transition(t).call(yAxisCall);

And we can add transitions to our circles like this:

// Update our circlesvar circles = g.selectAll("circle")    .data(data);circles.exit().transition(t)    .attr("fill-opacity", 0.1)    .attr("cy", y(0))    .remove()circles.transition(t)    .attr("cx", function(d){ return x(d.gpa) })    .attr("cy", function(d){ return y(d.height) })circles.enter().append("circle")    .attr("cx", function(d){ return x(d.gpa) })    .attr("cy", y(0))    .attr("r", 5)    .attr("fill", "grey")    .attr("fill-opacity", 0.1).transition(t)    .attr("fill-opacity", 1)    .attr("cy", function(d){ return y(d.height) });

We’re transitioning between a fill-opacity of 0 and 1 to make our dots gently fade in and out of existence, and we’re smoothly shifting the updating circles to their new positions.

So there we have it. We now have a beautiful scatter plot which is updating between different sources of data. You can find the finished product of all this code on my GitHub pagehere.

Although mastering the concepts in this article might seem like a huge step to take just to get started with D3, the code gets easier and easier to understand with practice.

You’ll soon find that the same key concepts underpin every D3 visualization, and that once you know how one visualization works in D3, you can quickly learn to build almost anything that you can imagine.

Check out the examples onbl.ocks.org andblockbuilder.org to see some ready-made implementations of so many interesting projects. Like D3 itself, all of this code is open source, meaning that you can copy any of this code onto your local machine, and use it for your own projects.

An easy way to get started with D3…

If you’re looking for the fastest and easiest way to learn D3, then I teach acourse on Udemy which offers a comprehensive introduction to the library. The course includes:

  • 7 hours of quality video content.
  • A step-by-step introduction to the foundational concepts in D3, covering all of the topics covered in this article and more.
  • Four awesome class projects to practice the skills that you’re learning with real-world data.
  • A strong emphasis on data visualization design, helping you to create custom visualizations for your own data.
  • Walkthroughs of 12 of the most commonly used visualizations, teaching you how to understand and adapt pre-written community code for your own purposes.
  • An introduction to an object-orientated approach for creating complex web apps, where multiple visualizations on the page are updating at once.

You can get the course at a discounted price of only $20.99 by signing up through this linkhere.

About Adam Janes

Adam se enamoró de D3.js por primera vez cuando era estudiante de Economía e Informática en la Universidad de Harvard. Ahora trabaja como ingeniero de visualización de datos y ayuda a empresas de todo el mundo a encontrar las mejores formas de mostrar sus datos. También imparte un curso en línea en Udemy, que ofrece a los estudiantes una introducción completa a D3 a través de 93 videoconferencias.

Publicaciones de janes_adam

Te podría interesar...

Deja una respuesta