Mapping with D3.js - v3.0
Intro
D3 is a powerful data visualization library written by Mike Bostock that helps connect data to graphical elements, and then apply data-driven transformations to those elements. The basic idea is that when the data is bound to graphics, you can produce more portable graphics and much more dynamic visualization with less effort.
So why use D3 for maps? Maps are fundamentally graphical objects based on data, and D3 has built in support for map projections and transformations. D3 is actually the backend renderer for SVG images in the OpenStreetMap editor iD, so that’s a pretty good endorsement for D3 mapping!
So why not use another library like Leaflet.js? The short answer is that D3 will be advantageous when you really want to customize interactivity and dynamic visualization. The tradeoff is in ease-of-creation: D3 will take more time to customize the map to what you want. That said, there’s really no reason you can’t use both D3 and Leaflet together! Here is a great tutorial example using D3 to create dynamic overlays on a Leaflet map.
What can I create with D3?
Check out these links for some examples of D3 visualizations:
The full gallery of D3 visualizations
Geographic Projections: animated, and draggable
The famous “Wealth of Nations” Viz
Earthquake Energy Release Visualizations using D3/Leaflet (by Ryan)
D3 - Data Driven Documents
D3 stands for Data Driven Documents. We will unpack this title in three parts.
Data
D3 has straightforward functions to grab data from a variety of sources including XMLHttpRequests, text files, JSON blobs, HTML document fragments, XML document fragments, comma-separated values (CSV) files, and tab-separated values (TSV) files. Part of the tremendous power of D3 is that it can take data from a variety of sources, merge different data sources, and then join data elements to the visual elements that represent the data.
Driven
Driven is actually one of the defining characteristics of D3: the graphical elements are defined by the data. In other words, each circle, line, or polygon also contains the data they are defined by. A desktop GIS software works in the same way while you’re working on your map, but when you export the map, the vector-based features lose the data that defines them. If you export a raster image those attributes are completely converted to color values and the data is detached completely.
That type of thing doesn’t happen in D3. Not only does your data define the elements in your graphic, the data is also bound (joined) to the elements in your document. A circle isn’t just a circle element with an x,y and radius, it’s also the data that originated the element in the first place. This characteristic of D3 allows data to drive your visualization, not only upon creation, but throughout its life cycle.
Documents
At its core, D3 takes your information and transforms it into a visual output. That output is usually a Scalable Vector Graphic, or SVG that lives in an HTML document. SVG is a file format that encodes vector data for use in a wide array of applications. SVGs are used all over the place to display all kinds of data. If you’ve ever exported a map from desktop GIS and styled it in a graphics program, chances are your data was stored as SVG at some stage of the process.
SVGs are human readable, which works well for us because we aren’t computers. This is an SVG in code:
<svg width="400" height="120">
<circle cx="40" cy="60" r="10"></circle>
<circle cx="80" cy="60" r="10"></circle>
<circle cx="120" cy="60" r="10"></circle>
</svg>
And this is the rendered version of that code:
SVG’s work similarly to html pages, where tags represent objects that can have objects nested within them: each circle is an element nested within the SVG. Each circle contains some coordinates of the object’s center (cx, cy), and radius (r), so the SVG is just a set of instructions defining the geometry of each object, where to put each object, and how to style the objects in the SVG coordinate space.
It’s also worth noting that D3 has the ability to select, write, and edit any element on the HTML DOM, and any of the SVG shape elements like rectangles and lines. Later we’ll learn to use D3 to create <path>
elements to draw complex country boundaries on our map.
Tutorial Time!
What do I need for this tutorial?
-
A text editor (such as Notepad++, Brackets, or Sublime Text).
- You will also need a local web server. Here are 2 good options:
- Here’s a Chrome app that’s lightweight and works great. Upon opening the app, it will prompt you to choose a folder to serve files from.
- MAMP is a free apache webserver for MacOS and Windows. (on MacOS, you’ll put your files in
/Applications/MAMP/htdocs/
, and usually access your files via http://localhost:8888/)
- Clone this repo for the starter files.
Tips
- The learning curve can be pretty steep. Stay positive. Ask lots of questions.
- Start simple, add complexity piece by piece
- Refer to documentation / tutorials
- Cannibalize code wherever/whenever you can. D3 has great examples and most the code is freely accessible.
- In this tutorial, SOLUTIONS ARE PROVIDED AT THE END OF EVERY STEP under headings like this:
Challenge Answer
You found the answer!
What map are we making?
Today, we’ll build a choropleth of some local health data from the Institute for Health Metrics and Evaluation. Specifically, we’ll be building a colored map showing Mortality Rates due to opioid use disorders. The data is grouped by 2010 Census Tract Boundaries, so that’s the geographic grouping we’ll be working with.
We’re basically building just the choropleth part of this Leaflet-based map, so you can compare your results to this one.
The steps will be:
- Data Wrangling
- Most of the data wrangling is done for you, but we’ll look at the data to figure out how to match up our data file to the census tracts on our map
- Convert our shapefile to TopoJSON
- TopoJSON is a compressed vector format for the web. We’ll use Mapshaper to manipulate the shapefile to fit our needs
- Use D3 to build a legend
- a crash course learning experience in D3
- learn about D3 scales
- Use D3 to build the choropleth map
1) Data Wrangling
Mortality Data: data about dead people
The data for this tutorial comes from the Institute for Health Metrics and Evaluation - Global Health Data Exchange, and consists of mortality rates due to opioid use from 1990-2014 (units are: Deaths Per 100,000 people). The raw data file (/public_webserver_files/data/IHME_KING_COUNTY_WA_MORTALITY_1990_2014_OPIOID_USE_DISORDERS_Y2017M09D05
) contains 119,400 rows and includes data for multiple years, for males, females, and both sexes, as well as for multiple age groups (All Ages, and Age Standardized).
I’ve filtered the data down to just the following data using this Observable notebook**
- 2014 (the most recent year with suitably accurate data)
- age standardized (all ages, weighted by the number of people in each age group)
- both sexes
- death rates (not Years of Life Lost due to opioid deaths, which is also contained in the raw data)
**Observable is a brand new tool, written by Mike Bostock, the creator of D3, that is basically the JavaScript/D3 version of a Jupyter notebook.
Geographic Data Wrangling
The shapefile I’m using in this tutorial comes from the King County GIS data portal. We’re using the 2010 Census Tracts Shapefile
The main issue with the geographic data is that the census tracts are indexed by census tract number (TRACT_FLT
), whereas the mortality data is indexed by IHME’s location_id
. We’ll take care of that issue using Mapshaper.
2) Convert Shapefile to TopoJSON and JOIN location_id
to the topology
The goal of this section is to do several things to the shapefile to make it web ready
- Remove the extra shapefile cruft (delete unused data fields)
- Add the IHME-specific
location_id
to each of the census tract boundaries. This is so we can quickly ‘match up’ the IHME Mortality data to the right census tract. - Make sure our data is in longitude, latitude (or any un-projected coordinate system)
- Do a bit of simplification and renaming of files
What’s a TopoJSON?
TopoJSON is a simplified geographic format that ensures that all vertices are shared between different lines in a geometry, with arcs indexed between the shared vertices. This makes for very, very small files that can be quickly transmitted over the web.
Think of TopoJSON as a zipped GeoJSON. The general technique in web development is to send the TopoJSON ‘over the wire’, then unzip it to GeoJSON on the client side, or in the user’s browser using some other very small code.
Use Mapshaper to figure out what attributes we have
Mapshaper is an amazing command-line tool for reshaping map data, and has an amazing web-based tool to get started. This challenge will show you the basics.
-
Go to Mapshaper.org
- Find the census tract shapefile in your repo:
/mapshaper_files/census_tract_shapefiles/tracts10_shore.shp
. - Drag ALL of the
tracts10_shore.*
files into the mapshaper import window (you really only need tracts10_shore.shp and tracts10_shore.dbf) - Notice that you can zoom, pan etc. Click on the ‘i’ button on the top right.
- Mouse over your shapefile to get a tooltip that shows the attributes for each census tract.
Notice that there are a bunch of extra data fields (GEO_ID_TRT, FEATURE_ID, etc) we don’t really want. We could either delete these using QGIS, or we could figure out how to let mapshaper do it for us.
Challenge 2a: Remove those extra data fields!
That extra data is nice, but all we need are the census tract numbers, which we’ll use later. Let’s delete all fields EXCEPT the TRACT_FLT
field, which represents the tract id as a decimal value
- Click the “Console” button. You’ll get a prompt to type “tips” for examples. Try it!
- Try typing
help
orhelp <command name>
to find commands. You can also try the Mapshaper command reference for a web interface. - Figure out which command you need to remove ALL except the
TRACT_FLT
field.
Challenge 2a Answer
1. Type ```filter-fields 'TRACT_FLT'``` into the Mapshaper console. 2. Now use the info button and mouse over each tract to be sure that only the `TRACT_FLT` field is still there.
Challenge 2b: Join Census Tract Number TRACT_FLT
to IHME location_id
IHME data is indexed by IHME’s own location_id
, so ideally our topojson file will need to have location_id
instead of TRACT_FLT
. Let’s make that happen in mapshaper.
- Find
/mapshaper_files/IHME_location_id_TO_tract_id.csv
and open it in a text editor. This file contains a ‘mapping’ of census tracttract_id
to IHMElocation_id
.
-
We’ll need to have the
location_id
field attached to our TopoJSON so we can link the IHME data to each census tract. Let’s JOIN that data using Mapshaper. -
DRAG the
IHME_location_id_TO_tract_id.csv
into the Mapshaper window and drop it into the same window with your King County Shapefile. You should get animport
prompt, and when you clickimport
, you should see a big grid of rectangles. -
Click on the filename
IHME_location_id_TO_tract_id
at the top of the page and make the map layer the ‘target layer’ again. - Now comes the hard part. In the
Console
on the left. Typehelp join
and figure out what the command will be to join theIHME_location_id_TO_tract_id.csv
data to the shapefile. You’re basically trying to map the keysTRACT_FLT
andtract_id
, which are the same in the shapefile and in the csv file. This is a tough one, so if you get stuck just check out the answer below. - Convert to TopoJSON format!
Challenge 2b Answer
Try the following command: ``` join IHME_location_id_TO_tract_id.csv keys=TRACT_FLT,tract_id ``` You should be able to toggle the info button and mouse over the map to make sure you see `location_id` and `location_name` in the data fields.
REMOVE IHME_location_id_TO_tract_id
from the list of layers
- Click on the layers tab and close out the
IHME_location_id_TO_tract_id
layer. We won’t need it any more now that thelocation_id
has been joined to the shapefile.
Change the projection to WGS84
- D3 likes latitude and longitude, not the projected coordinate system that this shapefile is in.
- Type:
projections
in the console to see all of the projections you can use. - Then type
proj wgs84
into the console to change the projection.
Simplify the layer
- Simplifying geometries is one thing that Mapshaper does really well. This can be done either via the ‘Simplify’ tab, or via the command line.
- To use the command line:
simplify 0.1 keep-shapes
will simplify to 10% andkeep-shapes
ensures that no polygons are removed in the simplification process.
Rename the layer
- So easy. Just type
rename-layers census_tracts
into the console. When there are multiple layers, just use commas between the layers, which are in the order you see them in the dropdown.
Export the finished product!
- Just click the
Export
button in the top right, selectTopoJSON
and save your file!
- A copy of the results of this file is already in our public web files at:
public_webserver_files/data/king_county_census_tracts.json
(click to see the map on github’s leaflet renderer)
ADVANCED: How to build your own bash script to do this
* Download and install [Node.js](https://nodejs.org/en/)
* Install mapshaper globally on your computer (-g flag)
```
npm install -g mapshaper
```
* Add the bash tag `#!/usr/bin/env bash`
* Start the command with ```mapshaper -i
* You'll just need to add some padding where we compute the new bar transformation
```javascript
...
.attr('transform', function (rectangleHeight, index) {
var padding = 20;
return 'translate(' + ((rectangleWidth + padding) * index) + ',' + 0 + ')';
});
```
##### Open `public_webserver_files/02-scales-from-data-SOLUTION.html` for the solution.
```javascript
.attr('fill', function (datum) {
return colorScale(datum); /** <- CHALLENGE SOLUTION: just call colorScale() on the datum value from the bound colorBin data */
});
```
##### Open `public_webserver_files/05-color-census-tracts-SOLUTION.html` for the solution
```javascript
.attr('fill', function (geoDatum) {
/** find the mortality datum that has the same location_id as this geo path */
var mortalityDatumObject = mortalityDataArray.find(function(mortalityDatum) {
return +mortalityDatum.location_id === geoDatum.properties.location_id;
});
/** check for missing object, return black if we didn't find the right mortality datum */
if (mortalityDatumObject === undefined) {
return 'black';
}
/** get the mortality value and convert it from a string to a number */
var mortalityValue = +mortalityDatumObject.val;
/** return the color scale that corresponds */
return colorScale(mortalityValue);
});;
```
Optional Challenge Answer
Computing Colors Challenge Answer
Computing Colors Challenge Answer