2397 words
~12min read

Leaflet Zip Code Map Part 1

September 24th 2014
8K views

This is the first part of a two part series about how to set up leaflet maps to show zipcode areas with some extra data coming along for the ride allowing us to color the zip code areas based on demographics or any other reporting data. I did this using an ASP.NET MVC site and Web API with a Sql Server database. A database is required for this approach but there’s nothing stopping you from reimplementing this in your backend architecture of choice.

  • leaflet js zipcode map

This part will focus on getting the zipcode data into your db and your backend set up to serve it out in JSON, ready for leaflet maps to receive it.

1. Get the census shape data#

First up is getting the basic boundary data for all the zip codes in the US. Luckily our handy census bureau is kind enough to provide this. Head on over to data.gov and get the line shapefile. Inside that zip file should be a ~800mb .shp file.

2. Shrink it up with QGIS#

The level of detail is really way more than we need for leaflet, so to spare everyone slow loading times we’re going to simplify some of the geometry in the shape file. QGIS is a free tool that can open up the .shp file and simplify it. To do that, add the layer by going to the menu Vector > Geometry > Simplify Geometry. I used the simplification of 0.0003 to get the file down from the ~816m down to ~96mb. Have the shrinker save it to a new file for the next step. If you’ve never seen the whole US zip code map before it’s quite the sight!.

3. Convert the shape into GeoJSON#

Leaflet needs all this shape data in a standard JSON format so to get it from .shp to GeoJSON we need FWTools. Thanks to the GIS Collective for their tutorial on this step. Once you have the FWTools installed go to their bin directory and run the ogr2ogr command to convert the file: C:\Program Files (x86)\FWTools2.4.7\bin>ogr2ogr.exe -f "geoJSON" "C:\new-path\simp.js" "C:\old-path\simp3.shp"

With the JSON overhead the file is now sitting at around 175mb. If you’re feeling like a mean web developer you could make your clients download that file for their map now ;) However, I was feeling generous so let’s get the data into a DB so we only serve up what they really need to see.

4. Create your table#

CREATE TABLE dbo.zipcode_geojson
(
    zip NVARCHAR (5) NOT NULL
    , geoJSON VARCHAR (max) NOT NULL
    , latitude DECIMAL (18, 10) NOT NULL
    , longitude DECIMAL (18, 10) NOT NULL
    , point GEOGRAPHY NULL
    , CONSTRAINT PK_zipcode_geojson PRIMARY KEY (zip)
)

5. Get the data into your DB#

Now we need to import our GeoJSON data into the database but also pick up where each zip code is at. If you crack open the .js file now you can see that each row after the first few lines has some meta data about what zip code it is and its latitude and longitude. Since this will be a one time deal I got hacky and wrote a simple LinqPad script to loop through the lines and put it in the DB. Here it is for convienience but it requires you to chop off everything but the contents of the features array inside the .js file.

void Main()
{
    var path = "C:\\new-path\simp.js";

    int counter = 0;
    int commaCount = 0;
    string line;
    var newgeos = new List<Zipcode_geojson>();

    // Read the file and display it line by line.
    System.IO.StreamReader file = new System.IO.StreamReader(path);
    while((line = file.ReadLine()) != null)
    {
        //Console.WriteLine (line);
        counter++;
        if(line == ","){
            commaCount++;
        }else{
            var zipPos = line.IndexOf("GEOID10");
            var zipEndPos = line.IndexOf('"', zipPos + 9);
            string zip = line.Substring(zipPos + 10, zipEndPos - zipPos - 3);
            zip = zip.Replace("\"","");

            var latStrPos = line.IndexOf("INTPTLAT10");
            var endLatStrPos = line.IndexOf('"', latStrPos + 13);

            decimal lat = Convert.ToDecimal(line.Substring(latStrPos + 14, endLatStrPos - latStrPos - 3));

            var longStrPos = line.IndexOf("INTPTLON10");
            var endLongStrPos = line.IndexOf('"', longStrPos + 13);

            decimal longitude = Convert.ToDecimal(line.Substring(longStrPos + 14, endLongStrPos - longStrPos - 3));

            newgeos.Add(new Zipcode_geojson(){ Zip = zip, GeoJSON = line, Latitude = lat, Longitude = longitude});

            Console.WriteLine(string.Format("{0} {1},{2} len {3}", zip, lat,longitude, line.Length));
        }
    }

    file.Close();

    Zipcode_geojsons.InsertAllOnSubmit(newgeos);
    SubmitChanges();

    Console.WriteLine(string.Format("{0} lines, {1} commas", counter, commaCount));
}

Gotta love those hard coded string positions. If I wanted to waste any more brain cycles on this little bit I’d run it through a proper parser and then import but the quick and dirty way was faster here.

6. Update your point column#

If you were paying close attention you would have noticed there’s a point column in the table we created. This is a nifty feature of Sql Server that allows us to store a proper latitude longitude point in a single column which can be queried against in a much more elegant way as we’ll see later. Pre storing this in a column instead of creating it in the query is muuuuch faster. Promise. UPDATE zipcode_geojson SET point = geography::Point(latitude, longitude, 4326)

Optionally, you can now update the point column to be non-nullable if you want to crack your DBA whip.

7. Get some demographic data#

This is also optional, but makes the results much more interesting. I opted to get a cheap zipcode demographic database from easydbs.com for $30. It comes with a decent amount of data about income/age/population which is just what I was looking for. Here’s their schema for reference that I’m working off of.

CREATE TABLE dbo.zipcode_demographics(
  zip VARCHAR (5) NOT NULL
  , population INT NOT NULL
  , race_pct_white DECIMAL (4, 1) NOT NULL
  , race_pct_black DECIMAL (4, 1) NOT NULL
  , race_pct_asian DECIMAL (4, 1) NOT NULL
  , race_pct_native DECIMAL (4, 1) NOT NULL
  , race_pct_other DECIMAL (4, 1) NOT NULL
  , sex_pct_male DECIMAL (4, 1) NOT NULL
  , sex_pct_female DECIMAL (4, 1) NOT NULL
  , age_pct_0_19 DECIMAL (4, 1) NOT NULL
  , age_pct_20_39 DECIMAL (4, 1) NOT NULL
  , age_pct_40_59 DECIMAL (4, 1) NOT NULL
  , age_pct_60_79 DECIMAL (4, 1) NOT NULL
  , age_pct_80_over DECIMAL (4, 1) NOT NULL
  , age_median DECIMAL (4, 1) NOT NULL
  , households INT NOT NULL
  , home_pct_own DECIMAL (4, 1) NOT NULL
  , home_pct_rent DECIMAL (4, 1) NOT NULL
  , income_median DECIMAL (9, 2) NOT NULL
  , PRIMARY KEY (zip)
)

8. Serve it up#

This is the most complicated bit to get the JSON data out of the database, join it together with with the demographic data, and serve it back out as JSON. Quickly. To do this I’ve got some help from Dapper to get the data from Sql Server into objects, and C# Async to fetch any other reporting data at the same time as the zipcode data since that can also take a bit of time.

Let’s start with the geoJSON model that queries the zipcode database we worked so hard to get.

/* ZipCodeGeoJSON.cs */

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using Dapper;
using System.Data.SqlClient;
using System.Web.Helpers;

namespace Models
{
    public static class ZipCodeGeoJSON
    {
        /// <summary>
        /// Returns all geo json data within radius meters of the point
        /// </summary>
        public static List<zipCombined> GetZipsInCircle(decimal latitude, decimal longitude, decimal radius)
        {
            var result = new List<zipCombined>();
            using (var con = new SqlConnection(connectString))
            {
                var quarryText = @"
                SELECT
                gj.zip,
                gj.latitude,
                gj.longitude,
                gj.geoJSON,
                d.population,
                d.age_median as medianAge,
                d.income_median as medianIncome,
                d.home_pct_own,
                d.home_pct_rent,
                d.age_pct_0_19
                age_pct_0_19,
                age_pct_20_39,
                age_pct_40_59,
                age_pct_60_79,
                age_pct_80_over
                FROM (
                    SELECT zip, latitude, longitude, geoJSON,
                    point.STDistance(geography::Point(@center_latitude, @center_longitude, 4326)) AS distance
                    from zipcode_geojson
                ) gj
                INNER JOIN zipcode_demographics d on gj.zip = d.zip
                WHERE gj.distance < @radius
                ";

                var quarry = con.Query<zipCombined>(quarryText,
                    new {center_latitude = latitude, center_longitude = longitude, radius = radius }
                );

                result.AddRange(quarry);
            }

            return result;
        }

        /// <summary>
        /// Returns all geo json data within bounds
        /// </summary>
        public static List<zipCombined> GetZipsInRectangle(
            decimal NELat,
            decimal NELng,
            decimal SWLat,
            decimal SWLng
        )
        {
            var result = new List<zipCombined>();
            using (var con = new SqlConnection(dbConnectString))
            {
                var quarryText = @"
                    DECLARE @area geography = GEOGRAPHY :: STGeomFromText('polygon(({3} {0}, {3} {2}, {1} {2}, {1} {0}, {3} {0}))', 4326)

                    SELECT
                        gj.zip,
                        gj.latitude,
                        gj.longitude,
                        gj.geoJSON,
                        d.population,
                        d.age_median as medianAge,
                        d.income_median as medianIncome,
                        d.home_pct_own,
                        d.home_pct_rent,
                        d.age_pct_0_19
                        age_pct_0_19,
                        age_pct_20_39,
                        age_pct_40_59,
                        age_pct_60_79,
                        age_pct_80_over
                    FROM zipcode_geojson gj
                    INNER JOIN zipcode_demographics d on gj.zip = d.zip
                    WHERE @area.STIntersects(point) = 1
                ";

                quarryText = string.Format(quarryText, NELat, NELng, SWLat, SWLng);

                var quarry = con.Query<zipCombined>(quarryText,
                    new {NELat = NELat, NELng = NELng, SWLat = SWLat, SWLng = SWLng}
                );

                result.AddRange(quarry);
            }

            return result;
        }

        public static geoFeature selectGeoJSON(zipCombined zip)
        {
            var json = zip.geoJSON;
            var jsonObj = Json.Decode(json);

            var newGeo = new geoFeature()
            {
                type = "Feature",
                properties = new geoProperties(){
                    zip = jsonObj.properties.ZCTA5CE10,
                    name = jsonObj.properties.ZCTA5CE10,
                    lat = Convert.ToDecimal(jsonObj.properties.INTPTLAT10),
                    lon = Convert.ToDecimal(jsonObj.properties.INTPTLON10),
                    //lArea = Convert.ToDecimal(jsonObj.properties.ALAND10),
                    //wArea = Convert.ToDecimal(jsonObj.properties.AWATER10),
                    pop = zip.population,
                    mAge = zip.medianAge,
                    mInc = zip.medianIncome,
                    home_pct_own = zip.home_pct_own,
                    home_pct_rent = zip.home_pct_rent,
                    age_pct_0_19 = zip.age_pct_0_19,
                    age_pct_20_39 = zip.age_pct_20_39,
                    age_pct_40_59 = zip.age_pct_40_59,
                    age_pct_60_79 = zip.age_pct_60_79,
                    age_pct_80_over = zip.age_pct_80_over
                },
                geometry = new geoGeometry()
                {
                    type = jsonObj.geometry.type,
                    coordinates = jsonObj.geometry.coordinates
                }
            };

            return newGeo;
        }
    }

    public class zipCombined
    {
        public string zip { get; set; }
        public decimal latitude { get; set; }
        public decimal longitude { get; set; }
        public string geoJSON { get; set; }
        public int population { get; set; }
        public decimal medianAge { get; set; }
        public decimal medianIncome { get; set; }

        public decimal home_pct_own { get; set; }
        public decimal home_pct_rent { get; set; }

        //age percentages
        public decimal age_pct_0_19 { get; set; }
        public decimal age_pct_20_39 { get; set; }
        public decimal age_pct_40_59 { get; set; }
        public decimal age_pct_60_79 { get; set; }
        public decimal age_pct_80_over { get; set; }
    }

    public class geoJSON
    {
        public string type { get; set; }
        public List<geoFeature> features { get; set; }
    }

    public class geoFeature
    {
        public string type { get; set; }
        public geoProperties properties { get; set; }
        public geoGeometry geometry { get; set; }
    }

    public class geoProperties
    {
        public string name { get; set; }
        public string zip { get; set; }
        //public decimal lArea { get; set; }
        //public decimal wArea { get; set; }
        public decimal rev { get; set; } //revenue
        public decimal qty { get; set; } //quantity
        public decimal lat { get; set; }
        public decimal lon { get; set; }
        public int pop { get; set; } //population
        public decimal mAge { get; set; } //median age
        public decimal mInc { get; set; } //median income

        public decimal home_pct_own { get; set; }
        public decimal home_pct_rent { get; set; }

        //age percentages
        public decimal age_pct_0_19 { get; set; }
        public decimal age_pct_20_39 { get; set; }
        public decimal age_pct_40_59 { get; set; }
        public decimal age_pct_60_79 { get; set; }
        public decimal age_pct_80_over { get; set; }
    }

    public class geoGeometry
    {
        public string type { get; set; }
        public dynamic coordinates { get; set; }
    }
}

I started with the circle approach of fetching zips based on the center point of the map and some radius based on zoom until I realized I could get the lat/long corners of the leaflet map. A future improvement would be to expand the query viewport slightly to capture the zips who’s center is outside the boundaries but still has area inside the viewport.

To explain the model a bit, the zipCombined class is used for the query result we get out of sql server that has the demographic data joined in. The rest of the geoX classes are the format of GeoJSON strongly typed out (save for the geoGeometry coordinates which are complicated and don’t need to be strongly typed for what we need to do).

Onto the controller:

/* MapController.cs */

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using System.Web;
using System.Web.Http;
using Models;

namespace API
{
    public class MapController
    {
        public async Task<object> Get(
            decimal latitude,
            decimal longitude,
            decimal NELat,
            decimal NELng,
            decimal SWLat,
            decimal SWLng,
            bool extraFilterParametersYouShouldPass
        )
        {
            //fetch zip path data
            var geoJSONAsync = GetZipsAsync(NELat, NELng, SWLat, SWLng);

            //get all reporting data filtered by the inputs
            var occupancyListAsync = GetReportingDataAsync(extraFilterParametersYouShouldPass);

            await Task.WhenAll(geoJSONAsync, occupancyListAsync);

            var geoJSON = geoJSONAsync.Result;
            var occupancyList = occupancyListAsync.Result;


            if (occupancyList.Count == 0)
            {
                return new { };
            }

            var zipsReturned = geoJSON.Select(x => x.zip).Distinct().ToList();

            var groupedZipSales = occupancyList
                .GroupBy(x => x.zip)
                .Where(x => zipsReturned.Contains(x.Key))
                .Select(x => new
                {
                    zip = x.Key,
                    rev = x.Sum(s => s.revenue),
                    qty = x.Sum(s => s.qty)
                })
                .ToList();

            geoJSON result = new Models.geoJSON()
            {
                type = "FeatureCollection",
                features = geoJSON
                    .Select(x => ZipCodeGeoJSON.selectGeoJSON(x))
                    .ToList()
            };

            //fill in reporting data to geo result
            foreach (var groupedSale in groupedZipSales)
            {
                var g = result.features.Where(x => x.properties.zip == groupedSale.zip).FirstOrDefault();
                if (g != null)
                {
                    g.properties.rev = groupedSale.rev;
                    g.properties.qty = groupedSale.qty;
                }
            }

            return new
            {
                summary = new
                {
                    minRev = groupedZipSales.Min(s => s.rev),
                    maxRev = groupedZipSales.Max(s => s.rev),
                },
                geoData = result
            };

        }

        public async Task<List<zipCombined>> GetZipsAsync(
            decimal NELat,
            decimal NELng,
            decimal SWLat,
            decimal SWLng
        )
        {
            return await Task.Run(() =>
                {
                    return ZipCodeGeoJSON.GetZipsInRectangle(NELat, NELng, SWLat, SWLng);
                }
            );
        }

        //This is all optional and specific to your need
        //But this is how you could fetch other reporting data you want to join in and visualize
        public async Task<List<ReportingClass>> GetReportingDataAsync
        (
            bool extraParams
        )
        {
            var httpContext = HttpContext.Current;
            return await Task.Run(() =>
                {
                    return ManifestMinutes.GetSessionItems(extraParams, httpContext: httpContext)
                        .ToList();
                }
            );

        }
    }
}

The basic idea is to fetch the zip data and any other reporting data simultaneously, wait for both to finish, join up the results, and then pop it on down to the client. I also was originally saving the client from calculating the min and max revenue but ended up doing it on the client side anyways. You might also notice notice I’m using a closure with the httpContext in GetReportingDataAsync. This has to be passed as a parameter to an async function if you want to use it for caching.

This just about wraps it up for the backend part of getting some wonderful leaflet maps on your page with zip code data. One final thought about serving up this massive stack of JSON zip code data, make sure you have gzip enabled on your server. The responses for a decent size area were upwards of 5mb without compression on and less than 500kb once I got it set up.

Check out Leaflet Zip Code Map Part 2 for how to set up the front end.

Want the inside scoop?

Sign up and be the first to see new posts

No spam, just the inside scoop and $10 off any photo print!