Tableau doesn't recognize Dutch postcodes!
How to use geocoding API in Python
The Tableau community is always inspiring and there isn’t a week without an outstanding visualization.
This post was the most recent one which made me post again after more than 1 year!
After traveling Scotland in July and buying lots of whisky, I wanted to buy a Scotland distillery map for my wall. Then, I figured I'd just make one in @tableau. Then I decided to make one for Kentucky, Ireland, and Japan as well! pic.twitter.com/C8gaLuBlQy
— Ken Flerlage (@flerlagekr) August 29, 2022
I loved how data and art came together, and I wanted to do something similar for the Netherlands. But what is more popular here than whisky? Beer of course! 🍻
The idea was to visualize active breweries. I found a website - Nederlandse Biercultuur - which has all the info I needed, listing the brewery, the date when it opened and the address. It even specifies whether a brewery is production brewery e.g. it has its own boilers and sells beer under its own brand name. However, the data cannot be downloaded and even though I reached out to them, I haven’t received an answer. 😞 So with some manual work, I had to collect data on my own.
But this is not the point of this post! After collecting the data, I was ready to start vizzing in Tableau Public (of course design was already waiting in Figma). This is when the second obsticle came… Tableau’s built-in geocoding (OpenStreetMap) doesn’t recognize Dutch postcodes!
While ZIP code is a 4-digit number, Dutch postcodes consist of four digits followed by two uppercase letters (example: 1025PD). The first two digits stand for the city and region, the second two digits and the two letters indicate the range of housenumbers usually on the same street. Which means, that postcode is better than using the street name which can be prone to misspelling.
My dataset has 464 breweries from what 62 were excluded due having the same 4 digits. So I looked for a solution to generate longitude and latitude for the postcodes via API. Here comes my solution, which I hope could come handy for others too.
Initial data structure:
Brewery | Province | Year Open | Address |
---|---|---|---|
Bierbrouwerij Maallust | Drenthe | 2011 | Hoofdweg 140, 9341 BL Veenhuizen |
Bierbrouwerij Kasparus | Flevoland | 1994 | Karwijhof 15, 8308 AJ Nagele |
Brouwerij ‘t IJ | Noord-Holland | 1985 | Funenkade 7, 1018 AL Amsterdam |
For the data transformation I used Jupyter Notebook. 🐍
Packages
1 | import pandas as pd |
If you don’t have any of the packages installed, run install before:
1 | !pip install requests |
Data transformation
If you look at the example, you’ll see that Address includes the street name and number separated by comma and space from postcode and city:
Funenkade 7, 1018 AL Amsterdam.
For the API calls, we’ll need two columns, postcode and streetnumber.
1 | df = pd.read_excel("folder/subfolder/NL_provinces.xlsx") |
API request
With the API request, we’ll want to retrieve the latitude and longitude of a specified postcode and street number. I’ve used the API of postcode.nl. Unfortunately, this API is not free, and after ~500 requests it will throw an error that you reached the limit and your IP will be blocked.
There is another API for post.nl. I’ve requested access, so I might update this post later on.
1 | #create lists for latitude and longitude |
Save table
1 | #add lat and lon lists as columns to the df |
Viz in Tableau Public
Find the interactive version here. Proost! 🍻😊