How to use geocoding API in Python

The Tableau community is always inspiring and there isn’t a week without an outstanding visualization.

This post was the most recent one which made me post again after more than 1 year!

I loved how data and art came together, and I wanted to do something similar for the Netherlands. But what is more popular here than whisky? Beer of course! 🍻

The idea was to visualize active breweries. I found a website - Nederlandse Biercultuur - which has all the info I needed, listing the brewery, the date when it opened and the address. It even specifies whether a brewery is production brewery e.g. it has its own boilers and sells beer under its own brand name. However, the data cannot be downloaded and even though I reached out to them, I haven’t received an answer. 😞 So with some manual work, I had to collect data on my own.

But this is not the point of this post! After collecting the data, I was ready to start vizzing in Tableau Public (of course design was already waiting in Figma). This is when the second obsticle came… Tableau’s built-in geocoding (OpenStreetMap) doesn’t recognize Dutch postcodes!

While ZIP code is a 4-digit number, Dutch postcodes consist of four digits followed by two uppercase letters (example: 1025PD). The first two digits stand for the city and region, the second two digits and the two letters indicate the range of housenumbers usually on the same street. Which means, that postcode is better than using the street name which can be prone to misspelling.

My dataset has 464 breweries from what 62 were excluded due having the same 4 digits. So I looked for a solution to generate longitude and latitude for the postcodes via API. Here comes my solution, which I hope could come handy for others too.

Initial data structure:

Brewery Province Year Open Address
Bierbrouwerij Maallust Drenthe 2011 Hoofdweg 140, 9341 BL Veenhuizen
Bierbrouwerij Kasparus Flevoland 1994 Karwijhof 15, 8308 AJ Nagele
Brouwerij ‘t IJ Noord-Holland 1985 Funenkade 7, 1018 AL Amsterdam

For the data transformation I used Jupyter Notebook. 🐍

Packages

1
2
import pandas as pd
import requests

If you don’t have any of the packages installed, run install before:

1
!pip install requests

Data transformation

If you look at the example, you’ll see that Address includes the street name and number separated by comma and space from postcode and city:
Funenkade 7, 1018 AL Amsterdam.

For the API calls, we’ll need two columns, postcode and streetnumber.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
df = pd.read_excel("folder/subfolder/NL_provinces.xlsx")

#split at comma, and take the second part
df["postcode"] = df["Address"].str.split(",").str[1:].str.join("")

#extract the digits followed by at least 1 letter
df["postcode"] = df["postcode"].str.extract('(\d+ [A-Za-z].)')

#remove the space between digits and letters
df["postcode"] = df["postcode"].str.replace(" ", "")

#split at comma, and take the first part
df["streetnumber"] = df["Address"].str.split(",").str[0].str.join("")

#extract the digits
df["streetnumber"] = df["streetnumber"].str.extract('(\d+)')

API request

With the API request, we’ll want to retrieve the latitude and longitude of a specified postcode and street number. I’ve used the API of postcode.nl. Unfortunately, this API is not free, and after ~500 requests it will throw an error that you reached the limit and your IP will be blocked.

There is another API for post.nl. I’ve requested access, so I might update this post later on.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#create lists for latitude and longitude
lat = []
lon = []

#iterate over the table rows
for index, row in df.iterrows():
try:

#send a request for each row based on postcode and streetnumber
postcode = str(row["postcode"])
streetnumber = str(row["streetnumber"])

req = "http://api.postcodedata.nl/v1/postcode/?postcode="+postcode+"&streetnumber="+streetnumber+"&ref=domeinnaam.nl&type=json"

response = requests.get(req)

#append the response to the lists
lat.append(response.json()["details"][0]["lat"])
lon.append(response.json()["details"][0]["lon"])

except:

#it might happen that your postcode is incorrect
#in that case append None to keep equal length as the table
lat.append(None)
lon.append(None)

Save table

1
2
3
4
5
6
#add lat and lon lists as columns to the df
df["lat"] = lat
df["lon"] = lon

#finally export the table
df.to_csv("folder/subfolder/output.csv")

Viz in Tableau Public

Find the interactive version here. Proost! 🍻😊

The Netherlands active breweries in 2022