Summary
Overview
To meet our goal of making the UN data easier to access and easier to analyse we envision a tool in two parts:
- A graph database consisting of as many datasets as we can load, built in TigerGraph – the world’s best graph database!
- A simple, web-based user interface which will allow people with no knowledge of graph or coding to be able to interact with, filter, download, and perform basic analysis on the UN data stored in our graph database.
Having the UN data in a graph database will allow sophisticated users to perform deep graph analytics on the datasets: we see similiarity-analysis in particular as being a key area of graph investigation.
Having a user friendly web front end will allow any user from any field to be able to see and interact with the data in a friendlier and more powerful way than the raw UN data website. The two-metric scatter plot in particular we hope will be a powerful, simple tool to enable non-technical users to instantly visualise their data-of-interest (see below).
We knew before we started our build that the UN data was going to be hard to work with, but we didn’t realise how hard. Each dataset had its own quirks, its own challenges and its own pitfalls – from missing years, to strange aggregations, to collapsed dimensions, to countries that no longer exist! But with some serious wrestling and wrangling we managed to get ourselves the most amazing set of data fully loaded, connected and available in TigerGraph:
- Total Vertices: 1,457,406
- Years of data: 73
- Countries: 259
- Metric types: 729
- Individual data points: 1,456,063
- Total Edges: 15,441,320
We succeeded beyond what we thought was possible; just a small sample from our 729 metrics includes:
- Cause of death
- Crop yields
- Movement of refugees
- Homicide
- Pollution
- Tourism
- Childhood obesity
- Vaccination rates
- GDP