Data visualisation is one of those fantastic ideas where data is presented in a format that allows us to see what is going on. From a good visualisation we can drawn conclusions about a data set easily. We can interpret the numbers with the effort of doing any number crunching.
Interestingly, if you talk to a data analyst, you will find visualisation design is not simple. You will be told, it depends on the data and what you want to find out. The contradiction of the “user” is that they expect the visualisation to help them understand. Only after seeing a visualisation will they have an idea of what they might want to find out.
If you do not understand the purpose of a visualisation it simply becomes “eye candy”. Colourful pictures that suggesting something might be interesting.
We see this all too often with the graphs and graphics used in reporting and analysing COVID data. One visualisation example that got me really thinking it that presented and developed by (Three graphs …)
It is interesting to see graphs that shift the perspective, time is no longer an axis. However, the x-axis and y-axis are intrinsically inter-related. The x value is the rate of change of the y value. As a result we get curious graphs using the co-ordinate space in fascinating ways.
However, the dependence between x and y means when x is negative and line always descends and when it is positive it climbs. It is not easy knowing how to read this. It means each data source spirals upwards and downwards either side of the y-axis.
What to make of this visualisation? It seems to contradict the idea that it helps understand the data. The average number of deaths a day is not really a comparable scale across countries with differing populations, so the relative line positions are not that useful. The spiral patterns of each line are intrinsic to the data presentation approach. However the similarity of shape might be something of interest and that is easy to see. But what to make of one spiral that is squashed when seen next to another? Common dates cannot be easily related. There do seem to be “events” such as when spiral turns and starts again. But those events are not labelled on the graph, so we don’t know they show COVID restriction policy decision or a new variant appearing.
In fact to add to my confusion, what look like “events” might not be because of the poor representation of time. Imagine the y value stayed constant for a week before growing, and another stayed constant for nine weeks before growing. Both cases would appear the same!
One simple way of checking if you understand what is going on is to see if you can work the visualisation with some simple “known” data. In the graph style above: an unchanging value will be a dot with x at zero and some constant y value (the blue dot below). A consistently growing value will be a vertical line with a positive x (the green line below). A consistently dropping value will be a vertical line with a negative x (the red line below). A cyclic value such as temperature through 24 hours will be an ellipse (see the grey ellipse below).
These simple examples help us grasp what is being visualised. But they also show the challenges, a horizontal has no real meaning! Despite this, it is still hard to see what the particular visualisation of real COVID data is telling the viewer without sophisticated interpretative skills.