According to data released by Britain’s Office for National Statistics, and a recent BBC article, Brits drinking habits are changing. Are the British falling out of love with booze? The data and graphs involved were the topic for Makeover Monday this week.
Thoughts on original charts
The data set provided focussed on the first section and chart in the article. It’s a good simple bar chart clearly showing the changes in certain categories between 2005 and 2016. What we can’t see is the pattern in the intervening years. We also cannot see the parts that make up the whole in any given year; we can see the proportion drinking at least once during the week before the survey, and the proportion that do not drink at all, but there is no mention of those that drink but not during the week in question. Finally the response of drinking on 5 or more days of the week is presumably a subset of those drinking at least once during the week. The bar chart does not represent these sets fully.
An area chart seemed to be the best way to represent parts of a whole over time. I started out with the overall results before breaking the same graph down by gender and then gender and age bracket. Large labels containing headlines outline the changes at a glance:
The viz is available on Tableau Public.
The wine glass, male and female icons were sourced from www.flaticon.com/authors/simpleicon.
The biggest challenge was one I almost didn’t spot! Initially I misinterpreted one of the source data fields, which resulted in the graphs being wrong. The clue was that my numbers were adding up to more than 100% in some cases. Discussion on twitter about the figures not necessarily adding up to 100% and comments in the ONS source about 95% accuracy meant that I attributed my mistake to the source data. The issue kept nagging at me though and I’m glad I went back and checked as I’d double counted those that drank on 5 more days!
The way the data was structured meant that I needed to reshape it to create the four categories that made up the whole. A pretty simple exercise within Tableau. I created calculated fields to grab the proportion when the question was relevant. For example:
IF [Question] = 'Drank alcohol in the last week' THEN [Proportion] END
Then I could SUM these up with [Question] excluded from the view and stack the measures. [Aside: I actually made my life harder than it needed to be here and used LOD calculations excluding the question dimension. This was unnecessary given that I didn't need the question dimension in the view in the end].
Calculating the missing categories was just a matter of subtracting the proportions we did have from 1:
1 - ([Drinks on 1 or more days] + [Does not drink at all])
There was quite a bit of fiddling with the label content and font sizes to get them to fit nicely into the space available. They’re applied to the last point on the (synchronised dual axis) lines that highlight those that drank at least once during the week. If you download the workbook you’ll spot a space and full stop character at the end of each line of the label. These act as padding so that the label is not flush with the edge of the area chart. The full stop is needed as spaces on their own are ignored. To get around this the full stop is given the same colour as the chunk of the area chart that the label sits over. Not a robust option if the area chart components can vary more dramatically over time but okay in this context.
I’m pretty happy with my made over chart. I think it represents the part of the story I’ve focussed on clearly and cleanly. I’m less happy with the mistake I made and the unnecessary LOD calculations – but in some ways that’s reinforced a good habit of going back and checking things if they seem wrong or overly complex!