Makeover Monday, 2017 #27-30

I was on holiday for week 27 to 30 so caught up with these four later in the year…

Week 27: Tourism in Berlin and Brandenburg

For week 27 a makeover of a filled map showing visitor stats for Berlin and Brandenburg. I’ve retained the filled maps but focussed on an angle that interested me; whilst Berlin ranks top for both number of visitors and total number of nights in all years, when you look at average nights per visitor it is one of the regional districts that usually tops the list. Check out the interactive version here on Tableau Public where you can look at different years, and explore how the figures differ for domestic or international tourists.

Week 28: Tour de France

I was cycling in France around the time of the tour. Not quite at the same pace, even if we did pass a few signs! I can highly recommend a barge and bike holiday in Burgandy if you want to immerse yourself in the French countryside. Anyway here is a quick catch-up based on the week 28 data set; is the Tour de France getting easier, or the riders better? Available on Tableau Public here.

Week 29: White House Salaries

Is the Trump administration payroll top heavy? Well there is a bump, but actually I’m not so sure! Makeover below. Also available on Tableau Public here.

Week 30: How thirsty is our food?

That’s a wrap! My final makeover for the year completed on the 29th December.

Statista put together a nice summary of how much water is used when producing various foods, using data from the Unesco Institute for Water Education. The statista version is a pretty clear bar chart which has been jazzed up into an infographic style. I recall that there were lots of great submissions from this week but didn’t remind myself of these before attempting my own makeover. Goal one was to show the breakdown by water type and goal two was to visualise the rankings - e.g. which food type uses most green water, which uses most grey. A bump chart eluded me with the blended data set I used, but I quite like the alternative I came up with anyway! Next to the ranking there is a simple bar chart for each water type so we can see the absolute amount of water used (otherwise the ranking could be a little misleading). Both charts are ordered by total water consumption and text boxes added for title, overview and column headings.

I’m intrigued to see that pulses require the most grey water to dilute generated waste water sufficiently. A shame as I do like a good bean surprise for dinner ;o)

The makeover is also available on Tableau Public.

Makeover Monday, 2017 #26

It’s the half way point for Makeover Monday in 2017: 26 charts selected by Andy and Eva, 26 makeovers submitted and 26 blog posts written. It’s been a tough challenge for me to produce a visualisation every week and tougher still to write about each one. However, both of those aspects have been enjoyable, and the practice and reflection has really helped me get more out of Tableau in my job. I’ve had a few submissions selected in the weekly round up and one was selected in a #VizForSocialGood project for use by Inter American Development Bank. The way the Makeover Monday project works this year with the weekly wrap up lessons and the community input has made a huge difference to me. If you’re reading this, chances are you’ve helped me so thanks!

The chart in week 26 explored German car production and exports. Nice clear charts. Not much narrative and for me the hover over was a bit much – possibly this was due to the size and imagery used, but it just seemed to get in the way for me.

After exploring the data and seeing other submissions the dip from the global financial crisis and resulting recession was a clear story. What was also interesting was that the percentage of vehicles exported also dropped in step with the reduction in production. Does this mean that Germans weathered the storm a little better than those they export vehicles to? There is also an intriguing spike in export percentage for Trucks late 2012 / early 2013. With the exploration done it looked like I’d have two charts and some narrative. Great I thought – I can pop this in panels coloured to match the German flag. More on this design choice below, but here’s what it ended up like:

.
You can also check it out on Tableau Public - although there’s very little interaction, just a bit of hover over.

I quite liked the design choice to mimic a German Flag, but I don’t think it will suit or appeal to everyone. The background colours are obviously quite bright and may not work for those that need a low contrast visual. There has been some interesting discussion amongst the community this week about design and data; when does design detract from the ability to see the stories in the data? I suspect I’ve fallen into the trap of emphasising a design element over the data. If time allowed I’d have a go at toning those colours down, or reversing them so that the colours come out in foreground elements instead.

Makeover Monday, 2017 #25

More Exasol-based data in Makeover Monday week 25. 200 Million+ ozone air quality readings from the EPA and a goal to make over a multi-year air quality tile plot on the EPA website. I spent way too much time exploring the data so only had time for a quick make over in the end and this short blog post. Checkout some of the other participants efforts on twitter or wait for Andy and Eva’s weekly summary to see some really cool visualisations.

Makeover Monday, 2017 #24

Week 24 of Makeover Monday and a fascinating data set of art work in the Tate Collection. Nominally we’re making over charts from an article by Florian Kräutli such as this one:

Thoughts on the original charts

The chart highlighted to Makeover Monday participants was a pie chart showing the proportion of works in the collection by Turner versus all other artists. It works well to illustrate a key point about the data and explain why the author excluded Turner’s works from a number of their beautiful visualisations; when Turner’s works are included they skew the data to the point where other artists can’t be seen. The Tate Collection includes the Turner Bequest of roughly 30,000 works of art. Many of these art works are unfinished or preparatory sketches – e.g. each page in a sketch book is counted as a separate work. Angie Chen’s submission explains this nicely and is well worth checking out along with the original article.

Makeover

For the makeover I wanted to take on the challenge of showing Turner and other artists in the same viz, without skewing the data. I recalled various art timelines from school days and a Gantt chart seemed like a good way to achieve this. The Gantt chart would show the range of years for each artists work. And when the artists were ordered by start year it would have the feel of a timeline. I then experimented with overlaying semi transparent marks for the actual art works, the aim being to have a denser / more packed overlapping set of marks for Turner than other artists:

The timeline is also available on Tableau Public here, where you get the option to hover over and click a link to see an example piece of art.

Challenges

There were still too many artists to show so I filtered the list to exclude art not attributed to a year and to focus on the top 25 artists which helped to keep the timeline concise. Highlighting the top 5 also helps the casual reader. Some fiddling around was needed to get labels formatted nicely when including the highlighting. I’ve just employed the usual trick of having two calculations returning a value or empty string in opposite circumstances.

Formatting the Gantt bars was probably the biggest challenge. A Gantt bar will stretch to a point and not beyond, whereas if you plot a shape at that same point it is centred on the point and extends past it by half of it’s size. I wanted to achieve a look where the Gantt bar was simply a box around the collection of points, so to start with I ended up with the start and end points spilling out of the box – definitely not the look I was after.

To get around the marks spilling out of the boxes I created additional calculations that extended the ends of the boxes by enough years to fix the formatting. Is this a fudge? Is it a bad thing to do? To an extent yes, because it misrepresents the data! But in its defence most viewers won’t get (or need?) an accurate idea of the specific start and end year on a first glance anyway, and the actual years are included in the hover over tooltip.

A few participants hit issues using the URLs included in the data set to pull in images of the art work. The conclusion seemed to be that the Tate site didn’t allow it’s content to be iframed. I didn’t try to tackle this and instead just provided a link to click through to some examples.

Conclusion

I think I’ve achieved my goal with the viz. If time allowed I’d work on the option to view the actual pieces of art and more details about them. I briefly toyed with producing ASCII versions of the art work for inclusion in my tooltips; hover over a mark to see the piece of art …. kind of! Could have been a good excuse to create a web data connector maybe. I also wondered whether I could have nested a spark line or histogram within he Gantt chart bars. No shortage of ideas with this weeks data!

Makeover Monday, 2017 #23

Makeover Monday was live from TCOT in London this week and it was amazing to see some of the output produced in just one hour! The challenge was to redo an already great graph from FiveThirtyEight on US National Park popularity:

Thoughts on the original chart

I like the original chart – actually the whole article is really interesting and the charts engaged me throughout. The chart focussed on above shows the ranking nicely, but not the actual number of visitors over time. So we don’t know how much more popular Great Smoky Mountain has been than Rocky Mountain or Yosemite. It’s hard to follow some of the threads without interactivity – although the colour coding of some of them certainly helps. You also can’t tell which states the parks in question are located in.

Makeover

I wanted the story to develop from the overall figures (a little like the opening section of the article), via a state-by-state picture, to the detailed data for specific parks. A tile map seemed like a great way to cover the state-by-state picture, but it was hard to get a sense of the numbers so I’ve included the latest recreational visitor count in each tile where applicable:

The viz is also available on Tableau Public where you get a little extra hover over functionality.

Credits and challenges

Thanks to Matt Chambers’ blog post and Brittany Fong’s template for the US tile map approach – well worth a read if you haven’t already.

There were a few challenges over and above the base tile map:

Data blending limitations. I used the Makeover Monday data set exactly as is, so it contains data at a park level which is more detailed than the tile map template. Also the filtered park level data didn’t contain data for all states. A blend in Tableau is like a left outer join - you get all records from the primary data source and any records that link to it in the secondary data source. So from this perspective it seemed best to use the tile map template as the primary. But blends work best when the most detailed data set is the primary data set – I couldn’t get the cases where there were multiple parks per state to display properly; the dreaded * issue. Uh oh. Catch 22? To resolve this I cheated a bit – the parks data actually had data for each state when not filtered to a type of “National Park”. So I removed the filer and added a calculation to just SUM up recreational visitors when the type was “National Park”. Perhaps I should have found a better way to do this – e.g. ensure everything I was displaying was aggregated at state level, like MIN([State]), or just create a suitable data set outside of Tableau?
Adding summary info to each tile. To achieve the summary section at the bottom of each tile I forced the value axis to stretch to a negative value, and then added a fake data point on the last year midway towards this negative value. That data point is plotted as a blank shape and given the label that I wanted. A little bit of fiddling to display a different label (with the same text) for states with no parks so that I could give these a different colour and it was job done. What frustrated me was that I couldn’t put the state abbreviation in the top left whilst having the visitor number in the bottom right. A lack of a guaranteed data point for the earliest year in each state with national parks prevented me from doing the top left label – nothing to hook a fake data point on to.
Getting a different grey background for each part of the tile. I used good old reference bands here. The downside is that you get some spurious info about the reference lines when hovering for tool tips.

Conclusion

I felt that the viz petered out a little. If time allowed I’d like to have experimented with making each state tile act as a filter so that the reader could view the detailed park data for whichever state they wanted. Other than that some graphical content to tie in with the subject of national parks would be good. Overall though I think my makeover achieved my goals and it was a good chance to try out a US tile map … and work through some data blending challenges!

Makeover Monday, 2017 #22

Just a quick write up this week. Eva selected a map from Knoema showing what proportion of each country’s population had internet access. I quite like the interactive map, but it suffers from some problems common to filled maps. Eva and Andy have talked about the use of maps a few times in their weekly write ups so this week I thought I’d explore the issue in a bit more detail.

Makeover

The original visualisation had a headline focussing on the top 5 countries for internet access. The top 5 in 2015 includes three relatively small countries making it a great angle to focus on for what I wanted to do. Rather than just write about the problems with a filled map I wanted to illustrate the issue. Here is the end result – hopefully it speaks for itself?

The interactive viz is also available on Tableau Public.

Final thoughts

I’m not saying filled maps are bad. I’ve used then previously to good effect. I also like the extra context that the maps add to my makeover; the end result is arguably more engaging than the bar chart alone. In addition there have been some really nice map based makeovers this week that serve to highlight some key themes. But map-based visualisations aren’t without their problems, and there are aspects of this weeks data and story that illustrate some of those problems.

Makeover Monday, 2017 #21

According to data released by Britain’s Office for National Statistics, and a recent BBC article, Brits drinking habits are changing. Are the British falling out of love with booze? The data and graphs involved were the topic for Makeover Monday this week.

Thoughts on original charts

The data set provided focussed on the first section and chart in the article. It’s a good simple bar chart clearly showing the changes in certain categories between 2005 and 2016. What we can’t see is the pattern in the intervening years. We also cannot see the parts that make up the whole in any given year; we can see the proportion drinking at least once during the week before the survey, and the proportion that do not drink at all, but there is no mention of those that drink but not during the week in question. Finally the response of drinking on 5 or more days of the week is presumably a subset of those drinking at least once during the week. The bar chart does not represent these sets fully.

The makeover

An area chart seemed to be the best way to represent parts of a whole over time. I started out with the overall results before breaking the same graph down by gender and then gender and age bracket. Large labels containing headlines outline the changes at a glance:

The viz is available on Tableau Public.

Credits

The wine glass, male and female icons were sourced from www.flaticon.com/authors/simpleicon.

Challenges

The biggest challenge was one I almost didn’t spot! Initially I misinterpreted one of the source data fields, which resulted in the graphs being wrong. The clue was that my numbers were adding up to more than 100% in some cases. Discussion on twitter about the figures not necessarily adding up to 100% and comments in the ONS source about 95% accuracy meant that I attributed my mistake to the source data. The issue kept nagging at me though and I’m glad I went back and checked as I’d double counted those that drank on 5 more days!

The way the data was structured meant that I needed to reshape it to create the four categories that made up the whole. A pretty simple exercise within Tableau. I created calculated fields to grab the proportion when the question was relevant. For example:

IF [Question] = 'Drank alcohol in the last week' THEN [Proportion] END

Then I could SUM these up with [Question] excluded from the view and stack the measures. [Aside: I actually made my life harder than it needed to be here and used LOD calculations excluding the question dimension. This was unnecessary given that I didn't need the question dimension in the view in the end].

Calculating the missing categories was just a matter of subtracting the proportions we did have from 1:

 1 - ([Drinks on 1 or more days] + [Does not drink at all])

There was quite a bit of fiddling with the label content and font sizes to get them to fit nicely into the space available. They’re applied to the last point on the (synchronised dual axis) lines that highlight those that drank at least once during the week. If you download the workbook you’ll spot a space and full stop character at the end of each line of the label. These act as padding so that the label is not flush with the edge of the area chart. The full stop is needed as spaces on their own are ignored. To get around this the full stop is given the same colour as the chunk of the area chart that the label sits over. Not a robust option if the area chart components can vary more dramatically over time but okay in this context.

Conclusion

I’m pretty happy with my made over chart. I think it represents the part of the story I’ve focussed on clearly and cleanly. I’m less happy with the mistake I made and the unnecessary LOD calculations – but in some ways that’s reinforced a good habit of going back and checking things if they seem wrong or overly complex!

Makeover Monday, 2017 #20

For week 20 Makeover Monday is collaborating with #VizForSocialGood and Inter-American Development Bank to look at youth employment trends in Latin America and the Caribbean. Great data set, great cause and a great opportunity for our data visualisations to make a difference. For some reason I also felt an increased sense of responsibility to understand and accurately represent the data!

Thoughts on the original charts

The original article walks the reader through the overall headline figures, explaining the various categories and ending with a look at the sectors that young people are employed in. I spent a lot of time trying to understand how the various categories (Ninis, Nininis, unemployed but studying, informally employed, etc) added up to the total number of 15-24 year olds. So much so that in the end this seemed like a good angle to visualise. If I was struggling to make sense of the categories then there was a good chance others were too, and so explaining that graphically would be valuable.

The makeover

Design wise I wanted to bring in key numbers from the original article as headlines, but present and compare the proportions graphically. Pie charts were an option for the graphical component (given a limited number of parts to the whole), but waffle charts seemed to be a better fit for the flow of the visualisation:

The visualisation is also available on Tableau Public, where you can choose a country to drill into.

.

Credits

A quick nod to Andy Kriebel and his very helpful blog post and video on producing waffle charts in Tableau. This was my first attempt to create a waffle chart and Andy’s video was invaluable!

Challenges

The first challenge was understanding the categories! You’ll note from the final waffle chart that I’m not quite there yet. If you hover over the grey section on the left (in the interactive version) you’ll see that I’ve labelled it “unknown”. I’m guessing that this category has to be those 15-24 year olds who are studying or training and are not seeking work. It’d be great to hear other people’s thoughts on whether this is correct, and whether I’ve accurately represented the categories.

Challenge two was a bit more prosaic. I built the headline components of the dashboard as worksheets in their own right. Each of these headline worksheets had a single text label incorporating the various numbers with text. What I hadn’t remembered until I finished was that I couldn’t apply a filter from one worksheet to another worksheet with a different primary data source. Rats! I had to go back and start these again, pulling in a dummy value from the waffle chart grid data source so that each worksheet had the same primary data source. Was there a better way to do this? If so I’d love to hear about it.

The final challenge was colouring the text in the headlines to avoid the need for a colour legend underneath each waffle charts. I could improve this aspect because the colours develop as the categories are expanded and consequently some colours are technically given two meanings.

Final thoughts

What else might I change? A waffle chart isn’t always as accurate as a pie chart (unless you can show enough squares!) so the eagle eyed will notice some rounding issues – e.g. two squares shaded for the 2.5 million unemployed who are studying out of 100 million young people . It would probably have been better to have more than 100 squares in each waffle to allow for more accuracy. Adding the percentage into the hover over would help here.

Part of me thinks that a concluding paragraph would be useful, but I wasn’t confident adding this with the unanswered question of the unknowns. Nevertheless I’ve learnt a lot about issues in employment for young people in Latin America, used a new chart type and hopefully contributed something valuable to the overall conversation and understanding.

Makeover Monday, 2017 #19

We’re redoing a list based on Dutch car registration data this week for #MakeoverMonday. The list doesn’t really do the data set justice, but I don’t speak Dutch so haven’t dug into the rest of the story! The actual figures were hard to reproduce and seemed like a niche part of the data, so I looked for a different story and decided to focus on the most popular makes of car. Headline figures give the reader some context as to how many registrations there were in 2015 and 2016, as well as the general growth rate. The slope chart then shows registrations for the top 5 makes and how these have changed:

The visualisation is also available on Tableau Public.

It was great to try a slope chart this week. Also a quick shout out / credit to Charlie Hutcheson and Pooja Gandhi re the dashed lines in the top section of the viz – thanks for the write up on this technique.

Makeover Monday, 2017 #18

A look at Sydney ferry patronage for week 18 of Makeover Monday based on Transport for New South Wales Open Data.

Thoughts on the original chart

The chart being made over is actually a series of Tableau dashboards within a story (set of tabs). I like the way I can work through the story from an overview of the data to some summary charts and then down to some detail. The card type dashboard interested me. They key story that jumped out to me was that around 70% of trips were made using an Adult Opal Card. I don’t think we need to see this proportion visually per month and then again per line. Perhaps other angles from the data could have been visually represented too? Nice dashboard though and I enjoyed clicking through to the map for some context.

My makeover

I wanted to try getting the breakdown by ferry line, card type and month onto one viz so targeted an iPad portrait layout. First up I tried a heat map. This was okay but not quite what I was looking for.

A line or bar chart of trips per month, with a panel per ferry line and card type worked well but there were too many card types to show nicely! Also the trips using certain card types (e.g. employee) were negligible compared to the main types. Grouping the types together allowed me to fit everything in. School and Concession (concession being for tertiary students) seemed to bundle in with Child / Youth quite nicely, and single trips could be bundled in with other outliers.

The viz ended up looking like this:

This allows me to see at a glance the top lines and card types, plus I get a brief idea of trends over time from the monthly bars (with labels on the first and max values). In some ways it’s working a little bit like a heat map if you consider whitespace a lack of heat in comparison to the space take up by the bars. In some ways the bottom right part is a little empty. Still that does tell a story.

The viz is also available on Tableau Public.

A note on image reuse

My original design concept had a different image at the top of each column; I wanted to use images of the main Opal card types. My thoughts were that this would add useful context for those users who were familiar with Sydney ferries or similar networks. It would be clear at a glance that column one related to the ferry lines and the subsequent columns to the various card types.

Unfortunately whilst the data is open access under a creative commons licence, their logos and trademarks are not. I interpreted this to mean that I couldn’t use the Opal card type images, because they included trademarked logos. I followed up with the relevant department (who were very helpful) just to be sure and had those thoughts confirmed. No drama – it was useful experience to research this angle and I’ve still been able to use coloured rectangles to give my viz some context. In fact if I were redoing the viz I’d probably use that space for some summary numbers whilst retaining the colour for context. I’d also revisit my column headings to make the groupings clearer.