Makeover Monday, 2017 #17

This week Tableau themselves chose the data; a celebration of the importance of data skills based on LinkedIn’s annual Top Skills reports. The Tableau example includes a bump chart and box and whisker plot, whereas the LinkedIn version is a simple list per country in a slide share style:

LinkedInSkillsReport



.

Thoughts on the original charts

The LinkedIn version is simple and clear. You can see which skills are ranked highest and how that ranking has changed since last year. However it is hard to compare the countries to each other or to the overall global ranking (you have to flick back and forth in the slide deck). Also you can’t see how rankings have changed from years prior to 2015.

The Tableau version addresses some of these issues. It’s still a little hard to compare countries to the overall global ranking. The bump chart shows the change in ranking across multiple years, but personally I find bump charts a little hard to follow particularly if there’s no labelling or colour legend.

 

Is there such a thing as spark dots?

My first idea was to build a set of spark lines showing the current and historical rankings for each skill. Unfortunately this didn’t look great as some skills don’t have rankings in some years. The spark  lines looked a bit disjoint and in some cases were just spark dots, which presumably isn’t a real thing!

 

So, what could I add to fix this?

Hang on, why did I want to add something when the data was just a ranked list? Instead I decided to step right back and try a simple styled list. I’d still want to show change in rank from 2015, but would leave out comparisons to 2014. This would still present a challenge as I’d need to balance the emphasis on the skill versus the information on ranking changes, and I’d need to deal with cases where there was no change in ranking or the skill was new to the rankings. I’d also have the challenge of comparing different countries.

Now to be honest I passed a bit on this second challenge, deciding that with limited space (and audience attention) it was better to just provide a comparison to the overall global ranking and the option to chose the country. So I wasn’t going to address all of the points I noted with the originals, but I was happy that I’d be honouring  the nature of the source data.

Here is the finished viz:

LinkedInSkillsMakeover



.

The core of the viz is a pretty basic matrix of text by rank and country. Check out the version on Tableau Public if you want to dig into anything in particular. You may note that time could be spent making the workbook more robust as I have hardcoded 2015 and 2016 as the years to use, instead of using LOD or table calcs to find the latest and immediately previous year. There were two particular tricks I used this week that seemed to be worth elaborating on though:

 

Showing just the selected country and the global list

To handle this I didn’t just want to use the existing country field as a filter because I’d be relying on the user to always have one country plus “Global” selected. Instead I created a parameter with all countries except global (tip: you can use the “add from field” option when creating a parameter to save typing all of the options in). I could then add a workbook filter on a calculated field to only include global and the selected country:

Step3


I also created a calculated field for  ”country order” so that global always appeared in the right hand column. Otherwise sort order would be alphabetic and sometimes global would appear in the left hand column.

 

Colouring the text in EACH column differently

This was quite fiddly. I’d already created calculations to show an up arrow, down arrow and relevant text for the supporting label about each skill. I then realised that I wanted the selected country to stand out more than the global list. To do this I ended up creating two calculations, one for global which would return NULL when country was not global and one for the selected country which would return NULL in the opposite situation. The idea here is that only one of the calculated fields ever returns a value so I could give each a different colour safe in the knowledge that only one of them would be displayed in any one cell. Note the lack of an ELSE block in the calculation below which ensures NULL is returned. I could have used ELSE “” given that the calculation is used in label text of course, which may have been more maintainable?

Step2



.
Step1

.

If you’re wondering what those strange symbols are in the label definition above then it’s just that I’ve selected a font of Wingdings 3 for the calculated fields that return and up and down arrow.

 

Job done!

Another great week for honing Tableau skills. Nothing I’ve done this week is rocket science by any means, but as someone still learning the product, and about data visualisation as a whole, it feels like I’m making progress. And of course it’s pretty cool to see that some of the skills used are in demand!

Makeover Monday, 2017 #16

Week 16 and it’s back to some big data on EXASOL thanks to Eva and Johannes. This time over 750 million rows of GP practice prescribing data for the UK from 2010 – 2017. The nominal challenge to makeover some of the charts in a House of Commons research briefing based on the data.

orig



.

Like most of the Makeover Monday community I didn’t find the graphs particularly exciting or engaging. but I did find the research briefing to be interesting and it covered some of the key questions I had in my mind about prescriptions as a topic, such as what was happening with antibiotic use.

Also like others I didn’t simply try to recreate the graphs in the briefing. Actually that’s not entirely true, I do tend to start these makeovers by recreating at least one of the graphs or data tables. Recreating something is a good way to check that I’m getting the right numbers and helps me to orientate myself to the data. For my actual makeover though I wanted to dive into some of the detail like many others. One of the early submissions by Adam Crahen identified the growth in prescriptions of Apixaban. I knew a bit about anticoagulants, so this felt like a good category to drill into:

DASH



.

The first step was to identify the main anticoagulants being prescribed. I picked these four as an obvious story seemed to emerge; the move away from the traditionally prescribed Warfarin to three new novel oral anticoagulants. A little insider knowledge made me include Dabigatran as it doesn’t really stand out from the crowd (the remaining grey lines) unless you look closely.

Next I sourced a reusable image relating to the vascular system. Now at this point I got pretty excited. I thought that I could blend the arteries, etc from the image into the lines in the graphs. So I had the timeline running down the viz and was trying to resize things to line up nicely. End result? It looked like terrible! Back to the drawing board and I’m glad that I did.

Instead I decided to use the image as a backdrop to some context (using the title and the number of anticoagulant items prescribed compared to the overall number of items prescribed) and for aspects of the colour pallet. A much cleaner result I think whilst still retaining some visual impact.

The final step was to source detail for the labels on the left (the story) and edit this down to fit. Reducing the text down to what I ended up with was probably the second hardest part of the process! If I were revisiting this visualisation I’d tie the story into the headline a little more by specifically mentioning heart attacks in the section on conditions. I would also try to pull in some antiplatelet prescription data as aspirin has a part to play in the story.

In terms of new things tried in Tableau, this week I went with an entirely floating layout for the dashboard to see if that helped with alignment variations between devices and Tableau Public. The approach of floating everything seems to have helped a bit, although I maybe should have split the story points into individual text objects so that they would better align to the aspects of the graphs that they referred to?

On a final note it was great to play with EXASOL again – a pleasure exploring and working with this large data set remotely.

My makeover is also available on Tableau Public.

Makeover Monday, 2017 #15

Uh oh! Making over a 2009 graph by Andy Kriebel this week. No pressure. The graph looks at the correlation between gold and oil prices with the premise that they’re closely related. You can see Andy’s own makeover here. Another of my favourites is one by Mike Cisneros, which you can read about here. I love the story telling element Mike achieves; something I was aiming for myself.

I started off looking at each price as a percentage of its average over the period (because an ounce of gold is much more expensive than a barrel of crude oil). This allowed me to produce an area chart showing which commodity was most “expensive” compared to it’s average price at each point during the period:

Dashboard 1



.

I didn’t love this as I couldn’t quite work out what I was trying to tell the viewer, so moved on!

Next up I considered scatter plots by decade with the aim of highlighting outlying points that weren’t fitting the correlation. My thinking was to display one scatter plot per decade – ideally in a tile like view. I probably should have pursued this idea as I think it would have allowed me to address a couple of concerns with the original chart. Instead I developed the idea of a tile per decade, focussing on four key fluctuations in the gold oil price ratio:

gold oil ratio 2



.

I really like the way I can make each tile focus in on a key fluctuation. The tile look was pretty straightforward to achieve. I created four separate charts – one for each decade – and set the background colour for each chart panel and axis. I appreciate that I’m not addressing every change in the ratio of gold to oil price, and perhaps haven’t even picked the most important ones, but it does allow a simple story to emerge for the reader. My story points focus on oil prices and were sourced from Wikipedia (1980s oil glut; 2000s energy crisis; 2010s oil glut). It would have been nice to add a parameter allowing the reader to chose what sort of story they wanted to read – one skewed to oil, gold or a mix. Unfortunately time didn’t allow.

In hindsight I think I may have added too much to each tile. The mini spark lines for prices added a bit of context for each story point, but had to be layered on as floating charts which was fiddly. They could also be misread to indicate that the gold and oil prices were quite close at certain points  because I haven’t synchronised the axis. I’ve tried to address that issue with a note, but in doing so I’ve added more to an already crowded first tile. Learning point: once I start adding I often end up adding more; better to take away! In my defence I did do some pruning. Initially there was a large number at the top right of each tile showing the closing ratio for each decade. I removed that number as it seemed better, and more space effective, to label the key line points I was referring to in the story points.

The story points are added as floating labels. The positioning of these varies depending on what screen resolution I’m using – you’ll see that the Tableau Public version doesn’t look quite the same. It would have been nice to make the viz more robust by using labelled points instead of the floating text.

All in all though a good opportunity to focus on story telling, and learn about oil and gold prices!

Makeover Monday, 2017 #14

The Guardian recently reported on the a PWC report into the risk of UK job losses from breakthroughs in robotics and artificial intelligence. Whilst technological unemployment isn’t a new concept and the report and article certainly do comment on job creation too, the percentages at risk in some sectors are quite striking.

The graphic in the article appears to be clean and easy to read at a glance, however I found it a little hard to interpret the “job automation at potential high risk” percentage, particularly as it is not immediately comparable to the other percentage shown, and that the definition of high risk is itself based on a percentage. The article does go on to clarify the value to be fair but it did initially confuse me.

 

Can we automate our own job to do a makeover?

Before tackling a makeover in Tableau I thought it would be interesting to see if the act of doing a data visualisation could actually be automated, e.g. by IBM Watson Analytics. Signing up for a 30-day IBM Watson trial was quick and easy after which I could upload the source data spreadsheet and have it prepared for analysis. Once prepared I could either ask a structured question of the data set or choose from some recommended forms of analysis for the measures and dimensions identified. I didn’t spend long on this, and whilst a visualisation could indeed be produced with little input from me, it certainly had some flaws. For example the total/all dimension value was included which skews the chart.

Here is the viz that was produced:

UK jobs at risk of automation - IBM Watson



.

Doing the makeover manually – much better?

Moving on to the Tableau makeover, I really liked a submission by Tamara Gross and have borrowed some of the styling from that. My makeover follows below. I’ve converted the percentage of jobs at risk into a percentage of total UK jobs so that it is more comparable with the overall share of employment – we don’t lose the actual proportion at risk as that is apparent from the stacked bars. However we gain a set of percentages that sum up to the overall headline of 30%. The viz is also available on Tableau Public.

DASH



.

But surely we can automate some of the data visualisation process?

Continuing the theme of whether the data visualisation process can be automated I took a look at the Narrative Science Narratives for Tableau Google Chrome extension. This extension is intended to give you an automated explanation of a Tableau visualisation, in “natural, conversational language everyone can understand”. How much time do you spend crafting a headline and narrative for your charts or dashboards? Could this extension really automate that hard work? I took it for a test drive with my makeover.

First up you need to open the link above in Chrome, click ”Try it Now” and install the extension. You then get a leaf icon in the top right of your Chrome tool bar. Click the icon and you should be prompted to register. Once registered if you visit one of your visualisations in Tableau Public and click the icon again you’ll be prompted to select the dimensions and measures that you want to write about:

step01



.

This bit took some fiddling! I couldn’t get the narrative to work without including “measure names” in the dimensions list despite not wanting to write about measure names. Perhaps this hiccup is a reflection of undue complexity in my visualisation though. For simplicity I removed all measures except the percentage share of UK jobs at risk measure.

Once you click next you get a draft narrative and can then tweak various settings, like specifying the word “industry” rather than “entity”, whether the measure is a number or percentage and whether higher values are good, bad or neutral. After changing these settings you can go back to the narrative which will have refreshed.

I selected the least verbose narrative, unlike if I’d written it myself!

step04



.

I’m reasonably happy with the end result. I wouldn’t use it as is, but parts of it would certainly be a great starting point for a narrative. Here’s the start of the text produced:

This analysis measures Percentage share of jobs at high risk by Industry and by Measure Names.

Average Percentage share of jobs at high risk is 4.53% across all 20 industries. Wholesale and retail trade stood out with a very high Percentage share of jobs at high risk value. The individual industries discussed in detail are selected based on the total Percentage share of jobs at high risk across all subcategories.

For Domestic personnel and self-subsistence:

Average Percentage share of jobs at high risk is 0.02% across all three subcategories (representing 0.08% of the total Percentage share of jobs at high risk across all industries).”

I’ve emphasised the aspects I’d keep which nicely highlight a couple of the industries that many in the Makeover Monday community identified to talk about.

Want to find out more about natural language and Tableau? Check out this Tableau blog post if you haven’t already.

 

Wrapping up

In a week where the topic was automation of jobs due to breakthroughs in robotics and AI it was really interesting to try to automate aspects of my own job using AI / analytics tools. I’ve clearly just scratched the surface in terms of what I’ve tried, but my impression is that tools like those discussed can certainly augment the job of data analysis and visualisation. However, there will still be a reliance on human intuition and design skills in this particular field for some time.

Makeover Monday, 2017 #13

I found week 13 of Makeover Monday quite hard. The visualisation in question was an infographic purporting to show the perceived reasons for success by different social strata. The subtitle of the graphic seemed quite inflammatory and so I was keen to trace the data back to source to see what grounds the author had for their claims. Unfortunately, as others found too, it was pretty hard to find the actual Russian survey results used. This University of Illinois blog post has a great critique of the visualisation and is well worth a read.

For my makeover I’ve focussed on simply redoing the visualisation as I wasn’t able to sense check things like sample size or what constituted the social strata. A caveat is included. I wanted to highlight that the less well off and better off were usually at opposite ends of the range for most of the success factors that were offered in the survey.

.

DASH

.

The viz is also available on Tableau Public.

Makeover Monday, 2017 #10

This week a look at the top 500 gamer channels on YouTube based on this list on socialblade.com. What intrigued me about this week was the immediately interesting disparity between the video views, channel subscribers and the Social Blade ranking (score) and rating (grade). The more influential channels according to Social Blade are not necessarily those with more views or more subscribers. The list itself is pretty handy but not visual engaging and there’s no real story telling or summary.

For my makeover I wanted to highlight the difference between the Social Blade ranking / rating and raw views / subscribers. I also wanted to practice a few tricks raised by Rody Zakovich on making your dashboard pop and from Andy and Eva on the Makeover Monday blog for week 8 and 9. I think this is an improvement on the list – and on my previous makeovers – but there’s still more I’d like to have nailed if time had allowed.

 

Top 500 YouTube Gamer Channels.

The visualisation is also available on Tableau Public.

Makeover Monday, 2017 #9

Makeover Monday week 8 and Andy Kriebel challenged the Tableau community to improve on two graphs of his American Express card expenditure. The graphs are very clean and appear to offer some drill down functionality to view transactions. There doesn’t seem to be an intermediary level of detail, e.g. sub-categories (although that may exist and we’re just not seeing it here). Also the use of a 2016 average may not be as useful as a median given that there were a couple of one off large expenses.

For my makeover I’ve added bars to the existing line chart, showing the number of transactions per month. This gives a little extra context (the large amount in December isn’t due to an increase in transactions for example). I’ve then tried a calendar style heat map showing when each transaction occurred.

DASH1

.

The user can chose an aspect of the heat map to highlight. In the example above the logic is picking certain categories of non-UK expenditure as most likely to be overseas travel (e.g. a Hilton in the US as opposed to Amazon in the US. Caveat: this is obviously based on some assumptions and may be inaccurate). The user can also focus on certain categories including highlighting large / largest transactions within each category. I’m not convinced this is a good way to explore the data as the labelling is quite hard to accommodate:

DASH2

.

You can try the interactive version here on Tableau Public.

Or check out some of the other community contributions - there’s some really cool visualisations as usual.

Makeover Monday, 2017 #8

Week 8 is all about potatoes. European Union potato sector stats to be precise. Eurostat provide a very detailed analysis of the EU potato sector on their website:

Eurostat potato sector stats page.

The article is very detailed and does take some time to read, but the key points are in bold so it can be scanned to get an idea of the main stories. The use of tables provides a lot of background data but again the emphasis is on taking the time to digest the data. For my makeover I’ve tried to focus on three key points – most of the production is in a small number of countries; Germany is the biggest producer; but France achieves the best price. The viz can also be seen on Tableau Public.

EU Potato Production & Prices