LOD equivilant of LOOKUP (part 2)

In a previous post I walked through a LOD (level of detail) based alternative to the LOOKUP table calculation. In that example I was looking at sales last month, and in a recent Tableau Forums question someone was asking if it could be extended to determine average sales from the previous three months.

The answer was “yes” and you can check the whole thread on the Forums, including other options. The main difference from my previous post is that we needed to extend the group numbers from two to four, so that any given month falls into 4 “higher level groups” that we can target with a LOD. We then chose which LOD to use based on the index (_i).

LOD-lookup

Top M within each top N of categories

I recently helped with a Tableau community forums question where the user needed to:

  • filter to the top N categories based on a measure,
  • using a dense rank (so that categories with the same value had the same rank),
  • but list no more than M categories within each rank (based on another criteria)

This screenshot illustrates the requirement where we want to see the top 5 categories based on sales, where categories with the same sales have the same rank. But we only want to see the top 2 within each rank, based on lowest cost:

Illustration of top X within Top N (dense rank), showing items excluded



For example Vegatable, Fruit and Cereal are all ranked joint top on sales, however only Vegetable and Fruit should be shown as they are the lowest 2 based on cost.

My first attempt to help was a complex soluton involving PREVIOUS_VALUE (I’m a sucker for this function!) but on reflection I ended up providing an option based on more usual table calculations (RANK_…) and a level of detail expresssion. I’ll go through this option, and why it was necesary, over the rest of this blog post. And I might even share the PREVIOUS_VALUE one in a separate post for those interested!

First up the data set I used to test the solution looked like this:

topn-2


You’ll note that there is some work to do to aggregate up to the category level.

Filtering to the top-N categories on sales is relatively straightforward. I use RANK_DENSE because I want categories with the same sales to be ranked the same, and to have no gaps in the ranking.

Rank – Sales =

RANK_DENSE(SUM([Sales]),'desc')

I can then drag that onto filters and filter to up to 5:

topn-3


Things get a bit more complicated next when I want the lowest two on cost within each ranking. Your first thought might be to create another ranking based on cost that is partitioned by (restarts for every) ranking on sales. However when you get into those settings within the table calculation you’ll notice that you can’t use the existing table calculation. “Rank – Sales” is not in the list!

topn-4


In fact no measures will show up in that partitioning list, yet we need to partition by the sum of sales or the ranking based on sum of sales. This is where the level of detail expression comes in. We can create a sum of sales that can be turned into a dimension.

Category Sales =

{FIXED [Category]: SUM([Sales])}

NB: Once you’ve created this, right click it and convert to dimension. Also if you need other filters on your view you’ll need to remember that the FIXED level of detail expression is busting out of that level of detail (filters) on the view. To get around that you can either use context filters (which apply prior to the level of detail expression being evaluated), or you can add the other dimensions you’re filtering on into the list of dimensions you’re fixing by (before the colon in the calc above).

Now that we have a dimension of category sales we can create another ranking table calculation that is partitioned by that, so that we get a ranking on cost that restarts for each group of categories based on sales.

Rank – Cost =

// Results are computed along Cateogry Sales, Category.
// The calculation restarts at zero for every Category Sales
RANK_UNIQUE(SUM([Cost]),'asc')

topn-5

And from here we can construct a view that lets us filter to the top N categories on sales, and within each ranking the top M based on cost:

topn-6


Job done! Do you have a better or simpler approach? If so I’d love to hear about it!

Tableau tile and hex maps

If you’ve ever built a filled map in Tableau where some of the areas are too small to stand out or label effectively, then you’ve probably come across hex maps as a solution. A hex map means that each area is the same size, regardless of the actual relative sizes.

For example, note how the Nelson (NSN) region in this filled map of New Zealand is relatively small compared to other regions, and hence hard to see and label:

Filled map of regions of New Zealand


Whereas in this hex map version Nelson (NSN) is the same size as the other regions:

Hexmap regions of New Zealand


In extreme cases the relative size and colour of the border compared to the area can even skew people’s perception of the colour. Also if you want to go a step further and show how data has changed over time (or some other dimension) per area, then you may have heard of tile maps as an option:

Tile map regions of New Zealand


But how do you build these in Tableau?

Tile Map Template

I tend to start with a tile map design as it’s easier to visualise in a grid! I find the map I want to reproduce and then plot the areas on a grid in Excel:

Excel grid laying out regions of NZ


Note: Because you’re scaling up small areas and reducing to tiles it won’t be possible to faithfully reproduce the geography. What you’re aiming for is a rough approximation that still looks like the overall geographical area you’re representing, whilst allowing you to show each area at the same size. For example, in New Zealand the West Coast (13 In the grid above) should stretch up further, and I compromise on whether Tasman (10 in the grid above) is south or west of Nelson (11).

Regions in NZs upper South Island


Once I’m happy with my grid I create a sheet that has the list of areas (regions in this example) along with the coordinates from the grid:

Excel list of regions and tile grid coords


Using this data I can plot the tiles in Tableau. Dragging X to Columns and Y to rows, I get a mark/area per region. And it is laid out as per the design in Excel:

Tableau view showing basic NZ tile map

.

Tile Map Example

Linking the tile map data set to another data set allows me to produce a set of small multiple charts arranged as per the tile map design. Let’s take a look!

First up define the data source:

Tableau data source for example NZ tile map with data


Here I’m relating a data set that has NZ population by region and year, to the list of regions and their coordinates. I’ve ended up using a relationship calculation as part of the join on region name to replace “Wanganui” with “Whanganui”, as the two data sets use different spellings.

From here building the view is like the template, dragging X onto Columns and Y onto rows. But this time I also drag Year onto Columns and the measure I want to plot onto Rows (in this case the % change in population since an index year I’ve chosen). This gives me a separate chart of % change by year for each region, in its grid location (Southland in 1, 8 for example):

Tableau view showing NZ tile map with data over time


Note: in this example I’ve also created a fake measure typed straight onto Rows – AVG(0.5) – on a dual synchronized axis. I’m using this with a blank custom shape as a mark that I can attach a region code label to for the first year on each tile. There is also a reference band from -0.5 to the window max of my fake measure (-50% to +50%) on this axis to provide a light grey background where we have data.

When I hide each header, apply some formatting (blue background, grid lines turned off), and include on a dashboard the result is:

Tableau dashboard showing NZ tile map regional population change over time

.

Hex Map Template & Example

A hex map is on one hand simpler, and on the other more complex. It’s simpler – in my opinion – because it doesn’t suit including a chart within each hexagon (unlike the tile map). But it’s more complex in that the hexagons aren’t on a simple grid. Let’s take Nelson (NSN), Marlborough (MBH) and Tasman (TAS) in the example below:

Hexmap for NZs upper South Island highlighting overlaps


MBH is to the east of NSN and so is +1 across on the X axis.

And in my grid system it’s also +1 down on the Y axis, but it actually starts halfway down NSN and a little indented, so that the hexagons tessellate.

TAS is completely below NSN and so is +2 down in my grid system.

I work this out by trial and error by adapting the tile map coordinates to suit a hex map. Ending up with a list of regions with X and Y coordinates again:

Excel listing NZ regions and hexmap coords


Creating the view is also very similar – dragging X to Columns and Y to rows, hiding headers for the X and Y pills, formatting the colour and removing things like grid lines. But this time I use a custom hexagon shape. For the example below I’ve also used a data source similar to the tile map example above where I relate the hexmap coordinates Excel sheet to the regional data I want to show.

Tableau view showing build of NZ regional hexmap


Note: in the screenshot above I’ve sized my window and adjusted the shape size in Tableau so that the hexagons are nicely aligned (same space on each side). This is quite fiddly, and you’ll have to play around to get it right when including your map on a dashboard. Including on a dashboard is a good way to retain control over the sizing. An alternative is to draw a polygon where you have much more control over spacing but that is outside of the scope of this blog!

The finished result on a dashboard:

Tableau dashboard showing NZ regional population hexmap

.

Feel free to use the templateS

You’re very welcome to use these templates in your own work if they’re helpful.

The workbook is available on Tableau Public (templates and examples):

https://public.tableau.com/app/profile/steve7374/viz/NZTileandHexMaps/CONTENTS

And the Excel grid data on the Tableau Forums:

https://community.tableau.com/s/news/a0A8b00002HfQecEAF/new-zealand-tile-and-hex-map-templates

I’ll be following this post up with more APAC tile and hex maps. If you’ve got any requests let me know. I’d also love to explore a way to generate the tile and hex grids from actual the original region coordinates or shapes … but that is a much more complex problem!

LOD equivalent of LOOKUP

The LOOKUP table calculation in Tableau is really handy when you want to show or use a value from a previous row in the view. For example if you are showing sales per month and need to use the sales figure from the previous month to calculate month-on-month growth.

Sales and sales previous month


In the example above our calculation for “Sales last period” is:

LOOKUP(SUM([Sales],-1))

LOOKUP is great when the level of detail you want to look back through is in the view (e.g. order month is in the view if you want to lookup sales for the previous order month). But things get tricky when that isn’t the case. And people often hit constraints when they want to use the result of a table calculation like LOOKUP in another calculation / aggregation. It’s not unusual at this point for people to ask if they can use a level of detail / LOD calculation for looking up the value from a previous month, year, or category. And unfortunately the answer is almost always “you can’t use a LOD for that”, but can you? Spoiler alert: yes!

The trouble with LODs

LOD calculations allow you to break out of the level of detail that your calculation is working on. So if your calculation is working at the month level you can break out to the quarter level, to ask questions like how does this month compare to the quarter. In the example below we could use a LOD to calculate how much the sales for 2022-06 differ from the average for the quarter that that month is in;  breaking out to a higher level grouping.

Sales by month and quarter highlighting a month and quarter


But if you’re after previous month there is no single dimension or higher level grouping that will tell you that; each month has a different previous month (unlike the case above where three months share the same common quarter). The previous month for 2022-06 is 2022-05, but 2022-06 is itself the previous month of 2022-07. Any one month is in two groups when it comes to determining previous months.

Possibly as you read that last sentence an idea will be forming! Let’s pursue that idea that there are two groups in play…

A LOD for LOOKUP(…,-1)

If we put each month into two groups which are offset from each other by a month then we can calculate two previous values, one of which will be right for odd numbered months and the other for even numbered months in a sequence of months. Let’s take a look at this in action:

LOD version of LOOKUP, example with monthly sales


Firstly we need an continguous sequence number for each row (in our case each month). With months that is easy as we can use a DATEDIFF to calculate the number of months since the first month in the data set.

_i = 

DATEDIFF(
 'month',
 {MIN(DATETRUNC('month',[Order Date]))},
 DATETRUNC('month',[Order Date])
 )

Using that sequence number we can create the two offset groupings using integer division.

_iGroupA =

DIV([_i]-1,2)

_iGroupB = 

DIV([_i],2)

This gives us a grouping for each combination of two months:

Each grouping of two months


February will be in Group B when we want it’s previous month, and in Group A when it is the previous month.

As we now have a higher level grouping that we can target in a LOD, we create two LOD calculations to get the other value in the group…

_iPrevA =

SUM({FIXED [_iGroupA]: SUM([Sales])})
- 
SUM([Sales])

_iPrevB = 

SUM({FIXED [_iGroupB]: SUM([Sales])})
- 
SUM([Sales])

These calculations get the sum of sales for each grouping of two months, and then subtract the sales for the month we’re in. Hence we end up with the sales for the other month in the grouping. Now we just need to work out which of these LODs to use for each month. To do that we switch between them…

iSwitch =

([_i]-1) % 2

The modulus operator here gives us the remainder of dividing by 2 and hence a number that flicks back and forth between 0 and 1. So…

Sales last month (LOD) =

IF MIN([_iSwitch])=1
THEN [_iPrevA]
ELSE [_iPrevB]
END

You can check out the workbook for this on my Tableau Public profile: LOD equivalent of LOOKUP.

I’ve used this approach for a number of gnarly Tableau Forum questions and I think I’ve seen similar crop up once or twice before but it’s not a particularly well known approach. Often people will fall back on solutions like self joins – and to be fair shaping your data to avoid having to jump through hoops like the above is not a bad idea. But if you do need it, I hope that you find it useful!

What and Why skills vs How skills

John Cutler asked a great question on Twitter; how do we describe less visible skills like qualitative research in comparison to technical skills like software development?

Initially I was intrigued by the parallels with translation vs interpretation in linguistics. I can see similarities between a software developer translating requirements into code. And a design researcher interpreting customer interviews to help produce the right software requirements.

I also liked a response by Tiffany Chang suggesting that one skill is more concrete and the other more abstract. That resonated with human centered design approaches for me. And the idea of not jumping straight from problem to solution:

HCD abstract concrete understand create quadrants



(ref: https://ssir.org/articles/entry/human_centered_systems_minded_design)

So in some ways perhaps we are really talking about the difference between skills to understand “what” (we need to do), and “why” (we should do it), versus skills in “how” we do it? And unfortunately perhaps it is natural for many people to see and appreciate “how” skills above “what” and “why” skills. When you’re cold, fire lighting skills are more immediately appreciable than skills to uncover that we need to move the tribe to a different valley.

Depth vs breadth

In a recent sprint review we were asked how our findings, which were based on a relatively small number of customer conversations, could be meaningful. Were they statistically significant?

I’d got used to our stakeholders being familiar with the background to qualitative research and how we don’t try to quantify it as such. And that the selection approach / recruitment matrix mean that we can have confidence in the insights. However staff had come and gone and so it was a good reminder to address the common concern that a survey would have been better and more statistically significant.

To address the concern we touched on the draw backs of surveys in terms of being sufficiently well designed, depth of understanding and subjectivity. Example – rate our service from 1-10, with a follow up question of why did you pick that rating? If a customer answers “efficient service” what does efficient really mean? Obviously quick and easy right? Well maybe not. I’ve had cases in more in depth conversations where I asked  “Can you tell me a bit more about what efficient meant to you?” and the answer was not what most of us would expect! A survey can you give you breadth but not necessarily depth of understanding. And it’s unlikely to tell you why efficiency is important to customer X.

We also talked about some of the science behind qualitative research: getting your recruitment matrix right, and the concept of saturation and diminishing returns after speaking to a relatively small number of customers. We felt we were close to that point of diminishing returns. And interestingly we’d already identified and highlighted the key drawback of our selection approach in this situation, which was that we were gaining understanding of customer experience across a reasonably narrow time frame – very much “point in time” insights.

Still it is hard for some to trust the insights from what seems like a small number of conversations. My final persuasion is to ask stakeholders to think about some data we have – e.g. the number of times a customer has logged into a help portal. It’s tempting to say that those logging in most are the ones that need the most help and others are doing great without the need for help. But are they really? Maybe they’re struggling on, tapping a colleague on the shoulder and asking for their help instead. The data (in this case) cannot tell you that. A conversation can.

Hone your skills with Makeover Monday

I don’t usually get to attend Tableau User Groups. We don’t (yet?) have one down in the depths of New Zealand’s south island, and it’s a long drive to the nearest one in Christchurch. But with New Zealand and much of the world in some form of lock down, Tableau has encouraged and supported virtual user group meetings. So I was excited to dial into this weeks virtual New Zealand Tableau User Group meeting jointly arranged by Alex, Thabata, Jeff and Paul from the Auckland, Christchurch and Wellington groups. The icing on the cake was being invited to speak about my experience with Makeover Monday!

The topic of my 30 minute slot was Hone your skills with Makeover Monday.  For those after the tips and checklist I mentioned, please read on. Or you can watch the recording of the whole session on YouTube, including the other great speakers:

  • How to do Tableau in lock-down! – Alex Waleczek
  • Hiring: Score yourself a unicorn – Sarah Burnett
  • Set It Up: When to use Set vs Parameter Actions – Heidi Kalbe

Here is the summary of the 13 tips and the example checklist I covered in my presentation.

Tips

  1. First up, my last tip: please, please, please don’t be discouraged from participating by some of the brilliant submissions you’ll see from others. Everyone has to start somewhere. And there’s nothing wrong with a quick and simple makeover. Often a simple bar chart is just what the data and story deserves.
  2. Do read the article, it’s tempting to save time by not reading it. But there is often useful context and background that you can dig into. Spending some time understanding context will usually pay you back as an analyst
  3. Remember the purpose of makeover Monday. Take the time to ask yourself what works well with the original chart and what could be improved. Doing so will help you focus on what you want to make over. Sometimes the improvements might only need to be minor – e.g. better use of colour. Sometime you may be doing a completely different chart.  Invest the time that suits you
  4. Do dig into any nuances in the data. E.g. does the data start and end partway through a year, which would impact seasonal comparisons? Or are there fields that need to be transformed or pivoted to make analysis easier? Have you understood an unusual outlier, or some peaks and troughs?
  5. Once into visual exploration I like to build up working sheets as I explore various angles. That way I can come back to points I want to focus on and refine later. As I refine these, ideas for a dashboard and sequencing start to emerge. My best advice here is to watch one of Andy Kriebel’s live Makeover Monday’s. You’ll get to see where he spends his time, how he goes about exploring data and creating a better data viz.
  6. If you plan to blog about each makeover you’ll find that that takes some time. It can help to keep notes as you review the original chart and explore the data if you plan to blog. That way you can structure them into a blog post at the end
  7. Have a checklist to go through before publishing. Some people keep a written checklist to remind themselves of key things, Otherwise it’s easy to forget about tool tips or spell checking! I’ve included an example checklist below.
  8. Do share your work – take the plunge! It’s a good way to engage and get feedback, which is a crucial part of improving your data visualisation and story telling skills.
  9. Try not get too disheartened if you get no feedback on Twitter, or even unexpected feedback. It’s very difficult to deliver feedback in a way that suits everyone on Twitter.
  10. If you want feedback register for the weekly viz review webinar. Remember to to only use the #MMVizReview hashtag if you will register for and attend the webinar otherwise it makes it harder for the organisers to prepare.
  11. Do work through and incorporate any feedback that you’re given in the viz review webinar. It helps you to reinforce the learnings, and it shows Eva and Charlie that their input is worthwhile!
  12. Don’t be discouraged if you don’t get selected in the weekly favourites. You’re one of a thousand people participating (as of May 2020) and Charlie and Eva can’t practicably see and recall every Makeover Monday tweet! Remember why you decided to take part and ensure that you’re getting what you want out of it – e.g. after two months look back at how much you’ve improved.
  13. If you benefit try to pay it forward in the future. I’ll leave that up to you, but it could be helping new Tableau users on the Tableau Forums (hint: many people find that trying to pass on their knowledge is a wonderful way to gain deeper knowledge themselves), or it could be getting involved in your local user group. Maybe you’ll take the time to encourage new Makeover Monday participants in your area!

Checklist

Here is the example checklist I provided – over time you’ll find the things that help you to check that you’ve got a great makeover before you submit:

  • Right chart type
  • Improved what you set out to improve
  • Remember your audience (e.g. mobile)
  • Clear title & annotations
  • If your title is a question is it answered?
  • Consistent fonts, tool tips, etc.
  • Consistent use of colour (helps the story)
  • Simple is good (remove till you can’t)
  • Spell check and read back through it
  • Source and image credits

Those tips are things that have worked for, or stuck out to me. You can find much more information on the Makeover Monday website, including how to buy the book which covers a whole heap of data visualisation advice.

Finally as I said in the presentation, I look forward to seeing more NZ user group members in the #MakeoverMonday feed soon. Please do reach out if you need some encouragement!

PREVIOUS_VALUE in Tableau

Late last year I started to actively help out on the Tableau Forums. What a great decision! I’d forgotten how much fun it could be to (1) pick up a discrete challenge; (2) help others out; and (3) learn so much more in the process.

One of the questions I recently chipped in on was about circular references in a sequential calculation. The background to the question is really interesting and I ended up spending a few hours digging into epidemiological models, but that’s a different story! Whilst trying to help I took a fresh look at the PREVIOUS_VALUE function in Tableau. I have to admit, prior to this I had thought that PREVIOUS_VALUE(x) was just the same as LOOKUP(x,-1) … turns out that isn’t the case!

LOOKUP(x,-1) will return the previous value of the expression, x, in the “window” of results being presented in the view. So if you’re presenting a monthly sum of [order value], LOOKUP(SUM([order value]),-1) can give you the sum of order value from the prior month*. Handy eh!

PREVIOUS_VALUE(x) is actually saying get the result of this entire calculation from the previous result in the window. And whilst LOOKUP returns NULL on the first result, PREVIOUS_VALUE will return the passed in expression, x, as the first result.

Your initial thoughts might be that these  aren’t much different, but the real power is when you want to do more with your calculation. Let’s say you have:

yA = 2 * LOOKUP(x,-1)

yB = 2 * PREVIOUS_VALUE(x)

In the case of the LOOKUP above you’re always just looking up the last value of x and then multiplying by 2. With PREVIOUS_VALUE you’re starting with x and multiplying by 2, but then each subsequent call is actually looking up the last value of the overall expression – i.e including the  previous multiplication by 2 – then multiplying by 2 again.

So the first yB = 2 * x.

The second yB = 2 * (2 * x) … and so on.

That’s pretty powerful!

The Tableau expression help gives a couple of great examples. The first is to calculate a running sum, and of course you could use RUNNING_SUM for that instead. The second example is pretty cool though – a running product!

PREVIOUS_VALUE_help


I built this viz on Tableau Public to help illustrate the difference in the example I used above. Feel free to download it and have a play.

PREVIOUS_VALUE


Edit:

During subsequent discussions on Twitter Jim Dehner (Tableau Ambassador and top forum contributor) pointed out some additional differences with PREVIOUS_VALUE. I’m yet to dig into these but leave them here in case they help you … “previous value will return the single previous value and can not be chained or used in certain table calculations like running sums -  Lookup is a table calculation that moves from the current “cell” location up, down, left or right by any increment – it can be nested”. I also need to credit Jim for reminding me about the handy example in the Tableau calculation wizard / function help that I included above.

Footnotes:

* with table calculations you get to define how they traverse the window. If you’re displaying a table of months, where each month is a row in your view (i.e. month is on the rows shelf), and you want to LOOKUP the value from the prior month, then you could compute the lookup using “table down”. That way the window progresses down the list of months, and you’re looking up the value from the row above. Table across is often used when you’re calculating a cumulative total on a line graph with time on the horizontal axis (columns). You can define more complex ways to compute the order of the window. And you can specify when to restart the calculation, e.g. to get a separate monthly running total per year.

Makeover Monday, 2019 #26

An interesting and deceptively simple data set on alcohol consumption by country for 2019 week 26.

I like the simplicity of the table of data and the factors affecting the top 25 that are discussed in the article. The chart itself would be better as bars not columns in my opinion, allowing the country names to be laid out for easier reading. As Eva noted in her submission showing liters of pure alcohol consumed per capita per year isn’t that easy to relate to. Digging into the definitions for standard drinks / units I was surprised to find that there is quite a range, and that some countries still don’t define a standard drink. I decided to focus on that aspect for my makeover.

Interactive version on Tableau Public: here.

DASH

Makeover Monday, 2019 #3

Andy Kriebel selected a data set about US workers paid at/below the minimum wage for those choosing to participate in week 3, 2019.

The original viz highlights some of the regional differences for 2017 by showing the data geographically. I like that I can see regional differences, but I found myself wanting to see the trend over time (as it’s available in the data set) to see if the geographical trends are part of an ongoing story.

So for my makeover I’ve kept things pretty simple and separated the different regions and sub-regions. Adding the overall line for the US and differentiating values above / below this in different colours helps to tell the story. A state highlighter allows users to focus in on one state if they want to – this is quick built in functionality for Tableau (right click a dimension and set as highlighter). I spent a lot of time in the depths of SQL Server geography queries for last week’s makeover, so it was refreshing to step back to simple built in Tableau functionality for week 3!

Interactive viz: here on Tableau Public.

Static image:

US-MIN-WAGE