Pages

Tuesday, 22 December 2020

Revisiting the QlikView Cookbook - Using OMIT

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Blocking user access to a field using OMIT.

Picking this one was spurred by a message last week from a good friend who reminded me of a blog post that I wrote on the topic back in 2011!

Privacy of fields have been around longer than GDPR or and other legislation. In the Cookbook, I imagined a scenario of a HR application that contained salary and absence information that we might want to restrict access to. I am sure that we can all imagine many other fields that we might want to restrict - including names!

Anyway, here is the code setup to replicate the recipe in Qlik Sense Business:

Section Access;

Access:
LOAD * INLINE [
  ACCESS, USER.EMAIL, OMITGROUP
  ADMIN, stephen.redmond@notmymail.fake, 
  USER, user1@notmymail.fake,
  USER, user2@notmymail.fake, SALARYONLY
  USER, user3@notmymail.fake, SALARYANDABSENCE
  USER, user4@notmymail.fake, ABSENCEONLY
];

OmitGroups:
LOAD * INLINE [
  OMITGROUP, OMIT
  SALARYONLY, Salary
  SALARYANDABSENCE, Salary
  SALARYANDABSENCE, Absence
  ABSENCEONLY, Absence
];

Section Application;

Employees:
LOAD * INLINE [
  EmpID, Name, Salary, Absence
  1, Joe Bloggs, 50000, 3
  2, Jane Doe, 45000, 5
  3, Fred Frank, 30000, 0
  4, Jeri Jublek, 19000, 12
];

The blank values for the ADMIN and user1 here are deliberate. They could really be anything - blank, "NONE", "MISSING", etc. - anything that would not connect these users to the OmitGroups. The effect here is that these users will not have any OMIT entries so will see all the fields - as in the image below:


For the users with the SALARYONLY restriction, they cannot even see the Salary field. It will appear as if the field does not even exist in the data model:


For those with SALARYANDABSENCE, they will be missing both fields:


In our post-GDPR world, OMIT is is feature of Qlik that you should definitely consider when you have PII or sensitive fields involved. Make sure to be explicit about the users who can and can't see these data.



As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Monday, 14 December 2020

Revisiting the QlikView Cookbook - Rounding Time to Quarter Hour segments

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Consolidating a date-time value into quarter hourly segments.

The premise behind this recipe is one that is relatively straightforward, we would like to round down Time Stamp data to a less unique value. The best reason for doing this is to simply remove the uniqueness of the data - Time Stamps could represent millions of values in a day, because they can be sub-second, when there is really only 86400 seconds in one day, 1440 minutes and 96 quarters. The less unique your data, the better that Qlik can store it. Of course, you need to bear in mind the business need for the presentation of the values - if they don't need minutes and seconds, don't store them!

The simple load statement for this recipe is:

Load
  TimeCounter,
  TimeStamp(TimeStamp) as Date,
  WeekDay(TimeStamp) As WeekDay,
  TimeStamp(Round(TimeStamp, (1/(24))), 'M/D/YY HH:mm') As Hour,
  TimeStamp(Round(TimeStamp, (1/(24*4))), 'M/D/YY HH:mm')
  As QuarterHour;
Load
  Today() + (Rand()) As TimeStamp,
  1 as TimeCounter
AutoGenerate(1000);


This does give us a result which may be good for our purposes. However, we need to bear in mind that this is a mathematical rounding, and that may not match business rules that we have to abide by. In this example, we can see that the 06:00:00 hour covers time slots from 05:30:00 to 06:29:59. The 06:15:00 quarter covers times slots from 06:07:31 to 06:22:29. These ranges may not be what a business user would expect - would you expect 5:30am to appear in the 6am group?

If you want your 06:00 to be 06:00:00 to 06:59:59, then this might work: 

TimeStamp#(Text(TimeStamp(TimeStamp, 'YYYYMMDD HH'))&':00','YYYYMMDD HH:mm') As Hour


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Wednesday, 9 December 2020

Revisiting the QlikView Cookbook - Dynamic part-to-whole Pie Chart

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Creating a dynamic part-to-whole pie chart.

Now, I know that pie charts are not everyone's first choice, but for part-to-whole comparisons they work just as well, if not better, than other alternatives. (If interested in more on this, read my IEEE VIS 2019 paper, Visual cues in estimation of part-to-whole comparison here).

Probably most will agree that this style is not the best:


The large number of segments here tempts the user away from part-to-whole into part-to-part comparisons. There are argument that say this would be equally as useful as a bar equivalent, but I won't make them here.

To setup an example in Qlik Sense Business, I loaded the following data:

LOAD * INLINE [
Country, Sales
USA, 53453
Mexico, 21317
Canada, 14545
UK, 12333
France, 21333
Germany, 23123
Japan, 44342
China, 34234
Australia, 12345
];

The setup of the dimension is quite important. Instead of using the normal field as a dimension, I use a calculation:

=if(
  Aggr(Sum(1), Country)>0,
  'Selected: $(=Concat(DISTINCT Country, ', '))',
  'Others: $(=Concat({<Country=E({$})>} DISTINCT Country, ',
  '))'
)

It is using Aggr to group together Countries that are selected separately from non-selected. We then need to setup the Measure expression to ignore selections on Country:

Sum({<Country=>} Sales)

Now, when we make a selection on Country, the pie chart won't drill to that selection, it will show us that selection versus the other Countries:


It is true part-to-whole!



As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Sunday, 6 December 2020

Revisiting the QlikView Cookbook - Using TOTAL

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on using TOTAL to calculate the percentage of total and subtotal.

I like this recipe because it really helps to understand the power of the TOTAL function, which has been a core part of Qlik for a long time, pre-dating Set Analysis for use in vertical calculations.

We start off with a simple set of Sales data:

LOAD * INLINE [
Country, City, Sales
USA, San Diego, 23712
USA, Dallas, 49888
USA, New York, 71621
USA, Boston, 55033
UK, London, 13444
UK, Birmingham, 31212
UK, Manchester, 39712
Germany, Berlin, 49100
Germany, Frankfurt, 59102
Germany, Munich, 41568
Japan, Tokyo, 41412
Japan, Yokohama, 55554
Japan, Osaka, 12345
];

Then we need to create a Pivot Table with Country and City as Row Dimensions and the following 3 measures:

LabelExpression
Sales $Sum(Sales)
Total Sales %Sum(Sales)/Sum(TOTAL Sales)
% Sales by CountrySum(Sales)/Sum(TOTAL<country> Sales)

Set the display format of the two percentage measures to an appropriate % format. Turn Show totals on for the Country dimension.

When the pivot table first displays, the Country dimension will need to be expanded to show the cities.


Hopefully it is straightforward to see what has happened here. Total by itself tells the function to ignore the dimensions in the chart. Therefore the Total Sales % calculation takes each individual sales total and divides it by the overall total. So, for Berlin, that is 49100 / 543703 = 9%.

When we include one, or several, dimension within the angle bracket after the TOTAL, we tell Qlik to ignore all dimensions in the chart, except for the ones lists. So when we include Country in the angle brackets, it will give us the Sub-Total for that country. Again, for Berlin, that is 49100 / 149770 = 32.8%.



As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Sunday, 29 November 2020

Revisiting the QlikView Cookbook - Labels on a Line Chart

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Replacing the legend in a line chart with labels on each line.


The idea here is that the legend on a chart can be somewhat removed from the data and not always easy to match the colours to the lines, especially for those with a visual impairment such as colour blindness.

The trick here is to use a small hack with the Dual function to only make labels printable for the last value of the line. Here is a quick run-through of the method - which isn't a million miles away from the QlikView method presented in the book.

Start with some data:

CrossTable(Country,Sales)
LOAD * INLINE [
Date, USA, UK, Japan, Germany
2013-01-01, 123, 100, 80, 40
2013-02-01, 134, 111, 75, 50
2013-03-01, 155, 95, 70, 60
2013-04-01, 165, 85, 88, 50
2013-05-01, 154, 125, 90, 70
2013-06-01, 133, 110, 75, 99
];

Now, we can quickly create a simple line chart by Date and Country as in the image above, with a measure of Sum(Sales).

Now, if we edit the measure to this:

Dual(
  If(Date=Max(total Date), Country, ''),
  Sum(Sales)
)

So here we are checking if the current date matches the maximum date. If it does, return the name of the Country. If not, return a blank string.

Before we change anything else, we should change the measure's Number formatting option to be Measure expression, otherwise Qlik Sense will override our Dual and apply a default numeric formatting.

Now, down in the Appearance section, under Presentation, we turn on the option to Show data points. Set the Value labels option to Auto.

Just below this, in Colors and legend, turn off the Show legend option - we don't need it!

If all has gone to plan, we should have a line chart that is something like this:


This will not be the perfect solution in every circumstance. If you have very many lines then it just won't look great. But it is something worth playing with, and certainly a hack worth knowing about.



As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Making Sense of Themes

Episode 14 of the Qlik Virtual Meetup Scotland (QVMS) had some very interesting speakers. Old friend and fellow Luminary, Barry Harmsen, from Bitmetric, was talking about his sensetheme.com site, so I thought that I might delve a little into the idea of themes and explain why I think it might be a good idea for you to explore them.

For some historical context, the idea of Themes is not new to Qlik. Anyone who has used QlikView will know to what level of detail you can specify colours, various fonts, borders, shadows, etc. on various objects and then use the Theme Maker function to export those settings into an XLM file for use in other documents.

When Qlik Sense was first released in 2014, it didn't have too much in the way of ability to change things. You got the plain white background and the default set of fonts. You could change the odd colour, but not much else.

The Qlik Sense "Classic" theme

Many folks grumbled at this and Qlik listened. Starting from Qlik Sense (on-premise) February 2018 release, custom themes could be used to style an app. This feature has subsequently been rolled out on SaaS offerings.

Out of box on the current version of Qlik Sense Business, there are four themes available. Sense Focus is the default, but you can also choose Sense Breeze, Sense Horizon, or the "Classic".

Sense Focus - the default theme

Sense Breeze theme
Sense Horizon theme

A theme is simply a JSON file with a collection of settings. If you are comfortable working in JSON and with CSS properties, then a good place to start is the Getting started building custom themes page in the Qlik documentation. If you are less comfortable, then Bitmetric's Sense Theme tool is another excellent choice.

For free, you can download a load of different examples, from Classic QlikView to Tableau! Also for free, with some limitations, you can create your own theme file and use their friendly UI to help you setup the colours.

For a small monthly fee, you can unlock more features and get rid of those pesky pop-up ads. I can see many businesses taking this option as they will want to apply individual theme elements to their Qlik Sense applications, and this will become a valuable tool for their designers to work with.

Time to dive into themes?


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Monday, 23 November 2020

Revisiting the QlikView Cookbook - Boxplot

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Creating a box plot chart for a simple data set.

The Boxplot in Qlik Sense is a little more advanced than the one that I originally wrote about in QlikView. It has an element of sophistication around it which may actually initially confuse a user, especially one who has been used to other statistics tools. When I use a Boxplot in QlikView, I generally expect to add one dimension, and then expect that the values will be presented for each value in that dimension. I can, of course, get quite sophisticated with my calculations, and there is even a Wizard to help me get started with creating the necessary AGGR expressions I may need.

The Qlik Sense Boxplot will actually assume that I want to do the calculations using similar AGGR expressions and, if I don't, I need to make some manual edits. It is probably easier if you let me show you what I mean with an example. First, some sample data - only slightly more complex than in the original recipe:

Load 
    'USA' as Country,
    Ceil(Rand()*12) as Month,
    Round(Rand()*50, 0.01) as Value
AutoGenerate(100);
Load 
    'UK' as Country,
    Ceil(Rand()*12) as Month,
    Round(Rand()*20, 0.01) as Value
AutoGenerate(100);

After loading, if I drop a Boxplot onto my sheet, it offers me the usual Add dimension and Add measure options. If I try to recreate the original QlikView Cookbook chart, I might try with Country and Value - actually you will note that, as usual, it forces you to add an aggregation.


Ok, so what happens next? Well, you might be surprised to see only 1 skinny box plotted!


So what happened? Why did it not split by Country as a QlikView user might have expected? Turning to the help file, you might find this description of adding the first dimension:

    This is the inner dimension, which defines a box.

Eh? What box does it define? Well, we have added a dimension and an expression, so this is the box it defines:


So, actually, there are only 2 values being represented by the Boxplot. The plot isn't looking at all the individual values (which can't be bigger than 50 because of the way we have created them), it is representing the statistics of all the calculated values - in this case only 970.77 and 2529.77.

Let's rejig slightly. I am going to replace the Box dimension with Month and set the X-axis dimension to be Country. The result looks something like this:


In this case, the "box" is now a table of the Sum of Value by Month, and there is a different "box" for each Country. For example, the UK "box" may look like this:


The Boxplot calculations are then calculated across the 12 values in each "box". Simples!

But what are the calculations? Well, if we look into the Box plot element properties, we can see that it defaults to some presets from which we can choose Standard (Tukey), Percentile-based or Standard deviation. If we turn off the Use-presets option, we can now see what set of calculations are being used for each element in the Boxplot. In fact, this also allows us to put in our own calculations! We could, for example, use this option to go back to a version of the original QlikView Boxplot.

If add a new Boxplot on the screen and then go straight to the properties and add Country for both the Box and X-axis dimension. For the Y-axis measure, I am just going to add the value 0 - it doesn't matter too much what value you use, though it does need to return a valid calculation or the chart won't work.

Now, when I go down to the Box plot element section and deselect the Use-presets option, I can specify new expression values:

Property
Expression
First whisker
Min(Value)
Box start
Fractile(Value, 0.25)
Center line
Fractile(Value, 0.5)
Box end
Fractile(Value, 0.75)
Last whisker
Max(Value)



Now we have something that is similar to the original!


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Saturday, 21 November 2020

Revisiting the QlikView Cookbook - Tooltips

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this first post, I'll look at the very first recipe - Creating custom pop-up labels on a bar chart.

In QlikView, it was necessary to manually put together a text expression to join multiple values, with their functions to ensure correct formatting, together and ensure values appear on separate rows using line feeds (using the chr function) - something like this:

= Country 
& chr(10) 
& 'Sales : ' & Num(Sum(Sales), '$(MoneyFormat)') 
& chr(10) 
& 'Costs : ' & Num(Sum(Costs), '$(MoneyFormat)') 
& chr(10) 
& 'Margin : ' & Num(Sum(Sales)-Sum(Costs), '$(MoneyFormat)') 
& chr(10) 
& 'Margin % : ' & Num(1-(Sum(Costs)/Sum(Sales)), '0.0%')

In Qlik Sense, it is a little easier. 😊

In most of the major chart types there is a sub-property in the Appearance property called Tooltip.



In here we can see the following options:


OptionDescription
Hide basic rowsThis either shows or hides the default Tooltip values. Useful to turn off if you are doing something custom.
DescriptionAn expression that defines a piece of text to present within the tooltip. Could be used to present useful information for users such as how these metrics compare with others.
MeasuresOne or several measures to display. Each measure has typical sub-properties for measures such as label and numeric format.
ImagesAn image to display in the Tooltip - either a static one from the Media libaray, or from a Url (which could be dynamic!)

So, to recreate the manually constructed text from QlikView, we just need to turn off the Hide basic rows option, leave the Description blank, and add four Measures with appropriate formats. We might end up with a result like this:




As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn

Wednesday, 29 April 2020

The most important form of work

If you haven't been following the new series of Qlik Virtual Meetups, they are worth tuning into and there is probably one in your region. Being virtual, it doesn't matter if they are in your region or not!

The recent edition of the Qlik Virtual Meetup Scotland featured an old friend in Treehive Strategy's Donald Farmer (it also featured the ever excellent Michael Tarallo talking about Qlik Sense April 2020!). In an excellent presentation, Donald mentioned two particular quotes that resonated with me and some of my own ideas around data visualisation and data literacy.

The first was from (the excellent!) Michael Lewis's book, Losers: The Road to Everyplace but the Whitehouse. The quote was:

"an explanation is where the mind comes to rest"

Donald uses the quote in the context of an analyst looking at data to what they feel is a conclusion. He feels, and I agree, that it is not enough to say that you have just followed the data. You stop being critical. Donald also paraphrases Deming: "with data, your just another person with data, and an opinion". This is because we are all fundamentally human beings with many, many biases. Where our mind comes to rest we feel comfortable.

For me, an interesting example of this occurred just yesterday with a chart published in this article: Three charts that show where the coronavirus death rate is heading. The chart, which at initial look looks like a spiral mess, with time, becomes clear and quite interesting. It is a great example of a chart that you need to spend some time with to appreciate, but the effort is rewarded.

I shared my opinion on Twitter, and some of the other discussion is interesting. However most of the discussion stopped at, "ugh!" - people were not prepared to give the chart the benefit of the doubt and invest any time. "Ugh!", was their explanation. That is where their minds rested.

The second quote was from a Harvard Business Review article titled What's So New About the New Economy written by Alan M. Webber:

"In the new economy, conversations are the most important form of work."

Donald expressed that Data literacy is about communicating with data. It is not just about a few people understanding data, it is about raising the level in society.

Conversations are enormously important in raising all the boats. Some data visualizations are merely about getting to rest - getting users to a position where they have the explanation that they want. Others are designed to try to get users beyond that point. But it is hard to get a body-at-rest moving. Conversations can help us get over that inertia. Conversations about data. Conversations over data.

Will you join the conversation?


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook

Friday, 17 April 2020

How Many Segments? And other stories

Following on from my recent presentation at the first ever Qlik Virtual Meetup Scotland (link opens the meeting recording), there were a few questions arising. I thought it would be useful to answer them in a blog post.

Scott
Pies
I like a pie chart but how many segments before it becomes difficult to judge a value?
Paul
Pies
Would you not just group smaller segments into OTHER?
Steve
Pies / Bars
I think the long tail on a bar chart, with the Sense minichart, is very obvious - where lots of tiny pie wedges do not

Let's start with the "how many segments" question and hopefully the thread will move through the answering of the others. First, consider this typical pie chart, produced in Qlik Sense:


A typical pie chart with many segments

As with the majority of visualization tools, Qlik Sense has defaulted to the sensible option of sorting the segments in size order. This immediately allows us to see that the USA is bigger than Germany which is bigger than Austria. I can roughly estimate the values of each (as shown in my research) and I can quickly tell that the three markets make up just under half of the whole.

The very simple answer to the question of "how many segments" is, how many makes sense to meet the business requirements.

On the question of grouping smaller segments, I can modify the original chart as below, and still answer those same business questions:

Pie chart showing the top 3 segments versus all others

In this situation, I can still see answer that business question and have reduced the number of segments on display. I would argue that I can answer that business question equally well with either chart, although the first may actually deliver me additional insights, and that is the critical thing about either chart - that they answer that business question.

What can sometimes be a problem, however, is that with interactive tools such as Qlik Sense the user could drill down to just those 3 countries:

Pie chart after user has drilled to 3 countries 

Now the user is no longer able to see the part-to-whole of these countries' market share. Instead, they are looking at the part-to-whole of just these countries. This may be OK! It depends on the business question that the user wants to answer. If it is a problem, you can use Set Analysis in Qlik Sense to do something about it - similar to one of my previous pie posts.

Let's consider some alternatives (alt-pie charts!), starting with the simple bar chart.

Typical bar chart showing a measure versus a categorical value

Again, we typically order these charts by value, so we can still quickly see that the US is larger than Germany and both are larger than Austria. If our purpose here is to compare one country versus others, then this is the perfect chart to use. Even given that, it is not really so easy to compare the US vs. Poland or even Belgium, but interactivity can help with this. It is, however, a quite a lot more difficult to see how much of the total market is made up by the top 3 countries. That would be especially more difficult if there were a longer tail of smaller values that you might have to scroll. That is why I prefer the pie chart if the business question is a part-to-whole one. We can, of course, clump the other countries into "Others" in the bar chart, but it is still not easy to see the part-to-whole:


Bar chart with "Others" bar

The recommended choice for part-to-whole coming from anti-pie advocates, is the horizontal bar:


Examples of horizontal bars as an alternative to pie charts

As my research has shown, the horizontal bar does not always work as well as a pie chart. It is a valid option, and one that you could consider, but it should definitely not be the default.

To summarize, if you are asking the question "how many segments", then you may be asking the wrong question. Always remember Redmond's Rules:

  • Use the right visual encodings (and pie charts are a valid choice!)
  • Add labels and annotations to provide context to the user
  • SFW! Make sure that you are answering the business question

The last rule can be hard, because sometimes you don't actually know the questions that the users want to answer! In those circumstances, following a Design Thinking methodology will probably get you where you want to be.


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook

Wednesday, 1 April 2020

Exponential data and logarithmic scales


I have to admit that when I first saw some of the recent data visualisations from the likes of the Financial Times and the New York Times, I wasn't an immediate fan. That is because they were using a logarithmic scale which distorts the data. My feeling was that they should be using a population based metric to compare different territories (XX per 100,000 is common).

Comparison of exponential data shown on a normal scale and on a logarithmic scale
A regular scale will have regular increments in the "Y" axis so if one point is twice as high as another, you can tell that it is twice the value. A logarithmic scale grows exponentially - generally log 2, so doubling on each equal size of increase (though the presentation usually rounds grid lines to 10s). If a point is twice as high as another, the higher value is the original value squared (e.g. 4 -> 16, 8 -> 64). It can be difficult for people to interpret, especially if they are not mathematical. In fact, I would suggest that it is almost impossible for most users to quickly tell the accurate difference in magnitude between different points - merely that one point is greater or lesser than another.

There is a general situation where it is useful to use a log scale, and that is where there is some skew in the data. For example, where there is a mix of some very high and many lower values - such as with exponentially growing data. In that situation, the scale of the higher values can obscure the lower values.

Ten US States growth shown on a normal scale. The higher value in one state hides detail in the other states. The dashed grey lines show example exponential growth patterns.
As an example, consider the chart above which shows growth patterns in several US states. All have a exponential type growth, but the higher values in New York makes it difficult to see the direction in detail of the smaller values. The scale needs to accommodate the high New York values, but most of the "action" in this chart it at the smaller values.

Comparison of ten US states using a logarithmic scale. The trajectory lines are straightened and it is easier to see the trajectory of the states with lower values.
When the same data is presented on a logarithmic chart, all of the lines are straightened and we get a much better view of the trajectory of each state.
I can now clearly see that Michigan's trajectory appears to be heading in a slightly worse direction than New York's. I am not concerning myself with how much farther ahead on the trajectory New York is, only the direction that they are both travelling and hence making mental forecasts about Michigan's future.

Bar chart with a logarithmic scale - don't do this kids! The log scale removes the comparative power of the bar chart.
BTW, I am good with using log scales like this for lines, but don't do it for bar charts! The effect of the logarithmic scale is to remove the power the the bar chart has of aiding our understanding of the difference in magnitudes. These differences are encoded by the length of the bar, a log scale will distort it. Don't do it!


Qlik Luminary, Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook