FlowingData, one of the people that I follow on twitter, has blogged today about how the visualization below, published by the Wall Street Journal, was a good use of pie charts.
Unfortunately, I have to completely disagree!
I am not against pie charts in a part-to-whole comparison (see QlikTips: Defending Pie Charts) but using them for this type of comparison is not good.
There are two things being encoded by these pie charts:
- percentage of shares being sold in the IPO - represented in the traditional pie segment fashion.
- volume of shareholding - represented by the size of the circle.
The first is OK in one single pie chart showing one part-to-whole comparison. However, the second is a very poor way for us to compare two values - especially when they are not side by side. Taken together, it is almost impossible to get any real insight from this chart.
As an example, how easy to you find it to compare the 50% pie chart against Tiger Global versus the 6% slice in Mark Zuckerberg's pie? You can't - it just isn't possible. The reality is that the 6% is actually much bigger than the 50% in actual share volume.
I would contend that a bar chart is almost always the best medium to represent this type of information. It is so much easier to interpret and see the values.
Stephen Redmond (@stephencredmond) is CTO of CapricornVentis a
QlikView Elite Partner
Pages
▼
Saturday, 19 May 2012
Tuesday, 15 May 2012
Good Geographic Charting
I am not generally a fan of geographic mapping. I have seen very few examples of it done well.
The problem is often the geography is not always directly proportional to the value being measured. Take this example from The New York Times:
This time they have changed the size of each state to represent the actual number of electoral college votes that they have. Now, the map shows a much smaller swathe of red through the middle and more reflects the reality that (currently) the Democrats have a slight lead.
I really like this because it is a more accurate heat map while still retaining the geographic context.
I thought about how this might apply to Europe. We are used to seeing a map like this (from Google Maps):
But the land area occupied by a country doesn't always reflect its population size. I knocked-up this visualization in QlikView:
I haven't applied any color coding to this yet but it does give you an idea of how the populations are sitting. For example, Sweden has a much bigger land area than Germany (about 450k km² versus about 360k) but has a much lower population (9.3m v 81.7m). Iceland changes from a large island to a small speck in the ocean.
As with all charts, it is important to make sure that context is not skewed. This method of geographic charting helps maintain the contexts.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner
The problem is often the geography is not always directly proportional to the value being measured. Take this example from The New York Times:
This is not an atypical image that one would see in this area. A color is applied to each of the states to indicate a measure. In this case, it is current Democrat (blue) versus Republican (red) support.
By the looks of this map, the republicans are doing very well. But there is a distortion because many sparsely populated states and hence lower electoral college votes (e.g. Montana - 3, Wyoming - 3, Idaho - 4, South Dakota - 3, North Dakota - 3) have a large land mass. On the other hand we have states with small land masses (e.g. New Jersey - 14, Massachusetts - 11, Maryland - 10, Connecticut - 7) that have much higher density populations and hence more electoral college votes. Texas (38 votes) has a much bigger area than other states but it is actually California (55 votes) with the biggest vote. Hawaii (4) is way smaller in area than Alaska (3), but it has one more vote.
The New York Times have actually taken a much better approach in their main Electoral Map.
This time they have changed the size of each state to represent the actual number of electoral college votes that they have. Now, the map shows a much smaller swathe of red through the middle and more reflects the reality that (currently) the Democrats have a slight lead.
I really like this because it is a more accurate heat map while still retaining the geographic context.
I thought about how this might apply to Europe. We are used to seeing a map like this (from Google Maps):
But the land area occupied by a country doesn't always reflect its population size. I knocked-up this visualization in QlikView:
I haven't applied any color coding to this yet but it does give you an idea of how the populations are sitting. For example, Sweden has a much bigger land area than Germany (about 450k km² versus about 360k) but has a much lower population (9.3m v 81.7m). Iceland changes from a large island to a small speck in the ocean.
As with all charts, it is important to make sure that context is not skewed. This method of geographic charting helps maintain the contexts.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner
Monday, 14 May 2012
Brushing Heatmaps
I picked up a fun visualization from Matt Styles on thedailyviz.com called "How common is your birthday"?
Some of the commentary - especially from Andy Kirk - made me think a bit about how we use heatmaps. So, I fired up QlikView, grabbed the data from The New York Times, and opened up a link to Color Brewer.
The map in question shows us the rank of each of the days of the year as regards number of births - rather than the actual numbers. Here is my representation using a QlikView pivot chart:
The code for the color is:
ColorMix1(Rank/366, RGB(0,68,27), White())
I believe that there is a fundamental flaw representing this number of points (obviously 366) in a heat map where the value is the rank versus the actual value. The flaw is that the difference in color does not give us the difference in magnitude between the different blocks.
With this number of blocks, I can see that there is an obvious pattern of darker colors in July-September, but I find it very difficult to pick out, among the sea of darker colors, the ones that represent the actual top ranked days.
Feeling that this might be something that people might want to do, I thought about how I might do it in QlikView and came up with this variant:
Here, I am using a second color range to represent the top 10 (all in September) and the bottom 10 (with February 29th obviously being the lowest).
Essentially, I this is an example of brushing but applied to a heatmap.
The code for the mixed block is:
if(Rank <= 10, ColorMix1(Rank/20, RGB(12,15,124), White()),
if(Rank > 356, ColorMix1((367-Rank)/20, RGB(179,0,0), White()),
ColorMix1(Rank/366, RGB(0,68,27), White())))
It would be a fairly straightforward matter to give the user an interactive facility to turn on/off the Top 10 or Bottom 10.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner
Some of the commentary - especially from Andy Kirk - made me think a bit about how we use heatmaps. So, I fired up QlikView, grabbed the data from The New York Times, and opened up a link to Color Brewer.
The map in question shows us the rank of each of the days of the year as regards number of births - rather than the actual numbers. Here is my representation using a QlikView pivot chart:
The code for the color is:
ColorMix1(Rank/366, RGB(0,68,27), White())
I believe that there is a fundamental flaw representing this number of points (obviously 366) in a heat map where the value is the rank versus the actual value. The flaw is that the difference in color does not give us the difference in magnitude between the different blocks.
With this number of blocks, I can see that there is an obvious pattern of darker colors in July-September, but I find it very difficult to pick out, among the sea of darker colors, the ones that represent the actual top ranked days.
Feeling that this might be something that people might want to do, I thought about how I might do it in QlikView and came up with this variant:
Here, I am using a second color range to represent the top 10 (all in September) and the bottom 10 (with February 29th obviously being the lowest).
Essentially, I this is an example of brushing but applied to a heatmap.
The code for the mixed block is:
if(Rank <= 10, ColorMix1(Rank/20, RGB(12,15,124), White()),
if(Rank > 356, ColorMix1((367-Rank)/20, RGB(179,0,0), White()),
ColorMix1(Rank/366, RGB(0,68,27), White())))
It would be a fairly straightforward matter to give the user an interactive facility to turn on/off the Top 10 or Bottom 10.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner
Tuesday, 8 May 2012
If you don't know where you are going, then you'll probably get there
It is one of the main set of KPIs that a lot of our customers look for in any BI implementation. They want to see performance versus the same period last year. Or perhaps the year to date. Or maybe a moving annual total.
Mr. and Mrs. Doe leave their house to go for a drive to somewhere. After about an hour, Mr. Doe asks Mrs. Doe how far they have come since they left home. She confirms that they have successfully traveled approximately 100km from their point of origin. Satisfied, Mr. Doe continues on.
After about another hour, he asks again how far they have gone. Again, Mrs. Doe confirms that they have traveled approximately 100km from the last checkpoint. They continue on the way. Each time checking that they have traveled the appropriate distance since the last checkpoint. After 4 hours, Mr. Doe is surprised to find that they have arrived back in their home town, but on the wrong side of the tracks, and their gas tank is running low!
Look at the image above. It is a KPI showing that our YTD is well down on the previous year. There must be wailing and gnashing of teeth at the next board meeting. However, let us consider what this is measuring.
Unless you have an extremely stable business, your sales in any one period will actually be influenced by multiple different factors - many of which you have had no control over. It is, effectively, a random number. If I was in the Energy business in the UK, this chart might reflect my business - but is that because of the weather? January 2011 was far milder than January 2010 so my sales will have been down. In 2010, they would have been up and everyone would have been smiling - but it was because of the weather, not anything that I had control over. If I was a retailer, Easter 2010 was in early April so I might have had a seasonal spike in sales in late March. Easter 2011 was late in April so the spike might not have kicked into the QTD figures. Hence sad faces on the shop floor.
What if the figures are up on last year? Does that mean that I need to give everyone a extra holiday to celebrate? Not necessarily. Again, lots of different things might be affecting the numbers. Perhaps you launched a new product. Maybe you hired a whole load of sales people in a new territory and the stock is flying off the shelves. Your business this year will be so different from your business last year that comparing the two is comparing apples and oranges.
There is a better way.
Mr. and Mrs. Roe leave their house in Ellsworth, WI, planning to take a drive to Lake Wisconsin - about 4 hours away. Before they leave, they plan their journey on the map and mark out whey they should be at each hour. After the first hour of the journey, Mr. Roe asks Mrs. Roe how far they are from their first marker. She tells him that they are a little short of it. Mr. Roe is a careful driver and has been driving a bit under the speed limit so he gives it a little more gas.
After about another hour, he checks in again about how for they are from the next target. This time they are a little ahead. He knows that he could take his foot off the gas a little but decides that he wouldn't mind getting there a little earlier. He lets Mrs. Roe know this and she readjusts the markers. They ended up reaching their destination ahead of time and had a great time at the lake.
A business should not be relying on random events to compare how they are doing. If I am a retailer, I will know when the major holidays are going to occur. I will look at my last year figures, apply some thought as to where I can see growth, apply some mathematics, and come up with a sales forecast for the year - a forecast that should reflect the strategic direction of the company. If things change (like a really good summer!), I can change the forecast to reflect things. If I am in the energy business, I will be constantly looking at long range weather forecasts and modifying the sales forecast.
So, unless your year-on-year business is extremely stable, comparing one set of effectively random numbers against another set isn't really going to tell you much about your business. Comparing them against a well thought out and planned set of numbers is going to tell you exactly where you are going and then the KPI is going to let you know if you need to intervene and change things - exactly what a KPI should do.
If you don't know where you are going, then you'll probably get there.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner
Mr. and Mrs. Doe leave their house to go for a drive to somewhere. After about an hour, Mr. Doe asks Mrs. Doe how far they have come since they left home. She confirms that they have successfully traveled approximately 100km from their point of origin. Satisfied, Mr. Doe continues on.
After about another hour, he asks again how far they have gone. Again, Mrs. Doe confirms that they have traveled approximately 100km from the last checkpoint. They continue on the way. Each time checking that they have traveled the appropriate distance since the last checkpoint. After 4 hours, Mr. Doe is surprised to find that they have arrived back in their home town, but on the wrong side of the tracks, and their gas tank is running low!
Look at the image above. It is a KPI showing that our YTD is well down on the previous year. There must be wailing and gnashing of teeth at the next board meeting. However, let us consider what this is measuring.
Unless you have an extremely stable business, your sales in any one period will actually be influenced by multiple different factors - many of which you have had no control over. It is, effectively, a random number. If I was in the Energy business in the UK, this chart might reflect my business - but is that because of the weather? January 2011 was far milder than January 2010 so my sales will have been down. In 2010, they would have been up and everyone would have been smiling - but it was because of the weather, not anything that I had control over. If I was a retailer, Easter 2010 was in early April so I might have had a seasonal spike in sales in late March. Easter 2011 was late in April so the spike might not have kicked into the QTD figures. Hence sad faces on the shop floor.
What if the figures are up on last year? Does that mean that I need to give everyone a extra holiday to celebrate? Not necessarily. Again, lots of different things might be affecting the numbers. Perhaps you launched a new product. Maybe you hired a whole load of sales people in a new territory and the stock is flying off the shelves. Your business this year will be so different from your business last year that comparing the two is comparing apples and oranges.
There is a better way.
Mr. and Mrs. Roe leave their house in Ellsworth, WI, planning to take a drive to Lake Wisconsin - about 4 hours away. Before they leave, they plan their journey on the map and mark out whey they should be at each hour. After the first hour of the journey, Mr. Roe asks Mrs. Roe how far they are from their first marker. She tells him that they are a little short of it. Mr. Roe is a careful driver and has been driving a bit under the speed limit so he gives it a little more gas.
After about another hour, he checks in again about how for they are from the next target. This time they are a little ahead. He knows that he could take his foot off the gas a little but decides that he wouldn't mind getting there a little earlier. He lets Mrs. Roe know this and she readjusts the markers. They ended up reaching their destination ahead of time and had a great time at the lake.
A business should not be relying on random events to compare how they are doing. If I am a retailer, I will know when the major holidays are going to occur. I will look at my last year figures, apply some thought as to where I can see growth, apply some mathematics, and come up with a sales forecast for the year - a forecast that should reflect the strategic direction of the company. If things change (like a really good summer!), I can change the forecast to reflect things. If I am in the energy business, I will be constantly looking at long range weather forecasts and modifying the sales forecast.
So, unless your year-on-year business is extremely stable, comparing one set of effectively random numbers against another set isn't really going to tell you much about your business. Comparing them against a well thought out and planned set of numbers is going to tell you exactly where you are going and then the KPI is going to let you know if you need to intervene and change things - exactly what a KPI should do.
If you don't know where you are going, then you'll probably get there.
Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner