Comments archive

The pages of this comments archive list all public comments on the latest version of the IBCS® Standards in chronological order.


Avatar

I had the same discussion with Jürgen exactly one year ago. It’s nice, it pops up again!

From my point of view, outliners for absolute values or deviations might not be recommended, but sometimes very useful and therefore not to be labeled as “do not use!”.

Why?

Assume you have data driven, auto-generated reports (daily, monthly). Then I would like to clarify assumptions written by Rolf, which may lead to different conclusions:

– Wrong values can be corrected right away: This is mostly not true. (We are still in an auto-generated report, you are an employee, you don’t have admin-rights on the database, your analytical system will be overwritten with ERP-values tonight, again)

– Wrong values can be marked as such. This is not always true, because sometimes it’s unknown if a value is wrong in the first place. Sometimes wrong values are even expected, but not always predictable by an algorithm. (e.g. sensor-values, wrong calculations, manual typing errors by an agency in Australia)

– Big differences are scaled with outliners (USA/Estonia-sample): With outliners we do NOT scale differently. We mark values which are off a certain limit.

– Small values are more important than big values: Depending on the question this is not inherently true. If I need to control these entities with smaller values, they are very important FOR ME. Still, I need to be aware of the second-priority-values.

I have a simple example:

Imagine, you are a HR-controller of a big company who gets a salary-report by employee each month a table, automatically sent to you by e-mail generated by a SAP-service. You don’t completely know the values of the next report, yet.

Your job is to control the salaries. Now, in practice, it’s important to you to have an indicator who earns how much via an absolute indicator. Maybe, you even want to see a deviation to last year with absolute values.

Your company has 500 employees – and 10 of them earn so much, compared to last year even more, that you really need to be aware them, BUT you don’t have the influence or power to control these guys (on top of the company). So, which values are important to you? Maybe, the ones below the limit of 200 kEUR! Who am I, to tell you what is important to do your job?

I can imagine more samples, even have one in practice.

View comment inline
Avatar

Dear Rolf, I’d like to come back to the question of what’s “not important for business” regarding charts (I agree with you that tables are somehow the “lender of last resort” in this matter).

Here you write: “Charts on the opposite are always biased. This is valid when using charts in our analytical work and for our presentations. It is the selection of data and the selection of the chart type which makes the difference.” Can we / IBCS really predict and hence define what is important and what not for all situations out there? I think not. Hence we should allow outliers for absolute values too. But outliers (for absolute values) shouldn’t be applied “by chance” just because there is not enough space. As with every other aspect of a chart a specific value should be declared as outlier “by purporse” by the chart designer (and only as a last option if other options are somehow not feasible) and clearly marked as such. Ideally there is also an annotation which gives the audience an explanation of why the outlier is necessary. In interactive dashboards you could also add an option to switch between a view with and a view without outliers for absolute values.

View comment inline
Jürgen Faisst

I think the reason for this controversial discussion is, that Alex, Lars, Raphael, and Tilo talk about interactive analytic systems (where you never know what numbers will pop up), whereas Rolf comes from the design of static reports. So let’s talk about interactive analytic systems:

I agree, that all the situations can occur as described. However, following Raphael, I would not cut bars with absolute values exceeding a threshold automatically and by default. The default for absolute values should be “proper scaling”. Instead, I would propose an interactive functionality allowing to change the proper scaling by manually adding a threshold in order to better visually analyze the smaller values. However, this interactive functionality has more the character of “zooming into small values” than of “marking outliers”. So I would suggest to propose such a functionality as part of CH 4.5 “Use magnifying glasses” and to use a different visualization for the truncated bars in this case. Then we can stick to using outlier indicators only to indicate big relative variances of small values.

View comment inline
Avatar

Dear Alexander and Lars –

thank you for your comments on “Use outlier indicators if necessary”.

The core idea of CH 4.4 is the sentence “If an outlier is not important for business … then it is not appropriate to scale the whole chart to this outlier”.

To my understanding, not important for business are two situations:
A) Outliers because of wrong big values
B) Outliers of relative variances because of big absolute variances and their small reference values

A) Wrong big values should be corrected right away – if we know about them or suspect them. But wrong data are no real outliers and should not be marked as such.

Wrong values might stem from of wrong data collection, wrong calculations, or other reason which lead to useless or impractical values.

B) Big relative variances can be found in practice quite often: In many cases, these big relative variances are not important. Easily, relative variances of 100 percent and more exist if the reference value is small (and therefore of little importance).

To my understanding, other extremely big values should not be marked as outliers.

I think that the examples which you mention are typical practical situations: We have bigger values and we have smaller values – and we have to visualize this in a proper way. And I think that these examples are no outlier issues in a narrow sense.

In practice, which are the most difficult scaling problems?

x) Big differences: Big and small sales; big sales and small profits; small shares of the total; etc. Big differences are only a challenge for us if the measures have the same unit – we have no problem to compare different measures with different units.
One very important dashboard problem belongs here: Big differences because of drill-down functionality.

y) Small differences (which are important to analyze): Capacity utilization of 99,1% and 99,2%; headcount development from 1.012 to 1.014 employees; etc.

For these problems IBCS offers the following solutions:

BIG DIFFERENCES
Big differences must be scaled correctly even if we make extreme comparisons like these:

a) Sales of USA and of Estonia (which is only 1% of USA) and
b) Absolute sales and absolute profit (which is only 1% of sales)

It would be absolutely wrong and misleading to cut axes or use other means of distortion resp. manipulation: Here, Estonia is much smaller than USA, and profit is much smaller than sales…

Now, in addition, we want to see more about the development, the structure, or other details of these small values:

What can we do here?

1- Indicators: We can use magnifying glasses (CH 4.5) or scaling indicators (CH 4.3) to show small values in more detail. One special form of magnifying glass would be to show the properly scaled data plus a specially highlighted data series with the smaller values in a much bigger scale – e.g. presented in colored lines or bars.

2 – Indexing: We can index all values – or only the series of small values – to a certain point of time. These indexed values can replace the original values or we add them to the original values.

3 – Non-linear comparisons: We can use “creative solutions” (CH 2.2 top): Using area comparisons, we can compare greater value spans than we can with linear comparisons. Using volume comparisons, we can e.g. perfectly visualize 1.000 compared to 1: a cube of 10*10*10 represents 1.000, 1 cube 1*1*1 represents 1. These “creative solutions” of CH 2.2 could be used in specific presentation slides but they are difficult or impossible to use in an automatic “dashboard environment”.

4 – Tables: Do not forget tables! Yes, this can be a perfect solution to present a) the sales values of USA and Estonia as well as b) big sales values and small profits. It is much easier to understand a table with the values “Sales 1.213; 1.256” and “profits 5; 6” than most charts with the same values…

SMALL DIFFERENCES
In many cases, small differences of large values are even more challenging to visualize properly than big differences. Typical situations are:
c) Capacity utilization of 99,1% and 99,2%
d) Headcount development from 1.012 to 1.014 employees

Again, it would be absolutely wrong and misleading to cut axes or use other means of distortion resp. manipulation: The changes of utilization are very small, and the changes of headcount development are little.
Again, we want to see more about the development, the structure, or other details of these small changes: What can we do?

Well, the same 4 suggestions from above might help.

ALTERNATIVE ANALYSES
Sometimes it helps to change the type of analytical setup – we must look for alternative solutions which are suited better to transfer our analytical idea, e.g.:

5 – Axis start: Instead of truncating the axes when presenting small differences of c) and d), the axis starts at the first (or last) value and we show the (small) absolute changes to these values only. Whichever scale we use here, it should of course be the same in all other visualizations of this type.

6 – References: In all of these difficult cases we can look for references (plan values, averages, benchmarks, etc.) and present our data in relation to these references.

7 – New analyses: We change the structure of our data or add data: ad a) Comparing Estonia with every single state of the USA might lead to better insights. In general: Breaking down data to lower levels often helps to understand things much better – plus having less scaling problems. ad b) Adding the data series “profit as % of sales” might lead to the intended insight – and will cause no scaling problems.

PRACTICAL PROBLEMS
We face several severe practical problems if we want to follow these suggestions above automatically, if our software tools must find “optimal” solutions e.g. for dashboards and drill-down techniques(!). But I am confident that good programmers will be able to find these solutions – provided that we have good theoretical approaches.

CH 4.2 has not been introduced often in practical solutions for difficult scaling issues: “Size charts to given data”. In many dashboards, we have several charts with the same measures and the same unit: Some of them show small values, others show big values (When I understand Lars correctly, this is one of the challenges he meets in practice) – here future software tools will have to find “optimal” chart sizes and their arrangements to maximize their common scale and minimize unused chart space.

CONCLUSION
Visualizing things means comparing things. And we should try to find better solutions than cutting resp. truncating axes or manipulating visualization elements. In most cases, this does not show the truth, this leads to wrong interpretations, and this does not support good decision making.

Tables: If there is no other way out: Use them!
Outliers: We use them for relative variances only!
Semantics of scales: We still need a concept that we see right away “It’s in thousand”, “It’s in millions”, “It’s in percent”, etc. (Again: This is very important to solve the drill-down issue…)

View comment inline
Avatar

Hi Ulrich,

Thank you for your feedback. Your example shows one usage scenario I had in mind too. Using the width for differentiating basic measures.

I absolutely agree to the “True view” premise, but there are many situations where none of the three options (winding bars, scaling indicators or normalization) is suitable and I don’t see a suitable solution in the standards right now.

For example, a dashboard showing turnover and revenue for a business with really low margin (for example stock exchanges), whereas the users want to see the absolute values. Currently we can use only the scaling indicator to scale the graphs and/or two highlight the usage of different units. But from my opinion, the scaling indicator works only well if both graphs can be placed side by side, not always possible. Winding bars can be used for outliers but doesn’t support this scenario and normalization is not what the user want to see here. Using the same scale/unit will result in an useless graph for revenue. Therefore, enabling the usage of different units and some sort of visual support thereof, would be really helpful to increase the practicality of the rules.

I like the idea of having different bar width, but may be there are alternative solutions…(?)

Best regards,
Michael

View comment inline
Avatar

Imagine an interactive dashboard consisting of four charts with abs. deviations for ACT-BUD values. All charts are scaled equally to ensure proper comprehension.

If there is a zero or very small BUD value than the deviation/variance is very high which results in a strange visualization if this is not shown as an outlier: lots of real estate is wasted as all other charts have to correspond the pixel/unit ratio of this outlier variance. The alternative is a floating deviation axis which is clutter …

I agree with Alex; our graphomate charts show an outlier, if there is not enough space to show the complete ACT-BUD value corresponding to the pixel/unit ratio of all base charts.

View comment inline
Avatar

I’m pretty sure that there are certain scenarios where it could be usefull to have outliers not only within relative deviations but also within absolute deviations or even regular charts(see also comments in UN 5.3) as they also can get unreadable if one measure is too high.

Examples for outliers would be:
– Vacation pay or sick pay in a chart for labor costs per month
– Additional payments for heating, power and water in a chart for operating costs per month.

View comment inline
Avatar

Hello Michael,

The category width and aggregation level problem not only arises when we deal with one indicator differentiate by regions and business units etc., but also when dealing with KPIs where we have very wide data ranges as shown in the chart below.



As we can see category and bar width applies not only to the time dimension, but is also used for differentiating basic measures, calculated measures, ratios etc. That’s why it could be difficult to use category width for visualizing aggregation hierarchies.

As one of the most important premises of IBCS standards is the “True view” even extreme values have to fit within the chart. Following this leads to using “winding columns/bars” (CH 2.2). The other way is applying scaling indicators (CH 4). A third way could be “normalizing” values to a base “100”, which is used sometimes in bullet charts and specially for comparing values. (see http://www.graphomate.com/neues-zu-den-graphomate-bullet-graphs/).

Best regards,
Ulrich

View comment inline
Avatar

Hi Rolf, hi all,

I’m facing a similar issue in my daily business and would appreciate a solution to provide a visual differentiation between multiple aggregation levels or units used on a single report. For example a dashboard showing turnover by different categories whereas a meaningful scale can only be applied by using million EUR for the first graph and thousand EUR for a second graph. Same issue may occur by comparing turnover and revenue for business with really low margin (e.g. stock exchange turnover vs. trading fee).

Today the category and bar width is only used to differentiate between the time dimensions. I suggest to extend the usage of the widths to a more generic definition: The width can be used to identify the level of aggregation. The higher the level the wider the category segments. Whereas aggregation can be the aggregation over time, the aggregation along a hierarchy (e.g. product hierarchy) or along a unit (e.g. metric system).

For the time dimension it still works because time is a standard aggregation hierarchy. But, with the more generic definition it could also be used on the Y-axes to differentiate between category aggregation levels (drill from highest product group down to single item) or to indicate the usage of different units (for example millions for turnover and thousands for revenue in two graphs on a single report).

A more generic definition could also enhance the visualization according to UN5.2 scaling indicators.

One additional thought: Today, the unit is typically mentioned once in the report title. In case of using different units on a single report we might also think about a standard visualization/position to show the graphs unit.

What do you think?

Best regards, Michael

View comment inline
Avatar

Thanks for starting this legends discussion – especially for interactive applications.

I once had a clear opinion regarding legends: there is no need for them.
But our clients convinced me, that there is a need for legends (sometimes).
So we build a – cost free – component to meet these expectations 😉

First I would like to distinguish between legends for data series and legends for categories.

If you show different measures – data series – in several charts on one screen, it’s a good idea to label these measures directly at the chart elements. If there is not enough space on the dashboard, than the chart title could be used. On the other side labeling all category elements directly could end in a mass, if there are to many or to small chart elements which is often the case with stacked bar/columns charts.

As we suppress labels for small elements, our clients asked for a legend component: their dashboard addressees wanted to know what they are looking at.

Of course small multiples are a good alternative but they need much more space.

And yes, the lowest stack at the base line is the only series which could be read easily against the overall value.

Here is our idea: By clicking at one stack series this series is displayed at the bottom line of the chart and it’s always labeled.

But I also agree to Markus to place a legend a little bit offset – in the same fixed(!) order as the stacked values.

Second my thoughts on an interactive legend on mouse-over.

As Raphael pointed out, most BI tools show the value and even the category label in a so called tool tipp when hovering over one chart element. This helps to understand what is shown – even in a quite detailed manner. But it doesn’t help to get a quick overview.

If you are not able to distinguish the formats for actual, budget, forecast or other values for example, you need a legend. Legends consume real estate and as they are even more things to explain to a dashboard user, I suggest to place a small question mark or an “i” for information as an icon in the dashboard. By hovering over this icon a pop-up opens up and all necessary information on how to use the dashboard and interpret the data is presented.

One last thought: Why not using legends only temporarily? If dashboards are designed consistently and well structured, I don’t think there is a need to show legends always. 🙂

View comment inline