Wednesday January 9th, 2013 by Björn Balazs
It seems to be an easy truth: Too much detail in icons confuses the users. So we wondered whether we could find any evidence for this truth in the data of our large scale test of the LibreOffice Icons.
First we did an expert rating and sorted the icons in two groups, depending on the level of detail in the icon.
Group 1 – Low Detail Icons
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Group 2 – High Detail Icons
![]()
![]()
Some Statistics
Next we did a plot of the ratings the icons in the different groups achieved. The higher the value, the better the icon (read about the methodology of testing icons):
It looks like the two groups are actually different – but to prove this, just looking is not enough, some calculations are needed. So we did a Mann-Whitney-U Test and the value Z=4.015 shows that these two groups are actually different on a significance level of p<0.001.
What do we learn from this?
Whenever you create icons, try to be as iconic as possible – or to phrase it differently: try to remove as much details from your icons as possible. It is likely that the quality of your icons will win.
More Information
As always you can find the raw results of the icon test in the LibreOffice Project on UserWeave. You are happily invited to make up your own mind and to discuss the results with us.
17 Comments
-
Luke Wolf
Comment from Thursday January 10th, 2013 at 01:01 AM
I feel the need to point out here that your study is deeply flawed as almost none of the “low-detail” icons have an analogue in your “high-detail” iconset.
For a study of this type to actually hold any value you cannot just compare any set of icons to any other set of icons, they must be of a type. So a set of Save Icons can be compared in a single dataset but you can’t compare a “Low-Detail” Save Icon with a “High-Detail” Print Icon. It’s not that such a study isn’t a valid thing to test but your testing methodology was all wrong.
-
Heiko Tietze (User Prompt)
Comment from Thursday January 10th, 2013 at 01:03 AM
Of course it’s worth to discuss the classification. But the table icon is just a simple grid with horizontal and vertical lines. Spelling consists of a colored check mark and text that is underlined or not.
The issue of too much details becomes obvious for printer icons, for instance. Have a look for eye vs. flash or magnifier vs. feeder. These icons are not clear enough and got worse assessment, i.e. are mixed up with other functions. -
Luke Wolf
Comment from Thursday January 10th, 2013 at 01:21 AM
Heiko,
The thing is though whether the results are valid or not the testing methodology was wrong, because what was done was the equivalent of doing a study of favorite fruits and asking “Would you like an Apple, an Orange, or a bar of chocolate?”. Yeah you can get results out of the apple and orange results but that chocolate bar there makes the entire study invalid.Ultimately the problem with this study is that it’s not comparing “low-detail” vs “high-detail” it’s comparing a random assortment of icons from set 1, and an almost completely different assortment of icons from set 2 and seeing which ones people recognize. Particularly since most of these “high-detail” icons are actually rather obscure.
You can’t actually even draw any real results from this other than that commonly used symbolism is easily caught onto, but going beyond that you start losing people. Which is common sense anyway.
-
Andy
Comment from Thursday January 10th, 2013 at 03:28 AM
For a study of this type to actually hold any value you cannot just compare any set of icons to any other set of icons, they must be of a type. So a set of Save Icons can be compared in a single dataset but you can’t compare a “Low-Detail” Save Icon with a “High-Detail” Print Icon. It’s not that such a study isn’t a valid thing to test but your testing methodology was all wrong.
@Luke Wolf – While I agree with your overall conclusions, your process for getting there is flawed. You can compare various random low-detail icons to various random high-detail icons. The difference comes into the degrees of freedom and the strength of the resulting correlations.
I agree that rigging a survey with similar low/high res icons that are similar but represent the same thing would be an improvement, but it would be possible to design a valid study without it.
The problem here is that the two sets of icons are also different from one another in their level of abstract complexity. The high-res icons are clearly representing more abstract concepts than the low-res icons.
The statistics don’t lie. The observed differences are real. The problem lies in the interpretation. Correlation does not imply causation. In this case I suspect the difference in abstract complexity plays at least as much of a role as the style of the icons themselves.
An alternative explanation could be that designers tend to design more complex icons when trying to create a representation (especially a skeumorphic representation) of an abstract idea. And, these icons are harder to understand. Note that I’m not blaming the complexity of the drawing any more than the complexity of the abstract idea. There isn’t enough information here to do that.
Although, I wonder if it could be possible to use the Tango icons to get at this. As I recall the Tango icons tend to be more detailed than the Oxygen Icons. That could be interesting although I would still caution against drawing too strong of a conclusion from the results either way.
-
Comment from Thursday January 10th, 2013 at 08:44 PM
Andy, thanks for you comment. I think it sums up pretty well the overall criticism or scepticism of the comments while keeping a tone that I feel replying to is worth the time.
To do a study methodologically correct, you need a set of hypothesis before you even design the study. This has not happened in this case. When we created the study we did not think about detail in icons. We simply wanted to benchmark the existing LibreOffice Icons. So I do agree with your critisism, even though I would not go as far as saying it is worthless.
I have tried to be reserved with drawing direct conclusions within the article (ok, headline and teaser are a bit brave) – but seeing the replies here obviously not reserved enough ;). Take it as: Designer often tell me, that too much detail destroys an icon – can the existing data refute this view? It cannot.
So it is true: A better study design (like you and others have outlined) should follow these calculations. I hereby happily invite anyone to conduct such a study and will do my best to support it!
Let’s us work together to understand icons better!
-
-
TheBlackCat
Comment from Thursday January 10th, 2013 at 07:48 AM
Another issue is that the high-res icons cover a much wider range of “goodnesses”, a range that includes the range of the low-detail icons. This indicates that high-detail icons can be just as good as low-detail icons, given some other, unknown condition is satisfied.
-
Heiko Tietze (User Prompt)
Comment from Thursday January 10th, 2013 at 10:10 AM
An alternative explanation could be that designers tend to design more complex icons when trying to create a representation (especially a skeumorphic representation) of an abstract idea. And, these icons are harder to understand. Note that I’m not blaming the complexity of the drawing any more than the complexity of the abstract idea.
Fully agree. That’s what we want say with ‘level of details’.
And you are right with methodological criticism. It simplifies too much our rather dry results concerning printer icons:
Print directly = 10.0/8.2 (Tango/Oxygen)
Page preview = 7.6/7.6 (Tango/Oxygen)
The small magnifier (along with the strange function label) complicates identification.
Therefore the conclusion should be clear: Try to design icons not too complex.Which does not mean that all complex icons are worthless, of course (@TheBlackCat).
-
Dirk
Comment from Thursday January 10th, 2013 at 11:05 AM
To me the most striking difference between the two groups ist the difference in variability in your response variable. Even if you take the different size for the two groups into account (20 low, 28 high detail) it seems from the graph that the high detail group is more variable.
I would suggest to replot the data. For this keep the separation into the low and high detail group, but instead of a summarizing graph like a boxplot, plot the raw data and use the icons itself as plotting symbols (you can add x-jitter or lower the alpha for the icons to compensate overplotting). Maybe you will see a pattern that could explain why some of the high detail icons score so bad and others so well.
-
mutlu
Comment from Thursday January 10th, 2013 at 11:09 AM
This is so flawed a “study” that its value is nonexistent. I am sorry, if I offend you, but this is a serious problem, if your results are taken seriously. You should let people do such studies who actually know what they are doing.
First of all, the distinction between low-detail and high-detail icons is highly inconsistent. Second, those in the first category point to rather well-know, well-entrenched and often-used actions many will easily recognize, while the latter are less often used. Third, the latter category also includes icons such as the “auto-spellcheck” and the “spelling and grammar” icons that are very close to one another both in meaning and in iconic representation. Thus, an allegedly low-detail icons like “save” will score higher than either of the spell-related icons due to the inherent ambiguity. Concluding that this has to do with the level of detail is not the way to go.
Instead, you would need two sets of icons that represent the same actions, one with more and one with less detail. Both icon sets would have to be equally known or unknown to your audience. From that, you could draw conclusion. From this, NOT.
I do appreciate you work, but please don’t encourage the F/OSS community to invest resources into design that are misdirected based on well-meaning but simply wrongly executed studies.
Thank you.
-
Aaron Seigo
Comment from Thursday January 10th, 2013 at 04:25 PM
Hoo boy.. I came to the comments section to ask some questions about how low versus high detail was determined; about how dimensions such as familiarity of both the icon and the represented action were sorted from the results; etc. Seems several others beat me to it (though I was going to ask questions rather than simply declare the study flawed/invalid

It seems there may be real issues and I agree with mutlu’s sentiment of risking effort wasted by uncertain / innacurate results.
I would also caution against attempting icons for certain things in the first place. For instance the print preview icon is probably impossible to make highly effective, and so I’d push that question back one step: do we even want print preview in our menus and on our toolbars? Should we simple go to “Print” and have the preview presented there? (Google Chrome, for instance, follows this method.) Such an approach resolves the issue entirely by re-positioning it.
Knowing which actions to target for repositioning does rely on good icon effectivity data though (at least in part).
As for final weighing of results, it is not only effectivity that needs to be factored. Visual appeal should also come into it, imho. A completely functionality but visually unappealing UI may gain in usability but actually decrease the number of users.
As our goal as developers is not “increase the usability score”, but rather “increase number of users and their individual satisfaction”, of which “increase the usability score” is a significant aspect of, this data set, even if perfectly arrived at, needs to be weighed alongside additional data derived on the same icons with a separate survey.
All that said, thanks for continuing to push forward the amount of data we have to work on. I hope the next iteration of this process will produce even more useful results!

-
Comment from Thursday January 10th, 2013 at 09:11 PM
Aaron, I absolutely agree that icons should not be trifled with. This is the reason why I try to push this topic. Next to aspects of the design itself, we have to question the general use of icons as you argue, and also consider internationalization, cultural aspects and so on. Even if this study does not answer all questions (which does anyhow?) – it might help us to phrase questions more precisely in future.
What I do not understand from your and mutlu’s comment: How could this article lead to the risk of wasted effort? I am not calling to action to redo all icons or something like that. I simply state:
Whenever you create icons, try to be as iconic as possible – or to phrase it differently: try to remove as much details from your icons as possible. It is likely that the quality of your icons will win.
I hope anyone will agree on that – independent of methodological questions concerning this study

BTW: I think that the goal of Usability is to ‘increase the number of users and their individual satisfaction’. But whenever you want to test something you have to abstract somewhere. So yes: There is more to icons than the values we assess in this study.
-
-
Astron
Comment from Thursday January 10th, 2013 at 06:25 PM
Aaron Seigo wrote this:
> For instance the print preview icon is
> probably impossible to make highly
> effective, and so I’d push that question
> back one step: do we even want print preview
> in our menus and on our toolbars? Should we
> simple go to “Print” and have the preview
> presented there?If only it were that simple… in LibreOffice, we have two different previews that show things that look like printed pieces of paper: there is a small print preview in our print dialogue (yay! too small, but a good start). The toolbar icon in question, though, is, if you ever bothered to read the tooltip, leading you to a function called the _Page_ Preview.
What would be the difference?, you ask, understandably. The difference is that the Page Preview does not take into account what kind of paper sizes the chosen printer can/wants to actually print to.
We had a longish discussion about this on our UX list … and some people even defended this “feature.” So, right now, it still is as it was. -
Mohamed-Ikbel Boulabiar
Comment from Thursday January 17th, 2013 at 02:16 AM
Why do you still use all these old icons ?
There are freely available one from theNounProject which are simpler than the ones presented here.
http://thenounproject.com/-
Comment from Thursday January 17th, 2013 at 11:08 AM
Because we investigate the icons that are in use in LibreOffice. We will investigate the Nuon icons in different studies.
-
-
Oren
Comment from Friday January 18th, 2013 at 09:32 PM
Hey, an R box-and-whiskers plot

I’m not sure the Mann-Whitney test can be used on this dataset. This test assumes that the two groups are independent, which doesn’t seem to be the case here since both icon types were ranked by the same people. -
ari9999
Comment from Saturday January 19th, 2013 at 05:42 AM
Icons and logos share some key characteristics and objectives. The best logos (including those that win design awards) tend to use highly abstracted symbols, i.e., images stripped of detail.
Regardless of whether the LibreOffice study is scientifically valid — the methodology may indeed be flawed — pure intuition and personal reflection suggest the underlying idea may be valid: simpler graphics often communicate faster and more effectively, and are easier to recognize and recall.
You can test this yourself at free do-it-yourself logo sites like LogoGarden.com.


Nikita Skovoroda
Comment from Wednesday January 9th, 2013 at 11:56 PM
Why is http://user-prompt.com/wp-content/uploads/table.png is called low detail, while http://user-prompt.com/wp-content/uploads/spellcheck1.png and http://user-prompt.com/wp-content/uploads/spelling.png are called high detail?
Are you sure that they were splitted in two groups «low detail» / «high detail» by someone who was not aware of the survey results and haven’t seen any statistics? If not, the conclusions you make have no actual value, because the person who did the split can rely on the survey results without even realizing it.