Data Visualization (2)

This blog post is my second on ‘Data Visualization‘ and is based on quotes and notes that I’ve made around Edward Tufte’s Visual Display of Quantitative Information. The following quote from the book is an excellent way of describing the simplicity of data graphics (or visualizations) whilst recognising how useful they can be.

Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasining about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful.

In his book, Tufte dedicates individual chapters to certain aspects of graphics:

Graphical Excellence

Tufte discusses achieving ‘graphical excellence’ by adhering to, or adopting, a series of key tenets. Obviously some of these tenets are more easily achieved than others, but by aiming to adhere to these principles, hopefully the use of data graphics in this project will be a good example of creating quality visualizations.

Firstly, data graphics should:

  • Show the data
  • Induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production or something else
  • Avoid distorting what the data have to say
  • Present many numbers in a small space
  • Make large datasets coherent
  • Encourage the eye to compare different pieces of data
  • Reveal the data at several levels of detail, from a broad overview to a fine structure
  • Serve a reasonably clear purpose: description, exploration, tabulation or decoration
  • Be closely integrated with the statistical and verbal descriptions of a data set

These principles lay a simple groundwork for constructing data graphics: show the data, don’t twist what the data shows, make it easy to look at and don’t let fancy technology and pretty baubles get in the way of the true purpose of the graphic: to convey information to the viewer.

Tufte goes on to describe ‘graphical excellence’ as being: the well-designed presentation of interesting data; complex ideas communicated with clarity, precision and efficiency; giving the viewer the greatest number of ideas in the shortest time, with the least ink, in the smallest space; and being multivariate – showing more than one variable.

In terms of this project, the relevance of graphical excellence is simple – the process of selecting and applying to university is a complex one, we should be presenting information with clarity and efficiently; we shouldn’t be making applicants remember details for a long time while they try and search for information and we should be showing them as much information as they require, in as short a time as is reasonable and effective. As mentioned in my previous blog post, the human brain is far more capable of taking in more information at a single time than it is remembering multiple pieces of information over a longer time period.

Graphical Integrity

Obviously we have to be truthful and not misleading when representing this data, Tufte lays out a few simple steps to ensuring this integrity:

The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.

Basically, don’t distort the numbers being shown by playing around with how they’re visualized. If value B is twice that of value A, the graphical representation of B should be twice the size of the representation of A.

Clear, detailed, and thorough labelling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data.

Where appropriate, particularly in this project, it may be necessary to label and/or provide a textual description of some of the data being presented in order to avoid ambiguity. This will be particularly important in datasets such as the KIS, where the source of data, or how a particular figure has been derived can differ from one instance to another.

Show data variation, not design variation.

When presenting various datasets that are similar, the design should remain consistent so that any variation in the view shown to the viewer is due to variations in the data being presented, rather than the method of presentation.

‘Data Ink’ and Graphical Redesign

Tufte outlines five core principles, in terms of ‘data ink’ – the amount of ink (or pixels) used to visualize data.

  • Above all else show the data
  • Maximise the data-ink ratio
  • Erase non-data-ink
  • Erase redundant data-ink
  • Revise and edit

The first point is self-explanatory – the purpose of the graphic is to show the data. The data-ink ratio refers to the amount of data that is shown for the amount of ink (pixels) used – increase the amount of data that can be shown using as small amount of ink as possible. Also, where possible, the amount of ink/pixels used for non-data should be reduced, i.e. the amount of pointless gridlines, needless embellishments etc. Further to this, one should erase data-ink that is redundant and serves no real purpose, as well as revising and editing the graphic in order to achieve a more optimal use of data-ink.

In terms of how applicable these concepts are to this project, I think there has to be a happy -medium that can be maintained.  Whilst removing surplus ‘ink’ is useful for preventing the over-embellishment of data visualizations, being over zealous could result in a similar phenomena – making the graphic harder to read because there’s almost nothing there to look at.

Chartjunk

This term is exactly what it sounds like, filling a graphic with junk for the sake of doing so. A quote from Tufte’s book says it all:

Occasionally designers seem to seek credit merely for possessing a new technology, rather than using it to make better designs. … at least a few computer graphics only evoke the response ‘Isnt it remarkable that my computer can be programmed to draw like that?’ instead of ‘My, what interesting data.’

 

These notes form just a tiny portion of what Tufte discuss is this particular book, but are points that are more applicable to this particular project. As I bring this post to a conclusion, I leave you with one last quote:

What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult, that is – the revelation of the complex.