Making Lasagne, Not Spaghetti

The title of this blog post is taken from Noah Iliinsky’s master’s thesis and relates to designing complex diagrams that contain multiple elements, are multi-layered and well ordered, as opposed to ‘spaghetti diagrams’ that “have undefined axes, little or no order, and are a hodgepodge of similar elements”. The thesis highlights key concepts to consider when designing data visualizations and suggests a design process to use, to help create better defined and more meaningful and useful data visualizations. This blog post acts as a quick overview of some of the concepts covered in the thesis, providing a useful list of concepts to consider when designing diagrams / data visualizations. How these concepts directly apply to my work on ON Course will be outlined in a future blog post.

Firstly, how do we define what constitutes a complex diagram? Dictionary definitions don’t really help in this particular context: “The state or quality of being intricate or complicated”. Three criteria are suggested in the thesis, that may contribute towards a diagram being considered as complex.

  1. At least four different types if information to present
    Why four different types? This number was arrived at as the vast majority of visualizations encode three or fewer data types, whereas there are far fewer which encode four or greater types of data.
  2. Large amounts of information
    This concept is fairly self explanatory, if you have to present an inordinate amount of data, then whichever method of presentation you choose, it’s going to be a complex diagram.
  3. Presenting qualitative information
    Presenting qualitative information leads towards complex diagrams because of a lack of standard metaphors / representations / scales that often exist for quantitative information. As a result of this, the designer / author has to create the representative metaphor themselves, in such a way that the target audience can understand and comprehend the data being shown to them

Why be concerned with how complex a diagram is going to be? More data types to present in the visualization means that more distinct encoding methods are required. “Once the most obvious encoding has been used, the author must venture into more subtle encodings to make their points. Clearly, the more characteristics there are to be represented, the more difficult this becomes.”

Five Fundamental Principles

Iliinksy suggests that five fundamental principles should be considered when designing and creating data visualizations. By creating good complex diagrams, “knowledge can be easily extracted from the diagram because of good choices made by the authors”. These five principles are: ‘Different goals require different methods’; ‘Audience brings context with them’; ‘The principle of information availability’; ‘The principle of semantic distance’ and ‘The principle of informative changes’.

Different goals require different methods

This first principle relates to the reasons and goals for designing a particular diagram. Determining the goals of a particular diagram has a direct impact on the methods used to create the visualization. This is not to say, however, that diagrams with very different goals will not be made using quite similar methods. It is important, however, to realise that there is no panacea in data visualizations. no one method of visualizing data that will meet any goal.

Audience brings context with them

This principle highlights the need to consider and understand the target audience when designing data visualizations. As data is being encoded, it is important to make sure that the encodings being selected – colour, spacing, grouping, labeling, positioning etc, are all understandable by the audience.

“Common sense would dictate that, in virtually any medium, the audience must be considered when designing any communication. However, the degree to which the audience influences the interpretation of a communication is often underestimated. The needs, background and biases of the audience are too-frequently ignored, discounted, or erroneously assumed to be similar to that of the document designer.”

Just because the designer “knows” that ‘x’ means ‘y’, doesn’t mean that the audience will hold the same view!

The principle of information availability

This principle is similar to a concept I’ve blogged about from one of Tufte’s books. The information being presented to the audience needs to be easily accessible and available. “…no matter what, the operating moral premise of information design should be that our readers are alert and caring.  They may be busy, eager to get on with it, but they are not stupid.” (Tufte)

A diagram with a good level of information availability will result in the audience being able to learn quickly and easily from the diagram. The audience shouldn’t have to spend 10 minutes working out what the information being shown to them actually is or what it might show.

The principle of semantic distance

This principle relates to the meaning that is given or inferred from the spacing between elements within a diagram. Elements of a diagram that are closer to each other will be perceived as being conceptually closer than elements that have been placed further away in the diagram.

The principle of informative changes

“Readers … expect any change in a pattern to mean something, … when something changes, there is – or should be – new information”

This is an important concept to keep in mind when designing diagrams, be it a standalone diagram or a series of related diagrams. If there is any visual difference between elements in the same diagram, or the same element in a series of diagrams, then it is quite likely that the audience will perceive them as being different elements. If no data has changed, then the appearance and presentation of the elements should remain consistent. Conversely, if something has changed, make sure it looks like it has!

The principles outlined above have all been summarised from Noah Iliinsky’s master’s thesis, which can be downloaded from complexdiagrams.com I recommend reading it as it obviously goes into a lot more detail than what I’ve outlined in this post, and contains a lot of other material that I haven’t even mentioned.

My next blog post will be about how I apply these concepts to creating visualizations as part of the ON Course project.