Categories and Content in Information Design

The continually growing trend of visualizing data in design is not in isolation from changing conditions in contemporary culture; the information age has provided a surplus of data that needs to be put in contexts that allow the data to be understood quickly at such a large scale. With the accelerating change in how these visualizations are being approached it can become difficult to understand what makes successful information graphics. Edward Tufte noticed the advancement of information graphics ahead of the curve in 1983, concluding in The Visual Display of Quantitative Information that “Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency” (2001, pg. 13). Tufte’s criteria for graphic excellence is specifically targeted towards measuring success based on the relationship between the form of the artifact and the content it deals with. If success depends on the structure of a graphic supporting the content in this way, can understanding the way in how humans process and categorize information graphics enable designers to make better artifacts?

Eleanor Rosch’s theories of categorization in the chapter “Principles of Categorization” establish how humans use psychological principles of categorization when producing human categorizations. Because these principles are foundational in the way humans process categories prior to creating human-constructed ones, they must influence contemporary categorizations of information graphics. It is also of importance to keep in mind that prior human categories are cultural and not arbitrarily floating in society (1978, pg. 27). Many other categories may have a role is shaping perception of what information graphics categories are. In order to better understand the relationship between principles of categorization and how categories are constructed, examples of what qualify to be basic-level categories will be defined and investigated because Rosch’s studies conclude that these categories are the most inclusive and have the greatest cue validity (1978, pg. 31).

Defining Categories
Using Tufte’s statement as a guideline, it is clear that the success of these graphics is dependent explicitly on the information they are trying to communicate. The formation of categories for information graphics functions through visual qualities found in a range of graphics themselves not specifically to content. Can explanations be found to explain why they are this way or is it only coincidental? The 2008 book Data Flow: Visualising Information in Graphic Design defines new approaches to visualizing information in six categories: Dataspheres, Datanets, Datascapes, Datalogy, Datanoids, and Datablocks. These categories are organized by formal similarities that make each one easily distinguishable. Dataspheres are built from circles, Datanets form links between nodes, Datascapes resemble topography but add dimension, Datalogy references human perceptual experiences, Datanoids have human characteristics, and Datablocks are built off of rectangles. These categories are not unlike those that are well established in information graphics such as line graphs or pie charts; they are simply newer approaches that have gathered attention from designers. These descriptions are more than just grouping for the sake of organization or pleasure. The formations have clear foundational reasons why they exist as they are and how they relate to their superordinate category (information graphics).

Rosch uses two general principles to explain in a general sense why categories are formed: cognitive economy and perceived world structure. Cognitive economy suggests that humans use category systems to gain a “great deal of information about the environment while conserving finite resources as much as possible” (Rosch, 1978, pg. 28). Categories for information graphics demonstrate cognitive economy by reducing variations of slightly different formal approaches into categories that are descriptive enough to understand as a whole but also utilize efficient access.

Perceived world structure pays attention to how objects are considered to have a likeness to other objects that lend themselves to categorization. Rosch defines this principle as a process where “the material objects of the world are perceived to possess high correlational structure” (1978, pg. 29). Rosch uses the example that wings co-occur with feathers more than with fur in the perceived world. In categorized information graphics, the correlational structure is present as shapes or the relationship of shapes. Rosch’s studies on basic-level objects confirm that similarity in shapes is an aspect in the meaning of a class of objects (34). Recognizing shapes as part of the meaning of a class is critical to understanding how categories are structured for information graphics. The shape gives insight into a process that acts as an entry point into accessing the information. Although information graphics exists is to show the information efficiently while exposing complexity, the content is not the first component processed.

The way that these categories relate to the two principles also reinforce that they are basic-level categories. If someone is shown an example that would be considered a bar graph and asked what category it belongs to, Rosch’s studies indicate they would most likely categorize it as a bar graph instead of a superordinate category such as information graphics. Basic-level categories are best suited for quick recognition with maximum detail. They make the most of cognitive economy and utilize perceived world structure to for correlations between the members of the category.

At first glance, categories may seem based on nothing but some kind of desire for organization. The reasoning behind them is clearly extensive and meaningful to understanding how both how they are formed and, in a larger context, how the world is arranged through human thought. This raises a critical question about how we perceive information graphics: If shape is the basis for categorization and the first component processed, is content secondary to the form that the content exists in?

Evaluating Categories
The categories from Data Flow can be found in many recent artifacts including the 2009 Feltron Annual Report (fig. 1) and the New York Times information graphic Where the Candidates Won (fig. 2). The Feltron Annual Report is a self-initiated project by Nicholas Felton tracking mostly social aspects of his daily life throughout one year. The New York Times’ Where the Candidates Won visualization from September 2009 charts the results of the Afghan presidential election recount after fraud accusations. Both Felton’s record of encounters with friends throughout New York City and the election results visualizations are clear examples of what Data Flow describes as Datascapes. This categorization “switches between topography and topology,” and spatially arranges data that “at once imposes flow, direction, context, and order” (Klanten, 2008, pg. 98).

Fig. 1 / Selection from "The 2009 Feltron Annual Report" by Nicholas Felton.

Fig. 1 / Selection from “The 2009 Feltron Annual Report” by Nicholas Felton.

Fig. 2 / The New York Times, "Where the Candiates Won" from the October 16, 2009 story "Setting the State for the Recount"  by Shan Carter, Matthew Ericson, and Archie Tse.

Fig. 2 / The New York Times, “Where the Candiates Won” from the October 16, 2009 story “Setting the State for the Recount” by Shan Carter, Matthew Ericson, and Archie Tse.

Both of these information graphics can quickly be recognized as Datascapes (assuming familiarity with the terminology) and also employ similar methods of construction to communicate the information. Where Felton tracks encounters with people he knows, the New York Times displays the margin of victory each candidate had in their winning districts.

To investigate how these two artifacts may be understood in relation to each other, it is necessary to look at the order in which different levels of categories are observed. According to two studies referenced by Rosch, “objects may be first seen or recognized as members of their basic category, and that only with the aid of additional processing can they be identified as members of their superordinate or subordinate category” (1978, pg. 35). Because it has been established that Datascapes function as a basic category, when looking at both artifacts it is reasonable to assume that they both identify in this category prior to other category levels. The additional processing is critical to understanding what the visualizations are about. Only looking at form, it is clear the spaces depicted are different and that Where the Candidates Won likely charts more variables. When the supplemental language is considered, much more is revealed—showing context for the information provides the reasoning why the information graphic is produced in the first place.

The Feltron Annual Report uses a Datascape to display the locations of encounters with family, co-workers, and acquaintances in New York. This publication is displayed online, publicized, and shown in many contexts that reach an audience beyond those that have a personal connection to the information. This broader audience cannot understand Felton’s relationship to the people included in the report, the reasoning behind the locations of encounters in New York, or why he charts himself as happiest between April and May. To an audience that has no ties to the information, the content is capable of being read but it can only be processed at a certain level. Perhaps this publication uses categorized data representations not as a way of showing information for an audience to understand, but as a means of making a particular statement about our lives—the ability to see our selves through rigorous data is possible and may be here in the near future.

Distinguishing between understanding the actual information and getting a bigger picture from a series of information is critical in this context. The presentation in clear, concise, and appears to be accurate while conveying a kind of complexity. But for someone who isn’t Nicholas Felton, what is this complexity? It appears to be at a different level than the information graphics themselves and pushed towards particular conceptual thoughts. Rosch’s notion of perceived world structure justifies the complexity in viewing this image; the correlational structure puts it in a clear category, but once the content is addressed processing the image is difficult beyond the visual correlational structure.

In a much different context, Where the Candidates Won is embedded under a brief paragraph that explains general background for the information graphic. Each district contains overlays that indicate the ratio of the winning candidate for a particular district in Afghanistan, giving a general overview of how the election breaks down. The use of topography in this graphic situates each district’s outcome in their particular location, allowing the viewer (who is most likely American given that the article is in the New York Times) to see geographic patterns in the results. This layer of information reveals a great deal of information quickly to a viewer interested in the distribution of the results. The graphic also breaks down the results into more categories, such as the votes for Mr. Karzai in each district. The information is complex, easy to access, and clearly relates to the content. Part of the success of this efficiency is due to information that acts as a precursor to the image.

After initial judgments in categorization are made the conventions of the class make entry into the visualization easier and faster; however, as one delves further into the information, the categorization becomes less relevant and the viewer must engage with the content itself. If the information graphic does not display clear content, the graphic itself cannot succeed in communicating the content that structures the graphic.

The difficulty in understanding information beyond membership in a formal category at initial processing presents an opportunity for design to explore: Can visual elements in information graphics help contextualize the information at earlier stages to reduce the amount of additional processing necessary in understanding the information? Both examples, like many information graphics, use minimal visual content that only displays the results of the data. Any area around the information graphic that is not descriptive text is left white and untreated. These Datascapes have a visual landscape that is arid, sterilized, and decontextualized. Where the Candidates Won is successfully contextualized in the story through accompanying text—text that must be fully processed before understanding the information. Fortunately, the established hierarchy between bodies of text and image suggest significance to the written content because of scale. The Feltron Report provides much less context for a reader, using small typography to label locations and larger typography to establish the general subject to the page, Distribution (visible on the left header of Fig. 1). If the little text provided to clarify the subject matter of the report, how would any viewer have a clue this graphic is about meeting up with people throughout the year?

Visually foregrounding the context of information at an earlier stage in cognitive processing is certainly a large and daunting task. But in an age where the frequency of these kinds of graphics is growing exponentially, won’t the situation only get worse if it is treated with the objective sterilization present? Without visual cues that correlate to other categories, an information graphic means nothing but an aesthetic approach until content is exposed. After all, isn’t the whole movement of visualizing the information about the significance of the content itself? The significance of the content is either hidden within itself and unexplained or it is embedded in text that gives cues to itself limited by the constraints of the written word.

For the future of visualizing information, there is no doubt formal categories are necessary for recognizing and understanding how to read a range of approaches to data. But what about the other half, the foundation that makes the superordinate category of information graphics possible in the first place? Perhaps a new set of categories will be constructed in the future, a system that recognizes the possibility of not only sorting the graphical qualities of information by the formal structure of the information but also by visual cues that help situate the content of graphic information. If these kinds of structures emerge, they will be recognized quickly: they will be the first experiences for audiences to get a glimpse of the purpose of the graphic representation before needing to process a body of text.

Klanten, Robert. (2008). Data Flow: Visualising Information in Graphic Design. Berlin: Gestalten.

Rosch, E., Lloyd, B., and Social Science Research Council. (1978). Cognition and Categorization. Hillsdale, NJ: New York: L. Erlbaum Associates.

Tufte, Edward. (2001). The Visual Display of Quantitative Information, 2nd Ed. Cheshire, CT: Graphics Press.