Seeing Action: A Visual Analysis of World of Warcraft

High-level character view

The view of a high-level character in combat as part of a group of other players

Amid the flashes and the thrills of a videogame, something profound happens: a player comes to the game and together, they collaboratively create an experience (Holmes 2004). While this is true of any media to varying degrees (indeed, all phenomena are ultimately “experienced” and therefore interpreted, a co-creation of event and experience), a videogame is a unique text, for in a videogame the player is a necessary actor through which the game “happens.” Most analyses of a videogame must confront this fundamental circumstance: that the player and the game interact. Certainly, the notion of interactivity is unsettled and contested (see Gee, 2010; Turkle, 1995; Aarseth, 1997; Juul, 2005, and Wilson, 2004 for an example of the variety of interpretations). For the purpose of this analysis, I define interactivity as the condition through which the player controls certain events within the game world, and that this world informs the choices made by the gamer. So, a player uses an interface (actually, several—physical as well as conceptual) to influence the outcome of the afforded design of the game; and, depending on how previous actions affect the game world, the player then uses this interface to make additional choices, and the cycle repeats.

What are these interfaces, and how do they make the game possible? As noted, they are both physical interfaces—that of the keyboard and mouse or other controller, even the body itself—and conceptual—icons, buttons, cursors, and more complex visual representations as well as aural information and kinesthetic feedback. These interfaces perform distinct but intertwined functions in order to do the game; using the mouse to move the cursor to click a button to take an action is a chain of interrelated functions that constitute game play, at least on a mechanistic level. Game play requires both the physical and conceptual interfaces; however, this analysis focuses on the latter category and considers the manifestations of the on-screen interface to describe how players make sense of the world in which their actions take place.

This interpretation of how a videogame works relies on several important assumptions, which need a brief explication. It should be clear from this working definition that a videogame is a dynamic space; change occurs within the gamespace, and the “text” is in a state of constant flux. Gameplay is both an instantaneous “in-the-moment” experience as well as the collection of all the previous “states” of play that lead to that particular instantiation of the game. However, though the text of a videogame is always in motion, it is not infinitely variable; there are limitations on what actions a player may take and how those are represented.

This, of course, implies that videogames are designed texts and are created with particular affordances and limitations as a vehicle for the player to co-create and experience the game. There are instances of emergent rather than designed games, though few, which I will ignore for the purposes of this analysis. Further, because a game is designed as an experiential space, the designer creates the text with the player’s actual performance in mind; that is, the designer makes choices that help the player actually play the game. They may not create “complete” tools sets for users in order to provide some level of challenge (and, therefore, purpose) of the game—indeed, most of the “best” games are arguably ones in which the actions, information, and tools available are “pleasantly frustrating” (Gee, 2006). Nevertheless, the designer’s intention is that the game will be played, and therefore creates a space in which a player can learn how to navigate within the world and perform the actions necessary to progress in the game.

Here, then, is another key element of games—that of progression. While varied in its implementation (as we shall see), progression refers to the notion that players work towards a goal, and that the game operates as a channel through which that work occurs; gameplay is a function of the change of states of the player through their interaction with the game. This is an important consideration for two reasons; first, it implies that the player must transition from novice to master of the discourse of the game in order to make progress (Gee, 2010); second—and more important for this analysis—it also implies that the designers must create a system through which players learn this discourse in order to master it. More specifically, game designers utilize the interface as a resource to provide information for the player to make meaningful and then use to take action.

What are the resources that designers can use to provide this information? To answer, it is necessary first to consider exactly what the information conveyed actually is. Generally speaking, it includes player status (health, condition, etc.), configuration of the gameworld, the relationship between the player and other actors within the space, and the actions that a player can take in a particular situation. What information is actually presented —and, more importantly, how—varies from game to game; the world of Pac-Man is in many ways far different than the world of Tetris, which is different than the world of Super Mario Bros., which is different than the world of Grand Theft Auto, and so on. There are, of course, certain conventions within and between games that designers can utilize to streamline the interface by providing familiar or standardized resources for meaning-making, including things like health bars, cursors, maps, and point-of-view, among others.

So, we can refine the question even further: what resources can designers use to support this progressively dynamic interface through which players interact with the game? Further, how do these structures support the progressive nature of gameplay, both in terms of “moving” through the game as well as in players’ progressive understanding of the game and shifting from basic performance to high-level performance? Using the massively-multiplayer online game World of Warcraft as a lens to focus this analysis, I contend that there are two primary methods the designers of this particular game, Blizzard Entertainment, Inc. (hereafter referred to simply as Blizzard) use: the orientational structures of the interface itself and the interface’s just-in-time/on-demand nature. Through these structures, the designers teach a player what and when information is salient in order for the player to take action and “do” the game as well as provide spaces for players to learn to master the game itself.

Theoretical framework

This analysis borrows heavily from the theories outlined by Gunther Kress and Theo van Leeuwen’s Reading images: The grammar of visual design (1996) and subsequent writings. In particular, I have adapted their work on the composition of an image to discuss the player’s relationship to their avatar, with the world around them and with their ability to act. Kress and van Leeuwen define composition as “a third element: the composition of the whole, the way in which the representational and interactive elements are made to relate to each other, the way they are integrated into a meaningful whole” (1996, p. 181). There are also significant “representational” and “interactive” elements that the designers use to create the interface of World of Warcraft, but to understand how the interface functions in such a dynamic space requires considering these parts collectively as well as individually.

Importantly, I have made these adaptations somewhat liberally to describe this dynamic space; some features of Kress and van Leeuwen’s theories fit more aptly than others. Describing a video game such as World of Warcraft is an exercise somewhat akin to describing the shape of boiling water; you can say that it is roughly cylindrical in shape (depending on the shape of the pot it is boiling in), but it becomes far more difficult to describe the exact topography since it is constantly in motion. Nevertheless, it is possible to give a general sense of the shape and composition of the object, so it is in this spirit that I have proceeded with this analysis.

Among the elements that align more closely with the study of a videogame, point-of-view, framing and salience are perhaps the most critical. For Kress and van Leeuwen, point-of-view describes position of the viewer in relation to the actors and objects within an image. Meaning for the viewer occurs in part by how they are situated to the “subjects” of the image; viewed from above, the viewer assumes more power in relation to the “object” of their gaze, while the opposite holds true as well. Similarly, the distance at which this object is viewed helps determine the social relation between the viewer and the subject; an extreme close-up of a subject implies a close, socially-intimate relationship, while a character seen from far away is detached and remains a stranger to the viewer. Point-of-view is a key socially relational tool.

Framing, too, is a relational tool as well, though less about the viewers relation to the subjects of the image than to the information conveyed. Framing refers both to visible frames (such as a box around an object, even the borders of the image itself) as well as invisible or implied frames (including objects aligned with each other). I also include vectors between objects in my definition of implied frames, though Kress and van Leeuwen consider vectors more as interpersonal rather than compositional attributes. I contend that vectors between elements can create a frame that links the objects , and this linkage is itself a kind of frame in the sense that Kress and van Leeuwen describe in Chapter 6 (1996). Further, I consider Kress and van Leeuwen’s notions of “given and new” and “real and ideal” (1996, p. 186-193) a particular kind of framing technique; while they consider the “given and new” and “real and ideal” as part of the “information value” of an image, I extend framing to include this informational value on a somewhat literal level—as part of the framed image placed in a particular spot. For Kress and van Leeuwen, the left/right and top/bottom orientation of the information in an image (and therefore, how it is framed within the confines of the image) provides a kind of internalized narrative where the information on the left of the image is “’presented as something the viewer already knows, as a familiar and agreed-upon point of departure for the message” (187), while information on the right of the image is “presented as something which is not yet known, or perhaps not yet agreed upon by the viewer, hence as something to which the viewer must pay special attention” (187). Similarly, top/bottom orientation provides information about the objects within an image and their status in relation to the view of the image.

The third of Kress and van Leeuwen’s elements critical to this study is salience, or to what degree particular information is important in the image. Size, contrast, and location within the frame of the image all play a part in enhancing or minimizing the importance of a particular object; a large, centralized, strongly contrasted object appears more “noticeable”—and therefore more “important”—than a small object in a corner of the image that blends into its surroundings. Salience is also an informationally relational tool in a similar sense to framing; it helps the viewer discover what information might be important by highlighting it and drawing attention to it.

Finally, it should be noted that while I align myself to a social semiotic view (in particular as described by Kress, 2010 and Jewitt and Oyama, 2001, among others) in which the resources used in the creation and interpretation of a text are contested and negotiable and meaning-making is a multiparty endeavor, for the purpose of this analysis I have limited my focus primarily to the structural elements of an image; that is, the means through which designers attempt to convey meaning, not the meaning itself. It is important to consider “what” an icon represents in and of itself, or how a viewer might interpret the representation, but this analysis centers more around the tools used than the meanings made. This is not to suggest that I avoid the meaning of specific resources exactly; rather, I treat these “meanings” on a somewhat functional and generalized level. Not all icons mean the same thing, for example, and each conveys specific (and potentially unique) information. This study, however, does not cover these individual meanings beyond recognizing that they are also capable of being misinterpreted or ignored completely by a particular viewer. Further, I operate under the assumption that many of these tools are conventionalized within the genre of videogames and that users have a general understanding of these conventions, if not specifically within World of Warcraft.

Of course, all of these technique so far described can be manipulated for artistic or narrative purposes, but in general these principles create a structure of composition that helps convey meaning. Together, these compositional elements create what I call orientational structures through which a designer of a text helps to “orient” a viewer to the information the designer attempts to convey and provide tools to the viewer through which they can co-construct the meaning of the image.


To analyze World of WarcraftI have selected several representational screenshots of the game which provide a cross-section of different features. In particular, I will focus on two images

New character view

The initial view of a new character

(Figures 1 and 2) which

High-level character view

The view of a high-level character in combat as part of a group of other players

provide a strong sense of the dynamic nature of the gamespace. I have attempted to choose views and situations that players commonly encounter rather than looking for images that represent the extreme ends of a player’s experience; however, the very shape and function of the game suggests that any images will show variation—sometimes profoundly. So, for instance, Figure 1 is the first screen a new player would encounter after creating a character, while Figure 2 is an example of a high-level character in combat with a group of other players. Certainly, not all players progress to end-game content and may not face this exact situation (though they will likely encounter something similar even at lower levels); however, all players start essentially at Figure 1 (with some variation based on the type of character they create). Since this study assumes players progress through the game towards higher-level content, Figure 2 can be considered as representational for the “average” player of the game.

World of Warcraft also offers a considerable breadth of content and possible scenarios, from fighting enemies to buying and selling goods, communicating with other players, and a host of other in-game activities. Importantly, these activities often occur though changes in the “standard” user interface (via things like dialogue boxes, skill “windows,” maps, and others). Figure 1 represents what might be considered the “standard” view when the player is at rest; they elements in this image (with some exceptions) are always present, excluding these special circumstances; similarly, Figure 2 is a “standard” view while in combat and grouped with other players and the elements in place here are normally present while in this state. I have included several examples of these changes to the “standard” user interfaces to refine and complicate these normal views, but Figures 1 and 2 can safely be assumed to represent the “normal” images players encounter when starting the game and playing it at a high level.

Further, it should be noted that the interface of World of Warcraft is customizable and extensible; players can manipulate the interface to move information and components to different places, hide information altogether, or add to it via designer-included choices or third-party extensions. For example, by default the health bars of players are in the upper left corner of the screen; the player can move this box to anywhere else within the screen using tools provided by Blizzard; many players choose to replace the default health bars with a third-party add-on that offers greater cosmetic and informational options. This study includes only default settings included in the game (with some unintended variations such as numerical text on the health bars—turned off by default—and a few third-party add-ons which do not greatly affect the overall study and can effectively be ignored beyond mentioning them). And, because the game can be played on a variety of computer systems with different screen sizes and resolutions, it is important to recognize that the exact position of elements—even by default—varies per user. These differences are noticeable (more “open space” around elements and distance between them), though the overall position of elements remains nearly the same (action bars at the bottom of the screen, map in the upper right corner, etc.). I have used a consistent aspect ratio for all screenshots here which is near the middle of the options available to players (specifically, I used a resolution of 1280×900—the size of a mid-range laptop screen), though resolutions can range from 800×600 up to 1920×1080 and beyond.

It is also again necessary to recognize that World of Warcraft is a “real-time” game in that events happen immediately, and the screen is constantly animated not only by user input but by circumstances in the environment as well. The game is organized around what Kress and van Leeuwen refer to as a temporally dominant integration code, where events happen over time, though the screen is also organized around a spatial integration code as well. The view of the user is constantly in flux and in motion; using screenshots for a visual analysis is a necessary constraint of the nature of this study, so when discussing what a player sees in terms of a particular screenshot, it must necessarily be assumed that the screen is nevertheless changing as well. Again, the images selected are useable “standards” for different situations—particularly at rest and in combat—but are still devoid of perhaps the more important visual characteristic of the game, that of motion.

Orientational structures

To organize the specific orientational structures World of Warcraft employs, it is necessary to differentiate what kind of information is being conveyed and how before bringing it together to understand how this information is used by a player to play the game. In particular, how characters are oriented to their in-game representation (the avatar), the world around them, their abilities to act and with other “feedback data” each provide particular examples of the structures at work. Taken together, these structures support the other key function of the game, which is covered in the later portion of the analysis.

Player character

In World of Warcraft, the player is represented through an avatar, a kind of “puppet” that the player controls in order to navigate the world and execute their desired actions. While the relationship between the player and the avatar is complex, in many ways the player “becomes” the avatar and vice versa; what happens to the avatar affects the choices a player makes, and those choices are then executed primarily (though not exclusively) through the avatar. How a player sees “themselves” in the game, then, is a critical component of their ability to act meaningfully with the game world.

Once of the clearest ways of visualizing the player-in-space is through point-of-view. Some games exclusively use a “through-the-eyes” or first-person view; the player does not see themselves at all, but rather takes a position of looking at the world as if they themselves were in it. This genre is represented best by “first-person shooter” or FPS games such as Call of Duty, Halo, or Metroid: Other M, though other genres use first-person views such as racing games and role-playing games. Other games provide a third-person view exclusively; games like Pac-Man, Gears of War, and the Madden football franchise put the player over the shoulder or looking down at their in-game representation. Still other games allow users to change their point-of-view; World of Warcraft falls into this latter category.

Figure 3 represents the default camera view of the game. Figures 4-9 represent other possible camera angles that a player can use, from an extremely far distance (Figure 5) to a close-up of their character’s face (Figure 6) to a top-down view (Figure 7) to a first-person view (Figure 9). To change the camera angle, players can either cycle through preset camera distances and angles provided by Blizzard or use their mouse to manually change the distance and angle. Players can rotate 360 degrees horizontally around their character, and 180 above their character down to ground level; they can also zoom from first-person out to about 50 yards away from the character. However—importantly, as we shall see—the camera always remains centered around the avatar. The camera, therefore, is a free-floating, user controlled tool that occupies a half-spherical space of about 100 yards in diameter centered around the player’s character.

Two key considerations take place within the player’s relation to their avatar. First, they remain both detached and connected to their player characters at all times (very few players use the first-person view as it is impractical for most tasks in the game which depend on high spatial awareness, so third person is the dominant view). Second, their character is always at the center of all actions that they witness in-game (again, discounting special circumstances like trade skills windows and cinematic “cut-scenes”). Both features play a significant role in orienting the player to themselves in the game and in learning to take action.

Kress and van Leeuwen provide a tool based on the work of Edward Hall in order to gauge social distance in images, which I adapt and expand for World of Warcraft. According to Kress and van Leeuwen, there are several boundaries with which we describe our social distance from another person (or, in some cases, object); these distances in turn help define our relationship and level of intimacy with the subject, and therefore our ability to analyze, empathize, and identify with it. These boundaries range from close personal distance (which implies a high degree of intimacy) to far personal distance (which decreases in intimacy, but is still at a personally intimate level) through close social distance (as in that of acquaintances or business relationships) to far social distance (that of “inspection” or observation) ending finally at public distance (the distance at which we interact with strangers; a detached relationship).

Interestingly, the camera in World of Warcraft represents these boundaries rather well, if considered somewhat liberally; the default view (Figure 3) stands at a somewhat far personal distance, where the player can see their character as the most important and largest figure, and can assume an intimate-enough relationship to transfer their identification to the avatar; Figure 5, on the other hand, stands at the far end of the spectrum and is a kind of public distance, where a player can focus much more on the surroundings at the expense of seeing “themselves” in the space. Importantly, Kress and van Leeuwen argue that “the relation between the human participants represented in images and the viewer is once again an imaginary relation. People are portrayed as though they are friends, or as though they are strangers” (1986, p. 132, emphasis in original). So any kind of relationship assumed between the player and the avatar is a imaginary one, but one supported by the social distance at which the player views the avatar. To Kress and van Leeuwen’s scale we can add first-person view, where the player actually assumes the view of their avatar; what this does to intimacy and social identification is subject to some debate, certainly, but I suggest that it might provide an even greater conceptual link between the player and their in-game representation and promote a higher degree of identification. Again, though players do not often use the first-person view, it becomes more useful for taking scenic screenshots or getting an “on-the-ground” view of the world around the player.

However, the identification with the player also depends on the position within the frame of the screen as well as the distance of the camera. That is, even zoomed out to the maximum distance, players are still centered on their character, so any exploration of the surrounding space is related directly to its relationship to the avatar itself. This centralization of the avatar enhances the identification potential of social distance by tying the world to the character regardless of the distance of the camera. In this way, point-of-view and position of the avatar are a linked structure that connects the player to the avatar through which they act, both conceptually (the player sees the world in relation to the avatar) and literally (they cannot not see their avatar, baring some kind of obstruction, which I will ignore here).

But the centralization of the character plays another role as well. By tying the view of the avatar to the center of the screen and relating the rest of the world and other information to the avatar, the player learns to navigate the world through that character and to pay close attention to it; it becomes salient in that all action flows through the relationship of the viewer and avatar—all other information becomes important in relation to that centered view. As Kress and van Leeuwen argue, “[f]or something to be presented as Centre means this it is presented as the nucleus of the information on which all others elements are in some sense subservient” (1996, 206). As the in-game representation of the player, it is important to ensure that the player pays attention to the avatar; Blizzard utilizes a structure that puts the character at the heart of their every encounter with the world in order to emphasize its relative importance to the player.

Game world

But the avatar is only one way of relating to the game world itself; there is also the surrounding space in which the player acts (the environment) as well as other players, objects, and actors with which they interact. These objects and places not only “flesh out” the world but provide a way to orient the player to their potential actions and to understand those actions and how they will in turn affect the player. That is, not only are they the things a player uses and the space they use it in, these elements provide a way of actually understanding those places and objects by positioning them in relation to the player.

The world itself is quite literally the terrain that the player navigates, from the plains and hillsides of the world to the structures and buildings that populate it. It provides a “place” for the player to perform their actions. Because World of Warcraft presents a stylized-but-realistic world which follows many of the same rules as our everyday world (characters generally obey the same laws of gravity, for example, and are humanoid in appearance), it is important that they world is manifest in a “naturalized” way; by this, I mean that players see the world as if they were really in it; walls are solid and cannot be looked through; the ground similarly is “impenetrable;” and the world follows similar aesthetic norms like houses and castles and even conventionalized “fantasy” aesthetics like dragons and demons, which are unreal as everyday lived experiences but nevertheless real in the imaginative sense—World of Warcraft simply utilizes these conventions to help “ground” the player in the world in which they act.

More literally, the world players navigate surrounds them completely in the figurative sense (they are “in” the world at all times) as well as a literal sense (they are always seen “in” the world in that it is ever present in the frame of the screen). Figure 10 provides a good example: here, a player is navigating a major city, walking along a paved street alongside a canal on one side and a building on the other. The world fills the frame, and serves as a kind of “background” on which the player acts, as well as for other informational elements of the interface. The world is foundational in several ways; as a backdrop for action, it is a critical way to orient the player to where they are acting, which also influences how the act (they won’t try walking through a brick wall, for example, or off the edge of a cliff); but it is also foundational in a structural sense in that it provides the boundaries of the viewed world and the space on which all other information is projected. The world is thus both important (it’s the context for action) and sublimated (it’s a background). In terms of salience, then, the world can be both important (the player must pay attention to where they are acting in order to decide how to act) as well as less-important (much of the world can be “ignored,” such as the bricks in the wall or the buildings in the distance and the player can still function effectively).

The player also has access to a map of the world (Figure 11, which is actually one part of a much larger map accessible to the player; for more on maps, refer to Figures 23 – 26). This map follows many of the conventions of “real” maps (it is north-oriented, it is two-dimensional, it uses symbols to represent things like roads and buildings), but most importantly, it serves as an alternative way to conceptualize the space the player inhabits. The player can open the map interface and see their current position within a particular zone; they can also zoom out to see how those zones are related to each other, as well as zoom into other zones which the player doesn’t currently occupy. In this way, the player is oriented to the world not in a “literal” sense (the world in which they find themselves immediately) but in a more abstract way (their place within the lay of the land, so to speak; where they sit in the space of the world). While this is not the primary form of orientation in the world, it is worth noting that players have access to multiple conceptual avenues to understand where they are.

The game world, however, isn’t limited to just the territory of the game; it also includes objects and players with which the player interacts. Because World of Warcraft is a multiplayer game, the player exists within a world filled with other players playing synchronously (Figure 12, a typical scene in a capital city with dozens of players going about their business). And because the space is a fully realized “world,” it is populated by objects for the player to use (Figure 13, a portal to another zone that the player can click to use). Here, it is difficult to discuss Kress and van Leeuwen’s notions of social distance to other actors, since it can change by a simply mouse click; however, it is possible for the player to stand near another player and zoom in via first-person view on the other players’ face and body (viewing them from the close personal distance). In everyday practice, however, most other players remain closer to the far social distance, where their full bodies and space around them are visible (Figure 13 shows a good range of distances between actors; some are nearby while others are more distant). Indeed, the bodies of the characters are not in general the most distinguishing feature of another player (though it might be the most noticeable); instead, players have “nametags” above their heads that identify them and their affiliation to a guild (a formal group of players who work cooperatively to meet particular goals). Because of the limited set of cosmetic options available to a character (both the body size and shape and the gear/clothing they can wear), many players appear to be identical; it is only through the player names and affiliations that distinguish them. Nevertheless, these objects and players also help orient the player to the world by showing them what they can act on and who else might be involved. In this way, the world is more cohesively presented as a “real” space that the player can act within.


These actions take several forms: they are locomotive (the player moves from one place to another), they are interactive (clicking on a portal, as in Figure 13), and they are more abstract (casting a spell or dealing damage). Interestingly, the latter forms the core of actual gameplay; that is, the “point” of World of Warcraft is to fight enemies and other players in battle and to overcome these challenges, and the actions involved in battle primarily fall into that abstract category. Figure 1 shows the initial screen a player encounters when first creating a character; along the bottom of the screen, a row of buttons appears which contain several icons—these icons are the “action potentials” a player can take. In the case of Figure 1, the character can swing their sword or axe (the first and second icon, respectively), cast a damage reducing spell (the isolated icon near the middle of the row), or access “functional” components of the interface like the character info pane, quest log and others (the right segment of the action bar). As players progress in the game, they gain more abilities, and the interface provides more space for additional icons (and, therefore, ways the player can act). Figure 14 represents a high-level character out of combat; here, the single row of action-potentials used by the new player has expanded to include a second full horizontal row and portions of a third row, and two vertical rows of action buttons along the right margin of the screen. Even this view does not include all the ways a player can act, and there are several additional rows of icons they can scroll through within the bottom most action bar to access these additional actions.

A player is presented with a large number of ways with which they can act in the world; however, it is important to note that these actions are contextual and cannot be used at all times—damage-dealing abilities like spells and attacks cannot be cast outside of combat (generally, though there are some exceptions), while some actions such as fishing or trade skills cannot be used while in combat. So even though the action-potentials appear at all times on the screen, they are not useable at all times. This notion will become more important in subsequent sections of this analysis, but here it is enough to acknowledge that the icons are ever-present but not ever-useable.

Also worth noting is that the interactive object for the player is split; it is primarily the avatar through which the player moves and acts with the world (the vehicle), but the player also uses a cursor to click on objects within the interface as well (what I call a “task related interface object,” borrowing from van Vugt, et al [2007]). This cursor changes shape depending on the potential action available, from using the in-game mail system to selling goods. Figure 15 shows several of the different permutations of the hand icon. This division of vehicle to drive the player and a task related interface object to perform specific actions sets up an interesting dynamic for the player. On the one hand, it separates them from fully embodying the avatar since they can manipulate the world via the cursor; however, it also serves to tie the interface objects such as the actions bars to the avatar, since clicking on an action button makes the avatar perform that specific task. The division of body from the action selection device (the cursor) is primarily a way to link the interface with the world itself.

The position of the action abilities also help the user focus on them; by placing them in the lower portion of the screen, they too serve a foundational role—they are at the heart of all the things that a player does in the game (with previously noted exceptions, here ignored). Because they are central to the actual performance of the game (that is, in the fighting of enemies), these actions appear in a portion of the space that is literally the base of the screen. Indeed, the placement of the action bar at the bottom of the screen is significant when considering Kress and van Leeuwen’s notions of “real and ideal;” for them, the lower portion of an image contains information of a practical and specific nature, while the top portion shows “idealized or generalized essence of information” (193). The “real” information of the lower portion of an image is what the viewer works with, while the upper portion is what they aspire to do. While this fit is not exact in terms of World of Warcraft, it nevertheless is demonstrated by the placement of these interface elements: the actions available to the player, the “tools” to work on the world, are at the bottom of the screen, and represent the “real” world actions they can take, while the upper portion of the screen shows status-level information like player health (which I cover in greater depth in the next section). These status-level elements of the interface represent what the player ultimately wants to accomplish (maintain their own health, deplete their enemies’ health), while the action-potentials represent the specific means to do so.


The interface, however, also includes other elements beyond the action-potentials represented by the icons of the action bar, including (as noted above) player and target health, but also a mini-map in the upper right corner, a chat window in the low left portion of the screen, party-member health, real-time information like damage taken and dealt, and other information. Collectively, this data serves as additional orientation for the player by providing necessary information based on what they are doing. Figure 16 provides a good collection of these interface elements, which I have labeled. It is important to note that in this image, a great number of elements of the interface appear simultaneously because the player is in combat: target status, party status, and combat text also appears. Compare this to Figure 14, where the player is at rest; many of these elements are not present. Figure 14 represents a good example of the “basic” user interface, the elements of the interface that appear at all times regardless of status. Figure 16, on the other hand, demonstrates the dynamic nature of the interface by including additional information that changes based on the state of the player.

Some of this additional information appears automatically—combat text, for example, appears whenever the player is engaged in fighting an enemy. Some of this information is available at the player’s discretion and accessible through a keyboard button or other interface button; Figures 17 and 18 represent these “on-demand” elements. Figure 17 is a players inventory, bags that a player can place objects like equipment, food, and other objects within for later use, while Figure 18 presents the character info pane which contains more abstract concepts like player statistics (strength, agility, spirit, etc.). Again, these concepts are conventionalized and borrowed from games such as Dungeons and Dragons so that World of Warcraft does not explicitly state what these statistics mean but assumes the player understands them, at least generally. The player has access to both things with which to work (objects in their inventory) and the concepts through which they use these objects (their relative strength or their health pool, and therefore their ability to effectively wield a sword, for example).

Structurally, the interface elements are aligned within the frame of the screen in particular ways. As noted above, the action bars occupy the bottom portion of the screen, and can extend up vertically along the right margin, while status-level information (player health, statistical bonuses, and their place in the world via the mini-map) occupies the top margin of the screen and can extend down vertically along the left column of the screen. These interface elements therefore occupy the margins around the centralized avatar, and create a frame around the player which contains pertinent information. Combat text occurs within the space around the player on top of the game world and complicates this relationship, but for the sake of this analysis I consider it part of the frame around the character; the frame just has somewhat fuzzy boundaries. Nevertheless, this frame reinforces the centralizing features of the avatar by again tying all of the information around the avatar. Figures 19 and 20 describe the general shape of these frames.

The information within the frames takes on various levels of salience for the player. Again, since they are tied to the avatar by being placed around it, they can be assumed to be important and related to the status of the character. Since many of the elements of the interface are always present, they can again be assumed to be worth paying attention to since they likely provide information necessary to the player. These elements contrast to varying degrees with the background of the game world; first, though the game world is in motion, these elements remain in place—the player looks through them as if they are an overlay onto the camera lens; secondly, they have distinct borders around them which separate them from the background (the health bar, for example, features a heavy border around it and includes a portrait of the player). By separating these elements from the background, the elements stand out and again can be assumed to contain valuable information for doing the game. Finally, since they represent concrete examples of abstract concepts (player health is represented by a “health bar,” while actions like casting a lightning spell have a distinct icon associated with it), the juxtaposition highlights the physical manifestation of these abstract abilities and status. That is, they can stand out to the player since they are an uncommon representation of particular information; this “uncommonness” helps to emphasize the information and help make it more meaningful to the player.

Orientation and progression

So far, I have discussed the structures in place to help orient a player to the game world, and some of the ways that these structures emphasize certain information as important to the player. But what about one of the primary assumptions described at the beginning, that of “progression” through the game? Here, I mean progression both as moving from point to point (following the narrative of the game, for example, or gaining skills as the player gains levels), but also in terms of the player’s performance as they learn to master the space and their actions within it. How do these structures support this progression of the players interaction with the game?

The answer lies in several features of the interface which correspond to these multiple definitions of progression; first, as players gain new abilities and experience more of the world, the interface changes to provide more information to them; secondly—and perhaps most importantly—as players experience the world and interact with it, they learn what information is important and when in order to act more effectively. These dual features require further explication, but together they provide tools for the player to learn to do the game.


Capacities for the player include both action-potentials as well as conceptual information about the world and about themselves. As players progress in the game, they gain levels and abilities. Every other level or so, a player will have access to a new attack or healing ability; at level 1, they begin with only a handful of abilities (Figure 1, for example), while at higher levels, players can have several dozen abilities (Figure 14). In a very literal sense, then, their ability to act has progressed from a few to numerous ways to act within the world. Similarly, there is a progression in the amount and type of things the player can interact with. Figures 21 and 22 show the initial inventory pane and a high-level character’s inventory. The player, as they gain levels and progress through the game, gain more space to store objects as well as encounter a greater variety of types of objects (from equipment to trade skills materials like herbs and leather to quest-related objects and others). So not only has their ability to act increased, but they types of things they interact with and the spaces for those things have expanded as the player levels.

Another example of capacity progression is found in the map. Figures 23 – 26 show the map changing as the player explores a particular zone. In Figure 23, the player has not moved around in the zone, so the map displays only a rough outline of the area. Figure 24 shows a small part of the zone “filled out” with more detailed information; the player has walked around within that part of the zone, so the map shows more details. By Figure 26, the player has explored the entire zone, so the map has been “completed.” The player now has access to information like the location of a major city, roadways between settlements, and other geographic data. It is through the actual exploration of the space that the player gains this access; and as they progress, their knowledge base expands accordingly.

This notion of progression also occurs within the character information pane (Figures 27 – 29). The initial view a player has of their character information (Figure 27) shows a small portrait of the avatar, and icons representing the gear currently equipped. When first starting a character, the player only has a few of the gear “slots” filled, but as they gain levels and better gear, they fill these slots (Figure 29). These changes in gear also cause cosmetic changes to the avatar, and change the appearance of the character as well. Further, players can turn on more detailed information about the character, including statistics and other conceptual information (Figure 28). Here, a player can quite literally witness their character progress as these stats increase.

This notion of progression, however, is more nuanced than I have so far given heed. In describing the player as gaining levels and abilities, I assume the position of the player; that is, I am referring to how the player progresses in the game. As they level, they gain abilities and objects to act through. However, from Blizzard’s point of view, progression works almost oppositionally to the player’s perspective. Blizzard designs the game around the high-level abilities and content, and then must pare it down for new players. They create all the abilities and subsequently limit these abilities to certain thresholds (character level primarily, but also actual performance as in the case of the map). Rather than giving players new abilities, Blizzard eases the restrictions on a player as they level. This is an important, but somewhat tangential, observations that is nevertheless important to consider when thinking about the designer’s abilities to provide meaningful structures for the player to learn the game.


These meaningful structures, then, are put in place not only to support the players current actions, but to encourage them to learn how to use the interface in order to progress towards higher content. To do so, players must understand what information is important to them at a given time for a certain situation. We have already seen some ways in which information is made salient to the player, from the centralization of their avatar to the framing of the status information around them. However, salience also depends on this developing notion of progression in that players learns what information is important by when they encounter it in their gameplay. That is, the orientational structures of the interface provide clues to the salience of particular information only through the player’s evolving experience with it. To demonstrate this requires a bit of narrative.

A player is engaged in combat with an enemy as part of a group of players, a rather common occurrence in World of Warcraft. The screen they see resembles Figure 30, with their character, other players and objects in the world, status information and action-potentials all simultaneously displayed. Because the situation is familiar and they have performed similar actions hundreds of times before, their focus likely resembles Figure 31; they pay close attention to their avatar to see if they are taking damage and where they are in the space, and they are looking at their abilities in the lower left corner of the screen to know which one is available to them at the moment. The rest of the interface “fades” out of their focus as the information it provides does not matter to the player much at the given moment. It is available to them (it looks like Figure 30, after all), but they don’t pay attention to it.

However, in the course of the fight, something goes wrong; maybe the enemy takes longer to die than the player expects, or they themselves start taking increased damage. The group leader starts issuing commands through the chat box for the players to disengage and fall back. The player’s focus shifts from their avatar to the chat box (Figure 32); they need to know the commands given in order to successfully operate as part of the group. After reading the order to abandon the fight and run to safety, the player’s focus might again shift up to the status-information of their own health and the health of their party members (Figure 33). Now, they need to know how long they can expect to live and whether any of their group can help them survive the escape to safety. Again, all the information is present at all times; only the player’s focus has shifted to call upon information that matters to them in a given situation.

How does this shifting focus occur, how do players know when to seek out pertinent information? In short, how do players know what information is salient at any given time? Partly, the structures already described—centralization, framing, etc.—support the players learning when to call on information. The other part of the answer lies in the “just-in-time” and “on-demand” nature of the interface. That is, players are provided important and timely information when they need it which can be called upon by the player to make it meaningful. Further, this process is refined by continual use of the progressive nature of the interface.

By just-in-time, I mean that information is available to a player when they need it. In combat, for example, text appears displaying the damage they are taking and dealing. This information is not available out of combat since the player doesn’t need to know that they are not doing damage. The game only displays that particular information when it might be useful to a player; it does not hide combat information and withhold information from the player. Similarly, the game does not present much unnecessary or superfluous information (within certain constraints, as we shall see shortly). The interface is designed by Blizzard to provide information to the user when they need it in order to act.

However, just because the information is available to a player does not necessarily mean the player actually makes sense of it. Confronted with Figure 2, for example, a new player would likely be overwhelmed by the sheer amount and variety of information and not be able to actually play the game. Instead, the information presented must be available “on-demand” to the player. By this, I mean both literally (they can access it when they want to, such as opening their inventory or character pane) but also must be accessible to them when it matters. The latter is by far the more important feature, and perhaps at the very heart of gameplay, for understanding how to operate within the affordances and limitations of the game world is playing the game. In other words, knowing when to call upon the resources available to the player in order to act constitutes game play itself.

How players gain this understanding of when to call upon given information is supported both by the structures in place within the interface (the placement of objects, the framing of the screen, and so on) as well as with players continuous experience with that interface. As players progress through the game, they encounter variations of the same basic mechanics (fighting an enemy, for example). These variations are “practice sessions” for higher-level content; at level 1, players might fight a single enemy by themselves, and during the fight the basic combat interface is visible (including health, combat text, orientation to the enemy, etc.). The game is “tuned” to be more forgiving to new players, so that their starting health is usually much higher than that of the first enemies they encounter; they can notice their health bar being depleted but with enough time to compensate and kill the enemy or run away. Similarly, they are given a few basic abilities to use in the fight, and can learn where to go to access those abilities. As they progress in level, these encounters become more complicated, until they reach high-level content like that shown in Figure 2. By this time, they have encountered the various elements of the interface enough times to know when to call on the various elements, much like the description of the players shifting focus above. Through playing the game and using the interface repeatedly, players learn how and when to access the elements that are most important for their continued action within the game world.

Conclusion and further study

In essence, then, the designers of World of Warcraft (Blizzard) utilize the orientational structures of the visual interface in order to teach players what information is salient and when in order for the player to take actions and do the game. These structures include the player’s relation to the their in-game representation and others within the world itself, as well as the way information is framed around the player to connect status-information and action-potentials to the character (and thus, the player’s ability to act). But these structures also rely on the player’s engagement with the elements of the interface over time, and the progression of both the player’s capacities as well as their meta-level knowledge of the interface in order to perform at a high level. This high-level performance is the ultimate goal of Blizzard, and the structures they employ support a player’s progression towards this goal.

This analysis is focused at the textual level itself; that is, it is centered around the screen that Blizzard designs and the player uses. Further refinement of this analysis is possible at all three levels; by interviewing the designers themselves about their intentions and assumptions, we can reinforce or refute some of the claims made in this analysis, particularly in the analysis of the framing and informational value of interface elements (including the real and ideal and social relations between the player and the avatar and others); we can also look more closely at a broader range of images or consider the screen in motion in order to provide a fully description of the text; we can interview players to see what they find meaningful and how they come to that understanding. In particular, the latter might be the most useful study, as understanding how players make sense of their actions constitutes the culmination of the design and presentation of a game. Players are meant to play the game; knowing how they come to this play capacity is central to the study of games and in the continued evolution of game play itself.


Aarseth, E. (1997). Cybertext: Perspectives on ergodic literature. Baltimore, MD: Johns Hopkins UP.

Gee, J.P. (2006). Learning by design: Good video games as learning machines. In P. Messaris and L. Humphreys (Eds.) Digital media: Transformations in human communication (pp. 173-186). New York, NY: Peter Lang Publishing.

Gee, J. P. (2010). Good video games and good learning: Collected essays on video games, learning and literacy. New York, NY: Peter Lang Publishing.

Holmes, J. (2004). Playing with my (non)self: Grand Theft Auto at the juncture of rhetoric. Unpublished paper.

Jewitt, C. and R. Oyama. (2001). Visual meaning: a social semiotic approach. In T. van Leeuwen and C. Jewitt (Eds.) Handbook of visual analysis (p. 134-156). Thousand Oaks, CA: SAGE Publications.

Juul, J. (2005). Half-real: Video games between real rules and fictional worlds. Cambridge, MA: MIT Press.

Kress, G. and T. van Leeuwen. (1996). Reading images: The visual grammar of design. New York, NY: Routledge.

Turkle, S. (1995). Life on the screen: Identity in the age of the internet. New York, NY: Touchstone.

van Vught, H. C., Konjin, E.A., Hoorn, J.F., Keur, I. And A. Eliens (2007). Realism is not all!: User engagement with task related interface characters. Interacting with computers (p. 267-280).

Wilson, L. (2004). Interactivity or interpassivity: a question of agency in digital play. <;.

Tagged , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: