Silvan Tomkins and Innate Affects
Silvan Tomkins' (1963) model, recently developed in clinical and neurophysiological terms by Nathanson (1992) and Schore (1993), takes as its premise that drives and affects are different biological systems. Drives are “a group of mechanisms that provide information about ‘time, of place, and of response—where and when to do what when the body does not know how to otherwise help itself'” (p. 110). For example, when certain nutrients drop below a certain point, a ‘problem' is identified in the throat and stomach, and the organism is motivated to gather food towards the mouth. It is, then, primarily a “transport system, taking material in and out of the body, [and so] it must impose its specific temporal rhythms” (p. 111). Computer games may provide a distraction from such unmet biological needs as hunger, thirst, respiration and excretion and even sexual drives; furthermore, unmet biological needs may augment the degree or kind of intensity involved during gameplay. Nonetheless, in an urbanised, Western social context, we can assume that these needs are rarely so seriously unattended to that they significantly determine the experience of play. Computer games are only marginally motivated by biological drives in that it is after basic physiological needs are met that leisure activities such as computer gameplay are (usually) undertaken.
In understanding the motivation for gameplay, then, it is more useful to pursue Tomkins argument that drives are not effective in themselves but require amplification from affects. Indeed, for Tomkins, not only may affects be sufficient motivators in themselves: “It is the affects rather than the drives which are the primary human motives” (p. 111). In identifying the nature of affects, and differentiating them, he argues that the face is the primary site of action for the affect system and that there are nine innate affects identified with distinct facial responses, three positive and six negative:
The positive affects are as follows: first, interest or excitement, in which we observe that the eyebrows are down and the stare tracking an object or fixed on it; second, enjoyment or joy, the smiling response; third, surprise or startle, with eyebrows raised and eyes blinking.
The negative affects are the following: first, distress or anguish, the crying response; second, fear or terror, in which the eyes may be frozen open in a fixed stare or moving away from the dreaded object to the side, the skin pale, cold, sweating, and trembling, and the hair erect; third, shame or humiliation, with the eyes and head lowered; fourth, dissmell, with the upper lip raised; fifth, disgust, with the lower lip lowered and protruded; sixth, anger or rage, with a frown, clenched jaw and red face. (Tomkins, 1987, p. 139)
These innate affects are, with the exception of the food-centred and drive-based mechanisms of dissmell and disgust, identified with two names to identify both a moderate and extreme form of the affect. Of these nine, the basic six are interest-excitement, enjoyment-joy, surprise-startle, fear-terror, distress-anguish, and anger-rage. The affects of disgust and dismell are defensive auxiliary affects of the drives for hunger, thirst and oxygen, and are of less importance in analysing the computer as a medium. Shame-humiliation is an auxiliary affect produced through separation from interest and enjoyment and is relevant in understanding processes of game mastery and frustration.
For Tomkins, affects must be sufficiently general if, for example, infants are to respond to an unlearned stimulus and communicate this response to others; at the same time, affects must be capable of being triggered by learned stimulus. That is, the above affects are not triggered by particular stimulus, such as a fear of being burnt. Rather, Tomkins accounts for affect activation in terms of the density of neural firing, or stimulation, over time, identifying three classes of affect activators: stimulation increase, stimulation level, and stimulation decrease. On this basis he argues that:
the human being is equipped for affective arousal for every major contingency. If internal or external sources of neural firing suddenly increase, s/he will startle or become afraid, or become interested, depending on the suddenness of the increase in stimulation. If internal or external sources of neural firing reach and maintain a high, constant level of stimulation, which deviates in excess of an optimal level of neural firing, s/he will respond with anger or distress, depending on the level of stimulation. If internal or external sources of neural firing suddenly decrease, s/he will laugh or smile with enjoyment, depending on the suddenness of the decrease in stimulation. (1987, p. 140)
These affects are clearest in infants but are more complexly mediated as individuals develop. Nonetheless, “‘meaning' operates through the very general profiles of acceleration, deceleration, or level of neural firing as these are produced by either cognitive, memorial, perceptual or motor responses” (p. 142); that is, no matter how the individual develops, learned responses can activate affects “only through the general neural profiles” (1987, p. 141).
For Tomkins, affects bridge biology and psychology: meaning-free shifts in brain physiology “give rise to the kind of qualitative experience we call a feeling” (Nathanson, 1994, p. 115). These:
evolved to amplify or make urgent and important stimuli that conform to any of the six categories of stimulus increase, stimulus level or stimulus decrease by making something happen on the face, something that we learn to interpret because it is a compelling program that, from infancy to senescence, literally takes over the body” (p. 115-6).
For something to be the subject of one's attention, it must be brought into focus by affect; indeed, nothing can be conscious unless it is amplified by an affect. In this respect, there are nine innate affects: nine ways of paying attention to stimulus or of being motivated.
While an innate affective response may require a comparison between two stimuli, and hence a minimal pause while the individual retrieves information from short or long term memory, it does not involve further cognition. Evaluation is limited to “immediately observable affordances” (Tan, 1997), that is, those forms of action which are obviously associated with the object or situation, and which finds its most basic form in the fight or flight response (see Gibson, 1979). Affective responses are “analogical amplifiers” of a stimulus (Nathanson, 1994, p. 11) in the sense that they are immediate and reflect the character of the stimulus (Tan, 1997, p. 47), that is, the increase in neural firing will partially reflect the intensity of the stimulus. Furthermore, since the biological response to an increase in stimulus is an increase in neural firing, this may constitute the stimulus—or be a sufficient precondition—for a subsequent increase in neural firing, motivating subsequent affects or reinforcing the existing affect. For example, excitement makes one more excited, but an excess of excitement may produce distress, and this excess will be a significant precondition for the enjoyment felt when the immediate stimulus is abated. In short, the analogical character of affects means that they can be self-perpetuating.
Secondary appraisal involves greater cognitive elaboration and may give rise to “an emotional significance that is not directly evident from the situation itself” (Tan, 1997, p. 47); that is, if innate affects constitute, or are a central component of, primary appraisal, emotions proper—and Tomkins' more detailed analysis of affect—require a secondary appraisal, such that, at a certain point, the elaborated affective sequences Tomkins' describes gain the character of the emotions as elaborated in Tan's account of Fridja's model. Primary appraisal—the physiological mechanisms of response that cue any emotive response—is, then, affective in nature, but during the development of an individual affects are combined and contextualised in increasingly complex but formulaic ways, and previous appraisals will allow us to be primed for stimuli. Consequently, while innate affects have a biological character, learned affects, or emotions, have a biographical character (Nathanson, 1994, p. 116).
While Tan develops an elaborated account of the way in which the feature film is an emotion machine, we can make a far less complex argument that many computer games function as affect machines. Computer games often stimulate players so frequently and intensely that their cognitive systems are unable to evaluate the stimulus—there is no time for a developed emotional response, much less a sensorimotor response, before the next stimulus appears—such that affective responses are dominant. This is not to say that there is no cognitive activity or no true emotional responses, as if computer games are only affect machines; it is merely to say that some kinds of games may primarily, or during certain sequences, function as affect machines, and that more developed cognitive or emotional responses are auxiliary to the production and regulation of affect.
Physiological Thresholds of Affective Intensity: Limits and Adaptability of the Cognitive and Motor System
Computer games may be seen as retarding secondary appraisal by providing stimulus measurably beyond their optimal level of processing, or beyond their capability to react according to the immediately identified affordances; in short, they force players to perform at, or beyond the limits of, their physiological ability and therefore are often predisposed towards the production of surprise-startle, anger-rage and distress-anguish. While as yet no empirical research on computer games addresses this assertion, a sufficient theoretical justification and departure point for making it is Loftus' and Loftus' (1984) account of the cognitive system during gameplay. We can summarise the cognitive systems by reference to two examples: Xenon 2 and Vampire: the Masquerade .
First, human senses are continually bombarded with sensory stimuli which is “initially placed into a sensory memory. One sensory memory corresponds to each sensory modality, thus there are five sensory memories in all” (p. 45). Sensory memory is extensive—it may record most or all sensory stimuli—but it does not retain information for long; visual information, for example, is only retained for about a quarter of a second, or 250 milliseconds. This information is then either transferred to short and long term memory, or lost. The process whereby this information is filtered out is referred to as (“selective”) “attention,” and this affects the extent to which sensory memory passes into short or long term memory. While the mind can be aware of multiple events, it can only focus on one event at a time.
Shifts of attention are often accompanied by physical actions, the most significant being eye movements. Sudden eye movements from one place to another are called “saccades,” and during these movements no information is passed from the eye to the mind. Information only arrives in the periods between saccades, and these periods are referred to as “fixations.” Saccades last less than a thirtieth of a second (between 10 and 80 milliseconds), but the eye is fixated for a minimum of a fifth of a second (between 150 and 400 milliseconds) (Grob, 1994, p. 59). This physiological limitation may create problems during gameplay if attention is not adequately directed and action onscreen is too fast. However, shifts of attention may occur without eye movement, and these mental shifts of attention are somewhat faster, taking about a twentieth of a second (50 milliseconds). Saccades are essential to normal vision in that the fovea, at the centre of the retina, perceives visual images in fine detail, but occupies only a small proportion of the retina. Peripheral vision, by contrast, is comparatively unfocused, though changes in peripheral vision may be detected in sufficient detail to cue the individual to change focus. That is, saccades may “purposely explore the environment” to compensate for this limitation (Bordwell, 1985, p. 32).
While all the stimuli from the interface enters sensory memory, players can only attend to and respond to one set of stimuli at a time, and if attention is not directed to the appropriate part of the screen the player may not respond to the appropriate sensory cue. In Xenon 2 and similar top-down scrolling games, the player vigilantly scans the scrolling screen for incoming aliens which may come from the top of the screen or the side of the screen (SOAO). The intensity of this scanning is relatively continuous (FOI), though there may be some sections of a level with slightly less enemies than others, and patterns may be learned and anticipated (PLA), increasing reaction time. The visual resemblance between the passing landscape and some alien forms—for example, the orange worms that extend from the passing walls resemble the cones that stud the rocks—means that the player must also actively scan the entire environment in anticipation of relevant sensory information. This scanning is confined to that part of the screen that is presently displayed (ARASO), but it is under the pressure of time (TP) because the screen keeps scrolling by. Nonetheless, the intensity of this scanning varies depending upon the number of enemies and bullets at any point in time. The consequence of failure (CF) is the damaging or destruction of the ship, which is catastrophic (game ends) if the player is on his/her last life.
In Vampire , by contrast, the player must scan the onscreen view in the top two thirds of the screen, but for most of the time gameplay functions in a navigational mode which involves scanning for such objects as doors, keys and quest items which may be necessary to advance through the game, or scanning for enemies, the appearance of which prompts a shift into a high-stimulus phase. During navigation this scanning is unpressured (TP) and the consequence of failure (CF) is almost nil, since the player can navigate the game-world at his/her leisure. During combat, this scanning is highly pressured (TP), and involves more urgent scanning beyond the immediate perspective (ARASO). The consequence of failing (CF) to attend to the right stimulus varies, but it is generally minimised because a single attack against a character is not usually fatal, and the player has time to change the focus of their attention, for example, by turning to attack a hidden monster or the monster that the player realises is inflicting the most damage. The frequency of these high-intensity sequences (FOI) is relatively low, and since opponents can usually be seen from a distance, the player has time to respond or retreat, such that the sequence can be prepared for or avoided. In a sense, then, the player not only scans the screen for but cognitively anticipates or is primed for triggers for a high-intensity sequence, in the hope that, through preparation, the intensity can be minimised.
Memory is another physiological limitation that computer games exploit. Short-term memory is usually identified with consciousness, and is referred to as “working memory,” but it has a small capacity (seven plus or minus 2 items) and does not retain information for long. Different memory strategies, such as rehearsal, forced association and mneumonics, may be utilised to keep items in short-term memory or facilitate their passing into readily accessible long-term memory. Information in short-term memory is usually passed to long-term memory, which is virtually unlimited, and may endure indefinitely. Nonetheless, information may eventually be forgotten or be difficult to access from long term memory without the appropriate cue, depending upon how it is originally stored. When a familiar symbol is presented to an individual, it takes about a tenth of a second to retrieve or recognise the symbol from long-term memory.
Reaction time depends upon the degree of expectancy, in that anticipation of an event can maximise the response, and is therefore related to the access of short- or long-term memory. Unlike events in the real world, the regulated nature of computer games means that players can rely upon a certain degree of predicability in their responses, and thereby develop learned reflexes. However, excessive anticipation may give rise to an affective intensity in which the presence of stimulus may in itself prompt action, and the player will not wait long enough to determine the object s/he is responding to. That is, expectation may shift between: on the one hand, an all-or-nothing principle in which any stimulus is a sufficient condition for the player to respond in the manner that is anticipated as being relevant; and, on the other hand, a caution in which the player waits until s/he has definitively recruited the identifying information from long-term memory, and (re-)verified its contextual relevance, before responding. Both of these extremes may prove disastrous in terms of gameplay.
For example, arcade-style games like Xenon 2 frequently maximise the number of objects of various types, and the speed of their appearance, so as to panic the player into inappropriate responses without time for proper appraisal. As noted above, there is sometimes a visual resemblance between various landscape objects, monsters and power-ups, which means that if the player's selective attention is not intense enough, they may recall and act upon the wrong memory. If the player sees an orange rock on the passing walls they will continue flying as they were; if they suddenly re-appraise the orange rock and recognise that it is a biting orange worm they will be cued to slow down or slow up to avoid it, but they may move in the wrong direction. Conversely, if the player mistakes a power up for an enemy, they may veer away from it and thereby commit themselves to a movement which they cannot recover from and so lose the power up. Consequently, access to stored memory and reaction time is time-pressured and has a high consequence of failure.
In Vampire , memory access and reaction time is minimally time pressured and the consequence of failure is similarly low in most navigation and combat sequences. However, in some combat sequences, especially fighting more powerful monsters, the player must constantly identify each monster type, remember which character inflicts the most damage to that monster, and often have made proper preparations prior to engaging in combat with the monster. For example, if a player forgets which door leads to the Cuppadocian who has stolen the book of Nod in the second quest of the game, s/he will not cast the appropriate protection and summoning spells which may be required to defeat him. Consequently, reaction time involves a highly pre-emptive element: the player is expected to recognise the future high-stimulus state, then pause and prepare with reference to long term memory. Indeed, while games like Xenon 2 require players to engage in more conscious and complex use of their short- and long- term memory—compensating for the limits of short-term memory, remembering the sequence of enemies as they appear—other games like Vampire take over the role of memory by providing ready-to-hand records: the statistics window, inventory and discipline screens, item descriptions, records of quests completed and present goals, and maps of the level.
While every game has its own distinctive strategic deployment of opponents and power ups and so on, it is useful, as a provisional and general point of comparison, to provide a taxonomy of how/where different genres produce affective intensities in excess of optimal levels. Prior to empirical research, we can consider some preliminary categorisation, which might signal how games utilise or exploit processes of selective attention, short and long-term memory access, reaction time, and sensorimotor skills. Even without testing, it is possible to describe the general intensity of most intense sequences of games, such as those related to combat or hazardous exploration, when the player is at a mid-range state of health/remaining lives. To this end it is useful to identify such categories as:
Agent relative attention shifts onscreen (ARASO): gaps between focii of attention, eg. from the agent controlled by the player to the rest of the screen, or just close proximity to the agent controlled by the player.
Agent relative shifts between screens (ARASB): the extent to which the player must keep scanning a playing field larger than the display screen, eg. player scrolling about the map in Warcraft 2.
Statistics relative shifts onscreen (SASO): frequency of and gaps between action and secondary representations of action onscreen, eg. health-bars and radars.
Statistics relative shifts between screens (SASB): gaps between action and secondary representations on other screens, eg. inventory windows that must be opened.
Type of representation (TOR): the routine reading of iconic sign-functions which motivate non-verbal memory, or symbolic sign-functions which motivate verbal memory.
Game as memory/mneumonic (GAM): extent to which game provides memory cues for the player—or simply recalls information for the player—minimising the player's necessity of long-term recall, eg. tips, hints, maps and quest logs.
Rate of appearance (ROA): frequency at which significant new stimulus, eg. new enemy ships, appear to confront the player.
Frequency of intensity (FOI): frequency during which stimulus-intense sequences, principally dangerous combat sequences or hazardous navigation, appear in the game—as opposed to, eg. non-threatening navigation or narrative sequences.
Potential for learned anticipation (PLA): degree to which the player can learn to predict the sequence of stimulus, eg. the patterns of oncoming ships.
Procedural complexity (PC): complexity of sensorimotor sequences, eg. jumping across a series of moving platforms; this will be partly determined by the complexity of the interface, eg. pressing four keys in a timed sequence vs. pressing the same key repeatedly as fast as possible.
Cognitive Vigilance (CV): the degree/frequency of cognitive attention required to complete a sensorimotor sequence.
Overall Affective Stimulus (OAS): hypothetical extent to which the game cues high-level affective responses relative to physiological limitations of human players.
Table 1. Generalised categories of computer game structure/goals vs physiological demands
On the basis of this kind of appraisal one might derive an index of affective intensity for a particular game event, sequence, or (far less reliably, because of the variety of sequences in many games and the variable nature of gameplay), an entire game. Of course, empirical research is required to clarify significant units that influence these kinds of match or mismatch between the demands of games and players' physiological thresholds, but even the observations above suggest that many computer games exploit the physiological limits of players in processing stimulus. In Tomkins' terms, if affect is governed by sudden increases, decreases, or levelling, of stimulus, then action-oriented computer games have a tendency towards excessive and rapid-fire increases and the sustaining of stimulus at a peak level beyond players' optimal levels of response. That players frequently exhibit affective markers of distress or rage—that is, facial expressions and body postures—is sufficient to conclude that computer games cue affective states of interest-startlement and distress-rage in excess of the everyday affective range. That is, many action-oriented computer games produce an excess of stimulus without time for the secondary appraisal characteristic of proper emotional responses. Since primary appraisal can be mapped, in part, onto Silvan Tomkins' (1963) innate affects, we can restate Tan's argument that the traditional feature film is an “emotion machine” by arguing that computer games often function as “affect machines.”
Such observations are directly related to the aesthetic which the terms “altered state,” of “immersion,” of “kinaesthesia” attempt to describe: the desirable sense of vertiginous involvement in which things are just about to go out of control, such that play is a form of “crisis management” with the ongoing possibility of failure or virtual death. Adventure games may rarely function as “affect machines” in this respect: games like Monkey Island 3 or Myst provide ample space for secondary appraisal and therefore the mitigation of primary affects and the cultivation of more complex and long-term emotional states. However, less action-oriented computer games, such as Vampire: the Masquerade , have sequences during which a high affective intensity is produced.
The Functionality of the Interface: Non-verbal and Verbal Memory
Physiological and cognitive accounts of visual processing, or more specifically, how images and words are stored in and retrieved from long term memory (LTM) (GroB, 1994; Laird, 1993; Strothote & Strothote, 1997), provide a perspective on the role played by signs at the interface in regulating affect. There are, of course, many accounts of the principles governing image comprehension to design effective computer interfaces (GroB, 1994), but the complexities of visual processing—notably, what proportion of vision is processed physiologically and what propertion is processed cognitively (Johnson-Laird, 1993, p. 60)—are still being debated. Nonetheless, we can generally accept that while initial perception of a familiar or simple image may be fairly automatic, pictures which engage our interest because they are of ecological value or are difficult to interpret require greater cognitive processing and greater recruitment from long term memory (Kunen, Green and Waterman, 1979).
According to the dual-coding theory of memory proposed by Paivio, 1971 (see Sadoski, Paivio, & Goetz, 1991):
memory consists of two separate and distinct mental representations, or codes—one verbal and nonerbal. The verbal system is language-like in that it specializes in linguistic activities associated with words, sentences and so on. Although the nonverbal system includes memory for all nonverbal phenomenon, including such things as emotional reacxtions, this system is most easily thought of as a code for images and other “picture-like” representations (although it would be inaccurate to think of this as pictures stored in the head). (Rieber, 1994, p. 111)
Verbal information (“logogens”) are stored as discrete elements, mirroring the structure of language, while non-verbal information (“imagens”) are stored in a more continuous fashion, with a gestalt “all-in-oneness” quality. Whereas verbal stimuli directly activate verbal memory codes, visual stimuli activate visual memory codes. The distinction between the two forms of memory is understood to be analogous to that between digital and analogue information. Furthermore, processing in the verbal system is supposedly sequential or linear, while processing in the non-verbal system is supposed to be parallel or synchronous, in that one can “scan” a mental image in a way we cannot “scan” our memory of a sequences of words. Kobayashi (1986) argues that remembering something in both systems increases our chance of recall, but argues that while pictures may be stored both visually and verbally, words are less likely to be stored visually. It is evident, of course, that we do not remember an entire text in non-verbal form, but assimilate it at a higher-level of organisation (we do not recall an entire sjuzhet , only an abbreviated fabula ), but we may have a visual memory of a text we have not cognitively assimilated, such that we can access parts of a text in both linear and synchronous fashion.
It is generally accepted that visual memory is better than verbal memory, and that in this respect the retention, retrieval and transfer of memory may be seen as having systems analogous to “iconic” and “symbolic” sign-functions; or, rather, iconic sign-functions more readily recruit non-verbal memory, whereas symbolic sign-functions more readily recruit verbal memory. The extent to which these orient the individual towards, procedural, bottom-up and top-down processing is then, not generalisable: it would depend upon the context. The primary significance of iconic and symbolic sign-functions, then, pertains to the extent to which the representation of something facilitates the speed of search of relevant information (amount of STM and LTM), recognition of key features (labelling and grouping), and the ability to draw inferences about the perceived (especially in terms of relations between the key features) (Larkin & Simon, 1987). While this may have a minimal significance in the regulation of affect, in the sense that quick access to and interpretation of the interface allows the player to better process the flow of information, the significance of sign-functions in computer games like FFX is dependant upon—and more evident in terms of—cognitive modes, or styles, and the ways these modes or styles are culturally coded.
In FFX , a player in the middle of a combat sequence may absently recognise an image of Tidus on the screen as an icon, as realistically resembling (and perhaps being perceived, and identified with, as) an actual person in terms of textures, proportions and the movement. However, the player may look closer to see observe or determine the individual pixels, polygon count and texture mappings, such that Tidus is perceived as a symbol, a representation of a fictional person. The player may then attend to the indexical relationship between the icon/symbol of Tidus and the statistics in the lower right hand corner—name, hit points, and health points—which are indexical of Tidus' combat status. These statistics are only meaningful in relation to (part of the same system of difference as) the statistics of other characters and the options available to the player. Indeed, when the player accesses the menu options of Attack, Summon, Special or Item (and the options in the subordinate menus), so that the character performs a particular combat action, the image and statistics of Tidus are indexical of a particular action , or, more broadly, they are subordinate to a strategy , which is itself only meaningful in the context of the entire differential field that constitutes the game's semiotic system.
While Poole (2002) makes this argument in relation to computer games in general, a qualification needs to be made. Poole observes that each game has its mechanics, its rules, and that these are symbolically organised, so that, even in the case of iconic characters like Lara Craft: “The ‘realistic' skin hides a semiotic cyborg” (p. 192). Furthermore, a “good” game will offer a complex symbolic system, in which the player's “imagination” is induced, in the sense that it offers “the dynamic challenge of being able to predict how one's actions will affect the system, and therefore what course of action is optimal” (p. 185), with the player acting as if “the rules of the semiotic system presented, and not the rules of the real world” (p. 185) apply. The importance of this point is that Poole does not clarify the distinction between of sign-type rather than sign-function, and this extends to a lack of distinction between the sign-functions coded by the program—the “semiotic engine” (p. 203)—and the interpretation or allocation of sign-functions by the player. I account for this in the next chapter by synthesising Aarseth's (1997) notion of “ergodic” texts (which similarly emphasises the sign-types or sign functions of the “machine-text”) with Barthes' notion of levels of signification. The subsequent chapters address the semiotic complexity of FFX alongside the emotional response induced in relation to it in more detail, but we can certainly accept Poole's conclusion that the appeal of a computer game lies in not just increased iconism (associated with “identification,” see chapter seven) nor increased symbolic complexity: “what matters in modern gameplay terms is the interaction of all three types of sign” (p. 202).
As Poole (2002) observes, different modes of representation allow for economic ways of reducing the vast quantities of game information into a readily accessible form. In some cases, an “iconic” sign may more readily convey this information, but as often as not the most efficient way of conveying game information is a mix of iconic, indexical and symbolic sign-functions, though ultimately, depending upon the players capacity for storing, recruiting and processing verbal and non-verbal memory, this efficiency will depend upon the player. The interface becomes a “tool” of the player as they try to keep the flow of information within an optimal level: the player learns the most economic strategy of scanning what signs at what time during gameplay to regulate their choices. A game with an inadequate interface may mean that the affective intensity readily extends beyond the players capacity to self-regulate because the player is aware that, in a sense, the playing field is unequal, or unfair: s/he has to self-regulate more frequently to compensate for the fact that s/he is continually frustrated by the interface. This may, through cognitive dissonance, lead to increased investment in the game, or to an increased sense of wasted effort when we finally abandon the game.
From Game as Catharsis of Negative Affect to Real-Time Experience and Activity of Affect Regulation
The obvious but misleading inference from this is that the experience of gameplay, or at least of a particular kind of gameplay aesthetic, is characterised by the production of negative affects: the sudden rapid-fire increases of stimulus that produce not just interest-excitement and surprise but also startlement, fear-terror; and the maintanence of a constant high level of stimulus that peaks beyond an optimal range and thereby produces distress-anguish and anger-rage. These responses may be accounted for physiologically as a pronounced activation of the sympathetic nervous system beyond the individual's ability to regulate, prompting the player to an action tendency which finds its most extreme behavioural elaboration in the self-preserving flight-or-fight reflex. This reflex or affective motivation may, of course, be highly attenuated, mediated or repressed during play. That is, instead of leading to a high-level of arousal beyond play, and the subsequent seeking of persons or objects with which to act aggressively towards, it may be confined to the player's relationship to the game, and range from a simple attunement of the intensity of play, through to verbal and/or physical attacks against the game or machine. We might observe that these affects cannot be regulated within the diegesis of the game, given the limited affordances, and that this readily leads to a displacement of attention to non-diegetic elements.
However, the negative affect produced during gameplay is hardly desirable in itself (except when players are engaging in “dysfunctional” gameplay); what is desired, rather, is the pressured development or performance of procedural schema and cognitive strategies that allow the player to regulate these affects, or, more specifically, the relief (and therefore enjoyment-joy) produced when the player teeters upon and avoids a catastrophic level of negative affect. By extension, player motivation may be seen as determined largely by the way players deal with catastrophic negative affect. What is especially significant here, however, is that the moments that common sense suggests produce distress or anger—moments of hyper-stimulation, such as those of failure or frustration—are often moments when entirely different systems of affect and emotion comes into play: the auxiliary affects of shame and pride.