This is a draft of a column I wrote for the ACM’s interactions magazine. It will appear late 2011.
Mildly irritated. Frustrated. Somewhat annoyed. Profoundly agitated.
This was the trajectory of feeling I experienced this afternoon as I ascended and descended a customer service phone tree in the hopes of reaching a human who could (or would) answer my question. As I reached a flushed state of agitation, I had a meta-moment. I was catapulted out of my immediate experience into a view onto self from a distance. You are seeing red . Your reaction is unreasonable. Stop. Get a grip. Take a deep breath. Hang up the phone. 
As a result of this incident, in the longstanding tradition of armchair philosophy I started to ponder: What are feeling, affect, emotion? Was I really being unreasonable having an emotional reaction, by feeling agitated? What are reasonable and unreasonable in this context? Would others have been swept away with agitation and simply shouted at the person who finally answered? Would I have felt better if I had done that? Aside from hiring more people, what, if anything, could the company have done to prevent this ascension to annoyance? What was clear to me was that a tinny rendition of Antonio Vivaldi’s Four Seasons was not cutting it in the keep-Elizabeth-calm department.
For decades human computer interaction professionals have been thinking about cognition and how to present choices in ways that are intuitively obvious, and we’ve worried about how to measure emotions like frustration for almost as long. We know that perceived options change and that people make different choices among the same set of options depending on their emotional state.
Well, I pondered, perhaps we can detect subtle cues about someone’s state, and based on that, present more copacetic options and different interaction experiences. It is postulated that people have visceral, pre-cognitive positive or negative reactions to things . Perhaps someone’s state can be detected before the person themself realizes how they feel? Perhaps, there could be sensors in my phone handset that detect my emotional state and initiate cheer-up, calm-down or fall in love sequences? Evidence suggests that visceral responses aren’t just in response to an artifact or situation; they can be modulated by social factors. “Affective priming” studies suggest that others’ emotional states can affect our decision making even if we are unaware of it: people take more risks after being subliminally exposed to smiling faces than to frowning ones . If the service I was calling had subliminally played laughter down the phone line, just maybe my ascension to agitation would have been prevented.
As I idly entertained myself with these technospeculative reveries, I came across a discussion of innovations in sensor technologies which can reliably tell us what people are feeling, can accurately map their emotional state. Frankly, I had cast aside my reveries as overly deterministic and somewhat creepy, the stuff of potions and hexes, and way too much of a wicked design/engineering problem . Such a scenario, I reasoned, requires accurate detection, a good model of emotion and mood and it’s impact on thought and action, and an understanding of how the person’s reactions and their surrounding context are going to interfere with any initiated interactive sequence. I decided to turn to research to see if there exist grounds for these optimistic, techno-detection-of-emotion, media narratives.
The plethora of disciplines interested in emotion (for example, biology, linguistics, psychology, sociology, cultural studies, anthropology, design, literary studies, performing arts…..) suggest many have pondered: What is the relationship between emotion, thought and action? How can we detect and measure emotion? How can we assess the influence of emotion on thinking and action? Some conclude that emotions are an annoying flaw in an otherwise perfect reasoning system. Others assert that emotion/feeling and reason are intertwined, that it is futile to assert emotion-free cognition is even possible. In 1994, António Damásio argued from neuroscientific evidence that emotions play a critical role in cognition and provide the scaffolding for the construction of social cognition and underlie all human consciousness . Armed with Damásio’s perspective, I delved deeper into theories on emotion, hoping to find more evidence as the nature and measurement of emotional influence on human behavior. The theories can be crudely bucketed into three types:
• naturalistic theories, which maintain that emotions are products of natural processes which are independent of social norms and conscious interpretation; they result from hormones, neuro-muscular feedback from facial expressions, and genetic mechanisms,
• interactionist theories, which acknowledge culture’s importance yet suggest that enduring and universal biological mechanisms provide for core emotional features (social aspects are cast as derivative, contingent, and variable)
• social constructionist theories, which maintain that emotions depend on a social consciousness concerning when, where, and what to feel as well as when, where, and how to act.
There are more or less extreme versions of each; the more extreme flaunt clear ideological stances. For example, naturalistic theorists like Robert Zajonc associate emotions with spontaneity, creativity, sociability and passion, they are more authentic and are preferred over trained, ‘cold’ reason . A whole research agenda exists to illustrate that we process emotional cues (preferences) separately from making reason-driven decisions (inferences) . Much of the interactionist perspective is rooted in the work of Charles Darwin, the father of evolutionary theory, whose view on emotions is that they are gestures that hark back to basic human behaviours—love is the vestige of copulation, angry teeth-baring a vestige of biting. Returning to my own telephonic experience, I confess I did not bare my teeth at the handset. However, had I done so, my Darwininan ancestral counterpart would simply have bitten it.
Other researchers focus a great deal on responses in the sympathetic nervous system; elevated activity is known as arousal, and arousal is deemed positive or negative. Reading articles in this tradition, I have to say there seems to be a conflation of body arousal with emotion and feeling. In my view, arousal tells you something happened, but not how the individual felt, what emotion they experienced nor how those around behaved in response to the same incident; these factors are likely to have a major influence on the emotion as felt and whether it persists or not. Moving from detection of usually non-visible sensory nervous system arousal to visually available body reactions, Paul Ekman’s work on emotion and facial expression is perhaps the most famous work from the interactionist camp. Ekman suggests there are a few core or “basic”, universal, biologically generated, emotion-revealing facial expressions—these reflect anger, disgust, fear, sadness, surprise and joy . Ekman acknowledges that most of the elicitors and expressions (“display rules”) of emotions are socially learned and can thus vary.
Ekman’s allusion to ‘variance’ is also interesting, of course. People modulate their behaviors quite substantially depending on social context—raising children is largely about trying to infuse a framework for socially appropriate emotional displays given different social contexts. Modification of spontaneous physical expressions is part of enculturation, and cultures vary. Anthropologists like Dewight Middleton propose different cultures have ‘emotional styles’, which not only guide what we consider to be reasonable feelings for any situation but also how to enact that experienced feeling . These ways of enacting feeling are part of what are called the ‘techniques of the body’, a concept introduced by Marcel Mauss in 1973 to describe highly developed body actions that embody aspects of a given culture—how one walks, eats and emotes are learned through explicit instruction, through subtle approval/disapproval cues and through postural mimicry, and they can reflect one’s gender and class. Our physical expression of emotion including our facial expression is modulated by the company we are in, a problem for systems that claim to detect anything but the most extreme of emotional displays with any degree of universal reliability .
Mapping the physiological to the psychological to the social and cultural is important; however, to replace biological determinism with cultural determinism is not satisfactory either. Social constructionists assert that there are two kinds or classes of human emotions; one class is universal and has analogues in animals and can be seen in human infants (e.g., joy, sadness, fear), and the other is adult human emotions which are culturally variable. Through enculturation emotions lose any ‘natural’ or ‘spontaneous’ quality because they are mediated by culturally located social consciousness. Arlie Russell Hochschild’s work is a good illustration of this . Her position is that society and culture provide “feeling rules” and prototypes of feeling that focus on particular kinds of emotion . In the West we feel grief at funerals but we’re supposed to look happy at parties no matter how dull they are; of course, we are perfectly able to embrace other ways of behaving at funerals and parties should we choose to do so. In addition to modeling appropriate postures and behaviors as discussed above, people verbally communicate culturally sanctioned or appropriate reactions in saying things like “You must be delighted”, “Don’t fret over that” or “You should be absolutely incensed” when others narrate incidents. Adult emotions, Hochschild says, involve personal appraisal of things like attachment (to things, situations, people and outcomes), agency and approval/disapproval. Such appraisal involves other entities: for example, envy is a name for noting an attachment or desire for something, and noting that another has it; sadness focuses on something or someone liked/loved that is not available. Hochchild addresses a range of complex emotional states including sadness and grief, nostalgia, depression, frustration, anger, fear, indignation, disgust, contempt, guilt, anguish, love, envy, jealousy, shame and anxiety. Familiar to most adults, these emotions do not map easily to reductive concepts like arousal or positive/negative affect. Rather, they require that we consider the personal, social and cultural locatedness of feeling and emotion. Hochschild also introduces the very useful concept of “emotional labor”—the effort to try to actually feel the “right” feeling for a situation, and to try to induce the “right” feeling in others . Appraisal and emotional labor are not just about the current situation or the past; they also involve imagined futures. Personally, my breaking point on the phone tree debarcle was when I realized I could not be relied upon to engage in the emotional labour of masking my irritation. As a result, my concern was that I’d say something I’d regret when a person finally came on the phone. Imagining the future-self being embarrassed is a great way to restrain the inner jackass.
Walter Benjamin in Illuminations wrote “In every case the storyteller is a man who has counsel for his readers.” It’s true. My main message is not, however, ‘avoid phone trees at all costs’ ; that’s my secondary message . Rather, as content and communication tools proliferate and we design people’s experiences with and through those tools, it behooves us to know why and how people react emotionally and express, perceive and communicate their feelings, not just verbally but also pre- or para-verbally. We need to be sensitive to norms in different cultures, and what happens when we design across them. And, while automated detection of arousal and affect are a very promising for connecting with people who have little insight into their own feelings or cannot communicate their emotions easily, let’s please be wary of reducing the entirety of human emotional experience to simplistic, deterministic, electrochemical models.
Cycling back, to solve the phone-tree problem, my solution would not be to save up for an fmri scanner, nor to measure my galvanic skin response, nor have a webcam trained to my face to detect my expression. Without claiming any universality but knowing myself pretty well, I’d hook something up to listen to my voice, my intonation . Following the old saying, “It’s not what you said, it’s how you said it”, tone of voice includes the timbre, rhythm, loudness, breathiness and hoarseness of how something is uttered. Personally, I seem to follow the general rules: softer tones and pitches are associated with friendliness, higher tones and pitches signal upbeat and happy, and clipped and louder tones signal irritation. Let’s suffice to say when I finally hung up the phone, I was not intoning in a kittenesque fashion, nor in a light, cheerful tone. I was gearing up to bark.
 In their 1980 book Metaphors We Live By, they suggest the body is directly reflected in metaphors we use. They suggest that “our concept of anger is embodied via the autonomic nervous system and that the conceptual metaphors and metonymy used in understanding and are by no means arbitrary: instead they are motivated by a physiology”. Feeling anger is “seeing red” because it reflects bodily reactions—when people get angry, as a result of soaring cortisol levels, they heat up and often faces flush red.
 See Don Norman’s Emotional Design for a lovely exposition on this.
 People who are asked to make more complex financial decisions—for example, whether to gamble $1 for $.50 chance of winning $2.50—are primed with subliminal happy faces or sad faces. Those shown happy faces were likely to choose the investment than people primed with angry faces.
 ““Wicked problem” is a phrase originally used in social planning to describe a problem that is difficult or impossible to solve because of incomplete, contradictory, and changing requirements that are often difficult to recognize. Moreover, because of complex interdependencies, the effort to solve one aspect of a wicked problem may reveal or create other problems.” See http://en.wikipedia.org/wiki/Wicked_problem
 Antonio Demasio (1994) Descartes’ Error: Emotion, Reason, and the Human Brain
 In his 1980 American Psychologist paper, “Feeling and Thinking: Preferences need no Inferences”, Zajonc asserts “affect dominates social interaction, and it is the major currency in which social intercourse is transacted.”
 Neither Darwin nor Ekman nor their followers, I note, have managed to give us any advice on emotion recognition in the world of cosmetic modification. The American Society for Aesthetic Plastic Surgery reports over 2 million procedures in 2010 using Botulinum Toxin Type A (the active ingredient of products like Botox) to reduce facial lines. Eyebrow raising and frowning, clear signals of interest and affect, are not what they used to be when this research thread was first hatched.
 Dewight Middleton, Emotional Style: The Cultural Ordering of Emotions, 2009, Ethos, American Anthropological Association
 I also note that, to date, the most successful biofeedback systems present motivated people with data that allow them annotate and moderate their own behaviours; they don’t presume to mimic the careful collaborative choreography of emotions as we might achieve with a sensitive human friend. See research into the Quantified Self.
 The Managed Heart: The Commercialization of Human Feeling (1982) Berkeley: The University of California Press.
 See also http://en.wikipedia.org/wiki/Arlie_Russell_Hochschild.
 If you are going to use one, refer to gethuman.com first; it’ll tell you what to do to get to a human operator in the most expeditious of ways.
 I note that well-publicised affect detecting companies like Affectiva (whose technology is based on research conducted in Rosalind Picard’s Affective Computing Lab at MIT) do not use voice, but rather focus on the reactions of the sympathetic nervous system and facial expressions. This is, in large part, because they technologies were originally designed for people for whom speech is difficult; e,g, those who suffer from autism or babies. See http://www.affectiva.com/q-sensor/#videos.