MIT Media Lab
Thesis document [pdf]
Chapter 3: Reconsidering Social Interaction for the Digital Realm
[images currently missing]
The intricate processes that comprise all social interaction are embedded in the underlying assumptions that can be made about the environment in which the interaction occurs. People learn to read and make use of the contextual information presented to them in the physical world. Yet, when they go online they inaccurately assume that experiences can be translated.
The architecture of the digital realm fundamentally conditions potential social interactions. Although designers and theoreticians have emphasized the metaphors that translate the physical to the digital, these metaphors are often inaccurate, if not deceptive. Architectural and spatial metaphors span the writings on cyberspace, suggesting that most aspects of the digital landscape can be compared directly to a physical replica. This metaphor is taken up in the spatial language that we use to discuss digital environments - chatrooms, websites, message boards, and portals all exist in the realm of cyberspace. Even the words that researchers use to separate the physical from the digital imply space: world, landscape, and environment. Yet, while these notions are sold for ease of comparison, they imply a set of architectural assumptions that are not applicable online. Thus, they mislead people into believing that they should act in a comparable manner and will receive the appropriate feedback.
Metaphors are one of the more effective means for people to build new conceptual models (Lakoff & Johnson 1980). This linguistic tool allows people to translate their mental assumptions from an understood concept to a new idea. Metaphors make the new concepts seem intuitive by relying on previously understood ones. Of course, this is only successful when the assumptions can be accurately translated. In the case of the digital realm, translating physical expectations to the digital world is problematic. In physical rooms, people expect a certain level of privacy and control over their words because their experiences have indicated that social interactions are ephemeral and the average interaction remains in the context in which it was presented. Online, information is archived by default; thus, what is said in one room might not be as fleeting and immobile as the speaker believes. This immediately creates a tension between the expectation that an individual has and the reality of the architecture.
Although experienced users understand that the metaphors do not map directly, the architecture gives off an entirely different impression. Harrison & Dourish (1996) argue that the difference is that of space versus place. Thus, when the architecture implies that the virtual place is located in a spatial metaphor resembling the physical one, the architecture is deceptive. Thus, metaphors do not necessarily need to be retired, but they must be supplemented with mechanisms for architectural awareness.
Without this awareness, being taken out of context can be quite disconcerting. In order to address this, designers should convey the social norms through the architecture. They should simultaneously inform users of the underlying differences while providing the tools for people to more comfortably interact online.
This chapter presents some of the underlying differences between the physical and the digital, focusing on those that impact social behavior. I focus on two main architectural differences that impact social interaction - the power of architecture and the lack of embodiment. In looking at these, i am interested in the ways in which they impact one's ability to derive context and the other social cues necessary for communication.
Underlying differences in architecture
The architecture of the Internet is code (Lessig 1999), which is comprised of digital bits. Over seven years ago, Negroponte (1995) proselytized the notion that bits were not the same as atoms and thus must be treated differently. Shortly following, William Mitchell (1995) constructed an early critique of how the architectural differences would impact social interaction. Yet, even with such awareness, designers failed to inform users of this.
In the world of bits, many tasks are trivial when compared to their physical counterpart. Copying data is a core function of code; transporting bits over wires takes moments; altering data, images and text requires little effort and leaves few traces. Digital information is easily stored, manipulated, sorted and copied. Thus, most data that has passed through the Internet exists in many different forms on all of the systems through which it passed. While a typical conversation leaves nothing more than impressions in people's minds, online conversations are often recorded because of the nature of their passage. Whether they exist in email or on Usenet, this data is frequently archived, sorted and searchable.
Although it may seem advantageous to have historical archives of social interactions, these archives take the interactions out of the situational context in which they were located. For example, by using a search engine to access Usenet, people are able to glimpse at messages removed from the conversational thread. Even with the complete archive, one is reading a historical document of a conversation without being aware of the temporal aspect of the situation. As such, archived data presents a different image to a viewer who is accessing it out of the context in which it was created.
Digital archives allow for situational context to collapse with ease. Just as people can access the information without the full context, they can search for information which, when presented, suggests that two different bits of information are related. For example, by searching for an individual's name, a user can acquire a glimpse at the individual's digital presentation across many different situations without seeing any of this in context. In effect, digital tools place massive details at one's fingerprint, thereby enabling anyone to have immediate access to all libraries, public records and other such data. While advantageous for those seeking information, this provides new challenges for those producing sociable data. Although the web is inherently public, people have a notion that they are only performing to a given context at a given time. Additionally, they are accustomed to having control over the data that they provide to strangers. Thus, people must learn to adjust their presentation with the understanding that search engines can collapse any data at any period of time.
In the physical world, the public space still has boundaries; people are not performing for the entire world, across all time. They are performing in a particular environment and draw from the contextual cues of that environment. Online, when an individual performs for a particular chatroom, they make certain assumptions about who has access to their presentation. When these interactions are recorded, the conversation can be repositioned into a different context. Although recording is an inherent attribute of shared bits, the digital design does not inform the users as they have come to expect offline. Thus, people are still startled when public presentations reappear elsewhere. The history of Usenet provides a clear example of the social impact of collapsed contexts.
Usenet: an example of destroyed context
In the 1980s, most people who had access to the Internet were either associated with universities or corporations. Many of these people regularly participated in conversations on Usenet, an asynchronous threaded messaging system that was available to everyone. Usenet was divided into topical groups, which represented many of the interests of these people and thus spanned an extensive range of topics. Yet, while there was diversity of interest, there was still an assumed homogeneity to the participants; it was not until 1992 that an AOL user posted to Usenet (Google 2001). Posters often knew each other and were equally familiar with the digital terrain.
Posters knew that they were posting to public forums and that anyone who had access could read their posts. Perhaps a little bit of hindsight makes it seem obvious that the Internet could one day be comprised of most people and that those posts would be permanently archived and reassembled with search engines. And perhaps those posters should have had that foresight, but many of them did not. People posted messages with a particular thread and group in mind, having a full understanding of who tended to post to that forum. They generally assumed that most readers had some vague interest in the topic at hand, but that their message was always read with the other messages and the thread for context. People often expected that their messages would last for a few months, as they routinely saw old messages fade away from their server. Posters had a sense of interpersonal and situational contexts, derived in part by assuming that it was like any group meeting space, where some people were vocal and others remained anonymous in the background.
Yet, as time marched on, the masses jumped on the digital bandwagon and started to participate in all of its forums. Usenet grew rapidly; new groups were added; old inhabitants left; and the culture of the groups changed over time. In 1995, DejaNews was introduced as a searchable archive-based interface to Usenet; in 2001, Google acquired DejaNews and expanded the archive to 20 years worth of posts.
Suddenly, with a few keystrokes, millions of grouped postings could be condensed into those that pertained to a given keyword. Perhaps ideal for searching for answers to questions, this interface quickly removes any of the original context in which the post was created. While the date and links to the thread are included beneath an excerpt of the message, the interface allows you to automatically browse these messages out of temporal or group order. Although messages were created within a particular context, it is not necessary to know anything about that space to browse the messages. Nothing distinguishes the posts of one group from that of another, one time from another, or one individual from another.
Without knowing the context and history of a given newsgroup or individual, or the social norms of a given time period, messages can be easily misinterpreted. If a search for an individual shows postings from rec.pets.cats and alt.flame, and the searcher is not aware that angry postings are expected in the latter, the poster might easily be perceived in a negative light. Without knowing the context of the space, people do not know how to assess the specific social norms separate from a general view of normative behavior. Even a date-based search for my advisor, Judith Donath, suggests that the two most related groups to her are rec.arts.books and rec.autos.antiques; without knowing the group or the information being discussed, one might easily misinterpret what these "related groups" mean.
Usenet highlights the contextual problems associated with digital data. Although users post-1995 were not told about DejaNews, many were aware that it existed. By being aware, users were able to adjust their presentations to accommodate for the potential collapsing of contexts due to the change in architecture. Prior to that, many users lacked the assumed foresight; they did not anticipate these conditions. The architecture made archiving possible, but posters did not predict that their messages would continue to persist and impact their interactions years later. Although almost everyone concedes that posts were public, the notion of public in the physical realm does not mean persistent across all space and all time. When a twelve-year-old states an opinion to a group of strangers in a public park, it is not assumed that this will be quoted out of context in a job interview fifteen years later. Likewise, it is not certain that society should require that level of accountability for past statements; even the credit bureau forgives an individual after seven years.
As massive quantities of Usenet data are aggregated, it is not surprising that researchers analyze it. While most of the analysis results in academic papers, Microsoft's Netscan (Smith 2001) provides a tool for users to see the resulting statistics about a given newsgroup, a given conversation, or a given person. While this data helps users gain perspective about the various groups and people, it can also be socially problematic. The statistics about groups are not put in a given context. If a group has 50 active members, is it more like 50 people in a football stadium or 50 people in a bedroom? Without having to know anything about the context in which posts originated, one can explore statistics on anyone's posting habits. What does it mean when someone posts messages to which there are no responses? Does this mean that the person is quite knowledgeable and is answering a question or that everyone would prefer to ignore this individual? Usenet comprises lots of different types of social norms. As discussed in the next section, presenting statistical data can be problematic.
Digital architecture provides different social cues
While Google provides a mechanism for collapsing contexts in Usenet, it also provides a means for people to instantly access extensive information about others throughout the world. This tool has both advantages and disadvantages. On one hand, having access to data about others informs the curious individual, as is noted by those who scour the Internet for personal information on potential dates (Schoeneman 2001; Rosen 2000: 199). Search engines allow people to sift through data to get a glimpse at someone of interest in order to evaluate potential connections. At the same time, this information can be misleading or inaccurate, thereby misinforming the individual. Perhaps the data is from an untrustworthy source or does not represent the individual in the current situation. Or perhaps the data reveals information about someone with an identical name. When people acquire information online, they are not aware of the validity of their sources. Even if extensive, accurate information about an individual were to exist, users are not likely to read it all. With a limited sample, impressions may be inappropriately distorted.
Not only is the reader disadvantaged by not having the tools to properly evaluate the information, but also the subject lacks the ability to control their representation. As information is archived, it is also difficult for a subject to correct inaccuracies, let alone adjust potential impressions. With such data available, it is difficult to resolve old issues and one must be prepared to justify their past continuously. Such records are problematic, as they require people to "live their lives knowing that the details might be captured by a big magnifying glass in the sky" (Lee 2002).
In the archives of the digital world, the records of heated flame wars and other digital mistakes remain persistent. For some, this is a source of anxiety, shame and embarrassment. In the midst of my research, i received an email from someone who wanted to know if i had any solutions for purging old data:
I had a rather bad public battle and due to being outnumbered by a bunch of jerks, I was made to look VERY bad many years ago and these same individuals feel the need each year to rehash the past and keep this wound open and painful to me, and I have no way of getting these "crap" purged. (Anonymous 2002)
Past posts are consistently part of a user's digital present in ways not comparable to the physical domain. Slander and gossip are archived, but the subject has no recourse for adjusting this data. In such incidents, people feel misrepresented and powerless.
Not only must one accommodate for their historical presentations, but they must also be prepared to deal with the quantitative data that is produced to represent them. For example, sites such as eBay can tell you about a users' reputation through a set of numbers. This simplification might make sense when you are evaluating a users reputation is one particular context (i.e. as a capable seller), but if reputation scores are calculated across different behavioral contexts, as is proposed by Microsoft Research (Smith & Fiore 2001), this could have tremendous social consequences. Using author profiles to evaluate someone's reputation, as one number based on 21 years of Usenet data, can be quite confounding. How do these reputations accommodate for context? Is one's post to alt.flame acceptable if it is not a flame? What value does verbosity have in evaluating an individual's worth? Do people have any say in how these statistics are used to represent them?
When a system presents reputation data, it alters the social structure. Thus, the design of such systems must be handled delicately. Researchers at AT&T accidentally discovered this problem when they equipped a chat bot with the ability to tell people various social statistics (Isbel, et. al. 2000). Although intended as a friendly feature for people to understand their own statistical behavior in relating to others, people quickly used it as a method for seeing how valuable they were in their friends' social network, which developed tremendous social tension. Numerical representations rarely convey the nuanced details of a situation, leaving room for abuse and misrepresentation that is more destructive than helpful.
Although the intention is to provide meaningful feedback, this is only helpful when it is representative and accurate. Inaccuracies come not only from mistakes but also from those who abuse the system. As these systems impact those that they represent, it is important that methods of recourse exist whereby users can challenge the results. The United States government recognized this need and drafted the Code of Fair Information Practices, which mandates transparency of governmental data with an explicit recourse protocol (Garfinkel 2000: 7). Limiting an individual's ability to control their representation is problematic. With identity theft on the rise, systems that emphasize scores for privileges, but provide no accountability, are open to harmful abuse.
As these problems are inherently architectural, users have two choices: either learn to manage with these systems, or demand designers to adjust the systems to meet the needs of the user. Although i argue primarily for the latter, it is also essential for users to be aware of the current structure and act accordingly. In order to encourage awareness, system designers should provide behavioral and systematic feedback that conveys the norms of the environment. This is important, as system interfaces not only affect a user's ability to derive context, but also to present one's identity.
The value of embodiment
Communicative performance typically utilizes the subtle nature of one's body. People know how to utilize their bodies to convey nuanced details and attitudes, and to otherwise affect the tone of any verbal cue. Through experience and mental models, people know how to read those subtle cues and evaluate people's bodies. Yet, online people must operate through a different medium. They project their ideas through the computer interface and perceive the output that the computer provides. Social interactions are limited by what people can convey and perceive in the mediated space. In current systems, both the performer and viewer have limited channels for expression and perception. Thus, much is lost in digital conversations; attempts to convey intention can be frustrating.
The spatial qualities of digital environments are devoid of meaning or functionality. If there is any decoration in the space, it is in the form of digital wallpaper or images that are supposed to mimic physical objects, such as graphical beer glasses. These items have no use in the digital environment; people cannot actually drink from a glass online. Additionally, the decorations are not tied to any fundamental aspects of the space, regulated by market forces or usage. A digital Van Gogh has no value. Digital decorations represent what the space wants to convey, not necessarily what it is. While these decorations are not particularly helpful, most online spaces lack even that level of spatial cues, relying simply on the digital equivalent of a piece of paper as the interface for interaction. None of these environments are affected by previous usage; history is told in logs, not through the effects on the space. Yet, in the physical world, the marks on the floor, the scratches on the table and the aging of the wallpaper convey subtle details that people evaluate in assessing a space. Online, everything always appears untouched. Unlike the physical realm, digital environments show no information about temporality, do not change over time through interaction, and do not communicate their history.
Online, we are unable to see much of the interpersonal context cues - how many people, common characteristics of the people, fashion statements, gender, age, activities, etc. Yet, by quickly glancing at a physical crowd, one can easily determine these as well as what the social norms are, and how many people are abiding by various sets of rules. Crowds online are invisible. Body language cues and facial expressions are missing. What remains is a set of textual descriptions and expressions, with perhaps a graphical representation of oneself or a collage-like homepage that indicates manually articulated aspects of one's presentation. In order to detect crowds, people try to make meaning out of the download speed of websites or the tickers on websites to indicate visitors (Xiong & Brittain 1999). Online, people are given limited signals, and those are often inaccurate or inadequate for people to properly develop their mental models.
Recognizing conventional and assessment signals
People rely on the signals that others exude in order to assess their identity presentation. Yet, for those signals to be meaningful, they must have the ability to determine the validity or relevance of a given signal. Assessment signals, which are costly to possess, are quite reliable (i.e. a person with large muscles can be reliably perceived to be strong). Conversely, people can present conventional signals with little effort, but they are far less reliable (i.e. owning a T-shirt that says oneself is strong is far less meaningful than possessing large muscles) (Zahavi 1997). The signals that people present must be evaluated for both their meaning as well as their reliability, for if someone is to challenge a signal, it is important to understand how reliable that information is.
Because of their reliability, assessment signals are far more desirable for the presenter and the reader. Yet, they are far more costly to possess and maintain. Online, people present themselves primarily through text. Physical signals, such as one's strength, must be converted to textual statements, thereby converting assessment signals into conventional ones. Yet, just as the reliability of the signal is decreased, so is the likelihood of harm when challenging the signal. Different forms of assessment signals evolve online, such as an email address at a prestigious domain or certain types of public archives. Both online and off, assessment signals require time and complexity to develop and present. Yet, online, conventional signals typically evolve from the documentation of assessment signals being challenged, rather than just existing as an end result. Because of the amount of time necessary to evolve assessment signals online, people constantly interact with conventional signals, which must be challenged or accepted despite the low level of reliability. As a result, deception runs rampant, as people are too likely to trust the signals that they are given, particularly those that refer to the body (sex, age, race, etc.), which are rarely challenged offline (Donath 1999).
While text does provide some information about one's identity, it is not nearly as rich as the detailed information that one conveys through body and fashion. Online, minimal information can often be harmful, as coarse data requires that people interpolate from missing information in order to build their mental models (boyd 2001). This approach is particularly problematic because people are not likely to reevaluate their initial impressions (Aronson 1995). When engaging with another in social environments, people construct a mental image of that other person, even if the only information that they may have is data such as 21 years old, white, female with blonde hair. Should their mental image resemble Britney Spears, they are most likely going to be wrong, resulting in an uncomfortable dilemma for both parties. As people read one's performance in relation to their mental image, conversation subtleties may be inaccurately perceived. Such is the case when people inaccurately assume someone's sex (O'Brien 1999).
Embodiment provides both social cues as well as a mechanism for people to properly present themselves; by not providing this information, the digital world fails individuals. This results in a slew of peculiar interactions, fundamentally due to a failure to properly communicate.
Regaining context through account maintenance
Inadvertently, users have formulated new behaviors for managing context online. As data is primarily collapsed through one's name or email address, people create multiple accounts and associate particular accounts with particular contexts. The most obvious example of this is the separation between work and personal email addresses. By managing multiple accounts, people are able to regain some control and privacy. In doing so, they are also formulating a new paradigm for conceptualizing context - localization.
Maintaining multiple personas online satisfies many goals for the digital individual. In the early days of MUDs and MOOs, people regularly explored their identity by playing with different online personas. Because people chose to fragment their social identity, digital researchers such as Sherry Turkle (1995) and Sandy Stone (1998) saw this play as indicative of a postmodern, fragmented self. Yet, the play in which people engaged simply gave them the ability to reflect on, experiment with, and process their own identity. Fragmented social presentations online provides even greater flexibility for the multi-faceted individual, as it allows them to walk through common spaces presenting different aspects of themselves rather than being required to maintain one persona per space, as is necessary offline (Reid 1998: 37). While role-playing is a fascinating, it is only one of the motivations behind maintaining multiple accounts.
Seeking privacy or segregation of lives, people maintain multiple accounts that represent different facets of their internal identity. In the realm of Usenet, this allows the user to use one account to discuss topics related to programming and one to talk about recreational interests. As an alternative to anonymity, this allows people to build reputations and friendships while only revealing particular aspects of their identity. So long as people maintain a strict boundary between accounts (i.e. not providing one's name or other identifying information), this provides a barrier when archives aggregate across or allow access to data through individual identification.
By maintaining multiple accounts, users associate context locally. In other words, rather than adjusting one's presentation according to the situation or current population, one can maintain an account that represents a specific facet and present oneself through that. In doing so, people take their internal facets and create external representations for them. Thus, faces function directly from externalized facets, or accounts, rather than through the individual themselves. When reading for situational and interpersonal context information, people assess which facet should be associated with that interaction and use it exclusively. As one moves from one ephemeral context to another, one simply switches accounts or facets. Thus, when one logs into one's work email, one knows that one is presenting the work face uniformly through this account.
In doing so, people have started a new paradigm of social interaction online. Although this may initially appear peculiar, multiple email addresses/handles fill a desired void of the digital realm - the ability to manage the given context. They have minimized the collapsed contexts by maintaining the contexts locally; thus, what is aggregated is done so across a particular facet instead of a particular individual. Of course, people maintain a varying number of accounts and they differ as to how strictly they segregate their different facets. People's consciousness of this behavior is often dependent on how much they feel it is necessary to maintain segregated facets.
While such control mechanisms work as a substitute for the failure of digital context, they are only temporary bandages for a larger problem. It will be collapsed in the near future, accidentally or maliciously by those who want to reveal people online, through new technological advancements, or systematically by initiatives such as Microsoft's Passport. Managing separate facets is neither convenient nor intuitive; thus, only those with the greatest need put forth the effort to segregate their facets.
As is discussed in more detail in Chapter 5, Passport encourages users to maintain only one account. It is in the market's best interest that a user be unable to present facets, for marketing purposes as well as control. Thus, people's motivation to start regaining context in a unique way suggests the importance of such behavior in digital interactions.
With information fundamentally missing, people are trying to find new ways to make sense of their interactions and regain awareness and control over their presentation. As people will inevitably adjust to the architectures that they are given, the goal is not to eliminate the possibilities that are afforded by the underlying potential of digital environments. Instead, designers must recognize what users are trying to do and provide them with the tools that will make it easier.
First, people need self-awareness. They need to understand their representation and role in digital interactions. While others see their presentations and have immense data about them, people are not often aware of the traces that they leave behind. Without this awareness, control seems impossible. Thus, people react to the problems without having an idea of how to stop them from occurring.
Awareness is necessary at both an individual and group level. People must be aware of the group as a whole, what the norms are and how other people are behaving. They must be aware of reactions as well as presentation, people as well as the virtual space. They must be aware of the contextual information that surrounds them. Without this awareness, people act in a disinhibited way, suggesting that increased awareness will result in increased self-regulation (Joinson 1998: 51).
Awareness is the first step for people to be able to manage their presentation and identity online. In addition, they need management tools to properly organize and present themselves. As they are not able to present their bodies, they need tools that will allow them to represent their digital equivalent, often facets of their identity as opposed to their whole being. By managing their facets, these tools should allow users to present faces as they see fit.
Awareness and management provide feedback that makes an environment more socially comfortable, as they provide information that people use to present themselves. Such information also provides users with some of what they need to self-regulate. By enhancing digital environments with desired channels for feedback and control, designers can empower users and create the environment for more fluid social interactions. Thus, the remainder of this thesis focuses on what is necessary to provide such awareness and identity management.