Faceted Id/entity:Managing representation in a digital world

Faceted Id/entity:
Managing representation in a digital world

danah boyd
MIT Media Lab
Master's Thesis

Thesis document [pdf]
- Abstract
- Introduction
- Negotiating Identity in Social Interactions
- Reconsidering Social Interaction for the Digital Realm
- Self-Awareness in Social Interactions
- Digital Identity Management
- Example Applications
- Social Network Fragments: A Self-Awareness Application
- SecureId: An Identity Management Application
- Conclusion
- Bibliography

About the author

Chapter 4: Self-Awareness in Social Interactions

Awareness empowers individuals, as it gives them the ability to understand their position in a given system and use that knowledge to operate more effectively. In social interactions, people want to be aware of their own presentation, of what is appropriate in the given context, and how others perceive them. In the physical world, this awareness comes relatively easily, as people know how to derive meaning from the information conveyed by their bodies and those around them. In daily interactions, people are aware of their presentation: they know what they are wearing, they have a sense of their facial expressions, and they can easily comprehend the reactions presented by others. Yet, online people produce immense quantities of data about their identity and behavior without an awareness of what that data is, let alone what it represents. People do not have the tools to be aware of their presentation online. Likewise, they are unable to gain access to the implicit data produced by others. Yet, these two components are essential for interpersonal contextual awareness.

Context awareness is a fundamental concern of the ubiquitous computing community, as awareness is necessary for interaction. Yet, much of the research in this area focuses on revealing environmental factors that the system can sense, including functional qualities of the space and quantitative interpersonal information such as presence. As exemplified by Anind Dey's work (2000), context-awareness in ubiquitous computing focuses on revealing external activities to the user. Although environmental awareness is essential, it is also necessary to have self-awareness. Users must not only be aware of the environment, but also of themselves within it.

In this chapter, i begin by discussing the data that people produce online and then highlight current approaches to social awareness by addressing various systems. After critiquing different approaches to awareness, i prescribe a tool that attempts to provide awareness as a user interacts with various systems.

Considering the data that individuals produce

As i have already discussed, digital data is inherently archiveable. This means that systems are able to track anyone's digital habits, including what websites they visit, who they email or instant message (IM) and when, what forms they fill out, and when they are online. Any data sent over the network, whether intentionally or unintentionally, can be archived and used to help represent an individual's behavior. Some data that individuals produce is done so intentionally, such as the messages that someone writes in an IM window. Other data is archived by servers without explicit consent from users, such as the footprints that one leaves when exploring the websites. While messages are freeform in their structure, people also relay structured data such as the profiling information required by many websites. Whenever the go online, people produce immense amounts of data about themselves without even realizing it (Behr 2002):

External logs (web, IM): login time, duration, files accessed, referring website, connecting to what people
Personal ISP logs: time/date/duration, tracked web contact access, email messages
Profile data: age, sex, address, email, occupation, industry, income
Affiliations: website/email domain
Personal website content: links, interests, bio, photos
External websites references to an individual
Message content: email, chat, IM, SMS, Usenet/bboard posts, journals/blogs
Sharable data: MP3s and other files
Social networks: IM, email, chat, Usenet/bboard
Presence data: IM buddy lists, Outlook calendars
Shopping habits, browsing habits, recommendations
Reputation as buyer, seller, advisor
Archives of data over time: conversations, websites

Unlike the data that one typically produces in the physical world, all of this data is stored and can be accessed with relative ease. Currently, this data is not centrally located; each server logs an individual's behavior on that machine alone. Unless a user only uses one machine, a complete set of data is not maintained locally either. Since most users are not aware of the unintentional data that they produce, they are unlikely to store the aggregated data.

The aggregation of this data is quite powerful in helping construct a complete image of an individual, as marketing companies and corporations such as Microsoft (Passport) have already recognized. While external systems are working to reconstruct people through their data output, individuals are not even aware of this data, let alone how it could be perceived and used. Although i do not condone most of the corporate goals with regard to this data, i believe that the mechanism for empowering individuals starts with giving them access to this data in a meaningful way, entirely for their personal use.

For data to be comprehensible, users need more than access to the raw data. While one's browser history is quite interesting, logs are not intuitive. Telling a user that they spent over half of this week's web time at online bookstores is far more meaningful. With the quantity of information available, it needs to be distilled and encapsulated in order to be comprehensible. Doing so requires tools that are intended for this purpose.

Tools for creating awareness

Awareness can either be provided post-facto or integrated into the application. While the former provides reflection, the latter is more desirable as it allows the user to immediately respond to the information. Yet, most research focused on self-awareness deals with post-facto data both for simplicity and because of a lack of access to application source. This work focuses on revealing underlying patterns to the user, quite often through social visualizations.

In order to give the reader a sense of the different approaches, i have selected a sample of awareness tools and offer a brief analysis of their strengths and weaknesses. These tools focus on making social information available to the users, yet the information is not always simply about them; often it is about their relationship to others and to groups. The pieces that i have chosen as examples either emphasize making the raw data accessible or use the data to convey more generalized notions of the people in their space. Much of the work that i address comes from Sociable Media, my own research group, as we continue to be the dominant group working on social visualization.

Making data accessible: Netscan & Blogdex

Although Usenet data is public by nature, it is difficult to ascertain what the trends are within these environments. Netscan (Smith 2001) captures and processes Usenet data and makes it available to the public in the form of statistics - how many people, how regularly do they post, how often do people reply, what groups does an individual participate in, etc. This data provides a digital portrait of users and groups through their statistical habits.

Blogs are quickly emerging as a trendy way to share information with others on the web, as they let people post links to interesting sites and comment on others' posts. While one person's blog is quite interesting, the phenomenon as a whole is even more fascinating, particularly looking at what is fashionable to post, who links to who, and how rapidly the trends change. By analyzing as many blogs as possible, Blogdex (Marlow, 2001) provides a tool for people to see what the trends in blogging are, how their blog relates to the habits as a whole, and what their relation to other bloggers is.

Both Blogdex and Netscan provide a mechanism for aggregating data and conveying information that is often obfuscated, so that users can see their habits within the larger system. Yet, they do not pull out the meaningful trends or convey what the statistics might mean to the user. For example, while Netscan lets a user see that a particular group has 50 active members, this data is most likely meaningless to the user. Even when compared to other groups, the user cannot easily determine if 50 people in a space suggests that the room is more like an empty football stadium or a living room. Additionally, trends are relative. When Blogdex was reported in the BBC, over 200 Farsi blog owners added their blog to Blogdex. As a result, the rankings quickly changed due to the increased Farsi traffic; thus, it was clear that the rankings are only appropriate for the types of blogs who have added themselves to Blogdex. At the same time, it is difficult to get a sense of what types of blogs have been added, and which have not. In both systems, determining the trends or the meaning of the data can be challenging.

While these tools fail to make the leap between the data and their value, they are particularly noteworthy because they take the first step in making otherwise uncollected data accessible in unique ways. By using them, the curious and thorough user can develop their own intuition about the environment by scouring the statistical data for meaning. While these tools currently provide the equivalent of a well-structured system log, they present the most salient statistics for environments that are otherwise not accessible.

Visualizing statistical data: PostHistory & Live Web

Focused on revealing an individual's behavior, PostHistory (Viégas 2001) was developed to give users a sense of their email habits. Drawing from one's email archive, the system analyzes the data to understand who converses with whom, when, and how often. PostHistory conveys this statistical data in an elegant and compelling visualization where users are able to easily see information such as which people write to them the most, what the relation between time and people is, and how often they receive personal messages versus group messages. While the current implementation is graphically compelling and legible, it only provides the essential data and makes no attempts to evaluate it for the user. This is both a strength and a weakness, as users are encouraged to reflect on their own behavior yet they are unable to delve into the data to understand its contextual relevance, partially because PostHistory does not allow users to access the underlying message data. For example, just because an individual sends the largest quantity of messages does not mean that they are that valuable to the user; each message may only be comprised of a few words or might be solely associated with a listserv. Without being able to see why the individual was rated so high, the information may be misleading.

When people surf websites, they are sharing the space with others, yet this aspect of social awareness is difficult to perceive other than recognizing that a site is slow. Live Web (Xiong & Brittain 1999) visualizes the data traces that each server maintains about visitors. Thus, users can see who else has recently visited a website and what path they took as they followed various links. This type of a system makes the social aspect of system logs accessible to the public, letting them get a sense of interpersonal context. While people are able to observe one another within one site, they are not able to gather more information about the people or follow them outside of the particular site. Live Web does not provide enough information for anyone to be more than simply an intimate stranger, as it does not provide the motivation or detail for people to communicate with one another.

Developed in Sociable Media as tools for visualizing inaccessible data, both PostHistory and Live Web focus on revealing the underlying logs of the data. They do little to imply information about the user in relation to others or the community itself. Yet, by making the designs so compelling, they reveal data in a meaningful way, thereby offering the first step in providing users with the knowledge necessary to understand the social behavior around them.

Impression-driven visualizations: Loom2, Visual Who & Social Network Fragments

In Loom2 (boyd, et. al. 2002), Hyun-Yeul Lee and i began exploring how a visual language could be developed to convey the socially salient features articulated by Whittaker, et. al. (1998) that Netscan (Smith 2001) exposed. By trying to understand the relevance of social data to the user, we created a series of artistic and computational sketches that allowed people to interactively explore different aspects of Usenet environments. The Loom2 project focused on a series of sketches and designs that explored different aspects of information presentation, including some that were too complex to fully integrate into current systems. For example, we recognized the power of text in serving as both a functional mechanism for gathering meaning about the message as well as a beautiful form that could convey underlying intentions. Thus, one aspect of Loom2 was to explore how glyphs could be animated with motion to personify the textual individual. Although this was only done through a handful of interactive prototypes, we recognized the power of conveying impressions as well as meaning.

Loom2 started to reveal the importance of giving people multi-layered data, such that visual information could help them create quick meaningful impressions, but also provide them with the detail necessary to explore the actual raw data at a lower level. The value of an interactive visualization system is that it draws both on the power of visual cues as well as layered information, or what Ben Shneiderman (1987) refers to as the Visual Information-Seeking Mantra: overview first, zoom and filter, then details on demand. Loom2 recognizes that people need more than just simple access to data - they need to understand how data relates to them and how they relate to others. By approaching this issue through design, we began to develop a visual language that focused on providing data awareness by relying on cues that people understand, including aspects of motion, color and graphical layout.

Visual Who (Donath 1995) is an interactive visualization of mailing list and other group/member data. By interacting with the system, users are encouraged to comprehend highly dimensional data about their relationship to groups based on the stereotypes of the members of those groups. Thus, Visual Who offers users a tool for comparing themselves to the group, where the groups' value is based on the external activities of all its participants. Closeness does not suggest that an individual is interested in the associated group; merely, it suggests that the individual has much in common with the members of that group. For example, the system strongly associated a Media Lab professor with skateboarding; he was not even remotely interested in skateboarding. As many of his students were skateboarders and he had a lot in common with them, he became associated with that group. One of the problems with this piece is that users can easily mistake the feedback they are receiving as indicative of their relationship with other people. When the system positions two people nearby, it simply suggests that the individuals have the same relative pull to the groups present. Thus, the only thing that they have in common is the same tie ratio.

In order to provide users with an awareness of the structure of their social networks via email interactions, Jeff Potter and i developed Social Network Fragments (SNF), which is detailed in Chapter 7. By analyzing email behavior, we associated a value for different types of email relationships based on how much they indicated an awareness or knowledge of others on a similar message. For example, when a user sends a message to two different people and blind carbon copies another, what can we say about the various ties in terms of how well the people know one another? By assigning a value to each of these ties, we developed a language for quantifying the weight of two people's relationships. Using this, Social Network Fragments visualizes the complete graph of people's relations with one another, focusing on conveying the structure of the social network.

All three research pieces focus on providing impressions by constructing a legible social landscape, as described by Donath (1996). By developing a language for relating people and information, both Social Network Fragments and Visual Who offer users an interactive interface in which to explore the social information that the system derives from the data that they produce. As the information that they convey is impression-driven, these systems are bound to be misleading at times. In Visual Who, users often mistake the graphical distance between users to be meaningful, while the only meaningful relationships are between the people and the various groups. Likewise, in SNF, the clustering algorithm can collapse dimensions in a way that places unrelated people near each other on the two dimensional surface.

As was recognized by the Loom2 project, conveying impressions is a delicate process and the mistakes extend beyond just readability. Not only must researchers concern themselves with how the data is analyzed, but they must also take these qualitative values and convey them as impressions on the screen. Thus, there are bound to be errors in both steps. Yet, this approach is also important, as it is impressions that people want, not simply a vast quantity of unanalyzed data. Even the imperfect impressions that are conveyed by Visual Who and Social Network Fragments are quite compelling, because they are providing insight that is otherwise inaccessible to the users.

By giving people access to both data and the possible connotations that can be drawn, people are able to see a different perspective on their behavior. This awareness provides cues that may not be fully accurate, but neither are most impressions in the physical world. Awareness comes not simply from understanding the statistics that one produces, but by understanding the possible impressions that this makes in relation to the individual. Thus, while it can be perceived as a weakness that these system imply potentially inaccurate information, it can also be seen as a virtue, because it is precisely these impressions that users need to be aware of when they are engaging in social interaction.

Application-driven awareness

While the aforementioned awareness research systems provide users with post-facto awareness, systems have also been built to integrate social transparency into the system. For example, updated versions of ChatCircles (Viégas 1999) share a user's historical movements by leaving traces on the background of the chatroom while Erickson, et. al. (1999) have integrated awareness mechanisms of presence and participation into Babble. Both of these systems provide feedback to the users, including: who is there, who can see them, who is participating and with what level of activity. They provide a record of interactions, allowing users to see more than just the current data. The feedback mechanisms in these systems are intended to improve users' experience by making them more aware.

Likewise, many non-research tools incorporate feedback so that users can use the systems in a more effective manner. In particular, these systems reveal some of the data that the user has provided the site. Although this information is available, it is often obfuscated, as it is primarily intended for sporadic review. That which is readily available is intended to help the user browse. Most often, that which provides the best awareness is not intended for such; yet, it can be co-opted by users to reflect on their own behaviors.

For example, some webboards, such as ezboard, give users tools to see the history of their posts, to see and edit a public profile and to track the responses to their messages. Amazon lets the user know that they are being observed by welcoming them by name in a manner that allows the user to see and edit much of the data that that they have stored about them. Yahoogroups lists all of a member's associated groups for their direct access. Ebay provides a user website that lets users modify their preferences as well as respond to the feedback about them and see how their reputation has been affected by others. Presence information can be seen through most instant messaging programs. Most e-commerce and communications sites provide some aspect of data awareness, whether it is the history of one's interactions or a profile that one is presenting to the company and other users.

By providing users with a centralized location where all of their membership data is located, these sites give users an opportunity to observe how the site and others see them. Unfortunately, what is typically provided to the users is not complete transparency; people still do not know how they are given a particular recommendation or why they receive a particular advertisement. Additionally, the structure of most sites does not indicate the level of observation that is occurring. By welcoming the user, Amazon provides a counter-example; the hello informs the user that the system is watching, thereby providing architectural feedback. Yet, for the most part, sites have no motivation to provide awareness as their data collection is usually for advertising purposes. What awareness they do easily provide is usually about other users, such as reputation scores.

Bridging research applications and web feedback

The aforementioned research systems provide direct feedback and make hidden data accessible, while the typical website provides feedback incidentally. Yet, both approaches are advantageous. While sometimes obfuscated, web feedback is incorporated into the system and changes over time. It is limited because it typically provides minimal feedback. On the other hand, the research applications convey rich data; most of them are more effective as portraits than ongoing awareness tools. Unfortunately, when these systems run off of live data, they sit as separate applications, not directly integrated with the application being used. The feedback that they provide is often about the system as a whole, not simply about the user's role within the system. Thus, they may be considered separately.

Recognizing that most self-awareness applications are focused on giving users an overriding image of self and not one that is integrated with the current context, i started to consider what would be a better way of providing awareness. In doing so, i imagined a tool that interactively and continually provided awareness about the user as they operated in the digital world.

Digital Mirror: A tool for reflective self-awareness

A mirror provides an image in which we can see ourselves, our identity and postulate what others see. From Lacan's perspective (1977/1966), the mirror stage in development is when children first get a notion of themselves as unique individuals. This mirror reflection provides a source of feedback that allows us to adjust our presentation in order to convey what we want to project.

The mirror is an interesting metaphor for consideration, as people do not operate with such awareness in the physical world. In fact, performing in front of a mirror takes on an entirely different aura than performing without one. Yet, in our embodied selves, we have a decent sense of what we are projecting. Online, we lack the body with which to project ourselves and thus we project our ideas into a digital representation that serves as our online agent. By operating our agent, we assume we are able to perceive ourselves, as we can access our profiles, manipulate our location, and create textual messages. Yet, this presentation is deceptive as it is not what others can see.

Those who see us are also seeing much of our past. For example, when one logs into a website, the website does not just see the current set of actions, but aggregates them with all previous interactions. As we interact, the information that can be accessed about us is potentially great. Although it is inconvenient to log all conversations, this is potentially available to others. As discussed earlier in this chapter, given any application, much data can be stored and accessed.

Given this, one approach to empowering users through awareness is to give them access to all that could potentially be seen about them. By presenting this data in an accessible manner, an individual could determine what is meaningful. Just as with the mirror, the user not only sees what they believe they are presenting, but with the image that others can see. By revealing what can be seen given the facet that one is presenting, the system could provide the user with a different level of interpersonal contextual information. Certainly, this does not mimic the physical world, nor will the resulting behavior. Yet, by providing such feedback, people can understand how their facets operate online and have the ability to adjust them. Users do not see everything that they may have shared, but everything that is accessible in this given context, with this given facet. By integrating the tool into the interactions that one has and presenting the feedback explicitly, such a mirror system encourages the use of awareness to adjust one's behavior.

Privacy Mirror

As i was contemplating the interface for a Digital Mirror, i stumbled on Nguyen & Mynatt's (2002) concurrent work in constructing a Privacy Mirror for people's online interactions. By recognizing the power of accountability and awareness in inciting change, Nguyen presents a set of ideas that most closely resemble my own thinking. Yet, while i was imagining a tool directed at users for considering their own output in a multi-faceted contextually collapsed world, their system focuses on creating large-scale transparency in public environments such as websites. With such a system, people would be aware of all data logs, not only their own; they would be able to see the history of the people's interactions at a given site. Privacy Mirror would provide detailed transparency, eliminate the "secrets" behind access logs, and otherwise let users know what detailed data is being logged during their interactions.

While i agree with Nguyen & Mynatt that awareness is essential for giving people control, i do not agree with the approach of making all logged data universally public. By doing so, the system would allow for an even greater amount of contextual information to be collapsed. Although many advertising agencies have this information, consider the impact of Privacy Mirror if a boss discovered their employee's off-hours interest in a controversial topic. Such a system does not provide individual privacy, but transparency. Although Brin (1998) argues that such transparency is crucial for addressing the issues of privacy online, i feel as though universal awareness can only bring about harm, as it would provide further drive towards a heterogeneous society where all people are performing for the universal public. As this is not the society that i am interested in helping develop, i decided to consider these weaknesses and imagine the interface to an improved mirror.

Digital Mirror: Example scenario

Imagine a tool, shall we call it one's Digital Mirror (or Mirror for short) that is a hovering presence on a user's system . In its window, the Mirror shows an image of the user that changes as the user interacts with various applications. This image is constructed for the given user and is not accessible to anyone else.

Perhaps this image is abstract, showing iconic information to represent different information. Or perhaps the image is of a person who is caricatured based on the information provided. Both of these approaches have their weaknesses. On one hand, presenting a caricature shows more detail than is truly representative and thus creates an impression that has the same confounding issues as a profile by relying on minimal data to present an entire picture. Yet, at the same time, this is precisely what other people do; perhaps such a representation would make the user consider the data that they are providing. Having bits of data shown through visual iconic bits provides the data explicitly, but without the level of impression that often impacts social interaction. It is uncertain as to which approach is more valuable, and thus a fine example of needed future research. In either case, imagine that we are observing "Sarah" as she interacts with different applications using Digital Mirror.

Sarah logs into an IM client as zephoria. The people on her buddy list see that zephoria has logged in; they see her profile, which lists her as male and located in Boston. Many of the people on her list know her as Sarah, mostly from offline interactions; those who only know her online know her as Zephyr.

Sarah's Mirror now indicates her relationship with the IM client and her buddy list, indicating the profile information that is hidden to her when she opens her client. Perhaps the male identity is shown through a % symbol, or perhaps the caricature is given male features. Her location could also be shown through a representative icon, perhaps a state map. Data that is accessible to all those on her buddy list is also integrated into this representation, perhaps the public Google-able data about zephoria.

Seeing one of her friends online, Sarah opens up an IM conversation with Bob123; they have talked many times before.

Although she is still presenting a facet of her identity, the context is narrowed by this direct link; thus, her Mirror changes again, to reflect the facet that she is in direct contact with Bob123. As they've shared long chats, images of conversations scatter the background of Sarah's Mirror. By selecting the conversations on her screen, Sarah can access these previous interactions. Recognizing that the IM character Bob123 is identical to the email character bob@bob.com, the system includes their email interactions as well, as these pertain to the image that Sarah is presenting to Bob123. Drawing on the ideas from Conversation Maps (Sack 2000), Mirror provides users with unique words and expressions that stand out during their conversations, springing this information from the icons containing it. Thus, it is not surprising that personal qualities that Sarah has revealed in chatting litter the representation, indicating her love of music and Italo Calvino. A small graph appears, indicating the parts of Sarah's social network that Bob123 knows about, using mechanisms derived from Social Network Fragments.

Bob123 asks about the well being of Taylor. Not remembering what she has shared about Taylor, Sarah turns to her Mirror, focusing in on the social network graph. In this graph, she can see all of the people that they've spoken with together online, some through IM, some through email. She can also see all of the people that she has mentioned to Bob123, including Taylor. Focusing in, she is referred to two emails and a chat log where they have discussed Taylor. Realizing that Bob123 is referring to Taylor's health, she returns to the conversation and responds accordingly, noting not to tell him about Taylor's newfound love.

The Mirror lets the user delve into the data to understand from where the representations come. In this way, the system integrates previous work, where the initial Mirror image is a fingerprint, indicating the general information. Simultaneously, it is an interface, allowing Sarah to delve into the fingerprint to see its components; thus, it is the gateway for Sarah to access her facets.

Switching to surf one of her favorite newsmags, Sarah's representation quickly changes to present the other facet's data. Logged in from her work machine, the website quickly notes her IP address and the website from which she came. The advert on the site is being pulled from DoubleClick, an advertising company, which is also aware of her IP address and all of the sites that she has surfed using this IP.

Unlike the more social environment of the IM world, the website is interested in data about her, often to provide her with targeted advertisements. The Mirror reveals this data by showing her representation through her digits, showing the site where she came from and giving her a timeline of her interactions at this newsmag.

Sarah clicks on the timeline to remind herself of the last time she has visited. She's fascinated by the patterns, noting that she seems to come twice every day - once in the morning during her usual check-in routine and once, for a far more extended period of time, when she is anxiously awaiting the end of the work day.

As DoubleClick has also received her data, the Mirror links this current interaction to DoubleClick. By recalling the previous sites that she has given them, Sarah's Mirror connects this site to all of those other sites, producing a highly dense graph of the network of Sarah's websurfing.

Intrigued by the suddenly large graph in her Mirror, Sarah decides to navigate the data in order to understand what it means. As DoubleClick has detailed and connected logs of her websurfing habits, she finds interesting tidbits about herself. For example, she always seems to go directly from CNN's website to the New York Times, and both the NYTimes and DoubleClick are aware of this incoming link. In zooming into more details about the specifics of her presence at the NYTimes, she is able to see the profile information that she has provided them, along with her history of articles read. She chuckles as she sees that the NYTimes recognizes her as a low-income male working in the financial district while living on Pennsylvania Avenue in D.C and reading all articles related to queer culture and military abuses in Afghanistan (without once looking at a stock price). As she zooms out of this profile information, returning to the parent link of DoubleClick, she is able to see a more general profile, which suggests that she is 83% male. The graph also provides her with a view of what facets of her online presentation have been collapsed with others, mostly notably by sites who started to note which IP addresses that she has logged in to them from, using the same account.

Most people are unaware of the amount of information collected about them online, let alone how easily it is collapsed. By conveying this information in Sarah's Mirror, she can quickly see what is being revealed about her behavior. Plus, as Mirror attempts to highlight the most salient characteristics, Sarah can see the most obvious patterns in her behavior - her timing trends, the generalized categories of the sites she visits, the profiling information that has been collected about her, etc. The information should be provided in a highly dense visualization, a technique that Tufte values as being a design that gives viewers control over the data by allowing them "to select, to narrate, to recast and personalize data for their own uses" (1990: 50). Thus, dense visualizations provide good tools for reflection and awareness.

With awareness tools, people want to know how they are perceived. They are not able to see another's face so they must resort to understanding the data that others use to evaluate them. Yet, it is not only the data that matters; the situation in which the data is created drastically affects the impressions that others gather. Understanding how systems perceive a person is much easier, as that observation is usually calculated using out-of-context and numerical data. Thus, the impressions that Sarah can derive from her Digital Mirror when interacting with the web are far more meaningful than those she can derive when seeing her conversational history with an IM friend. At the same time, Sarah probably has a more intuitive sense of how her friend perceives her than how a computer system does, as their conversation inevitably provides feedback in the way that a data-hungry system does not.

This scenario articulates some of the feedback that i imagine would be useful to users, so long as it is distilled in a meaningful way. What is provided goes above and beyond the magnitude of information that people have offline, yet the environment is also quite different. Simply put, "information is power and currency in the virtual world" (Billy Idol, "Cyberpunks"). As people have the ability to access massive amounts of data about others, it becomes useful for users to be aware of what is out there about them. While i believe the widespread transparency is problematic for marginalized individuals, i do feel that personal data should be transparent to their subjects, as they are essential for self-awareness and identity management.

Desired qualities for self-awareness tools

By contemplating Digital Mirror and a potential usage scenario, i highlighted what i believe to be essential characteristics of user-focused awareness tools:

1) The tool should be contextually dependent. Thus, it must be integrated with the actual applications, collecting data from them and presenting it back to the user in an accessible manner as they are using that application. When one's facet bridges multiple applications, the presentation should include this data. Such an application would provide situational contextual information by allowing users to see what applications their facets transcend.

2) Awareness tools should only provide the data that might also be available to the system or person with whom the user is interacting, not just everything that the user presents. In doing so, the tools take into consideration the types of faceting that a user has developed and return meaningful information for the interaction at hand. As links between data are shown, the visual aspect of the interface should indicate how likely that link is to be made. For example, a set of email interactions between two people years ago should not have the same weight as ones made more recently.

3) The representation should provide both raw data and impressions, such that the user can quickly ascertain the value of the information or understand any of the impressions that are offered. By utilizing the value of an interactive multi-scaled approach, users can delve into the high-level fingerprints in order to understand how they are being constructed. Through interactive interfaces, the user should be able to get to the raw data from the higher-level impressions. The interface should be compelling and attempt to convey information as legibly as possible.

While i can proselytize such ideas, i have only hypothesized what such a design might look like. The hardest task in bringing these ideas to fruition is that of creating a comprehensible design. Such a system should draw from the developments that have been made by researchers who design visualizations that represent people's behavior post-facto. Simultaneously, further research is necessary to determine what data is appropriate to convey and what better mechanisms exist for making it accessible.

To do so requires strides in two directions - analyzing the appropriate data and representing it in a meaningful way. The efforts made in Loom2 make it quite clear that this work has hardly just begun. Presenting statistical data in a key-based accessible manner is not that difficult; people can learn to read keys and evaluate the numbers. But numbers do not complete the picture; people need to derive meaning from their environments beyond what is easily computed. Yet, evaluating qualitative data requires such delicacy in order to pull out the desired impressions. Then, once the data is available, conveying those impressions requires yet challenge, as design is not a systematic art. For example, what is more helpful - abstract statistical representations or potentially inaccurate caricatures? How much abstraction is meaningful? What is an acceptable margin of error for conveying impressions? The goal is to provide a visual tool that requires little more than a glance to get a meaningful impression, but that also offers an interface for extended detail.

Concluding thoughts

Awareness online need not resemble its offline counterpart, as the available data is not comparable. By providing awareness online, the goal is not to mimic offline knowledge, but to supplement the dearth of available digital feedback. In doing so, people can feel more settled by understanding how they are seen, even if they cannot determine the reaction of others. Even this level of awareness increases one's ability to appropriately self-monitor.

Self-awareness allows users to understand who they are in a particular environment, how facets of their identity are manifested and aggregated, how other people and sites can see them. Such awareness places an individual within the society at large, in relation to other people. While awareness allows users to begin controlling their presence, it is only the first step. Awareness alone is not effective in giving individuals control; they must also have the ability to instigate change of how they are perceived by having the tools to manage their presentation directly.