[ Table of contents: HTML, PDF, ASCII,
PostScript
]
[ This article: HTML, PDF,
ASCII,
PostScript
]
0018-8670/96/$5.00
© 1996 IBM
Navigating large bodies of text
by D. Small
Reprint Order No. G321-5621.
The display of information by computers does not often fulfill the
promise of the computer as a visual information appliance. A design
experiment is described in this paper in which a large body of text, such
as the complete plays of William Shakespeare, is made visible. The
typography is designed to handle display at a variety of scales so that
the user can move smoothly between detailed views and overviews of as many
as one million words. Visual filtering techniques are described that aid
in the analyses of text at a wide range of scales. Also, the use of
three-dimensional space to organize complex relationships among different
information elements is described. New interaction paradigms are explored
that aid in the navigation of complex, three-dimensional information
spaces.
Initially, new media emulate the media that they
replace, before creating any new paradigms. This can certainly be said of
the way in which electronic media have handled text. The glowing glass
screen of the computer is seen as a flat surface on which is pasted images
that resemble sheets of paper. Window-like systems have advanced this
emulation only to the extent that they allow for many rectangular planes
of infinitely thin virtual paper to be stacked haphazardly around the
glass surface of the computer display. The graphic power of computer
workstations has now advanced to the point where we can begin to explore
new ways of treating text and the computer display. By allowing the
computer to do what it can do well, such as compute three-dimensional
graphics and display moving images, we can develop a truly new design
language for the medium.
It is night and you are dreaming. The sky is dark
and you are floating lightly above the earth. As you cast your thoughts
about, you flit and fly above the landscape from place to place and then
zoom out into space. As soon as a constellation appears ahead, you skim
instantly to that place, exploring, moving as fast as thought.
Navigation of information might be just as I
described--smooth, simple, and as fast as your thoughts. Professor Muriel
Cooper, founder of the Visible Language Workshop within the MIT Media
Laboratory, coined the phrase information landscape [1]
to describe this sort of space, where information "hangs" like
constellations and the reader "flies" from place to place, exploring yet
maintaining context while moving so that the journey itself can be as
meaningful as the final destination (see Figure
1).
Figure 1
A landscape, whether real or virtual, provides an
experience in which context is continuous and meaningful. It is through
context that we can understand new information and can relate it to what
is already known. Drs. Stephen and Rachel Kaplan wrote about the
experience of mystery in landscapes in their book Cognition and
Environment: Functioning in an Uncertain World. [2]
In the case of mystery, the new information is not present;
it is only suggested or implied. Rather than being sudden, there is a
strong element of continuity: the bend in the road, the brightly lighted
field seen through a screen of foliage--these settings imply that new
information will be continuous with, and related to, that which has gone
before. Given this continuity one can usually think of several
alternative hypotheses as to what one might discover.
By escaping the confines of the flat sheet of paper,
we can arrange information into meaningful landscapes that exhibit
qualities of mystery, continuity, and visual delight.
The plays of William Shakespeare are used to explore
the design of an electronic information space that maintains the qualities
of a meaningful landscape. A large image from the system is shown in the
background on the title page of this paper. Each character in the play
A Midsummer Night's Dream is marked in a different color, and
different typefaces are used to distinguish stage direction, names,
dialog, and commentary. Each scene is laid out in a single column of text.
These scenes are aligned at their top edge and separated horizontally by a
small gutter. The gutter, or space between columns of text, is increased
between acts so that each act forms a distinct visual chunk. Each of the
five acts of the play are arrayed from left to right, and finally, each
play is arranged one above the other.
Information landscapes
Before delving into the details of the Virtual
Shakespeare Project, it will be useful to review some previous work in the
design of information landscapes by the students at the Visible Language
Workshop. The purpose of these projects was to help define the design
issues associated with three-dimensional typography.
In this early work, we developed a method of
displaying typographic forms at any size, position, and orientation in
three-dimensional (3D) space. A virtual camera is then moved through and
about the space, exploring the information, both text and images, which
inhabits the space. By adapting and expanding on techniques developed for
two-dimensional graphic design to the mostly unexplored realm of
three-dimensional design, a number of visual experiments were produced.
These interactive sketches addressed a number of design issues, including
the use of perspective, scale, space, and interaction to create simple and
flexible visualizations of information.
First, we must remember that letters were designed to
be viewed directly on a flat two-dimensional (2D) surface and, by allowing
arbitrary viewpoints, perspective distortion is created. Although this is
correct for 3D perception, it is less than ideal for reading. Since it is
not always possible to guarantee the angle of the view relative to the
angle of the text, one cannot be certain of maintaining the integrity of
the letterform. Each new angle will result in a differently shaped letter
and at extreme angles the image can be reduced to a line. Furthermore,
when the camera moves behind the text, it looks reversed as though seen in
a mirror. While certain word shapes can still be recognized in less than
ideal circumstances, in general there are few views from which text holds
its legibility. One can solve this problem by constantly rotating all of
the text objects so that they face the viewer, but this has the problem of
destroying the overall structure of a complex three-dimensional space.
Another solution is to constrain the movement of the camera to maintain a
minimum legibility of text in the scene, but such constraints are not
always acceptable.
In Figure
2 a map of North America is labeled with the country names Canada and
the United States (partially seen), which are easily read when the map is
viewed from the common orientation with north at the top. Still, nothing
prevents the viewer from moving to the North Pole, from where the text
will appear reversed.
Figure 2
A graphic designer can use size differences to visually
distinguish certain elements in a text, such as the headline of a
newspaper story or the fine print on a contract. In a three-dimensional
space, you cannot always resolve the relative size of two objects. If one
object appears smaller in the picture plane, it could actually be smaller,
or it could be the same size and farther away, or it could even be much
larger and very far away. So, in the design of an information space, one
must be careful about using size as a differentiating variable. One
interesting advantage of designing an information space is that the
designer can use an almost unlimited range of scale to represent
information. In print, the difference in size between the largest and
smallest element is limited by the resolution of the printer and the
physical size of the paper, but in a virtual space, typography can have
almost unlimited variations in scale. This was explored in some detail in
the Virtual Shakespeare design that will be discussed later.
We can also examine the use of space and how that
differs in the design of digital media. Traditional graphic design has
always been concerned with the disposition of pictorial space. There is a
constant tension between the desire to include as much content as possible
and using white space to create a harmonious and uncluttered image. In two
dimensions, the designer is constrained by the limits of physical space
and the static nature of the medium. In three dimensions, despite the
easing of those constraints, the problem remains to create clear, legible
relationships. Because of the ease with which one can create content at
different scales and orientations, it is possible in an electronic
landscape to present massive amounts of information while giving an
impression of low visual density. This capability can work against the
design as easily as it can help. Vast amounts of empty space can lead to
an environment with very low legibility. A legible landscape is one that
is meaningful, rich, and clear. Kevin Lynch wrote in his book The Image
of the City: [3]
By this [legibility] we mean the ease with which its parts
can be recognized and can be organized into a coherent pattern. Just as
this printed page, if it is legible, can be visually grasped as a
related pattern of recognizable symbols, so a legible city would be one
whose districts or landmarks or pathways are easily identifiable and are
easily grouped into an over-all pattern.
Because typographic elements can appear at any scale,
an information landscape can create a good sense of overview and context
while losing a clear understanding of the density of the content. In a
paper book, we can understand at a glance the amount of text by the size
of the book, the width of its spine, and so forth. As we dynamically shift
the scale of an electronic text, we may not be able to have a constant
yardstick or scale against which to understand the size of a text. This
can be seen in the figure on the title page, where an entire Shakespearian
play is visible. It is difficult to get a sense of how many words are in
the play or how long it will take to read.
Finally, we can examine how we can use either
increased spatial resolution or time varying images to increase the
density of information displays. The constant dimensions of the computer
screen and its low resolution (100 dots per inch, or dpi) when compared to
print (over 600 dpi) greatly limit the amount of text that can be
simultaneously presented to the user. One solution to this problem is
simply to increase the resolution of the display (see Figure
3). The Visible Language Workshop has built an extremely
high-resolution display that enables the simultaneous display of large
amounts of information. [4]
Figure 3
This approach, however, makes extreme demands on display
technology, compute power, and data bandwidth. Instead we can use a
standard size display and dynamically shift our viewpoint around a larger
virtual information space. Although the resolution at any one moment in
time is still limited, we can smoothly move from an overview to a detailed
view in a manner that helps to maintain the all-important context of the
larger body of information. Animated typography can also be used to
enhance the emotional content of a message. Yin Yin Wong has explored the
use of dynamic forms to create expressive visual narratives (see Figure
4). [5]
By using moving typography, she was able to overcome the resolution limits
of the display to create type that appears to be speaking to the reader
with an incredible density of meaning. Her work hints that the computer
cannot only faithfully reproduce text, as in a book, but can express the
meaning of the text, as an actor performing a role.
Figure 4
Related work
I would like to briefly discuss two examples that
inspired this work. One particularly interesting piece is Vannevar Bush's
1946 essay "As We May Think," in which the idea of hypertext is first
proposed. [6]
I was particularly struck by his description of the physical device that
would be used to access information, which he termed the Memex.
If the user wishes to consult a certain book, he taps its
code on the keyboard, and the title page of the book promptly appears
before him, projected onto one of his viewing positions. Frequently-used
codes are mnemonic, so that he seldom consults his code book; but when
he does, a single tap of a key projects it for his use. Moreover, he has
supplemental levers. On deflecting one of these levers to the right he
runs through the book before him, each page in turn being projected at a
speed which just allows a recognizing glance at each. If he deflects it
further to the right, he steps through the book 10 pages at a time;
still further at 100 pages at a time. Deflection to the left gives him
the same control backwards. A special button transfers him immediately
to the first page of the index. Any given book of his library can thus
be called up and consulted with far greater facility than if it were
taken from a shelf. As he has several projection positions, he can leave
one item in position while he calls up another. He can add marginal
notes and comments, taking advantage of one possible type of dry
photography, and it could even be arranged so that he can do this by a
stylus scheme, such as is now employed in the telautograph seen in
railroad waiting rooms, just as though he had the physical page before
him.
Note that there are several ways of traversing the
information. One can jump directly to any point in the corpus. In
addition, we can travel through the work at speeds of increasing orders of
magnitude. Technology has now advanced to the point where building a Memex
machine as Bush describes is perfectly feasible, and some of his ideas,
such as using order of magnitude jumps in navigation, were incorporated
into the design of Virtual Shakespeare.
While Bush's Memex and other descriptions of
"cyberspace" [7]
are useful thought experiments, another example of an existing artifact
that addresses the issues of macro/micro readings is Maya Ying Lin's
Vietnam Veterans' Memorial in Washington, D.C. The 58000 war casualties
are displayed so that one can start by looking at individual names and
slowly include more and more names until the enormity of 58000 is made
visible. Approaching from either side, you first read a few names. As you
continue forward, the wall grows taller beside you, each panel containing
more and more names. Each name is clearly legible as you pass it, but the
entire list of names becomes a statistical blur in the distance. A
consistent scale helps the visitor to measure the information from the
level of an individual up to the war as a whole. Because the observer can
use his or her own body to measure the text, it is possible to avoid the
problem of indeterminate scale seen in the title page figure. This
suggests that a stronger feeling of embodiment in an information landscape
will aid in its comprehension.
Virtual Shakespeare
The purpose of the Virtual Shakespeare Project was to
explore the design of a large body of textual information. I chose to
visualize the complete plays of William Shakespeare. The amount of text is
on the order of one million words and the work itself has many structures
that can be made visible: speeches, scenes, acts, and so forth. A
rendering model was developed that is optimized for rapid navigation and
changes in scale. If your viewpoint is close to the text it will be fully
rendered. If it is farther away, and therefore smaller on the display, a
simplified texture is used in place of each line of text. This technique,
called greeking, maintains the overall shape of each line, although
individual words are lost. As distance increases to the point where each
line of text blurs into the next, each block of text is drawn as a simple
rectangle of the same size and overall density. Breaks between the dialog
of different characters are used as the delineator for the larger text
blocks. This means that even at a great distance, the reader can still
follow who was speaking and how much was said. The final stage comes when
the dialogs become so small as to merge together. At this point each scene
is rendered as a simple rectangle. These different views are shown in Figure
5. As we move back to include ever larger amounts of information in
our view, the display of the information becomes more abstract while
maintaining visual continuity.
Figure 5
It is important that all transitions from one level of
detail to another be as smooth and inconspicuous as possible. The reader
should believe that all the information is there on the screen. A simple
cross-fade is used to blur the transition from one state to another. This
works quite well; however, there are still some problems associated with
color and the typography itself. As typographic elements change size, it
is not always possible to maintain a consistent perceived color for the
text. As an object becomes smaller in the visual field, its surroundings
have a greater effect on its perceived color. In the case of the rendering
engine used in the Virtual Shakespeare Project, the text becomes darker as
it gets smaller. This becomes a problem when color, or even brightness, is
used to distinguish one object from another. Figure
6 shows how highlighting can work effectively at two different
distances. Because it is easy to see Titania's dialog while viewing the
entire play, we can readily explore her thread through the narrative.
Figure 6
In addition to perceived shifts in color, the typographic
forms themselves appeared somewhat unstable. That is, the thickness of the
stems appears to change, the serifs wriggle and fade, and the counters
tend to clog up when the letterforms shrink. Typefaces are generally
designed to be used in only a small range of sizes and always so that they
are flat to the page. One problem that needed to be overcome was the fact
that letters were being used that were much larger or smaller than had
ever been intended and that some view angles created such perspective
distortion as to render the typeface illegible. Through experimentation,
we have found that some typefaces are more sturdy in this respect than
others; however, no single typeface is adequate for all situations. What
will be required is to design a new kind of typeface that can dynamically
adjust its form to its environment. For example, in the days of lead
typefaces, each size was designed independently. Designers knew that a
letterform that looked clean and elegant at 12 points would be tall and
spindly looking at 6 points, so letterforms became squatter and thicker as
they grew smaller. New work in multimaster typefaces by Adobe Systems Inc.
allows the generation of a range of faces from a single master; [8]
however, they have not been used in a dynamic display. Future work will
explore the generation of variable typefaces that can adapt to suit their
environment.
Despite these problems, it is possible to use cues
such as color or change in typeface to visually highlight portions of the
text. For example, one may be interested in seeing all of the dialog for a
specific character. Whatever visual technique is used, it should clearly
distinguish the selected text at a wide range of scales. If a change in
typeface, such as boldface or italic, is used it can be
difficult to see when viewing an entire scene or act. Dynamic
highlighting, such as blinking can be effective when the selected text is
small and could be swamped by other information; however, it can also
render illegible just that information that one wishes to make visible. To
avoid these problems, I used brightness to cue the dialog of a character.
The contrast between the selected and unselected text was continuously
adjusted to account for changes in scale. As the distance from the text
increases, bright objects become surrounded by more and more black space
and must be made brighter to seem to maintain a consistent visual
distinction from the unselected text. The ability to visually filter out
some portions of the text enables the reader to see patterns and structure
that were impossible to find in the traditional book format. For example,
when Titania's dialog is highlighted, you can immediately see her role in
the narrative structure. She is introduced in the second act, has a rather
long soliloquy, and then comes and goes a few times during the rest of the
play. By allowing us to read within a meaningful context, the computer can
fundamentally change the kinds of understandings we can glean from a text.
The use of space in an information landscape is
fundamentally different from that of traditional design. One example of
this can be seen in the presentation of footnotes or supplementary
material. In traditional book design there are few options for visually
treating such related materials. Footnotes can be placed at the bottom of
the page or in the margin and referenced by number or asterisk. The length
of the footnote is quite limited, unless it appears in an extremely small
and barely legible typeface. Tschichold carefully enumerates the many
typesetting problems associated with footnotes in The Form of the
Book. [9]
Hypertext systems, such as those used to access the Internet World Wide
Web, allow the designer to tag a text with footnotes of arbitrary size.
However, when the reader selects a link the footnote appears and
completely obliterates the original text. The use of 3D space gives the
designer new possible solutions to this problem, two of which are shown in
Figure
7.
Figure 7
One obvious solution is to take advantage of the ability
to rapidly change scale and place the footnote next to the referring text,
but much smaller, as shown in the first image. Since size is arbitrary,
you can even put a footnote in the dot of an i or in the period at the end
of a sentence. The problem with this solution, which this extreme example
makes clear, is that it is difficult to see both the footnote and the
referring text at the same time. In the second image, the footnote is
shown at the same size as the main text, but at ninety degrees to it. When
looking directly at the text, the footnotes, being infinitely thin,
disappear, but with a quick twist they can be read. This solution has the
advantage of providing quick, yet unobtrusive access and allowing the
simultaneous display of both texts.
Although these new rendering techniques allow many
different views of a large-scale text, the visualization is only useful if
the user can easily navigate about the text. A number of new methods of
navigation were developed to address this problem. Most current interface
paradigms (windows, buttons, mice) were based on a two-dimensional screen.
A three-dimensional model requires new kinds of controls that allow for
easy manipulation in space. Three different approaches were developed that
make use of LEGO** brick technology and a magnetic field sensor that is
used to determine an exact position and orientation of an object in space.
[10]
The first approach was to use one position/orientation
sensor to act as a handle for the text and another as a virtual camera. By
holding the first sensor in one hand it is possible to easily control the
position and orientation of the text. A graphical representation of the
"handle" is displayed along with the text to help orient the user. The
other sensor is built into a small LEGO helicopter that can easily fit
into the other hand. A helicopter was used because it has an implicit
sense of pointing and an implicit orientation (rotor above, tail behind,
windshield in front). The helicopter pilot "looks" at the handle and
determines the position of the virtual camera in the information space
(see Figure
8). In addition to these controls, some simple LEGO machines were
constructed that allow the user to zoom in and out, and to position the
text horizontally and vertically relative to the virtual handle.
Figure 8
The second approach consisted of a small LEGO stage. The
positioning gears were built into the structure of the stage in two
orthogonal orientations (see Figure
9). To move the text left and right, the wheel above the stage is
turned left and right. To move up and down, the wheel on the side of the
stage is used. The angle of the text is controlled by rotating the entire
stage assembly. To select different characters in the play, the
corresponding LEGO actor is placed on the stage. A resistor is soldered
onto the feet of each figure and the stage can then recognize each actor
when he or she is snapped in place. The footlights come on and the text
for the character lights up.
Figure 9
The last approach was to merge the display with the
navigation controls (see Figure
10). This was done by building a handheld display prototype of balsa
wood and LEGO bricks. There is a single button to the right of the display
that acts as a clutch. When the button is engaged, moving the display
moves the virtual camera about the information landscape. Tipping the
display forward moves the camera down and scrolls the text up. Pushing the
display away from your body zooms out, and pulling the display close to
your body zooms in. Tilting the display to the right or left causes the
text to slide horizontally in that direction. The amount of motion
increases exponentially with the amount of movement, which gives the user
both fine control over small motions and very fast, large motions. When
the thumb control is not pressed, the display is locked and will not move.
Figure 10
All of these methods increase the ease with which the
reader can navigate the text and greatly increase the utility of the
system. For example, giving the user a physical handle on the text makes
it easy to quickly reorient it to see footnotes, which may be placed at
right angles to the main text. The most important lesson learned from
these experiments was that it is impossible to separate the visual design
from the design of the interface. Subtle interactions between the visual
design and the physical controls may facilitate many actions but make
others more difficult.
Conclusion
The current paradigm of a computer with a fixed, heavy
display, a keyboard, and a mouse is not nearly as comfortable to use as a
simple book. Nonetheless the ability of the computer to gather, analyze,
and filter vast amounts of data makes it indispensable in today's world.
Any attempt to improve the design of information in electronic media
should address not only the visual display of the information, but the
design of the computer itself and how one interacts with it in the context
of the real world.
In making information accessible to people, it is
necessary for designers to rethink current design paradigms. The computer
screen is not a piece of paper and should not be treated as such. By
taking advantage of the ability of the computer to display dynamic,
flexible, and adaptive typography, we can invent new ways for people to
read, interact with, and assimilate the written word. Like a garden,
well-designed information should be legible, inviting, and comfortable,
and its exploration should and can be a true delight.
**Trademark or registered trademark of LEGO Systems,
Inc.
Cited references
Accepted for publication March 25, 1996.
Reprint Order No. G321-5621.
[ Journals home
page | Subscribe/order
| Current
issue | Recent issues |
Description
]





©1998 IBM Corporation