Virtual reality and parallel systems performance analysis

June 7, 2017 | Autor: Keith Shields | Categoria: Virtual Reality, Performance Analysis, High Dimensionality, Computer, Parallel Systems
Share Embed


Descrição do Produto

Theme Feature

Virtual Reality and Parallel Systems Performance Analysis Daniel A. Reed Keith A. Shields Will H. Scullin Luis F. Tavera

S

calable parallel systems are becoming the standard architecture for high-performance computing. However, achieving close to peak performance requires careful attention to a plethora of sys­

tem details. Not only do hundreds of processors interact on a microsec­

Christopher L. Elford

University ofIllinois, Urbana-Champaign

ond time scale, but also the space of possible performance optimizations is large, complex, and highly sensitive to both application behavior and system software. Although no general theory predicts the performance effects of software changes, a cycle of experimentation involving software modi­ fications and performance measurements does permit application per­ formance tuning. Given the complexity of parallel systems and the num­ ber of possible performance optimizations, two keys to this tuning are capturing and analyzing dynamic performance data and understanding the performance effects of software changes. Just as a logic analyzer lets a hardware designer study signal transi­ tions, software event tracing provides the raw performance data needed to understand all possible spatial and temporal interactions of parallel tasks. However, on parallel systems with hundreds of processors, ap­ plication instrumentation of procedure calls, message passing, and input/output can quickly generate a large amount of performance data. (See Adve et al.l and other articles in this issue for a discussion of the alter­ natives to event tracing. ) If the event frequency is high and the number of processors is large (in the hundreds), the aggregate data rate can be many megabytes/second. Moreover, for a fixed application problem size, both processor-interaction frequency and performance data volume can grow superlinearly with the number of processors. Finally, the relations of specific performance met­

-

rics to application performance can vary widely across applications and

A data-immersive virtual

potentially large volumes of dynamic performance data, we have devel­

parallel architectures. To understand these relations while managing oped a data-immersive virtual environment, called Avatar, which explores

world enables exploration of

performance data and provides real-time adaptive control of application behavior.

complex, parallel-system performance data and

PERFORMANCE-DATA PRESENTATION TE CHNIQUES

supports real-time adaptive

on physical processes that are continuous in space and time. Hence, com­

control of parallel-system

ings of regularly spaced, n-dimensional data sets.

behavior. It has been

performance presentation techniques (for example, those from statisti­

operational for about

ates

Much scientific measurement and computational simulation focuses plementary scientific visualization techniques focus on intuitive render­ In contrast, performance data is irregular in space and time, and other cal graphics2) are more appropriate. Performance measurement gener­ n

metrics for each of the p processors in a parallel system, but

measurement times often depend on loosely correlated event transitions

two years.

0018-9162/95/$4.00

in each processor.

1995 IEEE

November 1995

-

� ""0 c:

1.00 r-------�

world for performance analysis and tuning, generalizes

o

these scatterplot matrices to scattercube matrices.

u

� I

0.75

Scatterplot matrices A scatterplot matrix is a generalization of the simple two­

>. ra



which are used widely in the statistical graphics commu­ nity, can help alleviate this problem. Avatar, our virtual

dimensional scatterplot that containsNZ x-y scatterplots. As

0.50

illustrated in Figure 2, each component scatterplot shows one of the possible projections from N to two dimensions. Because theN projections on the diagonal of the scatterplot matrix are degenerate (both variables on the individual scatterplots are the same), there areNZ -N nondegenerate

0.00 0.00

I 0.25

0.50

0.75

I

I

1.00

1.25

projections. By symmetry, the projections above and below the diagonal are simple transpositions of each other. For 1.50

Procedure lifetime (milliseconds)

example, Figure 2 shows an eight-dimensional scatterplot matrix with performance data from a parallel genome­ sequencing code involving extensive input/output and

Figure 1. Processor behavioral curves for two processors.

interprocessor communication. The diagonal of this scat­ terplot matrix contains box and whisker plots2 of each met­ ric's minimum, mean, and quartiles. Figure 2 highlights several important aspects of perfor­

The behaviors of the p processors define p curves in an

mance data and the limitations of scatterplot matrices.

n-dimensional performance metric space. The measured

First, and most important, some performance metric pairs,

data defines a series of irregularly spaced points on each

such as seek and file-read durations, are highly correlated,

processor's behavioral trajectory. Figure 1 shows the

while others, such as blocking (synchronous) message­

behavioral trajectory for two of eight processors execut­

send delays and procedure-invocation lifetimes, are not.

ing a simple Jacobi iteration to solve a sparse linear sys­

The application code, input data, and underlying system

tem. The x-axis represents sliding-window averages of

software and hardware determine which metrics are

procedure-invocation lifetimes, while the y-axis lists the

strongly correlated and influence their dynamic ranges;

idle time as the processors await message receipt.

wide variations exist across codes and code executions.

This behavioral trajectory, called a phase portrait in clas­

Second, in each projection, the data forms a few behav­

sical mechanics, shows the relationship between two vari­

ioral equivalence classes-all processors typically execute

ables that depend on a third, independent variable (time).

the same code with data-dependent control f low. In

Because the code associated with Figure 1 is iterative, the

practice, a few behaviors dominate, and the data from

measured processor behaviors define two closed paths in

most processors lies in one of the associated clusters.

the metric space; for other codes with more irregular

Understanding the reasons for such behavioral outliers is

behavior, the curves need not be closed. (In Figure 1, dif­

often the key to improving performance.

ferences across iterations are due to measurement varia­

Finally, although scatterplot matrices highlight bivari­

tions and changing converger.ce-verification costs across

ate correlations, trivariate or higher degree correlations

iterations.) Understanding these trajectories' characteris­

are not obvious. The existence of multiple, bivariate clus­

tics and correlations is the key to application and system

ters does not imply clustering in three or more dimensions.

software tuning.

To redress this constraint, statisticians have introduced

The simplicity of this behavioral-trajectory analysis

graphical brushing, where interactively highlighting a

problem belies its difficulty. In practice, such problems

cluster of data points in one scatterplot highlights the same

involve hundreds of processors with 10 or more perfor­

points in all other scatterplots. The cluster dimensional­

mance metrics for each, a microsecond time scale for

ity is determined by the number of scatterplots where the

events, and tens or hundreds of megabytes of performance

highlighted points are adjacent.

data. Moreover, some of the performance metrics are dis­

In short, scatterplot matrices are attractive and intu­

crete, while others are continuous, and their dynamic

itive. But they show only bivariate relations and do not

ranges can differ by multiple orders of magnitude. Within

exploit important aspects of our visual sense-notably,

this context, we must correlate the movement of hundreds

our kinematic and spatialization skills.

of points in a high-dimensional metric space, identifying those processors and metrics that are the critical perfor­ mance determinants.

-

Scattercubes Understanding a three-dimensional object's shape is

Because human visualization and spatialization skills

greatly simplified by binocular stereo images and the abil­

enable recognition of two and three-dimensional projec­

ity to change the viewing perspective; through their own

tions, understanding the relations among abstract, mul­

head and body movements, users can study the three­

tivariate data is difficult. In fact, performance analysts

dimensional object in a natural way. To exploit this capa­

and statisticians face many of the same data analysis and

bility, while retaining the attractive features of scatterplot

visualization problems: In both cases, the data is high­

matrices, we use a three-dimensional generalization of

dimensional, irregular, and sparse. Scatterplot matrices,

scatterplot matrices.

computer

� I

.. I

·

=- .

: .

� ·

-

.

i

I

II

i i

I

il

i



I

k

I l.

.

.

_

.

i

.

..

. .

.

.

� I

:

I I

. ..

.

:

I I

, :.

I

i . -

L .. _.

.

--

• I

I

.

_.

.__.

-_ . .

._---

*_ ..

__ . .. .

j .., .

_ ..

-

_

.

Preferences

..

----

I

:

:

.

NOIIblOC k tienCl

-

i

.

.

i

CIu.1erIng I

--

_ . .

� l

.

..

. � t.I:�!

-.

i...a-

.

':-:. ; :"1.'

1

.

-

..

o •

-

- _.

.

1

I

:

I I

I ,

I

I

�I I � � •



!

. .

--- --

I'roceClure

i

I

----

_.

i

I __

:s

I I

t

i

I

1 ......1···

; ,.' o:!O-

.

_.

_. . .

L--__

I..

..

..

-_ .

.

ItlOCk I\eCeIW

.. .

u_.

_.

-

ItlOCk I!en(\

...

. 0

I

I I

I

.

-

-

· .

.

.

-

.

_k

. .

-

-

-

.-

NOIIblOC k I\eaCI

BlOCk I\eaCI

IlVwalt

Figure 2. Scatterplot matrix.

Our three-dimensional generalization of scatterplot matrices, which we call a scattercube matrix, contains tf3 three-dimensional scatterplots. Figure 3 shows the case whenN = 6. In each cube, the coordinate axes correspond to three of the performance metrics. Like a scatterplot matrix, a scattercube contains both degenerate and non­ degenerate cubes. In a scattercube matrix, a diagonal of N cubes in the interior is three-fold degenerate (all three axes of the cubes on the diagonal correspond to the same metric); these are the gray cubes in Figure 3. In addition, three planes of N2 cubes are two-fold degenerate; these are the red, blue, and green cubes. Because the three degenerate planes share the same degenerate diagonal, there are tf3 - 3tf2 + 2N nondegenerate cubes-the violet cubes in Figure 3. In terms of symmetry, a scattercube matrix is similar to, but more complex than, a simple scatterplot matrix. Each coordinate-orthogonal plane is a variation on a scatterplot matrix; the diagonal along each coordinate­ orthogonal plane is degenerate, and the individual scat­ terplots are three- rather than two-dimensional. Moreover, the degenerate planes define a three-fold symmetry.

Figure 3. A 6

x

6

x

6 scattercube.

November I995

-

Figure 4 illustrates this symmetry when the degenerate

of a single scattercube, with the current value of each

cubes are not displayed. Each cube group in Figure 4

processor's performance metrics denoted by the location of the octahedra in the three-dimensional metric space.

reflects the group diagonally opposite. Finally, the phase portraits in Figure 1 can easily be gen­ eralized to three dimensions. Figure 5 shows the interior

The axes represent performance metrics from a parallel input/output library called the Portable Parallel File System (PPFS), 3 which supports a client-server model. In the PPFS, application clients issue requests to user-level file servers that collectively implement various file-system caching and prefetching policies (see sidebar "Portable Parallel File System"). Figure 5 shows three performance metrics from an exe­ cution of an application code with the PPFS library: server hits (the number of times client requests were found in the server file caches), server service time (the time servers spent satisfying client requests), and client service time (the time the user code was blocked on file requests) . These metrics reveal the dynamic behavior of the PPFS and show the efficacy of data caching and prefetching policies. The current position of each processor in the metric space is denoted by an octahedron. A history ribbon can be associated with an octahedron to show the octahedron's last k positions, with its most recent positions marked by the brightest blue ribbon and the oldest positions marked by very dark blue (that is, the ribbon color varies from bright blue to dark blue with age). The yellow octahedron

Figure 4. Scattercube symmetry for a 6

x

6

x

6 cube.

in the upper left corner of Figure 5 has an associated history ribbon. These history ribbons are also visible in Figure 4.

VIRTUAL ENVIRONMENT INFRASTRUCTURE Application performance is critically sensitive to system and application con­ figuration parameters. Therefore, we've designed Avatar to support both perfor­ mance data immersion and interactive, real-time adaptive control. Immersion lets users observe, explore, and modify attrib­ utes of the scattercube display while "inside" the performance data, whereas real-time, interactive control lets users modify application and system parameters and immediately see how performance is affected. Successfully integrating performance instrumentation and real-time data extrac­ tion, an immersive virtual environment,

Figure 5. Scattercube phase behavior.

and adaptive control mechanisms imposes rigid software design and interactive response-time constraints. Below,

Application program



Adaptive controls

Data presentation

Immersed user

1

Pablo performance instrumentation

Performance data

....



User controls

Data presentation metaphor



Data

Data

interface

manager

Parallel system

we

describe the software and hardware infra­ structure needed to satisfy these con­ straints along with the most salient aspects of their interaction.

Software design Figure 6 shows the logical structure of Avatar's software components. A parallel­ application code, instrumented with the University of Illinois's Pablo software,4,5 generates time-stamped performance data

Figure 6. Avatar's logical organization.

-

Computer

in Pablo's self-describing data format

(SDDF ) . A data interface accepts performance data from

us port the software to various hardware configurations

the parallel system and buffers it for subsequent rendering

by simply refining the appropriate hardware-interface

by the presentation-metaphor software (where a display

classes. Currently supported configurations include a sim­

metaphor is a schema that captures a particular perspec­

ple workstation monitor display, a six-degree-of-freedom

tive on system behavior). Finally, a data manager realizes

tracker and head-mounted display, and the Cave Auto­

any needed data transformations, such as scaling, and

matic Virtual Environment (CAVE) virtual reality theater. 6

computes ancillary data, such as data centroids, for the

The CAVE's primary display is a room-sized cube with walls illuminated by high-resolution, rear-projection video

metaphor-rendering software.

displays.

Hardware support

The workstation environment provides an inexpensive,

Although Avatar was developed on Silicon Graphics

nonimmersive virtual reality, which is adequate for devel­

(SGI) systems, its object-oriented implementation lets

opment and testing and effective for simple demonstra-

Portable Parallel File System The PPFS is a user-level, parallel input/output library that lets applica­ tions control the placement of file data across multiple storage devices, choose caching and prefetch policies, and specify data-consistency proto­ cols. In the PPFS client-server model, we can dynamically reconfigure file­ cache sizes at the application (client) and PPFS (server) level. By carefully matching PPFS parame­ ters and data management policies with application access patterns, we can sometimes increase application input/output performance by an order of magnitude over that achievable with a native Unix file system alone. Client

Intuitively, request-aggregation, write­ behind, prefetching, and caching poli­ cies better match the application Native messages

request stream to the underlying file system's capabilities. And, with dynamic

Control translator

file-system reconfiguration, the user can interactively explore many possible

PPFS

input/output optimizations during a single application execution.

Avatar



TCP control packets

Augmenting PPFS with the Pablo instrumentation library lets us cap­ ture event traces of internal PPFS state transitions, procedure calls, and input/output events. This data, along with sliding-window averages

,

of lower level input/output perfor­

T

UDP S D DF

Controls

records

mance-such as queue lengths and

Cache

..

delays, service times, and request

..

Write-back

throughputs-can be transmitted in

..

real time to remote sites via network

..

sockets. Finally, PPFS can accept dy­ namic reconfiguration requests from a socket. With Avatar, the user can examine the PPFS performance data in real time. change PPFS file-system policies and policy parameters, and

PPFS Portable Parallel File System SDDF Self-describing data format Tep Transmission-control protocol UDP User datagram protocol

see the resulting changes in perfor­ mance (see Figure A).

Figure A. PPFSlAvatar interaction mechanism.

November 1995

-

tions. A user can experience stereo by wearing a pair of LCD shutter glasses that are synchronized with the display of left and right eye views.

Scattercube metaphor Relying on the SDDF data metaformat and separating presentation from data management has allowed us to

The head-mounted display version provides immersion

build realizations of the scattercube metaphor for the

by filling the field of view with synthetic imagery and

workstation, head-mounted-display, and CAVE environ­

exploiting special-purpose peripherals that enhance the

ments. Moreover, this software separation has reduced

sense of immersion. Head and hand trackers provide the

the effort needed to implement other data-presentation

requisite data to render scenes in response to user move­

metaphors.

ments. Stereo headphones and sound-spatialization hard­

At initialization, the user interactively maps all or part

ware create the illusion that sounds originate from

of the performance metrics to the scattercube dimensions.

particular locations in the virtual environment. Simple

In individual scattercubes, three of these metrics define the

speech recognition and synthesis hardware augment the

position of each data point. Additional attributes of each

tracked mouse with oral commands and voice acknowl­

data point, such as size, color, or sound, can represent more

edgment. The CAVE version of the code supports high-resolution imagery without the encumbrance of a head-mounted dis­

than three performance metrics in a scattercube. Within each scattercube, historical display of each point's move­ ments creates three-dimensional phase portraits.

play; the user needs only LCD shutter glasses, a head

For real-time performance presentation and adaptive

tracker, and a tracked mouse to control stereo displays.

control, data updates from the parallel system must be

The CAVE currently supports data sonification but not

timely, lest the user base decisions on old, possibly obso­

sound spatialization or voice recognition.

lete, data. The scattercube metaphor ties each point's transparency to its age. Thus, each data point fades as the

Pablo performance instrumentation An instrumented application invokes the Pablo instru­ mentation software to record salient application events.

time increases since the last performance data was received from the associated processor. If the interval becomes too great, the point simply disappears.

To minimize the volume of data that must be rendered exploit Pablo's capability to compute sliding-window aver­

Data sonification Despite the efforts of a few data sonification pioneers,

ages of performance metrics. By adjusting the window

data presentation has long been synonymous with graph­

while still providing details on application dynamics, we

size, we can balance instrumentation detail against data

ics and visualization, and only recently have nonvisual

volume. The resulting performance data can be output to

data representations become widely accepted. The ana­

either a file for postmortem analysis or sent directly to

log in virtual reality systems is three-dimensional audio,?

Avatar through a Unix socket for real-time analysis. All performance data is described by Pablo's SDDF,4

which provides the illusion through stereo headphones that a sound emanates from a particular location in space.

which shares features of other data metaformats but is

Such spatialized sound can add realism by mimicking

designed specifically to describe performance data. By let­

the physical world, and, in a virtual environment, can

ting users interactively select the desired SDDF records,

heighten awareness and increase the number of data­

Avatar does not need to make assumptions about the

presentation options.

semantics of the records it receives. The same presenta­

Avatar uses sound to reinforce the displayed data,

tion metaphor can be used for many different types of

increase the number of effective data-display dimensions

dynamic performance data.

by conveying the values of metrics not visually presented, and aid navigation and interaction in virtual space. To con­ vey the statistical characteristics of the performance met­ rics within each scattercube, a sound source is placed at the time-varying centroid of the data points within that scattercube. The distance from the scattercube origin to the source defines the pitch of the emitted sound. Hence, low-pitched sounds are emitted when the data centroid is near the origin, and high-pitched sounds are generated when the data centroid is far from the origin. When a user first enters a scattercube, the sound's origin helps the user locate most of the data. Alternatively, users can associate a sound source with an individual point in a scattercube. The attributes (for example, pitch, timbre, or sustain) of this sound source can be fixed or can be associated with other performance metrics. In either case, Avatar can plot a point or trajec­ tory on the basis of sound, and users can hear the phase­ space behavior.

Interactive controls Figure 7. Application controls (genome-sequencing code),

--

Computer

Natural, intuitive control is the essence of data immer­ sion; if the controls are awkward or confusing, the illu-

controls, all metaphor controls are realized

sion of immersion is quickly lost. To lessen the complexity of the virtual environment interface and reduce the learning curve, several less frequently used configuration

n addition

I to the mouse

controls. voice­

through menus and control panels. To move about scattercubes, Avatar users can fly large distances using the mouse or

controls are accessible only through the

adivated toggles

move locally via head and body motions.

workstation interface. The remaining con­

can control

Movements can be recorded for later replay

trols are accessible through a combination

display of phase­

as a fixed flight path through the scatter­

of mouse and voice commands.

behavior lines

cubes. Alternatively, user position can be

and scattercube

fixed while the scattercube matrix rotates

charaderistics.

about one or more axes. This lets users see

The mouse and tracker generate button signals along with mouse position and ori­ entation. Spoken commands are recog­

the relations among all the performance

nized and positively acknowledged via

metrics without choosing a flight path that

synthesized voice response. The user can combine mouse

circumscribes all the scattercubes.

and voice commands to choose and configure items from

In addition to the mouse controls, voice-activated tog­

a group of windows and menus. For the sake of simplicity

gles can control display of phase-behavior lines and scat­

and familiarity, these windows and menus resemble those

tercube characteristics. When the scattercube faces are

of a standard workstation or PC windowing system. For

opaque, users can neither see inside multiple scattercubes

controls such as those in Figure 7, the movement of the

from the outside nor see through the wall of a single scat­

tracked mouse is projected into the control panel window,

tercube that they are inside. In our experience, users begin

and the user modifies items by pointing, clicking, and

with an external view, where all scattercube faces are

dragging with the mouse.

translucent (as in Figure 4), then they circumnavigate the

As Figure 6 suggests, Avatar includes interactive con­

scattercube matrix, fly into a single scattercube, and raise

trols for both parallel applications and data-presentation

the opacity of that cube to focus attention on the data

metaphors. One set of controls lets users adjust applica­

within.

tion behavior in response to observed performance data, while another lets them change display attributes.

EXPERIENCES Avatar has been operational for about two years.

PARALLEL APPLICATION CONTROLS. To support inter­

Although development continues, with emphasis on sup­

active control of parallel-system behavior, we have devel­

port for new presentation metaphors, the general struc­

oped a control library that lets application codes accept

ture and functionality have stabilized. On the basis of our

and respond to interactive-control requests. We have used

experiences with real-time performance analysis and

this library to develop a version of the Portable Parallel

interactive control, we can draw some general conclusions

File System (PPFS) that can respond to interactive con­

about our design choices.

trols issued from Avatar. Because PPFS supports a rich set of file system policies,

Genome-sequence comparison under the PPFS

choosing the best match of file system policies and appli­

To assess the utility of Avatar, we selected a parallel

cation access patterns would typically require several

implementation of a genome-sequence comparison code

cycles of policy selection and testing. However, with inter­

that executes under our PPFS on an Intel Paragon XPIS.

active controls, we can immediately change PPFS para­

The Paragon XP /S is a two-dimensional mesh of compute,

meters and see their effects. Current controls include both

service, and input/output nodes, each with its own local

the PPFS server and client file-cache sizes as well as the

memory.

cache write-back and prefetch parameters.

Because the synthesis methods currently used to deter­

Besides enabling file-system controls, the PPFS and our

mine genetic sequences produce nontrivial numbers of

instrumentation software permit interactive adjustment of

errors, exact string-matching algorithms are inappropri­

performance-measurement windows. With large windows,

ate for biological sequences. One approximate sequence­

the system reports average performance-metric values over

matching approach is based on a generalization of the

long time intervals, minimizing performance-data volume

Needleman, Wunsch, and Sellers (NWS)B dynamic­

and data-extraction overhead. Conversely, small windows

programming algorithm, with a K-tuple heuristic that

provide detailed, high-resolution data and can track rapid

prunes the search space to improve performance. With

changes in performance, albeit with higher data volume

this algorithm, the input sequence is processed against all

and extraction overhead. Varying the window size lets users

entries in the genome database, and the database entry

adjust the performance-data rate as needed to balance

generating the highest score is declared the best match to

instrumentation overhead and detail.

the input sequence. In our parallel implementation of the NWS algorithm,

METAPHOR CONTROLS. Most controls associated with

each Paragon XP /S node independently compares the test

a particular data-presentation metaphor are necessarily

sequence against disjoint portions of the sequence data­

metaphor dependent, although some (for example, fly­

base. Unfortunately, a simple static partitioning of the

ing) are metaphor independent. For the scattercube

genome database yields poor load balance-comparison

metaphor, Avatar supports control of display attributes

times heavily depend on sequence content and size.

(such as cube wall opacity, data scales, and the mapping

Maximizing performance requires a dynamic approach,

of metrics to scattercube axes) and data attributes (such as

where parallel tasks read groups of new sequences from the

data-point brushing and history lines). Like application

database as needed. If too many sequences are read, load

November 1995

-

the queue of pending read requests, the number of file cache hits, and the time to service application read requests. For these metrics, optimality is at the upper rear corner of the cube-low client service time, small queue length, and high server hit counts. Comparing the top and bottom of Figure 8 shows that increasing the cache size and prefetch amounts dramati­ cally reduces the length of the queues, increases the cache hit count, and decreases the client (application) read­ service times. In practice, interactively identifying this combination ofPPFS parameters takes only a few minutes, and the correct parameters reduce application execution time by an order of magnitude on 256 processors of the Intel Paragon XPIS. More generally, because the input/output performance of many large-scale, parallel codes is strongly sensitive to request sizes and patterns, seeing system dynamics enables application scientists to more readily understand temporal performance variations and study the effects of changing application parameters and algorithms.

Interaction experiences Intuitive interaction is undoubtedly the most difficult design problem in the creation of a virtual reality system, particularly one like Avatar that shows abstract data. Without effective navigation techniques, users will not fully explore the data space, and without appropriate cues, those who attempted to do so would quickly become lost. NAVIGATION. In our experience, the utility of a partic­

ular navigation technique depends on the desired dis­ placement from the current location. Walking is the most natural and simplest navigation technique, but it is appro­ priate only for exploration within a single scattercube or its adjoining cubes. Walking makes it difficult to sustain the illusion of immersion when concerned about physical obstacles and cabling constraints, neither of which can be Figure 8. Portable Parallel File System (PPFS)

seen when wearing a head-mounted display. Given these

cache configuration: (top) suboptimal cache configu­

constraints, novice users are particularly reluctant to walk

ration; (bottom) optimal cache configuration.

while wearing a head-mounted display but feel quite free to move about in the CAVE. For long-distance movement, either in the CAVE or

--

imbalances result. Conversely, reading too few sequences

when wearing the head-mounted display, flying is essen­

fails to amortize the cost of input/output operations.

tial. Unfortunately, f lying by pointing the mouse in the

Because the PPFS library provides a rich set of file-cache

desired direction is not intuitive. Inertia and thrust can

and prefetch policies, interposing the PPFS between the

only be represented visually, making it easy to misjudge

genome-sequence comparison code and the native

angle and speed.

ParagonXPIS file system lets users tune file-system behav­

In general, giving users complete control to walk or

ior to meet their application needs. By dynamically chang­

fly is best when they are moving toward and exploring a

ing the size of the PPFS cache and the aggressiveness of

specific scattercube. For large-scale exploration, user­

the sequence prefetch policy, users can find a configura­

controlled navigation must be complemented with fixed

tion that maximizes performance without changing the

flight paths and visual reference cues. Fixed-paths let users

application source code. The correct parameter choices

focus on the data rather than mentally balancing naviga­

depend on the characteristics of the sequence database

tion and data analysis.

and the test sequence. Because these characteristics vary

Rotation and orbiting provide a global view of the scat­

widely and cannot be predicted, interactive file-policy tun­

tercubes with little chance of disorientation: The flight

ing can substantially reduce comparison times.

path is closed, and the users return to their point of ori­

The Avatar environment lets us adjust PPFS parameters

gin. Flight path replay permits more general paths, albeit

while monitoring real-time performance data. As Figure 8

with greater chance of disorientation. This is especially

shows, the distinction between an effective policy config­

useful for letting users share an earlier exploratory trip or

uration and an undesirable configuration is striking. The

providing inexperienced users with more interesting views

three performance-metric axes represent the length of

than those from simple rotations.

Computer

VISUAL AND SONIC CUES. The scattercube metaphor

where the user is currently located. However, with the

presents the user with a multiplicity of similar, three­

CAVE's higher resolution projections, axis labels are excel­

dimensional scatterplots. Without distinctive visual and

lent navigational aids. Users can quickly fly through cubes,

sonic cues, users can lose the identity of a specific scatter­

searching for a specific projection.

cube. Color coding the cubes on the basis of their degen­

Finally, placing a sound source at the origin or at the

eracy, as in Figure 3, provides rough guidance about the

data centroid of a single scattercube permits quick orien­

user's current position. However, it does not distinguish

tation relative to the axes or most of the data. Because

cubes of the same class or indicate a specific location

users hear sound only when they are inside a scattercube,

within the scattercube array.

sound sources provide a complementary navigational aid,

We have found that color coding is best used in con­

and users can readily determine the dispersion of the data

junction with an opacity control for the cube faces. When

in a series of scattercubes simply by choosing a flight path

a user is inside a specific scattercube, opaque faces focus

that intersects multiple scattercubes.

attention on the data in that cube. However, it is difficult to navigate among the cubes, because context is lost.

DATA UNDERSTANDING AND CORRELATION. The scat­

Moreover, it is impossible to view the behavior of any met­

tercube metaphor was designed to study the dynamic

rics when not inside a cube (for example, while flying).

behavior of performance metrics and show data cluster­

Transparent faces permit observation of data in other

ing in many dimensions. Displaying data-point history is

cubes, but background clutter makes it difficult to identify

perhaps the most useful mechanism for understanding

data in a specific cube. Translucent cube faces strike a bal­

dynamic behavior, as it extends phase portraits to three

ance between opacity and transparency by affording an

dimensions.

overview of many cubes simultaneously-with the near­

If enough history is displayed, not only is any cyclic '

est cubes the most visible-while still delineating cube

behavior evident but also, by observing a data point across

boundaries. Adjusting the degree of translucency provides

multiple cubes (for instance, by flying along a coordinate

a continuum of local focus and global perspective.

axis), users can see the movement in four or more dimen­

Within a single scattercube, labeling the axes at the ori­

sions. Second, history lines distinguish the movements of

gin and adding data scales, as in Figure 5, uniquely identi­

multiple points. Without history, if several points moved

fies each cube. With the resolution of today's head-mounted

at once, it would be difficult to determine which new met­

displays, these labels are readable only from within the cube

ric values were associated with each point. Finally, history

Related research system

Carolina/University of California, Los Angeles nano­

draws on a long history of performance-analysis

Our

performance-data

presentation

manipulator.5 Avatar differs from both the ROV and

software , statistical graphics, and virtual reality

the nanomanipulator in its focus on abstract, multi­

research. Heath' describes a suite of two-dimen­

variate data rather than on a physical system.

sional graphics displays for representing dynamic

However, like the ROV, our parallel performance

performance data. Cleveland2 presents a cogent

instrumentation provides a camera for peering into

summary of the statistics community's techniques

the murky depths of parallel systems, and like the

for visualizing irregular data, including early expe­

nanomanipulator, our adaptive controls provide a

riences with three-dimensional scatterplots. Our

probe for prodding the mysterious beasts that live

work differs in its generalization of scatterplot

there.

matrices to encompass three-dimensional scatter­ plots and its integration of history lines to show phase behavior. The closest analog to our work within the virtual

References

1. M.T. Heath and J.A. Etheridge, "Visualizing the Per­

reality community is Beshers and Feiner's work on

formance of Parallel Programs," IEEE Software, Vol.

multidimensional data spaces for visualization of

8, No . 5, Sept. 1991, pp. 29-39.

financial data. AutoVisuaP and its predecessor,

2. W.5. Cleveland and M.E. MiGill, eds., Dynamic Graph­

n-Vision, use "worlds within worlds" to display

ics for Statistics, Wadsworth & Brooks/Cole, Pacific

N-dimensional data. Both create a hierarchy of

Grove, Calif., 1988.,

three-dimensional displays, where users can recur­

3. C. Beshers and S. Feiner, "AutoVisual: Rule-Based

sively nest a group of displays within one display by

Design of Interactive Multivariate Visualizations," IEEE

selecting a point. Our work differs in that it imposes

Computer Graphics & Applications, Vol. 13, No.4, July

no hierarchy on the data dimensions: All are treated as equals, and users need not assign an a priori order or importance. Finally, because Avatar supports real-time adap­ tive control, it bears some relation to other tele­

1993, pp. 4 1-49. 4. B.H. Robinson, "Midwater Research Methods with MBARI's ROV," Marine Tech. Soc. J., Vol. 26, No. 4,

Winter 1992, pp. 32-39.

5. R.M. Taylor et aI., "The Nanomanipulator: A Virtual

presence projects-for example, the Monterey Bay

Reality Interface for a Scanning Tunneling Micro­

Aquarium Research Institute (MBARI) remotely oper­

scope," Proc. SIGGraph 93, ACM, New York, 1993, pp.

ated vehicle (ROV)4 and the University of North

127- 134.

November 1995

lines show the magnitude of metric changes within the

formance input/output: parallel scientific codes and World

sliding-window interval-short lines for small changes

Wide Web servers. The goal of the first collaborative

and long lines for larger changes.

research effort and of the nascent scalable input/output

We have discovered that enabling history for a subset

initiative (SIO) is to redress the input/output limitations

of-rather than all-the points (for example, those in a

of today's massively parallel systems via a broad-based

bounding volume or a few representatives) provides sig­

effort that includes performance-analysis, operating­

nificant insight. In this case, history lines play the role of

system, compiler, and application researchers. In the sec­

brushing2 in statistical graphics: If representatives are clus­

ond domain, we are using Avatar to analyze the access pat­

tered in all scattercubes, then the data is clustered in all

terns to the National Center for Supercomputing

dimensions.

Applications' (NCSA) WWW server, using the request logs

One problem with observing real-time data is temporal accuracy. A data point whose associated metrics have not been recently updated may appear near

recorded by that server. Our goal in this effort is to under­ stand the types of requests and access patterns and the implications for future-generation server design.

other, more recently updated points. This can be alleviated via aging-increasing the

e are now

Wusing Avatar to study two

transparency of a data point on the basis

RECORDING AND ANALYZING the dynamics of application­

of the time since its last update.

program, system-software, and hardware interactions are

Seeing system dynamics helps us under­

the keys to understanding and tuning the performance of

types of high­

stand the temporal variation in perfor­

massively parallel systems. We have implemented Avatar,

performance

mance and the effects of changing appli­

a data-immersive virtual world for performance analysis

inputloutput.

cation parameters. Rather than running

and real-time control of application behavior. Avatar

the entire application several times using

shows all possible three-dimensional projections of a

test data sets to identify appropriate para­

sparsely populated, N-dimensional metric space. Our early

meters, we could adjust those parameters interactively to

experiences with Avatar suggest that the combination of

increase performance. This is vital, because the perfor­

its performance-metric correlation and its capability to

mance of many dynamic codes is strongly sensitive to input

interactively modify application behavior provide a pow­

data characteristics, which makes it extraordinarily diffi­

erful mechanism for performance optimization. I

cult to identify a priori a single, globally optimal parame­ ter configuration.

Acknowledgments

DIRECTION AND FUTURE WORK Although Avatar has been operational for about two

______.

__

This work was supported in part by the Advanced Research Projects Agency under ARPA contracts DAVT63-

years, like all experimental software projects its imple­

91-C-0029 and DABT63-93-C-0040; the National Science

mentation has raised more research questions than it has

Foundation under grants NSF IRI 92-12976, NSF CDA94-

answered. (See sidebar "Related research.") Besides refin­

01124, and NSF CDA87 -22836; the National Aeronautics

ing system functionality, we are working to add new data

and Space Administration under NASA Contract Number

presentation metaphors and interaction mechanisms.

NAG-1-613; and a collaborative research agreement with

The scattercube metaphor abstracts a parallel program's

the Intel Supercomputer Systems Division. We are

behavior as a group of dynamic performance metrics.

indebted to Duane Andres for his software contributions

Although this lets us study performance-metric correla­

to the early development of Avatar and to Stephen Lamm

tions, the direct relation to application-code fragments is

for his recent additions. Ruth Aydt, Roger Noe, Tara

lost. We have developed a time-line metaphor to represent

Madhyastha, Bradley Schwartz, and Brian Totty con­

the processor interactions when executing application

tributed to the Pablo performance analysis software and

code on parallel systems.

offered valuable advice on the design of Avatar. Finally,

In the time-line metaphor, processor icons are equally

we owe special thanks to Phil Roth for Figure 2.

spaced about the circumference of a cylinder, with the cylinder axis representing time; that is, each processor's time line extends along the cylinder axis. Along each time

References

line, icons represent processor activities, while lines from the initiating processor to the recipient processor repre­

Analysis Environment for Data Parallel Programs," to appear

interaction durations and query the system for additional

in Proc. Supercomputing

95, ACM, New York.

information on selected activities. In the future, critical

2. W.S. Cleveland and M.E. MiGill, eds., Dynamic Graphicsfor

and near critical paths will be highlighted, and users will

Statistics, Wadsworth & Brooks/Cole, Pacific Grove, Calif.,

be able to determine how removing a critical path affects performance. We have also instrumented Avatar to obtain detailed, dynamic data on tracker overhead and lag, rendering rates,

-

1. V.S. Adve et aI., "An Integrated Compilation and Performance

sent cross-processor interactions. Users can determine

1988. 3. J.Y. Huber et aI., "PPFS: A High-Performance Portable Par­ allel File System," Proc. NinthACM Int'l Can! Supercomputing, ACM, New York,

1995, pp. 385-394.

data-processing costs, and command processing. With this

4. D.A. Reed, "Experimental Performance Analysis of Parallel

data, we will be able to analyze the end-to-end perfor­

Systems: Techniques and Open Problems," Proc. Seventh Int'l

mance as a function of user behavior and data complexity.

Can! Modeling Techniques and Tools for Computer Performance

We are now using Avatar to study two types of high-per-

Evaluation, Springer-Verlag, Secaucus, N.J., 1994, pp. 25-51.

Computer

5. D.A. Reed et aI., "Scalable Performance Analysis: The Pablo

Keith A. Shields is a member of the technical staff at the

Performance Analysis Environment," Proe. Scalable Parallel

Analytic Sciences Corporation. Shields received aBS degree (magna cum laude) in computer sciencefrom the University of South Alabama in 1991 and an MS degree in computer science from the University of Illinois in 1994. Shields is a member of ACM and IEEE.

LibrariesConf., IEEE CS Press, Los Alamitos, Calif., Order No.

4890,1993, pp. 104-113. 6. C. Cruz-Neira, D.J. Sandin, T. DeFanti, "Surround-Screen Projection-Based V irtual Reality: The Design and Imple­ mentation of the CAVE," Proe. SIGGraph 93, ACM, New York,

1993, pp. 135-142.

Will H. Scullin is a graduate student pursuing an MS degree in the Department of Computer Science at the Uni­ versity of Illinois, Urbana-Champaign. He received a BA degree (with distinction) in computer science in 1993 from

7. E.M. Wenzel, "A Virtual Display System for Conveying Three­ Dimensional Acoustic Information," Proc. Human Factors Soe., Human Factors Soc., Santa Monica, Calif.,

1988, pp. 86-90.

8. S.B. Needleman and C.D. Wunsch, "An Efficient Method

the University of Minnesota, Morris.

Applicable to the Search for Similarities in the Amino Acid Sequences of Two Proteins," J. Molecular Biology, Vol. 48, No.

Luis F. Tavera is a PhD candidate in the Department

1, Feb. 1970, pp. 444-453.

of Computer Science at the University of Illinois, Urbana­ Champaign. Tavera received a BS degree in physics engi­ neering in 1988 from the Universidad Iberoamericana in Mexico City, Mexico. He completed an MS degree in com­ puter science at the University ofIllinois in 1994.

Daniel A. Reed is a professor in the Department of

Computer Science at the University of Illinois, Urbana­ Champaign, where he holds a joint appointment with the National Center for Supercomputing Applications (NCSA). Reed received a BS degree (summa cum laude) in computer science from the University of Missouri, Rolla, in 1978 and MS and PhD degrees in computer science from Purdue Uni­ versity in 1980 and 1983, respectively. He was a recipient of the 1987 National Science Foundation Presidential Young­ Investigator Award. Reed serves on the boards of IEEE Trans­

Christopher L. Elford is a PhD candidate in the Depart­ ment of Computer Science at the University of Illinois, Urbana-Champaign. Elford received a BS degree (magna cum laude) in computer sciencefrom the University ofHous­ ton in 1991 and an MS degree in computer science from the

University of Illinois in 1994.

actions on Parallel and Distributed Systems, Concurrency Practice and Experience, and the International Journal of

Readers can contact the authors at the Department of Com­ puter Science, University of Illinois, Urbana, Illinois 61801; e-mail {reed.shields.scullin, tavera, elford}@cs.uiuc.edu.

High-Speed Computing. He is a member of the NASA RIACS

Science Council.

CALL FOR PAPERS

The Fifth Asian Test Symposium (ATS'96) November 20-22, 1996

Na1ional Tsing Hua University Hsinchu, Taiwan

IEEE Computer Society

.. IEEE

Scope: Papers addressing original and unpublished research contributions on theoretical and/or practical aspects of electronic testing are welcome. Specially sought will be papers that illuminate connections between practice and theory. Topics of interest include, but are not limited to:

(8) fault simulation, (C) design for testability, (D) synthesis for testability, (E) built-in self-test, (F) circuit and system level (I) fault tolerance, (J) concurrent error detection, (K) analog and mixed-signal testing, (ll memory testing, (M) Iddq testing, (N) board and system level testing, (0) test economiCS, (PI sonware test.

(A) test pattern generation,

diagnostics, (G) funclional levet testing, (H) switch level testing,

Submission: Authors are invited to submit five

(5) copies of a full paper (in English, 61 - 5 pages double spaced),

twenty

(20) copies of a one page abstract, and a separate cover page

1) the title of the paper, 2) the name and affiliation of each author, 3) a classification of the topic covered (using one of the topics listed above or creating one if necessary), 4) the principal author (including his/her e-mail address and fax number if available), and 5) the following signed statement: All appropriate clearances for the publication of this paper have been obtained, and if accepted the author(s) will prepare the final manuscript in time for inclusion in the Symposium Proceedings and will present the paper at the Symposium. to the Technical Program Chair. The cover page must contain or identify

Important Dates:

* * *

Deadline for Submission:

March

15,1996

June 15,1996

Notification of Acceptance:

Deadline for Receipt of Camera-ready Copies: August 1, 1996

* Tutorials: * Symposium:

November 20, 1996

November 21-22,1996

General Chair:

Technical Program Chair:

local Arrangements Chair: Youn-Long lin (NTHU)

Professor Chung-Len Lee

Professor Cheng-Wen Wu

Finance Chair: Ting-ling Hwang (NTHU)

Department of Electronic Engineering

Department of Electrical Engineering

Registration Cochairs: Chung-Hao Wu (NTHU), Jing-Yang Jou (NCIU)

National Chiao Tung University

National Tsing Hua University

Publicity Cochairs: Mely Chen (ITRI), Jen-Sheng Hwang (CIC, NSC)

Hsinchu, Taiwan

Hsinchu, Taiwan

Publications Chair: Wen-Zen Shen (NCTU)

+88635731154 Fax: +88635715971

Tutorials Chair: Tsin-Yuan Chang (NTHU)

Tel:

Exhibits Chair:Jyuo- Min Shyu (ITRI) Europe liaison: Bernard Courtois (TIMA)

Sponsored by: IEEE Computer Society,

Further Information:

Test Technology Technical Committee

E-mail: [email protected]

National Ising Hua University

WWW: http://mound_ee_nthu_edu_tw/cww/ats96/ats96_html

US liaison: Kwang-Ting (Tim) Cheng (UCSB)

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.