GeoAnalytics\" - Exploring spatio-temporal and multivariate data

Share Embed


Descrição do Produto

“GeoAnalytics” – Exploring spatio-temporal and multivariate data Mikael Jern, Johan Franzén NVIS – Norrkoping Visualization and Interaction Studio Linkoping University, Sweden [email protected], [email protected]

Abstract The voluminous nature of social scientific, spatialtemporal statistical databases calls for high interactive performance and creative integrated information and geovisualization tools. A solution to this challenge can be found in the emerging Visual Analytics (VA), a science of analytical reasoning facilitated by interactive visual interfaces and innovative visualization and is now actively pursued by research groups worldwide. In this paper, we present a tool called “GeoAnalytics”, based on the principles behind VA. Our objective is to define new suitable approaches and tools for exploring time variant and multivariate attributes simultaneous including a spatial dimension. We introduce parallel coordinates integrated with time series and trend graph that serves as the visual control panel for the application. Multivariate attribute dynamic queries can express simultaneously queries involving time varying spatial data. VA encourages the need to build a bridge between the advantages of both human perception and computer science technologies. The sense of immediacy and speedof-thought interaction is achieved in our dynamically linked components and maximum allocation of screen area for visual displays that helps users stay focused on their work and shortens their time to enlightenment.

1. Introduction The fast growing quantity of official social science, spatial-temporal statistical data accessible on the Internet calls for high interactive performance and creative integrated information and geo-visualization tool. While researchers have made substantial advances in information visualization and geovisualization over the past decade, many challenges remain particular for working with temporal and multivariate attributes simultaneously and have a spatial dimension. A solution to this challenge can be found in the emerging Visual Analytics (VA), a science of analytical reasoning facilitated by efficient interactive performance, improved fundamental methods for data management and visual exploration analysis [10].

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

Figure 1. GeoAnalytics’ visual interface provides maximum allocation of screen area for graphics displays.

The term “official data” denotes data collected in censuses and statistical surveys by National Statistics Institutes such as Statistics Sweden, SCB [14] or EUROSTAT. This statistical data is used to produce “official statistics” for the purpose of making policy decisions, and to facilitate the appreciation of economic, social, demographic, and other matters of interest to the governments, government departments, local authorities, businesses, and to the general public. For instance, population and economic census information is of great value in planning public services (education, fund allocation, public transport), as well as in private businesses (placing new factories, shopping malls, or banks, as well as marketing particular products). Moreover, survey data on specific topics, such as labour force, time use, household budget, are regularly collected to keep updated information on some economic and social phenomena. The techniques for attaching socio-economic data to specific locations have markedly improved over the last years. For our case study, we selected Sweden’s statistical databases, where official data can be accessed free over the Internet based on the 290 municipality regions.

Tailor-made and task-oriented applications based on layered component thinking are a foundation for our research. This component-based approach to VA tools that distributes functionality among a set of independent modules, will support more flexible and scalable solutions. Based on these principles, we have developed an exploratory data visualization tool called “GeoAnalytics”, appropriate for a target statistical database that is highdimensional data space with limited number of observations. A sense of immediacy and speed-of-thought interaction is achieved in our dynamically linked components and maximum allocation of screen area for visual displays that helps users stay focused on their work and shortens their time to enlightenment. GeoAnalytics includes six types of dynamically linked views with the following tools: parallel coordinates (PC), time series (TG) and time trend graphs (TTG), two choropleth maps and one overview and context map (see figure 1). The three integrated PC, TG and TTG serve as the visual control panel for selecting attribute data and time period to be explored and provide easier identification of multivariate and temporal relationships across spatial domains in the choropleth map. The PC, TG and TTG use embedded dynamic range sliders for both direct manipulation queries and conditioning that constrains the dependent attribute data for selected time periods displayed to those meeting specified parameters on all attributes and time. Our challenge is to provide the analysts with both explorative and communicative VA tools that can analyse multivariate statistical tables for a given time period, that is, the richness of attributes and time and help discover knowledge. Our development platform “GeoAnalytics framework” uses Microsoft’s .NET and DirectX graphics library. The approach facilitates an industry-standard component based architecture and easy-to-use tools for implementation of a dynamic multiple-view GUI. Visual Studios hierarchical layout management provides a non-programming solution to employ not only multiple views but also dynamic resizable views in a single coherent GUI window. The familiar Windows-based GUI most likely represents an appropriate interaction method for the novice end user. We employ our own developed component-based class libraries and data model. These components are developed in C# based on DirectX and fulfil most of our defined VA requirements. Our target user group is not restricted to only experts; we want a broader group of analysts to feel comfortable with our VA tools. In this paper, we begin with a brief section about related work that has influenced our research, followed by a conceptual and technical description of the overall system. Next we will discuss the visualization and interaction techniques implemented. We finish by discussing our conclusions and ideas for future work.

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

2. Related work Existing tools for analyzing observations over time and geography are based on traditional techniques and approaches from disciplines such as GIS, geographical, scientific and information visualization. Visualization of spatio-temporal data has been the subject of several recent research papers [6]. The results include both conceptual models and extensive specialized applications. GeoVISTA Studio [13] is an open source Java-based visual programming environment and is commonly used for developing geovisualization applications. Another general system is CommonGIS [11], which supports exploratory data analysis with decision-making. Andrienko has described interesting approaches in several papers [3], [4] including the impact of data and task characteristics. Most papers use the well-known dataset based on crime statistics in the USA, which is limited to 51 states. For studying local behaviour, Andrienko proposes a time graph symbol superimposed on the map at the location of respective states. Although this method works fine for the 51 well-distributed states of the USA, it’s not suitable for more dense regions such as the Swedish 290 municipalities. Tominski describes 3D pencil icons [7] for mapping multiple spatio-temporal data attributes. These 3D icons are placed at the centroid of each region. Most of these systems lack support for analysing simultaneously multiple attributes data and spatiotemporal behaviour. In our research we propose dynamic and interactive VA methods to also include support for multivariate thematic attributes data for long time series.

3. Visual Analytics

Figure 2. Four research areas constitute the foundation for Visual Analytics (VA)

Visual Analytics (VA) is an emerging and interdisciplinary frontier defined in the book “Illuminating

the Path” [10] as the science of analytical reasoning facilitated by interactive visual interfaces and is now actively pursued by research groups worldwide. VA takes advantage of human perception capabilities and can be described as “find patterns in known and unknown large dataset via visual interaction and thinking”. Several new trends are emerging from VA and among the most important one is the fusion of visualization techniques with other areas such as cognitive and perceptual sciences, statistical analysis, mathematics, knowledge representation, data mining and GIS to promote broadbased advances. Another trend, which has often not been well met to date by visualization researchers, is the realization that algorithmic and other technical development should be closely coupled with usability studies to assure that techniques and systems are well designed and that their value is verified and quantified. VA will arise from a combination of four research areas described in figure 2.

repository. Example of available components are (figure 3): Parallel coordinates, time and trend series graphs, choropleth map, tree map, colour legend and several data model and filter components, etc. Components obtained from this collection can be combined into larger assemblies using a variety of interconnection mechanisms and are readily transferable to most knowledge and decision support intensive applications. Practical testing in close collaboration with end users proves the validity, usability and attractiveness of each individual component.

4. System implementation We have set the following generic requirements for the tool design of GeoAnalytics’ framework: x Spatio-temporal and multiple thematic attribute data exploration; x Interactive performance that gives the user a sense of immediacy and speed-of-thought; x Tailor-made application design; x Shorten development time by utilising already developed and assessed components; x Dynamic and resizable time-linked views; x Maximize screen area for visualization; x Animate linked views simultaneously through time; x Design based on cognitive and perceptual principles;

Figure 3. Layered component architecture comprising basic and functional components developed for DirectX graphics.

Data model components used when filtering data:

4.1 Layered component architecture A keystone in our project is the component thinking. Instead of making large general-purpose applications, GeoAnalytics is based on individual encapsulated objects and customization using Microsoft’s .NET Framework, C# and DirectX as the development platform. Generic low-level basic and functional VA components, each one performing a specific task in the overall VA process, are put together into high-level application components such as GeoAnalytics (figure 3). This layered component architecture enables broad applicability, customization, scalability, reusability of components and shortens the development time. Interoperability is an idea that is invaluable to our development of the VA tools. Different developers, working almost entirely independently, can contribute VA components to a common, quality-assured component

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

4.2 Data model Our data model can be seen as a cube filled with discrete values. The cube has tree axes; time, variable and area. In GeoAnalytics, an area is a Swedish municipality and variable is the measured data type e.g. crime rate. The time is the data acquisition time. The general method for finding a value in the cube is by its position (variable time area). To obtain the migration rate for Stockholm in 2004, find the discrete value at position (2 3 4) in the cube. The values 2, 3 and 4 correspond to the position on each axis.

If the application does not scale well with the variables (statistical attributes), observations (municipalities) and time, its performance in terms of execution time, can degrade. The efficiency problem becomes crucial when human interaction slows down and annoys the user. We choose to employ a 3D data model optimized for efficiency and scalability in handling large spatiotemporal, multivariate attribute data sets. Caching of filtered values during the data initialization is done which significantly reduces the loading time during user interaction.

Figure 4. GeoAnalytics’ 3D data model

A three layered pipeline model is chosen: x The bottom layer of the pipeline consists of the data providers. The current version has one data provider which reads data from an Excel file. The framework, however, is built with consideration for the possibility to add extra data providers. That makes it easy for a developer using the framework to implement and integrate an own component for reading data from e.g. a web service or a database. x The middle layer of the framework filters the data provided from the bottom layer. One of the filtering steps is converting the 3D data structure provided by the Excel file to a 2D data structure for visualization in e.g. a PC component. x The top layer of the framework is the data consumer layer, which includes the GeoAnalytics functional components. Interoperability with SCB’s databases is achieved through a request for statistical data in a SCB dynamic HTML form connected to the SCB server. Selected data is returned as an Excel file that is imported into GeoAnalytics.

4.3 Linked and resizeable views VA encourages the need to build a bridge between the advantages of both human perception and computer

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

science technologies that will enable the user to take a more active role in the process of exploring data. A sense of analytical reasoning and speed-of-thought interaction is achieved in GeoAnalytics through its ability to time-link views. Parallel coordinates, time and trend graphs, choropleth maps are time-linked so that all of the views are synchronized to the same point in time. Animating time-linked views simultaneously through time is an important feature in many VA tasks and enable users to dynamically compare spatio-temporal data [9]. Tools for arranging views and controlling the size of individual views are indeed not simple programming tasks. We therefore propose using Microsoft’s Visual Studio’s .NET hierarchical layout management, with dynamic embedded resizable views in a single coherent GUI window that will maximize the use of the screen area. View controls are arranged in a hierarchy where each child view is only aware of the size of the parent view. An “anchor” property is used to specify how the control behaves when a user resizes the window. You can specify if the control should resize itself, anchoring itself in proportion to its own edges, or stay the same size, anchoring its position relative to the window’s edges. The “dock” property is related to the anchor property and is used to specify that a control should dock to an edge of its container. These alternative and different views on the data can help stimulate the visual thinking process that is characteristic for visual analytics. Coordination is implemented using a data linking method where the components use the same data model and any dynamic filtering made to the 3D data model propagates to all linked visualization components.

5. Visualization and interaction techniques The GeoAnalytics visual interface (figure 9) is divided into six linked views separated by interactive splitters, allowing the user to adopt the layout to his/her preference. Maximum screen area is reserved for the visualization and direct manipulation visual interactions such while most of the traditional GUI controls are hidden and can be pulled out when needed in context-sensitive pull-down menus. GeoAnalytics employs an event-based approach [7] for user-defined conditions and constraints. This means that detected important events are highlighted while irrelevant information is concealed. A parallel coordinates (PC) is integrated with a time series graph (TG) and a time trend graph (TTG), and serve GeoAnalytics as its visual control panel for selecting attributes data and single time steps or period to be explored. These three graphs integrate range sliders for defining spatio-temporal pattern events such as thresholds and conditioning and support interactive use with minimal cognitive overhead and virtually instantaneous response time.

User-controlled events of interest can be expressed simultaneously in all three graphs involving multivariate attribute values for time varying data. Events can be defined for single time steps or in the complete temporal domain. This ability to explore easily the data is helpful in identifying specific patterns of interest, as well as in gaining understanding of the data set as a whole.

in the PC. Time is also controlled by the time slider below the PC and TG (figure 6). Moving the time slider (animating) will dynamically update the PC and linked map and focus users to relevant events. Interesting municipalities can be selected with the mouse in the map, PC or TG and are highlighted in all displays (see darker lines in figure 6).

5.1 Parallel coordinates (PC) Parallel coordinates (PC) has been used in many multiple view geovisualization environments [1], [13]. Observations (municipalities in GeoAnalytics) are represented as a series polylines, passing through parallel axes, each representing a dependent, single attribute data (one time step) in the statistical database. A PC and its connection to the underlying Excel spreadsheet is illustrated in figure 5 and 6. The value of an attribute for a specific municipality is defined by the polyline’s intersection with the vertical axis (an axis can be added, deleted or moved interactively). A single polyline forms a visual representation of the characteristics of one municipality. Differences between selected municipalities can be found by visually comparing the polylines representing them. The number of dependent attributes that can be visualized is restricted only by the horizontal resolution of the dynamic resizable view. Visual interaction features are accessible via embedded controls; range sliders for defining events such as exceeding of a given threshold, interactive axis labels for controlling visualizations and dynamic movable axis.

Figure 5. A column (attribute data) in the Excel spreadsheet is assigned to a PC axis.

5.2 Time graph (TG) The behaviour of the selected and constraint attribute data in the PC are represented in a time graph (TG) for a given time period. In figure 6, the attribute “unemployed total %” is selected in the PC (left) for time 1999 with no constraints given. The corresponding attribute values for the time period 1999-2004 are shown in the TG (right). The TG is time-linked to the PC and updating time in TG will simultaneously change the time step for all attributes

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

Figure 6. Integrated PC and TG graphs.

5.3 Time trend graph (TTG) Dynamic animating changes over time is a significant and important feature but there are more profound challenges. Even if a trend or a sharp change is recorded in a data set, traditional visualizations of the data set’s structural and dynamic properties might not feature such trends and changes prominently enough to draw users’ attention. Therefore, it’s necessary to provide a visualization component with built-in trend detection mechanisms connected to PC, TG and data modelling components. The time trend graph (TTG) represents Value (t) – Value (t - 1), where t is the time period for the observation. Municipalities with similar changes for the defined the time period are shown. The TG in figure 7 shows two municipalities with an almost identical trend for unemployment for the time period 1999 - 2004. Note, that these two municipalities have a major divergence. These trends are discovered in TTG through visual dynamic manipulation such as testing of the sensitivity to similarity between municipalities of the time periods by dynamically filtering each time trend axis.

5.4 Choropleth maps GeoAnalytics provides two choropleth map views (figure 1 and 9), which are time-linked to the PC and TTG (only right map). The choropleth map is coloured according to the selected attribute in the PC or trend values in TTG, taking constraints on any attributes into consideration. Any changes in classification of the choropleth map also change the colour of the corresponding municipalities in the PC and TG. The choropleth maps provide multiple functionalities including:

x Maps are coloured based on selected attribute in PC; x Animated time series based on selected attribute in PC and dynamic time slider below the PC – all views are updated simultaneously; x Map 1 shows attribute data for time step T and Map 2 shows the trend in % from TTG; x Standard transformation tools zoom and pan;

5.5 Defined regions

Figure 8. Create regions with similar behaviours

A user-defined region that covers a selection of municipalities of special interest is created either by interactive selection (pick, rectangle) or via dynamic data filtering. These selected or constrained regions can be used to compare discrepancy between regions.

6. Conclusions

Figure 7. PC (lower) with multivariate attribute data, TG (middle) with time series data for selected attribute in PC and TTG (upper) with calculated trends. Figure shows unemployment youth for the time period 1999-2004. Two municipalities with similar trend are discovered in TTG and are highlighted in all 3 graphs (and maps).

Time periods are controlled by a time slider (figure 1 and 9), a linear time scale that is visible underneath the PC representation. Continuous animation of events over time and geography is provided simultaneously for relevant views as the time slider is moved forward and backwards in time.

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

This paper presents interactive visual analytics tools for analyzing simultaneously spatio-temporal behaviour for multiple social sciences attribute data with long time series. We demonstrate a visual control panel comprising of integrated parallel coordinates, time and trend graphs that can detect spatio-temporal clusters during a given time period. Using direct manipulation queries in the trend graph with virtually instantaneous response time, the analysts can dynamically detect and locate with specific features of interest such as continuous decrease or increase for a time period. GeoAnalytics provides a highly general view of all temporal behaviours through map animation and trend graphs. We propose that a dynamically time-linked view model can efficiently and quickly be developed using Microsoft’s Visual Studio .NET tools. The familiar Windows GUI of .NET based applications most likely represent an appropriate interaction method for the Windows users. Further usability testing is needed to discover how users expert and novice -interact with the proposed linked views metaphor. Most important is our aim to promote the use of a layered component-based approach to the development and engineering of applications. Customizable and scalable high-level application and functional components are designed and developed from low-level basic components. We develop visualization components based on DirectX library. We believe that using a layered component approach can potentially provide better scalability and more customizable exploratory tools.

GeoAnalytics had adopted the requirements to visual analytics with some of the following visual representation and interaction schemes: x Space and time awareness; x Integrated views of large-scale information spaces, supporting coordinated viewing of information in context and provision of both overview and detailed information; x Schemes for management and exploitation of multiple views and encourage complementary views of the same data; x Visual representations and interactive schemes from a cognitive perspective; x Methods for navigation in high-dimensional, multivariate and temporal data; x Interactive performance that can support analytic reasoning; Our next step includes a comprehensive user task analysis based on in-depth interviews with potential users, a set of focus group assessments of our proposed visual user interface and achieved interactive performance, and subsequent controlled experiments to test selected aspects of used methods, such as the multiple resizable views, number of attributes and dynamic range sliders that constrains the animated time-series.

7. References [1] D. Brodbeck and L. Girardin. Design study: Using multiple coordinated views to analyze geo-referenced high-dimensional datasets. In Proceedings of IEEE CMV, 2003. [2] Andrienko, N. & Andrienko, G. 2003. Informed Spatial Decisions through Coordinated Views, Information Visualization, 2 (4), 2003, pp.270-285

[3] Andrienko, N. & Andrienko, G. 2004. Interactive visual tools to explore spatio-temporal variation, In M.F.Coastabile (Ed.) Proceedings of the Working Conference on Advanced Visual Interfaces AVI 2004, Gallipoli, Italy, May 25-28, 2004, ACM Press, 2004, pp.417-420 [4] G. Andrienko and N. Andrienko. Visual Exploration of Spatial Distribution of Temporal Behaviors, In Proceedings of IEEE IV2005. [5] D. Carr, J. Chen, S. Bell, L. Pickle, and Y. Zhang. Interactive linked micromap plots and dynamically conditioned choropleth maps. In dg.o2002 Proceedings. Digital Government Research Center (DGRC), 2002. [6] Muller, W. & Schumann, H. 2003. Visualization methods for time-dependent data, Proceedings of the 2003 Winter Simulation Conference, 737-746, 2003. [7] Tominski, C., Schulze-Wollgast, P, Schumann, H. 2005. 3D information visualization for time dependent data on maps, Proceedings of IEEE IV2005 [8] Hochheiser, H. & Shneiderman, B. 2004. Dynamic query tools for time series data sets: timebox widgets for interactive exploration, Information Visualization, Vol.3, Issue 1, Spring 2004, 1-18. [9] Roberts, J. C. 2004. Exploratory Visualization with Multiple Linked Views, Exploring Geovisualization, J. Dykes, A.M. MacEachren, M.-J. Kraak (Editors) [10] Thomas, J & Cook, K. 2005. Illuminating the Path: The Research and Development Agenda for Visual Analytics, http://nvac.pnl.gov/ [11] CommonGIS. http://www.commongis.de. [12] G. Dang, C. North, and B. Shneiderman. Dynamic queries and brushing on choropleth map. Technical report, HumanComputer Interaction Lab & Department of Computer Science, 2003. [13] Geovista studio. http://www.geovistastudio.psu.edu. [14] Sweden Statistics. http://www.scb.se

Figure 9. GeoAnalytics’ visual interface provides maximum allocation of screen area for graphics displays. The PC graph represents 5 attributes for 2003 including unemployment (time step selected in Time Slider). TG graph shows unemployment for the period 1999 – 2004. Map1 and Map2 shows unemployment for the time step 1999 and 2003. Region “A” South Sweden is selected. The 2D scatter chart represents the correlation between unemployment and unemployment youth for 2003.

Proceedings of the Information Visualization (IV’06) 0-7695-2602-0/06 $20.00 © 2006

IEEE

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.