|
|
|||||||||||
|
|
|||||||||||
The Space-Time Intelligence System (STIS) is a recently developed product from TerraSeer, Inc. of Ann Arbor, Michigan. It provides insight to the temporal aspects of spatial data. This is accomplished through a distinctive spatial database with temporal measures embedded and integrated through the STIS. It uses these data to create linked maps and graphs that allow the user to explore change over time. The product runs on the Windows operating system (no other system requirements are noted by the firm) and comes with three months of free e-mail and fax support. Updates are free as long as the company offers the product. We evaluated version 1.03 of the STIS on a Dell Inspiron 8100 laptop with 512 MB of RAM and running Windows 2000. The current version is 1.06. For this review of the software, we first explore some issues related to analyses that involve both space and time. Then, we summarize the use of the software with examples from the cancer data supplied with the product by TerraSeer. Finally, we discuss the issues that arise when putting our data, showing West Nile virus (WNV) cases in Illinois in 2002, into the STIS. Temporal Data and GIS Many diseases are linked to temperature and precipitation and will exhibit seasonal variations. Good examples are vector-borne diseases, such as Lyme disease and WNV. Epidemics are identified when a relatively high number of cases occur during a specific time period. Toxic materials released into the environment will be at the highest levels near to and at the time of release, and adverse health outcomes may be correlated both spatially and temporally. Though the benefit of including temporal data in a spatial analysis may be apparent, the difficulties in analyzing these dimensions can be daunting. One issue is how to express the time period so that it can be used in both a GIS and other more specialized software. Because passing date fields across software packages is generally troublesome, even when a date format is available in a GIS, it may not be correctly interpreted by another program. For instance, the STIS software uses the ISO (International Organization of Standardization) date format of YYYYMMDD for the time stamp. This time expression can be formatted and recognized in most databases. Our data, however, are in ESRI's ArcGIS, and the "date" data type does not include an ISO format option, making it necessary to enter each ISO format date as an integer. It is also possible that the data order is more important than the actual date. In such situations, a numeric series may denote the time sequence of the data, and it can be difficult to standardize the "meaning" of the numeric sequence between programs. Even more complex than time formats, however, is how data are modeled. A set of disease cases measured at points will often include an associated date, such as the day of onset. But, if one wishes to aggregate disease cases at the county level, he or she must adapt the date concept because not every occurrence of an illness would have happened on the same day. Furthermore, epidemiologists typically calculate the prevalence — or the background rate of a disease — by dividing the population by the total number of cases for the geographic unit(s) of interest. The incidence of the disease during a single time period, however, may be of more interest and more revealing to the researcher. That is, it can be more insightful to investigate disease incidence during a particular year, season, or week by aggregating all occurrences for that period. Yet, to determine how a disease varies across time, each time period and geographic unit would need to be measured separately, increasing the complexity of the analysis. This complexity is evident in the STIS itself, which identifies data as part of either a time-slice or time-series. Time-slice data vary by some regular time unit across all geographic areas (an example would be the rate of Lyme disease cases by county in the United States for each month). In STIS, each of those slices of time becomes a separate file; thus, five years of data by month requires 60 files.
Evaluating Cancer Data The STIS operates on the concept of projects, which define a set of views and data. Opening a project recalls the past-saved work. The STIS includes a project titled "SE and Gulf Coast" as an example, along with tutorial instructions. When opened, the project pulls up a map of counties in the southeast United States and several graphs showing colon cancer mortality rates (see Figure 2). The cancer data are broken into five five-year time slices starting in 1970. Two distinctive features of these views are immediately apparent. First, a date slider appears at the top of each view. By pulling the slider to the right, the time period for the graph or map view changes. The time step is adjustable, so users can see data changes at daily, weekly, monthly, or yearly increments. The second distinct feature is the ability to "brush" across views so that data selected on one view are highlighted on all other views. Further exploration of the software reveals some other useful capabilities. For instance, each map or histogram view with a date slider can be animated, and users can export the animations as AVI files for play-back via the Web or using any software that supports video files, including Microsoft PowerPoint. Anyone who has prepared animated graphics can appreciate this option, as video creation can be quite time-consuming. Another STIS tool allows users to link the date increment across map and histogram views so that when the date of one view changes, the linked views also adjust automatically. At the same time, users can link views separately, keeping one view static while the others change. With the combined features of brushing, animation, and linking, it is possible to explore multiple dimensions of a database. In particular, the relatively higher and lower values of a disease occurrence can be seen in a temporal and geographical context. The STIS also provides functions for quantitative data measures. Specifically, users can execute a local Moran's I statistical analysis to measure spatial autocorrelation (Anselin 1995). A Moran's I test is a measure of spatial clusters, which determines how similar values at one place are compared with nearby locations. Higher Moran I values for a point indicate that it is more similar with its neighbors. Rather than just a single measure for all times, the STIS Moran I option is useful in measuring the autocorrelation for specific time periods to learn how much the location of clusters varies with time. In addition, the STIS lets users create a difference map, subtracting one map from another and highlighting those places with large changes across time periods. Employing this feature in conjunction with a Moran I analysis can further illuminate one's data analysis. A West Nile Virus Example Although at first glance our data appeared compatible with the STIS, an important issue became apparent soon after the effort began: 2002 human cases of WNV data were not in a form that could be imported immediately to the system. The data were actually in an STIS-supported point shapefile format and each point had a date of onset in ISO standard format, meeting the STIS requirement. However, the point data did not have a quantitative value analogous to the example cancer-rate data, and the points did not have values for all time periods, nor did they each start and end at defined points in time. We considered several options to overcome this difficulty. For instance, we contemplated calculating incidence rates by census tract or other geographic unit for a set of time steps, such as by week. One problem with this approach, however, was that the small number of WNV cases in some tracts, and the different numbers of people in different tracts, resulted in unstable rates, necessitating a smoothing operation for each of the time steps. This was deemed too complex and lengthy for the current purpose, so we developed an alternative. We chose a study area that comprised Cook and DuPage counties, which includes Chicago and its surrounding suburbs, and divided the area into 5-kilometer grids. We then used ArcGIS to calculate the number of WNV cases for each grid cell during a four-teen-week period from July 7 to October 12, 2002. The resulting maps showed that in the early and late parts of the season — when case numbers were low — only a few cells had a case of WNV. At the height of the outbreak in weeks 37 and 38 of 2002 (September 1-14), more than 100 cells had at least one case, and some had as many as four or five cases (see Figure 3). After creating the individual shapefiles for each week, we prepared to input the data into the STIS, following the program's documentation, which indicated the need to create a DBF file for each time period. The idea was to import one shapefile with all geographic units to establish the geography and then import the rest of the data as DBFs. Each file had to include both an ID field on which to join the data and a quantity field with the same field name for all of the time periods. The STIS interface also required that each file be defined for the time period for which it is valid. The exact steps to perform this successfully were not entirely apparent from the documentation provided with the software, but the discovery of TerraSeer's online help at www.terraseer.com/products/stis/stis_help.html provided the additional information needed to execute this task. Once the data were imported, we were excited to see the WNV data animated by weekly time increments (see http://maximus.cvm.uiuc.edu/wnv/maps/STISanimation.mhtml). The animation immediately revealed the ebb and flow of WNV cases across space and time, showing patterns that had not been apparent from our prior work with the data. The program also handled color selection intelligently on the maps. Thus, a value of "1" always had the same color across all the animated maps. In contrast, when assigning the default color for a set of maps organized by time, a GIS usually sets the highest color value in a color ramp to a "1" value when it is the only value in a map, but that same color can be assigned to a "5" value in a later map. The STIS software includes many other features that we were hoping to take advantage of, but obstacles remained to making full use of the STIS for the WNV data. For example, we faced a problem because our data included a large number of "0" values, where cells had no cases. (The option for eliminating the zeroes from graphs resulted in an error that was fixed in a subsequent patch.) Another set of options required a second meaningful variable, and we did not have a second variable for the 5-kilometer cells (see a recent paper by AvRuskin et al. 2004, who are involved with STIS development and used the program to examine the relationship between bladder cancer and arsenic exposure.) Potential and Promise Nonetheless, this promise of the software is not yet fully realized. Several issues still need to be addressed to make it useful to more users. First and foremost, the STIS needs more automation tools for creating and importing diverse databases. The program will be much more useful with a better means of tak-ing the data from a GIS or database in a format and form that users are likely to already have. Specifically, point location data for an event at a given time are common not only in disease studies, but also for crime and wildlife analysis. Thus, enhancements to how the STIS handles point location data should be given special consideration in future versions of the product. The STIS can additionally be improved with the inclusion of more space-time statistical methods. Functionality from TerraSeer's ClusterSeer software, for example, could be integrated into the STIS. Analysis tools from ClusterSeer that could be useful in the STIS include the Jacquez's K Nearest Neighbor statistical method, in which nearest neighbors in both space and time are matched to see if they exceed the random distribution, and the Kulldorffs' SCAN method, which has gained popularity for detecting space-time clusters (Jacquez 1996; and Kulldorff and Nagarwalla 1995). In addition, some consideration should be given to incorporating traditional time-series analysis for specific geographic areas into the STIS. Other suggested improvements to the STIS involve more quickly integrating the online help into the software updates. Although online help is a good idea as a supplement to the documentation that comes with the system, it is still not always possible to access the Web at all times. The documentation provided with the product should also be updated to include the fundamental information about importing data. Still, the STIS proved to be robust. Although our expectations may have been too high — the name of the product suggests more functionality than is currently developed — the product leaves open many exciting opportunities for expansion. Currently, the spatial-analysis software market does not have a comparable product, and there is room to grow the STIS into a significant tool for sophisticated space-time analysis. References AvRuskin, G.A., G.M. Jacquez, J.R. Meliker, M.J. Slotnick, A.M. Kaufmann, and J.O. Nriagu. "Visualization and Exploration of Epidemiologic Data Using a Novel Space Time Information System," International Journal of Health Geographics 3, no. 26 (2004): www.ij-healthgeographics.com/content/3/1/26. Devesa S.S., D.J. Grauman, W.J. Blot, G. Pennello, R.N. Hoover, and J.F. Fraumeni Jr. Atlas of Cancer Mortality in the United States, 1950-94. Washington, D.C.: U.S. Government Printing Office, 1999 [NIH Publication No.: (NIH) 99-4564]. Jacquez, G.M. "A K-Nearest Neigh-bor test for Space-Time Interaction," Statistics in Medicine 15 (1996): 1935. Kulldorff, M., and N. Nagarwalla. "Spatial Disease Clusters: Detection and Inference," Statistics in Medicine 14 (1995): 799. Ruiz, M.O., C. Tedesco, T. McTighe, and U. Kitron. "Environmental and Social Determinants of Human Risk for West Nile Virus in the Chicago Region, 2002," International Journal of Health Geographics 3, no. 8 (2004): www.ij-healthgeographics.com/content/3/1/8. Grant Ian Thrall, Ph.D., is column editor of Shop Talk and First Impressions. He is a consultant; a professor at the University of Florida, Gainesville; and a member of the Geospatial Solutions Editorial Advisory Board. |