SaTScan is a powerful stand-alone free software program that runs spatio-temporal scan statistics. It is carefully optimized and contains many tricks to reduce the computational burden, which is doubly intensive. The scanning itself is computer intensive, particularly in spatio-temporal settings, and the Monte Carlo hypothesis testing involves resampling and redoing the scanning for hundreds or thousands of random data sets.
There are two ways to run SaTScan. The easiest way to choose between the many data, analysis, parameter and output options is to use the graphical user interface (GUI). The GUI allows complete control, but precludes automated or repeated operation of multiple analyses over time. While more cumbersome, the SaTScan parameter file makes it possible to run SaTScan in batch mode, bypassing the GUI. It is not trivial to integrate it with various data sets and other analyses though. The rsatscan package contains a set of functions and defines a class and methods to make it easy to work with SaTScan from R. This allows easy automation and integration with data sets and analyses.
Before running rsatscan, it is recommended to first explore the SaTScan GUI to familiarize oneself with the various data, analysis, parameter options. Also, it is often a good idea to create a template parameter file using the GUI, and then list those parameter settings using the ss.options()in rsatscan.
The rsatscan functions can be grouped into three sets: SaTScan parameter functions that set parameters for SaTScan or write them in a file to the OS; write functions that write R data frames to the OS in SaTScan-readable formats; and the satscan() function, which calls out into the OS, runs SaTScan, and returns a satscan class object. Successful use of the package requires a fairly precise understanding of the SaTScan parameter file, for which users are referred to the SaTScan User Guide.
## rsatscan only does anything useful if you have SaTScan
## See http://www.satscan.org/ for free access
Basic usage of the package will:
ss.options()function to set SaTScan parameters; these are saved in R
write.ss.prm()function to write the SaTScan parameter file
satscan()function to run SaTScan
satscanobject and proceed to analyze the results from SaTScan in R.
The New York City fever data, which are distributed with SaTScan, are also included with the package.
## zip cases date ## 1 11229 1 2001/11/22 ## 2 11208 1 2001/11/13 ## 3 11208 1 2001/11/24 ## 4 11212 1 2001/11/3 ## 5 11374 1 2001/11/10 ## 6 10452 1 2001/11/20
## zip lat long ## 1 10001 40.75037 -73.99674 ## 2 10002 40.72199 -73.99000 ## 3 10003 40.73097 -73.98841 ## 4 10004 40.68834 -74.02002 ## 5 10005 40.70550 -74.00816 ## 6 10006 40.70754 -74.01292
For good style, an analysis would begin by resetting the parameter file:
Parameters for a specific SaTScan version (>= 9.2) can be set using the ‘version’ argument:
If a version is not specified, the parameters will be set to the
latest version available in ‘rsatscan’. More information on ss.options
is available in the documentation page, accessible using
Then, one would change parameters as desired. This can be done in as
many or few steps as you like; the previous state of the parameter set
is retained, as in
the parameters used in the example from the SaTScan manual are
ss.options(list(CaseFile="NYCfever.cas", PrecisionCaseTimes=3)) ss.options(c("StartDate=2001/11/1","EndDate=2001/11/24")) ss.options(list(CoordinatesFile="NYCfever.geo", AnalysisType=4, ModelType=2, TimeAggregationUnits=3)) ss.options(list(UseDistanceFromCenterOption="y", MaxSpatialSizeInDistanceFromCenter=3, NonCompactnessPenalty=0)) ss.options(list(MaxTemporalSizeInterpretation=1, MaxTemporalSize=7)) ss.options(list(ProspectiveStartDate="2001/11/24", ReportGiniClusters="n", LogRunToHistoryFile="n")) ss.options(list(SaveSimLLRsDBase="y"))
Note that the second call to
ss.options() uses the
character vector format, while the others use the list format; either
It might be reasonable at this point to check what the parameter file looks like:
##  "[Input]" ";case data filename" "CaseFile=NYCfever.cas"
Then, we write the parameter file, the case file, and the geometry file to some writeable location in the OS, using the functions in package. These ensure that SaTScan-readable formats are used.
write.??? functions append the appropriate file
extensions to the files they save into the OS.
Then we’re ready to run SaTScan. The location and name of the SaTScan executable may well differ on you r disk, particularly if you do not use Windows. In a later release of the package, it may be possible to detect the location the executable
# This step omitted in compliance with CRAN policies # Please install SaTScan and run the vignette with this and following code uncommented # SaTScan can be downloaded from www.satscan.org, free of charge # you will also find there fully compiled versions of this vignette with results ## NYCfever = satscan(td, "NYCfever", sslocation="C:/Program Files/SaTScan", ssbatchfilename="SaTScanBatch64")
rsatscan package provides a
satscan object has a slot for each possible output
file that SaTScan creates, and contains whatever output files your call
happened to generate.
If SaTScan generated a shapefile,
satscan() reads it, by
way of the
readOGR(), if it’s
available, into a class defined in the
sp package. You can
use the plot methods defined in the
sp package to plot it,
or use one of the many packages that builds on the
package for further processing.
It might be interesting to examine the scan statistics from the Monte Carlo steps.
This shows why none of the observed clusters had small p=values.
This is another data set included with
differs from the NYC fever example in that denominators are available;
these are provided in a population file. The analysis uses the Poisson
model rather than the Spatio-temporal permutation.
Again, replicating the examples from the SaTScan user guide, we set up and then write the parameter file, then run SaTScan.
invisible(ss.options(reset=TRUE, version="10.1")) ss.options(list(CaseFile="NM.cas",StartDate="1973/1/1",EndDate="1991/12/31", PopulationFile="NM.pop", CoordinatesFile="NM.geo", CoordinatesType=0, AnalysisType=3)) ss.options(c("NonCompactnessPenalty=0", "ReportGiniClusters=n", "LogRunToHistoryFile=n")) write.ss.prm(td,"testnm") ## testnm = satscan(td,"testnm", sslocation="C:/Program Files/SaTScan", ssbatchfilename="SaTScanBatch64")
Note that the parameter file need not have the same name as the case and other input files, which also need not share a name, though it may be helpful in keeping things organized.
One of the elements of a
satscan class object is the
parameter set which was used to call SaTScan. This may be useful,
A third data set included with
SaTScan is also included
with the package. This one has cases and controls, and uses the
Bernoulli model. We replicate the parameters from the
SaTScan manual again.
write.cas(NHumbersidecas, td, "NHumberside") write.ctl(NHumbersidectl, td, "NHumberside") write.geo(NHumbersidegeo, td, "NHumberside") invisible(ss.options(reset=TRUE, version="10.1")) ss.options(list(CaseFile="NHumberside.cas", ControlFile="NHumberside.ctl")) ss.options(list(PrecisionCaseTimes=0, StartDate="2001/11/1", EndDate="2001/11/24")) ss.options(list(CoordinatesFile="NHumberside.geo", CoordinatesType=0, ModelType=1)) ss.options(list(TimeAggregationUnits = 3, NonCompactnessPenalty=0)) ss.options(list(ReportGiniClusters="n", LogRunToHistoryFile="n")) write.ss.prm(td, "NHumberside") ## NHumberside = satscan(td, "NHumberside", sslocation="C:/Program Files/SaTScan", ssbatchfilename="SaTScanBatch64") ## summary(NHumberside)