Join count statistic
Statistics of spatial association
From Wikipedia, the free encyclopedia
Join count statistics are a method of spatial analysis used to assess the degree of association, in particular the autocorrelation, of categorical variables distributed over a spatial map. They were originally introduced by Australian statistician P. A. P. Moran.[1] Join count statistics have found widespread use in econometrics,[2] remote sensing[3] and ecology.[4] Join count statistics can be computed in a number of software packages including PASSaGE,[5] GeoDA, PySAL[6] and spdep.[7]
Binary data

Given binary data distributed over spatial sites, where the neighbour relations between regions and are encoded in the spatial weight matrix
the join count statistics are defined as [8][4]
Where
The subscripts refer to 'black'=1 and 'white'=0 sites. The relation implies only three of the four numbers are independent. Generally speaking, large values of and relative to imply autocorrelation and relatively large values of imply anti-correlation.
To assess the statistical significance of these statistics, the expectation under various null models has been computed.[9] For example, if the null hypothesis is that each sample is chosen at random according to a Bernoulli process with probability
then Cliff and Ord [8] show that
where
However in practice[10] an approach based on random permutations is preferred, since it requires fewer assumptions.
Local join count statistic
Anselin and Li introduced[11][12] the idea of the local join count statistic, following Anselin's general idea of a Local Indicator of Spatial Association (LISA).[13] Local Join Count is defined by e.g.
with similar definitions for and . This is equivalent to the Getis–Ord statistics computed with binary data. Some analytic results for the expectation of the local statistics are available based on the hypergeometric distribution[11] but due to the multiple comparisons problem a permutation based approach is again preferred in practice.[12]
Extension to multiple categories

When there are categories join count statistics have been generalised[4][8][9]
Where is an indicator function for the variable belonging to the category . Analytic results are available[14] or a permutation approach can be used to test for significance as in the binary case.