Skip to contents

The Local Geary is a local adaptation of Geary's C statistic of spatial autocorrelation. The Local Geary uses squared differences to measure dissimilarity unlike the Local Moran. Low values of the Local Geary indicate positive spatial autocorrelation and large refers to negative spatial autocorrelation. Inference for the Local Geary is based on a permutation approach which compares the observed value to the reference distribution under spatial randomness. The Local Geary creates a pseudo p-value. This is not an analytical p-value and is based on the number of permutations and as such should be used with care.

Usage

local_c(x, nb, wt, ...)

local_c_perm(x, nb, wt, nsim = 499, alternative = "two.sided", ...)

Arguments

x

a numeric vector, or list of numeric vectors of equal length.

nb

a neighbor list

wt

a weights list

...

other arguments passed to spdep::localC_perm(), e.g. zero.policy = TRUE to allow for zones without neighbors.

nsim

The number of simulations used to generate reference distribution.

alternative

A character defining the alternative hypothesis. Must be one of "two.sided", "less" or "greater".

Details

Overview

The Local Geary can be extended to a multivariate context. When x is a numeric vector, the univariate Local Geary will be calculated. To calculate the multivariate Local Moran provide either a list or a matrix. When x is a list, each element must be a numeric vector of the same length and of the same length as the neighbours in listw. In the case that x is a matrix the number of rows must be the same as the length of the neighbours in listw.

While not required in the univariate context, the standardized Local Geary is calculated. The multivariate Local Geary is always standardized.

The univariate Local Geary is calculated as \(c_i = \sum_j w_{ij}(x_i - x_j)^2\) and the multivariate Local Geary is calculated as \(c_{k,i} = \sum_{v=1}^{k} c_{v,i}\) as described in Anselin (2019).

Implementation

These functions are based on the implementations of the local Geary statistic in the development version of spdep. They are based on spdep::localC and spdep::localC_perm.

spdep::localC_perm and thus local_c_perm utilize a conditional permutation approach to approximate a reference distribution where each observation i is held fixed, randomly samples neighbors, and calculated the local C statistic for that tuple (ci). This is repeated nsim times. From the simulations 3 different types of p-values are calculated—all of which have their potential flaws. So be extra judicious with using p-values to make conclusions.

  • p_ci: utilizes the sample mean and standard deviation. The p-value is then calculated using pnorm()--asuming a normal distribution which isn't always true.

  • p_ci_sim: uses the rank of the observed statistic.

  • p_folded_sim: follows the pysal implementation where p-values are in the range of [0, 0.5]. This excludes 1/2 of all p-values and should be used with caution.

References

Anselin, L. (1995), Local Indicators of Spatial Association—LISA. Geographical Analysis, 27: 93-115. doi: 10.1111/j.1538-4632.1995.tb00338.x

Anselin, L. (2019), A Local Indicator of Multivariate Spatial Association: Extending Geary's c. Geogr Anal, 51: 133-150. doi: 10.1111/gean.12164

Author

Josiah Parry, josiah.parry@gmail.com

Examples

guerry %>%
  dplyr::transmute(nb = st_contiguity(geometry),
                   wt = st_weights(nb),
                   geary = local_c_perm(
                     x = list(crime_pers, literacy),
                     nb, wt
                   )) %>%
  tidyr::unnest(geary)
#> Simple feature collection with 85 features and 12 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -5.139026 ymin: 42.33349 xmax: 8.23032 ymax: 51.08939
#> Geodetic CRS:  WGS 84
#> # A tibble: 85 × 13
#>    nb      wt        ci cluster  e_ci var_ci   z_ci   p_ci p_ci_sim p_folded_sim
#>    <nb>    <list> <dbl> <fct>   <dbl>  <dbl>  <dbl>  <dbl>    <dbl>        <dbl>
#>  1 <int [<dbl … 1.09  Positi…  1.76  0.469 -0.980 0.327     0.376        0.188
#>  2 <int [<dbl … 0.557 Positi…  1.62  0.240 -2.16  0.0309    0.004        0.002
#>  3 <int [<dbl … 0.571 Positi…  2.62  0.685 -2.48  0.0131    0.004        0.002
#>  4 <int [<dbl … 0.525 Positi…  1.51  0.406 -1.54  0.124     0.104        0.052
#>  5 <int [<dbl … 1.69  Positi…  2.57  1.07  -0.851 0.395     0.432        0.216
#>  6 <int [<dbl … 0.803 Positi…  2.32  0.576 -2.00  0.0460    0.02         0.01 
#>  7 <int [<dbl … 1.99  Positi…  4.42  1.99  -1.73  0.0840    0.044        0.022
#>  8 <int [<dbl … 1.09  Positi…  3.56  2.02  -1.74  0.0824    0.044        0.022
#>  9 <int [<dbl … 0.547 Positi…  1.64  0.322 -1.92  0.0550    0.032        0.016
#> 10 <int [<dbl … 0.557 Positi…  1.23  0.260 -1.31  0.189     0.176        0.088
#> # … with 75 more rows, and 3 more variables: skewness <dbl>, kurtosis <dbl>,
#> #   geometry <MULTIPOLYGON [°]>