The univariate local join count statistic is used to identify clusters of rarely occurring binary variables. The binary variable of interest should occur less than half of the time.
Usage
local_jc_uni(
x,
nb,
wt = st_weights(nb, style = "B"),
nsim = 499,
alternative = "two.sided"
)
Arguments
- x
a binary variable either numeric or logical
- nb
a neighbors list object.
- wt
default
st_weights(nb, style = "B")
. A binary weights list as created byst_weights(nb, style = "B")
.- nsim
the number of conditional permutation simulations
- alternative
default
"greater"
. One of"less"
or"greater"
.
Details
The local join count statistic requires a binary weights list which can be generated with st_weights(nb, style = "B")
. Additionally, ensure that the binary variable of interest is rarely occurring in no more than half of observations.
P-values are estimated using a conditional permutation approach. This creates a reference distribution from which the observed statistic is compared. For more see Geoda Glossary.
Examples
guerry %>%
dplyr::transmute(top_crime = crime_prop > 9000,
nb = st_contiguity(geometry),
wt = st_weights(nb, style = "B"),
jc = local_jc_uni(top_crime, nb, wt)) %>%
tidyr::unnest(jc)
#> Simple feature collection with 85 features and 5 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -5.139026 ymin: 42.33349 xmax: 8.23032 ymax: 51.08939
#> Geodetic CRS: WGS 84
#> # A tibble: 85 × 6
#> top_crime nb wt join_count p_sim geometry
#> <lgl> <nb> <list> <dbl> <dbl> <MULTIPOLYGON [°]>
#> 1 TRUE <int [4]> <dbl [4]> 1 0.258 (((4.92452 45.80404, 4.91857…
#> 2 FALSE <int [6]> <dbl [6]> 0 NA (((4.126445 49.67821, 4.1262…
#> 3 FALSE <int [6]> <dbl [6]> 0 NA (((3.773349 46.22719, 3.7850…
#> 4 FALSE <int [4]> <dbl [4]> 0 NA (((5.872688 44.22421, 5.8694…
#> 5 FALSE <int [3]> <dbl [3]> 0 NA (((5.921825 44.24841, 5.9122…
#> 6 TRUE <int [7]> <dbl [7]> 2 0.312 (((4.177986 44.31775, 4.1732…
#> 7 FALSE <int [3]> <dbl [3]> 0 NA (((5.361486 49.59208, 5.3575…
#> 8 TRUE <int [3]> <dbl [3]> 1 0.37 (((1.229289 42.72774, 1.2258…
#> 9 FALSE <int [5]> <dbl [5]> 0 NA (((4.690867 48.08597, 4.6863…
#> 10 TRUE <int [5]> <dbl [5]> 2 0.444 (((2.659241 43.29298, 2.6597…
#> # … with 75 more rows