Skip to contents

The univariate local join count statistic is used to identify clusters of rarely occurring binary variables. The binary variable of interest should occur less than half of the time.

Usage

local_jc_uni(
  x,
  nb,
  wt = st_weights(nb, style = "B"),
  nsim = 499,
  alternative = "two.sided"
)

Arguments

x

a binary variable either numeric or logical

nb

a neighbors list object.

wt

default st_weights(nb, style = "B"). A binary weights list as created by st_weights(nb, style = "B").

nsim

the number of conditional permutation simulations

alternative

default "greater". One of "less" or "greater".

Details

The local join count statistic requires a binary weights list which can be generated with st_weights(nb, style = "B"). Additionally, ensure that the binary variable of interest is rarely occurring in no more than half of observations.

P-values are estimated using a conditional permutation approach. This creates a reference distribution from which the observed statistic is compared. For more see Geoda Glossary.

Examples

guerry %>%
  dplyr::transmute(top_crime = crime_prop > 9000,
                   nb = st_contiguity(geometry),
                   wt = st_weights(nb, style = "B"),
                   jc = local_jc_uni(top_crime, nb, wt)) %>%
  tidyr::unnest(jc)
#> Simple feature collection with 85 features and 5 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -5.139026 ymin: 42.33349 xmax: 8.23032 ymax: 51.08939
#> Geodetic CRS:  WGS 84
#> # A tibble: 85 × 6
#>    top_crime nb        wt        join_count  p_sim                      geometry
#>    <lgl>     <nb>      <list>         <dbl>  <dbl>            <MULTIPOLYGON [°]>
#>  1 TRUE      <int [4]> <dbl [4]>          1  0.258 (((4.92452 45.80404, 4.91857…
#>  2 FALSE     <int [6]> <dbl [6]>          0 NA     (((4.126445 49.67821, 4.1262…
#>  3 FALSE     <int [6]> <dbl [6]>          0 NA     (((3.773349 46.22719, 3.7850…
#>  4 FALSE     <int [4]> <dbl [4]>          0 NA     (((5.872688 44.22421, 5.8694…
#>  5 FALSE     <int [3]> <dbl [3]>          0 NA     (((5.921825 44.24841, 5.9122…
#>  6 TRUE      <int [7]> <dbl [7]>          2  0.312 (((4.177986 44.31775, 4.1732…
#>  7 FALSE     <int [3]> <dbl [3]>          0 NA     (((5.361486 49.59208, 5.3575…
#>  8 TRUE      <int [3]> <dbl [3]>          1  0.37  (((1.229289 42.72774, 1.2258…
#>  9 FALSE     <int [5]> <dbl [5]>          0 NA     (((4.690867 48.08597, 4.6863…
#> 10 TRUE      <int [5]> <dbl [5]>          2  0.444 (((2.659241 43.29298, 2.6597…
#> # … with 75 more rows