Bayesian areal wombling using false discovery rates

Pei Li, Sudipto Banerjee, Alexander M. McBean, Bradley P. Carlin

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Spatial data arising in public health services are often reported as case counts or rates aggregated over areal regions (e.g. counties, census-tracts or ZIP codes), rather than being referenced with respect to the geographical coordinates of individual residences. For such areal data, subsequent inferential interest often resides in the formal identification of "barriers", or "difference boundaries", on the map, where "boundary" refers to a border with sharp changes in outcome on either side. This boundary detection problem is often referred to as "wombling" or, more specifically, "areal wombling" for aggregated areal data, after a foundational article by Womble (1951). Existing statistical frameworks for areal wombling usually follow a two stage procedure: (i) estimate the spatial effects from an appropriate spatial model, and (ii) detect boundaries from those estimates using appropriate discrepancy metrics on those estimates. Lu and Carlin (2005), and several subsequent articles, explored areal wombling within this framework. This article treats wombling as a hypothesis-testing problem, where we are testing a substantial number of hypotheses - one for each geographical boundary - and seek to provide policy-makers and analysts with a final set of difference boundaries. Here we must reckon with a lurking multiplicity problem arising from the large number of individual hypothesis we are testing. We proffer a computationally feasible framework to estimate hierarchical spatial models that account for dependence between adjacent regions and test for equality of spatial effects, while adjusting for multiplicities using false discovery rates (FDR); see, e.g., Benjamini and Hochberg (1995). A simulation study is conducted to first illustrate and assess the new approach, which is then applied to detect boundaries on a county map of Minnesota that records pneumonia and influenza hospitalization rates from the SEER-Medicare program. AMS 2000 subject classifications: Primary 62F15, 62H11; secondary 62F03.

Original languageEnglish (US)
Pages (from-to)149-158
Number of pages10
JournalStatistics and its Interface
Volume5
Issue number2
DOIs
StatePublished - 2012

Keywords

  • Areal data
  • Bayesian inference
  • False discovery rates
  • Hierarchical models
  • Spatial moving averages

Fingerprint

Dive into the research topics of 'Bayesian areal wombling using false discovery rates'. Together they form a unique fingerprint.

Cite this