Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models

Mitch D Weegman; Todd W. Arnold; Robert G. Clark; Michael Schaub

doi:10.1002/eap.2258

Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models

Mitch D Weegman, Todd W. Arnold, Robert G. Clark, Michael Schaub

Fisheries, Wildlife, and Conservation Biology

Research output: Contribution to journal › Article › peer-review

18 Scopus citations

Abstract

Integrated population models (IPMs) are widely used to combine disparate data sets in joint analysis to better understand population dynamics and provide guidance for conservation activities. An often-cited assumption of IPMs is independence among component data sets within the combined likelihood. Dependency among data sets should lead to underestimation of variance and bias because individuals contribute data to more than one data set. In practice, studied individuals often occur in multiple data sets in IPMs (i.e., overlap), which is one way for the independence assumption to be violated. Such cases have the potential to dissuade practitioners and limit application of IPMs to solve emerging ecological problems. We assessed precision and bias of demographic rates estimated from IPMs using a complete gradient (0–100%) of overlap among data sets, wide ranges in demographic rates (e.g., survival 0.1–0.8) and sample sizes (100–1,200 individuals) and variable data sources. We compared results from our simulations with those from IPMs constructed using empirical data on tree swallows (Tachycineta bicolor) where data sets either had complete overlap or included different individuals. Contrary to previous investigators, we found no substantive bias or uncertainty in any demographic rate from IPMs derived from data sets with complete overlap. While variability in demographic rates was greater at low sample sizes (i.e., low capture, recapture, and survey probabilities), there were negligible differences in the posterior mean or root mean square error of demographic rates among IPMs with strong dependence vs. complete independence among data sets. Our simulations suggest IPMs can be designed using only capture–recapture data or harvest and capture-recovery data where population estimates are obtained from the same data as survival and productivity data. While we encourage researchers to carefully consider the modeling approach best suited for their data sets, our results suggest that dependence among data sets does not generally compromise IPM estimates. Thus, violation of the independence assumption should not dissuade researchers from the application of IPMs in ecological research.

Original language	English (US)
Article number	e2258
Journal	Ecological Applications
Volume	31
Issue number	3
DOIs	https://doi.org/10.1002/eap.2258
State	Published - Apr 2021

Bibliographical note

Funding Information:
We thank B. Morgan, F. Abadi, J. A. Royle, R. McCrea, and T. Besbeas for discussions on this work. We thank the dozens of volunteers and student employees who contributed to the tree swallow data in this paper, especially C. Fehr, D. Shutler, L. Bortolotti, M. Fast, P. Leighton, T. Diamond, and V. Harriman. Tree swallow work was supported through NSERC grants to R. G. Clark. Finally, we thank our respective employers for their support of this research.

Publisher Copyright:
© 2020 by the Ecological Society of America

Keywords

Horvitz-Thompson estimator
capture–mark–recapture model
capture–recovery model
independence among data sets
integrated population model

PubMed: MeSH publication types

Journal Article
Research Support, Non-U.S. Gov't

Access

10.1002/eap.2258

OpenUrl availability

Full text

Cite this

@article{78703b1bd8594e8d9d3e731212f86a8a,

title = "Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models",

abstract = "Integrated population models (IPMs) are widely used to combine disparate data sets in joint analysis to better understand population dynamics and provide guidance for conservation activities. An often-cited assumption of IPMs is independence among component data sets within the combined likelihood. Dependency among data sets should lead to underestimation of variance and bias because individuals contribute data to more than one data set. In practice, studied individuals often occur in multiple data sets in IPMs (i.e., overlap), which is one way for the independence assumption to be violated. Such cases have the potential to dissuade practitioners and limit application of IPMs to solve emerging ecological problems. We assessed precision and bias of demographic rates estimated from IPMs using a complete gradient (0–100%) of overlap among data sets, wide ranges in demographic rates (e.g., survival 0.1–0.8) and sample sizes (100–1,200 individuals) and variable data sources. We compared results from our simulations with those from IPMs constructed using empirical data on tree swallows (Tachycineta bicolor) where data sets either had complete overlap or included different individuals. Contrary to previous investigators, we found no substantive bias or uncertainty in any demographic rate from IPMs derived from data sets with complete overlap. While variability in demographic rates was greater at low sample sizes (i.e., low capture, recapture, and survey probabilities), there were negligible differences in the posterior mean or root mean square error of demographic rates among IPMs with strong dependence vs. complete independence among data sets. Our simulations suggest IPMs can be designed using only capture–recapture data or harvest and capture-recovery data where population estimates are obtained from the same data as survival and productivity data. While we encourage researchers to carefully consider the modeling approach best suited for their data sets, our results suggest that dependence among data sets does not generally compromise IPM estimates. Thus, violation of the independence assumption should not dissuade researchers from the application of IPMs in ecological research.",

keywords = "Horvitz-Thompson estimator, capture–mark–recapture model, capture–recovery model, independence among data sets, integrated population model",

author = "Weegman, {Mitch D} and Arnold, {Todd W.} and Clark, {Robert G.} and Michael Schaub",

note = "Funding Information: We thank B. Morgan, F. Abadi, J. A. Royle, R. McCrea, and T. Besbeas for discussions on this work. We thank the dozens of volunteers and student employees who contributed to the tree swallow data in this paper, especially C. Fehr, D. Shutler, L. Bortolotti, M. Fast, P. Leighton, T. Diamond, and V. Harriman. Tree swallow work was supported through NSERC grants to R. G. Clark. Finally, we thank our respective employers for their support of this research. Publisher Copyright: {\textcopyright} 2020 by the Ecological Society of America",

year = "2021",

month = apr,

doi = "10.1002/eap.2258",

language = "English (US)",

volume = "31",

journal = "Ecological Applications",

issn = "1051-0761",

publisher = "Ecological Society of America",

number = "3",

}

TY - JOUR

T1 - Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models

AU - Weegman, Mitch D

AU - Arnold, Todd W.

AU - Clark, Robert G.

AU - Schaub, Michael

N1 - Funding Information: We thank B. Morgan, F. Abadi, J. A. Royle, R. McCrea, and T. Besbeas for discussions on this work. We thank the dozens of volunteers and student employees who contributed to the tree swallow data in this paper, especially C. Fehr, D. Shutler, L. Bortolotti, M. Fast, P. Leighton, T. Diamond, and V. Harriman. Tree swallow work was supported through NSERC grants to R. G. Clark. Finally, we thank our respective employers for their support of this research. Publisher Copyright: © 2020 by the Ecological Society of America

PY - 2021/4

Y1 - 2021/4

N2 - Integrated population models (IPMs) are widely used to combine disparate data sets in joint analysis to better understand population dynamics and provide guidance for conservation activities. An often-cited assumption of IPMs is independence among component data sets within the combined likelihood. Dependency among data sets should lead to underestimation of variance and bias because individuals contribute data to more than one data set. In practice, studied individuals often occur in multiple data sets in IPMs (i.e., overlap), which is one way for the independence assumption to be violated. Such cases have the potential to dissuade practitioners and limit application of IPMs to solve emerging ecological problems. We assessed precision and bias of demographic rates estimated from IPMs using a complete gradient (0–100%) of overlap among data sets, wide ranges in demographic rates (e.g., survival 0.1–0.8) and sample sizes (100–1,200 individuals) and variable data sources. We compared results from our simulations with those from IPMs constructed using empirical data on tree swallows (Tachycineta bicolor) where data sets either had complete overlap or included different individuals. Contrary to previous investigators, we found no substantive bias or uncertainty in any demographic rate from IPMs derived from data sets with complete overlap. While variability in demographic rates was greater at low sample sizes (i.e., low capture, recapture, and survey probabilities), there were negligible differences in the posterior mean or root mean square error of demographic rates among IPMs with strong dependence vs. complete independence among data sets. Our simulations suggest IPMs can be designed using only capture–recapture data or harvest and capture-recovery data where population estimates are obtained from the same data as survival and productivity data. While we encourage researchers to carefully consider the modeling approach best suited for their data sets, our results suggest that dependence among data sets does not generally compromise IPM estimates. Thus, violation of the independence assumption should not dissuade researchers from the application of IPMs in ecological research.

AB - Integrated population models (IPMs) are widely used to combine disparate data sets in joint analysis to better understand population dynamics and provide guidance for conservation activities. An often-cited assumption of IPMs is independence among component data sets within the combined likelihood. Dependency among data sets should lead to underestimation of variance and bias because individuals contribute data to more than one data set. In practice, studied individuals often occur in multiple data sets in IPMs (i.e., overlap), which is one way for the independence assumption to be violated. Such cases have the potential to dissuade practitioners and limit application of IPMs to solve emerging ecological problems. We assessed precision and bias of demographic rates estimated from IPMs using a complete gradient (0–100%) of overlap among data sets, wide ranges in demographic rates (e.g., survival 0.1–0.8) and sample sizes (100–1,200 individuals) and variable data sources. We compared results from our simulations with those from IPMs constructed using empirical data on tree swallows (Tachycineta bicolor) where data sets either had complete overlap or included different individuals. Contrary to previous investigators, we found no substantive bias or uncertainty in any demographic rate from IPMs derived from data sets with complete overlap. While variability in demographic rates was greater at low sample sizes (i.e., low capture, recapture, and survey probabilities), there were negligible differences in the posterior mean or root mean square error of demographic rates among IPMs with strong dependence vs. complete independence among data sets. Our simulations suggest IPMs can be designed using only capture–recapture data or harvest and capture-recovery data where population estimates are obtained from the same data as survival and productivity data. While we encourage researchers to carefully consider the modeling approach best suited for their data sets, our results suggest that dependence among data sets does not generally compromise IPM estimates. Thus, violation of the independence assumption should not dissuade researchers from the application of IPMs in ecological research.

KW - Horvitz-Thompson estimator

KW - capture–mark–recapture model

KW - capture–recovery model

KW - independence among data sets

KW - integrated population model

UR - http://www.scopus.com/inward/record.url?scp=85100116861&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85100116861&partnerID=8YFLogxK

U2 - 10.1002/eap.2258

DO - 10.1002/eap.2258

M3 - Article

C2 - 33176007

AN - SCOPUS:85100116861

SN - 1051-0761

VL - 31

JO - Ecological Applications

JF - Ecological Applications

IS - 3

M1 - e2258

ER -

Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models

Abstract

Bibliographical note

Keywords

PubMed: MeSH publication types

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this