Scholarly Work - Mathematical Sciences
Permanent URI for this collectionhttps://scholarworks.montana.edu/handle/1/8719
Browse
8 results
Search Results
Item The integrated nested Laplace approximation applied to spatial log-Gaussian Cox process models(Informa UK Limited, 2023-04) Flagg, Kenneth; Hoegh, AndrewSpatial point process models are theoretically useful for mapping discrete events, such as plant or animal presence, across space; however, the computational complexity of fitting these models is often a barrier to their practical use. The log-Gaussian Cox process (LGCP) is a point process driven by a latent Gaussian field, and recent advances have made it possible to fit Bayesian LGCP models using approximate methods that facilitate rapid computation. These advances include the integrated nested Laplace approximation (INLA) with a stochastic partial differential equations (SPDE) approach to sparsely approximate the Gaussian field and an extension using pseudodata with a Poisson response. To help link the theoretical results to statistical practice, we provide an overview of INLA for point process data and then illustrate their implementation using freely available data. The analyzed datasets include both a completely observed spatial field and an incomplete data situation. Our well-commented R code is shared in the online supplement. Our intent is to make these methods accessible to the practitioner of spatial statistics without requiring deep knowledge of point process theory.Item Pathogen spillover driven by rapid changes in bat ecology(Springer Science and Business Media LLC, 2023-01) Eby, Peggy; Peel, Alison J.; Hoegh, Andrew; Madden, Wyatt; Giles, John R.; Hudson, Peter J.; Plowright, Raina K.During recent decades, pathogens that originated in bats have become an increasing public health concern. A major challenge is to identify how those pathogens spill over into human populations to generate a pandemic threat1. Many correlational studies associate spillover with changes in land use or other anthropogenic stressors2,3, although the mechanisms underlying the observed correlations have not been identified4. One limitation is the lack of spatially and temporally explicit data on multiple spillovers, and on the connections among spillovers, reservoir host ecology and behaviour and viral dynamics. We present 25 years of data on land-use change, bat behaviour and spillover of Hendra virus from Pteropodid bats to horses in subtropical Australia. These data show that bats are responding to environmental change by persistently adopting behaviours that were previously transient responses to nutritional stress. Interactions between land-use change and climate now lead to persistent bat residency in agricultural areas, where periodic food shortages drive clusters of spillovers. Pulses of winter flowering of trees in remnant forests appeared to prevent spillover. We developed integrative Bayesian network models based on these phenomena that accurately predicted the presence or absence of clusters of spillovers in each of the 25 years. Our long-term study identifies the mechanistic connections between habitat loss, climate and increased spillover risk. It provides a framework for examining causes of bat virus spillover and for developing ecological countermeasures to prevent pandemics.Item Estimating viral prevalence with data fusion for adaptive two‐phase pooled sampling(Wiley, 2021-10) Hoegh, Andrew; Peel, Alison J.; Madden, Wyatt; Ruiz-Aravena, Manuel; Morris, Aaron; Washburne, Alex D.; Plowright, Raina K.The COVID-19 pandemic has highlighted the importance of efficient sampling strategies and statistical methods for monitoring infection prevalence, both in humans and in reservoir hosts. Pooled testing can be an efficient tool for learning pathogen prevalence in a population. Typically, pooled testing requires a second- phase retesting procedure to identify infected individuals, but when the goal is solely to learn prevalence in a population, such as a reservoir host, there are more efficient methods for allocating the second- phase samples.2. To estimate pathogen prevalence in a population, this manuscript presents an approach for data fusion with two- phased testing of pooled samples that allows more efficient estimation of prevalence with less samples than traditional methods. The first phase uses pooled samples to estimate the population prevalence and inform efficient strategies for the second phase. To combine information from both phases, we introduce a Bayesian data fusion procedure that combines pooled samples with individual samples for joint inferences about the population prevalence.3. Data fusion procedures result in more efficient estimation of prevalence than traditional procedures that only use individual samples or a single phase of pooled sampling.4. The manuscript presents guidance on implementing the first- phase and second- phase sampling plans using data fusion. Such methods can be used to assess the risk of pathogen spillover from reservoir hosts to humans, or to track pathogens such as SARS-CoV-2 in populations.Item Why Bayesian Ideas Should Be Introduced in the Statistics Curricula and How to Do So(Informa UK Limited, 2020-09) Hoegh, AndrewWhile computing has become an important part of the statistics field, course offerings are still influenced by a legacy of mathematically centric thinking. Due to this legacy, Bayesian ideas are not required for undergraduate degrees and have largely been taught at the graduate level; however, with recent advances in software and emphasis on computational thinking, Bayesian ideas are more accessible. Statistics curricula need to continue to evolve and students at all levels should be taught Bayesian thinking. This article advocates for adding Bayesian ideas for three groups of students: intro-statistics students, undergraduate statistics majors, and graduate student scientists; and furthermore, provides guidance and materials for creating Bayesian-themed courses for these audiences. Supplementary files for this article are available on line.Item Msocc: Fit and analyse computationally efficient multi‐scale occupancy models in r(2020-07) Stratton, Christian; Sepulveda, Adam J.; Hoegh, Andrew1. Environmental DNA (eDNA) sampling is a promising tool for the detection of rare and cryptic taxa, such as aquatic pathogens, parasites and invasive species. Environmental DNA sampling workflows commonly rely on multi-stage hierarchical sampling designs that induce complicated dependencies within the data. This complex dependence structure can be intuitively modelled with Bayesian multi-scale occupancy models. However, current software for such models are computationally demanding, impeding their use. 2. We present an r package, msocc, that implements a data augmentation strategy to fit fully Bayesian, computationally efficient multi-scale occupancy models. The msocc package allows users to fit multi-scale occupancy models, to estimate and visualize posterior summaries of site, sample and replicate-level occupancy, and to compare different models using Bayesian information criterion. Additionally, we provide a supplemental web application that allows users to investigate study design for multi-scale occupancy models and acts as a graphical user interface to the msocc package. 3. The utility of the msocc package is illustrated on a published dataset and the functions in msocc are compared to the primary Bayesian toolkit for multi-scale occupancy modelling, eDNAoccupancy, using various computational benchmarks. These benchmarks indicate that msocc is capable of fitting models 50 times faster than eDNAoccupancy. 4. We hope that access to software that efficiently fits, analyses and conducts study design investigations for multi-scale occupancy models facilitates their implementation by the research and wildlife management communities.Item Modeling Partially Surveyed Point Process Data: Inferring Spatial Point Intensity of Geomagnetic Anomalies(2020-06) Flagg, Kenneth A.; Hoegh, Andrew; Borkowski, John J.Many former military training sites contain unexploded ordnance (UXO) and require environmental remediation. For the first phase of UXO remediation, locations of geomagnetic anomalies are recorded over a subregion of the study area to infer the spatial intensity of anomalies and identify high concentration areas. The data resulting from this sampling process contain locations of anomalies across narrow regions that are surveyed; however, the surveyed regions only constitute a small proportion of the entire study area. Existing methods for analysis require selecting a window size to transform the partially surveyed point pattern to a point-referenced dataset. To model the partially surveyed point pattern and infer intensity of anomalies at unsurveyed regions, we propose a Bayesian spatial Poisson process model with a Dirichlet process mixture as the inhomogeneous intensity function. A data augmentation step is used to impute anomalies in unsurveyed locations and reconstruct clusters of anomalies that span surveyed and unsurveyed regions. To verify that data augmentation reconstructs the underlying structure of the data, we demonstrate fitting the model to simulated data, using both the full study area and two different sampled subregions. Finally, we fit the model to data collected at the Victorville Precision Bombing range in southern California to estimate the intensity surface in anomalies per acre.Item Agent-Based Models for Collective Animal Movement: Proximity-Induced State Switching(2021-08) Hoegh, Andrew; van Manen, Frank T.; Haroldson, MarkAnimal movement is a complex phenomenon where individual movement patterns can be influenced by a variety of factors including the animal’s current activity, available terrain and habitat, and locations of other animals. Motivated by modeling grizzly bear movement in the Greater Yellowstone Ecosystem, this article presents an agent-based model represented in a state-space framework for collective animal movement. The novel contribution of this work is a collective animal movement model that captures interactions between animals that can trigger changes in movement patterns, such as when a dominant grizzly bear may cause another subordinate bear to temporarily leave an area. The modeling framework enables learning different movement patterns through a state-space representation with particle-MCMC methods for fully Bayesian model fitting and the prediction of future animal movement behaviors.Supplementary materials accompanying this paper appear online.Item Evaluating and presenting uncertainty in model‐based unconstrained ordination(2019-12) Hoegh, Andrew; Roberts, David W.Variability in ecological community composition is often analyzed by recording the presence or abundance of taxa in sample units, calculating a symmetric matrix of pairwise distances or dissimilarities among sample units and then mapping the resulting matrix to a low‐dimensional representation through methods collectively called ordination. Unconstrained ordination only uses taxon composition data, without any environmental or experimental covariates, to infer latent compositional gradients associated with the sampling units. Commonly, such distance‐based methods have been used for ordination, but recently there has been a shift toward model‐based approaches. Model‐based unconstrained ordinations are commonly formulated using a Bayesian latent factor model that permits uncertainty assessment for parameters, including the latent factors that correspond to gradients in community composition. While model‐based methods have the additional benefit of addressing uncertainty in the estimated gradients, typically the current practice is to report point estimates without summarizing uncertainty. To demonstrate the uncertainty present in model‐based unconstrained ordination, the well‐known spider and dune data sets were analyzed and shown to have large uncertainty in the ordination projections. Hence to understand the factors that contribute to the uncertainty, simulation studies were conducted to assess the impact of additional sampling units or species to help inform future ordination studies that seek to minimize variability in the latent factors. Accurate reporting of uncertainty is an important part of transparency in the scientific process; thus, a model‐based approach that accounts for uncertainty is valuable. An R package, UncertainOrd, contains visualization tools that accurately represent estimates of the gradients in community composition in the presence of uncertainty.