    Sleep duration, napping behaviors and restless legs syndrome during pregnancy and the trajectories of ultrasonographic measures of fetal growth: Findings from the NICHD Fetal Growth Studies–Singletons
    (Elsevier BV, 2024) Na, Muzi; Shetty, Samidha Sudhakar; Niu, Xiaoyue; Hinkle, Stefanie N.; Zhang, Cuilin; Gao, Xiang
    Objectives. Given the plausible mechanisms and the lacking of empirical evidence, the study aims to investigate how gestational sleep behaviors and the development of sleep disorders, such as restless legs syndrome, influence ultrasonographic measures of fetal growth. Methods. The study included 2457 pregnant women from the NICHD Fetal Growth Studies - Singletons (2009-2013), who were recruited between 8-13 gestational weeks and followed up to five times during pregnancy. Women were categorized into six groups based on their total sleep hours and napping frequency. The trajectory of estimated fetal weight from 10-40 weeks was derived from three ultrasonographic measures. Linear mixed effect models were applied to model the estimated fetal weight in relation to self-reported sleep-napping behaviors and restless legs syndrome status, adjusting for age, race and ethnicity, education, parity, prepregnancy body mass index category, infant sex, and prepregnancy sleep-napping behavior. Results. From enrollment to near delivery, pregnant women’s total sleep duration and nap frequency declined and restless legs syndrome symptoms frequency increased generally. No significant differences in estimated fetal weight were observed by sleep-napping group or by restless legs syndrome status. Results remained similar in sensitivity analyses and stratified analyses by women’s prepregnancy body mass index category (normal vs. overweight/obese) or by infant sex. Conclusions. Our data indicate that there is no association between sleep during pregnancy—assessed as total sleep duration and napping frequency, nor restless legs syndrome symptoms—and fetal growth from weeks 10 to 40 in healthy pregnant women.
    Lattice structures that parameterize regulatory network dynamics
    (Elsevier BV, 2024-08) Gedeon, Tomáš
    We consider two types of models of regulatory network dynamics: Boolean maps and systems of switching ordinary differential equations. Our goal is to construct all models in each category that are compatible with the directed signed graph that describe the network interactions. This leads to consideration of lattice of monotone Boolean functions (MBF), poset of non-degenerate MBFs, and a lattice of chains in these sets. We describe explicit inductive construction of these posets where the induction is on the number of inputs in MBF. Our results allow enumeration of potential dynamic behavior of the network for both model types, subject to practical limitation imposed by the size of the lattice of MBFs described by the Dedekind number.
    Modeling of the daily dynamics in bike rental system using weather and calendar conditions: A semi-parametric approach
    (Elsevier BV, 2024-06) Odoom, Christopher; Boateng, Alexander; Mensah, Sarah Fobi; Maposa, Daniel
    This study proposes a more robust methodological approach to modeling the effect of weather and calendar variables on the number of bike rentals. We employ penalized splines quasi-Poisson regression (a semi-parametric model), which involves some form of regularization, like those used in lasso, ridge, and other types of parametric regularization models. We demonstrate that this modeling approach reveals hidden relationships that a pure parametric model fails to identify. The findings show that visibility, windspeed, season, working day, and year all significantly impact bike rentals. Increased rentals are associated with increased visibility and lower wind speed. Rentals are negatively affected by the spring and winter seasons, while working days and the year show positive trends except in a few cases. The analysis of rentals by registered and casual users reveals similar patterns, though the magnitudes of the effects differ. These findings highlight the importance of considering weather and calendar variables when managing and promoting bike-sharing services. The study has implications for bike-sharing system operators and policymakers, suggesting strategies such as improving visibility and wind protection, seasonally tailoring promotional campaigns, targeting non-working days for casual users, and adapting to changing user demands. The study adds to our understanding of the factors that influence bike rentals and provides suggestions for improving the utilization and accessibility of bike-sharing systems.
    Net Primary Production of Ecoregions Across North America in Response to Drought and Wildfires From 2015 to 2022
    (American Geophysical Union, 2024-04) Potter, Christopher; Pass, Stephanie; Ulrich, Rachel
    Ecosystem models are valuable tools to make climate-related assessments of change when ground-based measurements of water and carbon fluxes are not adequate to realistically capture regional variability. The Carnegie-Ames-Stanford Approach (CASA) is one such model based on satellite observations of monthly vegetation cover to estimate net primary production (NPP) of terrestrial ecosystems. CASA model predictions from 2015 to 2022 revealed several notable high and low periods in growing season NPP totals in certain biomes. Both Temperate Broadleaf and Boreal Forest production shifted from relatively high average NPP values in 2015 through 2019 to lower levels in 2020, typically representing a loss of 10%–14% of growing season NPP flux. This rapid decline in growing season NPP from 2019 to 2020–2021 was also estimated for the Temperate Grasslands and Savanna, Temperate Conifer Forest, and Tundra biomes. In contrast to the climate patterns in the temperate biomes that developed into severe widespread drought in 2020 and 2021 due to low precipitation totals and extreme hot temperatures, growing season NPP in the Tundra biome was depressed in these same years by colder temperature induced drought conditions at the high latitudes of North America. Drought severity classes were closely associated with different levels of decline in NPP in most biomes. Trends in NPP in areas of the largest wildfires in North America that burned between 2012 and 2021 were examined to assess recovery of vegetation and the resiliency of ecosystems during extreme drought periods.
    Variable-coefficient parabolic theory as a high-dimensional limit of elliptic theory
    (Springer Science and Business Media LLC, 2024-01) Davey, Blair; Vega Garcia, Mariana Smit
    This paper continues the study initiated in Davey (Arch Ration Mech Anal 228:159–196, 2018), where a high-dimensional limiting technique was developed and used to prove certain parabolic theorems from their elliptic counterparts. In this article, we extend these ideas to the variable-coefficient setting. This generalized technique is demonstrated through new proofs of three important theorems for variable-coefficient heat operators, one of which establishes a result that is, to the best of our knowledge, also new. Specifically, we give new proofs of L2 → L2 Carleman estimates and the monotonicity of Almgren-type frequency functions, and we prove a new monotonicity of Alt–Caffarelli–Friedman-type functions. The proofs in this article rely only on their related elliptic theorems and a limiting argument. That is, each parabolic theorem is proved by taking a high-dimensional limit of a related elliptic result.
    Joint Spatial Modeling Bridges the Gap Between Disparate Disease Surveillance and Population Monitoring Efforts Informing Conservation of At-risk Bat Species
    (Springer Science and Business Media LLC, 2024-02) Stratton, Christian; Irvine, Kathryn M.; Banner, Katharine M.; Almberg, Emily S.; Bachen, Dan; Smucker, Kristina
    White-Nose Syndrome (WNS) is a wildlife disease that has decimated hibernating bats since its introduction in North America in 2006. As the disease spreads westward, assessing the potentially differential impact of the disease on western bat species is an urgent conservation need. The statistical challenge is that the disease surveillance and species response monitoring data are not co-located, available at different spatial resolutions, non-Gaussian, and subject to observation error requiring a novel extension to spatially misaligned regression models for analysis. Previous work motivated by epidemiology applications has proposed two-step approaches that overcome the spatial misalignment while intentionally preventing the human health outcome from informing estimation of exposure. In our application, the impacted animals contribute to spreading the fungus that causes WNS, motivating development of a joint framework that exploits the known biological relationship. We introduce a Bayesian, joint spatial modeling framework that provides inferences about the impact of WNS on measures of relative bat activity and accounts for the uncertainty in estimation of WNS presence at non-surveyed locations. Our simulations demonstrate that the joint model produced more precise estimates of disease occurrence and unbiased estimates of the association between disease presence and the count response relative to competing two-step approaches. Our statistical framework provides a solution that leverages disparate monitoring activities and informs species conservation across large landscapes. Stan code and documentation are provided to facilitate access and adaptation for other wildlife disease applications.
    Coding Code: Qualitative Methods for Investigating Data Science Skills
    (Informa UK Limited, 2023-11) Theobold, Allison S.; Wickstrom, Megan H.; Hancock, Stacey A.
    Despite the elevated importance of Data Science in Statistics, there exists limited research investigating how students learn the computing concepts and skills necessary for carrying out data science tasks. Computer Science educators have investigated how students debug their own code and how students reason through foreign code. While these studies illuminate different aspects of students’ programming behavior or conceptual understanding, a method has yet to be employed that can shed light on students’ learning processes. This type of inquiry necessitates qualitative methods, which allow for a holistic description of the skills a student uses throughout the computing code they produce, the organization of these descriptions into themes, and a comparison of the emergent themes across students or across time. In this article we share how to conceptualize and carry out the qualitative coding process with students’ computing code. Drawing on the Block Model to frame our analysis, we explore two types of research questions which could be posed about students’ learning. Supplementary materials for this article are available online.
    Ribosome Abundance Control in Prokaryotes
    (Springer Science and Business Media LLC, 2023-10) Shea, Jacob; Davis, Lisa; Quaye, Bright; Gedeon, Tomas
    Cell growth is an essential phenotype of any unicellular organism and it crucially depends on precise control of protein synthesis. We construct a model of the feedback mechanisms that regulate abundance of ribosomes in E. coli, a prototypical prokaryotic organism. Since ribosomes are needed to produce more ribosomes, the model includes a positive feedback loop central to the control of cell growth. Our analysis of the model shows that there can be only two coexisting equilibrium states across all 23 parameters. This precludes the existence of hysteresis, suggesting that the ribosome abundance changes continuously with parameters. These states are related by a transcritical bifurcation, and we provide an analytic formula for parameters that admit either state.
    Leveraging social networks for identification of people living with HIV who are virally unsuppressed
    (Wolters Kluwer Health, Inc., 2023-10) Cummins, Breschine; Johnson, Kara; Schneider, John A.; Del Vicchio, Natasha; Moshiri, Niema; Wertheim, Joel O.; Goyal, Ravi; Skaathun, Britt
    Objectives: This study investigates primary peer-referral engagement (PRE) strategies to assess which strategy results in engaging higher numbers of people living with HIV (PLWH) who are virally unsuppressed. Design: We develop a modeling study that simulates an HIV epidemic (transmission, disease progression, and viral evolution) over 6 years using an agent-based model followed by simulating PRE strategies. We investigate two PRE strategies where referrals are based on social network strategies (SNS) or sexual partner contact tracing (SPCT). Methods: We parameterize, calibrate, and validate our study using data from Chicago on Black sexual minority men to assess these strategies for a population with high incidence and prevalence of HIV. For each strategy we calculate the number of PLWH recruited who are undiagnosed or out-of-care and the number of direct or indirect transmissions. Results: SNS and SPCT identified 256.5 (95% C.I.: [234,279]) and 15 (95% C.I.: [7,27]) PLWH, respectively. Of these, SNS identified 159 (95% C.I.: [142,177]) PLWH out-of-care and 32 (95% C.I.: [21, 43]]) PLWH undiagnosed compared to 9 (95% C.I.: [3,18]) and 2 (95% C.I.: [0,5]) for SPCT. SNS identified 15.5 (95% C.I.: [6,25]) and 7.5 (95% C.I.: [2, 11]]) indirect and direct transmission pairs, while SPCT identified 6 (95% C.I.: [0,8]) and 5 (95% C.I.: [0,8]), respectively. Conclusions: With no testing constraints, SNS is the more effective strategy to identify undiagnosed and out-of-care PLWH. Neither strategy is successful at identifying sufficient indirect or direct transmission pairs to investigate transmission networks.
    Detecting punctuated evolution in SARS-CoV-2 over the first year of the pandemic
    (Frontiers Media SA, 2023-02) Surya, Kevin; Gardner, Jacob D.; Organ, Chris L.
    The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) evolved slowly over the first year of the Coronavirus Disease 19 (COVID-19) pandemic with differential mutation rates across lineages. Here, we explore how this variation arose. Whether evolutionary change accumulated gradually within lineages or during viral lineage branching is unclear. Using phylogenetic regression models, we show that ~13% of SARS-CoV-2 genomic divergence up to May 2020 is attributable to lineage branching events (punctuated evolution). The net number of branching events along lineages predicts ~5% of the deviation from the strict molecular clock. We did not detect punctuated evolution in SARS-CoV-1, possibly due to the small sample size, and in sarbecovirus broadly, likely due to a different evolutionary process altogether. Punctuation in SARS-CoV-2 is probably neutral because most mutations were not positively selected and because the strength of the punctuational effect remained constant over time, at least until May 2020, and across continents. However, the small punctuational contribution to SARS-CoV-2 diversity is consistent with the founder effect arising from narrow transmission bottlenecks. Therefore, punctuation in SARS-CoV-2 may represent the macroevolutionary consequence (rate variation) of a microevolutionary process (transmission bottleneck).
    Lifetime alcohol consumption patterns and young-onset breast cancer by subtype among Non-Hispanic Black and White women in the Young Women’s Health History Study
    (Springer Nature, 2023-10) Hirko, Kelly A.; Lucas, Darek R.; Pathak, Dorothy R.; Hamilton, Ann S.; Post, Lydia M.; Ihenacho, Ugonna; Carnegie, Nicole Bohme; Houang, Richard T.; Schwartz, Kendra; Velie, Ellen M.
    Purpose. The role of alcohol in young-onset breast cancer (YOBC) is unclear. We examined associations between lifetime alcohol consumption and YOBC in the Young Women’s Health History Study, a population-based case–control study of breast cancer among Non-Hispanic Black and White women < 50 years of age. Methods. Breast cancer cases (n = 1,812) were diagnosed in the Metropolitan Detroit and Los Angeles County SEER registry areas, 2010–2015. Controls (n = 1,381) were identified through area-based sampling and were frequency-matched to cases by age, site, and race. Alcohol consumption and covariates were collected from in-person interviews. Weighted multivariable logistic regression was conducted to calculate adjusted odds ratios (aOR) and 95% confidence intervals (CI) for associations between alcohol consumption and YOBC overall and by subtype (Luminal A, Luminal B, HER2, or triple negative). Results. Lifetime alcohol consumption was not associated with YOBC overall or with subtypes (all ptrend ≥ 0.13). Similarly, alcohol consumption in adolescence, young and middle adulthood was not associated with YOBC (all ptrend ≥ 0.09). An inverse association with triple-negative YOBC, however, was observed for younger age at alcohol use initiation (< 18 years vs. no consumption), aOR (95% CI) = 0.62 (0.42, 0.93). No evidence of statistical interaction by race or household poverty was observed. Conclusions. Our findings suggest alcohol consumption has a different association with YOBC than postmenopausal breast cancer—lifetime consumption was not linked to increased risk and younger age at alcohol use initiation was associated with a decreased risk of triple-negative YOBC. Future studies on alcohol consumption in YOBC subtypes are warranted.
    Using physical simulations to motivate the use of differential equations in models of disease spread
    (Informa UK Limited, 2023-09) Arnold, Elizabeth G.; Burroughs, Elizabeth A.; Burroughs, Owen; Carlson, Mary Alice
    The SIR model is a differential equations based model of the spread of an infectious disease that compartmentalises individuals in a population into one of three states: those who are susceptible to a disease (S), those who are infected and can transmit the disease to others (I), and those who have recovered from the disease and are now immune (R). This Classroom Note describes how to initiate teaching the SIR model with two concrete physical simulations to provide students with first-hand experience with some of the nuanced behaviour of how an infectious disease spreads through a closed population. One simulation physically models disease spread by the exchange of fluids, using pH to simulate infection. A second simulation incorporates randomness through the use of a probability game to keep track of the state of each individual at each time step. Both simulations invite students to ask questions about what factors influence disease spread. The concrete experience from the physical simulations enables students to make connections to the abstract mathematical representation of the SIR model and discuss the sources of stochasticity present in the spread of an infectious disease.
    Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
    (The Royal Society, 2023-07) Liu, Yuxuan; McCalla, Scott G.; Schaeffer, Hayden
    Particle dynamics and multi-agent systems provide accurate dynamical models for studying and forecasting the behaviour of complex interacting systems. They often take the form of a high-dimensional system of differential equations parameterized by an interaction kernel that models the underlying attractive or repulsive forces between agents. We consider the problem of constructing a data-based approximation of the interacting forces directly from noisy observations of the paths of the agents in time. The learned interaction kernels are then used to predict the agents’ behaviour over a longer time interval. The approximation developed in this work uses a randomized feature algorithm and a sparse randomized feature approach. Sparsity-promoting regression provides a mechanism for pruning the randomly generated features which was observed to be beneficial when one has limited data, in particular, leading to less overfitting than other approaches. In addition, imposing sparsity reduces the kernel evaluation cost which significantly lowers the simulation cost for forecasting the multi-agent systems. Our method is applied to various examples, including first-order systems with homogeneous and heterogeneous interactions, second-order homogeneous systems, and a new sheep swarming system.
    Estimating contact network properties by integrating multiple data sources associated with infectious diseases
    (Wiley, 2023-07) Goyal, Ravi; Carnegie, Nicole; Slipher, Sally; Turk, Philip; Little, Susan J.; De Gruttola, Victor
    To effectively mitigate the spread of communicable diseases, it is necessary to understand the interactions that enable disease transmission among individuals in a population; we refer to the set of these interactions as a contact network. The structure of the contact network can have profound effects on both the spread of infectious diseases and the effectiveness of control programs. Therefore, understanding the contact network permits more efficient use of resources. Measuring the structure of the network, however, is a challenging problem. We present a Bayesian approach to integrate multiple data sources associated with the transmission of infectious diseases to more precisely and accurately estimate important properties of the contact network. An important aspect of the approach is the use of the congruence class models for networks. We conduct simulation studies modeling pathogens resembling SARS-CoV-2 and HIV to assess the method; subsequently, we apply our approach to HIV data from the University of California San Diego Primary Infection Resource Consortium. Based on simulation studies, we demonstrate that the integration of epidemiological and viral genetic data with risk behavior survey data can lead to large decreases in mean squared error (MSE) in contact network estimates compared to estimates based strictly on risk behavior information. This decrease in MSE is present even in settings where the risk behavior surveys contain measurement error. Through these simulations, we also highlight certain settings where the approach does not improve MSE.
    Resource allocation accounts for the large variability of rate-yield phenotypes across bacterial strains
    (eLife Sciences Publications, Ltd, 2023-05) Baldazzi, Valentina; Ropers, Delphine; Gouzé, Jean-Luc; Gedeon, Tomas; de Jong, Hidde
    Different strains of a microorganism growing in the same environment display a wide variety of growth rates and growth yields. We developed a coarse-grained model to test the hypothesis that different resource allocation strategies, corresponding to different compositions of the proteome, can account for the observed rate-yield variability. The model predictions were verified by means of a database of hundreds of published rate-yield and uptake-secretion phenotypes of Escherichia coli strains grown in standard laboratory conditions. We found a very good quantitative agreement between the range of predicted and observed growth rates, growth yields, and glucose uptake and acetate secretion rates. These results support the hypothesis that resource allocation is a major explanatory factor of the observed variability of growth rates and growth yields across different bacterial strains. An interesting prediction of our model, supported by the experimental data, is that high growth rates are not necessarily accompanied by low growth yields. The resource allocation strategies enabling high-rate, high-yield growth of E. coli lead to a higher saturation of enzymes and ribosomes, and thus to a more efficient utilization of proteomic resources. Our model thus contributes to a fundamental understanding of the quantitative relationship between rate and yield in E. coli and other microorganisms. It may also be useful for the rapid screening of strains in metabolic engineering and synthetic biology.
    Adversary decision-making using Markov models
    (SPIE, 2023-06) Andreas, Elizabeth; Dorismond, Jessica; Gamarra, Marco
    This study conducts three experiments on adversary decision-making modeled as a graph. Each experiment has the overall goal to understand how to exploit an adversary’s decision-making in order to obtain desired outcomes, as well as specific goals unique to each experiment. The first experiment models adversary decision-making using an Absorbing Markov chain (AMC). A sensitivity analysis of states (nodes in the graph) and actions (edges in the graph) is conducted which informs how downstream adversary decisions could be manipulated. The next experiment uses a Markov decision process (MDP). Assuming the adversary is initially blind to the rewards they will receive when they take an action, a Q´learning algorithm is used to determine the sequence of actions that maximizes the adversary rewards (called an optimum policy). This experiment gives insight in the possible decision-making of an adversary. Lastly, in the third experiment a two-player Markov game is developed, played by an agent (friend) and the adversary (foe). The agents goal is to decrease the overall rewards the adversary receives when it follows optimum policy. All experiments are demonstrated using specific examples.
    Numerical analysis of a time filtered scheme for a linear hyperbolic equation inspired by DNA transcription modeling
    (Elsevier BV, 2023-09) Boatman, K.; Davis, L.; Pahlevani, F.; Rajan, T. Susai
    The focus of this paper is the development and analysis of a time filtering process for a linear hyperbolic equation motivated by the modeling of the transcription of ribosomal RNA in bacteria Davis et al. (2021). We demonstrate that a time filter technique can be combined with the classical upwind to produce a new explicit scheme with virtually no dissipation introduced by the method, and the filter can be implemented with minimal computational cost. The analysis shows that the filtered scheme gives the practitioner the ability to adjust the filtering so the dissipation can be made arbitrarily small over a range of time step choices. The analysis also indicates that the filtered scheme has a smaller local truncation error when compared to that of the original upwind method. A CFL condition for the new algorithm is derived, and it is shown to depend explicitly on the filter parameter. Numerical computations illustrate stability and convergence as well as dissipation and dispersion assessments of the filtered upwind scheme.
    The integrated nested Laplace approximation applied to spatial log-Gaussian Cox process models
    (Informa UK Limited, 2023-04) Flagg, Kenneth; Hoegh, Andrew
    Spatial point process models are theoretically useful for mapping discrete events, such as plant or animal presence, across space; however, the computational complexity of fitting these models is often a barrier to their practical use. The log-Gaussian Cox process (LGCP) is a point process driven by a latent Gaussian field, and recent advances have made it possible to fit Bayesian LGCP models using approximate methods that facilitate rapid computation. These advances include the integrated nested Laplace approximation (INLA) with a stochastic partial differential equations (SPDE) approach to sparsely approximate the Gaussian field and an extension using pseudodata with a Poisson response. To help link the theoretical results to statistical practice, we provide an overview of INLA for point process data and then illustrate their implementation using freely available data. The analyzed datasets include both a completely observed spatial field and an incomplete data situation. Our well-commented R code is shared in the online supplement. Our intent is to make these methods accessible to the practitioner of spatial statistics without requiring deep knowledge of point process theory.
    The Jordan–Chevalley decomposition for 𝐺-bundles on elliptic curves
    (American Mathematical Society, 2022-12) Frăţilă, Dragoş; Gunningham, Sam; Li, Penghui
    We study the moduli stack of degree $0$ semistable $G$-bundles on an irreducible curve $E$ of arithmetic genus $1$, where $G$ is a connected reductive group in arbitrary characteristic. Our main result describes a partition of this stack indexed by a certain family of connected reductive subgroups $H$ of $G$ (the $E$-pseudo-Levi subgroups), where each stratum is computed in terms of $H$-bundles together with the action of the relative Weyl group. We show that this result is equivalent to a Jordan–Chevalley theorem for such bundles equipped with a framing at a fixed basepoint. In the case where $E$ has a single cusp (respectively, node), this gives a new proof of the Jordan–Chevalley theorem for the Lie algebra $\mathfrak {g}$ (respectively, algebraic group $G$). We also provide a Tannakian description of these moduli stacks and use it to show that if $E$ is not a supersingular elliptic curve, the moduli of framed unipotent bundles on $E$ are equivariantly isomorphic to the unipotent cone in $G$. Finally, we classify the $E$-pseudo-Levi subgroups using the Borel–de Siebenthal algorithm, and compute some explicit examples.
    Combining Dynamic Bayesian Networks and Continuous Time Bayesian Networks for Diagnostic and Prognostic Modeling
    (IEEE, 2022-08) Schupbach, Jordan; Pryor, Elliott; Webster, Kyle; Sheppard, John
    The problem of performing general prognostics and health management, especially in electronic systems, continues to present significant challenges. The low availability of failure data, makes learning generalized models difficult, and constructing generalized models during the design phase often requires a level of understanding of the failure mechanism that elude the designers. In this paper, we present a new, generalized approach to PHM based on two commonly available probabilistic models, Bayesian Networks and Continuous-Time Bayesian Networks, and pose the PHM problem from the perspective of risk mit-igation rather than failure prediction. We describe the tools and process for employing these tools in the hopes of motivating new ideas for investigating how best to advance PHM in the aerospace industry.
