Theses and Dissertations at Montana State University (MSU)
Permanent URI for this communityhttps://scholarworks.montana.edu/handle/1/732
Browse
54 results
Search Results
Item String analysis and algorithms with genomic applications(Montana State University - Bozeman, College of Engineering, 2024) Liyana Ralalage, Adiesha Lakshan Liyanage; Chairperson, Graduate Committee: Binhai ZhuIn biology, genome rearrangements are mutations that change the gene content of a genome or the arrangement of the genes on a genome. Understanding how genome rearrangements occur in a genome can help us to understand the evolutionary history of extant species, improve genetic engineering, and understand the basis of genetic diseases. In this dissertation, we explored four problems related to genome partitioning and tandem duplication and deletion rearrangement operations. Our interest was focused on determining how difficult it is to solve these problems and identifying efficient algorithms to solve them. The proposed problems were formulated as string problems and then analyzed using complexity theory. In the first chapter, we explored several variations of F -strip recovery problem called XSR-F and GSR-F and their complexity under different parameters. We proved that the XSR-F problem is hard to solve unless we restrict the allowed block sizes to one size. We provided a polynomial time algorithm for GSR-F under a fixed alphabet and fixed F . In the second and third chapters, we introduced two string problems named longest letter- duplicated subsequence (LLDS) and longest subsequence-repeated subsequence (LSRS)-- formulated as alternative problem formulations for the tandem-duplication distance problem that allow to extract information about segments of genes that may have undergone tandem duplication-- analyzed the complexity of their variations and devised efficient algorithms to solve them. We proved that constrained versions of LLDS and LSRS problems are NP- hard for parameter d > or = 4, while general versions were polynomially solvable which hints that any variations closer to the original tandem duplication distance problem are still hard to solve. In the final chapter, we delved into two heuristic algorithms designed to compute genomic distance between two mitochondrial genomes and a heuristic algorithm to predict ancestral gene order under the TDRL (tandem-duplication random loss) model. We improved the previously studied method developed for permutation strings by tweaking heuristic choices aimed at calculating the minimum distance between two genomes to apply to non-permutation strings. These heuristic algorithms were implemented and tested on a real-world mitochondrial genome data set.Item Studies in alternative theories of gravity and advanced data analysis(Montana State University - Bozeman, College of Letters & Science, 2024) Gupta, Toral; Chairperson, Graduate Committee: Neil J. Cornish; This is a manuscript style paper that includes co-authored chapters.The field of gravitational wave astronomy is generating groundbreaking findings, yielding unique insights on some of the most extraordinary phenomena in the universe and providing invaluable information on testing the principles of general relativity. All gravitational wave signals detected so far appear to come from compact binaries - black holes and neutron stars. We use information from these sources to probe strong fields of gravity and to constrain modified theories of gravity. However, solely relying on template- based searches for known astrophysical sources biases our gravitational wave signal search towards well-modeled systems, potentially overlooking unpredicted sources with limited theoretical models, hindering the extraction of new physics. Further work in this thesis focuses on building improved signal and noise models to enhance our capability of detecting gravitational signals of all within and beyond the constraints of theoretical predictions. This includes introduction of new basis functions with added modifications to develop a signal-agnostic waveform reconstruction model using Bayesian inference. Additionally, this study discusses improvements in the speed and performance of the BayesWave trans-dimensional Bayesian spectral estimation algorithm, which includes implementing a low-latency analysis and various enhancements to the algorithm itself. In essence, this study is centered on developing a comprehensive understanding, both theoretical and observational, of astrophysical objects along with the spacetime that governs their dynamics.Item From curves to words and back again: geometric computation of minimum-area homotopy(Montana State University - Bozeman, College of Engineering, 2024) McCoy, Bradley Allen; Chairperson, Graduate Committee: Brittany FasyLet gamma be a generic closed curve in the plane. The area of a homotopy is the area swept by the homotopy. We consider the problem of computing the minimum null-homotopy area of gamma. Samuel Blank, in his 1967 Ph.D. thesis, determined if gamma is self-overlapping by geometrically constructing a combinatorial word from gamma. More recently, Zipei Nie, in an unpublished manuscript, computed the minimum homotopy area of gamma by constructing a combinatorial word algebraically. We provide a unified framework for working with both words and determine the settings under which Blank's word and Nie's word are equivalent. Using this equivalence, we give a new geometric proof for the correctness of Nie's algorithm. Unlike previous work, our proof is constructive which allows us to naturally compute the actual homotopy that realizes the minimum area. Furthermore, we contribute to the theory of self-overlapping curves by providing the first polynomial-time algorithm to compute a self-overlapping decomposition of any closed curve gamma with minimum area. Next, we describe the first polynomial implementation of an algorithm to compute the minimum homotopy area of a piecewise linear closed curve in the plane. We discuss how minimum homotopy area can be used as a similarity measure for curves and include experiments that compare the runtime of our algorithm to an implementation of the Frechet distance. We then extend our algorithm for computing the minimum homotopy area in the plane to homotopic, non-intersecting, non-contractible curves on an orientable surface with positive genus. Finally, we consider the inverse problem of determining which combinatorial Blank words correspond to closed curves in the plane. We solve a special case of this problem and give an exponential algorithm to the general case.Item Developing and implementing a fall prevention algorithm to improve patient safety: a quality improvement project(Montana State University - Bozeman, College of Nursing, 2024) Doyle, Tera Ann; Chairperson, Graduate Committee: Elizabeth A. Johnson; This is a manuscript style paper that includes co-authored chapters.Statement of the problem: Approximately one million falls occur in U.S. hospitals every year. Inpatient falls are the leading cause of preventable hospital-acquired adverse events, accounting for 70% of all accidents reported by hospitals. Inpatient falls have significant impact on healthcare costs, due to increased patient morbidity, mortality and limited reimbursement. Fall prevention clinical practice guidelines lack consensus regarding effective fall prevention interventions. Inpatient falls continue to be a major concern across the globe despite extensive prevention efforts. Methods: A scoping literature review was conducted to explore the body of evidence available regarding known causes, impact, fall prevention strategies and interventions. A database search was conducted across multiple databases using keyword terms related to inpatient falls. Results were screened for inclusion eligibility based on several factors to produce a current, comprehensive, evidence-based review of the known literature. Results: The evidence within the literature is extensive regarding known causes and impacts but variable regarding effective solutions and prevention strategies. Inpatient falls are multifactorial, complex and often caused by non-modifiable risk factors. Implementation, interventions and risk assessment tools vary dramatically across and within organizations, making comparison of research findings difficult. Clinical practice guidelines offer vague and varying recommendations for fall prevention programs. There is emerging evidence that multifactorial approaches that incorporate evidence-based risk assessment tools, risk stratification and tailored interventions are the most effective strategy currently being utilized. Conclusions: Inpatient falls continue to be a concern due to the dramatic impact for both patients and organizations. The lack of consensus in evidence and guidance perpetuates this complex problem. Multifactorial approach fall prevention programs have emerged as the most effective strategy at reducing and preventing inpatient falls. Quality improvement projects which utilize multifactorial approaches are supported by the evidence within the literature as a cost- effective strategy to prevent and reduce inpatient falls.Item An exploration of whole-genome comparative genomic strategies for polyploid crop genomes(Montana State University - Bozeman, The Graduate School, 2022) Reynolds, Gillian Lucy; Co-chairs, Graduate Committee: Brendan Mumey and Jennifer A. LachowiecGenome comparison for large and complex polyploid crop genomes is a highly complex venture, yet it is critical. Given a rising demand for food coupled with yield-impacting resource limitations and rapidly changing global climates it has never been more important to characterise the underlying genetic variation which underpins traits of agronomic interest. In this work, the problem of polyploidy genome comparison is explored at three levels. The first chapter characterizes the sequence relationships that exist between, and within, polyploidy genomes. This is achieved by hijacking a metagenomic strategy for rapid, and efficient, genome sequence classification. The second chapter then utilizes the identified subgenome- specific k-mer profiles for recruitment of assembled contigs and scaffolds previously only recruitable via more resource intensive optical mapping strategies. This makes a greater proportion of the assembled data usable for downstream variant analysis. The third chapter then zooms into the problem of how to identify variants from large -scale sequencing data while minimizing bias and computational costs. A critical assessment of modern variant calling for crop genomes is performed and an algorithm to further extend a new, resource efficient, approach for large scale comparative genomics is presented and critically evaluated. In all, the work presented herein takes a top-down journey from genome- and subgenome- level comparative genomics all the way to identifying base-pair resolution strategies that are capable of revealing the underlying sequences responsible for keeping the world fed.Item Increasing the PPI deprescribing rate at a transitional care unit(Montana State University - Bozeman, College of Nursing, 2023) Yu, Linfei; Chairperson, Graduate Committee: Sandra Benavides-VaelloBackground: PPIs are overprescribed worldwide, especially among geriatric populations. The long-term use of PPIs is associated with many adverse effects. This project aims to utilize deprescribing algorithms to assist healthcare providers in deprescribing inappropriate PPI prescriptions for patients at a 17-bed transitional care unit within a skilled nursing facility. Methods: The seven-step method problem-solving model was used for this project. Baseline assessment included a review of patient electronic medical records (EMRs) two months before the intervention. Admission and discharge notes were reviewed to identify the baseline rate of patients with PPI prescriptions and the deprescribing PPI rate by discharge. A review of the literature review was conducted to identify interventions that focused on providers deprescribing PPIs. A review of EMRs two months post interventions to identify PPI deprescribing rate. Interventions: Education, including the provision of the deprescribing algorithm, was provided to address the knowledge gap. A post-education survey was completed by providers to identify readiness and motivation levels for deprescribing PPIs. Patient education pamphlets regarding PPIs were made to enhance the success rate for deprescribing PPIs. Education was also provided to nursing staff to help distribute PPI education pamphlets to patients and remind healthcare providers to review PPI prescriptions. Results: Zero healthcare providers responded to the readiness survey. Following the interventions, 5 patients out of 20 on PPIs were deprescribed, compared to 0 out of 11 patients before the interventions. The five patients were deprescribed from PPIs by the same healthcare provider who responded to the follow-up emails after interventions. Conclusions: The project's objectives were not achieved due to healthcare providers' lack of response to the readiness survey, and the deprescription rate was 25% postintervention at TCU compared to the aim of 30%. To improve the chances of success in future QI projects, it is recommended to encourage the participation of healthcare providers and nursing staff through face-to-face education and allow more project time to thoroughly evaluate the impact of chosen interventions.Item The impact of algorithmic risk assessment tool legislation on racial disparities in criminal sentencing(Montana State University - Bozeman, College of Agriculture, 2023) Brauch, Hannah Clare; Chairperson, Graduate Committee: Wendy A. StockThe prevailing presence of racial disparities in criminal sentencing motivated the introduction of algorithmic risk assessment tools (RATs) in the U.S. judicial system. These tools provide judges with an algorithm-generated risk score and sentencing recommendation to consider in their decisions. Although this technology is well-intentioned, researchers find that RATs produce racial disparities in their outputs. My research examines the impact of state laws regulating the use of RATs on racial disparities in sentence length and likelihood of receiving probation. Utilizing Gardner's (2021) two-stage differences-in-differences methodology, I exploit the natural experiment arising from 29 states passing some form of RAT law at different times. I find that the impact of RAT laws depends on the components of the state's RAT law, and that the effect varies by racial group. My results suggest that RAT laws significantly decrease the racial sentencing disparity for Hispanics, but increase the disparity for Blacks. Although my results are somewhat sensitive to specification, they still bear critical policy implications regarding the use of RATs in the judicial system.Item Using sparse coding as a preprocessing technique for insect detection in pulsed LIDAR data(Montana State University - Bozeman, College of Engineering, 2022) Zsidisin, Connor Reece; Chairperson, Graduate Committee: Brad WhitakerThis research proposes using sparse coding as a preprocessing technique on insect lidar based data. This preprocessing technique will be used in conjunction with the Adaptive Boosting (AdaBoost), Random UnderSampling Boosting (RUSBoost), and neural network algorithms to automatically detect insects. The project aims to increase the effectiveness of these algorithms by using new images created by sparse coding. The K-Singular Value Decomposition (KSVD) algorithm will be used to train a dictionary on images that contain the majority class (non-insects). This trained dictionary will be used along with Orthogonal Matching Pursuit (OMP) to reconstruct all lidar images. The difference between the original image and the reconstructed image will be taken and processed by the feature extraction function and then used to train and test the models. Using a complete and an overcomplete dictionary our results show that the algorithms are able to detect insects at a higher rate. Using an overcomplete dictionary we are able to classify 93.18% of insect containing images in the testing dataset. Using the complete dictionary we were able to maintain 99.70% of non-insect images while increasing the percentage of insects classified to 84.09%.Item Flow decomposition algorithms for multiassembly problems(Montana State University - Bozeman, College of Engineering, 2022) Williams, Lucia Gean; Chairperson, Graduate Committee: Brendan MumeyCurrent genetic sequencing technologies allow for fast and cheap measurement of short substrings of genetic sequence called reads which must be assembled to recover the full unknown sequence. In some cases, such as when assembling RNA transcripts or the genomes of a mixture of species taken in a single sample, the reads come from multiple sequences. In this case, we would like to recover all of the distinct unknown sequences and their relative abundances, a task which we call multiassembly. A common model underlying many multiassembly approaches is flow decomposition, which decomposes a flow network into a set of paths and weights that parsimoniously explains the flow. In this dissertation, we formalize two new variations on flow decomposition to better model the information available when performing multiassembly from reads. The first, inexact flow decomposition, allows for some uncertainty in the flow measurements. The second, flow decomposition with subpath constraints, incorporates additional information that may be provided by longer reads. We give algorithms to solve these problems and demonstrated their usefulness for RNA assembly on a simulated dataset. Additionally, we give the first polynomial-size integer linear programming (ILP) formulation for minimum flow decomposition and show that it can be adapted to encode both of the variants mentioned above. An implementation of the ILP using the ILP solver CPLEX runs faster than existing exact MFD solvers on RNA sequencing datasets.Item Directed graph descriptors and distances for analyzing multivariate time series data(Montana State University - Bozeman, College of Letters & Science, 2022) Belton, Robin Lynne; Chairperson, Graduate Committee: Tomas GedeonLocal maxima and minima, or extremal events, in experimental time series can be used as a coarse summary to characterize data. However, the discrete sampling in recording experimental measurements suggests uncertainty in the true timing of extrema during the experiment. This in turn gives uncertainty in the timing order of extrema within the time series. Motivated by applications in genomic time series and biological network analysis, we construct a weighted directed acyclic graph (DAG) called an extremal event DAG using techniques from persistent homology that is robust to measurement noise. Furthermore, we define a distance between extremal event DAGs based on the edit distance between strings. We prove several properties including local stability for the extremal event DAG distance with respect to pairwise L1 distances between functions in the time series data. Lastly, we provide algorithms, publicly free software, and implementations on extremal event DAG construction and comparison.