Scholarship & Research
Permanent URI for this communityhttps://scholarworks.montana.edu/handle/1/1
Browse
10 results
Search Results
Item Improving the effectiveness of metamorphic testing using systematic test case generation(Montana State University - Bozeman, College of Engineering, 2024) Saha, Prashanta; Chairperson, Graduate Committee: Clemente Izurieta; This is a manuscript style paper that includes co-authored chapters.Metamorphic testing is a well-known approach to tackle the oracle problem in software testing. This technique requires source test cases that serve as seeds for the generation of follow-up test cases. Systematic design of test cases is crucial for the test quality. Thus, source test case generation strategy can make a big impact on the fault detection effectiveness of metamorphic testing. Most of the previous studies on metamorphic testing have used either random test data or existing test cases as source test cases. There has been limited research done on systematic source test case generation for metamorphic testing. This thesis explores innovative methods for enhancing the effectiveness of Metamorphic Testing through systematic generation of source test cases. It addresses the challenge of testing complex software systems, including numerical programs and machine learning applications, where traditional testing methods are limited by the absence of a reliable oracle. By focusing on structural, mutation coverage criteria, and characteristics of machine learning datasets, the research introduces strategies to generate source test cases that are more effective in fault detection compared to random test case generation. The proposed techniques include leveraging structural and mutation coverage for numerical programs and aligning random values with machine learning properties for supervised classifier applications. These techniques are integrated into the METTester tool, automating the process and potentially reducing testing costs by minimizing the test suite without sacrificing quality. The thesis demonstrates that tailored source test case generation can significantly improve the fault detection capabilities of Metamorphic Testing, offering substantial benefits in terms of cost efficiency and reliability in software testing.Item Exploration of UHPC applications for Montana bridges(Montana State University - Bozeman, College of Engineering, 2023) Starke, James Gerald; Chairperson, Graduate Committee: Kirsten MattesonThe following research project explores bridge applications of ultra-high performance concrete (UHPC). Bridge deterioration is a problem across Montana and UHPC overlays and patching/repairing have been found to be viable alternatives to bridge replacement. The current study began with a literature review on research, specifications, and implementation projects of UHPC bridge deck overlays. A report from FHWA was highlighted that summarized the results of previous overlay and repair projects, and developed their own recommendations. A material-level evaluation was performed on three UHPC mixes, primarily focusing on workability, compressive strength, tensile strength, and tension and shear bond strengths. All three UHPCs exhibited adequate behavior and the resultant properties were above recommendations from ACI for concrete repair and overlay applications. Based on the material-level evaluation results, a thixotropic version of Ductal was chosen for subsequent structural testing. Five slab test specimens were designed and constructed to model a deck section from an existing bridge in Montana. The testing and specimens were designed to determine the effects that including a UHPC overlay, overlay thickness, and substrate concrete strength have on the ultimate moment capacity. The slabs consisted of one control slab, two slabs with varying UHPC overlay depths, one with weak substrate concrete, and one tested to emulate a negative moment region on a bridge deck. The testing demonstrated that including a UHPC overlay increased the ultimate moment capacity of the slabs, even with a weak substrate concrete, but cause the slabs to fail in shear rather than concrete crushing. Additionally, the results imply that a weak deck strengthened with a thin UHPC overlay will respond similarly to a deck composed of much stronger normal concrete. The tensile capacity of the UHPC plays a large role in the overall strength and stiffness of a slab subjected to a negative moment and the tensile strength should be included in capacity calculations, as recommended by FHWA. Overall, the results are promising and shed light on how a UHPC overlay may contribute to the overall strength of an existing bridge deck if implemented in a future overlay project in Montana.Item Characterization and testing of reduced height (RHT) hypomorphs in durum and spring wheat(Montana State University - Bozeman, College of Agriculture, 2023) Ugrin, Josey Mackinsey; Chairperson, Graduate Committee: Michael J. GirouxThe Reduced Height (Rht) gene in wheat (Triticum aestivum L.) increases yield by partitioning less nutrients to stem elongation and more towards spike development. In hexaploid wheat, the mutations Rht-B1b and Rht-D1b, create high-yielding semi-dwarf varieties. While Rht-B1b and Rht-D1b have been widely adopted due to their ability to increase yield, they also have drawbacks such as smaller seed size and lower protein content. Furthermore, tetraploid durum wheat (Triticum. turgidum L), Rht-B1b creates plants that are shorter than in hexaploid wheat under Northern Great Plains growing conditions. This project aimed to further characterize Rht and to develop a plant height intermediate between current standard-height and semi-dwarf varieties to increase yield in both durum and spring wheat. To create novel Rht alleles, seeds were mutagenized with Ethyl-methanesulfonate (EMS) and mutations were identified. Near-isogenic lines (NILS) were developed for the two Rht-A1 alleles and Rht-B1b-E529K alleles in semi-dwarf (Rht-B1b) and standard height (Rht-B1a) varieties in durum. In spring wheat, NILs were developed for eight Rht-A1 alleles in two high-yielding Montana varieties. These NILS were planted in field trials and plant height and grain traits were measured. Four novel mutations, Rht-A1-E63K, Rht-A1-Q6*, Rht-A1-V55M, and Rht-A1-53T in spring wheat and two mutations in durum, Rht-B1b-E529K and Rht-A1-S50F all had either significantly changed height or grain traits. Along with developing and testing Rht alleles for field trait improvement, we did a study to characterize an Rht stop-codon dosage response in wheat. Previous studies in rice and barley have indicated that a lack of the functional SLR1/SLN1 gene respectively, results in an abnormal growth response characterized by taller height and slender appearance. This effect on Rht function has yet to be tested in wheat. Rht nonsense alleles were created by screening an EMS treated population created using seed of a standard-height Montana variety. We combined mutations creating lines homozygous for single, double, or triple mutations. In field trials, Rht triple mutants exhibited a slender, elongated phenotype with strike heads similar to SLN1 mutants in barley. Differences in height varied for the other crosses but did trend towards increased height with increased Rht-stop mutation dosage.Item Improving the confidence of machine learning models through improved software testing approaches(Montana State University - Bozeman, College of Engineering, 2022) ur Rehman, Faqeer; Chairperson, Graduate Committee: Clemente Izurieta; This is a manuscript style paper that includes co-authored chapters.Machine learning is gaining popularity in transforming and improving a number of different domains e.g., self-driving cars, natural language processing, healthcare, manufacturing, retail, banking, and cybersecurity. However, knowing the fact that machine learning algorithms are computationally complex, it becomes a challenging task to verify their correctness when either the oracle is not available or is available but too expensive to apply. Software Engineering for Machine Learning (SE4ML) is an emerging research area that focuses on applying the SE best practices and methods for better development, testing, operation, and maintenance of ML models. The focus of this work is on the testing aspect of ML applications by adapting the traditional software testing approaches for improving the confidence in them. First, a statistical metamorphic testing technique is proposed to test Neural Network (NN)-based classifiers in a non-deterministic environment. Furthermore, an MRs minimization algorithm is proposed for the program under test; thus, saving computational costs and organizational testing resources. Second, a Metamorphic Relation (MR) is proposed to address a data generation/labeling problem; that is, enhancing the test inputs effectiveness by extending the prioritized test set with new tests without incurring additional labeling costs. Further, the prioritized test inputs are leveraged to propose a statistical hypothesis testing (for detection) and machine learning-based approach (for prediction) of faulty behavior in two other machine learning classifiers i.e., NN-based Intrusion Detection Systems. Finally, to test unsupervised ML models, the metamorphic testing approach is utilized to make some insightful contributions that include: i) proposing a broader set of 22 MRs for assessing the behavior of clustering algorithms under test, ii) providing a detailed analysis/reasoning to show how the proposed MRs can be used to target both the verification and validation aspects of testing the programs under investigation, and iii) showing that verification of MR using multiple criteria is more beneficial than relying on using just a single criterion (i.e., clusters assigned). Thus, the work presented here results in providing a significant contribution to address the gaps found in the field, which enhances the body of knowledge in the emergent SE4ML field.Item Full scale component level testing & severity analysis of phantom 3 UAV to Cessna 182b aircraft collisions(Montana State University - Bozeman, College of Engineering, 2021) Hayes, Benjamin Woodruff; Chairperson, Graduate Committee: Robb LarsonUnmanned Aircraft Systems (UAS) are more attainable now than ever before. With uses ranging from re-forestation, agriculture, film-making, and recreation; a significant amount of airspace is being occupied by UAS. To better understand the risks posed by UAS to other aircraft, the Alliance for System Safety of UAS through Research Excellence (ASSURE) was created. One aspect of ASSURE's agenda is to conduct air to air collision studies using Finite Element Analysis (FEA) in combination with full scale collision data. Montana State University contracted with ASSURE to conduct component level testing for the project, and provide data for validating FEA models being developed at the National Institute of Aviation Research (NIAR). Component level testing consisted of the following aircraft components: Cessna 182B struts, wings, and windscreens. In order to accurately simulate in-flight geometry, fixtures were custom fabricated to individually mount aircraft components. High velocity impact data was collected via load cells, high speed video, and Digital Image Correlation (DIC). A drone launching system developed during an MSU conducted research effort was used to launch Phantom 3 quadcopter UAVs as projectiles for component level tests. For all tests, the impact was captured from two viewpoints using high speed video, and reaction force data was collected using load cells at critical attachment points. For wing and windscreen testing, 2-D DIC and 3-D DIC were used respectively to capture displacements during the collision. Testing showed that struts received mainly superficial damage, but that both wings and windscreens exhibited the potential for catastrophic failure.Item Performance of FRP-strengthened reinforced concrete beams subjected to low temperature(Montana State University - Bozeman, College of Engineering, 2021) Ahmed, Emtiaz; Chairperson, Graduate Committee: Kirsten MattesonThe use of Fiber Reinforced Polymer (FRP) to repair and strengthen existing concrete structural elements (beams, columns, beam-column connections, and slabs) has become globally accepted and popular. FRP can be used for this application in several forms, such as externally applied wrapping, Near Surface Mounted (NSM) bars, lamination, and sheets. The strength to weight ratio of this material is one of the main criteria that makes this material approved and desired by engineers and researchers for this application. Also, FRP is corrosion resistant and requires less installation time compared to other repairing techniques such as jacketing, section enlargement, and external post tensioning. The performance of FRP repairs has been studied extensively at conventional, non-extreme temperatures; however, little research has been conducted on the performance of these repairs at cold temperatures. The research discussed herein aims to fill this gap in knowledge so that FRP repairs can be more widely used in cold temperature environments, such as for bridge repairs in the state of Montana. In this work, six beams (6 in. x 8 in., 10 ft long) were constructed and tested in four-point bending at two different temperatures (room temperature and -40 °C). For each temperature, there were three beam types: 1) a control beam, 2) a longitudinal strengthened beam, and 3) a longitudinal + transverse strengthened beam. Overall, the results showed that low temperatures have a generally positive effect on concrete strength and beam performance. The average concrete compressive strength of frozen cylinders at -40 °C was observed to be 87.18% higher than the cylinders tested at room temperature. For all beam types, the ultimate load carrying capacity of the low temperature beams exceeded the capacity of the counterpart beam tested at room temperature. Additionally, at lower temperatures the strengthened beams showed delayed FRP delamination (occurring at higher displacements). Further, the initial stiffnesses of the cold beams were found to be significantly higher than the room temperature beams. Overall, the results of this study are promising for the potential of use of FRP for repairs in cold environments and future research is warranted.Item Automated techniques for prioritization of metamorphic relations for effective metamorphic testing(Montana State University - Bozeman, College of Engineering, 2022) Srinivasan, Madhusudan; Chairperson, Graduate Committee: John Paxton and Upulee Kanewala (co-chair)An oracle is a mechanism to decide whether the outputs of the program for the executed test cases are correct. In many situations, the oracle is not available or too difficult to implement. Metamorphic testing is a testing approach that uses metamorphic relations (MRs), properties of the software under test represented in the form of relations among inputs and outputs of multiple executions, to help verify the correctness of a program. Typically, MRs vary in their ability to detect faults in the program under test, and some MRs tend to detect the same set of faults. In this work, we aim to prioritize MRs to improve the efficiency and effectiveness of MT. We present five MR prioritization approaches: (1) Fault-based, (2) Coverage-based, (3) Statement Centrality-based, (4) Variable-based, and (5) Data Diversity-based. To evaluate these MR prioritization approaches, we conducted experiments on complex open- source software systems and machine learning programs. Our results suggest that the proposed MR prioritization approaches outperform the current practice of executing the source and follow-up test cases of the MRs randomly. Further, our results show that Statement Centrality-based and Variable-based approaches outperform Code Coverage and random-based approaches. Also, the proposed approaches show 21% higher rate of fault detection over random-based prioritization. For machine learning programs, the proposed Data Diversity-based MR prioritization approach increases the fault detection effectiveness by up to 40% when compared to the Code Coverage- based approach and reduces the time taken to detect a fault by 29% when compared to random execution of MRs. Further, all the proposed approaches lead to reducing the number of MRs that needs to be executed. Overall, our work would result in saving time and cost during the metamorphic testing process.Item The effects of high-stakes ATI remediation and testing practices including the ATI content mastery series and ATI PN comprehensive predictor(Montana State University - Bozeman, College of Education, Health & Human Development, 2021) Hunter, Elaine Hernandez; Chairperson, Graduate Committee: Tricia SeifertThe purpose of this retrospective, descriptive study was to determine if any differences existed in students' test scores on the Assessment Technologies Institute (ATI) PB tests: Fundamentals, Pharmacology, Medical Surgical and the Comprehensive Predictor before and after implementing a high-stakes remediation and testing policy. The ATI computer-based standardized tests are widely used in nursing programs as a program assessment tool. Also, ATI tests provide correlational evidence of first-time NCLEX-RN passage. The ATI Remediation and tests are commonly added to nursing programs progression plans. In recent years nursing programs have applied high-stakes ATI Remediation to the ATI tests in response to the high-stakes quality of NCLEX-RN. In this study the high-stakes ATI tests were administered to associate of science nursing students in their first year of their two year program. The site where the study took place was at a small university located in the Rocky Mountain Region of the US. Group comparisons between those who had a pre-policy of no high-stakes ATI Remediation and testing practices and a post-policy with high-stakes ATI Remediation and testing practices. Descriptive and inferential statistical analysis were used to detect difference in test scores between the two groups. Statistically significant differences were found between groups of test takers on the ATI PN Fundamentals and Comprehensive Predictor tests with the post-policy group scoring higher. An explanation of these findings indicate test scores increase with the use of high-stakes ATI Remediation and testing practices. The findings from these tests can assist nurse educators in placing a clearly defined, appropriate high-stakes ATI Remediation and testing into the progression plan.Item Cessna 182b windscreen material model development and full scale UAS to aircraft impact testing facility(Montana State University - Bozeman, College of Engineering, 2020) Arnold, Forrest Jacob; Chairperson, Graduate Committee: Douglas S. CairnsUnmanned Aircraft Systems (UAS) have become popular in the last decade. More than 1.5 million have been registered by the Federal Aviation Administration (FAA) since 2015. In order to understand the risk UAS pose to manned aircraft and make informed regulation decisions, the FAA has created air to air collision studies. As a part of the FAA general aviation air to air collision research, a Cessna 182 windscreen material model and a full scale impact testing facility were required. A Finite Element Crash Model of a Cessna 182 is in development as a part of the general aviation air to air collision research. The National Institute for Aviation Research at Wichita State University is managing development of the model. In support of that work, an LS-DYNA material model of the Poly(Methyl methacrylate) windscreen was developed. Results from tensile testing at multiple strain rates were used to develop material models using MAT_124 and MAT_187. A model of an impact tower was created to compare the material models to test results. The material models were tuned to better fit the impact tower test results. MAT_187 has more flexible material inputs, which allowed it to outperform MAT_124. A full scale impact testing facility was developed to support Finite Element model validation and direct testing of UAS to aircraft impact. A slingshot style launcher was designed and built to launch common quadcopter style UAS. Testing has shown that the launcher is capable of 120 knots with the accuracy required to repeatably hit the leading edge of a wing. Additionally, the launch site required a system for instrumented testing to compare experimental results with finite element results. A system was developed to allow flexible fixturing, impact speed and orientation measurement, and inclusion of load cells and strain gauges.Item Predicting metamorphic relations: an evaluation of program representations and machine learning techniques(Montana State University - Bozeman, College of Engineering, 2020) Rahman, Karishma; Chairperson, Graduate Committee: Upulee Kanewala; Upulee Kanewala was a co-author of the article, 'Predicting metamorphic relations for matrix calculation programs' in the 'MET18: Proceedings of the 3rd International Workshop on Metamorphic Testing' which is contained within this thesis.Testing complex scientific applications can often be a complicated and expensive procedure. A test oracle is used to verify the behavior of the software under test. However, difficulties due to the implementation of a test oracle make the process of systematically testing scientific applications more challenging. This problem is known as the oracle problem. Metamorphic testing (MT) is an effective technique to test these applications as it uses metamorphic relations (MRs) to determine whether test cases have passed or failed. Metamorphic relations are essential components of metamorphic testing that highly affect its fault detection effectiveness. MRs are usually identified with the help of a domain expert, which is a labor-intensive task. In this work, a previously developed graph kernel-based machine learning method is extended by predicting MRs for functions that perform matrix calculations. Then, semi-supervised support vector machine (S3VM) is used to build the predictive model for the suggested approach. Finally, call graph (CG) information of the functions are used to calculate the graph kernels to predict MRs. The overall result shows that random walk kernel performs better than the graphlet kernel, and semi-supervised learning can be effective with more unlabelled data. Also, the use of call graph representation presents a new avenue of research in predicting MRs for unseen functions.