Theses and Dissertations at Montana State University (MSU)

Permanent URI for this collectionhttps://scholarworks.montana.edu/handle/1/733

Browse

Search Results

Now showing 1 - 6 of 6

Improving the confidence of machine learning models through improved software testing approaches
(Montana State University - Bozeman, College of Engineering, 2022) ur Rehman, Faqeer; Chairperson, Graduate Committee: Clemente Izurieta; This is a manuscript style paper that includes co-authored chapters.
Machine learning is gaining popularity in transforming and improving a number of different domains e.g., self-driving cars, natural language processing, healthcare, manufacturing, retail, banking, and cybersecurity. However, knowing the fact that machine learning algorithms are computationally complex, it becomes a challenging task to verify their correctness when either the oracle is not available or is available but too expensive to apply. Software Engineering for Machine Learning (SE4ML) is an emerging research area that focuses on applying the SE best practices and methods for better development, testing, operation, and maintenance of ML models. The focus of this work is on the testing aspect of ML applications by adapting the traditional software testing approaches for improving the confidence in them. First, a statistical metamorphic testing technique is proposed to test Neural Network (NN)-based classifiers in a non-deterministic environment. Furthermore, an MRs minimization algorithm is proposed for the program under test; thus, saving computational costs and organizational testing resources. Second, a Metamorphic Relation (MR) is proposed to address a data generation/labeling problem; that is, enhancing the test inputs effectiveness by extending the prioritized test set with new tests without incurring additional labeling costs. Further, the prioritized test inputs are leveraged to propose a statistical hypothesis testing (for detection) and machine learning-based approach (for prediction) of faulty behavior in two other machine learning classifiers i.e., NN-based Intrusion Detection Systems. Finally, to test unsupervised ML models, the metamorphic testing approach is utilized to make some insightful contributions that include: i) proposing a broader set of 22 MRs for assessing the behavior of clustering algorithms under test, ii) providing a detailed analysis/reasoning to show how the proposed MRs can be used to target both the verification and validation aspects of testing the programs under investigation, and iii) showing that verification of MR using multiple criteria is more beneficial than relying on using just a single criterion (i.e., clusters assigned). Thus, the work presented here results in providing a significant contribution to address the gaps found in the field, which enhances the body of knowledge in the emergent SE4ML field.
Towards reduced-cost hyperspectral and multispectral image classification
(Montana State University - Bozeman, College of Engineering, 2021) Morales Luna, Giorgio L.; Chairperson, Graduate Committee: John Sheppard
In recent years, Hyperspectral Imaging systems (HSI) have become a powerful source for reliable data in applications such as remote sensing, agriculture, and biomedicine. However, the abundant spectral and spatial information of hyperspectral images makes them highly complex, which leads to the need for specialized Machine Learning algorithms to process and classify them. In that sense, the contribution of this thesis is multi-folded. We present a low-cost convolutional neural network designed for hyperspectral image classification called Hyper3DNet. Its architecture consists of two parts: a series of densely connected 3-D convolutions used as a feature extractor, and a series of 2-D separable convolutions used as a spatial encoder. We show that this design involves fewer trainable parameters compared to other approaches, yet without detriment to its performance. Furthermore, having observed that hyperspectral images benefit from methods to reduce the number of spectral bands while retaining the most useful information for a specific application, we present two novel hyperspectral dimensionality reduction techniques. First, we propose a filter-based method called Inter-Band Redundancy Analysis (IBRA) based on a collinearity analysis between a band and its neighbors. This analysis helps to remove redundant bands and dramatically reduces the search space. Second, we apply a wrapper-based approach called Greedy Spectral Selection (GSS) to the results of IBRA to select bands based on their information entropy values and train a compact Convolutional Neural Network to evaluate the performance of the current selection. We also propose a feature extraction framework that consists of two main steps: first, it reduces the total number of bands using IBRA; then, it can use any feature extraction method to obtain the desired number of feature channels. Finally, we use the original hyperspectral data cube to simulate the process of using actual filters in a multispectral imager. Experimental results show that our proposed Hyper3DNet architecture in conjunction with our dimensionality reduction techniques yields better classification results than the compared methods, producing more suitable results for a multispectral sensor design.
Extracting abstract spatio-temporal features of weather phenomena for autoencoder transfer learning
(Montana State University - Bozeman, College of Engineering, 2020) McAllister, Richard Arthur; Chairperson, Graduate Committee: John Sheppard
In this dissertation we develop ways to discover encodings within autoencoders that can be used to exchange information among neural network models. We begin by verifying that autoencoders can be used to make predictions in the meteorological domain, specifically for wind vector determination. We use unsupervised pre-training of stacked autoencoders to construct multilayer perceptrons to accomplish this task. We then discuss the role of our approach as an important step in positioning Empirical Weather Prediction as a viable alternative to Numerical Weather Prediction. We continue by exploring the spatial extensibility of the previously developed models, observing that different areas in the atmosphere may be influenced unique forces. We use stacked autoencoders to generalize across an area of the atmosphere, expanding the application of networks trained in one area to the surrounding areas. As a prelude to exploring transfer learning, we demonstrate that a stacked autoencoder is capable of capturing knowledge universal to these dataspaces. Following this we observe that in extremely large dataspaces, a single neural network covering that space may not be effective, and generating large numbers of deep neural networks is not feasible. Using functional data analysis and spatial statistics we analyze deep networks trained from stacked autoencoders in a spatiotemporal application area to determine the extent to which knowledge can be transferred to similar regions. Our results indicate high likelihood that spatial correlation can be exploited if it can be identified prior to training. We then observe that artificial neural networks, being essentially black-box processes, would benefit by having effective methods for preserving knowledge for successive generations of training. We develop an approach to preserving knowledge encoded in the hidden layers of several ANN's and collect this knowledge in networks that more effectively make predictions over subdivisions of the entire dataspace. We show that this method has an accuracy advantage over the single-network approach. We extend the previously developed methodology, adding a non-parametric method for determining transferrable encoded knowledge. We also analyze new datasets, focusing on the ability for models trained in this fashion to be transferred to operating on other storms.
Large-scale automated human protein-phenotype relation extraction from biomedical literature
(Montana State University - Bozeman, College of Engineering, 2020) Pourreza Shahri, Morteza; Chairperson, Graduate Committee: Indika Kahanda
Identifying protein-phenotype relations is of paramount importance for applications such as uncovering rare and complex diseases. Human Phenotype Ontology (HPO) is a recently introduced standardized vocabulary for describing disease-related phenotypic abnormalities in humans. While the official HPO knowledge base maintains known associations between human proteins and HPO terms, it is widely believed that this is incomplete. However, due to the exponential growth of biomedical literature, timely manual curation is infeasible, rendering the need for efficient and accurate computational tools for automated curation. In this work, we present HPcurator, a novel two-step framework for extracting relations between proteins and HPO terms from biomedical literature. First, we implement ProPheno, a comprehensive online dataset composed of human protein-phenotype co-mentions extracted from the entire set of biomedical articles. Subsequently, we show that these co-mentions are useful as a complementary source of input for a different, but highly related, task of automated protein-phenotype prediction. Next, we develop a supervised machine learning model called PPPred, which, to the best of our knowledge, is the first predictive model that can classify the validity of a given sentence-level protein-phenotype co-mention. Using a gold standard dataset composed of manually curated sentence co-mentions, we demonstrate that PPPred significantly outperforms several baseline methods. Finally, we propose SSEnet, a novel deep semi-supervised ensemble framework for relation extraction that combines deep learning, semi-supervised learning, and ensemble learning. This framework is motivated by the fact that while the manual annotation of co-mentions is extremely prohibitive, we have access to millions of unlabeled co-mentions. We develop a prototype of HPcurator by instantiating SSEnet with ProPheno, self-learning, pre-trained language models, as well as convolutional and recurrent neural networks. This system can successfully output a ranked list of relevant sentences for a user input protein-phenotype pair. Our experimental results indicate that this system provides state-of-the-art performance in human protein- HPO term relation extraction. The findings and the insight gained from this work have implications for biocurators, biologists, and the computer science community involved in developing biomedical text mining tools.
Convolutional neural networks for multi- and hyper-spectral image classification
(Montana State University - Bozeman, College of Engineering, 2019) Senecal, Jacob John; Chairperson, Graduate Committee: John Sheppard
While a great deal of research has been directed towards developing neural network architectures for classifying RGB images, there is a relative dearth of research directed towards developing neural network architectures specifically for multi-spectral and hyper-spectral imagery. The additional spectral information contained in a multi-spectral or hyper-spectral image can be valuable for land management, agriculture and forestry, disaster control, humanitarian relief operations, and environmental monitoring. However, the massive amounts of data generated by a multi-spectral or hyper- spectral instrument make processing this data a challenge. Machine learning and computer vision techniques could automate the analysis process of these rich data sources. With these benefits in mind, we have adapted recent developments in small efficient convolutional neural networks (CNNs), to create a small CNN architecture capable of being trained from scratch to classify 10 band multi-spectral images, using much fewer parameters than popular deep architectures, such as the ResNet or DenseNet architectures. We show that this network provides higher classification accuracy and greater sample efficiency than the same network using RGB images. We also show that it is possible to employ a transfer learning approach and use a network pre-trained on multi-spectral satellite imagery to increase accuracy on a second much smaller multi-spectral dataset, even though the satellite imagery was captured from a much different perspective (high altitude, overhead vs. ground based at close stand-off distance). These results demonstrates that it is possible to train our small network architectures on small multi-spectral datasets and still achieve high classification accuracy. This is significant as labeled hyper-spectral and multi-spectral datasets are generally much smaller than their RGB counterparts. Finally, we approximate a Bayesian version of our CNN architecture using a recent technique known as Monte Carlo dropout. By keeping dropout in place during test time we can perform a Monte Carlo procedure using multiple forward passes of our network to generate a distribution of network outputs which can be used as a measure of uncertainty in the predictions a network is making. Large variance in the network output corresponds to high uncertainty and vice versa. We show that a network that is capable of working with multi-spectral imagery significantly reduces the uncertainty associated with class predictions compared to using RGB images. This analysis reveals that the benefits of an architecture that works effectively with multi-spectral or hyper-spectral imagery extends beyond higher classification accuracy. Multi-spectral and hyper-spectral imagery allows us to be more confident in the predictions that a deep neural network is making.
Predicting anticancer peptides and protein function with deep learning
(Montana State University - Bozeman, College of Engineering, 2020) Lane, Nathaniel Patrick; Chairperson, Graduate Committee: Indika Kahanda
Anticancer peptides (ACPs) are a promising alternative to traditional chemotherapy. To aid wet-lab and clinical research, there is a growing interest in using machine learning techniques to help identify good ACP candidates computationally. In this work, we develop DeepACPpred, a novel deep learning model for predicting ACPs using their amino acid sequences. Using several gold-standard ACP datasets, we demonstrate that DeepACPpred is highly effective compared to state-of-the-art ACP prediction models. Furthermore, we adapt the above neural network model for predicting protein function and report our experience with participating in a community-wide large-scale assessment of protein functional annotation tools.

Theses and Dissertations at Montana State University (MSU)

Browse

Filters

Settings

Sort By

Results per page

Search Results