Theses and Dissertations at Montana State University (MSU)
Permanent URI for this communityhttps://scholarworks.montana.edu/handle/1/732
Browse
12 results
Search Results
Item Data-driven approaches for distribution grid modernization: exploring state estimaion, pseudo-measurement generation and false data detection(Montana State University - Bozeman, College of Engineering, 2023) Radhoush, Sepideh; Chairperson, Graduate Committee: Brad WhitakerDistribution networks must be regularly updated to enhance their performance and meet customer electricity requirements. Advanced technologies and infrastructure--including two- way communication, smart measuring devices, distributed generations in various forms, electric vehicles, variable loads, etc.--have been added to improve the overall efficiency of distribution networks. Corresponding to these new features and structures, the continuous control and monitoring of distribution networks should be intensified to keep track of any modifications to the distribution network performance. Distribution system state estimation has been introduced for real-time monitoring of distribution networks. State estimation calculations are highly dependent on measurement data which are collected from measurement devices in distribution networks. However, the installation of measurement devices is not possible at all buses to ensure the distribution network is fully observable. To address the lack of real measurements, pseudo- measurements are produced from historical load and generation data. Available measurements, along with physical distribution network topology, are fed into a state estimation algorithm to determine system state variables. Then, state estimation results are sent to a control center for further processing to enhance distribution network operation. However, the accuracy of state estimation results could be degraded by false data injection attacks on measurement data. If these attacks are not detected, distribution network operation could be significantly influenced. Different methods have been developed to enhance a distribution network operation and management. Machine learning approaches have also been identified to be beneficial in solving different types of problems in a power grid. In this dissertation, machine learning is applied to three areas of distribution systems: generating pseudo-measurements, performing distribution system state estimation calculations, and detecting false data injection attacks on measurement data. In addition to addressing these areas individually, machine learning is used to simultaneously perform distribution system state estimation calculation and false data injection attack detection. This is done by taking advantage of conventional and smart measurement data at different time scales. The results reveal that the operation and performance of a distribution network are improved using machine learning algorithms, leading to more effective power grid modernization.Item Improving the confidence of machine learning models through improved software testing approaches(Montana State University - Bozeman, College of Engineering, 2022) ur Rehman, Faqeer; Chairperson, Graduate Committee: Clemente Izurieta; This is a manuscript style paper that includes co-authored chapters.Machine learning is gaining popularity in transforming and improving a number of different domains e.g., self-driving cars, natural language processing, healthcare, manufacturing, retail, banking, and cybersecurity. However, knowing the fact that machine learning algorithms are computationally complex, it becomes a challenging task to verify their correctness when either the oracle is not available or is available but too expensive to apply. Software Engineering for Machine Learning (SE4ML) is an emerging research area that focuses on applying the SE best practices and methods for better development, testing, operation, and maintenance of ML models. The focus of this work is on the testing aspect of ML applications by adapting the traditional software testing approaches for improving the confidence in them. First, a statistical metamorphic testing technique is proposed to test Neural Network (NN)-based classifiers in a non-deterministic environment. Furthermore, an MRs minimization algorithm is proposed for the program under test; thus, saving computational costs and organizational testing resources. Second, a Metamorphic Relation (MR) is proposed to address a data generation/labeling problem; that is, enhancing the test inputs effectiveness by extending the prioritized test set with new tests without incurring additional labeling costs. Further, the prioritized test inputs are leveraged to propose a statistical hypothesis testing (for detection) and machine learning-based approach (for prediction) of faulty behavior in two other machine learning classifiers i.e., NN-based Intrusion Detection Systems. Finally, to test unsupervised ML models, the metamorphic testing approach is utilized to make some insightful contributions that include: i) proposing a broader set of 22 MRs for assessing the behavior of clustering algorithms under test, ii) providing a detailed analysis/reasoning to show how the proposed MRs can be used to target both the verification and validation aspects of testing the programs under investigation, and iii) showing that verification of MR using multiple criteria is more beneficial than relying on using just a single criterion (i.e., clusters assigned). Thus, the work presented here results in providing a significant contribution to address the gaps found in the field, which enhances the body of knowledge in the emergent SE4ML field.Item Towards reduced-cost hyperspectral and multispectral image classification(Montana State University - Bozeman, College of Engineering, 2021) Morales Luna, Giorgio L.; Chairperson, Graduate Committee: John SheppardIn recent years, Hyperspectral Imaging systems (HSI) have become a powerful source for reliable data in applications such as remote sensing, agriculture, and biomedicine. However, the abundant spectral and spatial information of hyperspectral images makes them highly complex, which leads to the need for specialized Machine Learning algorithms to process and classify them. In that sense, the contribution of this thesis is multi-folded. We present a low-cost convolutional neural network designed for hyperspectral image classification called Hyper3DNet. Its architecture consists of two parts: a series of densely connected 3-D convolutions used as a feature extractor, and a series of 2-D separable convolutions used as a spatial encoder. We show that this design involves fewer trainable parameters compared to other approaches, yet without detriment to its performance. Furthermore, having observed that hyperspectral images benefit from methods to reduce the number of spectral bands while retaining the most useful information for a specific application, we present two novel hyperspectral dimensionality reduction techniques. First, we propose a filter-based method called Inter-Band Redundancy Analysis (IBRA) based on a collinearity analysis between a band and its neighbors. This analysis helps to remove redundant bands and dramatically reduces the search space. Second, we apply a wrapper-based approach called Greedy Spectral Selection (GSS) to the results of IBRA to select bands based on their information entropy values and train a compact Convolutional Neural Network to evaluate the performance of the current selection. We also propose a feature extraction framework that consists of two main steps: first, it reduces the total number of bands using IBRA; then, it can use any feature extraction method to obtain the desired number of feature channels. Finally, we use the original hyperspectral data cube to simulate the process of using actual filters in a multispectral imager. Experimental results show that our proposed Hyper3DNet architecture in conjunction with our dimensionality reduction techniques yields better classification results than the compared methods, producing more suitable results for a multispectral sensor design.Item Extracting abstract spatio-temporal features of weather phenomena for autoencoder transfer learning(Montana State University - Bozeman, College of Engineering, 2020) McAllister, Richard Arthur; Chairperson, Graduate Committee: John SheppardIn this dissertation we develop ways to discover encodings within autoencoders that can be used to exchange information among neural network models. We begin by verifying that autoencoders can be used to make predictions in the meteorological domain, specifically for wind vector determination. We use unsupervised pre-training of stacked autoencoders to construct multilayer perceptrons to accomplish this task. We then discuss the role of our approach as an important step in positioning Empirical Weather Prediction as a viable alternative to Numerical Weather Prediction. We continue by exploring the spatial extensibility of the previously developed models, observing that different areas in the atmosphere may be influenced unique forces. We use stacked autoencoders to generalize across an area of the atmosphere, expanding the application of networks trained in one area to the surrounding areas. As a prelude to exploring transfer learning, we demonstrate that a stacked autoencoder is capable of capturing knowledge universal to these dataspaces. Following this we observe that in extremely large dataspaces, a single neural network covering that space may not be effective, and generating large numbers of deep neural networks is not feasible. Using functional data analysis and spatial statistics we analyze deep networks trained from stacked autoencoders in a spatiotemporal application area to determine the extent to which knowledge can be transferred to similar regions. Our results indicate high likelihood that spatial correlation can be exploited if it can be identified prior to training. We then observe that artificial neural networks, being essentially black-box processes, would benefit by having effective methods for preserving knowledge for successive generations of training. We develop an approach to preserving knowledge encoded in the hidden layers of several ANN's and collect this knowledge in networks that more effectively make predictions over subdivisions of the entire dataspace. We show that this method has an accuracy advantage over the single-network approach. We extend the previously developed methodology, adding a non-parametric method for determining transferrable encoded knowledge. We also analyze new datasets, focusing on the ability for models trained in this fashion to be transferred to operating on other storms.Item Large-scale automated human protein-phenotype relation extraction from biomedical literature(Montana State University - Bozeman, College of Engineering, 2020) Pourreza Shahri, Morteza; Chairperson, Graduate Committee: Indika KahandaIdentifying protein-phenotype relations is of paramount importance for applications such as uncovering rare and complex diseases. Human Phenotype Ontology (HPO) is a recently introduced standardized vocabulary for describing disease-related phenotypic abnormalities in humans. While the official HPO knowledge base maintains known associations between human proteins and HPO terms, it is widely believed that this is incomplete. However, due to the exponential growth of biomedical literature, timely manual curation is infeasible, rendering the need for efficient and accurate computational tools for automated curation. In this work, we present HPcurator, a novel two-step framework for extracting relations between proteins and HPO terms from biomedical literature. First, we implement ProPheno, a comprehensive online dataset composed of human protein-phenotype co-mentions extracted from the entire set of biomedical articles. Subsequently, we show that these co-mentions are useful as a complementary source of input for a different, but highly related, task of automated protein-phenotype prediction. Next, we develop a supervised machine learning model called PPPred, which, to the best of our knowledge, is the first predictive model that can classify the validity of a given sentence-level protein-phenotype co-mention. Using a gold standard dataset composed of manually curated sentence co-mentions, we demonstrate that PPPred significantly outperforms several baseline methods. Finally, we propose SSEnet, a novel deep semi-supervised ensemble framework for relation extraction that combines deep learning, semi-supervised learning, and ensemble learning. This framework is motivated by the fact that while the manual annotation of co-mentions is extremely prohibitive, we have access to millions of unlabeled co-mentions. We develop a prototype of HPcurator by instantiating SSEnet with ProPheno, self-learning, pre-trained language models, as well as convolutional and recurrent neural networks. This system can successfully output a ranked list of relevant sentences for a user input protein-phenotype pair. Our experimental results indicate that this system provides state-of-the-art performance in human protein- HPO term relation extraction. The findings and the insight gained from this work have implications for biocurators, biologists, and the computer science community involved in developing biomedical text mining tools.Item Convolutional neural networks for multi- and hyper-spectral image classification(Montana State University - Bozeman, College of Engineering, 2019) Senecal, Jacob John; Chairperson, Graduate Committee: John SheppardWhile a great deal of research has been directed towards developing neural network architectures for classifying RGB images, there is a relative dearth of research directed towards developing neural network architectures specifically for multi-spectral and hyper-spectral imagery. The additional spectral information contained in a multi-spectral or hyper-spectral image can be valuable for land management, agriculture and forestry, disaster control, humanitarian relief operations, and environmental monitoring. However, the massive amounts of data generated by a multi-spectral or hyper- spectral instrument make processing this data a challenge. Machine learning and computer vision techniques could automate the analysis process of these rich data sources. With these benefits in mind, we have adapted recent developments in small efficient convolutional neural networks (CNNs), to create a small CNN architecture capable of being trained from scratch to classify 10 band multi-spectral images, using much fewer parameters than popular deep architectures, such as the ResNet or DenseNet architectures. We show that this network provides higher classification accuracy and greater sample efficiency than the same network using RGB images. We also show that it is possible to employ a transfer learning approach and use a network pre-trained on multi-spectral satellite imagery to increase accuracy on a second much smaller multi-spectral dataset, even though the satellite imagery was captured from a much different perspective (high altitude, overhead vs. ground based at close stand-off distance). These results demonstrates that it is possible to train our small network architectures on small multi-spectral datasets and still achieve high classification accuracy. This is significant as labeled hyper-spectral and multi-spectral datasets are generally much smaller than their RGB counterparts. Finally, we approximate a Bayesian version of our CNN architecture using a recent technique known as Monte Carlo dropout. By keeping dropout in place during test time we can perform a Monte Carlo procedure using multiple forward passes of our network to generate a distribution of network outputs which can be used as a measure of uncertainty in the predictions a network is making. Large variance in the network output corresponds to high uncertainty and vice versa. We show that a network that is capable of working with multi-spectral imagery significantly reduces the uncertainty associated with class predictions compared to using RGB images. This analysis reveals that the benefits of an architecture that works effectively with multi-spectral or hyper-spectral imagery extends beyond higher classification accuracy. Multi-spectral and hyper-spectral imagery allows us to be more confident in the predictions that a deep neural network is making.Item Predicting anticancer peptides and protein function with deep learning(Montana State University - Bozeman, College of Engineering, 2020) Lane, Nathaniel Patrick; Chairperson, Graduate Committee: Indika KahandaAnticancer peptides (ACPs) are a promising alternative to traditional chemotherapy. To aid wet-lab and clinical research, there is a growing interest in using machine learning techniques to help identify good ACP candidates computationally. In this work, we develop DeepACPpred, a novel deep learning model for predicting ACPs using their amino acid sequences. Using several gold-standard ACP datasets, we demonstrate that DeepACPpred is highly effective compared to state-of-the-art ACP prediction models. Furthermore, we adapt the above neural network model for predicting protein function and report our experience with participating in a community-wide large-scale assessment of protein functional annotation tools.Item Factored evolutionary algorithms: cooperative coevolutionary optimization with overlap(Montana State University - Bozeman, College of Engineering, 2017) Strasser, Shane Tyler; Chairperson, Graduate Committee: John SheppardFactored Evolutionary Algorithms (FEA) define a relatively new class of evolutionary-based optimization algorithms that have been successfully applied to various problems, such as training neural networks and performing abductive inference in graphical models. FEA is unique in that it factors the function being optimized by creating subpopulations that optimize over a subset of dimensions of the function. However, unlike other optimization techniques that subdivide optimization problems, FEA encourages subpopulations to overlap with one another, allowing subpopulations to compete and share information. Although FEA has been shown to be very effective at function optimization, there is still little understanding with respect to its general characteristics. In this dissertation, we present seven results exploring the theoretical and empirical properties of FEA. First, we present a formal definition of FEA and demonstrate its relationships to other multiple population algorithms. Second, we demonstrate that FEA's success is independent of the underlying optimization algorithm by evaluating the performance of FEA using a wide variety of evolutionary- and swarm-based algorithms over single-population and non-overlapping versions. Third, we demonstrate that for a given problem, there is an optimal way to generate groups of overlapping subpopulations derived using the Markov blanket in Bayesian networks. Fourth, we establish that a class of optimization functions like NK landscapes can be mapped directly to probabilistic graphical models. Additionally, we demonstrate that factor architectures derived from Markov blankets maintain better diversity of individuals in their population. Fifth, we present a new discrete Particle Swarm Optimization (PSO) algorithm and compare its performance to competing approaches. In addition, we analyze the performance of FEA versions of discrete PSO and discover that FEA masks the poor performance of search algorithms. We show what conditions are necessary for FEA to converge and scenarios where FEA may become stuck in suboptimal regions in the search space. Finally, we explore the performance of FEA on unitation functions and discover several instances where FEA struggles to outperform single-population algorithms. These results allow us to determine which situations are appropriate for FEA when using solving real-world problems.Item Relating design process to design outcomes in engineering capstone projects(Montana State University - Bozeman, College of Engineering, 2003) Jain, Vikas KewalItem VLSI synthesis of digital application specific neural networks(Montana State University - Bozeman, College of Engineering, 1992) Beagles, Grant Philip