Improving the confidence of machine learning models through improved software testing approaches

dc.contributor.advisorChairperson, Graduate Committee: Clemente Izurietaen
dc.contributor.authorur Rehman, Faqeeren
dc.contributor.otherThis is a manuscript style paper that includes co-authored chapters.en
dc.date.accessioned2023-04-11T21:21:58Z
dc.date.available2023-04-11T21:21:58Z
dc.date.issued2022en
dc.description.abstractMachine learning is gaining popularity in transforming and improving a number of different domains e.g., self-driving cars, natural language processing, healthcare, manufacturing, retail, banking, and cybersecurity. However, knowing the fact that machine learning algorithms are computationally complex, it becomes a challenging task to verify their correctness when either the oracle is not available or is available but too expensive to apply. Software Engineering for Machine Learning (SE4ML) is an emerging research area that focuses on applying the SE best practices and methods for better development, testing, operation, and maintenance of ML models. The focus of this work is on the testing aspect of ML applications by adapting the traditional software testing approaches for improving the confidence in them. First, a statistical metamorphic testing technique is proposed to test Neural Network (NN)-based classifiers in a non-deterministic environment. Furthermore, an MRs minimization algorithm is proposed for the program under test; thus, saving computational costs and organizational testing resources. Second, a Metamorphic Relation (MR) is proposed to address a data generation/labeling problem; that is, enhancing the test inputs effectiveness by extending the prioritized test set with new tests without incurring additional labeling costs. Further, the prioritized test inputs are leveraged to propose a statistical hypothesis testing (for detection) and machine learning-based approach (for prediction) of faulty behavior in two other machine learning classifiers i.e., NN-based Intrusion Detection Systems. Finally, to test unsupervised ML models, the metamorphic testing approach is utilized to make some insightful contributions that include: i) proposing a broader set of 22 MRs for assessing the behavior of clustering algorithms under test, ii) providing a detailed analysis/reasoning to show how the proposed MRs can be used to target both the verification and validation aspects of testing the programs under investigation, and iii) showing that verification of MR using multiple criteria is more beneficial than relying on using just a single criterion (i.e., clusters assigned). Thus, the work presented here results in providing a significant contribution to address the gaps found in the field, which enhances the body of knowledge in the emergent SE4ML field.en
dc.identifier.urihttps://scholarworks.montana.edu/handle/1/17614
dc.language.isoenen
dc.publisherMontana State University - Bozeman, College of Engineeringen
dc.rights.holderCopyright 2022 by Faqeer ur Rehmaen
dc.subject.lcshMachine learningen
dc.subject.lcshTestingen
dc.subject.lcshSoftware engineeringen
dc.subject.lcshNeural networks (Computer science)en
dc.titleImproving the confidence of machine learning models through improved software testing approachesen
dc.typeDissertationen
mus.data.thumbpage71en
thesis.degree.committeemembersMembers, Graduate Committee: John Paxton; Mike Wittie; Travis Petersen
thesis.degree.departmentComputing.en
thesis.degree.genreDissertationen
thesis.degree.namePhDen
thesis.format.extentfirstpage1en
thesis.format.extentlastpage172en

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
ur-rehman-improving-2022.pdf
Size:
1.62 MB
Format:
Adobe Portable Document Format
Description:
Improving the confidence of machine learning models through improved software testing approaches (PDF)

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:
Copyright (c) 2002-2022, LYRASIS. All rights reserved.