Utilizing distributions of variable influence for feature selection in hyperspectral images

dc.contributor.advisorChairperson, Graduate Committee: John Shepparden
dc.contributor.authorWalton, Neil Stewarten
dc.date.accessioned2021-06-09T18:47:46Z
dc.date.available2021-06-09T18:47:46Z
dc.date.issued2019en
dc.description.abstractOptical sensing has been applied as an important tool in many different domains. Specifically, hyperspectral imaging has enjoyed success in a variety of tasks ranging from plant species classification to ripeness evaluation in produce. Although effective, hyperspectral imaging can be prohibitively expensive to deploy at scale. In the first half of this thesis, we develop a method to assist in designing a low-cost multispectral imager for produce monitoring by using a genetic algorithm (GA) that simultaneously selects a subset of informative wavelengths and identifies effective filter bandwidths for such an imager. Instead of selecting the single fittest member of the final population as our solution, we fit a univariate Gaussian mixture model to a histogram of the overall GA population, selecting the wavelengths associated with the peaks of the distributions as our solution. By evaluating the entire population, rather than a single solution, we are also able to specify filter bandwidths by calculating the standard deviations of the Gaussian distributions and computing the full-width at half-maximum values. In our experiments, we find that this novel histogram-based method for feature selection is effective when compared to both the standard GA and partial least squares discriminant analysis. In the second half of this thesis, we investigate how common feature selection frameworks such as feature ranking, forward selection, and backward elimination break down when faced with the multicollinearity present in hyperspectral data. We then propose two novel algorithms, Variable Importance for Distribution-based Feature Selection (VI-DFS) and Layer-wise Relevance Propagation for Distribution-based Feature Selection (LRP-DFS), that make use of variable importance and feature relevance, respectively. Both methods operate by fitting Gaussian mixture models to the plots of their respective scores over the input wavelengths and select the wavelengths associated with the peaks of each Gaussian component. In our experiments, we find that both novel methods outperform variable ranking, forward selection, and backward elimination and are competitive with the genetic algorithm over all datasets considered.en
dc.identifier.urihttps://scholarworks.montana.edu/handle/1/16207en
dc.language.isoenen
dc.publisherMontana State University - Bozeman, College of Engineeringen
dc.rights.holderCopyright 2019 by Neil Stewart Waltonen
dc.subject.lcshOptical spectroscopyen
dc.subject.lcshPhotographyen
dc.subject.lcshDistribution (Probability theory)en
dc.subject.lcshGenetic algorithmsen
dc.subject.lcshRemote sensingen
dc.subject.lcshFarm produceen
dc.titleUtilizing distributions of variable influence for feature selection in hyperspectral imagesen
dc.typeThesisen
mus.data.thumbpage23en
thesis.degree.committeemembersMembers, Graduate Committee: David Millman; Joseph A. Shawen
thesis.degree.departmentComputing.en
thesis.degree.genreThesisen
thesis.degree.nameMSen
thesis.format.extentfirstpage1en
thesis.format.extentlastpage125en

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
walton-utilizing-distributions-2019.pdf
Size:
3.63 MB
Format:
Adobe Portable Document Format
Description:
Utilizing distributions of variable influence for feature selection in hyperspectral images (PDF)

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
826 B
Format:
Plain Text
Description:
Copyright (c) 2002-2022, LYRASIS. All rights reserved.