Machine learning pipeline for rare-event detection in synthetic-aperture radar and LIDAR data
Scofield, Trey Palmer
MetadataShow full item record
In this work, we develop a machine learning pipeline to autonomously classify synthetic aperture radar (SAR) and lidar data in rare-event, remote sensing applications. Here, we are predicting the presence of volcanoes on the surface of Venus, fish in Yellowstone Lake, and select marine-life in the Gulf of Mexico. Given the efficiency of collecting SAR images in space and airborne lidar geographical surveys, the size of the datasets are immense. Immense training data is desirable for machine learning models; however, a large majority of the data we are using do not contain volcanoes or fish, respectively. Thus, the machine learning models must be formulated in such a way to place a high emphasis on the minority, target classes. The developed pipeline includes data preprocessing, unsupervised clustering, feature extraction, and classification. For each collection of data, sub-images are initially fed through the pipeline to capture fine detail characteristics until they are mapped back to their original image to identify overall region behavior and the location of the target class(es). For both sub-images and original images, results were quantified and the most effective algorithm combinations and parameters were assigned. In this analysis, we determined the classification results are not sufficient enough to propel a completely autonomous system, rather, some manual observing of the data will need to be performed. Nonetheless, the pipeline serves as an effective tool to reduce costs associated with electronic storage and transmission of the data, as well as human labor in manually inspecting the data. It does this by removing a majority of the unimportant, non-target data in some cases while successfully retaining a high percentage of the important images.