Adapting archetypal analysis to scientific imaging applications
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Montana State University - Bozeman, College of Letters & Science
Abstract
Scientific imaging applications create large sets of high-dimensional data, which may be difficult to process using traditional supervised machine learning representative models. First, many representative models generate computational elements that are difficult to interpret in terms of the scientific application and second, the high embedding dimension of the images often makes generating the models computationally inefficient. We propose using archetypal analysis (AA) as the representative model for these scientific imaging problems, since the computational elements, so called archetypes, resemble members of the original dataset. Specifically, the archetypes are generated as extreme points to an approximation of the convex hull of the data cloud, which means they maintain the structure of individual data points. To improve the computational task of generating the AA model, we propose a sketch-based AA method which projects the data to a lower embedding dimension before calculating the computational elements, lowering computation time for these high-dimensional problems, while at the same time retaining the geometric structure enough so that the computational elements closely match the results of AA. We also applied a primal-dual hybrid gradient (PDHG) solver to the AA algorithm structure attempting to speed up computation. To verify the significance of the interpretation of AA, we applied AA to transient fluorescent calcium images, recorded in the Kunze Neuroengineering lab as videos, in order to determine whether or not adding different nanoparticles changed the way the neurons in culture communicate. We also applied our sketch-based AA method to other sorts of imaging data sets, exploring the differences between our method and the standard AA method. Our experimentation shows the different ways that AA can be adapted to scientific imaging applications, providing a machine learning representation model that is interpretable in the context of the imaging problem and verifies the benefits of the sketch-based method in terms of computation time.