A customizable artificial auditory fovea
Casebeer, Christopher Ness
MetadataShow full item record
We, as humans, can separate and attend to audio sources in mixtures of sounds and noise. We can listen distinctly to a friend at a party in a sea of background noise and conversations. Human auditory neurology exceeds even state-of-the-art audio algorithms. How are we able to do this? This dissertation takes inspiration from biology to frame a novel audio processing front-end. Neurobiology shows that auditory neurons isolate signal onsets, timing, frequency, amplitude and modulation characteristics. Why is it then that many standard processing methods choose to ignore this information or make the assumption that machine learning will extract it regardless of input processing? This dissertation uses time-frequency analysis principles towards building a new front-end aimed at preserving these fine temporal and spectral details of the original signal to improve audio system detection and recognition. The system allows keeping the fine frequency and time characteristics of a signal during analysis, while allowing customization of how much and where this resolution is kept. Like biology, this front-end can dedicate resources to detecting important signal events. It can over represent or foveate regions of the time-frequency plane that are important to the signal processing task at hand. These fine details are hypothesized to help enable audio learning algorithms to detect the fine nuances that distinguish musical instruments, determine the characteristics of a specific persons voice, or even detect the emotional state of a person. This customizable auditory fovea aims to mimic the powerful detection capability found in biology which is in contrast to standard methods in audio signal processing.