Fasy, BrittanyMillman, DavidMicka, SamuelPadula, LukeMakarchuk, Maksym2024-07-232024-07-232024-04https://scholarworks.montana.edu/handle/1/18691Increasingly, topological descriptors like the Euler characteristic curve and persistence diagrams are utilized to represent complex data. Recent studies suggest that a meticulously selected set of these descriptors can encode geometric and topological information about shapes in d-dimensional space. In practical applications, epsilon-nets are employed to sample data, presenting two extremes: oversampling, where epsilon is small enough to ensure a comprehensive representation but may lead to computational inefficiencies, and undersampling, where epsilon lacks a grounded rationale, offering faster computations but risking an incomplete shape description without theoretical guarantees. This research investigates phenomena of oversampling and undersampling, delving into their prevalence across synthetic and real-world datasets. It experimentally verifies excessive oversampling in theory-guided approaches and examines the implications of undersampling, shedding light on the behavior and consequences of both extremes. We establish lower bounds on the number of descriptors required for exact encodings and explore the trade-offs associated with undersampling, contributing insights into the potential information loss and the resulting impact on the overall shape representation.en-USCopyright 2024Sampling Bounds for Topological DescriptorsPresentation