Statistical methodology for biological signals in the presence of measurement uncertainty
Barbour, Christopher Robert
MetadataShow full item record
In recent years, increasing amounts of complex biological data are being collected on patients in many branches of medical research. Many of these signals are being collected with a certain amount of imprecision in the attained measurements. Two such areas in multiple sclerosis (MS) research are clinical scale development and proteomics analysis. Scales are often constructed from multiple outcome measures to create a combined metric that is a better measure of the true trait of interest than any of the original components. When the interest is in creating a scale that is sensitive to changes over time, developing it using cross-sectional data may not tune the projection to detect changes over time optimally. The proposed methodology, coined the Constructed Composite Response (CCR), was developed to maximize detected longitudinal change. A simulation study, and analysis of a motivating dataset, demonstrated that the CCR methodology performs better at capturing longitudinal change than traditional techniques. Including sparsifying constraints, motivated by penalized regression models, improved the performance of the CCR in high- dimensional data. In proteomics data, undesirable sources of variation are often present. Examples include temporal fluctuation in control samples and technical variability from multiple assay runs. When developing a molecular classifier of MS, a novel variable screening procedure was implemented to eliminate proteins with high levels of these unwanted sources. A simulation study compared this with traditional screening approaches and findings are discussed. Future extensions and directions of research are also discussed.