THE RELATIONSHIP BETWEEN PHYSIOLOGICAL STRESS RESPONSE AND VARIATION IN OMICS DATA by Katherine Fay Steward A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biochemistry MONTANA STATE UNIVERSITY Bozeman, Montana April, 2021 ©COPYRIGHT by Katherine Fay Steward 2021 All Rights Reserved ii DEDICATION For Piper, Belle and Velcro. The emotional support you have provided me is immeasurable. iii ACKNOWLEDGEMENTS I would like to thank Dr. Brian Bothner, whose patience, guidance and constant faith helped lead me to this point. I appreciate every pep talk more than you know. To my committee, I have learned lifelong lessons from each of you and I appreciate your help and support over the past five years. Thanks to the Bothner Lab, past and present: doctors, grads and undergrads; team Bothner is the best. To my collaborators, I truly believe that science is a team effort, thanks for being on a team with me. Last, but never least, the staff of the MSU Chemistry Department; they always had the answers for an anxious grad student. iv TABLE OF CONTENTS 1. INTRODUCTION AND BACKGROUND ....................................................................1 Background ..........................................................................................................................1 Metabolomics .......................................................................................................................1 Metabolomics Approaches and Techniques ...................................................................3 Applications of Metabolomics ........................................................................................5 Limitations of Metabolomics ..........................................................................................7 Proteomics............................................................................................................................8 Proteomics Applications .................................................................................................9 Limitations of Proteomics .............................................................................................10 Cellular Stress Response ....................................................................................................10 Measuring CSR .............................................................................................................14 Research Goals..................................................................................................................15 References .........................................................................................................................20 2. METABOLIC IMPLICATIONS OF USING BIOORTHOGONAL NON-CANONICAL AMINO ACID TAGGING (BONCAT) FOR TRACKING PROTEIN SYNTHESIS ..............................................................................25 Contribution of Authors and Co-Authors ..........................................................................25 Manuscript Information Page ............................................................................................27 Abstract ..............................................................................................................................28 Introduction ........................................................................................................................29 Materials and Methods .......................................................................................................32 Reagents ........................................................................................................................32 Cell Culturing................................................................................................................32 Metabolite Extraction....................................................................................................33 LCMS Instrumentation and Metabolite Analysis .........................................................34 Statistical Analysis of MS Data ....................................................................................34 Sample Preparation and NMR Analysis .......................................................................35 Statistical Analysis of NMR Data .................................................................................36 Results and Discussion ......................................................................................................37 Mass Spectrometry-Based Metabolomics of Non-Canonical Amino Acids .................................................................................................................37 NMR Metabolite Profiles of E. coli Grown in the Presence of Non-Canonical Amino Acids ........................................................................................39 E. Coli Grown With Non-Canonical Amino Acids Under Heat Stress ........................40 MS Metabolomics of Cultures Grown With Non-Canonical Amino Acids Under Heat Stress ...................................................................................42 NMR Metabolomics Analysis of Cultures Grown With Non-Canonical Amino Acids Under Heat Stress..........................................................47 v TABLE OF CONTENTS CONTINUED Combined Pathway Analysis of Heat Stress E. coli Cultures ............................................48 Screening for Potential Degradation Products of AHA and HPG .....................................51 Summary ............................................................................................................................52 Data Availability Statement ...............................................................................................53 Author Contributions .........................................................................................................54 Funding ..............................................................................................................................54 Acknowledgements ............................................................................................................54 Supplementary Material Statement ....................................................................................55 References ..........................................................................................................................55 3. ACUTE STRESS REDUCES POPULATION-LEVEL METABOLIC AND PROTEOMIC VARIATION....................................................................................59 Contribution of Authors and Co-Authors ..........................................................................59 Manuscript Information Page ............................................................................................60 Abstract ..............................................................................................................................61 Introduction ........................................................................................................................62 Results ................................................................................................................................64 Analysis of Public Omics Data Sets .............................................................................68 Proteomics Data ............................................................................................................71 Exceptions to the Model ...............................................................................................72 Simulations ...................................................................................................................74 Discussion ..........................................................................................................................75 Methods..............................................................................................................................77 Metabolomics Analysis of Heat Shocked Avena Fatua ...............................................78 Proteomic Analysis of Escherichia coli Grown Under Aerobic or Anaerobic Conditions ...............................................................................................79 Mining of Public Data ...................................................................................................80 Statistical Analysis ........................................................................................................81 Simulated Data Analysis ...............................................................................................81 References ..........................................................................................................................83 4. PROBING MECHANISMS OF REDUCTIVE PYRITE DISSOLUTION IN METHANOCOCCUS VOLTAE BY PROTEOMICS ...................................................87 Contribution of Authors and Co-Authors ..........................................................................87 Manuscript Information Page ............................................................................................89 Abstract ..............................................................................................................................91 Introduction ........................................................................................................................91 Methods..............................................................................................................................94 Cell Culture Conditions ................................................................................................94 Cultivation Procedures ..................................................................................................95 vi TABLE OF CONTENT CONTINUED Protein Extraction .........................................................................................................95 Proteomics Analysis......................................................................................................96 Data Analysis ................................................................................................................97 Statistical Analysis ........................................................................................................97 Results ...............................................................................................................................98 Global Intracellular Proteomics ..................................................................................98 Chemical and Functional Analysis of the Proteome .................................................102 Iron Binding Proteins ................................................................................................104 Conserved Proteins and Oxidoreductases .................................................................107 Membrane Proteins ...................................................................................................108 Extracellular Proteins ................................................................................................111 Stressed Phenotype Analysis ....................................................................................112 Discussion .......................................................................................................................117 References .......................................................................................................................122 5. Concluding Remarks ....................................................................................................129 References ...................................................................................................................135 REFERENCES CITED ....................................................................................................136 APPENDICES .................................................................................................................146 APPENDIX A: Supplemental Material for Chapter Two................................................147 APPENDIX B: Supplemental Material for Chapter Three ..............................................159 APPENDIX C: Supplemental Material for Chapter Four................................................164 vii LIST OF TABLES Table Page 4.1 RVA and MIF protein summary ...............................................................................115 viii LIST OF FIGURES Figure Page 1.1 Hierarchy of Omics Analysis .........................................................................................2 1.2 Metabolomics Keyword Growth....................................................................................3 1.3 Cellular Stress Response ..............................................................................................12 1.4 Variation in Metabolomics Data ..................................................................................17 1.5 The Who, What and Why of RVA...............................................................................18 2.1 2D-PCA Plots of All Experimental Conditions from E. coli NCAA Experiment.............................................................................................................40 2.2 Heatmaps of Treatment Groups Clustered on Metabolite Intensity from E. coli NCAA Experiment ........................................................................................41 2.3 2D PCA Plot of MS Features and NMR Features of Heat Stressed E. coli Cultures ....................................................................................................44 2.4 NMR Heatmaps of Heat Stressed E. coli Cultures ......................................................46 2.5 3D-PLSDA of Metabolites as Identified by NMR from Heat Stressed NCAA doped E. coli cultures and Corresponding VIP Scores Table ...............................................................................................................49 3.1 Metabolic Variation in Response to Hemorrhagic Shock in a Mammal .....................65 3.2 Metabolic Variation in E. coli Treated With NCAA ...................................................67 3.3 Distribution of CV in A. fatua and Temporal RVA Analysis ......................................69 3.4 RVA of Proteomics Data and Simulation Analysis .....................................................72 4.1Global Proteomics .........................................................................................................99 4.2 GO Pathway Proteins .................................................................................................101 4.3 Volcano Plots .............................................................................................................103 4.4 Extracellular Protein Pools ........................................................................................110 ix LIST OF FIGURES CONTINUED 4.5 RVA of M. voltae .......................................................................................................112 5.1 RVA Focuses Data Analysis ......................................................................................131 x ABSTRACT Omics analysis is the cornerstone of systems biology. It offers comprehensive assessments of stress, interaction networks and connections to phenotype. Defining a stressed phenotype can be challenging, however, as stress response mechanisms can arise from a range of environmental conditions and experimental perturbations. Previous work from our lab noted the possibility of a relationship between stress in omics data and the variation of that data. This connection has yet to be clearly defined, and the cellular mechanisms responsible for the canalization of omics data remain a mystery. In this work I have taken advantage of the sensitivity of metabolomics and proteomics to detect cellular stress and characterize its relationship to variation. By utilizing coefficient of variation (CV) as a statistic of merit, the depth of the relationship between stress and variation can be uncovered. Once the model was clearly defined, a proteomics dataset with a large proportion of protein coverage was utilized to investigate what pathways might be responsible for the metabolite and protein canalization. 1 CHAPTER ONE INTRODUCTION AND BACKGROUND Background Systems biology is the study of the complex biological networks that make up an organism [1]. It utilizes a multi-disciplinary approach to probe processes and regulation of biochemical networks of a whole system, rather than through the reductionist techniques of traditional approaches [2]. Omics analyses have enabled holistic research and are the cornerstone of systems biology studies. The Omics approaches most commonly utilized are genomics, transcriptomics, proteomics and metabolomics (FIGURE 1.1). As the prefix for each implies, each omics approach is they are the examination of all the genes, transcripts, proteins or metabolites in a system [1]. Through the use of these techniques, unexpected properties of essential cellular function and networks can be revealed [3]. Global analyses, such as metabolomics and proteomics, enable researchers to characterize an organism’s integrated response to perturbations like stress, nutrient deficiencies, or environmental changes. These descriptions help outline a phenotype that can be utilized to define disease states in human health, agriculture and adaptive response studies. Metabolomics Metabolomics is of particular importance when characterizing phenotype. The products and intermediates of metabolism and regulatory processes within the cell, biofluids or tissues are 2 the small molecules that make up the metabolome [4]. The study and quantitative measure of metabolites in response to stress, environmental stimulus or genetic alterations yields the Figure 1.1. Hierarchy of omics analysis: displaying name of omics approach, what is analyzed, how typical analyses are done and applications of those studies. comprehensive study of metabolism, which helps describe a phenotype [5]. Metabolomics is considered the most dynamic of the omics approaches, as it can measure metabolite catabolism or genesis on a second timescale, whereas upstream omics approaches, like transcriptomics, measure the changes in mRNA degradation or production that can take hours within a living system [6]. Because it captures early changes and adjustments made within a cell, metabolomics is integral to identifying biomarkers of stress and disease [3]. One example is the identification of oncometabolites like sarcosine, choline, succinate and glucose along with others that have been associated with tumor growth and serve as biomarkers for leukemia and other cancers [4]. The field of metabolomics has advanced at a rapid pace since its introduction in 1998 (FIGURE 1.2). In 2005 the METLIN database was introduced, categorizing hundreds of 3 thousands of metabolites [2]. With technological advances and the use on online databases like METLIN, metabolomics will continue to push the boundaries of biochemical analysis. Publications with Keyword: "Metabolomics" 35000 30000 25000 20000 15000 10000 5000 0 Figure 1.2. Metabolomics Keyword Growth: Bar chart showing number of publications per year with searchable keyword “metabolomics”. Year 2021 includes January-March 20, 2021. Metabolomics Approaches and Techniques There are two basic approaches to metabolomics: untargeted and targeted. Untargeted analyses aim to investigate all the metabolic constituents of a cellular system or locale without focusing on specific metabolites, while targeted analyses investigate predetermined specific metabolites in a system. Untargeted metabolomics analysis is generally thought of as the “discovery” arm of metabolomics and qualitatively analyzes hundred to thousands of metabolites. This approach utilizes samples from biofluids, tissue or cells to detect small molecules in the samples and make relative comparisons between treatment groups [7]. This 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 4 comprehensive analysis of the metabolome offers generation of hypotheses, qualitative identification, and relative quantification of the system under study [8]. Untargeted metabolomics often utilizes databases like METLIN to identify unknowns, or metabolite features. Although metabolomics analysis is rapidly progressing and databases have been significantly expanded, the identification of these unknown features is considered the bottleneck of untargeted approaches [7]. Targeted metabolomics is the hypothesis driven “validation” arm and focuses on a defined set of metabolites from biological samples. The metabolites to be analyzed are determined from the biological question or the analytical library available that will be utilized [2]. The targeted approach offers absolute quantification of analytes of interest, although are limited in number, usually tens to hundreds [4]. The two arms of metabolomics can be utilized together, when possible, to yield the most comprehensive data. There are two primary techniques used for metabolomics analysis, Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR). Mass spectrometry analysis measures the mass to charge ratio (m/z) of ions, this measurement allows for the exact molecular weight of a molecule to be calculated [1]. Mass Spectrometry technology has been utilized since the early 20th century, originally conceived during the “hunt for the electron”, MS technology has seen decades of advancements [9]. During this time, MS spectra analysis was described and characterized in detail, and its’ use for biological applications was expanded. In 1988 electrospray ionization (ESI) was introduced, which allowed for nonvolatile analyte detection and quantitation utilizing MS [10]. The introduction of ESI allowed for coupling of MS to liquid chromatography (LC). LC- ESI uses a liquid mobile phase and columns packed with stationary 5 phase particles to allow for analyte elution at different times. This eluant is ionized through ESI and sent into the MS for detection and quantification [11]. LC-MS is one major analytical technology utilized for metabolomics analysis, it provides detection and quantitation of thousands of metabolite features, at trace level amounts [2]. LC-MS offers analysis of both polar and nonpolar metabolites, and allows for complex biological matrices like blood, plasma and urine with fast and reliable methods. Nuclear Magnetic Resonance Spectroscopy (NMR) is the other major analytical approach used for metabolomics, it measures the local magnetic field of atoms [12]. The excitation of the nuclei of the atom is measured, which yields details about the electronic structure of the molecule, thus the structure and functional groups of the compound can be determined [13]. In 1984, human urine samples were characterized using NMR and during the past 20 years, the use of NMR for metabolomics analysis has expanded exponentially [14]. NMR for metabolomics offers advantages due to its high reproducibility, the ability to identify unknown molecules through structure elucidation and doesn’t consume biological samples being analyzed. NMR can be used when compounds can’t be ionized, as well as to differentiate compounds with identical masses. Detection limits for NMR are much higher than for LC-MS, however, and typical only quantifies 50-200 metabolites [15]. Ideally, multi-platform (LC-MS and NMR) approaches for metabolomics analysis are utilized. Applications of Metabolomics The large amount of data generated from metabolomics studies idealistically results in the identification of many metabolites and a snapshot of the metabolic machinery at the time of sampling. Although the archetypal metabolomics data will have many identified compounds, in 6 MS based analysis there are often still many uncharacterized features. Through the use of pattern recognition statistics, global changes in the metabolome can be described in addition to the annotated pathway maps [2]. This enables researches to profile the metabolite changes in relation to the biological phenotype under study, which should also lead to the discovery of biomarkers indicative of disease, stress or toxicity. Personalized medicine has recently become the trendy approach to determining an individual’s disease risk and treatment, and even dietary restrictions and weight loss plans [16]. Personalized medicine based not only on the genomic makeup, but the metabolome as well, offers more accurate prediction of diseases like arteriosclerosis, cancer and diabetes [17]. Recent biomarker discovery advances in metabolomics cover a wide range, from dysregulation of lysine and phenylalanine metabolism intermediates indicative of renal cell carcinoma[18], to metabotyping for autism spectrum disorder based on ratios of lactate, succinate, pyruvate, alpha-keto-gluterate, glycine, 4-hydroxyproline and ornithine with other metabolites[19]. This clinical application of metabolomics previews an era of modern medicine that can detect the earliest stages of diseases and help develop more effective therapies to combat them. Plotting disease progression is another application of metabolomics. With the sensitivity and dynamic cellular maps that metabolomics offers, it is well suited to capturing the pathophysiological changes due to disease. Recent advances in ALS (amyotrophic lateral sclerosis) research showed that ceramide, creatine metabolism benzoate metabolism and fatty acid metabolism pathways contain metabolites that correlate to ALS pathology [20]. An NMR based metabolomics study on serum from hepatocellular carcinoma (HCC) patients showed that 7 differentiation between lipid, amino acids, lactate and glucose levels can discriminate between early and late stage HCC [21]. As mentioned, metabolomics is the most time sensitive representation of the biochemical activity of a cell or organism from an omics perspective. Thus, it follows that metabolomics is the most accurate representation of the molecular phenotype [5]. Phenotype is a result of the biochemical and physiological properties of an individual with its environment. This relationship between phenotype and metabolism has been referred to as the “missing link” in the prediction of phenotype from genotype [8]. Phenotype prediction from the metabolome is a powerful analytical tool that showcases another usage of metabolomics data. Limitations of Metabolomics Metabolomics has the power to capture subtle biological changes between groups and offers dynamic phenotypic information. There are some challenges, however, that the burgeoning field of metabolomics analysis faces. Comprehensive data standards across the field are lacking, which hinder the use of data in regulatory and some clinical settings [7]. While this is getting some attention in the field, and the international Metabolomics Society coordinated a Data Standards Task Group to address standardization, validation, and reporting formats within the field, it remains one of the larger problems to overcome within the discipline[22]. Metabolic flux within the system of study is another area that lacks attention in the field. Metabolomic approaches are varied and diverse, with experimental methods and designs that are equally unique [3]. Because of the flexibility of the approach, timing of experiments can be crucial to metabolomics analysis, sampling schedules are potentially missing valuable information held in the metabolome. 8 Metabolite identification, and subsequent biomarker identification remains one of the biggest hurdles within the analysis [23]. Unlike other omics platforms, untargeted metabolomics does not result in metabolite identification per se. In both NMR and LC-MS approaches a large number of features are detected in the spectra, which are then matched based on frequency axis signals using location and intensity (NMR) or by m/z values and retention times (LC-MS)[2]. Downstream analysis techniques for both have been vastly improved in recent years, with software and web tools that offer metabolite feature identification and streamlined statistical analysis [24]. With further advancements in data collection and interpretation, the application of metabolomics in a diversity of fields will continue to expand. Specifics of methods utilized here are detailed in chapters two and three. Proteomics As mentioned, the Omics hierarchy utilizes a systems biology approach for holistic analysis of entire biological environments within a cell. Proteomics is the identification and quantitation of all the proteins within a system [25]. Since they are the biological effectors within the cell, it is a technique that offers a detailed look into cellular functions and regulation. Over 50 years ago, the concept of analyzing the entire proteome was introduced, but it wasn’t until the mid-1990s that proteomics analysis started to become accessible [26]. Advances in technology on four fronts enabled proteomics analysis: MS technological and method development, large- scale genomic research that was the basis for protein databases, advancements in two- dimensional gel electrophoresis (2D-PAGE), and bioinformatic tools for the analysis of the large volume of data produced from the MS analysis[27]. 9 As with metabolomics, proteomics can be carried out through a variety of techniques at different levels of analysis, but generally end with MS based detection and quantitation. Arguably one of the most common approach utilized today is enzyme digestion of purified proteins or complex protein mixtures into peptides, followed by nano-LC-ESI coupled tandem mass spectrometry (MSMS) for analysis [25]. This approach allows for the analysis of hundreds to thousands of proteins from a biological sample. Proteomics Applications Biological research, medical applications and drug discovery are three major applications of proteomics analysis [28]. Significant research demonstrating the relationship between genes and proteins has resulted in a deeper understanding of the sophisticated relationship between cell function and genes. A recent study on multiple cancer cell lines, using the Cancer Cell Line Encyclopedia database and a comprehensive proteomics analysis of 375 cell lines showed expected dysregulation of proteins in the Microsatellite Instable (MSI) cells, as well as new associations. Two RNA monitoring proteins that hadn’t been previously identified with MSI were identified [29]. Numerous clinical studies utilize proteomics approaches with wide success. Studies on mental health disorders, heart disease and many cancers have led to the identification of specific proteins or classes of proteins that lead to pathology. One medical application of proteomics was an investigation of patients with chronic kidney disease that identified 273 peptides that differentiated healthy from disease state individuals. By combining these results, a pathological profile was constructed and recommended by the US- Federal Drug Administration for diagnosis and prognosis of kidney disease [30]. Because of these clinical applications, potential drug targets have also been discovered. Comparative proteomics of muscle invasive 10 and non-muscle invasive bladder cancer tissues showed a dysregulation in the eukaryotic translation initiation factor 3 subunit D (EIF3D). This protein can be targeted by silencing or knockdown and yields decreased cell proliferation and colony formation. A therapy for this protein target has reached animal model testing [31]. Limitations of Proteomics While proteomics can offer a lot of data that helps make biochemical connections, there are some challenges inherent to the technique. Sample preparation is potentially the biggest hurdle in proteomics analysis [32]. It is delicate work that must account for the diversity of protein molecular sizes, post translational modifications (PTM), hydrophobicity, protein conformation and cellular distribution which are just a few considerations [33]. Conventional methods try to capture as much of the proteome as possible, but require tuning if low abundance proteins, or specific localizations are the target. Large amounts of data are produced from LC-MSMS proteomics analysis. There are many software tools available at varying price points that automate the process of detecting and identifying peptides [32]. Utilizing the proper settings to minimize false positives, like peptide detection tolerance, digestion details and relevant PTMs is paramount to accurate and quality data. Pairing the appropriate database for data analysis is also imperative in obtaining quality results. Specifics of methods utilized for this work are detailed in chapter four. Cellular Stress Response All cells have stress response mechanisms to protect from and mitigate environmental changes that could be harmful [34]. Various biochemical processes can be utilized by the cell to 11 adapt in the short term to conserve cellular integrity or in the long term to afford resistance or adaptation to adversity. Many mechanisms of cellular stress response aren’t specific to a type of stress because the cell responds to molecular damage that occurs, rather than what caused it[35]. How the cell responds to the stress, and if it can surmount it, determines the fate of the cell and whether it will adapt and survive, or induce processes of cell death. Four primary stress response pathways in the cell’s arsenal include responses to oxidative stress, the unfolded protein response, DNA damage repair and heat shock response (FIGURE 1.3) [35]. Oxidative stress occurs when the balance between pro-oxidants and antioxidants is disrupted. Reactive oxygen species (ROS) and reactive nitrogen species (RNS) include hydrogen peroxide, peroxy radicals, hydroxyl radicals, singlet oxygen, superoxide anion nitrogen oxide and peroxynitrate[36]. These prooxidant ROS species can inflict macromolecular damage to lipids, carbohydrates, nucleic acids and proteins. The cell utilizes antioxidants like glutathione (GSH), superoxide dismutase (SOD) glutathione peroxidase and catalase to mitigate an abundance of ROS [35]. Many sources of ROS come from intracellular auto-oxidation reactions. Ascorbic acid, flavin, adrenalin, peroxisomes, and some low molecular weight thiol coenzymes will result in ROS production. The mitochondrial electron transport chain can also produce ROS intermediates. There are many diverse exogenous sources of ROS, including heavy metals, pollution, radiation, and pesticides. SODs are employed by the cell as a first line of defense, which is generally sufficient to restore the oxidant/antioxidant balance. Antioxidants can function directly by reacting with the free radical ROS or indirectly by inhibiting their formation or promoting the production and activity of antioxidant enzymes. 12 Figure 1.3. Cellular Stress Response: Basic schematic of various types of cellular stress response. Another major stress response pathway is comprised of a superfamily of proteins called heat shock protein (Hsps)[37]. Heat shock response is a highly conserved stress response that utilizes a set of molecular chaperone proteins (Hsps) that are named after their molecular weight. Hsps help maintain protein homeostasis by degrading or refolding misfolded proteins or preventing protein aggregation [38]. This class of proteins was first discovered in experiments evaluating heat stress, which provided the naming scheme [37]. Hsps response isn’t exclusive to just thermal stress, many stressors like heavy metals and oxidative stress can activate this cellular mechanism [35]. If heat stress goes unchecked, it can result in accumulation of misfolded 13 proteins, cellular defects, and cell death [39]. Interestingly, if the Hsps ameliorate the pressure on the cell, it can result in stress resistance and even cross-protection from other types of stress [38]. Protein degradation, or the unfolded protein response (UPR), is the collective protease machinery responsible for degrading protein targets into peptides to avoid accumulation of misfolded proteins or aggregated proteins [40]. Various environmental stressors or physiological changes can result in UPR. In bacterial prokaryotes, AAA+ proteases and ClpXP proteins control protein degradation. Tagged substrates are delivered to the degradation complex and through repeated cleavage, the protein is broken down into peptides. ClpXP has dual functionality within the cell, as it not only contributes to maintaining proteostasis, but to stress adaptation mechanisms as well [41]. There are over 50 known protein substrate targets for ClpXP, and it also acts as a transcriptional regulator on stress response pathway transcription factors SigmaS and SigmaE. This type of crosstalk is common with stress response mechanisms, as noted above [41]. The response to DNA damage is widespread biological response to physical alterations to DNA [42]. When DNA is modified by oxidation, methylation, alkylation, contains mismatched bases, hydrolyzed via deamination or nucleic acid removal, the cell signals for DNA damage repair [40]. The SOS pathway response to damage is one of the most well studied paradigms of bacterial stress response. SOS response is induced after DNA damage but has also been shown as a response to antibiotics in bacteria. The pathway is regulated by the proteins LexA and RecA, LexA regulates transcription of about 50 genes, including LexA and RecA[43]. Damage to DNA, also called lesions, is repaired via polymerases, recombination or bypassed. The SOS response is not the only DNA damage pathway a cell can employ. The adaptive response uses the protein 14 AidB as a detoxification protein that can dealkylate DNA and protect cells from oxidative stress during periods of restricted nutrition [44]. Survival after DNA damage can be mediated by cell growth inhibition, cell death and replication repair to template lesions. If the cell can sustain and combat the prolonged stress, disease often results from the chronic exposure. Cancer, diabetes, and Parkinson’s disease are just a few pathologies that can arise from prolonged stress [35]. Continued exposure to stress can result in autophagy or apoptotic cell death. This intentional removal of damaged cells promotes cell survival and is another form of cellular stress response [45]. Although the cell will mount survival responses to stress, like heat shock response, if the stress is severe or the duration too long the cell will signal death. Measuring CSR The quantification of cellular stress can be divided into two categories: morphological/physiological analyses or transcription/translation attenuation measurements [46]. Cells exposed to stress can show dysmorphic shapes, mutated cellular structures and deviations from average size [47], [48]. The second category measures stress through either gene, transcript or protein level changes or through stressor measurements, like ROS quantities. Recent advancements in MS based techniques have led to better transcript and protein detection, but many assays rely on traditional approaches like immunoblotting and fluorescence tagging [49]. Quantifying systems under stress can be messy, with confounding factors resulting in dysregulation within multiple cellular mechanisms. In mammals, stress is often assessed by measuring glucocorticoids. This can be problematic because while they are part of the endocrine stress response, they are influenced by a number of factors making for an unreliable predictor 15 [50]. Potentially adding to the confusion are the adaptations that cells use when exposed to mild stress. Many studies have shown that mild exposure to stress can result in dynamic homeostasis, and highly controlled regulatory paradigms in response to stress as a survival mechanism [51]. If stress is not the intended outcome of a treatment, parsing changes due to differential cellular mechanisms and CSR can be challenging without additional experimentation to determine types and amounts of perturbation. Molecular crosstalk between various types of stress is well documented, like drought, temperature and salinity stress, which presents in apple trees with common differentially expressed genes related to signal transduction and metabolism [52]. These genetic fluctuations could also result from changes in light or growth however, so physiological observations of size and proliferation need to be made in order to distinguish potential causes of gene differentiation [53]. Research Goals Cellular Stress Response is a challenging phenomenon to study because biology has evolved a diverse set of defense mechanisms that enable homeostatic plasticity. My research was motivated from this complexity, attempting to develop a method to accurately describe a stressed phenotype, regardless of stress severity. This work utilizes standard metabolomics workflows to probe global metabolome changes and repurposes untargeted data to examine questions about stress. Much of metabolomics research is aimed at quantifying metabolites that result in a major phenotypic change, like disease or cancer. Here, I exploit the sensitivity of the metabolome to exogenous stressors to help answer questions about how species adapt to and tolerate perturbations. In order to meet this aim, I developed a categorical approach. I first used 16 metabolomics data in a very straightforward manner to analyze metabolomic flux due to a minimal stressor. Characterizing dysregulation and building a metabolic map of what adaptations occur under a minimal stress lays the foundation for deepening our understanding of metabolic adjustments. Detailing the metabolomic adjustments to a minimal stress was the starting point for quantifying that stress. Current statistical workflows alone do not offer one clear phenotypic assessment of the systems under investigation. Omics analysis offers a comprehensive view of the system, but there are aspects of omics data that go unused that could potentially inform on phenotype and environmental condition. Unique approaches to mining metabolomics data for identification of biomarkers are becoming more common. One such approach previously investigated by our lab led to an observation that variation in the metabolome changed when a system experienced stress (FIGURE 1.4) [54]. The second goal of my research was to capitalize on this observation. Is there a repeatable and reliable way to determine if stress on a system results in less variation in the metabolome and how can this be developed in an omics statistical analysis? 17 Figure 1.4. Variation in Metabolomics Data (a.) First instance of smaller variation in omics data published by the Bothner group in 2014, Heinemann et al. If a stressed phenotype can be assessed using omics variation analysis and conventional workflows, the joint approach should offer additional way to identify potential biomarkers. Additionally, if annotated data is analyzed with this combined approach, can we identify pathways or specific proteins responsible for the change in variation due to stress? The final aim of my research was to investigate this possibility. This work outlines how a traditional metabolomics approach can describe metabolic adaptations to a minimal stress. The use of non-canonical amino acids in E. coli cultures caused mild perturbation that an investigation at the metabolite level captured. This work also suggested that different types of amino acid additions to growth media caused varying amounts of perturbation to the cultures. 18 Detailing the differing amounts of perturbation in E. coli laid the foundation for utilizing variation within omics data to help describe cellular stress response. I was able to outline a statistical model for the relationship between omics data and stress response. Acute stress leads to a less variable metabolite and protein production between biological replicates, regardless of stress type or model system. This statistical analysis helps add clarity to systems biology analysis (FIGURE 1.5). It clarifies if a stressed phenotype is at play and can help identify important metabolites or proteins that differentiate sample groups. Figure 1.5. The Who, What and Why of RVA: Schematic of variation analysis from raw data and detection of this pattern, characterizing the variation model and implementing variation analysis and potential mechanisms that underly it. Adding variation analysis to an omics workflow adds a deeper phenotypic assessment. Here, I use a comparative proteomics analysis of methanogenic archaea aimed at identifying iron and sulfur acquisition and trafficking to demonstrate that variation analysis helps parse 19 differences in the cultures due to environment and stress response. The additional analysis offers dimension to the phenotype that would have been otherwise overlooked. 20 References [1] F. Girolamo, I. Lante, M. Muraca, and L. Putignani, “The Role of Mass Spectrometry in the ‘Omics’ Era,” Curr. Org. Chem., vol. 17, no. 23, pp. 2891–2905, Dec. 2013, doi: 10.2174/1385272817888131118162725. [2] Q. Yang et al., “Metabolomics biotechnology, applications, and future trends: A systematic review,” RSC Advances, vol. 9, no. 64. Royal Society of Chemistry, pp. 37245– 37257, Nov. 14, 2019, doi: 10.1039/c9ra06697g. [3] V. Tolstikov, A. James Moser, R. Sarangarajan, N. R. Narain, and M. A. Kiebish, “Current status of metabolomic biomarker discovery: Impact of study design and demographic characteristics,” Metabolites, vol. 10, no. 6. MDPI AG, Jun. 01, 2020, doi: 10.3390/metabo10060224. [4] C. H. Johnson, J. Ivanisevic, and G. Siuzdak, “Metabolomics: Beyond biomarkers and towards mechanisms,” Nature Reviews Molecular Cell Biology, vol. 17, no. 7. Nature Publishing Group, pp. 451–459, Jul. 01, 2016, doi: 10.1038/nrm.2016.25. [5] P. P. Handakumbura, B. Stanfill, A. Rivas-Ubach, D. Fortin, J. P. Vogel, and C. Jansson, “Metabotyping as a Stopover in Genome-to-Phenome Mapping,” Sci. Rep., vol. 9, no. 1, pp. 1–12, Dec. 2019, doi: 10.1038/s41598-019-38483-0. [6] R. Rauhut and G. Klug, “mRNA degradation in bacteria,” FEMS Microbiol. Rev., vol. 23, no. 3, pp. 353–370, Jun. 1999, doi: 10.1111/j.1574-6976.1999.tb00404.x. [7] I. Gertsman and B. A. Barshop, “Promises and pitfalls of untargeted metabolomics,” J. Inherit. Metab. Dis., vol. 41, no. 3, pp. 355–366, May 2018, doi: 10.1007/s10545-017-0130-7. [8] D. S. Wishart, “Metabolomics for investigating physiological and pathophysiological processes,” Physiol. Rev., vol. 99, no. 4, pp. 1819–1875, 2019, doi: 10.1152/physrev.00035.2018. [9] J. Griffiths, “A brief history of mass spectrometry,” Analytical Chemistry, vol. 80, no. 15. American Chemical Society , pp. 5678–5683, Aug. 01, 2008, doi: 10.1021/ac8013065. [10] J. B. Fenn, “Electrospray ionization mass spectrometry: How it all began,” Journal of Biomolecular Techniques, vol. 13, no. 3. The Association of Biomolecular Resource Facilities, pp. 101–118, Sep. 2002, Accessed: Mar. 03, 2021. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2279858/. [11] M. E. Swartz, “UPLC TM : An Introduction and Review,” J. Liq. Chromatogr. Relat. Technol., vol. 28, no. 7–8, pp. 1253–1263, Apr. 2005, doi: 10.1081/JLC-200053046. 21 [12] P. A. Bottomley, “NMR imaging techniques and applications: A review,” Review of Scientific Instruments, vol. 53, no. 9. American Institute of PhysicsAIP, pp. 1319–1337, Sep. 04, 1982, doi: 10.1063/1.1137180. [13] B. Diehl, “Principles in NMR spectroscopy,” in NMR Spectroscopy in Pharmaceutical Analysis, Elsevier, 2008, pp. 1–41. [14] E. D. Becker, “A BRIEF HISTORY OF NUCLEAR MAGNETIC RESONANCE,” Anal. Chem., vol. 65, no. 6, pp. 295A-302A, Mar. 1993, doi: 10.1021/ac00054a716. [15] A. H. Emwas et al., “Nmr spectroscopy for metabolomics research,” Metabolites, vol. 9, no. 7. MDPI AG, Jul. 01, 2019, doi: 10.3390/metabo9070123. [16] M. Jacob, A. L. Lopata, M. Dasouki, and A. M. Abdel Rahman, “Metabolomics toward personalized medicine,” Mass Spectrometry Reviews, vol. 38, no. 3. John Wiley and Sons Inc., pp. 221–238, May 01, 2019, doi: 10.1002/mas.21548. [17] D. S. Wishart, “Emerging applications of metabolomics in drug discovery and precision medicine,” Nature Reviews Drug Discovery, vol. 15, no. 7. Nature Publishing Group, pp. 473–484, Jun. 30, 2016, doi: 10.1038/nrd.2016.32. [18] M. Zhang, X. Liu, X. Liu, H. Li, W. Sun, and Y. Zhang, “A pilot investigation of a urinary metabolic biomarker discovery in renal cell carcinoma,” Int. Urol. Nephrol., vol. 52, no. 3, pp. 437–446, Mar. 2020, doi: 10.1007/s11255-019-02332-w. [19] A. M. Smith et al., “A Metabolomics Approach to Screening for Autism Risk in the Children’s Autism Metabolome Project,” Autism Res., vol. 13, no. 8, pp. 1270–1285, Aug. 2020, doi: 10.1002/aur.2330. [20] S. A. Goutman et al., “Untargeted metabolomics yields insight into ALS disease mechanisms,” J. Neurol. Neurosurg. Psychiatry, vol. 91, no. 12, pp. 1329–1338, Dec. 2020, doi: 10.1136/jnnp-2020-323611. [21] A. Casadei-Gardini et al., “1H-NMR Based Serum Metabolomics Highlights Different Specific Biomarkers between Early and Advanced Hepatocellular Carcinoma Stages,” Cancers (Basel)., vol. 12, no. 1, p. 241, Jan. 2020, doi: 10.3390/cancers12010241. [22] C. Steinbeck et al., “The future of metabolomics in ELIXIR,” F1000Research, vol. 6, 2017, doi: 10.12688/f1000research.12342.2. [23] C. H. Johnson and F. J. Gonzalez, “Challenges and opportunities of metabolomics,” Journal of Cellular Physiology, vol. 227, no. 8. NIH Public Access, pp. 2975– 2981, Aug. 2012, doi: 10.1002/jcp.24002. [24] W. J. Nash and W. B. Dunn, “From mass to metabolite in human untargeted metabolomics: Recent advances in annotation of metabolites applying liquid chromatography- 22 mass spectrometry data,” TrAC - Trends in Analytical Chemistry, vol. 120. Elsevier B.V., p. 115324, Nov. 01, 2019, doi: 10.1016/j.trac.2018.11.022. [25] B. Aslam, M. Basit, M. A. Nisar, M. Khurshid, and M. H. Rasool, “Proteomics: Technologies and their applications,” Journal of Chromatographic Science, vol. 55, no. 2. Oxford University Press, pp. 182–196, Feb. 01, 2017, doi: 10.1093/chromsci/bmw167. [26] F. Vitzthum, F. Behrens, N. L. Anderson, and J. H. Shaw, “Proteomics: From basic research to diagnostic application. A review of requirements & needs,” Journal of Proteome Research, vol. 4, no. 4. American Chemical Society , pp. 1086–1097, Jul. 2005, doi: 10.1021/pr050080b. [27] M. Bantscheff, S. Lemeer, M. M. Savitski, and B. Kuster, “Quantitative mass spectrometry in proteomics: Critical review update from 2007 to the present,” Analytical and Bioanalytical Chemistry, vol. 404, no. 4. Springer, pp. 939–965, Sep. 08, 2012, doi: 10.1007/s00216-012-6203-4. [28] E. J. Dupree, M. Jayathirtha, H. Yorkey, M. Mihasan, B. A. Petre, and C. C. Darie, “A critical review of bottom-up proteomics: The good, the bad, and the future of this field,” Proteomes, vol. 8, no. 3. MDPI AG, pp. 1–26, Sep. 01, 2020, doi: 10.3390/proteomes8030014. [29] D. P. Nusinow, J. Szpyt, M. Ghandi, L. A. Garraway, W. R. Sellers, and S. P. Gygi Correspondence, “Quantitative Proteomics of the Cancer Cell Line Encyclopedia,” Cell, vol. 180, pp. 387-402.e16, 2020, doi: 10.1016/j.cell.2019.12.023. [30] E. Nkuipou-Kenfack, P. Zürbig, and H. Mischak, “The long path towards implementation of clinical proteomics: Exemplified based on CKD273,” PROTEOMICS - Clin. Appl., vol. 11, no. 5–6, p. 1600104, May 2017, doi: 10.1002/prca.201600104. [31] M. Frantzi, A. Latosinska, and H. Mischak, “Proteomics in Drug Development: The Dawn of a New Era?,” PROTEOMICS – Clin. Appl., vol. 13, no. 2, p. 1800087, Mar. 2019, doi: 10.1002/prca.201800087. [32] K. A. Brown, J. A. Melby, D. S. Roberts, and Y. Ge, “Top-down proteomics: challenges, innovations, and applications in basic and clinical research,” Expert Review of Proteomics, vol. 17, no. 10. Taylor and Francis Ltd., pp. 719–733, 2020, doi: 10.1080/14789450.2020.1855982. [33] K. A. Brown, J. A. Melby, D. S. Roberts, and Y. Ge, “Top-down proteomics: challenges, innovations, and applications in basic and clinical research,” Expert Review of Proteomics, vol. 17, no. 10. Taylor and Francis Ltd., pp. 719–733, 2020, doi: 10.1080/14789450.2020.1855982. 23 [34] G. S. Hotamisligil and R. J. Davis, “Cell signaling and stress responses,” Cold Spring Harb. Perspect. Biol., vol. 8, no. 10, p. a006072, Oct. 2016, doi: 10.1101/cshperspect.a006072. [35] S. Fulda, A. M. Gorman, O. Hori, and A. Samali, “Cellular stress responses: Cell survival and cell death,” International Journal of Cell Biology. 2010, doi: 10.1155/2010/214074. [36] J. M. Lü, P. H. Lin, Q. Yao, and C. Chen, “Chemical and molecular mechanisms of antioxidants: Experimental approaches and model systems,” J. Cell. Mol. Med., vol. 14, no. 4, pp. 840–860, Apr. 2010, doi: 10.1111/j.1582-4934.2009.00897.x. [37] K. Richter, M. Haslbeck, and J. Buchner, “The Heat Shock Response: Life on the Verge of Death,” Molecular Cell, vol. 40, no. 2. Cell Press, pp. 253–266, Oct. 22, 2010, doi: 10.1016/j.molcel.2010.10.006. [38] A. D. Nguyen, N. J. Gotelli, and S. H. Cahan, “The evolution of heat shock protein sequences, cis-regulatory elements, and expression profiles in the eusocial Hymenoptera,” BMC Evol. Biol., vol. 16, no. 1, Jan. 2016, doi: 10.1186/s12862-015-0573-0. [39] S. H. Peeters and M. I. de Jonge, “For the greater good: Programmed cell death in bacterial communities,” Microbiological Research, vol. 207. Elsevier GmbH, pp. 161–169, Mar. 01, 2018, doi: 10.1016/j.micres.2017.11.016. [40] A. J. L. Macario and E. Conway De Macario, “THE MOLECULAR CHAPERONE SYSTEM AND OTHER ANTI-STRESS MECHANISMS IN ARCHAEA,” 2001. [41] K. N. Truscott, A. Bezawork-Geleta, and D. A. Dougan, “Unfolded protein responses in bacteria and mitochondria: A central role for the ClpXP machine,” IUBMB Life, vol. 63, no. 11, pp. 955–963, Nov. 2011, doi: 10.1002/iub.526. [42] K. N. Kreuzer, “DNA damage responses in prokaryotes: Regulating gene expression, modulating growth patterns, and manipulating replication forks,” Cold Spring Harb. Perspect. Biol., vol. 5, no. 11, Nov. 2013, doi: 10.1101/cshperspect.a012674. [43] K. N. Kreuzer, “DNA damage responses in prokaryotes: Regulating gene expression, modulating growth patterns, and manipulating replication forks,” Cold Spring Harb. Perspect. Biol., vol. 5, no. 11, Nov. 2013, doi: 10.1101/cshperspect.a012674. [44] M. S. Rohankhedkar, S. B. Mulrooney, W. J. Wedemeyer, and R. P. Hausinger, “The AidB component of the Escherichia coli adaptive response to alkylating agents is a flavin- containing, DNA-binding protein,” J. Bacteriol., vol. 188, no. 1, pp. 223–230, Jan. 2006, doi: 10.1128/JB.188.1.223-230.2006. 24 [45] W. Zheng et al., “Multiple Modes of Cell Death Discovered in a Prokaryotic (Cyanobacterial) Endosymbiont,” PLoS One, vol. 8, no. 6, p. e66147, Jun. 2013, doi: 10.1371/journal.pone.0066147. [46] M. Gaidica and B. Dantzer, “Quantifying the autonomic response to stressors-One way to expand the definition of ‘stress’ in animals,” Integr. Comp. Biol., vol. 60, no. 1, pp. 113– 125, Jul. 2020, doi: 10.1093/icb/icaa009. [47] F. S. Hsu, S. Spannl, C. Ferguson, A. A. Hyman, R. G. Parton, and M. Zerial, “Rab5 and Alsin regulate stress-activated cytoprotective signaling on mitochondria,” Elife, vol. 7, Feb. 2018, doi: 10.7554/eLife.32282. [48] V. A. Sleight, L. S. Peck, E. A. Dyrynda, V. J. Smith, and M. S. Clark, “Cellular stress responses to chronic heat shock and shell damage in temperate Mya truncata,” Cell Stress Chaperones, vol. 23, no. 5, pp. 1003–1017, Sep. 2018, doi: 10.1007/s12192-018-0910-5. [49] K. Klann and G. Tascher, “Functional Translatome Proteomics Reveal Converging and Dose-Dependent Regulation by mTORC1 and eIF2α,” Mol. Cell, vol. 77, pp. 913-925.e4, 2020, doi: 10.1016/j.molcel.2019.11.010. [50] R. M. Sapolsky, L. M. Romero, and A. U. Munck, “How Do Glucocorticoids Influence Stress Responses? Integrating Permissive, Suppressive, Stimulatory, and Preparative Actions*,” Endocr. Rev., vol. 21, no. 1, pp. 55–89, Feb. 2000, doi: 10.1210/edrv.21.1.0389. [51] Y. Goulev et al., “Nonlinear feedback drives homeostatic plasticity in H2O2 stress response,” Elife, vol. 6, Apr. 2017, doi: 10.7554/eLife.23971. [52] X. Li, M. Li, B. Zhou, Y. Yang, Q. Wei, and J. Zhang, “Transcriptome analysis provides insights into the stress response crosstalk in apple (Malus × domestica) subjected to drought, cold and high salinity,” Sci. Rep., vol. 9, no. 1, pp. 1–10, Dec. 2019, doi: 10.1038/s41598-019-45266-0. [53] A. Trewavas, “Plant cell signal transduction: The emerging phenotype,” Plant Cell, vol. 14, no. SUPPL. American Society of Plant Biologists, pp. S3–S4, May 01, 2002, doi: 10.1105/tpc.141360. [54] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Dec. 2014, doi: 10.1007/s11306-014- 0651-0. 25 CHAPTER TWO METABOLIC IMPLICATIONS OF USING BIOORTHOGONAL NON-CANONICAL AMINO ACID TAGGING (BONCAT) FOR TRACKING PROTEIN SYNTHESIS Contribution of Authors and Co-Authors Manuscript in Chapter Two Author: Katherine F. Steward Contributions: Study design and conceptualization, data analysis and interpretation, manuscript revision, manuscript drafting. Co-Author: Brian Eilers Contributions: Experimental setup and manipulations, data analysis and interpretation and manuscript revision. Co-Author: Brian Tripet Contributions: Data analysis and interpretation and manuscript revision. Co-Author: Amanda Fuchs Contributions: Data analysis and interpretation and manuscript revision. Co-Author: Michael Dorle Contributions: Experimental setup and manipulations, data analysis and interpretation and manuscript revision Co-Author: Rachel Rawle Contributions: Data analysis and interpretation and manuscript revision. Co-Author: Berliza Soriano Contributions: Experimental setup and manipulations, data analysis and interpretation and manuscript revision Co-Author: Narayanaganesh Balasubramanian Contributions: Experimental setup and manipulations, data analysis and interpretation and manuscript revision Co-Author: Valérie Copié Contributions: Study design and conceptualization, data analysis and interpretation, manuscript revision, manuscript drafting. 26 Co-Author: Brian Bothner Contributions: Study design and conceptualization, data analysis and interpretation, manuscript revision, manuscript drafting. Co-Author: Roland Hatzenpichler Contributions: Study design and conceptualization, experimental design and manipulations, data analysis and interpretation, manuscript revision, manuscript drafting. 27 Manuscript Information Katherine F. Steward, Brian Eilers, Brian Tripet, Amanda Fuchs, Michael Dorle , Rachel Rawle, Berliza Soriano, Narayanaganesh Balasubramanian, Valérie Copié, Brian Bothner and Roland Hatzenpichler Journal: Frontiers in Microbiology Status of Manuscript: ____ Prepared for submission to a peer-reviewed journal ____ Officially submitted to a peer-reviewed journal ____ Accepted by a peer-reviewed journal __X__ Published in a peer-reviewed journal Frontiers Media SA 07 December 2019 13 February 2020 February 2020, volume 11, article 197 doi: 10.3389/micb.2020.00197 28 METABOLIC IMPLICATIONS OF USING BIOORTHOGONAL NON-CANONICAL AMINO ACID TAGGING (BONCAT) FOR TRACKING PROTEIN SYNTHESIS Katherine F. Steward1 , Brian Eilers1 , Brian Tripet1 , Amanda Fuchs1 , Michael Dorle1 , Rachel Rawle1 , Berliza Soriano1 , Narayanaganesh Balasubramanian1 , Valérie Copié1,2 , Brian Bothner1,2 * and Roland Hatzenpichler1,2,3 * 1 Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT, United States 2 Thermal Biology Institute, Montana State University, Bozeman, MT, United States 3 Center for Biofilm Engineering, Montana State University, Bozeman, MT, United States *Correspondence: Brian Bothner bbothner@montana.edu Roland Hatzenpichler roland.hatzenpichler@montana.edu Abstract BioOrthogonal Non-Canonical Amino acid Tagging (BONCAT) is a powerful tool for tracking protein synthesis on the level of single cells within communities and whole organisms. A basic premise of BONCAT is that the non-canonical amino acids (NCAA) used to track translational activity do not significantly alter cellular physiology. If the NCAA would induce changes in the metabolic state of cells, interpretation of BONCAT studies could be challenging. To address this knowledge-gap, we have used a global metabolomics analyses to assess the 29 intracellular effects of NCAA incorporation. Two NCAA were tested: L-azidohomoalanine (AHA) and L-homopropargylglycine (HPG); L-methionine (MET) was used as a minimal stress baseline control. Liquid Chromatography Mass Spectrometry (LC-MS) and Nuclear Magnetic Resonance (NMR) were used to characterize intracellular metabolite profiles of Escherichia coli cultures, with multivariate statistical analysis using XCMS and MetaboAnalyst. Results show that doping with NCAA induces metabolic changes, however, the metabolic impact was not dramatic. A second set of experiments in which cultures were placed under mild stress to simulate real-world environmental conditions showed a more consistent and more robust perturbation. Pathways that changed include amino acid and protein synthesis, choline and betaine, and the TCA cycle. Globally, these changes were statistically minor, indicating that NCAA are unlikely to exert a significant impact on cells during incorporation. Our results are consistent with previous reports of NCAA doping under replete conditions and extend these results to bacterial growth under environmentally relevant conditions. Our work highlights the power of metabolomics studies in detecting cellular response to growth conditions and the complementarity of NMR and LCMS as omics tools. Keywords: metabolomics, BONCAT, non-canonical amino acids, L-azidohomoalanine, L- homopropargylglycine Introduction Dieterich et al. (2006) introduced a method for visualizing newly synthesized proteins in mammalian cells termed BioOrthogonal Non-Canonical Amino acid Tagging (BONCAT). BONCAT facilitates the tracking and localization of protein translation in single cells following 30 a short incubation with a synthetic amino acid that later can be detected via azide-alkyne clickchemistry, a sensitive and precise biocompatible reaction (Kolb et al., 2001). BONCAT has proven to be particularly useful for monitoring cellular activity in complex microbial communities (Hatzenpichler et al., 2014, 2016; Samo et al., 2014; Leizeaga et al., 2017; Sebastián et al., 2019), and adds a convenient approach to the molecular tool box available for analyzing microbial community function (Hatzenpichler et al., 2020) because it avoids the use of radioactive substrates and is understood to only minimally impact protein structure and cell physiology. Currently, the two most widely used non-canonical amino acids (NCAA) are L- azidohomoalanine (AHA) and L-homopropargylglycine (HPG), which both replace L- methionine (MET) during translation (Kiick et al., 2002). These amino acids contain either an azide (AHA) or an alkyne functional group (HPG) which are amenable to azide-alkyne click chemistry (Kolb et al., 2001). Experimental protocols for performing BONCAT studies and click-labeling newly made proteins are well established in microbiology and microbial ecology (Bagert et al., 2014; Hatzenpichler et al., 2014, 2016; Mahdavi et al., 2014; Hatzenpichler and Orphan, 2015; Babin M. B. et al., 2016; Bagert et al., 2016). Phenotypic markers of optical density, behavioral tests, and responses to visual cues have been utilized to assess the impact of cell treatments with NCAA. Studies on HeLa cells (Bagert et al., 2014), a range of bacterial and archaeal pure cultures (Bagert et al., 2014; Hatzenpichler et al., 2014; Hatzenpichler and Orphan, 2015), and environmental samples (Hatzenpichler et al., 2014, 2016) have demonstrated that the addition of low concentrations (nM-mM range) of AHA or HPG to a sample over short periods of time (typically 1–2 cell generations) has only minimal effects on the physiology, growth rate, or protein expression 31 patterns of organisms. Hinz et al. studied zebrafish and investigated the potential effects of NCAA labeling in vivo, which revealed that AHA was successfully incorporated into proteins in a ratio consistent with time and concentration, that AHA was non-toxic and had no detrimental effect on animal behavior (Hinz et al., 2012). A recent proteomic study investigating the effect of AHA and HPG on protein expression and the ability to incorporate these reagents into mice showed that a small percentage (∼10%) of proteins change their expression patterns in response to AHA doping (Calve et al., 2016). Lastly, a recent study indicated that the incorporation of AHA into a model protein only minimally affected the protein tertiary structure (Lehner et al., 2017). The recent application of proteomics investigating the cell machinery have also shown that AHA and HPG have little impact on the overall fitness of the organism (Dieterich et al., 2006; Landgraf et al., 2015). However, a deeper look into the metabolism of NCAA doped organisms has, to our knowledge, never been carried out. This study aimed to characterize the metabolome of Escherichia coli when grown in the presence of NCAA and identify potentially differentiated metabolite patterns that might inform us on the metabolic impact of NCAA on cell homeostasis and organismal health. In order to investigate how NCAA exposure affects intracellular metabolism, E. coli cell cultures were grown with and without AHA or HPG. One sample group included additional MET as a minimal perturbation to be used as a control experiment for baseline stress due to media supplementation. Control cultures were also grown in media without amino acid amendment. “Real world” experimental and cell culturing conditions were utilized in our analysis to best evaluate the potential effects of NCAA, and to mimic current field work in the environment that attempts to find suitable growth conditions for otherwise unculturable microorganisms. Comprehensive 32 metabolite mapping techniques using Liquid Chromatography Mass Spectrometry (LC-MS) and Nuclear Magnetic Resonance (NMR) spectroscopy were employed to assess potential metabolome differences between different E. coli cell cultures and growth conditions. Materials and Methods Reagents HPLC grade solvents: water, methanol and acetonitrile were purchased from Fisher (Waltham, MA, United States). AHA and HPG were purchased from Click Chemistry Tools (Scottsdale, AZ, United States). All other chemicals were purchased from Millipore Sigma (St. Louis, MO, United States) and were used as provided, with no additional purification steps. Cell Culturing An overnight culture of E. coli K12 DH10B, which had been grown on M9 minimal medium (200 mg/L thiamine, 0.2% glucose), was inoculated 1:20 into 6 L of M9 medium (200 mg/L thiamine, 0.2% glucose) to yield a fresh culture of optical density, measured at a wavelength l of 600 nm, i.e., OD600 of 0.041. 150 mL aliquots of this culture were then aliquoted into 36 Erlenmeyer flasks, which were incubated at 37◦C on rotary shakers run at 200 rpm. Temperatures were independently checked with a thermometer at regular intervals to validate that temperatures were consistent across incubators throughout the experiments. Immediately following inoculation, the following incubations were started: five flasks each for (1) 50 µM MET; (2) 50 µM AHA; and (3) 50 µM HPG. One additional control flask without amendment was used to monitor growth of the cultures via optical density (OD600). This was 33 done to avoid disturbing the experimental cultures given the large number of flasks. Growth experiments with 50 µM amino acid addition (MET, AHA, or HPG) were stopped after 85 min of incubation when the control flask had reached an OD600 of 0.072, corresponding to ∼0.74 cell generations (Supplementary Figure S1). The 2 × 50 mL cell cultures were then decanted into two 50 mL tubes. Tubes were centrifuged for 5 min at 4,700 g at room temperature. Resulting supernatants were decanted and the cell pellets flash-frozen in liquid N2 and stored at −80◦C until further processing. After these samples had been stored at −80◦C, the remaining 20 flasks were processed the following way: 1 mM of (1) MET, (2) AHA, and (3) HPG were added to 5 flasks; Five additional culture flasks served as no-amendment control, which were used to track cell growth. The incubation was continued as described, with a starting OD600 of control cell cultures of 0.27, and stopped after 5 min of amino acid pulse labeling whereby the control cultures had reached an OD600 of 0.31, corresponding to ∼0.04 cell generations (Supplementary Table S1). Cells were pelleted, pellets flash frozen, and samples stored as described above. Cultures for the heat stress experiments were conducted as described above except that cell cultures were grown and maintained at 42◦C. Metabolite Extraction Escherichia coli intracellular metabolites were extracted using published protocols (Hamerly et al., 2015). Briefly, frozen cell pellets were re-suspended with water, then sonicated using a Biologics Ultrasonic Homogenizer model 3000 for 10 pulses of 3 s each. Resulting supernatant was centrifuged and transferred to 10 mL scintillation vials to which four volumes of ice cold acetone were added, followed by storage of the samples at −80◦C overnight for protein precipitation. Protein concentration in the samples was determined using a Bradford assay 34 (Bradford, 1976) (Supplementary Table S2). Samples were vortexed, centrifuged, and split into two fractions for concurrent analysis by LC-MS and NMR: 1 mL for LC-MS analysis and 4 mL for NMR metabolomics analysis. Both fractions were dried completely using vacuum speed concentration with no heat, and subsequently frozen at −80◦C until further use. LCMS Instrumentation and Metabolite Analysis The dried metabolite fraction used for liquid chromatography mass spectrometry (LC- MS) was re-suspended with 20 µL of 50:50 MeOH/H2O before injection into the mass spectrometer. MS-based analysis of polar metabolites was accomplished using an Agilent 1290 ultra-high performance liquid chromatography (UPLC) system coupled to an Agilent 6538 Accurate-Mass quadrupole Time of Flight (TOF) mass spectrometer. A Cogent diamond hydride HILIC chromatography column (2.2 µM, 120 A, 150 mm × 2.1 mm Microsolv, Leland, NC, United States) was used for metabolite separation. The gradient began with solvent B (0.1% formic acid in acetonitrile) for 2 min at 50%, followed by a gradient ramp of 50–100% B over 14 min. This step was followed by a hold at 100% solvent B for 1 min, and then return to initial conditions. Mass analysis was conducted in positive mode with a capillary voltage of 3500 V, dry gas temperature of 350◦C at a flow of 8 L/min and the nebulizer was set at 60 psi, injecting 2 µL sample volumes, with blanks run intermittently between samples. Data acquisition parameters were as follows: 50–1,000 mass range at 1 Hz scan rate with a resolution of 18,000. Accuracy based on calibration standards was approximately 5 ppm. Statistical Analysis of MS Data Extracted ion chromatograms, peak detection, peak annotation, chromatogram alignment, gap filling and relative quantitation of identified features was completed using MZmine (Pluskal 35 et al., 2010), MetaboAnalyst (Chong and Xia, 2018), and XCMS (Tautenhahn et al., 2012). Metabolite identifications were made based on exact mass and retention time matches to authentic standards using an in house library of ∼500 compounds. Statistical analysis of the MZmine output was done using Microsoft Excel version 2016 and MetaboAnalyst v4.0. XCMS utilizes an all-inclusive processing package with a similar workflow, in which it extracts chromatograms, identifies peaks, matches peaks across samples, gap fills, performs statistical analyses and in silico compound identification, and graphical visualization of the data. Identifications of unknown features were made using the MetLin Metabolite Database, which provided a list of possible metabolites based on exact mass, species, and likelihood (Myers et al., 2017). Sample Preparation and NMR Analysis Dried metabolite mixtures were re-suspended in 600 µL of NMR buffer (containing 0.25 mM 4,4-dimethyl-4-silapentane1-sulfonic acid (DSS) in 90%H2O/10% D2O, 25 mM sodium phosphate, pH 7), and transferred into 5 mm NMR tubes. All one dimensional (1D) 1H NMR spectra were recorded at 298 K using a Bruker AVANCE III solution NMR spectrometer operating at 600.13 MHz 1H Larmor frequency and equipped with a 5 mm liquid-helium-cooled TCI cryoprobe with Z-gradient and a SampleJetTM automatic sample loading system. 1D 1H NMR data were acquired using the Bruker supplied 1D excitation sculpting water suppression pulse sequence ‘zgesgp’ with 256 scans, a 1H spectral window of 9,600 Hz, 32K data points, a dwell time interval of 52 µsec, and a recycle delay of 5 s between scan acquisitions. The data were first processed with the Bruker TOPSPIN 3.5 software1 using standard parameters for chemical shift referencing using the DSS signal and line broadening (0.3 Hz). Spectral phases 36 were manually adjusted, and a polynomial function was applied (qfil, 0.2 ppm width) on the residual water peak to remove its signal. Metabolite identification and quantification were carried out using the ChenomxTM NMR suite software (version 8.3)2 and its associated 600 MHz small molecule reference spectral database. DSS was used as an internal standard for metabolite quantification, while imidazole NMR signals were used to correct for small chemical shift changes arising from slight pH variations between samples. The metabolite concentration tables (mM) generated with Chenomx were exported to a.csv file and converted to µM and normalized to sample protein concentration as established from Bradford protein assays. Validation of metabolite IDs, which were annotated in Chenomx3 , was accomplished using 2D 1H-1H and 2D 1H-13C total correlation spectroscopy (TOCSY) NMR or by spiking, when available, pure metabolite standards into the samples and monitoring resulting spectral changes in the 1D 1H NMR spectra. 2D 1H-1H TOCSY spectra were acquired for representative samples using the Bruker-supplied ‘mlevphpr.2/mlevgpph19’ pulse sequences (256 × 2048 data points, 2 s relaxation delay, 32 transients per FID,1H spectral window of 6602.11 Hz, 80 ms TOCSY spin lock mixing period). 2D 1H-1H TOCSY spectra were processed using Topspin software (Bruker version 3.2)4. Statistical Analysis of NMR Data The NMR-based metabolite data were uploaded to the MetaboAnalyst v4.0 web server for multivariate statistical analysis. Metabolite concentrations were normalized by log transformation and auto-scaling (mean centered divided by the standard deviation of each variable) prior to univariate and multivariate statistical analysis. Student t-test, principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were 37 performed to identify potentially distinct metabolite patterns between the E. coli sample groups grown under the different conditions. Variable importance in projection (VIP) plots generated from the PLS-DA data were employed to assess the importance of each variable (i.e., metabolite) in the projection used in PLS-DA model building; statistics were calculated for the data shown in Figure 5, using 3 components, yielding Q2 and R2 values of 0.646 and 0.913, respectively. PLS- DA model validity was further assessed using the (B/W) permutation test function of MetaboAnalyst which, using 2,000 permutation steps yielded a p-value of <5 × e-04, as a measure of the significance of the PLS-DA model. For hierarchical clustering analysis (HCA), distances were measured using a Euclidean correlation and clustering by the Ward algorithm. Results and Discussion Mass Spectrometry-Based Metabolomics of Non-canonical Amino Acids An initial set of experiments was conducted to determine the physiological impact of NCAA additions to E. coli cell cultures under otherwise normal growth conditions. This study established which NCAA concentrations are needed to evaluate changes in the metabolome of E. coli that may be relevant to normal growth conditions (i.e., cells grown at 37◦C). Cultures were spiked with either 1 mM or 50 µM concentrations of AHA, HPG, or MET. MET was added as a baseline perturbation control experiment to account for the impact additional amino acid would have on the metabolism of E. coli, as opposed to the control group, which had no amendment to the minimal growth medium. Metabolite extracts from the six conditions and control groups were prepared and analyzed by LC-MS using a high-resolution Q-TOF instrument. A total of 4,036 mass features were detected across all sample groups using the MZmine data reduction approach, as described above. Statistical analysis was done in Excel (2016) using the MZmine 38 output, with additional statistical analysis performed using MetaboAnalyst (Chong and Xia, 2018) and XCMS (Tautenhahn et al., 2012). PCA was used to gather information about the variation between sample treatments and replicates. 2D-PCA plots indicated no clear separation among the different experimental groups when all m/z features were analyzed as a single input (Figure 2.1A). The 1 mM MET and both HPG groups displayed the largest separation from the other treatment groups. While principal component 1 (PC1) accounted for 44.5% of the variance, it primarily separated the 1 mM HPG samples from the other experimental conditions. The second principal component (PC2) accounted for 14.7% of the variance but did little to differentiate between the different sample groups. Overall, there was greater variation between sample treatment replicates than between the different sample treatment groups. A heatmap was constructed to visualize differentiated MS features between treatment groups. Heatmaps are a powerful tool for visualizing trends and correlated changes via hierarchical clustering across all samples and all features. The control and baseline-MET samples intermingled, while AHA and the HPG sample treatments mixed on the hierarchal cluster but were generally clustered apart from the MET and control samples (Figure 2.2A). The heatmap patterns indicated that not all replicates from each treatment clustered with each other; however, a general grouping by type and concentration of NCAA was discernable. The relatively few “hot zones,” or regions of high (dark red) or low (dark blue) abundance features on the heatmap suggested that only a small number of MS features exhibited log2fold changes > 2, indicating that generally minor metabolic differences existed between the different E. coli treatment groups. Based on 2D-PCA, ANOVA (Supplementary Table S3), t-test and HCA, we 39 thus concluded, from the global MS data, that only small metabolic changes are occurring in E. coli grown in the presence of NCAA. NMR Metabolite Profiles of E. coli Grown in the Presence of Non-canonical Amino Acids Intracellular metabolite extracts from the same E. coli cell cultures were analyzed using 1D 1H NMR spectroscopy. As with the MS studies, the NMR metabolomics data readily detected some changes in the metabolome of E. coli as a function of NCAA incorporation. However, these metabolite pattern changes were found to be relatively small and insufficient to unambiguously distinguish the different E. coli cell cultures based on 2D-PCA and HCA analyses (Figures 2.1A, 2.2B) of the different NMRbased metabolite profiles. 54 metabolites were annotated by analysis of the 1D 1H NMR spectra of the E. coli intracellular metabolite extracts using Chenomx (Supplementary Table S4). These metabolite IDs were further validated using spiking of standards and 2D 1H-1H and natural abundance 1H-13C TOCSY experiments (Supplementary Table S5). Metabolite patterns between the different E. coli sample groups, i.e., E. coli grown with MET, HPG and AHA at 1 mM and 50 µM conditions, were investigated by PCA analysis of resulting between groups (Figure 2.2B). The NMR metabolomics results are consistent with the MS spectral data. Only the 50 µM HPG samples separated as a unique cluster, similar to what is observed in the heatmap of the LC-MS spectral features (Figure 2.2A). Taken together, our results suggest that overall, the variability between biological replicates is comparable in magnitude to potential metabolic changes arising from the addition of NCAA in the E. coli cell cultures. This combined MS and NMR metabolomics analysis of NCAA-treated E. coli cell 40 cultures demonstrated that these analytical platforms can detect subtle changes in the metabolomes of E. coli grown under different culturing conditions, but that additional studies were needed to parse out how these small but potentially significant metabolic changes may impact cellular phenotypes. Figure 2.1. 2D-PCA plots of all experimental conditions from E. coli NCAA experiment. (A) MS data and (B) NMR data are shown. (A) The PCA plot of the MS data shows that only cultures grown with 1 mM HPG separate from the other experiment conditions. (B) The NMR data show significant overlap and a lack of differentiation between experimental groups, the only exception being the group with 50 μM HPG. E. coli Grown With Non-canonical Amino Acid Under Heat Stress While planning additional experiments, we evaluated how NCAA are typically used in field work and “real-world” research applications to evaluate how an organism regulates its translational activity in response to environmental (Samo et al., 2014; Hatzenpichler et al., 2016; Leizeaga et al., 2017; Sebastián et al., 2019) or (co)cultivation conditions (Mahdavi et al., 2014; Babin S. A. et al., 2016; Bagert et al., 2016). For such purposes, cellular organisms are often grown for short periods of time under environmental perturbations or cellular stress. 41 Figure 2.2. Heatmaps of treatment groups clustered on metabolite intensity from E. coli NCAA experiment. (A) MS data and (B) NMR data are shown. The scale of the heat map indicates blue as lowest and red as highest in abundance as calculated across sample groups after normalization using fold change. (A) The heatmap with HCA of 4,036 features as detected by MS is shown. The lack of clustering of the different experimental groups and no significant patterns of up- or down-regulated features for the different groups are indicative of a lack of differentiation between sample types. (B) The NMR data (40 metabolites) also shows a lack of group clustering, the only exception being the group with 50 μM HPG. For enlarged images with metabolite names and sample identifications see Supplementary Figure S5. We concluded that a more real-world evaluation of a NCAA treatment would include an environmental stressor. Because of the extensive literature available on heat stress response in E. coli (Jozefczuk et al., 2010; Ye et al., 2012), we chose temperature increase as an appropriate stressor. Assessing metabolome changes under BONCAT treatment during heat stress would thus not only help clarify how E. coli cells are adapting to the incorporation of NCAA, it would also recreate a stress condition that may best reflect “real world” research applications. This rationale thus led to a second set of metabolomics investigations, which utilized high temperature as a stress condition during E. coli cell growth with or without a NCAA or MET present. 42 The heat-treated experiment of E. coli consisted of four groups grown at 42◦C. Based on the first set of experiments, we narrowed the experimental conditions to 50 µM AHA, HPG or MET. A MET-supplemented culture was again used as a baseline comparison for BONCAT addition, while the control samples contained minimal media. Metabolomics studies and resulting multivariate statistical analysis of LC-MS and NMR metabolite profiles were conducted on heat stressed E. coli cell cultures, using the same approach described above for the initial study. As with our initial NCAA addition experiments, physiological data was recorded throughout the growth of the E. coli to monitor phenotypic changes and to assess microbial health. Optical density measurements, averaged over biological replicates, were recorded throughout the E. coli incubation and growth periods, and indicated that all of the cultures were within 8% OD600 of each other, with an average OD600 of 1.3 after 210 min of cell growth. Bradford protein assays were utilized to assess protein content and translational activity, prior to intracellular metabolite extraction, and indicated an average protein concentration of 2.1 mg/mL with all samples within 15% of the average concentration (Supplementary Table S2). The range in protein concentration revealed that E. coli grown in the presence of AHA, HPG or MET resulted in greater intracellular amounts of proteins than the control cell cultures. MS Metabolomics of Cultures Grown with Non-canonical Amino Acids Under Heat Stress NMR and MS analyses of the intracellular metabolomes of E. coli cell cultures grown under heat stress were undertaken utilizing the same analytical approaches described for our first set of experiments. LC-MS analysis identified 5,960 features across all samples. To assess variation and replication trends in the data, PCA analysis was undertaken using the MS metabolite profile data recorded on the heat stressed E. coli cell cultures and grown in the 43 presence of AHA, HPG, MET or the no addition control conditions. Resulting 2D-PCA plots did not reveal significant separations between these different groups (Figure 2.3A), with PC1 accounting for 77.8% of the variance between AHA, HPG or MET treated groups (red, blue, and cyan circles) compared to control (green circle). Principal component 2 accounted for an additional 8% of the variance, reinforcing that similarities rather than differences in metabolic profiles between the treatment groups were most prominent. Variability between the control and the MET-treated cell cultures was as great as the difference between these two groups and the AHA and HPG treated groups. The MET, AHA, and HPG groups clustered more tightly, as illustrated by the shaded 95% confidence intervals of the different groups in the 2D PCA scores plot shown in Figure 3A, compared to that of the control group. The NCAA treated samples clustered with each other, as did the control and MET-treated E. coli samples. This trend was present in the initial set of experiments conducted without heat stress and became more apparent in the PCA analysis of the stressed E. coli sample groups (Figure 2.3A). An analysis of variance (ANOVA) was also undertaken for the MS-based metabolite profiles of the heat stressed and amino acid treated E. coli cell cultures (Supplementary Figure S2). A comparison of the treatment groups to each other, resulted in an F value (St Hle and Wold, 1989) of 0.274, and an F critical value of 2.61, indicating that the means of the metabolite profiles, i.e., means of the intensities of the MS spectral features, for all the sample groups were not significantly different, and no treatment group differed significantly from the others (Supplementary Table S5). A post hoc analysis using Tukey’s honestly significant difference test (Tukey’s HSD test) was conducted with MetaboAnalyst on individual 44 Figure 2.3. 2D PCA plot of MS features and NMR features of heat stressed E. coli cultures. (A) MS data and (B) NMR data are shown. (A) The variation within the control group in the MS data completely encompasses the spread of the other sample types. (B) Experimental groups show partial separation by NMR. Data is similar to the MS non- stressed PCA plot in that E. coli cells with HPG have the greatest separation. MS spectral features to identify which features accounted most significantly for group differences between the different E. coli growth conditions (Supplementary Table S6). The analysis resulted in the identification of 907 features that changed in abundance, Supplementary Figure S2. This is less than 15% of all observed mass features. Each significantly changed feature was subjected to the post hoc Tukey HSD test. Features that fell outside the means of other treatment groups is listed (Supplementary Table S6). A second method employed to assess the data and the impact of AHA and HPG on the intracellular metabolome of E. coli was to analyze differentially regulated mass features in the heat stressed and AHA or HPG cell cultures compared to the E. coli heat stressed control groups and MET-doped cell cultures. Pairwise comparisons of AHA or HPG treated groups against the MET-treated E. coli cell cultures were conducted, as well as comparisons of control group and MET-treated cultures, which found that only 7% of the mass features were significantly different 45 (fold change > 2, p < 0.10). Using the same criteria, the AHA and HPG samples were found to contain differentially expressed features at levels of 8 and 19% respectively compared to the MET-doped cultures. These analyses indicate that, while E. coli adapts metabolically to the presence of NCAA in its growth medium in the presence of heat stress, each NCAA supplementation impacts the intracellular metabolomes of the E. coli cultures in different ways. It also appears that HPG has a greater impact on the metabolome of E. coli than AHA, based on pairwise t-tests. HCA results were plotted on a heatmap to visualize changes in the patterns of individual MS features identified between the NCAA supplemented heat stressed E. coli cell cultures. When taking into account all of the MS features, the AHA and HPG supplemented E. coli samples separated to a greater extent from the control and MET-supplemented samples, compared to the same groups analyzed in our initial study in the absence of heat stress (Figures 2.2A, 2.4A). Although the boundaries between groups were clearer, the HCA did not separate all replicates of a group into unique clusters, nor did the heatmap reveal the presence of a large number of features with significant fold changes (Figure 2.4A). A heatmap of the top 250 differentiated MS features, as assessed by Tukey’s HSD test, segregated into distinct sample groups best described by growth condition. The heatmap contained blocks of upregulated features that were characteristic of each of the treatment group (Supplementary Figure S3). In this HCA analysis, the AHA and HPG-supplemented groups clustered next to each other while the MET-supplemented and E. coli control groups were more similar. The differences between the AHA and HPG treated E. coli cell cultures compared to the control and MET-treated groups again showed that MS metabolomics can easily distinguish between the different growth 46 conditions, even when the differentiated features amount to a relatively small proportion of the intracellular metabolome mass spectral features. The color changes on the heatmap indicate fold change, and although not large does reveal that a metabolic adaptation takes place upon addition of the NCAA to the growth medium. To complement the MS-based metabolomics analysis, NMR was utilized to expand metabolite identification and coverage, and to help with the assessment of the potential biological impact of those metabolic adaptations on the cellular phenotypes of E. coli. Figure 2.4. Heatmaps of heat stressed E. coli cultures. Data from MS and NMR are shown (A,B, respectively). Heat map is coded with blue as low and red as high abundance. Fold change is indicated on the scale. (A) AHA and HPG doped stressed E. coli cultures show segregated clustering in the MS heatmap of all features (5,960 detected features). (B) The NMR heatmap (55 identified features) shows distinct clustering of Control, HPG, and Met, the exception being AHA which had moderate clustering with MET. For enlarged images that show metabolite names and sample identifications see Supplementary Figure S5. NMR Metabolomics Analysis of Cultures Grown With Non-canonical Amino Acids Under Heat Stress 47 From analysis of 1D 1H NMR spectra and spectral profiling using the Chenomx software, 55 metabolites were identified and quantified from intracellular metabolite extracts of the heat stressed, NCAA-supplemented E. coli cultures (Supplementary Table S7). While the MS metabolomics data demonstrated the presence of a certain degree of metabolic adaptation occurring in these cell cultures, the 55 metabolites annotated and validated by NMR provided some clues as to which metabolic pathways may be involved in these metabolic adaptations. The NMR metabolomics studies of the heat stressed, AHA, HPG, and MET supplemented E. coli cell cultures employed the same experimental workflow used for examining the intracellular metabolomes of the E. coli cell cultures in absence of heat stress. Group separations were assessed using PCA. The resulting 2D-PCA scores plots (Figure 3B) revealed that HPG, AHA, and MET-supplemented E. coli groups could be separated based on their distinct NMR-based metabolome profiles from the control group. Furthermore, the AHA and HPG treated groups also separated from each other based on distinct metabolite patterns (Figure 3B, red and purple 95% confidence interval circles), while the metabolic profile of the MET supplemented group overlapped with that of the AHA-treated group (Figure 2.3B, red and cyan 95% confidence intervals). Analysis of loading factors (Supplementary Table S8) contributing to PC1 and PC2 of the 2D PCA-score plots revealed that betaine, xanthosine, N- carbamoyl-aspartate, glucose, 4- aminobutyrate, adenosine contributed significantly to PC1, which accounted for 50.5% of the variance, while PC2 accounted for an additional 10.7%. Although the samples separated from each other primarily along the PC1 axis, as with the MS metabolomics findings, the variation between sample replicates was rather large and resulted in minimal separation by treatment type. In other words, although metabolic adaptations 48 occur within the cell in the presence of NCAA under heat stress, those metabolic responses appear to be rather limited and do not seem to suggest that a significant overhaul of the metabolic machinery of E. coli is taking place. HCA, schematically represented as a heatmap of relative NMR metabolite abundance, was employed to further evaluate differences among the metabolite profiles of each E. coli treatment group (Figure 2.4B). This heatmap was generated from changes in metabolite concentrations observed for the 54 metabolites that were identified by NMR. The control and MET-supplemented groups clustered more closely, while the NCAA treated groups formed a second cluster. The one exception was a replicate from the E. coli cell cultures supplemented with MET (Figure 2.4B). Consistent with the HCA analysis of the MS spectral features, the heatmap representation of the NMR-based metabolite profiles suggest that although AHA or HPG supplementation does induce changes in intracellular metabolome of E. coli, no dramatic metabolic alterations appeared to have taken place within the cells. Combined Pathway Analysis of Heat Stressed E. coli Cultures The NMR metabolomics data provided important information about potential changes in metabolic pathway usage based on a metabolic pathway impact analysis that was conducted using MetaboAnalyst. Partial least squares discriminate analysis (PLS-DA) (Figure 2.5A), with resulting variable importance in projection (VIP) scores for metabolites that have the highest discriminatory power among the treatment groups was used (Cho et al., 2008). Metabolites that contributed most to the separation of the different sample groups are listed in the VIP scores plot and revealed interesting trends between the different cell culture treatment groups (Figure 2.5B). This analysis indicated that NCAA addition impacted amino acid, protein, and lipid metabolism. 49 Intermediates in central carbon metabolism via lipid and amino acid synthesis and TCA cycle related metabolites, including aspartate, glycine, fumarate, glucose, pyruvate, malate, and 4- aminobutyrate were altered as a result of AHA or HPG supplementation in the growth medium, resulting in high differentiation between sample treatments for these molecules (Supplementary Tables S9, S10). Several metabolites related to pyruvate metabolism demonstrated a statistically significant differentiation between the NCAA-treated E. coli groups, supporting the idea that TCA cycle activity was altered in the AHA and HPG supplemented E. coli cell cultures. Intracellular levels of pyruvate, succinate, formate, and acetate were found to be higher in the E. coli cell cultures grown under heat stress and supplemented with NCAA. Metabolites associated Figure 2.5. 3D-PLSDA of metabolites as identified by NMR from the heat stressed non- canonical amino acid doped E. coli cultures and corresponding VIP scores table. (A) The PLSDA shows distinct separation between the doping groups (B) VIP shows the top 12 metabolites (out of 55 total metabolites) that contributed the most to the variation between sample types. 50 with purine and amino acid metabolism, like xanthosine, dTTP, glycine and adenosine, also pointed to metabolic networks related to energy production as being altered in the NCAA treated cells. In addition to amino acid biosynthesis, glycerophospholipid metabolism was dysregulated as a result of NCAA incorporation with O-phosphocholine, and sn-glycero-3-phosphocholine are present at higher concentrations in the HPG and AHA supplemented cultures. Metabolites associated with lipid, amino acid and purine metabolism were consistently higher in abundance in the NCAA-treated E. coli groups then the control and methionine treated E. coli. While NCAA addition impacted TCA cycle activity within the cell and potentially energy production via amino acid, purine and lipid metabolism, the implications of such metabolic changes remain unclear. Amino acid biosynthesis and degradation were altered, as leucine, MET, and tyrosine were present at higher concentrations in the AHA and HPG doped samples than in the control and the MET doped samples (Supplementary Figure S4). These results indicate that amino acid metabolism in E. coli is altered upon addition of a NCAA suggesting that leucine, MET, and tyrosine catabolism may be suppressed in the AHA and HPG supplemented cell cultures, or that other metabolites serve as metabolic precursors for energy production under these conditions, sparing the utilization of leucine, MET, and tyrosine for such purpose. In the heat exposed, AHA or HPG doped growth conditions, intracellular levels of amino acids were found to be higher than in control or MET-treated groups (Supplementary Tables S9, S10), including higher abundance of acetylated amino acids like N-actylglycine, N- acetylgultamate, and N-acetylaspartate. Acetylated amino acids could represent breakdown products of proteins that have been acetylated (Arnesen, 2011) and have been reported to be used for metabolic adaptations of microorganisms. Protein acetylation is a common post- and co- 51 translational modification process for metabolic enzymes involved in central metabolism (Christensen et al., 2019). This modification is usually found on the side chains of amino acids, not on protein backbone residues, and could explain why free acetylated amino acids were detected in high abundances in the AHA and HPG-treated E. coli cell cultures. Amino acid acetylation could also be indicative of a higher rate of post translational modifications in the NCAA doped samples (Elf and Ehrenberg, 2005), which would suggest changes in the accuracy of protein translation, protein signaling, and protein-protein interactions. There are differing hypotheses on the implications of acetylation on protein degradation. One school of thought indicates that it is protective (Carabetta and Cristea, 2017), while more recent studies have reported the opposite (Arnesen, 2011). Additional studies are needed to fully elucidate the impact of AHA or HPG supplementation on protein acetylation. Screening for Potential Degradation Products of AHA and HPG A question remaining to be address on the use of BONCAT relates to whether protein synthesis indeed serves as the only sink for the incorporation of NCAA or whether some organisms could be capable of metabolizing NCAA for their energetic needs. In an attempt to provide answers to this issue, our LC-MS data was searched for potential breakdown and conversion products of AHA and HPG as predicted from KEGG pathways and assuming that AHA or HPG could serve as substrates for enzymatic conversions. Potential compounds of interest included N-formyl-AHA and N-formyl-HPG as well as AHA/HPG versions of 4- (methylsulfanyl)-2-oxobutanoate. Other degradation products were ruled out because they all required activation of the MET-sulfur functional group, which is absent from both HPG and AHA. No features that matched these suspected products were detected. This implies that 52 breakdown of AHA and HPG is not a major metabolic activity of E. coli, and that the main sink for AHA and HPG is, indeed, protein synthesis. Summary Utilizing NMR and LC-MS approaches, we were able to establish that NCAA addition can cause metabolic perturbation and adaptation in E. coli, especially when the bacteria are subjected to heat stress. MS analyses indicated that the presence of NCAA altered the concentration of approximately 15% of the global mass features identified based on ANOVA. To put this into perspective, the addition of MET altered the abundance of 7% of the all the mass spectral features detected in the E. coli cells. This mild perturbation is consistent with previous studies that have investigated the impact of AHA or HPG replacement of MET (Dieterich et al., 2006; Bagertet al., 2014; Hatzenpichler et al., 2014, 2016; Hatzenpichler and Orphan, 2015; Landgraf et al., 2015; Calve et al., 2016;Lehner et al., 2017). Although the observed metabolic changes were mild, the heatmaps and 2D-PCA score plots highlighted trends between the different E. coli treatment groups. HCA also showed that while AHA and HPG addition impacts the global metabolism of E. coli to some extent, the lack of group separation based on distinct metabolite profiles suggests that these metabolic changes are minimal under regular growth conditions and become more pronounced when cells are subjected to heat stress. The global NMR and MS data are consistent in revealing the absence of significant group separation between the different E. coli cell cultures. The largest difference between groups was observed for the HPG-treated cells. Along this same trend, the AHA- and MET-doped cultures were more similar in metabolite profiles, group clustering, and metabolic change at the individual metabolite level. HPG seemed to perturb E. coli to a larger extent than AHA based on 53 paired t-tests which had 19 and 8% of metabolites changing, respectively. This was not expected because the differential impact of AHA and HPG on E. coli had not been reported previously. The NMR data lent power to our analysis in the form of metabolite annotation and validation. Changes in specific metabolite levels indicated that pyruvate metabolism and intermediates of the TCA cycle were affected. Changes in central carbon metabolism is a common stress response in E. coli, so the perturbations we observed as a result of NCAA additions are consistent with this archetypical stress response (Jozefczuket al., 2010). Along with TCA metabolites, glycerophospholipids, amino acids and acetylated amino acids were detected at higher concentrations in the AHA and HPG supplemented E. coli samples. In-depth NMR and MS metabolomic analyses show that supplementing E. coli cultures with NCAA has an impact on the concentration of specific metabolites leading to ametabolic adjustment. This should serve as a cautionary note to scientists about how and when NCAA can be used. Our data implies that the common practices of using optical density for cells in culture or behavioral analyses for multicellular species to assess the impact of NCAA supplementation are not telling a complete story. Metabolic profiles do change but our overall assessment is that under normal or even moderately stressful growth conditions, NCAA doping causes minor perturbations to the overall metabolic homeostasis of microbial cells. Data Availability Statement The datasets generated for this study can be found in the Metabolomics Work Bench, https://www. metabolomicsworkbench.org/data/MWTABMetadata4.php? F=kfsteward_20191206_111451_mwtab_analysis_1.txt&Mode= Study&DataMode=AllData&StudyType=MS#DataTabs. 54 Author Contributions RH, BB, VC, and KS conceptualized and designed the study. RH, BS, MD, BE, and NB worked on the experimental setup and manipulations. All authors analyzed and interpreted the data, critically revised the manuscript for important intellectual content. KS, BB, RH, and VC drafted the manuscript. Funding This research was supported in part by funding from the Keck Foundation and the National Science Foundation (MCB1817428). Undergraduate participation in this research was made possible through a National Science Foundation Research Experiences for Undergraduates Grant (REU-1461218). Funding for Proteomics, Metabolomics and Mass Spectrometry Facility used in this publication was made possible in part by the MJ Murdock Charitable Trust and the National Institute of General Medical Sciences of the National Institutes of Health under Award Number P20GM103474. Funding for the NMR facility was provided in part by the NIH SIG program (1S10RR13878 and 1S10RR026659), the National Science Foundation (NSF- MRI:DBI-1532078), the Murdock Charitable Trust Foundation (2015066:MNL), and support from the Office of the Vice President for Research and Economic Development at MSU. Acknowledgments The authors thank Jesse Thomas for technical assistance with mass spectrometry. 55 Supplementary Material The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2020.00197/full#supplementary-material References Arnesen, T. (2011). Towards a functional understanding of protein N-terminalacetylation.PLoS Biol.9:e1001074. doi: 10.1371/journal.pbio.1001074 Babin, M. B., Bergkessel, M., Sweredoski, M. J., Moradian, A., Hess, S., Newman,D. K., et al. (2016). SutA is a bacterial transcription factor expressed during slow growth in Pseudomonas aeruginosa. Proc. Natl. Acad. Sci. U.S.A113,E597–E605. doi: 10.1073/pnas.1514412113 Babin, S. A., Zlobina, E. A., Kablukov, S. I., and Podivilov, E. V. (2016). High-order random Raman lasing in a PM fiber with ultimate efficiency and narrow band width Sci. Rep.6:22625. doi: 10.1038/srep22625 Bagert, J. D., van Kessel, J. C., Sweredoski, M. J., Feng, L., Hess, S., Bassler, B. L.,et al. (2016). Time-resolved proteomic analysis of quorum sensing in Vibrioharveyi. Chem. Sci.7, 1797–1806. doi: 10.1039/c5sc03340c Bagert, J. D., Xie, Y. J., Sweredoski, M. J., Qi, Y., Hess, S., Schuman, E. M.,et al. (2014). Quantitative, time-resolved proteomic analysis by combining bioorthogonal noncanonical Amino acid tagging and pulsed stable isotope labeling by Amino acids in cell culture. Mol. Cell. Proteomics13, 1352–1358.doi: 10.1074/mcp.M113.031914 Bradford, M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem.72, 248–254. Calve, S., Witten, A. J., Ocken, A. R., and Kinzer-Ursem, T. L. (2016). Incorporation of non- canonical amino acids into the developing murine proteome. Sci. Rep.6:32377. doi: 10.1038/srep32377 56 Carabetta, V. J., and Cristea, I. M. (2017). Regulation, function, and detection of protein acetylation in bacteria. J. Bacteriol.199, e107–e117. doi: 10.1128/JB.00107-17 Cho, H.-W., Kim, S. B., Jeong, M. K., Park, Y., Miller, N. G., Ziegler, T. R., et al.(2008). Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra Int. J. Data Min. Bioinform.2, 176–192. Chong, J., and Xia, J. (2018). MetaboAnalystR: an R package for flexible and reproducible analysis of metabolomics data. Bioinformatics34, 4313– 4314.doi:10.1093/bioinformatics/bty528 Christensen, D. G., Xie, X., Basisty, N., Byrnes, J., McSweeney, S., Schilling, B.,et al. (2019). Post-translational protein acetylation: an elegant mechanism for bacteria to dynamically regulate metabolic functions. Front. Microbiol.10:1604.doi: 10.3389/fmicb.2019.01604 Dieterich, D. C., Link, A. J., Graumann, J., Tirrell, D. A., and Schuman, E. M.(2006). Selective identification of newly synthesized proteins in mammalian cells using bioorthogonal noncanonical amino acid tagging (BONCAT).Pr oc. Natl. Acad. Sci. U.S.A.103, 9482–9487. Elf, J., and Ehrenberg, M. (2005). Near-critical behavior of Aminoacyl-tRNA pools in E. coli at rate-limiting supply of Amino acids. Biophys. J.88, 132–146. Hamerly, T., Tripet, B. P., Tigges, M., Giannone, R. J., Wurch, L., Hettich, R. L.,et al. (2015). Untargeted metabolomics studies employing NMR and LC–MS reveal metabolic coupling between Nanoarcheum equitans and its archaeal host Ignicoccus hospitalis. Metabolomics11, 895–907. Hatzenpichler, R., Connon, S. A., Goudeau, D., Malmstrom, R. R., Woyke, T., and Orphan, V. J. (2016). Visualizing in situ translational activity for identifying and sorting slow-growing archaeal-bacterial consortia. Proc. Natl. Acad. Sci. U.S.A.113, E4069–E4078. doi: 10.1073/pnas.1603757113 Hatzenpichler, R., Krukenberg, V., Spietz, R. L., and Jay, Z. J. (2020). Next-generation physiology approaches to study microbiome function at the single cell level.. Rev. Microbiol. doi: 10.1038/s41579-020-0323-1 57 Hatzenpichler, R., and Orphan, V. J. (2015). Detection of Protein-SynthesizingMicroorganisms in the Environment via Bioorthogonal Noncanonical Amino Acid Tagging (BONCAT). Berlin, Heidelberg: Springer, 145–157. Hatzenpichler, R., Scheller, S., Tavormina, P. L., Babin, B. M., Tirrell, D. A., and Orphan, V. J. (2014).In situ visualization of newly synthesized proteins in environmental microbes using amino acid tagging and click chemistry.Environ.Microbiol.16, 2568–2590. doi: 10.1111/1462-2920.12436 Hinz, F. I., Dieterich, D. C., Tirrell, D. A., and Schuman, E. M. (2012). Non-canonical amino acid labeling in vivo to visualize and affinity purify newly synthesized proteins in larval zebra fish.ACS Chem. Neurosci. 3, 40–49. doi:10.1021/cn2000876 Jozefczuk, S., Klie, S., Catchpole, G., Szymanski, J., Cuadros-Inostroza, A.,Steinhauser, D., et al. (2010). Metabolomic and transcriptomic stress response of Escherichia coli. Mol. Syst. Biol.6:364. doi: 10.1038/msb.2010.18 Kiick, K. L., Saxon, E., Tirrell, D. A., and Bertozzi, C. R. (2002). Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligation. Proc. Natl. Acad. Sci. U.S.A.99, 19–24. Kolb, H. C., Finn, M. G., and Sharpless, K. B. (2001). Click chemistry: diversechemical function from a few good reactions. Angew. Chemie Int. Ed.40,2004–2021. Landgraf, P., Antileo, E. R., Schuman, E. M., and Dieterich, D. C. (2015).BONCAT: Metabolic Labeling, Click Chemistry, and Affinity Purification of Newly Synthesized Proteomes. New York, NY: Humana Press, 199–215. Lehner, F., Kudlinzki, D., Richter, C., Müller-Werkmeister, H. M., Eberl, K. B.,Bredenbeck, J., et al. (2017). Impact of Azidohomoalanine incorporation on protein structure and ligand binding. Chem. Biol. Chem.18, 2340–2350. doi:10.1002/cbic.201700437 Leizeaga, A., Estrany, M., Forn, I., and Sebastián, M. (2017). Using click-chemistry for visualizing in situ changes of translational activity in Planktonic Marinebacteria. Front. Microbiol.8:2360. doi: 10.3389/fmicb.2017.02360 58 Mahdavi, A., Szychowski, J., Ngo, J. T., Sweredoski, M. J., Graham, R. L., Hess,S., et al. (2014). Identification of secreted bacterial proteins by noncanonical amino acid tagging. Proc. Natl. Acad. Sci. U.S.A.111, 433–438. doi: 10.1073/pnas.1301740111 Myers, O. D., Sumner, S. J., Li, S., Barnes, S., and Du, X. (2017). Detailed investigation and comparison of the XCMS and MZ mine 2 Chromatogram construction and chromatographic peak detection methods for preprocessing mass spectrometry metabolomics data.Anal. Chem.89, 8689–8695. doi: 10.1021/acs.analchem.7b01069 Pluskal, T., Castillo, S., Villar-Briones, A., and Oresic, M. (2010). MZmine2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics11:395. doi:10.1186/1471-2105-11-395 Samo, T. J., Smriga, S., Malfatti, F., Sherwood, B. P., and Azam, F. (2014). Broad distribution and high proportion of protein synthesis active marine bacteria revealed by click chemistry at the single cell level. Front. Mar. Sci.1:48. doi:10.3389/fmars.2014.00048 Sebastián, M., Estrany, M., Ruiz-Gonzalez, C., Forn, I., Sala, M. M., Gasol,J. M., et al. (2019). High growth potential of long-term starved deep ocean opportunistic Heterotrophic bacteria. Front. Microbiol. 10:760. doi: 10.3389/ fmicb.2019.00760 St Hle, L., and Wold, S. (1989). Analysis of variance (ANOVA). Chemom. Intell. Lab. Syst. 6, 259–272. Tautenhahn, R., Patti, G. J., Rinehart, D., and Siuzdak, G. (2012). XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039. doi: 10.1021/ac300698c Ye, Y., Zhang, L., Hao, F., Zhang, J., Wang, Y., and Tang, H. (2012). Global metabolomic responses of Escherichia coli to heat stress. J. Proteome Res. 11, 2559–2566. doi: 10.1021/pr300 0128 59 CHAPTER THREE ACUTE STRESS REDUCES POPULATION-LEVEL METABOLIC AND PROTEOMIC VARIATION Contribution of Authors and Co-Authors Manuscript in Chapter Three Author: Katherine F. Steward Contributions: Study design and conceptualization, data compilation, data analysis and interpretation, drafting and editing manuscript. Co-Author: William E. Dyer Contributions: Study design and conceptualization, data interpretation, manuscript revision, manuscript drafting. Co-Author: Valérie Copié Contributions: Study design and conceptualization, data interpretation, manuscript revision, manuscript drafting. Co-Author: Jennifer Lachowiec Contributions: Study design and conceptualization, data analysis and interpretation, manuscript revision, manuscript drafting. Co-Author: Brian Bothner Contributions: Study design and conceptualization, data analysis and interpretation, manuscript revision, manuscript drafting. 60 Manuscript Information Katherine F. Steward, William E. Dyer, Valérie Copié, Jennifer Lachowiec, Brian Bothner Communications Biology Status of Manuscript: ____ Prepared for submission to a peer-reviewed journal _X___ Officially submitted to a peer-reviewed journal ____ Accepted by a peer-reviewed journal ____ Published in a peer-reviewed journal Nature Research Submitted Manuscript 61 ACUTE STESS REDUCES POPULATION-LEVEL METABOLIC AND PROTEOMIC VARIATION Katherine F. Steward 1, William E. Dyer1,3, Valérie Copié1,2, Jennifer Lachowiec3, and Brian Bothner1,2 (1) Department of Chemistry and Biochemistry, Montana State University, Bozeman MT 59717 (2) Thermal Biology Institute, Montana State University (3) Department of Plant Sciences and Plant Pathology, Montana State University To whom correspondence should be addressed: Brian Bothner, Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT 59717, Phone: 406-994- 5270, FAX: 406-994-4807, E-mail: bbothner@chemistry.montana.edu Abstract: Variation in omics data due to intrinsic biological stochasticity is often viewed as a challenging and undesirable feature of complex systems analyses. In fact, numerous statistical methods are utilized to minimize the variation among biological replicates. We demonstrate that the common statistics relative standard deviation (RSD) and coefficient of variation (CV), which are often used for quality control or part of a larger pipeline in omics analyses, can also be used as a metric of an organism’s response to physiological stress. In an approach we term Replicate 62 Variation Analysis (RVA), the data and analyses presented here demonstrate that metabolome or proteome CV profiles reflect acute stress when feature-wide canalization of CV is observed. Multiple in-house mass spectrometry omics datasets in addition to publicly available data were analyzed to assess changes in CV profiles in plants, animals, and microorganisms. We contrasted RVA results across experiments to create a foundation for understanding omics level adaptations due to stress. Our RVA approach helps characterize stress response and recovery, and could be deployed to detect populations under stress, monitor health status, and conduct environmental monitoring. Introduction: Cells respond to stress through numerous mechanisms to maintain homeostasis. For example DNA damage repair, the unfolded protein response, mitochondrial stress signaling, and regulated cell death are all global stress pathways [1]. These programs are initiated by signaling molecules including metabolites and proteins. Metabolomics and proteomics methods are thus well suited for investigating cellular stress response (CSR), as they capture global snapshots of an organism’s physiological state at a given time [2], [3] . Global phenotypic information, built from data on individual molecules helps explain not only stress, but also disease states, antibiotic or herbicide resistance, and evolutionary fitness [4] by characterizing phenotypic plasticity and baseline physiology[2], [5], [6]. Studies investigating CSR generally focus on a specific stressor, model system, or pathway. In this study, we demonstrate that acute stress in plants, animals, and microorganisms can decrease global physiological variability, and that measures of phenomic variation are a useful descriptive statistical metric. 63 Standard omics workflows typically report variability among individuals and groups using approaches including relative standard deviation (RSD) or Coefficient of Variation (CV), hierarchical clustering, principal component analysis (PCA), or other multivariate statistical analyses [7]. CV is used in omics analyses to evaluate the repeatability of a biological assay or the precision of an experiment [8] and is reported as a ratio of the standard deviation to the mean. However, variability in data is generally considered to be undesirable, and many methods have been employed to minimize intra-group variation among biological replicates [9]–[11]. Nonetheless, intrinsic phenotypic variability among individuals in a population has been exploited to provide population-level insights in the fields of ecology, evolution, and genetics[12]. For example, Yablokov et al. used standard deviation and CV metrics to report ranges of phenotypic states within a population of marine mammals, and proposed that such data informs on how new taxa arise [13]. We now expand on this foundation by characterizing and comparing CV profiles of metabolome and proteome data obtained from resting and stress- challenged organisms in order to better describe CSR. Changes in CV means and medians were determined, and CV distribution profiles compared [14] to comprise what we term Replicate Variation Analysis (RVA). Previously, we reported that the CV of metabolites of unique model systems including wild pig (Sus scrofa) decreased due to acute stress [15]. Using a standard metabolomics workflow, multivariate clustering and PCA were used to detect differences between treatment groups, as well as identify global trends for reduced metabolite variation in both species when under acute stress. We have now extended the analyses of S. scrofa data using RVA and applied 64 the same approach to stress responses in a variety of organisms including plant, bacterial, and multiple eukaryotic species. Results This project began with our previous observation that CV distributions of metabolomes (n = 8) derived from urine were altered during hemorrhagic shock in S. scrofa [15]. The focus of that work was to identify relevant stress biomarkers. Now, our reanalysis of the data revealed that metabolic variation among individuals was significantly reduced during hemorrhagic shock. A 2D PCA score plot indicated that data variability is canalized under stress with a reduction in both PC1 and PC2 (Figure 3.1A). Further analysis revealed that stress is associated with a reduced median CV value (from 59% to 46% and a significantly reduced mean CV (62% to 49%, Wilcoxon T test< 0.001) and a CV profile that is shifted towards a peaked distribution (Kolomogrov-Smirnov test, d= 0.287, p < 0.001, Figure 3.1B). To establish whether decreased variation in metabolite abundance among biological replicates is a general outcome of acute stress, we analyzed additional in-house metabolomics data sets. Our attention turned to a data set that was employed to investigate the metabolic impact of Bio Orthogonal Non-Canonical Amino Acid Tags (BONCAT) on the cellular growth of Escherichia coli [16]. Batch cultures of E. coli were grown on minimal medium (Control) or with additions of methionine (MET), azidohomoalanine (AHA), or homopropargylglycine (HPG) (n = 5). Intracellular metabolite profiles were analyzed using both MS and NMR-based metabolomics techniques. 2D PCA analysis of the MS metabolomics data (Figure 3.2A) revealed 65 Figure 3.1. Metabolic variation in response to hemorrhagic shock in a mammal. A. Principal component analysis of control (red) and shocked (green) S. scrofa (n = 8). B. Profile distribution plots of the CV of metabolite features from s. scrofa replicates from a control (black) and a shocked group (pink). The X axis shows the CV and the Y axis is the is the proportion of metabolites in the metabolome. Adapted from Heinemann et al. 2014. that the control cultures displayed greater variation among biological replicates than the treatment groups. When the same mass spectrometry data were analyzed using RVA, changes in the CV profiles between the control and amino acid tag additions were also observed (Figure 3.2B). CV means and medians were decreased, and the distribution profiles became narrowed with a sharper peak (Figure 3.2C). The NMR data revealed a similar pattern in distribution between the control and HPG samples (K-S test, d=0.28, p = 0.022) with a median decrease from 18% (control) to 13% (HPG) and a significant decrease in mean %CV (control=26%, HPG =15%, Wilcoxon T-test, p = 0.0015; Figure 3.2D). The RVA approach demonstrated that metabolomic dysregulation in HPG was greater than AHA, which was greater than MET, and all three treatments caused a decrease in variation relative to the control, a pattern mirrored in the NMR metabolomics data as well (Supplemental figure IA). The RVA distribution profiles 66 matched the differential abundance analysis of the original work, in which we showed that the HPG, AHA and MET additions resulted in significant perturbation to 19, 11, and 7% of the metabolites, respectively. RVA thus has the potential to be used as a measure of stress, as it correlates to dysregulation of analytes. We next analyzed data from physiological investigations of the weedy plant Avena fatua (wild oat). To investigate the global impact of this acute stress, we inflicted a heat shock treatment (40°C, 24hrs) on inbred seedlings, followed by metabolomics analyses after increasing durations of recovery (n= 8). This study demonstrated that CV distribution profiles were markedly altered soon after heat shock (Figure 3.3A). Median CV values were reduced from 67% in untreated plants to 28% in heat shock plants. Mean CV values also showed a significant change between untreated and heat shock groups (control = 76%, heat shock = 37%, Wilcoxon T test p < 0.001). As documented for S. scrofa and E. coli above, CV distributions were also significantly canalized following heat shock (K-S test, d = 0.46, p < 0.001). These A. fatua data were also analyzed to assess the kinetics of recovery from stress, and how this impacts CV distribution. Over the course of a 100-hour recovery period, CV distribution means increased from 37% to 76% (Wilcoxon T test p < 0.001) while K-S test d values decreased from 0.45 to 0.13 (Figure 3.3B). During recovery, the CV distribution widened and became less peaked with the metabolome approaching a distribution that resembled data from untreated plants. As seen in the E. coli data above, the temporal CV distribution profiles of heat shock and recovery in A. fatua suggest that a qualitative measure of stress can be assessed based on CV distributions of the population. 67 Figure 3.2. Metabolic variation in E. coli treated with non-canonical amino acids A. Principal component analysis of four different treatment groups from non-canonical amino acid treatment experiments on E. coli cell cultures with median displayed as a solid line (red=AHA treatment, green=control, blue =HPG and cyan=MET) (Steward et al. 2020). B. Distribution plots of CV of mass spectrometry metabolite feature profiles for the non- canonical amino acid treated cultures of E. coli. C. Table of CV statistics include the K-S d statistic for the different comparisons of the Control to the other groups, the CV mean and the CV median. D. Profile distribution plots of the CV of NMR metabolite features from E. coli replicates from a control (black) and HPG treated (pink). 68 Analysis of public omics data sets To examine the generality of our approach and observations, a series of systematically selected, published data sets from other research groups were analyzed. We employed our approach on MS-based metabolomics data to track the metabolic adaptations of a methionine sensitive cancer cell line [17]. The original experiment involved replacing methionine in the growth medium with homocysteine, followed by an acclimatization period (n = 4). The cell lines stressed by the loss of methionine failed to thrive in its absence, but supplementation with homocysteine resulted in adaptations that enabled cell growth. RVA analyses of the metabolic mass spectral features demonstrated that the stress imparted by the absence of methionine resulted in significantly reduced intra-group variations (K-S test, d = 0.72, p < 0.001) (Supplemental figure IB). This pattern was also reflected in mean CV values, which decreased from 15% to 6% (Wilcoxon T test p < 0.001) for the control and stressed groups, respectively, and median CV values decreased from 15% to 4%. This cancer cell dataset was of particular interest because it also included a temporal analysis of CSR. Adaptation to homocysteine was tracked over 12 hours by periodic removal of metabolite samples from untreated and methionine-stressed cells. The CV profiles indicated that the peaked profile of early time points shifted to a wider distribution resembling that of the control group, and KS-test d statistic changed from 0.72 to 0.22 between the 2 to 12-hour timepoints (Figure 3C). The second external dataset came from a study in which Neocloeon triangulifer (mayflies) were fasted overnight and then subjected to heat stress or ambient temperature [18]. Metabolite samples (n = 6) were analyzed by LCMS. RVA analysis revealed subtle changes 69 Figure 3.3: Distribution of CV in A. fatua and temporal RVA analysis A. Distribution profile plot of metabolomic CV of A. fatua exposed to heat shock at 40 C (pink) and the control group (black). B. Temporal CV profiles from heat stressed A. fatua. Time post-stress is from zero to 100 hours of recovery. Table below: values of the K-S test. C. Temporal CV profiles of methionine dependent cancer cell line supplemented with homocysteine in the growth media, with timepoints collected after 2,4,8 and 12 hours of acclimation. Table below: values of the K-S test. in CV values, which displayed a slight mean decrease from 23% to 20%, and median CV decrease (18.1% to 16.5%) from the ambient temperature insects as compared to the heat shocked group. Although mean and median changes were small, CV distributions tended towards a canalized profile in the heat exposed group (K-S test, d =0.078, p = 0.037) (Supplemental figure IC). The difference in CV profiles reflect the impact of acute thermal stress, even under a shared fasting condition. 70 The third external data set originated from a metabolomics study that investigated the impact of diet on mouse intestinal digesta composition. The treated group was fed a low protein, low fat chow to mimic malnourishment, and mass spectrometry metabolite data were collected from control and diet-restricted mice (n = 4) [19]. RVA analysis demonstrated a clear change in CV distribution profiles (KS-test, d = 0.48, p < 0.001), with a change in median CVs from 48 (control) to 21 (diet) and mean CVs (control = 48%, diet = 21%) (Supplemental figure IIA). The fourth and fifth datasets originated from a two treatment study in which Haliotis discus hannai (sea abalone) (n = 9) that had been acclimated to either high or low temperature were subjected to heat stress or no heat treatment, and mass spectrometry metabolite profiles were compared [20]. When analyzed using RVA, CV distribution of cold-acclimated abalone either subjected to heat shock or not displayed a significant change (Wilcoxon T test, p <0.001) in CV mean from 29% (control) to 24% (heat shock) and median CV decrease from 25 (control) to 20 (heat shock) (KS-test, d = 0.18, p < 0.001; Supplemental figure IIB). High temperature acclimated abalone groups either heat shocked or not also exhibited a significant change in CV distribution profiles (KS-test, d = 0.077, p = 0.033), representing a more narrowed distribution for the heat shock group, though the CV means were similar (Supplemental figure IIB). Together, re-analysis of the mayfly and high temperature acclimated abalone data highlight that RVA profiles can detect even small changes reflecting intra-group metabolome variation and CV distribution changes imparted by acute stress, even after a stress acclimation period. Proteomics Data We next wished to establish whether proteome data also reflected a canalization of variation following acute stress. We first used an in-house proteomic study investigating E. coli 71 cell cultures grown under aerobic or anaerobic conditions (n = 4). 2D PCA score plots indicated less variation across both PC1 and PC2 in the anaerobic group (Figure 3.4A), while RVA revealed a significant difference in CV distribution (K-S test, d = 0.19, p < 0.001) as well as a trending towards smaller CV mean (7.7% and 6.6%) and CV median (6.7% and 5.3%) for the anaerobic group (Figure 3.4B). We followed this analysis by mining the Pride proteome archive database [21] to search for additional external examples, including an investigation of drought stress effects on two varieties of Chinese bread wheat (Triticum aestivum L) (n = 3) [22]. Our RVA analysis demonstrated reduced CV distributions in drought-stressed proteome profiles (K-S test, d = 0.47, p < 0.001), with changes in mean and median CV values in the control group (mean = 42 %, median = 29.8%) as compared to the stressed group (mean = 25%, median = 11.1%, Wilcoxon T Test, p< 0.001; Supplemental figure III). Both prokaryotic and eukaryotic proteome datasets thus provide evidence that a reduction in intra-group variation in response to acute stress applies to diverse classes of omics data. 72 Figure 3.4: RVA of Proteomics data and Simulation Analysis A. Principal component analysis of proteomic data from anaerobic and aerobic E. coli cultures, shown in green and red respectively. B. CV distribution plots for anaerobic (pink) versus aerobic (black) E. coli cultures. C/D. Simulated data with 3, 6, 10 or 20 replicates using 50, 500 or 5,000 features. The standard deviation was modeled at 0.5 of the mean (C) and 0.23 of the mean (D). Exceptions to the Model Through mining the Metabolomics Workbench data repository, we determined that not all datasets exhibit this relationship between variation and stress. Three metabolomics datasets examined did not display a significant change in CV distribution when control and treatment groups were compared. After a thorough analysis of experimental design, the exceptions were classified into two categories. The first category was for metabolomics analyses conducted using alternative analytical approaches such as a targeted analysis in which isotopically labeled carbon 73 from 256 specific metabolites was tracked to evaluate heat shock on Caenorhabditis elegans (nematodes) [23]. CV distributions of the data from heat shocked and control groups were not significantly different (K-S test, d = 0.04, p = 0.83), indicating that a targeted approach may not detect CV canalization. In support of this premise, an NMR metabolomics study analyzing cadmium exposure in Danio rerio (zebrafish) embryos also failed to show a difference in CV distribution between control and treatment groups (K-S test, d= 0.27, p = 0.17) [24]. Similar to the nematode study, only 33 metabolites were analyzed in the zebrafish study. While the change in CV distribution does not appear to be dependent on number of features analyzed, it is more difficult to quantify a difference between two discontinuous distributions, like those resulting from targeted NMR data, compared with the 1000+ spectral features that make up a metabolomics or proteomics mass spectrometry data set. Other exceptions that did not reveal stress-induced CV profile changes involved studies where the biological groups in question were subject to chronic rather than acute stress. A blood plasma metabolomics study of Chronic Fatigue Syndrome (CFS) in both male (control = 18, CFS = 22) and female human (control = 23, CFS = 21) patients revealed that the CV distribution significantly increased in patients suffering from chronic fatigue compared to healthy control subjects (males: KS-test, d = 0.10, p < 0.001; females: KS-test, d = 0.10, p < 0.001), and the mean values increased slightly as well (males: 32% to 34%; females: 36% to 39%). Given our observations that the period of stress and/or recovery time impacts CV distribution, we surmise that in contrast to acute stress, chronic stress may result in an opposite trend and a corresponding increase in CV distribution patterns. This idea is consistent with the evolutionary theory that directional evolution based on environmental stress induces increased phenotypic and genetic 74 variation [25]. However, additional work is needed to fully characterize the relationship between chronic stress and CSR variation in omics data. Simulations The fact that targeted or less than global data failed to display canalization stood out as potentially impactful to RVA. NMR datasets typically report on tens to hundreds of metabolites, while mass spectrometry based-metabolomics data often contains a thousand or more spectral features. We hypothesized that the number of features comprising the CV distribution may affect statistical power to discern differences between data sets. In order to understand the impact of feature number, known CV profiles were simulated to better understand how varying feature number (50, 500, and 5000), replicate number (3, 6, 10, and 20), and the ratio of feature mean to standard deviation influenced the error in CV distributions. These values were chosen as they are reasonable representations of different omics experimental designs. A CV profile was simulated first, from individual feature means and standard deviations based upon mayfly data [18]. Next, varied numbers of samples were taken 1000x for assessment. Calculating the correlation coefficient between the “known” and sampled CV distributions revealed that more replicates in the experiment and a smaller ratio of standard deviation to the mean yields more accurate estimates (Figure 3.4C and 3.4D). Unexpectedly, the number of features is not a predictor of accuracy of CV distribution calculations as we had hypothesized. This suggests that stress response can be an emergent property from the collective action of untargeted features. The number of biological replicates and the variance of a specific feature, however, are primary considerations. RVA is most informative when maximizing replicate numbers, which should be a priority consideration for experimental design. 75 Discussion: The present study demonstrates a connection between variability across samples and stress that can be quantified at the omics level. By repurposing CV as a statistic of merit, a stressed phenotype was identified. The approach presented here can help characterize CSR and holds the potential to assess the magnitude of stress recovery. The biological mechanisms underlying reduced variation have the potential to categorize disease and stress states of a population as a property of the phenome. A metabolic bottleneck (i.e. a single optimum solution to resource use; aka convergence) [26]–[28] is one possible mechanism to explain this reduction. However, we also propose that the change could be less of an active, CSR pathway initiation and more of a passive reaction where ancillary metabolic pathways are quieted in the perturbed organism. The lack of nutrients or influx of stressors on the system activates CSR and thus, other pathways may be down regulated to mitigate the physiological effects of stress [29]. The CSR mechanisms that result in canalization are unknown, but the ability to observe and quantify this population-level response provides a valuable perspective on the phenome. Whether it is activating CSR, turning down auxiliary pathways or a combination of both, our analyses demonstrate that acute stress results in omics profiles that are less variable. The change in variation distribution holds valuable information but leads to additional questions. The temporal studies visualized stress adaptation as changes in the CV distributions. Following acute stress, the CV distribution shifts towards a smaller mean CV. This shift could be a gradient of CSR, or it could be individuals within the population relaxing their stress response at slightly different times. Single cell analysis of Xenopus oocytes investigated this idea, studying the 76 MAPK cascade response to progesterone stimulus [30]. Ferrel et al. were able to determine that patterns of phosphorylation in the population exhibited a bimodal distribution, with individuals responding to stress not gradually, but as if a switch had been flipped [30]. Research along this line, using RVA, will help to answer an ongoing and fundamental questions about CSR: does it function as a rheostat or a switch? RVA provides a finite characterization that can feasibly yield identification of physiological mediators responsible for the metabolic canalization of a stressed phenotype. CV as a global bottom-up statistic holds much potential; however, it is not without limitations. As we have shown, not all treatment versus control data sets follow the trend outlined here. Commonalities of studies that did not have reduced variability in “stress” groups included presence of chronic stress and the use of a targeted rather than nontargeted analytical approach. Chronic stress on a system is a known cause of deleterious mutations that can result in disease, cancer and even death [31]. Data that support a reduced CV are from systems under acute stress that did not cause overt cellular death or an immediate disease state or permanently altered CSR are not part of the metabolic response examined here. As shown in the wild oat and cancer cell examples, a temporal RVA analysis shows that CSR recovery allows for the CV distributions to relax, closer to a control distribution. Potentially the data that do not match this trend were collected at timepoints not relevant to active stress amelioration. Data appropriate for this model also need to be sufficiently comprehensive, as a discontinuous distribution will hinder RVA. Used in coordination with typical omics workflows, the addition of RVA has the potential to impact many areas of research and help make otherwise unrecognized connections. 77 Describing the phenotypic differences of a genotype due to environmental conditions is the aim of phenomics, which is at the intersection of metabolomics, proteomics, genomics, and at the forefront of multiple human health and agricultural studies. RVA can help characterize a population phenome with statistics that are straightforward to generate. Additionally, RVA could potentially be used as a predictive tool, to help pinpoint early changes in metabolite or protein levels that indicate stress or future disease. RVA also has implications at the juncture of stress response and resistance. It has been shown that repeated exposure to acute stress can result in long term phenotypic changes, as observed in antibiotic resistant E. coli populations, herbicide resistant weedy species[32], and prolonged stress adaptation in Drosophila melanogaster [33]– [35]. The nuances of the relationship between intra-population variability (the variome) and stress response are a promising area for additional study. Methods: For previously published data, experimental details can be found in the respective publications. The Sus scrofa study analyzed machine learning techniques to identify biomarkers of hemorrhagic shock, CV was noted in this paper, but not further analyzed[36]. The effect of Bio Orthogonal Non-Canonical Amino Acids on E. coli was evaluated at the metabolite level, analyzing the addition of either AHA, HPG or Methionine [16]. Methionine sensitive cancer cells were subjected to methionine starvation with homocysteine replacement in the media, with the metabolite changes tracked over time [17]. The next study focused on heat shock treatment on mayflies to analyze stress tolerance, using GC-MS for metabolomics analysis [18]. Mouse models used to evaluate malnutrition was the next study, analyzing MS based metabolome changes [19]. The last two examples used in the metabolomics section came from a study on heat 78 stress in abalone, studying metabolome effects of heat stress after a high or low temperature acclimation [20]. The proteomics data set utilized here analyzed drought stress on Chinese wheatleaf [22]. Metabolomics Analysis of Heat Shocked Avena Fatua Avena fatua plants were grown from origin seeds as described in Burns et al, 2018. After three weeks of growth, plants were placed in a temperature-controlled chamber for 24 hours at 40 C. Shoot material was harvested at time intervals of 0, 6, 24, 48, and 100 hours after heat shock. The material was immediately placed in liquid nitrogen and stored at -80°C for metabolite extraction. Frozen tissue was ground for 1 minute in liquid N2 with a mortar and pestle. The powdered tissue (approximately 150 mg per sample) was suspended in methanol (MeOH) at 70°C for 15 minutes. Samples were vortexed for 1 min and then centrifuged (25,000 g, 10 minutes, 4°C) to remove cellular debris from the soluble fraction. To precipitate proteins from the soluble metabolite fraction, ice cold acetone was added at a ratio of 4:1 acetone: extract and stored at -20°C overnight, followed by centrifugation (25,000 g) at 4°C for 10 minutes. The resulting supernatant fraction was dried and stored at -80°C. Prior to analyses by LC-MS, samples were resuspended in 40 L of 50% HPLC grade water / 50% MeOH. MS-based analysis of polar metabolites was accomplished using an Agilent 1290 ultra-high performance liquid chromatography (UPLC) system coupled to an Agilent 6538 Accurate-Mass quadrupole Time of Flight (TOF) mass spectrometer, using a HILIC column (Cogent diamond hydride HILIC 2.2 µM, 120 A, 150 mm x 2.1 mm Microsolv, Leland, NC) for metabolite separation. The gradient for separation started with a hold of solvent B (0.1% formic acid in acetonitrile) for 2 minutes at 50%, followed by a gradient ramp of 50-100% B over fourteen minutes. Then an 79 isocratic hold at 100% solvent B for one minute, with a return to initial conditions. Mass analysis was conducted in positive mode with a capillary voltage of 3500 V, dry gas temperature of 350 °C at a flow of 8 L/min and the nebulizer was set at 60 psi, injecting 2 µL sample volumes, with blanks run intermittently between samples. Data acquisition parameters were as follows: 50- 1,000 mass range at 1 Hz scan rate with a resolution of 18,000. Accuracy based on calibration standards was approximately 5ppm. Proteomic Analysis of Escherichia Coli Grown Under Aerobic or Anaerobic Conditions Proteomics analysis of aerobic versus nonaerobic E. coli cultures was carried out on MG1655 (K12) in LB media at 37 C. Four replicate cultures were started with a 5 L inoculation from an overnight culture and grown under an atmosphere of nitrogen or ambient air until harvest at mid-log phase (0.4 OD for the aerobic samples and 0.3 OD for anaerobic samples). Cells were pelleted using centrifugation and proteins extracted immediately. The cell pellets were resuspended in 0.1 M Tris-HCL pH 7.5 buffer with 8 M urea and subjected to three freeze/thaw cycles in liquid Nitrogen, followed by ultrasonication for 5 minutes (Biologix - Model 13000). Samples were centrifuged and the resulting supernatant was removed and proteins precipitated from it using ice cold acetone and storage at -20C for one hour. The precipitated proteins were centrifuged, the supernatant was removed and the protein pellet was resuspended in 0.1 M Tris-HCL pH 6.8, 5 um EDTA, 50 mM N-ethylmaleimide in 6M urea. This sample was transferred to a 3K MWCO Nanosep centrifuge device and a modified FASP digestion was carried out. The sample was reduced with an excess of DTT and alkylated using 50 mM Iodoacetamide. The samples were washed four times with 50 mM ammonium bicarbonate pH 7.8 and then digested using sequencing grade Trypsin at a 20:1 protein: protease 80 ration for 18 hours. Samples were run on a Dionex Ultimate 3000 Nano UHPLC equipped with an Acclaim PepMap 100 C18 trap column (100 um x 2 cm) and an Acclaim PepMap RSLC C18 (75 um x 50 cm, C18 2 M 100A) for separation. Mobile phase A was 0.1 % formic acid in HPLC grade water and B was 80/20 acetonitrile: water. Peptides were separated at 0.6 nL/min. using a linear solvent gradient from 3-30% B over 120 mins. The LC system was coupled with a Bruker maXis Impact with captive spray ESI mass spectrometer was used for data collection of spectra from 150 to 1750 m/Z at a maximum rate of 2 Hz for precursor and fragment spectra with adaptive acquisition for highly abundant ions. Data dependent MS/MS was used to collect sequence information on the 5 most abundant ion per full scan. Data analysis was done using MaxQuant (v1.6.4.0) and Perseus (v1.6.4.10). Mining of Public Data Data was obtained from the Metabolomics Workbench [37] and the PRIDE proteomics repository [21]. The archives were searched for data sets that matched “stress” in the keyword search. If the summary described an omics data set that evaluated a stress or perturbation and a control group, both with at least three biological replicates, the uploaded data set was evaluated. If the data provided was in a raw format (e.g. “sample.d” datafile) the set was discarded in order to avoid potential bias from our in-house processing pipeline. If the data was in a final, processed tabular format and experimental conditions were clearly described, the data was used. Reasons for not using a data set included lack of clearly defined experimental and control groups, undecipherable sample codes, or incomplete data inclusion. Data sets that met the criteria of containing stress and control groups with at least three biological replicates, were evaluated by replicate variation analysis. 81 Statistical Analysis CV statistics were calculated using the standard deviation and the mean of individual metabolites or proteins in a group. The standard deviation was taken as a ratio to the mean and reported as a percentage. This was done for every detected metabolite feature or protein to obtain the distribution of the omic population. Statistical analysis was carried out in R [38] and distribution plots were made using ggplot2[39] and ggridges[40], PCA plots, histograms of CV, distribution plots, and distribution statistics of mean and median were all calculated and plotted. A two sample Kolmogorov-Smirnoff (KS) test was utilized to analyze for the empirical distribution functions of the control and the treatment groups. The two sample KS test describes the differences between shape and location of the two distributions being tested using the d statistic with a calculated p value. A larger d statistic indicates a larger change between the two distributions being compared[14]. Simulated Data Analysis The process of simulating these CV distributions requires two levels of simulations— first, a simulation of the population level CV distribution and second, simulations of the individual replicates sampled from these CV distributions. Therefore, the “true” CV distributions across the population level were simulated first. For this, both the means and standard deviations were simulated for each omics feature. The Mayfly treatment dataset presented in Figure 2 was used to parameterize simulations. The means were drawn from a normal distribution with a 1) mean equal to the log(mean) of the Mayfly dataset to disallow negative values and 2) a standard deviation equal to the standard deviation of the log(mean) of the dataset. Each mean also required a corresponding simulated standard deviation. Within the Mayfly treatment dataset, the 82 standard deviation varies from 0.02-1.65x of its corresponding mean, with a mean standard deviation fold-change of 0.23. Therefore, we tested both 0.23-fold and 0.5-fold of the mean and 0.1 as the standard deviation to randomly assign each mean a corresponding standard deviation. Finally, the CV was calculated for each mean-standard deviation pair to create the “true” CV distribution. Forty distributions were simulated. Random sampling from each of the CV distributions were simulated as follows: For each mean and standard deviation pair, varied numbers of replicates were drawn, and the CV was computed. The Spearman’s correlation between the CV for these simulated sample and the “true” CV simulated in the first step was determined. The process was repeated 1000x for each replicate and feature number combination. 83 References Cited [1] L. Galluzzi, J. M. Bravo-San Pedro, O. Kepp, and G. Kroemer, “Regulated cell death and adaptive stress responses,” Cellular and Molecular Life Sciences, vol. 73, no. 11–12. Birkhauser Verlag AG, pp. 2405–2410, Jun. 01, 2016, doi: 10.1007/s00018-016-2209-y. [2] C. H. Johnson, J. Ivanisevic, and G. Siuzdak, “Metabolomics: Beyond biomarkers and towards mechanisms,” Nature Reviews Molecular Cell Biology, vol. 17, no. 7. Nature Publishing Group, pp. 451–459, Jul. 01, 2016, doi: 10.1038/nrm.2016.25. [3] R. Schuhmacher, R. Krska, W. Weckwerth, and R. Goodacre, “Metabolomics and metabolite profiling,” Analytical and Bioanalytical Chemistry, vol. 405, no. 15. Springer, pp. 5003–5004, Jun. 17, 2013, doi: 10.1007/s00216-013-6939-5. [4] W. S. Bush, M. T. Oetjens, and D. C. Crawford, “Unravelling the human genome- phenome relationship using phenome-wide association studies,” Nature Reviews Genetics, vol. 17, no. 3. Nature Publishing Group, pp. 129–145, Mar. 01, 2016, doi: 10.1038/nrg.2015.36. [5] T. M. Healy and P. M. Schulte, “Phenotypic plasticity and divergence in gene expression,” Molecular Ecology, vol. 24, no. 13. Blackwell Publishing Ltd, pp. 3220– 3222, Jul. 01, 2015, doi: 10.1111/mec.13246. [6] D. Houle, D. R. Govindaraju, and S. Omholt, “Phenomics: The next challenge,” Nature Reviews Genetics, vol. 11, no. 12. Nature Publishing Group, pp. 855–866, Dec. 18, 2010, doi: 10.1038/nrg2897. [7] E. J. Want et al., “Solvent-Dependent Metabolite Distribution, Clustering, and Protein Extraction for Serum Profiling with Mass Spectrometry,” Anal. Chem., vol. 78, no. 3, pp. 743–752, Feb. 2006, doi: 10.1021/ac051312t. [8] C. E. Brown and C. E. Brown, “Coefficient of Variation,” in Applied Multivariate Statistics in Geohydrology and Related Sciences, Springer Berlin Heidelberg, 1998, pp. 155–157. [9] P. H. Bessette, F. Aslund, J. Beckwith, G. Georgiou, and S. Blanquet, “Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm,” Proc. Natl. Acad. Sci., vol. 96, no. 24, pp. 13703–13708, Nov. 1999, doi: 10.1073/pnas.96.24.13703. [10] A. M. De Livera, G. Olshansky, J. A. Simpson, and D. J. Creek, “NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data,” Metabolomics, vol. 14, no. 5, p. 54, May 2018, doi: 10.1007/s11306-018-1347-7. [11] A. M. De Livera et al., “Statistical Methods for Handling Unwanted Variation in Metabolomics Data,” Anal. Chem., vol. 87, no. 7, pp. 3606–3615, Apr. 2015, doi: 10.1021/ac502439y. [12] J. M. Jimenez-Gomez, J. A. Corwin, B. Joseph, J. N. Maloof, and D. J. Kliebenstein, 84 “Genomic Analysis of QTLs and Genes Altering Natural Variation in Stochastic Noise,” PLoS Genet., vol. 7, no. 9, p. e1002295, Sep. 2011, doi: 10.1371/journal.pgen.1002295. [13] E. C. Olson and A. V. Yablokov, “Variability in Mammals,” J. Mammal., vol. 48, no. 3, p. 500, Aug. 1967, doi: 10.2307/1377806. [14] F. J. Massey, “The Kolmogorov-Smirnov Test for Goodness of Fit,” J. Am. Stat. Assoc., vol. 46, no. 253, pp. 68–78, 1951, doi: 10.1080/01621459.1951.10500769. [15] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Mar. 2014, doi: 10.1007/s11306-014-0651-0. [16] K. F. Steward et al., “Metabolic Implications of Using BioOrthogonal Non-Canonical Amino Acid Tagging (BONCAT) for Tracking Protein Synthesis,” Front. Microbiol., vol. 11, p. 197, Feb. 2020, doi: 10.3389/fmicb.2020.00197. [17] S. L. Borrego et al., “Metabolic changes associated with methionine stress sensitivity in MDA-MB-468 breast cancer cells,” Cancer Metab., vol. 4, no. 1, p. 9, Dec. 2016, doi: 10.1186/s40170-016-0148-6. [18] H. Chou, W. Pathmasiri, J. Deese-Spruill, S. Sumner, and D. B. Buchwalter, “Metabolomics reveal physiological changes in mayfly larvae (Neocloeon triangulifer) at ecological upper thermal limits,” J. Insect Physiol., vol. 101, pp. 107–112, Aug. 2017, doi: 10.1016/j.jinsphys.2017.07.008. [19] E. M. Brown et al., “Diet and specific microbial exposure trigger features of environmental enteropathy in a novel murine model,” Nat. Commun., vol. 6, Aug. 2015, doi: 10.1038/ncomms8806. [20] F. Xu, T. Gao, and X. Liu, “Metabolomics Adaptation of Juvenile Pacific Abalone Haliotis discus hannai to Heat Stress,” Sci. Rep., vol. 10, no. 1, pp. 1–11, Dec. 2020, doi: 10.1038/s41598-020-63122-4. [21] “PRIDE - Proteomics Identification Database.” https://www.ebi.ac.uk/pride/archive/ (accessed Oct. 09, 2020). [22] M. Zhang et al., “Phosphoproteome analysis reveals new drought response and defense mechanisms of seedling leaves in bread wheat (Triticum aestivum L.),” J. Proteomics, vol. 109, pp. 290–308, Sep. 2014, doi: 10.1016/j.jprot.2014.07.010. [23] “Human Metabolome Database.” https://hmdb.ca/ (accessed Oct. 09, 2020). [24] A. J. Green et al., “Cadmium exposure increases the risk of juvenile obesity: a human and zebrafish comparative study,” Int. J. Obes., vol. 42, no. 7, pp. 1285–1295, Jul. 2018, doi: 10.1038/s41366-018-0036-y. [25] A. V. Badyaev, “Stress-induced variation in evolution: From behavioural plasticity to genetic assimilation,” Proceedings of the Royal Society B: Biological Sciences, vol. 272, 85 no. 1566. Royal Society, pp. 877–886, May 07, 2005, doi: 10.1098/rspb.2004.3045. [26] F. T. C. Pan, S. L. Applebaum, and D. T. Manahan, “Differing thermal sensitivities of physiological processes alter ATP allocation,” J. Exp. Biol., vol. 224, no. 2, p. jeb233379, Jan. 2021, doi: 10.1242/jeb.233379. [27] J. R. Banavar, J. Damuth, A. Maritan, and A. Rinaldo, “Supply-demand balance and metabolic scaling,” Proc. Natl. Acad. Sci. U. S. A., vol. 99, no. 16, pp. 10506–10509, Aug. 2002, doi: 10.1073/pnas.162216899. [28] C. Pollock, J. Farrar, D. Tomos, J. Gallagher, C. Lu, and O. Koroleva, “Balancing supply and demand: the spatial regulation of carbon metabolism in grass and cereal leaves,” J. Exp. Bot., vol. 54, no. 382, pp. 489–494, Jan. 2003, doi: 10.1093/jxb/erg037. [29] F. Chen, A. Evans, J. Pham, and B. Plosky, “Molecular Cell Editorial Cellular Stress Responses: A Balancing Act,” Mol. Cell, vol. 40, p. 175, 2010, doi: 10.1016/j.molcel.2010.10.008. [30] J. E. Ferrell and E. M. Machleder, “The biochemical basis of an all-or-none cell fate switch in xenopus oocytes,” Science (80-. )., vol. 280, no. 5365, pp. 895–898, May 1998, doi: 10.1126/science.280.5365.895. [31] R. P. Juster, B. S. McEwen, and S. J. Lupien, “Allostatic load biomarkers of chronic stress and impact on health and cognition,” Neuroscience and Biobehavioral Reviews, vol. 35, no. 1. Pergamon, pp. 2–16, Sep. 01, 2010, doi: 10.1016/j.neubiorev.2009.10.002. [32] W. E. Dyer, “Stress-induced evolution of herbicide resistance and related pleiotropic effects,” Pest Manag. Sci., vol. 74, no. 8, pp. 1759–1768, Aug. 2018, doi: 10.1002/ps.5043. [33] A. M. Pickering, L. Vojtovich, J. Tower, and K. J. A. Davies, “Oxidative stress adaptation with acute, chronic, and repeated stress,” Free Radic. Biol. Med., vol. 55, pp. 109–118, Feb. 2013, doi: 10.1016/j.freeradbiomed.2012.11.001. [34] W. E. Dyer, “Stress-induced evolution of herbicide resistance and related pleiotropic effects,” Pest Manag. Sci., vol. 74, no. 8, pp. 1759–1768, Aug. 2018, doi: 10.1002/ps.5043. [35] M. N. Ahmed, A. Porse, M. O. A. Sommer, N. Høiby, and O. Ciofu, “Evolution of antibiotic resistance in biofilm and planktonic pseudomonas aeruginosa populations exposed to subinhibitory levels of ciprofloxacin,” Antimicrob. Agents Chemother., vol. 62, no. 8, Aug. 2018, doi: 10.1128/AAC.00320-18. [36] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Dec. 2014, doi: 10.1007/s11306-014-0651-0. [37] “Metabolomics Workbench : NIH Data Repository : Overview.” https://www.metabolomicsworkbench.org/data/index.php (accessed Oct. 09, 2020). 86 [38] “RStudio | Open source & professional software for data science teams - RStudio.” https://rstudio.com/ (accessed Oct. 09, 2020). [39] “ggplot2 citation info.” https://cran.r-project.org/web/packages/ggplot2/citation.html (accessed Oct. 16, 2020). [40] C. O. Wilke, “Ridgeline Plots in ‘ggplot2’ [R package ggridges version 0.5.2],” Jan. 2020, Accessed: Oct. 16, 2020. [Online]. Available: https://cran.r-project.org/package=ggridges. 87 CHAPTER FOUR PROBING MECHANISMS OF REDUCTIVE PYRITE DISSOLUTION IN METHANOCOCCUS VOLTAE BY PROTEOMICS Contribution of Authors and Co-Authors Manuscript in Chapter Four Author: Katherine F. Steward Contributions: sample preparation, data analysis, data interpretation, manuscript drafting, manuscript editing, manuscript submission. Co-Author: Rachel L. Spietz Contributions: sample preparation, data analysis, data interpretation, manuscript drafting, manuscript editing, manuscript submission. Co-Author: Devon Payne Contributions: sample preparation, data interpretation, manuscript drafting, manuscript editing, manuscript submission. Co-Author: Will Kincannon Contributions: data analysis, manuscript editing Co-Author: Christina Johnson Contributions: data interpretation Co-Author: Malachi Lensing Contributions: data analysis, manuscript drafting Co-Author: Hunter Fausset Contributions: data interpretation, manuscript drafting Co-Author: Brigitta Németh Contributions: data analysis, data interpretation, manuscript drafting Co-Author: Eric M. Shepard Contributions: data interpretation, manuscript drafting, manuscript editing Co-Author: William E. Broderick Contributions: data interpretation, manuscript drafting 88 Co-Author: Joan B. Broderick Contributions: data interpretation, manuscript drafting, manuscript editing Co-Author: Jen Duboise Contributions: data analysis, data interpretation, manuscript drafting, manuscript editing Co-Author: Eric S. Boyd Contributions: data interpretation, manuscript drafting, manuscript editing Co-Author: Brian Bothner Contributions: data analysis, data interpretation, manuscript drafting, manuscript editing, manuscript submission. 89 Manuscript Information Katherine F. Steward, Rachel L. Spietz, Devon Payne, Will Kincannon, Christina Johnson, Malachi Lensing, Hunter Fausset, Brigitta Németh, Eric M. Shepard, William E. Broderick, Joan B. Broderick, Jen Dubois, Eric S. Boyd, Brian Bothner Status of Manuscript: __x__ Prepared for submission to a peer-reviewed journal ____ Officially submitted to a peer-reviewed journal ____ Accepted by a peer-reviewed journal ____ Published in a peer-reviewed journal 90 PROBING MECHANISMS OF REDUCTIVE PYRITE DISSOLUTION IN METHANOCOCCUS VOLTAE BY PROTEOMICS Katherine F. Steward1, Rachel L. Spietz2, Devon Payne2, Will Kincannon1, Christina Johnson1, Malachi Lensing1, Hunter Fausset1, Brigitta Németh1, Eric M. Shepard1, William E. Broderick1, Joan B. Broderick1, Jen Dubois1, Eric S. Boyd2, Brian Bothner1 1Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana, 59717 2Department of Microbiology and Immunology, Montana State University, Bozeman, Montana, 59717 To whom correspondence should be addressed: Brian Bothner, Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT 59717, Phone: 406-994- 5270, FAX: 406-994- 4807, E-mail: bbothner@chemistry.montana.edu 91 Keywords: methanogen, iron-sulfur cluster, proteomics, FeS2, pyrite, mackinawite, FeS Abstract The use of iron-sulfur proteins is ubiquitous across all Archaea, Bacteria and Eukaryotes. Although Methanogens genetically code for more Fe-S proteins compared to other organisms, the biogenesis of Fe-S clusters is poorly understood. Here, we utilize a comprehensive proteomics analysis of the model archaeon Methanococcus Voltae A3 in the presence of FeS2 or Fe(II)/HS- to elucidate candidate proteins and protein networks involved in Fe and S acquisition, trafficking and storage. The model archaea system M. voltae was used to evaluate Fe and S assimilation. M. voltae was selected as it has a larger proportion of Fe-S binding motif proteins. Among interesting proteins differentiated on FeS2 are Suf proteins, FeoA/B, iron-sulfur binding ferredoxin motif containing proteins and oxidoreductases. A concurrent study on M. voltae grown on FeS2 indicated that the archaea were experiencing stress due to their smaller size. We used a variation analysis to clarify the changes in the M. voltae due to stress versus Fe and S availability. This analysis reiterated proteins of import to Fe and S assimilation and showed differentiation between stress proteins like Hsp20 and the Universal Stress Protein. This study helped identify important candidate proteins involved in Fe and S acquisition and help characterize a phenotype that arose from changes in Fe and S availability. Introduction All cells require iron (Fe) and sulfur (S) as components of amino acids, vitamins, co- enzymes and co-factors [1]. Specifically, these elements are utilized in cysteine, methionine, 92 heme, and biological [Fe-S] clusters that function in electron transfer, substrate binding, and enzyme catalysis [2]. The ability to obtain Fe and S from the environment is critical for the growth of microorganisms. Oxidative weathering of minerals was of minimal import prior to oxygenic photosynthesis evolution and the subsequent accumulation of O2 in Earth’s atmosphere [3], [4], [5]. This assumption, that anoxic conditions meant sulfide minerals like FeS2 were not biologically available, raises key questions about the acquisition of S, Fe and other minerals occurs in microorganisms. Methanogens are a deeply-rooted branch of archaea that produce methane as a byproduct of their central metabolism [6]. As a group, they are obligate anaerobes that utilize carbon dioxide (CO2) for energy production, with a final product of methane (CH4). This process is catalyzed by enzymes with Fe-S cofactors [7]. Methanogens also have Fe-S containing metalloenzymes that are capable of using key metalloclusters to catalyze oxidation-reduction reactions and to carry out electron transfer, storage and a range of other reactions [8]. Examples of important enzyme systems that rely on Fe-S clusters yet evolved before biological oxidation of the environment include NiFe and MoFe enzymes that catalyze the reversible oxidation of H2 to H+ (hydrogenase) and the reduction of N2 to NH3 (nitrogenase), respectively [9]. Hydrogenases, nitrogenases, methane generating enzymes, and much of the supporting biochemistry all require cells to have steady access to soluble Fe and S/ Methanogens are among the most primitive of extant organisms [4], [6], [10]. Given that this group arose before oxygenation of the atmosphere when bioavailability of Fe and S was likely limited, one would not expect a higher than average reliance on these elements. Studies comparing the Fe content of Esherichia coli and Methanococcus maripaludis revealed that M. 93 maripaludis uses 15 fold more Fe than E. coli per mg of protein[11]. Methanogens are divided into two lineages based on the presence of a SufS gene which codes for a cysteine desulfurase that is required to liberate S from cysteine [12]. The ancestral lineage (Class I) does not have SufS. Recent work has demonstrated that Class I methanogens can reductively assimilate Fe and S directly from FeS2[13]. Further, another study found that Methanococcus voltae cells contain 167% more Fe when grown on FeS2 than ferrous Fe (Fe(II)) and sulfide (HS-)[14] .Therefore, Fe and S in FeS2 must be bioavailable to some microorganisms in anoxic environments, forcing a reevaluation of modern and ancient biogeochemical cycles. The mechanism and cellular pathways involved in this process have yet to be elucidated but recent transcriptomics work in Methanosarcina barkeri growing with different Fe/S sources implicates alpha-keto reductases, a flavin mononucleotide-dependent flavodoxin reductase and hydrolases as potential mechanisms of FeS2 reduction[15]. Here we use M. voltae A3, a model class I methanogen, to gain insight into the proteins and pathways critical for reductive assimilation of Fe and S by analyzing samples prepared from both FeS2 and canonical (Fe(II)/HS-) culture conditions. Unlike many class I methanogens, M. voltae A3 does not encode any form of nitrogenase in its published genome (NCBI Taxonomy ID: 456320).The long-term goal with this work is to identify candidate proteins responsible for the reduction of FeS2 (Pyrite) and subsequent uptake of Fe and S (presumably as small FeSaq molecular clusters), membrane proteins involved in transport, as well as the intracellular partners that are involved in the storage of Fe and S and the subsequent assembly of [Fe-S] clusters. These experiments will also shed light on cell-wide changes in protein synthesis, energetic strategies, and metabolic priorities. Our in-depth LCMS-based proteomics analysis of the intra 94 and extracellular proteomes under different conditions captured 77% of the predicted protein- coding regions of M. voltae. Widespread changes in the intra and extracellular fractions demonstrate that this class I methanogen is highly sensitive to the available Fe and S and makes large scale changes in a wide range of metabolic, redox active , and transport pathways to take advantage of the available Fe and S species. Methods Cell culture conditions M. voltae strain A3, obtained from the American Type Culture Collection (ATTC-BAA- 1334), was grown in Fe- and S-free basal medium that contained (g L-1): NaCl, 21.98; MgCl2 · 6H2O, 5.10; NaHCO3, 5.00; NH4Cl, 0.50; K2HPO4, 0.14; KCl, 0.33; CaCl2 · 2H2O, 0.10. The basal medium was amended with 0.01 g L-1 Fe(NH ) (SO ) · 6H O and 0.480 g L-1 4 2 4 2 2 Na2S · 9H2O for Fe (II)/HS- grown cells. Thirty minutes prior to inoculation, sulfide was added from an anoxic, sterile stock. Basal medium was amended with a synthetic FeS2 slurry to 2 mM Fe for FeS2 cultures. Trace element, vitamin, and organic solutions were added to the basal medium (each 1% v/v), based on Whitman et al. [16] but omitted Fe and replaced sulfate salts with chloride salts at the same molar concentrations. The trace element solution contained (g L-1): nitriloacetic acid, 1.500; MnCl2 · 4H2O, 0.085; CoCl2 · H2O, 0.100; ZnCl2, 0.047; CuCl2 · 2H2O, 0.0683; NiCl2 · 6H2O, 0.0683; Na2SeO3, 0.200; Na2¬MoO4 · 2H2O, 0.100; and Na2WO4 · 2H2O, 0.100. The vitamin solution contained (g L-1): pyridodoxine HCl, 0.01; thiamine HCl, 0.005; riboflavin, 0.005 g; nicotinic acid, 0.005; calcium D(+) pantothenate, 0.005; biotin, 0.002; folic acid, 0.002; and cobalamin, 0.0001. The organics solution consisted of 1 M sodium acetate · 3H2O, 75 mM L-leucine HCl, and 75 mM L-isoleucine HCl. M. voltae cultures were 95 supplemented with a 40% (wt/v) sodium formate stock solution added to a final concentration of 0.4% (v/v) prior to inoculation. Cultivation Procedures Seventy-five mL cultures of M. voltae were grown in 165 mL serum bottles and were harvested during log-phase of growth. Anaerobic conditions were maintained for culture harvesting. Samples were then centrifuged at 4,696 x g for 20 minutes at 4 °C in a swinging bucket rotor. For extracellular fractions, 10 mL of spent media supernatant was decanted under aerobic conditions into 40 mL of ice-cold acetone (Fisher Scientific, Fair Lawn, NJ) and left at - 20 °C for four hours. The samples were then centrifuged to pellet the extracellular proteins and stored at -80 °C. Protein extraction Cell pellets were resuspended in 500 mL of pH 7 phosphate buffer (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4) with protease inhibitor mix (Complete Mini EDTA Free Protease Inhibitor Cocktail, Roche). Samples were lysed using an ultrasonic homogenizer on ice for 15 minutes, and were then centrifuged, leaving the soluble protein fraction in suspension in the supernatant. The supernatant was collected and four column volumes of ice-cold acetone was added to precipitate the proteins, and samples were then left in the -80 ⁰C freezer for 1 hour and then in the -20 ⁰C overnight. The acetone was removed and protein pellets were stored at -80 ⁰C for proteomic analyses. 96 Proteomics analysis Protein pellets were digested using Thermo Scientific EasyPep Mini MS sample prep kit (Cat# A40006). Briefly, samples were reduced and alkylated using iodoacetamide and digested with a mixture of trypsin/lysC, a modified version of Lundby et al [9]. Samples were passed over a C18 reverse-phase column prior to Liquid Chromatography Mass Spectrometry (LCMS) to remove undigested protein. LCMS was performed on an UltiMate 3000 RSLCnano system (Thermo Scientific, San Jose, CA) using a self-packed ReproSil-Pur C18 column (100 um x 35 cm). The gradient used changed solvent B from 2─90% over 92 minutes. Solvent A was water with 0.1% formic acid, Solvent B was acetonitrile with 0.1% Formic Acid. The LC was coupled to the mass spectrometer digital Pico View nanospray source (New Objectives, Woburn, MA) that was modified with a custom-built column heater and an ABIRD background suppressor (ESI Source Solutions, Woburn, MA). The column was packed at 9000 psi using a nano LC column packing kit (nanoLCMS, Gold River, CA). Data-Independent Acquisition (DIA) [1,2] mass spectral analysis was performed using an Orbitrap Fusion mass spectrometer (Thermo Scientific, San Jose, CA). Six gas phase fractions (GPF) of the biological sample pool were used to generate a reference library. The GPF acquisition used 4 m/z precursor isolation windows in a staggered pattern (GPF1 398.4-502.5 m/z, GPF2 498.5-602.5 m/z, GPF3 598.5-702.6 m/z, GPF4 698.6- 802.6 m/z, GPF5 798.6-902.7 m/z, GPF6 898.7-1002.7 m/z). Biological samples were run on an identical gradient as the GPFs using a staggered window scheme (4 m/z Exploris 480, 24 m/z Fusion) over a mass range of 385-1015 m/z. An empirically corrected library which combines the GPF and the deep neural network Prosit [17] will be used to generate predicted fragments and retention times using ScaffoldDIA (Proteome Software, Portland, OR). 97 Data Analysis DIA data were analyzed using Scaffold DIA (2.1.0). Raw data files were converted to mzML format using ProteoWizard (3.0.19254) [5]. Deconvolution of staggered windows was performed. Analytical samples were aligned based on retention times and individually searched against uniprot-M_Voltae_UP000007722_20200218.fasta.z3_nce33_v2.dlib with a peptide mass tolerance of 10.0 ppm and a fragment mass tolerance of 10.0 ppm. Variable modifications considered were: Modification on C ysteine. The digestion enzyme was assumed to be Trypsin with a maximum of 1 missed cleavage site(s) allowed. Only peptides with charges in the range [18] and length in the range [6-30] were considered. Peptides identified in each sample were filtered by Percolator (3.01.nightly-13-655e4c7-dirty) [19]–[21] to achieve a maximum FDR of 0.01. Individual search results were combined and peptide identifications were assigned posterior error probabilities and filtered to an FDR threshold of 0.01 by Percolator (3.01.nightly-13- 655e4c7-dirty). Peptide quantification was performed by Encyclopedia (0.9.2). For each peptide, the 5 highest quality fragment ions were selected for quantitation. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis were grouped to satisfy the principles of parsimony. Proteins with a minimum of 2 identified peptides were thresholded to achieve a protein FDR threshold of 1.0%. Statistical Analysis Data from scaffold represents average protein intensity that takes into account all peptide intensities assigned to a protein. The intensities for data analyzed within the Scaffold software were log10 transformed and normalized using the Scaffold method. Subsequent analysis was performed using Excel and bioinformatics tools: Metaboanalyst [22], PHYRE [23], and psortb 98 [24]. For Metaboanalyst, data spreadsheets were first uploaded and checked for integrity. Protein abundance values were interquartile range (IQR) filtered to eliminate outliers. Missing features were replaced using the KNN algorithm, and features with >50% missing values were discarded. Protein abundances were then normalized by the sum of all features within a sample, log transformed, and auto scaled (µ centered, divided by standard deviation of each variable) prior to statistical analysis. T-test and fold change analyses were performed to assess significance and magnitude of protein abundance differences (volcano plot). The heatmap employed hierarchical cluster analyses of the samples and features, with the sample dendrogram on the x axis and the feature dendrogram on the y axis. Distances in the dendrograms are Euclidian and use the Ward clustering algorithm, feature rankings are based on their t-test values. Results Global Intracellular Proteomics In the present study, we analyzed the global proteomic response of M. voltae A3 cells grown in minimal base salts medium provided with either 20 µM Fe(II) and 2 mM HS- or 2 mM FeS2 as the sole source of Fe and S and with formate as the methanogenesis substrate. The intracellular and extracellular protein fractions were analyzed separately by shotgun proteomics to identify and quantify changes in cellular protein expression. Protein fractions from batch cultures were digested with trypsin and analyzed using LC-MS/MS. We identified 1,269 (77%) of 1,658 protein-coding genes [25] predicted from the M. voltae A3 genome. Grouping of the identified proteins by pathway analysis showed broad coverage of functional classes as expected given the high percent coverage of the genome (Figure S1). Comparative analysis of protein expression under the two growth conditions 99 (Fe(II)/HS- vs FeS2) using a student’s T-test showed that 509 proteins had significantly different abundances (fold change (fc) >2, p < 0.05) with 285 of these proteins more abundant in the Fe(II)/HS- condition and 224 proteins in greater abundance in cells grown on FeS2 (Figure 4.1A and Tables S1 and S2). Hierarchical clustering of proteins was used to compare the global response of M. voltae cultures under the two growth conditions. By limiting this analysis to the 500 most differentially expressed proteins between the two treatments, we saw that Fe and S source had a dramatic impact on the proteome (Figure 4.1B). To gain more insight into the roles of these differentially expressed proteins, we assigned functional annotations to each using gene ontology (GO) categorization (Table 4.1). M. voltae cells grown with Fe(II)/HS- had greater abundance of Figure 4.1. Global Proteomics: Differential analysis of sample groups. A. Volcano plot of differentiated proteins. Dots displayed in blue or red have a fold change of > 2 and t-test p value of < 0.05 or less. B. Heatmap based on the top 500 proteins that differentiate cells grown on FeS2 from Fe(II)/HS- mineral source. Biological replicates (columns) and proteins (rows) are arranged by hierarchical clustering using Euclidean distance and Ward clustering algorithms. Legend indicates fold change in protein abundance. 100 proteins associated with amino acid, protein, and nucleic acid metabolism, ribosomal proteins, membrane transport, and cofactor biosynthesis relative to cells grown with FeS2., proteins with potential roles in Fe and S metabolism were detected in higher abundance in the FeS2 samples, including metal uptake, trafficking, and storage proteins as well as transcriptional regulators and oxidoreductases. It is important to note that 22% (111) of the regulated proteins lacked an assigned function, with 62% (t69 proteins) of these in the FeS2 group, suggesting that uncharacterized proteins could serve a role in growth on FeS2. The next step was to investigate specific metabolic pathways to develop a deeper understanding of the physiological demands imposed by the different Fe and S sources. We first looked at core metabolic pathways including the tricarboxylic acid (TCA) cycle, sulfur metabolism, and the sulfur relay system as annotated in KEGG[28]. For the most part these either increased in Fe(II)/HS- or were unchanged (Figure 2, Table 2). The higher abundance of proteins associated with the TCA cycle, amino acid metabolism, and nucleic acid metabolism in the Fe(II)/HS- cultures suggests a growth favored phenotype compared to FeS2. Consistent with this idea is the greater representation of proteins with ATP binding domains in Fe(II)/HS- grown cells compared to FeS2 grown cells (Figure 2). Proteins predicted to have a role in nitrogenase- like pathways and nitrogen storage had similar abundances except for two proteins with similarity to NifB and more specifically the IssA clade (Mvol_0693 and Mvol_0689) that were more abundant in the Fe(II)/HS- condition, and a nifH homolog that was significantly higher in the FeS2 condition. Importantly, M. voltae fails to grow diazotrophically and does not express nitrogenase. Further investigation of these homologs found them to be similar to an IssA protein. 101 Figure 4.2. GO Pathway Proteins: Overview of pathway specific changes with respect to culture conditions. Colored segments show the percent of intracellular proteins in a given pathway that were more abundant in the presence of FeS2 (green), Fe(II)HS- (blue), and unchanged (yellow). Numerals on the bars show the actual number of proteins. Annotations were made from Uniprot annotations, GO annotations, STRING and KEGG annotations. 102 related to Fe storage in intracellular thioferrate nanoparticles [14], [29]. Proteins associated with methanogenesis were compared to assess changes to the central energy metabolism of M. voltae [4], [26]. While there were not widespread differences, the FeS2 condition had significantly higher abundances of 11 of the 31 proteins (p = 0.05) specific to the pathway (Figure 2, Table 1 and 2). The proteome also had methanogenesis marker proteins annotated from Uniprot[27] that were not differentiated in the two conditions. Chemical and functional analysis of the proteome The biological reductive dissolution of FeS2 results in the production of aqueous FeS clusters, which are hypothesized to be the assimilation products to meet Fe and S demands of methanogen cells[30]. Rather than relying strictly on protein annotation, we reasoned that the physical and chemical properties could be informative, particularly due to the relatively high percentage of unassigned proteins. We first examined the amino acid content of proteins differentially expressed between the two growth conditions to look for enrichment of motifs involved in metal binding and FeS cluster coordination. Cells grown on FeS2 expressed more cysteine-rich proteins (>2.5% Cys) compared to cells grown with Fe(II)/HS- (Figure 3 , Table S3). While we did not detect any significant difference in proteins enriched in acidic residues (Asp and Glu) that could potentially substitute for thiols in binding metal cations such as iron, we did see a striking abundance of highly basic proteins (>20% Lys, Arg, or His) that were more abundant in cells grown with Fe(II)/HS- (Figure 3B). Positively charged polypeptide regions are often found in proteins that bind negatively charged molecules such as nucleotides and nucleic acids. For example, ribosomal proteins, transcription factors, and translational machinery all bind DNA, RNA, and/or nucleotides such as ATP and GTP. Using Gene Ontology (GO) annotations, 103 we examined proteins involved in ATP binding (GO: 0005524) Figure 3C. This category had a large number of proteins that changed abundance, with many more found in the Fe(II)/HS- group. As a final groups we singled out those classified as structural constituents of the ribosome (GO: 0003735). Proteins with this GO classifications were overwhelmingly more highly expressed in cells grown on Fe(II)/HS- compared to those grown on FeS2. This is congruent with the observed profile of highly basic protein enrichment in cells grown on Figure 4.3. Volcano plot of proteins cultured with FeS - 2 or Fe(II)/HS . Each spot represents a protein with the fold change (horizontal) and p-value (vertical) indicated by position in the graph. Circled spots highlight proteins annotated to contain specific sequence and or chemical characteristics. A. ATP binding domains based on GO categories (GO:0005524). B. Proteins with high cysteine content. C. Structural components of the ribosome. D. Basic proteins, enriched in lysine, arginine, and histidine. Tabular data for the plots can be found in Supplemental Table 8. 104 Fe(II)/HS. Together, these observations suggest that M. voltae is able to dedicate more energy to growth, cell division, and motility when grown on Fe(II)/HS- compared to growth on FeS2. Additionally, the change in a large number of ribosome structural components, suggests ribosome function is tuned differently in the two growth conditions. Iron binding proteins M. voltae cells grown with FeS2 as their sole source of Fe and S expressed a greater abundance of oxidoreductases (GO: 0016491) and Fe-S binding proteins (GO: 0051536) compared to cells grown on Fe(II)/HS-. This observation prompted further investigation into Fe- S binding proteins expressed on each growth condition. Binding and transport of Fe is facilitated by specific cysteine-rich motifs, which we utilized as a search motif within the differentiated proteins predicted to interact with iron. A large number of proteins with iron-sulfur cluster binding motifs were differentially regulated. As an example, we looked at CX2CX2C ferredoxin motifs [31], for which there were 62 hits in the M. voltae proteome based on genome sequence analysis (Table S8). The proteomics data shows that 27 of these were differentially regulated (Table S4). Under FeS2 growth conditions, 16 annotated proteins (mostly oxidoreductases) and 1 of unknown function were more abundant. The majority show high similarity to [4Fe-4S] coordinating ferredoxins from methanogens. Mvol_0976 was detected - 69-fold higher in FeS2 conditions. This hypothetical protein is located downstream from a FeoB protein (Mvol¬_0975, up 14-fold in FeS2), and upstream from a FeoA protein (Mvol_0977, up 211-fold in FeS2). These observations indicate that Mvol_0976 is a functional protein ORF that shows higher abundance than the Fe transporter FeoB but much lower expression than FeoA, a regulatory protein that modulates FeoB function, under FeS2 growth conditions [32]. The 105 location of Mvol_0976 suggests that this protein is part of the Feo operon in M. voltae. A BLAST search [33] resulted in only one hit, demonstrating that this protein is unique to M. voltae and may have a distinct role in iron acquisition when FeS2 is the only source of Fe and S. A different set of proteins with ferredoxin motifs are more abundant in the Fe(II)/HS- sample group. Overall, the differential regulation of ferritin-motif proteins under FeS2 vs. Fe(II)/HS- growth conditions likely reflects their importance in iron-sulfur cluster acquisition and utilization in M. voltae. Of particular interest are several differentially expressed proteins identified as members of the radical SAM superfamily [34]-[36]. Of the 36 ORFs in the M. voltae genome that harbor characteristic radical SAM cysteine motifs, 19 were differentially regulated, with seven of these being more abundant under FeS2 growth conditions, and the other 12 being increased under Fe(II)/HS- growth conditions (Table S5). Among those more abundant under FeS2 growth conditions was a MiaB-like tRNA modifying enzyme (Mvol_1647). Interestingly, tRNA modification, such as that catalyzed by MiaB, is proposed to be part of a global regulatory mechanism in response to environmental stress conditions [37], [38]. This suggests that FeS2- based growth is a stress for M. voltae. Also increased is the anaerobic ribonucleotide reductase activating enzyme, an absolutely essential enzyme for nucleotide metabolism under anaerobic conditions [39]. Two other radical SAM proteins that increased with FeS2 (Mvol_0698 and Mvol_0696) have unassigned functions but are close together on the genome. Both of these radical SAM proteins have cysteine residues in addition to those required for catalysis that could bind additional iron-sulfur clusters. The genetic context of these two radical SAM proteins suggests they play roles in cofactor biosynthesis. Five radical SAM enzymes with no functional 106 annotation were found in higher abundance under sulfide growth conditions. Mvol_0045 and Mvol_1681 are both surrounded by ORFs related to nucleotide metabolism. Mvol_1414 is a radical SAM enzyme that uses the less common CX5CX2C HmdB motif [40] and has homology to the hydrogenase maturation protein HydE [41]. Mvol_1348 shows similarity to the [FeFe]- hydrogenase maturation enzymes HydE and HydG, and nearby is an ORF for HypD, the Fe-only hydrogenase [31], suggesting that Mvol_1348 may be involved in cofactor biosynthesis for HypD. This proteomics data reveals that the radical SAM enzymes with significantly different abundance between FeS - 2 and Fe(II)/HS growth conditions are involved in stress response, nucleotide metabolism, and cofactor biosynthesis. Two other proteins of specific interest that contribute to the differences between the two conditions were a DrsE domain containing protein (Mvol_0773) and DUF 2193 (Mvol_0354), both higher in the FeS2 condition. DrsE domains are involved in intracellular sulfur reduction and interact with desulfoferrodoxin ferrous iron binding proteins (Mvol_0775) [42]. These two oxidoreductases are next to each other on the genome and are both found in significantly greater amounts in the FeS2 condition. Interestingly, there is also an uncharacterized protein (Mvol_0772) and a cell wall binding protein (Mvol_0771) that were also detected in higher quantities in the FeS2 condition. It could be hypothesized that these proteins found next to each other and in large amounts when Fe and S are not readily available is not by chance. In the Fe(II)/HS- condition ApbE-like protein (COG2122) was more abundant. Since it is essential in sulfide assimilation, this increase could be expected as it has a role in biosynthesis of cysteine and homocysteine [43]. Of particular note is DUF 2193. This protein of unknown function was more abundant in FeS2 samples and was found to be highly conserved in methanogens. A cluster 107 of conserved CX2CX6DX2(H/C)X2C residues near the C termini of these proteins could act as a ligand for labile [Fe-S] cluster(s) coordination. Conserved Proteins and Oxidoreductases We anticipated that oxidoreductases could be essential for mobilizing FeS2 (which is composed of ferrous-persulfide units, {Fe(II)S2}0), as well as changing the oxidation states and speciation of Fe and S within the cell. In the M. voltae proteome, we identified 33 oxidoreductases (Table S6) that were not clearly associated with a well-defined metabolic pathway for example, methanogenesis, hydrogenase chemistry, or cellular respiration. Of these, 21 were more abundant on FeS2, 18 in the intracellular fraction and 3 extracellularly. Two of these proteins (Mvol_0775, Mvol_0776) are conserved in two other methanogen strains, M. maripaludis and Methanosarcina barkeri, now known to reduce FeS2[13]. Mvol_0775 is annotated as a mononuclear iron-binding desulfoferrodoxin protein, with relatives involved in superoxide reduction to peroxide as part of cellular antioxidant defense [44], while Mvol_0776 is a carboxymoconolactone decarboxylase similar to peroxiredoxins. These enzymes are mainly distributed in anaerobic Archaea and Bacteria, including sulfate reducers [45]. Either of these enzymes could play a role in antioxidant defense, or possibly, in reduction of the persulfide unit in FeS2 or its derivatives. To place the differentiated proteins from M. voltae in perspective, we analyzed the data looking for proteins conserved across species. We searched a list of Fe-S proteins and motifs conserved across many species of Archaea and Bacteria involved in the uptake, trafficking, and storage of Fe and S. Eleven of the 41 proteins described in that work are annotated in M. voltae (Table S7). Of those 11, 10 were detected in this work, four of which were differentiated. Three 108 were found in higher abundance in the FeS2 condition: two FeoA type proteins, and FeoB. A HemC type protein was found in higher abundance in the Fe(II)/HS- condition, Mvol_0134, which is the probable porphobilinogen deaminase (PBGD). This probable PBGD protein is likely involved in the production of linear tetrapyrroles which serve as precursors to a variety of cofactors including the methanogenesis-associated F430[46]. Upregulation of this protein could indicate generally high metabolic activity, as described above. Membrane Proteins A critical step in uptake of extracellular material involves membrane transport. While the approach used here focused on soluble proteins, 55 membrane or membrane associated proteins were significantly differentiated between Fe(II)/HS- and FeS2 conditions (Table S8). Membrane proteins were categorized by GO annotation from UniProt [47], Pfam [48], and PHYRE [23]. PHYRE was used when standard methods failed to assign a functional category. If PHYRE failed to yield results, Psortb was utilized to predict cellular localization. 29 of the 55 regulated membrane proteins were in higher abundance in the FeS2 samples. The energy converting hydrogenase, Eha or (NiFe)-hydrogenase-3-type complex (Mvol_1594) is a multi-subunit membrane-bound protein that has been identified as an essential protein in methanogenesis by supplying electrons to anaplerotically reduce CO2 to formylmethanofuran [49]. Mvol_1594 was detected in higher abundance in the FeS2 grown cultures. As a known transmembrane protein and proton pump it may be important for Fe and S transport [50]. FeoB (Mvol_0975, discussed above) is involved in ferrous iron uptake and transport. Mvol_0781, a heavy metal translocating P-type ATPase, is another candidate for involvement in ferrous iron uptake, a homolog found in Pseudomonas aeruginosa was shown to have selective uptake of zinc and copper [51]. Two 109 membrane associated transport proteins were found in higher abundance in the Fe(II)/HS- condition, Mvol_0749, and ecfA (Mvol_1619). These membrane proteins are involved in energy coupling transport, ecfA is part of an ABC-transporter complex and Mvol_0749 is a molybdenum ABC transporter. Other proteins, including the transcriptional regulator, TrmB (Mvol_1582) and formate hydrogenlyase subunit 4-like protein (Mvol_1241), were also found in higher abundance in Fe(II)/HS- . TrmB is a control point for sugar metabolism [52] and formate hydogenlyase is a central enzyme in anaerobic metabolism [53], contributing to the reasoning that M. voltae in the Fe(II)/HS- condition have phenotypic adaptations due to the perceived Fe and S availability. Extracellular Proteins In order to address the possibility that M. voltae may excrete essential enzymes targeted at the FeS2 reductive dissolution process, we analyzed the media for proteins that could facilitate mineral reduction and metal transport. After carefully removing cells, acetone precipitation was used to collect proteins from the extracellular fraction. The same proteomics workflow was used as with the intracellular samples. A tiered approach was used to compare the FeS2 and Fe(II)/HS- extracellular proteomes. First, protein abundance was compared between intra and extracellular fractions within each condition. We focused only on proteins that were highly enriched in the extracellular fraction (fc>20, p<0.05). This step was taken to eliminate proteins found extracellularly due to minor cell lysis rather than active excretion. The filtered lists from each condition were then compared. This yielded 25 proteins that were highly abundant in both 110 Figure 4.4. Extracellular protein pools: A. Comparison of proteins present in the media under different growth conditions. Proteins unique to a condition were at least 20-fold more abundant in that condition. B. Functional categorization of extracellular proteins upregulated under I.) FeS2, II.) Fe(II)/HS- , and III.) both growth conditions. Upregulated proteins were functionally categorized according to their Uniprot annotations and GO classifications. While the functional distribution of proteins from the sulfide and pyrite conditions were similar, proteins specifically upregulated specifically by one or the other growth condition were predominantly in the pathway independent Fe/S binding proteins, energy metabolism, and element metabolism groups. (Pathway independent indicates that the proteins could not be clearly identified with a particular metabolic pathway or process.) Further division of each group into subgroups is given in Table S12. 111 Fe(II)/HS- and FeS2 extracellular fractions, 99 specific to FeS2 and 142 distinct to Fe(II)/HS- condition (Figure 4.4, Table S9-S11). As a group, the proteins found in both conditions had a relatively high proportion of uncharacterized proteins (6 of 25), two membrane proteins (Mvol_0383 and Mvol_0341), and two transferase proteins (Mvol_0341 and Mvol 0341) (Table S11). Proteins enriched in only one condition were grouped by functional annotation for comparison. Overall, the FeS2 extracellular conditions had proteins involved in membrane transport, sulfur metabolism, oxidoreductases and uncharacterized (putative) Fe-S binding were in higher abundance (Figure 4.3B). The sulfide extracellular component had four radical SAM proteins, 45 uncharacterized proteins and 9 transport proteins in higher abundance (Figure 4.3C). 84 proteins differentiated among the sample conditions that were annotated as uncharacterized were investigated using PHYRE (Table S13). Some standouts in the extracellular FeS2 fraction were three radical SAM 4Fe-4S proteins (Mvol_0826, Mvol_1151 and Mvol_0698), and one 4Fe-4S ferredoxin iron-sulfur binding domain protein (Mvol_0878). The cysteine rich protein Mvol_1221, as well as the periplasmic copper binding protein (Mvol_0646) were also in high abundance in the FeS2 extracellular fraction. Stressed Phenotype Analysis We employed replicate variation analysis (RVA) as a tool to evaluate stress on the M. voltae system and to help identify target proteins that are under tight control by the cell. RVA utilizes the coefficient of variation (CV standard deviation/ CV mean) of every protein. The distribution of a treatment’s CV is plotted, and the KS d statistic, the median and the mean of each treatment groups’ CV is assessed. The CV distributions between the FeS2 and the 112 Fe(II)/HS- intracellular proteomes were significantly different (KS d = 0.27616, p<0.001) (FIGURE 4.5). The mean and medians of the FeS2 cultures (mean = 18.5, median =13.2) were also decreased from the Figure 4.5. RVA M. voltae: Distribution profile plot of proteomic CV of M. voltae grown with FeS2 (pyrite, blue) and the control Fe(II)/HS- group (Sulfide, purple). Fe(II)/HS- cells (mean=25.8, median =25.5). Previous work from our lab has shown a direct link between acute stress on a system and a decreased CV distribution, mean and median. Comparing the CV landscape of the proteome will offer biological insight about the cell’s protein regulation. Proteins that have a very small CV are likely under tight management by the cell, since averages of thousands of cells are producing nearly identical amounts of that specific protein. The FeS2 grown cells had a smaller average CV than then Fe(II)/HS- grown cells. The t- test between the two culture conditions identified 509 differentially regulated proteins. Of the 113 224 proteins that were found in higher abundance in the FeS2 culture, only 5 proteins had a CV > 30%, that is only 2.2% of the FeS2 differentiated proteins. In contrast, of the 285 proteins found in greater abundance in the Fe(II)/HS- condition, 56 proteins had a CV > 30%, that is 19.5% (Table S14 ). This indicates that the FeS2 were under strict cellular control, likely due to Fe and S perceived bioavailability. In order to identify additional proteins under tight control by the cell, we utilized fold change and CV. We first took the difference in CV between the stress and the control condition. We multiplied the absolute value of the difference in CV by the fold change for each protein. We termed this new ranking statistic the Most Important Feature (MIF) ranking. By looking at the proteins with the largest MIF ranking values, we many proteins that were previously identified as potentially important players in the Fe acquisition and trafficking. SufBD (Mvol_0653), desulfoferredoxin ferrous iron-binding protein (Mvol_0775), FeoA (mvol_0619), cysteine-rich small domain protein (Mvol_1221) and the DrsE domain containing protein (Mvol_0773) (all discussed previously). Using RVA and a MIF ranking greater than 20 to analyze individual proteins and functional groups of proteins, annotated with GO, identified pathways that M. voltae was utilizing to manage the restricted access to Fe and S (Table 4.1). As expected, proteins with 4Fe-4S ferredoxin iron sulfur binding domains were found in significantly larger amounts in the FeS2 condition, and 16 ferredoxin proteins with a MIF ranking larger than 20 in FeS2. The FeoA/B proteins involved in Fe cluster assembly reflected the same narrative, with three of the five annotated proteins in our data having a MIF ranking larger than 20 in the FeS2 condition. Interestingly, cellular metabolism, methane metabolism and methanogenesis marker proteins also had a larger proportion of proteins with a smaller CV in the FeS2 than in the Fe(II)/HS- 114 condition, but only 15 of 56 proteins in these groups had a MIF ranking greater than 20 and the significantly differentiated proteins were more equally distributed between the two conditions (discussed above). Not surprisingly, proteins indicative of stress were more abundant in the FeS2 condition, and had very small CVs compared to the Fe(II)/HS- condition (Table 4.1, table S15, S16). As mentioned, the MiaB protein (Mvol_1647) was detected 2.3 fold higher amounts, and had a MIF ranking of 54.8. The universal stress protein A (Mvol_0764) was 3.2 fold more abundant in the FeS2 condition, with MIF ranking of 52.5. Hsp20 (Mvol_0638) was 3.3 fold more abundant in the FeS2 condition, with a MIF ranking of 38.5 MiaB contains two 4Fe-4S groups and is responsible for the methylthiolation of tRNA, which may act as a global regulator to different environmental conditions, although the mechanism is unknown [54]. The universal stress protein, UspA, is part of the cell’s response to DNA damage and oxidative stress and can cause growth arrest, as reflected by the lower abundances of translation and transcription proteins in the FeS2 condition (Table 4.1) [55]. Hsp20 is part of the small heat shock protein family, which are induced when a cell is exposed to heat or environmental stress. It functions as a molecular chaperone to prevent protein aggregation [56]. Taken together, the global protein analysis indicated that the Fe(II)/HS- condition is a more hospitable growth environment for M. voltae. Looking at GO annotated groups of proteins using RVA, MIF ranking and differential protein analysis was consistent with this. The FeS2 growth condition resulted in 881 detected proteins with a smaller CV than in the Fe(II)/HS- supplemented media. Comparatively, the Fe(II)/HS- condition had 385 proteins with a smaller CV (Table 4.1). These groups of proteins with a large MIF ranking and a smaller CV in the 115 Table 4.1 FeS2 NOTES Fe(II)HS NOTES - number of proteins with 881 385 smaller CV than other condition number of proteins with 277 362 VA >20 Fe and S trafficking 8 FeoA/B, cysteine rich 3 Fe-S assembly domain proteins protein, SirA (sulfur relay) ferredoxins 16 6 oxidoreductases 10 desulfoferrodoxin 4 flavin reductase ferrous iron-binding, DrsE, Rubrerythrin uncharacterized proteins 58 57 r-SAM 5 8 CRISPR 7 adaptive fitness 1 transcription/translation 8 56 membrane proteins 7 4 glycolysis/gluconeogenesi 3 0 s ABC transporters 2 6 chemotaxis 2 3 flagella 0 5 stress proteins 22 USP, phage shock 14 UPR, DNA protein, Hsp20, damage oxidative stress, DNA damage Table 4.1 Summary of CV and MIF rankings of all GO annotated proteins groups. Fe(II)HS- were likely less controlled in the FeS2, indicating that these groups of protein are not integral to Fe and S acquisition or CSR (Table 4.1, S15, S17). Translation and transcription cell machinery proteins were more abundant in the Fe(II)/HS- cells compared to the FeS2. These groups had 56 proteins with a MIF ranking larger than 20 in the Fe(II)HS- condition. ATP 116 binding cassette (ABC) transporter proteins were also more abundant in the Fe(II)/HS- cultures, with 6 of the 15 proteins having a MIF ranking greater than 20. ABC transporter proteins are membrane-bound and play a critical part in substrate uptake, export and osmoregulation. They enable the archaea to scavenge substrates efficiently and have been shown to play a role in nitrate respiration [57]. Another mechanism that allows for effective substrate localization and uptake is the chemotaxis machinery. Chemoreceptors, coupled to the motility functions of the cell, can assist the cell in finding vital nutrients [58]. More significant regulation of chemotaxis proteins was detected in the Fe(II)/HS- growth condition, which is unlikely to be coincidence when combined with the ABC protein differentiation. These proteins, that were more abundant in the Fe(II)/HS- condition, also had larger MIF values in the Fe(II)/HS- compared to FeS2. RVA expanded the original aim of this analysis, we were able to add phenotypic dimension to our Fe and S trafficking investigation. Discussion This comparative shotgun proteomics analysis of M. voltae, grown in the presence of two fundamentally different sources of Fe and S, generated a deep view into the proteome of this organism. With 1,658 predicated protein-coding genes, this organism has less than half of the genes in a typical strain of E. coli. As one might expect for an organism with a petit genome, a high percentage of the proteome would be expected to be translated under any given circumstance. In this case, 1269 or 77% of the predicted ORFs were detected. Even for an archaeon, this is an impressive coverage of protein space and speaks to the depth of the differential analysis. The first noteworthy clue that the availability and form of Fe and S is critical to this organism is that 509 (40%) of the proteins captured had a significantly different 117 abundance (FC > 2, corrected p value < 0.05) between sample groups. This is an unusually dramatic change for an archaeal species when compared with other environmental pressures such as viral infection and acute oxidative stress [59], [60]. While the shift in the proteome was clear and dramatic, as shown in the heat map (Figure 2), the response was balanced; with 285 proteins increased in the Fe(II)/HS- cultures and 224 increased in the presence of FeS2. Such a balanced response suggests that M. voltae has specialized pathways optimized for each condition. A global perspective suggested that a growth phenotype was adopted in the presence of Fe(II)/HS-. This conclusion arose from differentiation of proteins associated with amino acid, protein and nucleic acid metabolism. There were also increases in proteins related to respiration, membrane transport and cofactor biosynthesis, nucleic acid metabolism, and TCA intermediates. Given that the Fe(II)/HS- condition is used because it facilitates growth in culture, these findings come as no surprise. Proteins that potentially have roles in Fe and S metabolism of minerals were detected in higher abundance in the FeS2 samples, including metal uptake, iron-sulfur trafficking and storage proteins, transcriptional regulators, and oxidoreductases. Analysis of methanogenesis pathway proteins showed a significant (FC >2, p<0.05) change in 11 of 31 proteins (Figure 4.2). While these changes were significant for several proteins, we found that overall protein expression of complete enzyme complexes was similar. The relatively modest change in methanogenesis-related proteins suggests little change in this cellular process between the two growth conditions. This work began with the hypothesis that a specialized set of proteins would be required for M. voltae to assimilate Fe and S by reducing FeS2. Metal binding activity would be requisite in such a protein pool; therefore, we queried the data for proteins with the cysteine rich 118 ferredoxin motif CX2CX2C. We detected 27 of the 62 CX2CX2C domain containing proteins in the genome at differential abundances dependent on the growth condition. Of the regulated proteins, 16 were increased in the presence of FeS2. Proteins with predicted involvement in metal binding and transport were of primary interest, like the FeoAB pair (Mvol 0977 and 0975) and the associated transcription factor DtxR (Mvol_0620). DtxR is a transcriptional regulator involved in maintaining transition metal homeostasis [61]-[63]. When Fe(II) is low in abundance and unavailable to DtxR, this protein binds the promoter of FeoAB and induces expression. When DtxR binds Fe(II), it suppresses its own expression and that of FeoAB. The increased expression of DtxR and FeoAB could imply that the cells sense Fe(II) limitation when grown with FeS2 (as discussed in [14]. Another protein of interest was DUF 2193. This protein of unknown function has metal binding motifs, was more abundant in FeS2 samples, and was found to be highly conserved in methanogens. This protein is currently being investigated for its role in binding different forms of Fe and S. Oxidoreductases could also be playing a direct role in the assimilation process by reducing FeS2 or involved in the transfer processes. 33 oxidoreductases without a clearly defined metabolic role were identified with 21 more abundant in the FeS2 samples (Table S6). Due to this differential abundance in the FeS2 condition and the 4Fe-4S binding site, these oxidoreductases, three of which were enriched in the extracellular fractions, and DUF 2193 are all interesting targets for further investigation. Thirteen radical SAM enzymes were significantly different between conditions. Among those significantly increased in FeS2 growth conditions was a MiaB-like tRNA modifying enzyme. Interestingly, the tRNA modification catalyzed by MiaB is proposed to be part of a 119 global regulatory mechanism in response to environmental stress conditions [37], [38]. This is consistent with the suppression of proteins in growth associated pathways in the FeS2 condition. As a final point, the role of differentially regulated Radical SAM proteins in the extracellular fractions remains to be elucidated. It remains unclear why such a large number of proteins annotated as cytoplasmic were detected in the extracellular fraction. We enforced strict criteria when analyzing global changes in the extracellular proteome, in which the protein abundance had to be greater in the extracellular compared to the intracellular fraction, which would rule out the obvious reasoning of cell lysis during culture and/or handling. Archaea are known to secrete large numbers of extracellular vesicles [64]. It is intriguing to postulate that this process could be used to deliver proteins extracellularly for uptake and transport of Fe and S. Importantly, recent results have shown that cells appear to require direct access to FeS2 in order to catalyze its reduction [14]. At this time, it is not clear if M. voltae actively secretes proteins to facilitate reductive dissolution of FeS2 and/or acquisition of soluble Fe/S species. Regardless, proteins such as oxidoreductases, Fe-S binding proteins, and radical SAM enzymes could have roles in the multi-step process that begins with FeS2 reduction, generation of soluble FeSaq molecular clusters (Fe150S150), and the trafficking of these species into the cell. These could then be stored in the cell by other classes of proteins using either cysteine rich motifs or noncovalent interactions, such as has been observed for the IssA protein in Pyrococcus furiosus [27]. Even though we did not specifically target integral membrane proteins, the increased abundance of FeoAB and other annotated transporters in the FeS2 condition, supports our hypothesis that M. voltae has specialized protein machinery facilitating the reductive dissolution and subsequent assimilation of Fe and S directly from 120 FeS2.By combining traditional proteomics analysis with RVA, we were able to elucidate the Fe acquisition machinery being deployed and to detect cellular stress that the apparent lack of Fe and S availability incurs on the cell. The obvious targets for Fe acquisition, including ferredoxins, oxidoreductases, and the FeoA Fe-cluster assembly machinery were all detected in higher abundance in the FeS2 cells, but also appeared to be under strict regulation by the cell, indicated by the low CV in these groups of proteins (Table 4.1). Work from our collaborators showed that M. voltae grown on FeS2, which is believed to be a less bioavailable source of Fe and S, showed signs of a stressed phenotype. Analysis using transmission electron microscropy (TEM) showed that FeS2 grown cells were smaller than the Fe(II)/HS- grown cells [14]. While all of the cells were roughly the same irregular, elongated coccoid shape, the FeS2 cells were nearly 33% smaller than the Fe(II)/HS- cells. This work suggested that the change in size was due to perceived lack of Fe availability rather than a true limitation, as both growth conditions produced HS- accumulation in the media, indicating that the cells had a rich supply of S. Utilizing RVA reinforced a stressed phenotype, as the CV profile of FeS2 was significantly smaller than the Fe(II)/HS- CV profile. M. voltae employed stress management through mechanisms like the heat shock response, the unfolded protein response and universal stress response [65]. Proteins from all of these categories were detected in larger amounts in the FeS2 condition, as well as had smaller CVs and a large MIF ranking. Another tool M. voltae might have employed to mitigate stress was to quiet some cellular processes. The Fe(II)/HS- condition had significantly more abundant proteins involved with transcription, translation, ABC transport and chemotaxis, with more proteins having a smaller CV and larger MIF ranking in the Fe(II)/HS- compared to the FeS2 condition. 121 This study establishes a foundation on which to build the pathways and proteins responsible for Fe and S acquisition in the model methanogen M. voltae. The observation that M. voltae can utilize FeS2 as sole sources of Fe and S provides unique opportunities to understanding the mechanisms employed by this organism, as well as other class I and class II methanogens, in acquiring, trafficking, and storing these elements. Through the use of variation analysis, we were also able to help clarify the small phenotype that M. voltae adopted when grown on FeS2. Defining the changes in biology due to Fe and S bioavailability in assimilation strategies compared to perceived stress is an important distinction that adds additional dimension to proposed Fe and S acquisition schemes. Knowledge along these lines will undoubtedly open up new avenues of Fe and S biochemistry and yield valuable industrial and biotechnological insights with applications related to metal sequestration and processing. 122 References Cited [1] K. Brzóska, S. Męczyńska, and M. Kruszewski, “Iron-sulfur cluster proteins: electron transfer and beyond *,” 2006. Accessed: Feb. 26, 2021. [Online]. Available: www.actabp.pl. [2] D. R. Martin and D. V. Matyushov, “Electron-transfer chain in respiratory complex i,” Sci. Rep., vol. 7, no. 1, pp. 1–11, Dec. 2017, doi: 10.1038/s41598-017-05779-y. [3] B. A. Berghuis, F. B. Yu, F. Schulz, P. C. Blainey, T. Woyke, and S. R. Quake, “Hydrogenotrophic methanogenesis in archaeal phylum Verstraetearchaeota reveals the shared ancestry of all methanogens,” Proc. Natl. Acad. Sci. U. S. A., vol. 116, no. 11, pp. 5037– 5044, Mar. 2019, doi: 10.1073/pnas.1815631116. [4] Y. Liu, L. L. Beer, and W. B. Whitman, “Methanogens: A window into ancient sulfur metabolism,” Trends in Microbiology, vol. 20, no. 5. Trends Microbiol, pp. 251–258, May 2012, doi: 10.1016/j.tim.2012.02.002. [5] H. Beinert, R. H. Holm, and E. Münck, “Iron-sulfur clusters: Nature’s modular, multipurpose structures,” Science (80-. )., vol. 277, no. 5326, pp. 653–659, Aug. 1997, doi: 10.1126/science.277.5326.653. [6] R. K. Thauer, A. K. Kaster, H. Seedorf, W. Buckel, and R. Hedderich, “Methanogenic archaea: Ecologically relevant differences in energy conservation,” Nature Reviews Microbiology, vol. 6, no. 8. Nature Publishing Group, pp. 579–591, Aug. 30, 2008, doi: 10.1038/nrmicro1931. [7] M. Fontecave and S. Ollagnier-de-Choudens, “Iron-sulfur cluster biosynthesis in bacteria: Mechanisms of cluster assembly and transfer,” Arch. Biochem. Biophys., vol. 474, no. 2, pp. 226–237, Jun. 2008, doi: 10.1016/j.abb.2007.12.014. [8] A. Mahadevan and S. Fernando, “Inorganic iron-sulfur clusters enhance electron transport when used for wiring the NAD-glucose dehydrogenase based redox system,” Microchim. Acta, vol. 185, no. 7, pp. 1–8, Jul. 2018, doi: 10.1007/s00604-018-2871-x. [9] J. W. Peters et al., “[FeFe]- and [NiFe]-hydrogenase diversity, mechanism, and maturation,” Biochimica et Biophysica Acta - Molecular Cell Research, vol. 1853, no. 6. Elsevier, pp. 1350–1369, Jun. 01, 2015, doi: 10.1016/j.bbamcr.2014.11.021. [10] E. S. Boyd, M. J. Amenabar, S. Poudel, and A. S. Templeton, “Bioenergetic constraints on the origin of autotrophic metabolism,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 378, no. 2165, Feb. 2020, doi: 10.1098/rsta.2019.0151. 123 [11] Y. Liu, M. Sieprawska-Lupa, W. B. Whitman, and R. H. White, “Cysteine is not the sulfur source for iron-sulfur cluster and methionine biosynthesis in the methanogenic archaeon Methanococcus maripaludis,” J. Biol. Chem., vol. 285, no. 42, pp. 31923–31929, Oct. 2010, doi: 10.1074/jbc.M110.152447. [12] E. S. Boyd, K. M. Thomas, Y. Dai, J. M. Boyd, and F. W. Outten, “Interplay between Oxygen and Fe-S Cluster Biogenesis: Insights from the Suf Pathway,” Biochemistry, vol. 53, no. 37. American Chemical Society, pp. 5834–5847, Sep. 23, 2014, doi: 10.1021/bi500488r. [13] E. S. Payne, D., Spietz, R.L., Boyd, “Reductive Dissolution of pyrite by methanogenic archaea,” J. Bacteriol. [14] E. S. B. Devon Payne, Eric M. Shepard Rachel L. Spietz, Katherine F. Steward, Sue Brunfield, Mark Young, Brian Bothner, William E. Broderick, Joan B. Broderick, “Reductive dissolution of pyrite II: Examining pathways of iron and sulfur acquisition, trafficking, deployment, and storage in mineral-grown methanogen cells,” J. Bacteriol. [15] E. S. B. Rachel L. Spietz, Devon R. Payne, Will Kincanonn, Gargi Kulkarni, William W. Metcalf, Jennifer L. DuBois, “Reductive dissolution of pyrite I: Probing abiotic and biotic mechanisms of mineral reduction,” J. Bacteriol. [16] W. B. Whitman, E. Ankwanda, and R. S. Wolfe, “Nutrition and carbon metabolism of Methanococcus voltae,” J. Bacteriol., vol. 149, no. 3, pp. 852–863, 1982, doi: 10.1128/jb.149.3.852-863.1982. [17] S. Gessulat et al., “Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning,” Nat. Methods, vol. 16, no. 6, pp. 509–518, Jun. 2019, doi: 10.1038/s41592-019-0426-7. [18] B. C. Searle et al., “Generating high-quality libraries for DIA-MS with empirically-corrected peptide predictions,” bioRxiv. bioRxiv, p. 682245, Jun. 27, 2019, doi: 10.1101/682245. [19] “Semi-supervised learning for peptide identification from shotgun proteomics datasets.” https://noble.gs.washington.edu/proj/percolator/ (accessed Feb. 27, 2021). [20] L. Käll, J. D. Storey, M. J. MacCoss, and W. S. Noble, “Assigning significance to peptides identified by tandem mass spectrometry using decoy databases,” Journal of Proteome Research, vol. 7, no. 1. American Chemical Society, pp. 29–34, Jan. 2008, doi: 10.1021/pr700600n. 124 [21] L. Käll, J. D. Storey, and W. S. Noble, “Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry,” in Bioinformatics, Aug. 2008, vol. 24, no. 16, p. i42, doi: 10.1093/bioinformatics/btn294. [22] J. Xia, I. V. Sinelnikov, B. Han, and D. S. Wishart, “MetaboAnalyst 3.0—making metabolomics more meaningful,” Nucleic Acids Res., vol. 43, no. W1, pp. W251–W257, Jul. 2015, doi: 10.1093/nar/gkv380. [23] L. A. Kelley, S. Mezulis, C. M. Yates, M. N. Wass, and M. J. E. Sternberg, “The Phyre2 web portal for protein modeling, prediction and analysis,” Nat. Protoc., vol. 10, no. 6, pp. 845–858, Jun. 2015, doi: 10.1038/nprot.2015.053. [24] N. Y. Yu et al., “PSORTb 3.0: Improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes,” Bioinformatics, vol. 26, no. 13, pp. 1608–1615, May 2010, doi: 10.1093/bioinformatics/btq249. [25] “Methanococcus voltae (ID 749) - Genome - NCBI.” https://www.ncbi.nlm.nih.gov/genome?term=txid2188%5Borgn%5D (accessed Feb. 27, 2021). [26] K. Mosbahi, M. Wojnowska, A. Albalat, and D. Walker, “Bacterial iron acquisition mediated by outer membrane translocation and cleavage of a host protein,” Proc. Natl. Acad. Sci. U. S. A., vol. 115, no. 26, pp. 6840–6845, Jun. 2018, doi: 10.1073/pnas.1800672115. [27] B. J. Vaccaro et al., “Biological iron-sulfur storage in a thioferrateprotein nanoparticle,” Nat. Commun., vol. 8, no. 1, pp. 1–9, Jul. 2017, doi: 10.1038/ncomms16110. [28] M. O. Dayhoff, R. V. Eck, and C. M. Park, “A model of evolutionary change in proteins.” National Biomedical Research Foundation, pp. 89–100, 1972. [29] T. Miyata, S. Miyazawa, and T. Yasunaga, “Two types of amino acid substitutions in protein evolution,” J. Mol. Evol., vol. 12, no. 3, pp. 219–236, Mar. 1979, doi: 10.1007/BF01732340. [30] D. Schneider and C. L. Schmidt, “Multiple Rieske proteins in prokaryotes: Where and why?,” Biochimica et Biophysica Acta - Bioenergetics, vol. 1710, no. 1. Elsevier, pp. 1–12, Nov. 15, 2005, doi: 10.1016/j.bbabio.2005.09.003. [31] M. Blokesch, S. P. J. Albracht, B. F. Matzanke, N. M. Drapal, A. Jacobi, and A. Böck, “The complex between hydrogenase-maturation proteins HypC and HypD is an intermediate in the supply of cyanide to the active site iron of [NiFe]-hydrogenases,” J. Mol. Biol., vol. 344, no. 1, pp. 155–167, Nov. 2004, doi: 10.1016/j.jmb.2004.09.040. 125 [32] C. K. Y. Lau, K. D. Krewulak, and H. J. Vogel, “Bacterial ferrous iron transport: The Feo system,” FEMS Microbiol. Rev., vol. 40, no. 2, pp. 273–298, Jan. 2016, doi: 10.1093/femsre/fuv049. [33] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” J. Mol. Biol., vol. 215, no. 3, pp. 403–410, 1990, doi: 10.1016/S0022- 2836(05)80360-2. [34] G. L. Holliday et al., “Atlas of the Radical SAM Superfamily: Divergent Evolution of Function Using a ‘Plug and Play’ Domain,” in Methods in Enzymology, vol. 606, Academic Press Inc., 2018, pp. 1–71. [35] J. B. Broderick, B. R. Duffus, K. S. Duschene, and E. M. Shepard, “Radical S- adenosylmethionine enzymes,” Chemical Reviews, vol. 114, no. 8. American Chemical Society, pp. 4229–4317, Apr. 23, 2014, doi: 10.1021/cr4004709. [36] P. A. Frey, A. D. Hegeman, and F. J. Ruzicka, “The radical SAM superfamily,” Critical Reviews in Biochemistry and Molecular Biology, vol. 43, no. 1. Crit Rev Biochem Mol Biol, pp. 63–88, Jan. 2008, doi: 10.1080/10409230701829169. [37] C. Ranquet, S. Ollagnier-de-Choudens, L. Loiseau, F. Barras, and M. Fontecave, “Cobalt stress in Escherichia coli: The effect on the iron-sulfur proteins,” J. Biol. Chem., vol. 282, no. 42, pp. 30442–30451, Oct. 2007, doi: 10.1074/jbc.M702519200. [38] R. G. Björk and T. Rasmuson, Modification and Editing of RNA. Washington, DC: ASM Press, 1998. [39] P. Nordlund and P. Reichard, “Ribonucleotide Reductases,” Annu. Rev. Biochem., vol. 75, no. 1, pp. 681–706, Jun. 2006, doi: 10.1146/annurev.biochem.75.103004.142443. [40] S. E. McGlynn et al., “Identification and characterization of a novel member of the radical AdoMet enzyme superfamily and implications for the biosynthesis of the Hmd hydrogenase active site cofactor,” Journal of Bacteriology, vol. 192, no. 2. American Society for Microbiology Journals, pp. 595–598, Jan. 15, 2010, doi: 10.1128/JB.01125-09. [41] Y. Nicolet et al., “X-ray structure of the [FeFe]-hydrogenase maturase HydE from Thermotoga maritima,” J. Biol. Chem., vol. 283, no. 27, pp. 18861–18872, Jul. 2008, doi: 10.1074/jbc.M801161200. [42] A. S. Pott and C. Dahl, “Sirohaem sulfite reductase and other proteins encoded by genes at the dsr locus of Chromatium vinosum are involved in the oxidation of intracellular sulfur,” Microbiology, vol. 144, no. 7, pp. 1881–1894, 1998, doi: 10.1099/00221287-144-7- 1881. 126 [43] B. J. Rauch and J. J. Perona, “Efficient sulfide assimilation in Methanosarcina acetivorans is mediated by the MA1715 protein,” J. Bacteriol., vol. 198, no. 14, pp. 1974–1983, Jul. 2016, doi: 10.1128/JB.00141-16. [44] V. Adam, A. Royant, V. Nivière, F. P. Molina-Heredia, and D. Bourgeois, “Structure of superoxide reductase bound to ferrocyanide and active site expansion upon X-ray- induced photo-reduction,” Structure, vol. 12, no. 9, pp. 1729–1740, Sep. 2004, doi: 10.1016/j.str.2004.07.013. [45] S. Lee et al., “A 1-Cys Peroxiredoxin from a Thermophilic Archaeon Moonlights as a Molecular Chaperone to Protect Protein and DNA against Stress-Induced Damage,” PLoS One, vol. 10, no. 5, p. e0125325, May 2015, doi: 10.1371/journal.pone.0125325. [46] S. Storbeck, S. Rolfes, E. Raux-Deery, M. J. Warren, D. Jahn, and G. Layer, “A novel pathway for the biosynthesis of heme in Archaea: genome-based bioinformatic predictions and experimental evidence.,” Archaea, vol. 2010, p. 175050, 2010, doi: 10.1155/2010/175050. [47] A. Bateman et al., “UniProt: The universal protein knowledgebase,” Nucleic Acids Res., vol. 45, no. D1, pp. D158–D169, Jan. 2017, doi: 10.1093/nar/gkw1099. [48] R. D. Finn et al., “Pfam: The protein families database,” Nucleic Acids Research, vol. 42, no. D1. Oxford University Press, p. D222, Jan. 01, 2014, doi: 10.1093/nar/gkt1223. [49] T. J. Lie, K. C. Costa, B. Lupa, S. Korpole, W. B. Whitman, and J. A. Leigh, “Essential anaplerotic role for the energy-converting hydrogenase Eha in hydrogenotrophic methanogenesis,” Proc. Natl. Acad. Sci. U. S. A., vol. 109, no. 38, pp. 15473–15478, Sep. 2012, doi: 10.1073/pnas.1208779109. [50] S. P. Gilmore et al., “Genomic analysis of methanogenic archaea reveals a shift towards energy conservation,” BMC Genomics, vol. 18, no. 1, Aug. 2017, doi: 10.1186/s12864- 017-4036-4. [51] O. Lewinson, A. T. Lee, and D. C. Rees, “A P-type ATPase importer that discriminates between essential and toxic transition metals,” Proc. Natl. Acad. Sci. U. S. A., vol. 106, no. 12, pp. 4677–4682, Mar. 2009, doi: 10.1073/pnas.0900666106. [52] A. Gindner, W. Hausner, and M. Thomm, “The TrmB family: a versatile group of transcriptional regulators in Archaea,” Extremophiles, vol. 18, no. 5. Springer-Verlag Tokyo, pp. 925–936, Sep. 01, 2014, doi: 10.1007/s00792-014-0677-2. [53] J. S. McDowall, B. J. Murphy, M. Haumann, T. Palmer, F. A. Armstrong, and F. Sargent, “Bacterial formate hydrogenlyase complex,” Proc. Natl. Acad. Sci. U. S. A., vol. 111, no. 38, pp. E3948–E3956, Sep. 2014, doi: 10.1073/pnas.1407927111. 127 [54] H. L. Hernández et al., “MiaB, a Bifunctional Radical-S-Adenosylmethionine Enzyme Involved in the Thiolation and Methylation of tRNA, Contains Two Essential [4Fe-4S] Clusters †,” 2007, doi: 10.1021/bi7000449. [55] K. Kvint, L. Nachin, A. Diez, and T. Nyström, “The bacterial universal stress protein: Function and regulation,” Current Opinion in Microbiology, vol. 6, no. 2. Elsevier Ltd, pp. 140–145, 2003, doi: 10.1016/S1369-5274(03)00025-0. [56] A. Bepperling et al., “Alternative bacterial two-component small heat shock protein systems,” Proc. Natl. Acad. Sci. U. S. A., vol. 109, no. 50, pp. 20407–20412, Dec. 2012, doi: 10.1073/pnas.1209565109. [57] S. V. Albers, S. M. Koning, W. N. Konings, and A. J. M. Driessen, “Insights into ABC Transport in Archaea,” Journal of Bioenergetics and Biomembranes, vol. 36, no. 1. J Bioenerg Biomembr, pp. 5–15, Feb. 2004, doi: 10.1023/B:JOBB.0000019593.84933.e6. [58] A. Briegel, D. R. Ortega, A. N. Huang, C. M. Oikonomou, R. P. Gunsalus, and G. J. Jensen, “Structural conservation of chemotaxis machinery across Archaea and Bacteria,” Environ. Microbiol. Rep., vol. 7, no. 3, pp. 414–419, Jun. 2015, doi: 10.1111/1758-2229.12265. [59] W. S. Maaty et al., “Something old, something new, something borrowed; how the thermoacidophilic archaeon Sulfolobus solfataricus responds to oxidative stress,” PLoS One, vol. 4, no. 9, Sep. 2009, doi: 10.1371/journal.pone.0006964. [60] W. S. Maaty et al., “Proteomic analysis of sulfolobus solfataricus during sulfolobus turreted icosahedral virus infection,” J. Proteome Res., vol. 11, no. 2, pp. 1420–1432, Feb. 2012, doi: 10.1021/pr201087v. [61] E. Guedon and J. D. Helmann, “Origins of metal ion selectivity in the DtxR/MntR family of metalloregulators,” Mol. Microbiol., vol. 48, no. 2, pp. 495–506, Apr. 2003, doi: 10.1046/j.1365-2958.2003.03445.x. [62] Y. Zhu, S. Kumar, A. L. Menon, R. A. Scott, and M. W. W. Adams, “Regulation of iron metabolism by pyrococcus furiosus,” J. Bacteriol., vol. 195, no. 10, pp. 2400–2407, May 2013, doi: 10.1128/JB.02280-12. [63] K. Hantke, “Iron and metal regulation in bacteria,” Current Opinion in Microbiology, vol. 4, no. 2. Elsevier Ltd, pp. 172–177, 2001, doi: 10.1016/S1369- 5274(00)00184-3. [64] B. L. Deatheragea and B. T. Cooksona, “Membrane vesicle release in bacteria, eukaryotes, and archaea: A conserved yet underappreciated aspect of microbial life,” Infection 128 and Immunity, vol. 80, no. 6. American Society for Microbiology (ASM), pp. 1948–1957, Jun. 2012, doi: 10.1128/IAI.06014-11. [65] A. J. L. Macario, M. Lange, B. K. Ahring, and E. C. De Macario, “Stress Genes and Proteins in the Archaea,” Microbiol. Mol. Biol. Rev., vol. 63, no. 4, pp. 923–967, Dec. 1999, doi: 10.1128/mmbr.63.4.923-967.1999. 129 CHAPTER FIVE CONCLUDING REMARKS Throughout this work, a clear connection between the biological variation within an experimental group and a stressed phenotype was developed. Cellular response to stress is a multifaceted and complex process. Because of this, defining and characterizing a stressed phenotype from omics data is not always straight-forward. Stress mechanisms are often activated at the onset of even minor perturbations. This can create complications when working with experimental conditions that are designed to test a hypothesis that does not include cellular stress response. The ability to evaluate multiple pathways such as stress response and central carbon metabolism simultaneously, is one major advantage of omics approaches. Metabolomics is a highly sensitive technique that enables the detection of multiple and/or small metabolic changes in a system. By exploiting this sensitivity and using a novel approach for analyzing statistical variation within the data, a distinct pattern was revealed; metabolic variation between organisms decreases in the presence of external stress. This work began with a simple question about non-canonical amino acids used as a protein tag and their potential impact on an organism when they are utilized. A traditional metabolomics workflow demonstrated that the addition of the amino acid methionine and the non-canonical amino acids AHA and HPG resulted in metabolomic changes of 7%, 19% and 8%, respectively. These perturbations resulted in adaptations to intermediates in the TCA cycle, glycerophospholipids, amino acids and acetylated amino acids. Interestingly, the multivariate analysis that outlined the metabolic changes was mirrored when the variation within the data was 130 calculated. This led to additional work that expanded variation analysis and its relationship to omics data. We first noted the occurrence of variation changing due to stress in work from 2014 [1], and it was a phenomenon that reappeared repeatedly, not only in our metabolomics data, but our proteomics data as well. To better understand and fully characterize this phenomenon, data was collected from the omics repositories: Metabolomics Workbench and PRIDE. By augmenting our data with that from published studies by other research groups I was able to construct a model for analysis of variation as it relates to stress: Replicate Variation Analysis (RVA). Analyzing the change in coefficient of variation and population statistics provided multiple examples of a system under stress reflected as a smaller RVA in omics data. An interesting outlier to the RVA model was that chronic stress seems to have a different signature in RVA. In contrast to the reduced variation observed in response to an acute stress, systems enduring continued pressure leading to a chronic stress, in some cases exhibit a rebound that generates increased variability. As described in chapter three, patients with Chronic Fatigue Syndrome had significantly larger RVA statistics. Given our observations that time impacts the RVDA, we surmise that in contrast to acute stress, chronic stress may result in an opposite trend and a corresponding increase in CV distribution patterns. This idea is consistent with the evolutionary theory that directional evolution based on environmental stress induces increased phenotypic and genetic variation [2], however additional work is needed to fully characterize the relationship between chronic stress and CSR variation in omics data. An additional point for further investigation is the potential to use RVA as a measure of severity of stress and a reflection of recovery, or plastic homeostasis of the system. By taking the traditional omics approach and 131 using the same data for a parallel analysis, I was able to identify phenome changes that were previously overlooked. RVA is a fast and simple tool that can offer clarity to complicated data in a highly reductive way (Figure 5.1). This view of the data can offer new connections between the phenotype and metabolome or proteome. Figure 5.1. RVA Focuses Data Analysis: Graphical abstract of the utilization of Replicate Variation Analysis (RVA) as a tool for omics analysis and characterization of stressed phenotypes and pathways that are tightly controlled due to this phenomenon. A proteomic analysis that helped explain some of the biochemical changes contributing to the change in variation due to stress was an archaeal system. Stress was not the predicted outcome of the experimental set up, but work from our collaborators showed indications that the 132 organism was adopting a stressed phenotype from TEM analysis of cell size and shape. To investigate the potential stress and the interaction with the experimental conditions at play, RVA was utilized to help untangle these mixed results. The original aim of the work was to probe potential schemes of Fe and S trafficking in Methanococcous Voltae, we completed a deep proteomic analysis of the system when grown on different sources of these minerals. We had excellent coverage of the proteome, identifying 1,269 proteins, a surprising 79% of predicted open reading frames. To explore the potential CSR that might also be at play, we applied RVA to help identify a stressed phenotype, stress proteins and proteins that were under tight control of the cells. Proteins with a small RVA in an experimental condition compared to the control indicates importance of that protein related to Fe and S integration. This reaffirmed many proteins that we had already identified through the traditional proteomics approach as potential players in Fe and S trafficking. This analysis also identified potential mechanisms that underlie RVA. Unsurprisingly, stress response pathways like heat shock, unfolded protein response and DNA damage repair were upregulated in the stressed phenotype and also had a small RVA. The control condition (non-stressed) had larger amounts of proteins relating to transcription and translation, with larger RVA values compared to the stressed group. This indicates that the cell had more metabolic freedom in the control environment, a conclusion that was reflected in other analyses as well. There are many areas of future research that arise from the characterization of the relationship between variation in omics data and phenotype. While here, I demonstrated that experimental groups that had been characterized as “stressed” resulted in a smaller CV distribution profile, a further characterization is necessary. Is it truly stress, or is it just a stressed- 133 adapted phenotype? CSR can have deleterious effects and even result in death, what does the CV profile look like throughout this progression? As mentioned in chapter three, temporal studies tracking the onset of the “adapted phenotype” and the relaxation of the “adapted phenotype” should be characterized to help identify plastic and non-plastic phenotypic adaptations. Further annotated biological data should also be evaluated to help characterize global trends in changes in CV distribution. The M. voltae provided information that indicated a stress response was occurring and some cellular processes, like transcription and translation were quieted, but this might not be the underlying mechanism for every organism that demonstrates this change in CV profile. This work began with a typical metabolomics analysis to investigate the implications of Non-canonical Amino Acid Tags in E. coli. Within the work, a relationship between variation in omics data and metabolomic dysregulation was first identified. Then, through a meta-analysis of various omics analysis, I was able to characterize this relationship and describe the statistical model needed to evaluate variation distribution comparisons between two experimental groups (RVA). Then, by using RVA in addition to standard statistical approaches for a proteomics analysis in the archaeon M. voltae, large global changes were better explained, and phenotypic depth was added to pathway differentiation. The results presented in this dissertation show that metabolomics and proteomics are powerful tools for elucidating metabolic adaptations to minor and more severe stress. We demonstrated the link between stress and variation in omics data is reproducible and an effective way to evaluate a stressed phenotype. Not only was I able to characterize an additional omics statistic that would add valuable information to any omics workflow, I was able to characterize 134 some of the underlying mechanisms that result in the metabolic tightening that occurs due to stress. 135 References Cited [1] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Dec. 2014, doi: 10.1007/s11306-014- 0651-0. [2] A. V. Badyaev, “Stress-induced variation in evolution: From behavioural plasticity to genetic assimilation,” Proceedings of the Royal Society B: Biological Sciences, vol. 272, no. 1566. Royal Society, pp. 877–886, May 07, 2005, doi: 10.1098/rspb.2004.3045. 136 REFERENCES CITED 137 [1] D. C. Dieterich, A. J. Link, J. Graumann, D. A. Tirrell, and E. M. Schuman, “Selective identification of newly synthesized proteins in mammalian cells using bioorthogonal noncanonical amino acid tagging (BONCAT),” Proc. Natl. Acad. Sci., vol. 103, no. 25, pp. 9482–9487, Jun. 2006, doi: 10.1073/pnas.0601637103. [2] H. C. Kolb, M. G. Finn, and K. B. Sharpless, “Click Chemistry: Diverse Chemical Function from a Few Good Reactions,” Angew. Chemie Int. Ed., vol. 40, no. 11, pp. 2004–2021, Jun. 2001, doi: 10.1002/1521-3773(20010601)40:11<2004::AID-ANIE2004>3.0.CO;2-5. [3] T. J. Samo, S. Smriga, F. Malfatti, B. P. Sherwood, and F. Azam, “Broad distribution and high proportion of protein synthesis active marine bacteria revealed by click chemistry at the single cell level,” Front. Mar. Sci., vol. 1, p. 48, Oct. 2014, doi: 10.3389/fmars.2014.00048. [4] A. Leizeaga, M. Estrany, I. Forn, and M. Sebastián, “Using Click-Chemistry for Visualizing in Situ Changes of Translational Activity in Planktonic Marine Bacteria.,” Front. Microbiol., vol. 8, p. 2360, 2017, doi: 10.3389/fmicb.2017.02360. [5] M. Sebastián et al., “High Growth Potential of Long-Term Starved Deep Ocean Opportunistic Heterotrophic Bacteria,” Front. Microbiol., vol. 10, p. 760, Apr. 2019, doi: 10.3389/fmicb.2019.00760. [6] R. Hatzenpichler, S. A. Connon, D. Goudeau, R. R. Malmstrom, T. Woyke, and V. J. Orphan, “Visualizing in situ translational activity for identifying and sorting slow-growing archaeal-bacterial consortia.,” Proc. Natl. Acad. Sci. U. S. A., vol. 113, no. 28, pp. E4069-78, Jul. 2016, doi: 10.1073/pnas.1603757113. [7] R. Hatzenpichler, S. Scheller, P. L. Tavormina, B. M. Babin, D. A. Tirrell, and V. J. Orphan, “In situ visualization of newly synthesized proteins in environmental microbes using amino acid tagging and click chemistry,” Environ. Microbiol., vol. 16, no. 8, pp. 2568–2590, Aug. 2014, doi: 10.1111/1462-2920.12436. [8] K. L. Kiick, E. Saxon, D. A. Tirrell, and C. R. Bertozzi, “Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligation.,” Proc. Natl. Acad. Sci. U. S. A., vol. 99, no. 1, pp. 19–24, Jan. 2002, doi: 10.1073/pnas.012583299. [9] J. D. Bagert et al., “Quantitative, Time-Resolved Proteomic Analysis by Combining Bioorthogonal Noncanonical Amino Acid Tagging and Pulsed Stable Isotope Labeling by Amino Acids in Cell Culture,” Mol. Cell. Proteomics, vol. 13, no. 5, pp. 1352–1358, May 2014, doi: 10.1074/mcp.M113.031914. [10] R. Hatzenpichler and V. J. Orphan, “Detection of Protein-Synthesizing Microorganisms in the Environment via Bioorthogonal Noncanonical Amino Acid Tagging (BONCAT),” Springer, Berlin, Heidelberg, 2015, pp. 145–157. [11] P. Landgraf, E. R. Antileo, E. M. Schuman, and D. C. Dieterich, “BONCAT: Metabolic Labeling, Click Chemistry, and Affinity Purification of Newly Synthesized Proteomes,” Humana Press, New York, NY, 2015, pp. 199–215. [12] S. Calve, A. J. Witten, A. R. Ocken, and T. L. Kinzer-Ursem, “Incorporation of non-canonical amino acids into the developing murine proteome,” Sci. Rep., vol. 6, no. 1, p. 32377, Oct. 2016, doi: 10.1038/srep32377. [13] F. Lehner et al., “Impact of Azidohomoalanine Incorporation on Protein Structure and Ligand Binding,” ChemBioChem, vol. 18, no. 23, pp. 2340–2350, Dec. 2017, doi: 10.1002/cbic.201700437. 138 [14] P. Landgraf, E. R. Antileo, E. M. Schuman, and D. C. Dieterich, “BONCAT: Metabolic Labeling, Click Chemistry, and Affinity Purification of Newly Synthesized Proteomes,” Humana Press, New York, NY, 2015, pp. 199–215. [15] T. Hamerly et al., “Untargeted metabolomics studies employing NMR and LC– MS reveal metabolic coupling between Nanoarcheum equitans and its archaeal host Ignicoccus hospitalis,” Metabolomics, vol. 11, no. 4, pp. 895–907, Aug. 2015, doi: 10.1007/s11306-014- 0747-6. [16] M. Bradford, “A Rapid and Sensitive Method for the Quantitation of Microgram Quantities of Protein Utilizing the Principle of Protein-Dye Binding,” Anal. Biochem., vol. 72, no. 1–2, pp. 248–254, May 1976, doi: 10.1006/abio.1976.9999. [17] T. Pluskal, S. Castillo, A. Villar-Briones, and M. Oresic, “MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data.,” BMC Bioinformatics, vol. 11, no. 1, p. 395, Jul. 2010, doi: 10.1186/1471-2105-11-395. [18] J. Chong and J. Xia, “MetaboAnalystR: an R package for flexible and reproducible analysis of metabolomics data,” Bioinformatics, vol. 34, no. 24, pp. 4313–4314, Dec. 2018, doi: 10.1093/bioinformatics/bty528. [19] R. Tautenhahn, G. J. Patti, D. Rinehart, and G. Siuzdak, “XCMS Online: a web- based platform to process untargeted metabolomic data.,” Anal. Chem., vol. 84, no. 11, pp. 5035–9, Jun. 2012, doi: 10.1021/ac300698c. [20] O. D. Myers, S. J. Sumner, S. Li, S. Barnes, and X. Du, “Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data,” Anal. Chem., vol. 89, no. 17, pp. 8689–8695, Sep. 2017, doi: 10.1021/acs.analchem.7b01069. [21] A. Mahdavi et al., “Identification of secreted bacterial proteins by noncanonical amino acid tagging.,” Proc. Natl. Acad. Sci. U. S. A., vol. 111, no. 1, pp. 433–8, Jan. 2014, doi: 10.1073/pnas.1301740111. [22] J. D. Bagert et al., “Time-resolved proteomic analysis of quorum sensing in Vibrio harveyi,” Chem. Sci., vol. 7, no. 3, pp. 1797–1806, Feb. 2016, doi: 10.1039/C5SC03340C. [23] B. M. Babin et al., “SutA is a bacterial transcription factor expressed during slow growth in Pseudomonas aeruginosa,” Proc. Natl. Acad. Sci., vol. 113, no. 5, pp. E597–E605, Feb. 2016, doi: 10.1073/pnas.1514412113. [24] S. A. Babin, E. A. Zlobina, S. I. Kablukov, and E. V. Podivilov, “High-order random Raman lasing in a PM fiber with ultimate efficiency and narrow bandwidth,” Sci. Rep., vol. 6, no. 1, p. 22625, Sep. 2016, doi: 10.1038/srep22625. [25] J. D. Bagert et al., “Time-resolved proteomic analysis of quorum sensing in Vibrio harveyi,” Chem. Sci., vol. 7, no. 3, pp. 1797–1806, Feb. 2016, doi: 10.1039/C5SC03340C. [26] L. St»hle and S. Wold, “Analysis of variance (ANOVA),” Chemom. Intell. Lab. Syst., vol. 6, no. 4, pp. 259–272, Nov. 1989, doi: 10.1016/0169-7439(89)80095-4. [27] H.-W. Cho et al., “Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra.,” Int. J. Data Min. Bioinform., vol. 2, no. 2, pp. 176–92, 2008, Accessed: Aug. 13, 2019. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/18767354. 139 [28] T. Arnesen, “Towards a Functional Understanding of Protein N-Terminal Acetylation,” PLoS Biol., vol. 9, no. 5, p. e1001074, May 2011, doi: 10.1371/journal.pbio.1001074. [29] D. G. Christensen et al., “Post-translational Protein Acetylation: An Elegant Mechanism for Bacteria to Dynamically Regulate Metabolic Functions,” Front. Microbiol., vol. 10, p. 1604, Jul. 2019, doi: 10.3389/fmicb.2019.01604. [30] J. Elf and M. Ehrenberg, “Near-Critical Behavior of Aminoacyl-tRNA Pools in E. coli at Rate-Limiting Supply of Amino Acids,” Biophys. J., vol. 88, no. 1, pp. 132–146, Jan. 2005, doi: 10.1529/BIOPHYSJ.104.051383. [31] V. J. Carabetta and I. M. Cristea, “Regulation, Function, and Detection of Protein Acetylation in Bacteria.,” J. Bacteriol., vol. 199, no. 16, pp. e00107-17, Aug. 2017, doi: 10.1128/JB.00107-17. [32] T. Arnesen, “Towards a Functional Understanding of Protein N-Terminal Acetylation,” PLoS Biol., vol. 9, no. 5, p. e1001074, May 2011, doi: 10.1371/journal.pbio.1001074. [33] S. Jozefczuk et al., “Metabolomic and transcriptomic stress response of Escherichia coli,” Mol. Syst. Biol., vol. 6, no. 1, p. 364, Jan. 2010, doi: 10.1038/msb.2010.18. [34] L. Galluzzi, J. M. Bravo-San Pedro, O. Kepp, and G. Kroemer, “Regulated cell death and adaptive stress responses,” Cellular and Molecular Life Sciences, vol. 73, no. 11–12. Birkhauser Verlag AG, pp. 2405–2410, Jun. 01, 2016, doi: 10.1007/s00018-016-2209-y. [35] R. Schuhmacher, R. Krska, W. Weckwerth, and R. Goodacre, “Metabolomics and metabolite profiling,” Analytical and Bioanalytical Chemistry, vol. 405, no. 15. Springer, pp. 5003–5004, Jun. 17, 2013, doi: 10.1007/s00216-013-6939-5. [36] W. S. Bush, M. T. Oetjens, and D. C. Crawford, “Unravelling the human genome- phenome relationship using phenome-wide association studies,” Nature Reviews Genetics, vol. 17, no. 3. Nature Publishing Group, pp. 129–145, Mar. 01, 2016, doi: 10.1038/nrg.2015.36. [37] T. M. Healy and P. M. Schulte, “Phenotypic plasticity and divergence in gene expression,” Molecular Ecology, vol. 24, no. 13. Blackwell Publishing Ltd, pp. 3220–3222, Jul. 01, 2015, doi: 10.1111/mec.13246. [38] D. Houle, D. R. Govindaraju, and S. Omholt, “Phenomics: The next challenge,” Nature Reviews Genetics, vol. 11, no. 12. Nature Publishing Group, pp. 855–866, Dec. 18, 2010, doi: 10.1038/nrg2897. [39] C. E. Brown and C. E. Brown, “Coefficient of Variation,” in Applied Multivariate Statistics in Geohydrology and Related Sciences, Springer Berlin Heidelberg, 1998, pp. 155– 157. [40] P. H. Bessette, F. Aslund, J. Beckwith, G. Georgiou, and S. Blanquet, “Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm,” Proc. Natl. Acad. Sci., vol. 96, no. 24, pp. 13703–13708, Nov. 1999, doi: 10.1073/pnas.96.24.13703. [41] A. M. De Livera, G. Olshansky, J. A. Simpson, and D. J. Creek, “NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data,” Metabolomics, vol. 14, no. 5, p. 54, May 2018, doi: 10.1007/s11306-018-1347-7. [42] A. M. De Livera et al., “Statistical Methods for Handling Unwanted Variation in Metabolomics Data,” Anal. Chem., vol. 87, no. 7, pp. 3606–3615, Apr. 2015, doi: 10.1021/ac502439y. 140 [43] E. C. Olson and A. V. Yablokov, “Variability in Mammals,” J. Mammal., vol. 48, no. 3, p. 500, Aug. 1967, doi: 10.2307/1377806. [44] F. J. Massey, “The Kolmogorov-Smirnov Test for Goodness of Fit,” J. Am. Stat. Assoc., vol. 46, no. 253, pp. 68–78, 1951, doi: 10.1080/01621459.1951.10500769. [45] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Mar. 2014, doi: 10.1007/s11306-014- 0651-0. [46] K. F. Steward et al., “Metabolic Implications of Using BioOrthogonal Non- Canonical Amino Acid Tagging (BONCAT) for Tracking Protein Synthesis,” Front. Microbiol., vol. 11, p. 197, Feb. 2020, doi: 10.3389/fmicb.2020.00197. [47] S. L. Borrego et al., “Metabolic changes associated with methionine stress sensitivity in MDA-MB-468 breast cancer cells,” Cancer Metab., vol. 4, no. 1, p. 9, Dec. 2016, doi: 10.1186/s40170-016-0148-6. [48] H. Chou, W. Pathmasiri, J. Deese-Spruill, S. Sumner, and D. B. Buchwalter, “Metabolomics reveal physiological changes in mayfly larvae (Neocloeon triangulifer) at ecological upper thermal limits,” J. Insect Physiol., vol. 101, pp. 107–112, Aug. 2017, doi: 10.1016/j.jinsphys.2017.07.008. [49] E. M. Brown et al., “Diet and specific microbial exposure trigger features of environmental enteropathy in a novel murine model,” Nat. Commun., vol. 6, Aug. 2015, doi: 10.1038/ncomms8806. [50] F. Xu, T. Gao, and X. Liu, “Metabolomics Adaptation of Juvenile Pacific Abalone Haliotis discus hannai to Heat Stress,” Sci. Rep., vol. 10, no. 1, pp. 1–11, Dec. 2020, doi: 10.1038/s41598-020-63122-4. [51] “PRIDE - Proteomics Identification Database.” https://www.ebi.ac.uk/pride/archive/ (accessed Oct. 09, 2020). [52] M. Zhang et al., “Phosphoproteome analysis reveals new drought response and defense mechanisms of seedling leaves in bread wheat (Triticum aestivum L.),” J. Proteomics, vol. 109, pp. 290–308, Sep. 2014, doi: 10.1016/j.jprot.2014.07.010. [53] “Human Metabolome Database.” https://hmdb.ca/ (accessed Oct. 09, 2020). [54] A. J. Green et al., “Cadmium exposure increases the risk of juvenile obesity: a human and zebrafish comparative study,” Int. J. Obes., vol. 42, no. 7, pp. 1285–1295, Jul. 2018, doi: 10.1038/s41366-018-0036-y. [55] A. V. Badyaev, “Stress-induced variation in evolution: From behavioural plasticity to genetic assimilation,” Proceedings of the Royal Society B: Biological Sciences, vol. 272, no. 1566. Royal Society, pp. 877–886, May 07, 2005, doi: 10.1098/rspb.2004.3045. [56] F. T. C. Pan, S. L. Applebaum, and D. T. Manahan, “Differing thermal sensitivities of physiological processes alter ATP allocation,” J. Exp. Biol., vol. 224, no. 2, p. jeb233379, Jan. 2021, doi: 10.1242/jeb.233379. [57] J. R. Banavar, J. Damuth, A. Maritan, and A. Rinaldo, “Supply-demand balance and metabolic scaling,” Proc. Natl. Acad. Sci. U. S. A., vol. 99, no. 16, pp. 10506–10509, Aug. 2002, doi: 10.1073/pnas.162216899. 141 [58] C. Pollock, J. Farrar, D. Tomos, J. Gallagher, C. Lu, and O. Koroleva, “Balancing supply and demand: the spatial regulation of carbon metabolism in grass and cereal leaves,” J. Exp. Bot., vol. 54, no. 382, pp. 489–494, Jan. 2003, doi: 10.1093/jxb/erg037. [59] F. Chen, A. Evans, J. Pham, and B. Plosky, “Molecular Cell Editorial Cellular Stress Responses: A Balancing Act,” Mol. Cell, vol. 40, p. 175, 2010, doi: 10.1016/j.molcel.2010.10.008. [60] J. E. Ferrell and E. M. Machleder, “The biochemical basis of an all-or-none cell fate switch in xenopus oocytes,” Science (80-. )., vol. 280, no. 5365, pp. 895–898, May 1998, doi: 10.1126/science.280.5365.895. [61] R. P. Juster, B. S. McEwen, and S. J. Lupien, “Allostatic load biomarkers of chronic stress and impact on health and cognition,” Neuroscience and Biobehavioral Reviews, vol. 35, no. 1. Pergamon, pp. 2–16, Sep. 01, 2010, doi: 10.1016/j.neubiorev.2009.10.002. [62] W. E. Dyer, “Stress-induced evolution of herbicide resistance and related pleiotropic effects,” Pest Manag. Sci., vol. 74, no. 8, pp. 1759–1768, Aug. 2018, doi: 10.1002/ps.5043. [63] A. M. Pickering, L. Vojtovich, J. Tower, and K. J. A. Davies, “Oxidative stress adaptation with acute, chronic, and repeated stress,” Free Radic. Biol. Med., vol. 55, pp. 109– 118, Feb. 2013, doi: 10.1016/j.freeradbiomed.2012.11.001. [64] W. E. Dyer, “Stress-induced evolution of herbicide resistance and related pleiotropic effects,” Pest Manag. Sci., vol. 74, no. 8, pp. 1759–1768, Aug. 2018, doi: 10.1002/ps.5043. [65] M. N. Ahmed, A. Porse, M. O. A. Sommer, N. Høiby, and O. Ciofu, “Evolution of antibiotic resistance in biofilm and planktonic pseudomonas aeruginosa populations exposed to subinhibitory levels of ciprofloxacin,” Antimicrob. Agents Chemother., vol. 62, no. 8, Aug. 2018, doi: 10.1128/AAC.00320-18. [66] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, and B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” Metabolomics, vol. 10, no. 6, pp. 1121–1128, Dec. 2014, doi: 10.1007/s11306-014- 0651-0. [67] “Metabolomics Workbench : NIH Data Repository : Overview.” https://www.metabolomicsworkbench.org/data/index.php (accessed Oct. 09, 2020). [68] “RStudio | Open source & professional software for data science teams - RStudio.” https://rstudio.com/ (accessed Oct. 09, 2020). [69] “ggplot2 citation info.” https://cran.r- project.org/web/packages/ggplot2/citation.html (accessed Oct. 16, 2020). [70] C. O. Wilke, “Ridgeline Plots in ‘ggplot2’ [R package ggridges version 0.5.2],” Jan. 2020, Accessed: Oct. 16, 2020. [Online]. Available: https://cran.r- project.org/package=ggridges. [71] K. Brzóska, S. Męczyńska, and M. Kruszewski, “Iron-sulfur cluster proteins: electron transfer and beyond *,” 2006. Accessed: Feb. 26, 2021. [Online]. Available: www.actabp.pl. [72] D. R. Martin and D. V. Matyushov, “Electron-transfer chain in respiratory complex i,” Sci. Rep., vol. 7, no. 1, pp. 1–11, Dec. 2017, doi: 10.1038/s41598-017-05779-y. 142 [73] R. K. Thauer, A. K. Kaster, H. Seedorf, W. Buckel, and R. Hedderich, “Methanogenic archaea: Ecologically relevant differences in energy conservation,” Nature Reviews Microbiology, vol. 6, no. 8. Nature Publishing Group, pp. 579–591, Aug. 30, 2008, doi: 10.1038/nrmicro1931. [74] M. Fontecave and S. Ollagnier-de-Choudens, “Iron-sulfur cluster biosynthesis in bacteria: Mechanisms of cluster assembly and transfer,” Arch. Biochem. Biophys., vol. 474, no. 2, pp. 226–237, Jun. 2008, doi: 10.1016/j.abb.2007.12.014. [75] A. Mahadevan and S. Fernando, “Inorganic iron-sulfur clusters enhance electron transport when used for wiring the NAD-glucose dehydrogenase based redox system,” Microchim. Acta, vol. 185, no. 7, pp. 1–8, Jul. 2018, doi: 10.1007/s00604-018-2871-x. [76] J. W. Peters et al., “[FeFe]- and [NiFe]-hydrogenase diversity, mechanism, and maturation,” Biochimica et Biophysica Acta - Molecular Cell Research, vol. 1853, no. 6. Elsevier, pp. 1350–1369, Jun. 01, 2015, doi: 10.1016/j.bbamcr.2014.11.021. [77] Y. Liu, L. L. Beer, and W. B. Whitman, “Methanogens: A window into ancient sulfur metabolism,” Trends in Microbiology, vol. 20, no. 5. Trends Microbiol, pp. 251–258, May 2012, doi: 10.1016/j.tim.2012.02.002. [78] E. S. Boyd, M. J. Amenabar, S. Poudel, and A. S. Templeton, “Bioenergetic constraints on the origin of autotrophic metabolism,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 378, no. 2165, Feb. 2020, doi: 10.1098/rsta.2019.0151. [79] Y. Liu, M. Sieprawska-Lupa, W. B. Whitman, and R. H. White, “Cysteine is not the sulfur source for iron-sulfur cluster and methionine biosynthesis in the methanogenic archaeon Methanococcus maripaludis,” J. Biol. Chem., vol. 285, no. 42, pp. 31923–31929, Oct. 2010, doi: 10.1074/jbc.M110.152447. [80] E. S. Boyd, K. M. Thomas, Y. Dai, J. M. Boyd, and F. W. Outten, “Interplay between Oxygen and Fe-S Cluster Biogenesis: Insights from the Suf Pathway,” Biochemistry, vol. 53, no. 37. American Chemical Society, pp. 5834–5847, Sep. 23, 2014, doi: 10.1021/bi500488r. [81] E. S. Payne, D., Spietz, R.L., Boyd, “Reductive Dissolution of pyrite by methanogenic archaea,” J. Bacteriol. [82] W. B. Whitman, E. Ankwanda, and R. S. Wolfe, “Nutrition and carbon metabolism of Methanococcus voltae,” J. Bacteriol., vol. 149, no. 3, pp. 852–863, 1982, doi: 10.1128/jb.149.3.852-863.1982. [83] S. Gessulat et al., “Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning,” Nat. Methods, vol. 16, no. 6, pp. 509–518, Jun. 2019, doi: 10.1038/s41592-019-0426-7. [84] B. C. Searle et al., “Generating high-quality libraries for DIA-MS with empirically-corrected peptide predictions,” bioRxiv. bioRxiv, p. 682245, Jun. 27, 2019, doi: 10.1101/682245. [85] “Semi-supervised learning for peptide identification from shotgun proteomics datasets.” https://noble.gs.washington.edu/proj/percolator/ (accessed Feb. 27, 2021). [86] L. Käll, J. D. Storey, M. J. MacCoss, and W. S. Noble, “Assigning significance to peptides identified by tandem mass spectrometry using decoy databases,” Journal of Proteome Research, vol. 7, no. 1. American Chemical Society, pp. 29–34, Jan. 2008, doi: 10.1021/pr700600n. 143 [87] L. Käll, J. D. Storey, and W. S. Noble, “Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry,” in Bioinformatics, Aug. 2008, vol. 24, no. 16, p. i42, doi: 10.1093/bioinformatics/btn294. [88] J. Xia, I. V. Sinelnikov, B. Han, and D. S. Wishart, “MetaboAnalyst 3.0—making metabolomics more meaningful,” Nucleic Acids Res., vol. 43, no. W1, pp. W251–W257, Jul. 2015, doi: 10.1093/nar/gkv380. [89] L. A. Kelley, S. Mezulis, C. M. Yates, M. N. Wass, and M. J. E. Sternberg, “The Phyre2 web portal for protein modeling, prediction and analysis,” Nat. Protoc., vol. 10, no. 6, pp. 845–858, Jun. 2015, doi: 10.1038/nprot.2015.053. [90] N. Y. Yu et al., “PSORTb 3.0: Improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes,” Bioinformatics, vol. 26, no. 13, pp. 1608–1615, May 2010, doi: 10.1093/bioinformatics/btq249. [91] “Methanococcus voltae (ID 749) - Genome - NCBI.” https://www.ncbi.nlm.nih.gov/genome?term=txid2188%5Borgn%5D (accessed Feb. 27, 2021). [92] M. O. Dayhoff, R. V. Eck, and C. M. Park, “A model of evolutionary change in proteins.” National Biomedical Research Foundation, pp. 89–100, 1972. [93] T. Miyata, S. Miyazawa, and T. Yasunaga, “Two types of amino acid substitutions in protein evolution,” J. Mol. Evol., vol. 12, no. 3, pp. 219–236, Mar. 1979, doi: 10.1007/BF01732340. [94] C. K. Y. Lau, K. D. Krewulak, and H. J. Vogel, “Bacterial ferrous iron transport: The Feo system,” FEMS Microbiol. Rev., vol. 40, no. 2, pp. 273–298, Jan. 2016, doi: 10.1093/femsre/fuv049. [95] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” J. Mol. Biol., vol. 215, no. 3, pp. 403–410, 1990, doi: 10.1016/S0022- 2836(05)80360-2. [96] G. L. Holliday et al., “Atlas of the Radical SAM Superfamily: Divergent Evolution of Function Using a ‘Plug and Play’ Domain,” in Methods in Enzymology, vol. 606, Academic Press Inc., 2018, pp. 1–71. [97] J. B. Broderick, B. R. Duffus, K. S. Duschene, and E. M. Shepard, “Radical S- adenosylmethionine enzymes,” Chemical Reviews, vol. 114, no. 8. American Chemical Society, pp. 4229–4317, Apr. 23, 2014, doi: 10.1021/cr4004709. [98] P. A. Frey, A. D. Hegeman, and F. J. Ruzicka, “The radical SAM superfamily,” Critical Reviews in Biochemistry and Molecular Biology, vol. 43, no. 1. Crit Rev Biochem Mol Biol, pp. 63–88, Jan. 2008, doi: 10.1080/10409230701829169. [99] C. Ranquet, S. Ollagnier-de-Choudens, L. Loiseau, F. Barras, and M. Fontecave, “Cobalt stress in Escherichia coli: The effect on the iron-sulfur proteins,” J. Biol. Chem., vol. 282, no. 42, pp. 30442–30451, Oct. 2007, doi: 10.1074/jbc.M702519200. [100] R. G. Björk and T. Rasmuson, Modification and Editing of RNA. Washington, DC: ASM Press, 1998. [101] P. Nordlund and P. Reichard, “Ribonucleotide Reductases,” Annu. Rev. Biochem., vol. 75, no. 1, pp. 681–706, Jun. 2006, doi: 10.1146/annurev.biochem.75.103004.142443. [102] S. E. McGlynn et al., “Identification and characterization of a novel member of the radical AdoMet enzyme superfamily and implications for the biosynthesis of the Hmd 144 hydrogenase active site cofactor,” Journal of Bacteriology, vol. 192, no. 2. American Society for Microbiology Journals, pp. 595–598, Jan. 15, 2010, doi: 10.1128/JB.01125-09. [103] Y. Nicolet et al., “X-ray structure of the [FeFe]-hydrogenase maturase HydE from Thermotoga maritima,” J. Biol. Chem., vol. 283, no. 27, pp. 18861–18872, Jul. 2008, doi: 10.1074/jbc.M801161200. [104] A. S. Pott and C. Dahl, “Sirohaem sulfite reductase and other proteins encoded by genes at the dsr locus of Chromatium vinosum are involved in the oxidation of intracellular sulfur,” Microbiology, vol. 144, no. 7, pp. 1881–1894, 1998, doi: 10.1099/00221287-144-7- 1881. [105] B. J. Rauch and J. J. Perona, “Efficient sulfide assimilation in Methanosarcina acetivorans is mediated by the MA1715 protein,” J. Bacteriol., vol. 198, no. 14, pp. 1974–1983, Jul. 2016, doi: 10.1128/JB.00141-16. [106] V. Adam, A. Royant, V. Nivière, F. P. Molina-Heredia, and D. Bourgeois, “Structure of superoxide reductase bound to ferrocyanide and active site expansion upon X-ray- induced photo-reduction,” Structure, vol. 12, no. 9, pp. 1729–1740, Sep. 2004, doi: 10.1016/j.str.2004.07.013. [107] S. Lee et al., “A 1-Cys Peroxiredoxin from a Thermophilic Archaeon Moonlights as a Molecular Chaperone to Protect Protein and DNA against Stress-Induced Damage,” PLoS One, vol. 10, no. 5, p. e0125325, May 2015, doi: 10.1371/journal.pone.0125325. [108] S. Storbeck, S. Rolfes, E. Raux-Deery, M. J. Warren, D. Jahn, and G. Layer, “A novel pathway for the biosynthesis of heme in Archaea: genome-based bioinformatic predictions and experimental evidence.,” Archaea, vol. 2010, p. 175050, 2010, doi: 10.1155/2010/175050. [109] A. Bateman et al., “UniProt: The universal protein knowledgebase,” Nucleic Acids Res., vol. 45, no. D1, pp. D158–D169, Jan. 2017, doi: 10.1093/nar/gkw1099. [110] R. D. Finn et al., “Pfam: The protein families database,” Nucleic Acids Research, vol. 42, no. D1. Oxford University Press, p. D222, Jan. 01, 2014, doi: 10.1093/nar/gkt1223. [111] T. J. Lie, K. C. Costa, B. Lupa, S. Korpole, W. B. Whitman, and J. A. Leigh, “Essential anaplerotic role for the energy-converting hydrogenase Eha in hydrogenotrophic methanogenesis,” Proc. Natl. Acad. Sci. U. S. A., vol. 109, no. 38, pp. 15473–15478, Sep. 2012, doi: 10.1073/pnas.1208779109. [112] S. P. Gilmore et al., “Genomic analysis of methanogenic archaea reveals a shift towards energy conservation,” BMC Genomics, vol. 18, no. 1, Aug. 2017, doi: 10.1186/s12864- 017-4036-4. [113] A. Gindner, W. Hausner, and M. Thomm, “The TrmB family: a versatile group of transcriptional regulators in Archaea,” Extremophiles, vol. 18, no. 5. Springer-Verlag Tokyo, pp. 925–936, Sep. 01, 2014, doi: 10.1007/s00792-014-0677-2. [114] J. S. McDowall, B. J. Murphy, M. Haumann, T. Palmer, F. A. Armstrong, and F. Sargent, “Bacterial formate hydrogenlyase complex,” Proc. Natl. Acad. Sci. U. S. A., vol. 111, no. 38, pp. E3948–E3956, Sep. 2014, doi: 10.1073/pnas.1407927111. [115] W. S. Maaty et al., “Something old, something new, something borrowed; how the thermoacidophilic archaeon Sulfolobus solfataricus responds to oxidative stress,” PLoS One, vol. 4, no. 9, Sep. 2009, doi: 10.1371/journal.pone.0006964. 145 [116] W. S. Maaty et al., “Proteomic analysis of sulfolobus solfataricus during sulfolobus turreted icosahedral virus infection,” J. Proteome Res., vol. 11, no. 2, pp. 1420–1432, Feb. 2012, doi: 10.1021/pr201087v. [117] E. Guedon and J. D. Helmann, “Origins of metal ion selectivity in the DtxR/MntR family of metalloregulators,” Mol. Microbiol., vol. 48, no. 2, pp. 495–506, Apr. 2003, doi: 10.1046/j.1365-2958.2003.03445.x. [118] Y. Zhu, S. Kumar, A. L. Menon, R. A. Scott, and M. W. W. Adams, “Regulation of iron metabolism by pyrococcus furiosus,” J. Bacteriol., vol. 195, no. 10, pp. 2400–2407, May 2013, doi: 10.1128/JB.02280-12. [119] K. Hantke, “Iron and metal regulation in bacteria,” Current Opinion in Microbiology, vol. 4, no. 2. Elsevier Ltd, pp. 172–177, 2001, doi: 10.1016/S1369- 5274(00)00184-3. [120] B. L. Deatheragea and B. T. Cooksona, “Membrane vesicle release in bacteria, eukaryotes, and archaea: A conserved yet underappreciated aspect of microbial life,” Infection and Immunity, vol. 80, no. 6. American Society for Microbiology (ASM), pp. 1948–1957, Jun. 2012, doi: 10.1128/IAI.06014-11. 146 APPENDICES 147 APPENDIX A SUPPLEMENTAL MATERIAL FOR CHAPTER TWO 148 149 150 Table S1. OD growth curves. All within 8.0% of the overall average at 1.3. OD Average Sample 0 min 210 min OD C1 0.871 1.231 C2 0.871 1.264 C3 0.871 1.224 1.217 C4 0.871 1.180 C5 0.871 1.184 A1 0.870 1.290 A2 0.870 1.349 A3 0.870 1.412 1.335 A4 0.870 1.276 A5 0.870 1.347 151 H1 0.872 1.493 H2 0.872 1.311 H3 0.872 1.354 1.418 H4 0.872 1.456 H5 0.872 1.474 M1 0.871 1.283 M2 0.871 1.230 M3 0.871 1.258 1.265 M4 0.871 1.334 M5 0.871 1.222 Table S2. Bradford assay protein concentrations. All within 15.0% of the overall average at 2.1. Sample # mg/mL Average mg/mL 1 1.94 2 1.90 3 1.87 1.83 4 1.74 5 1.69 6 1.91 7 1.91 8 1.91 1.94 9 2.16 10 1.82 11 2.15 2.36 152 12 2.37 13 2.45 14 2.25 15 2.56 16 2.37 17 2.35 18 2.40 2.43 19 2.40 20 2.63 Table S3. ANOVA single factor MS data in initial study SUMMARY Varian Groups Count Sum Average ce CTRL 9.57E 5.66E+ avg 4036 +08 237029 11 9.99E 7.78E+ M1 avg 4036 +08 247604 11 M50 9.43E 6.93E+ avg 4036 +08 233555 11 9.11E 6.16E+ A1 avg 4036 +08 225683 11 153 A50 8.59E 5.88E+ avg 4036 +08 212862 11 9.24E 1.44E+ H1 avg 4036 +08 228978 12 H50 8.56E 7.09E+ avg 4036 +08 212201 11 ANOVA Source P- F of Variation SS df MS F value crit Betwee 3.96E+1 0.52 2. n Groups 2 6 6.60E+11 0.85787 509 09892 Within 2.17E+1 Groups 6 28245 7.69E+11 2.17E+1 Total 6 28251 ANOVA test on non-stressed data show that all are the same, because p-value is greater than the alpha value at 0.05, and the F value is much smaller than the F critical value. Table S4. NMR metabolites in initial study. Co M M A AH H HP Mntrol et-1mM et-50µM HA-1mM A-50µM PG-1mM G-50µM etabol A A A A A A A ite S S S S S S S ver ver ver ver vera ver ver D D D D D D D age age age age ge age age 154 2 - Amin obutyr 3 0 1 0 3 0 2 1 2 0 1 0 1 0 ate .0 .3 .8 .5 .3 .7 .7 .8 .5 .8 .0 .4 .2 .0 4 - Amin 1 obutyr 59. 4 1 6 1 7 1 5 1 1 1 4 1 3 ate 7 4.6 38.3 3.4 43.8 4.5 38.3 3.8 06.3 5.1 98.6 5.0 30.2 7.1 6 2 2 1 2 1 1 A13. 25. 475. 364. 565. 4 8 864. 035. cetate 4 4 47.9 0 91.1 6 75.0 0 29.4 5.3 04.8 3 45.5 8 A 2 0 3 1 2 0 2 1 1 0 1 0 1 0 cetoin .6 .5 .0 .0 .1 .9 .2 .7 .8 .4 .0 .6 .4 .5 A denosi 1 1 2 1 1 0 1 0 1 0 1 1 0 0 ne .6 .1 .3 .9 .3 .9 .3 .7 .6 .9 .1 .2 .0 .0 1 A23. 2 1 1 1 6 8 4 7 1 6 4 5 7 lanine 6 8.8 19.1 9.4 25.7 .2 5.3 3.9 1.0 0.9 5.7 3.2 7.7 .4 A 1 4 6 6 1 2 6 2 3 1 5 1 8 0 MP 0.3 .4 .6 .6 1.9 .7 .4 .2 .4 .9 .8 .2 .3 .9 A spartat 5 0 4 1 4 1 9 1 3 0 1 6 3 0 e .0 .8 .3 .5 .7 .7 .6 0.7 .5 .6 0.2 .8 .6 .8 D imeth ylami 1 0 2 0 1 0 1 0 1 0 1 0 1 0 ne .9 .3 .0 .3 .8 .2 .5 .5 .0 .2 .1 .2 .4 .3 d 2 6 3 7 3 6 2 5 1 2 2 3 1 1 TTP 8.4 .2 0.9 .6 4.6 .6 3.2 .0 8.2 .5 1.5 .9 1.4 .5 F 2 1 ormat 74. 8 3 2 346. 2 6 2 4 2 8 1 2 e 8 6.8 69.7 4.8 70.7 5 95.6 2.4 49.4 6.8 86.9 8.0 70.7 0.1 155 F umara 5 2 4 0 3 0 5 2 4 0 3 0 5 1 te .6 .0 .2 .3 .5 .7 .7 .8 .5 .5 .1 .6 .6 .0 2 2 1 2 2 2 1 G88. 43. 926. 537. 011. 246. 140. 4 8 lucose 5 0 50.0 8 48.6 8 67.9 8 81.0 7 9.5 7 45.6 8.4 G lucose 6 1 7 1 7 1 4 1 3 0 3 1 5 0 -1- .2 .3 .0 .4 .1 .3 .9 .3 .8 .8 .9 .1 .7 .9 phosp hate G lutama 1 2 6 7 5 7 2 7 1 5 1 2 1 5 te 7.1 .2 6.1 0.9 8.7 0.0 0.3 .9 8.6 .8 3.0 .6 5.7 .7 G lutathi 3 4 4 7 5 2 2 1 2 4 3 6 2 2 one 6.1 .6 7.8 .5 4.4 1.2 8.4 1.3 1.0 .7 2.6 .6 3.7 .4 G 1 3 2 3 1 7 1 2 1 2 3 1 1 1 lycine 5.3 .7 1.4 .6 5.9 .0 3.0 .9 4.6 .7 5.5 1.1 0.4 .1 H istidin 1 0 1 0 1 0 1 0 1 0 1 0 0 0 e .1 .5 .5 .7 .1 .3 .3 .5 .2 .4 .0 .4 .8 .1 H ypoxa 9 3 1 1 1 3 5 1 4 1 6 2 3 0 nthine .3 .1 2.2 .7 1.2 .7 .5 .6 .7 .1 .5 .2 .4 .5 I soleuc 2 0 2 0 3 2 4 1 1 0 2 0 1 0 ine .6 .5 .5 .3 .6 .0 .0 .8 .8 .3 .3 .5 .1 .2 1 L18. 6 2 3 1 6 7 3 4 1 7 6 3 1 actate 1 2.0 10.3 8.9 58.5 1.8 3.9 1.2 6.5 3.8 1.5 5.7 7.2 2.2 L 4 0 4 0 4 0 5 1 3 0 3 0 3 0 eucine .0 .4 .3 .6 .2 .4 .3 .7 .6 .3 .3 .6 .0 .5 156 M 5 9 5 1 5 4 7 3 4 6 5 8 5 8 alate 9.8 .3 5.4 1.5 2.3 .7 2.2 5.4 7.1 .0 0.1 .5 6.6 .9 M ethion 3 1 3 3 1 2 5 2 2 0 0 0 0 0 ine .2 .0 61.1 4.5 6.9 .9 .9 .6 .7 .4 .0 .0 .0 .0 N - Acetyl aspart 1 3 2 1 1 2 2 1 1 8 1 1 1 3 ate 5.5 .4 0.2 6.1 1.8 .8 1.0 7.2 7.9 .1 2.9 4.1 6.8 .5 N - Acetyl glycin 1 0 2 0 1 0 1 0 0 0 0 0 0 0 e .1 .2 .1 .5 .7 .3 .0 .4 .7 .2 .7 .2 .5 .0 N 2 5 2 4 2 3 2 7 1 1 2 5 1 1 AD+ 2.2 .0 4.1 .9 7.5 .9 0.8 .6 4.3 .4 1.4 .4 0.3 .5 N 3 0 3 0 3 1 2 1 2 0 1 0 1 0 ADP+ .4 .8 .5 .5 .6 .0 .2 .0 .0 .7 .9 .9 .6 .4 P antoth 1 0 1 0 1 0 1 0 0 0 1 0 0 0 enate .2 .3 .1 .3 .6 .6 .2 .5 .8 .1 .4 .5 .4 .1 P henyla 2 0 2 1 2 1 2 0 1 0 2 0 1 0 lanine .5 .6 .2 .0 .2 .7 .0 .4 .7 .5 .1 .7 .7 .3 P ropyle ne 1 0 1 0 1 0 1 0 1 0 1 0 1 0 glycol .7 .4 .9 .5 .7 .3 .6 .7 .5 .5 .2 .5 .1 .2 P utresci 5 1 6 3 5 1 5 2 4 1 5 2 3 9 ne 3.5 8.5 4.6 0.8 2.2 3.7 0.8 1.4 9.7 3.7 9.3 5.7 6.0 .7 P yruvat 9 2 1 4 1 5 5 3 3 2 2 1 4 1 e 5.0 8.0 29.4 2.8 05.9 9.5 4.8 5.7 4.7 3.8 0.3 9.5 7.3 0.5 157 S uccina 9 2 1 8 1 5 9 3 6 2 1 3 3 5 te 7.3 7.2 19.1 .7 43.4 7.3 2.8 2.3 7.7 7.9 14.9 0.4 9.9 .7 T yrosin 9 2 1 1 1 3 7 3 5 1 5 2 7 1 e .6 .5 0.4 .2 0.4 .9 .7 .0 .4 .1 .9 .7 .1 .5 U DP- glucos 1 4 2 4 1 3 1 7 1 3 9 3 9 1 e 5.9 .6 2.3 .7 6.9 .1 5.0 .6 0.7 .1 .4 .9 .1 .0 U DP-N- 3 0 3 0 4 1 2 0 2 0 0 1 3 0 Acetyl .7 .6 .9 .8 .2 .8 .5 .9 .4 .3 .8 .8 .6 .4 glucos amine U 1 1 1 2 9 2 8 4 5 2 4 3 7 1 MP 0.4 .3 0.8 .2 .8 .3 .4 .3 .7 .0 .4 .9 .5 .1 U 1 3 1 5 1 2 6 2 7 1 8 4 3 0 racil 1.6 .2 0.9 .1 3.5 .2 .7 .7 .7 .9 .1 .4 .1 .4 V 1 2 1 4 1 2 1 5 9 2 8 2 6 0 aline 4.3 .6 7.8 .6 6.1 .4 3.0 .2 .7 .2 .0 .2 .4 .6 *Assignment with best-matched signals, all others validated. Table S5. ANOVA single factor MS heat-stressed data. SUMMARY Cou Su Ave Var Groups nt m rage iance 158 596 9.80 164 1.53 CTRL avg 0 E+08 503 E+12 596 1.10 184 2.22 MET avg 0 E+09 611 E+12 596 9.93 166 1.65 AHA avg 0 E+08 573 E+12 596 1.01 169 1.83 HPG avg 0 E+09 402 E+12 ANOVA Source of P- F Variation SS df MS F value crit Between 1.49 4.95 0.27 0.8 2.6 Groups E+12 3 E+11 394 4424 0528 4.31 238 1.81 Within Groups E+16 36 E+12 4.31 238 Total E+16 39 159 APPENDIX B SUPPLEMENTAL MATERIAL FOR CHAPTER THREE 160 Supplemental Figure I. A. Distribution plots of CV of NMR metabolite feature profiles for the non-canonical amino acid treated cultures of E. coli B. CV profiles of metabolites in methionine dependent cancer cells with methionine (MET) or homocysteine (Hcy). C. CV profiles of metabolites from replicates of mayflies that were exposed to heat stress (pink) and the analogous control group (black). 161 Supplemental Figure II. Malnourished mouse model studies and Temperature Acclimated Abalone A. Profile plots of CV metabolite features from control mice (black) and malnourished mice (pink). B. Profile plots of CV metabolite features from replicates of cold (left panel) or high temperature (right panel) acclimated Haliotis discus hannai that were exposed to heat stress (pink) and the analogous control group (black). 162 Supplemental Figure III. Distribution profile plots of proteomic data collected from the wheat leaves (black) and wheat leaves that were been exposed to drought stress using PEG (pink). Supplemental Table I. Study Summaries KS value p value type of data set S. scrofa 0.2868 2.20E-16 metabolomics hemorrhagic shock v. control Non-canonical 0.46329 2.20E-16 metabolomics amino acid (HPG) treated E. coli v. Control E. coli (Mass spec) Non-canonical 0.2807 0.02197 Metabolomics amino acid (HPG) treated NMR 163 E. coli v. Control E. coli (NMR) A. fatua control v. 0.45412 2.20E-16 metabolomics A. fatua heat shocked Methionine 0.71909 2.20E-16 metabolomics dependent cancer cell line control v. Hcy (homocysteine) treated Met-dependent cancer cell line Control Mice v. 0.48098 2.20E-16 metabolomics malnourished mice Control Mayflys v. 0.078221 0.03703 metabolomics Heat shocked mayflys Aerobic v. 0.18756 3.55E-15 proteomics Anaerobic E. coli Wheat leaf control 0.46843 2.20E-16 proteomics v. PEG stressed wheatleaf Control v. heat 0.037567 0.8251 metabolomics shocked nematodes Chronic fatigue 0.10233 0.02217 metabolomics syndrome in Human males v. control Chronic fatigue 0.10476 0.01991 metabolomics syndrome in Human females v. control Abalone low 0.18076 3.69E-10 metabolomics acclimatized v. low acclimatized heat shocked Abalone high 0.077259 0.03332 metabolomics acclimatized v. high acclimatized heat shocked 164 APPENDIX C SUPPLEMENTAL MATERIAL FOR CHAPTER FOUR 165 Figure S1. Pathway distribution of identified intracellular proteins based on gene annotations in DAVID (Nature Protocols 2009; 4(1):44 & Nucleic Acids Res. 2009;37(1):1). Supplemental tables submitted as a separate file.