fmicb-11-536978 September 11, 2020 Time: 10:27 # 1 ORIGINAL RESEARCH published: 11 September 2020 doi: 10.3389/fmicb.2020.536978 Repetitive Sampling and Control Threshold Improve 16S rRNA Gene Sequencing Results From Produced Waters Associated With Hydraulically Fractured Shale Jenna L. Shelton1* , Elliott P. Barnhart2,3, Leslie Ruppert4, Aaron M. Jubb4, Madalyn S. Blondes4 and Christina A. DeVera4 1 Eastern Energy Resources Science Center, U.S. Geological Survey, Sacramento, CA, United States, 2 Wyoming-Montana Water Science Center, U.S. Geological Survey, Helena, MT, United States, 3 Center for Biofilm Engineering, Montana State University, Bozeman, MT, United States, 4 Eastern Energy Resources Science Center, U.S. Geological Survey, Reston, VA, United States Edited by: Florin Musat, Sequencing microbial DNA from deep subsurface environments is complicated by Helmholtz Centre for Environmental a number of issues ranging from contamination to non-reproducible results. Many Research (UFZ), Germany samples obtained from these environments – which are of great interest due to the Reviewed by: Jeffrey M. Dick, potential to stimulate microbial methane generation – contain low biomass. Therefore, Central South University, China samples from these environments are difficult to study as sequencing results can be Paula J. Mouser, University of New Hampshire, easily impacted by contamination. In this case, the low amount of sample biomass may United States be effectively swamped by the contaminating DNA and generate misleading results. *Correspondence: Additionally, performing field work in these environments can be difficult, as researchers Jenna L. Shelton generally have limited access to and time on site. Therefore, optimizing a sampling jlshelton@usgs.gov plan to produce the best results while collecting the greatest number of samples Specialty section: over a short period of time is ideal. This study aimed to recommend an adequate This article was submitted to sampling plan for field researchers obtaining microbial biomass for 16S rRNA gene Microbiological Chemistry and Geomicrobiology, sequencing, applicable specifically to low biomass oil and gas-producing environments. a section of the journal Forty-nine different samples were collected by filtering specific volumes of produced Frontiers in Microbiology water from a hydraulically fractured well producing from the Niobrara Shale. Water Received: 21 February 2020 Accepted: 21 August 2020 was collected in two different sampling events 24 h apart. Four to five samples were Published: 11 September 2020 collected from 11 specific volumes. These samples along with eight different blanks Citation: were submitted for analysis. DNA was extracted from each sample, and quantitative Shelton JL, Barnhart EP, polymerase chain reaction (qPCR) and 16S rRNA Illumina MiSeq gene sequencing were Ruppert L, Jubb AM, Blondes MS and DeVera CA (2020) Repetitive performed to determine relative concentrations of biomass and microbial community Sampling and Control Threshold composition, respectively. The qPCR results varied across sampled volumes, while no Improve 16S rRNA Gene Sequencing Results From Produced Waters discernible trend correlated contamination to volume of water filtered. This suggests that Associated With Hydraulically collecting a larger volume of sample may not result in larger biomass concentrations or Fractured Shale. better representation of a sampled environment. Researchers could prioritize collecting Front. Microbiol. 11:536978. doi: 10.3389/fmicb.2020.536978 many low volume samples over few high-volume samples. Our results suggest that Frontiers in Microbiology | www.frontiersin.org 1 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 2 Shelton et al. Low Biomass Environment Sampling Reproducibility there also may be variability in the concentration of microbial communities present in produced waters over short (i.e., hours) time scales, which warrants further investigation. Submission of multiple blanks is also vital to determining how contamination or low biomass effects may influence a sample set collected from an unknown environment. Keywords: low biomass samples, 16S/18S ribosomal RNA gene analysis, produced water, blanks, hydraulic fracturing INTRODUCTION cross-contamination and contaminant DNA in samples (e.g., Eisenhofer et al., 2019). Laboratory contamination can occur Microbial generation of methane occurs in many terrestrial via many routes, including contamination of extraction or PCR environments. Recent interest has focused on microbial reagents and/or materials, surfaces, or human error (Salter et al., communities in deep subsurface hydrocarbon reservoirs as they 2014; Glassing et al., 2016). Furthermore, variation in sequencing can be stimulated to produce additional natural gas from residual results have been observed across laboratories (e.g., Salter et al., organic material in crude oil, coal, and/or shale reservoirs 2014). Therefore, not only can samples from hydrocarbon (Schlegel et al., 2013; Wuchter et al., 2013; Larter et al., 2015; wells possess low biomass, but they are also susceptible to Ritter et al., 2015; Daly et al., 2016; Barnhart et al., 2017). contamination issues that are magnified by their innate low However, these environments typically contain low biomass biomass nature. This means that biomass from contaminants concentrations due to inherent reservoir characteristics: low may be proportional to sample biomass in low biomass samples concentrations of essential nutrients, high temperatures, brackish but swamped by sample biomass in high biomass samples. to brine salinity conditions, high pressures, and low water drives In this study, we collected biomass by filtering produced (e.g., Head et al., 2003; Silva et al., 2013; Cai et al., 2015; Gieg, water from one hydraulically fractured well producing from the 2018). Unfortunately, field campaigns to collect samples can Niobrara Shale in northeastern Colorado. Hydraulic fracturing be complicated by associated expenses, access to wells from is a process where water, sand, and other chemicals are injected operators, and limited field access. Importantly, most researchers into a rock at a pressure great enough to fracture it, increase cannot determine parameters such as biomass concentrations permeability, and stimulate hydrocarbon flow. The goal of the prior to completing field sampling of hydrocarbon wells and study was to ascertain a suitable sampling protocol for produced may be left with samples that may be compromised or of low waters so that the highest quality data could be obtained in quality. Therefore, understanding the microbial constraints and the most efficient way. We filtered specific volumes of water controls on stimulating methanogenesis is challenging because for biomass to determine how field measurements of 16S rRNA identifying the microorganisms innate to these environments gene sequencing results vary across sample volume and if results with field-based studies can be difficult with low biomass from the samples from the same volume of filtered water were concentrations or other sampling issues, such as short time scale comparable. The hypothesis was that increasing volumes of water (e.g., days) microbial population changes (e.g., Zelaya et al., filtered would result in increasing concentrations of biomass 2019) and the challenging and complex nature of produced collected. Our attempt was to simulate a situation where biomass water composition. concentrations are unknown and standard operating procedures Low biomass concentrations have been identified in many are used to acquire data (e.g., non-low biomass specific DNA environments outside of deep hydrocarbon reservoirs, such extraction methods) so that a researcher could use these results to as those associated with subsurface sediments (Ogram et al., determine the quality of the resulting 16S rRNA gene sequencing 1995), carbonate caves (Barton et al., 2006), spacecraft assembly data. The specific research questions for this study were (i) do cleanrooms (Vaishampayan et al., 2013), acidic, arsenic-rich smaller volumes of sample result in sequentially smaller biomass creeks (Giloteaux et al., 2010), and subseafloor ocean crust concentrations; (ii) can field researchers use Cp (crossing point- (Santelli et al., 2010). However, studies on how the low-biomass PCR-cycle) values and blank samples to determine a quality characteristic impacts microbial sequencing are limited (e.g., threshold for low biomass samples; and (iii) can an ideal Salter et al., 2014; Glassing et al., 2016). In these environments, sampling plan be developed for researchers sampling low biomass many specific challenges with generating 16S rRNA gene data produced waters. These results may help guide future sampling from sediment, rock, fluid or other materials have been identified. efforts in low-biomass environments to provide reproducible Irreproducible or low-quality DNA extraction is one common and quality data. barrier to sequencing data from these environments, resulting in unconvincing results. Many researchers are developing tools or methods to deal with low-biomass results, such as modifying DNA extraction techniques (e.g., Webster et al., MATERIALS AND METHODS 2003; Barton et al., 2006), creating filters or other software that target contaminants via bioinformatics (e.g., Minich et al., Field Methods 2018; Karstens et al., 2019), analyzing non-reproducible data Produced water was collected in October 2018 from one (e.g., Chandler et al., 1997), and attempting to mitigate hydraulically fractured oil and gas well producing formation Frontiers in Microbiology | www.frontiersin.org 2 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 3 Shelton et al. Low Biomass Environment Sampling Reproducibility water, oil, and gas from the Niobrara B Chalk in the Denver- Julesburg Basin. The well was located in Weld County, Colorado, United States. The operator and exact location of the well is confidential through a Technical Assistance Agreement with the operator. Water was collected from the well separator into six 5- L Nalgene HDPE carboys over a period of approximately 48 h, collecting a total of 30 L. As these carboys were unable to be autoclaved prior to field work, they were cleaned in the field according to USGS protocol by rinsing each carboy 3 times with sample water prior to filling the carboy to the brim (Graham et al., 2008). Carboy 1 was collected on day one, carboys 2 and 3 were collected concurrently on day two, and carboys 4, 5, and 6 were collected concurrently on day three. When collecting the sample water, each triple rinsed carboy was filled to the brim (i.e., filled with no headspace), and closed tightly until filtration (to limit exposure to the atmosphere). First, we needed to determine the maximum amount of water that could be filtered before the filter clogged so that we could consistently filter a maximum volume of water without the filter clogging. Sterile Nalgene tubing was inserted into the mouth of a carboy and threaded through a peristaltic pump. A Sterivex GP Filter unit was attached to the other end of the tubing, and the pump was turned on. The filtrate (i.e., the water that passed through the filter) was measured using a graduated cylinder. The pump remained on until the filter clogged, and the volume of filtrate was then measured. The maximum volume of water that could be filtered was, on average, 1083 mL. Therefore, 1000 mL of filtrate was used as the maximum volume for this study. Fifty-seven filters were collected after filtering varied and specific volumes of filtrate. Volumes were selected that decreased FIGURE 1 | Boxplot of qPCR results. Volume of water filtered for each of the sequentially from 1000 mL in an attempt to simulate changes 57 samples is compared to the 16S rRNA copies/µL for each sample. The in biomass concentrations. The following volumes of water were thick line in each box represents the median for each volume while the collected – 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, and whiskers extend to roughly a 95% confidence interval. The red trendline hasan R2 value of 0.0695. Samples are colored based on sample volume. 0 mL (Figure 1) – and at least four filters were collected at each given volume using the method described for determining the maximum volume of water described above. We attempted to remove any bias or error that may have been generated due to biomass (i.e., close to the qPCR detection limit) against. All using 6 different carboys of sample water by randomizing the filters were shipped on dry ice to the Argonne Environmental filters that came from each carboy. For example, all four of the Sample Preparation and Sequencing Facility at Argonne National ◦ 1000 mL filters were not generated by filtering water from the Laboratory in Lemont, IL, for analysis, and kept at −80 C same carboy (see Table 1 for information about which samples until extraction. Notably, we could not guarantee that the same came from each of the 6 carboys). This approach should eliminate amount of DNA would be collected on each Sterivex filter at each any bias introduced by collecting water from different time points filtered volume (i.e., all 1000 mL samples did not necessarily have (i.e., potential differences in biomass concentrations across the the exact same biomass concentrations). carboys of water would be present across multiple volumes). After a given volume of filtrate was reached, the filter was Laboratory Methods removed from the tubing, capped, and placed immediately on dry Standard qPCR and sample preparation methods were used ice. A new sterile filter was then attached to the Nalgene tubing, to reduce bias and enable the development of a methodology and a different specific volume of water was filtered through for produced water sample collection regardless of prior that filter by repeating the above process. The Nalgene carboys knowledge of biomass concentrations. DNA was extracted from were kept well-mixed during the filtration process by physically the Sterivex filters using the Qiagen DNeasy PowerWater shaking the carboys. In addition to the samples discussed above, Sterivex extraction kit (Cat No./ID: 14600-50-NF) following a total of eight different internal sample blanks were collected, manufacturer instructions. The extracted DNA was used as four by filtering 2 L of 18.2 M-cm lab-purified water through template for qPCR and Illumina MiSeq sequencing. Each 20 µL a Sterivex filter, and four by submitting blank Sterivex filters qPCR reaction contained 10 µL of SYBR Green Master Mix, 1 µL (opened but unused filters). The blanks were meant to serve of Caporaso et al. (2011) 515F forward primer, 1 µL of Caporaso as an internal quality control and to base any instances of low et al. (2011) 806R reverse primer, 7 µL of PCR pure water, Frontiers in Microbiology | www.frontiersin.org 3 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 4 Shelton et al. Low Biomass Environment Sampling Reproducibility TABLE 1 | Sequences per sample and OTUs identified per sample for each sample before and after contaminant removal. Volume Carboy Above or below Sequences per Sequences per Percent OTUs per OTUs per Percent filtered number Cp = 30.5 sample before sample after difference in sample sample difference in detection limit contaminant contaminant sequences per before after OTUs per removal removal sample after contaminant contaminant sample after contaminant removal removal contaminant removal removal 1000 mL 4 Above 51167 12366 61.1 68 57 8.8 900 mL 1 Above 39982 19410 34.6 59 52 6.3 900 mL 2 Above 12976 7668 25.7 40 30 14.3 900 mL 4 Above 85239 19520 62.7 84 76 5 800 mL 5 Above 7622 4703 23.7 27 18 20 800 mL 2 Above 21589 10248 35.6 48 38 11.6 800 mL 1 Above 35871 7232 66.4 56 50 5.7 800 mL 4 Above 34555 7257 65.3 55 51 3.8 700 mL 4 Above 31501 6218 67 61 51 8.9 700 mL 1 Above 11182 3849 48.8 31 24 12.7 600 mL 4 Above 24142 4996 65.7 58 53 4.5 600 mL 1 Above 14245 3980 56.3 31 23 14.8 500 mL 1 Above 13253 2554 67.7 45 38 8.4 500 mL 4 Above 34148 9727 55.7 82 72 6.5 500 mL 5 Above 14125 6317 38.2 36 30 9.1 400 mL 4 Above 13344 2349 70.1 40 37 3.9 400 mL 5 Above 50256 28207 28.1 132 110 9.1 300 mL 1 Above 25521 8880 48.4 30 25 9.1 300 mL 5 Above 22631 3556 72.8 41 33 10.8 300 mL 5 Above 1823 318 70.3 23 18 12.2 200 mL 5 Above 67754 18625 56.9 56 51 4.7 200 mL 5 Above 12712 1982 73 39 34 6.8 100 mL 5 Above 89394 40948 37.2 66 54 10 100 mL 5 Above 52272 10315 67 34 29 7.9 1000 mL 2 Below 20424 14398 17.3 101 79 12.2 1000 mL 3 Below 2289 642 56.2 11 6 29.4 1000 mL 6 Below 10457 6213 25.5 42 30 16.7 900 mL 6 Below 12956 8091 23.1 66 52 11.9 900 mL 6 Below 3642 1950 30.3 36 26 16.1 800 mL 6 Below 1447 572 43.3 17 12 17.2 700 mL 2 Below 6449 4394 19 25 16 22 700 mL 5 Below 3554 534 73.9 11 5 37.5 700 mL 3 Below 20527 12873 22.9 89 62 17.9 600 mL 5 Below 5789 1827 52 14 7 33.3 600 mL 3 Below 1535 1231 11 15 13 7.1 600 mL 2 Below 25405 16577 21 59 47 11.3 500 mL 3 Below 4205 2132 32.7 15 11 15.4 500 mL 6 Below 5071 2059 42.2 23 11 35.3 400 mL 3 Below 1412 131 83 12 4 50 400 mL 1 Below 15585 8582 29 46 27 26 400 mL 3 Below 25021 11659 36.4 54 34 22.7 300 mL 3 Below 6075 2206 46.7 21 13 23.5 300 mL 2 Below 447 28 88.2 4 3 14.3 200 mL 3 Below 18672 10846 26.5 52 34 20.9 200 mL 1 Below 9187 6528 16.9 28 20 16.7 200 mL 3 Below 19241 13602 17.2 58 44 13.7 100 mL 3 Below 16867 9603 27.4 55 35 22.2 100 mL 3 Below 3503 2186 23.1 52 37 16.9 100 mL 1 Below 27365 19124 17.7 64 46 16.4 Cells colored in red indicate a larger percent difference between the two measured values, while green-colored cells indicate a smaller percent change between the two measured cells. Frontiers in Microbiology | www.frontiersin.org 4 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 5 Shelton et al. Low Biomass Environment Sampling Reproducibility and 1 µL of template DNA loaded into each well. The qPCR (Wickham, 2016), reshape2 (Wickham, 2012), RColorBrewer conditions were as follows: denaturing DNA at 94◦C for 3 min (Neuwirth and Brewer, 2014), and plyr (Wickham, 2009). followed by a three step cycle 40 times, 94◦C for 45 s, 50◦C for Resulting data were analyzed based on Argonne National 60 s, and 72◦C for 90 s. All samples were run in triplicate. Positive Laboratory internal qPCR blanks and thresholds and submitted controls were run in duplicate to ensure a precise standard curve. sample set blanks. The results of the Illumina MiSeq sequencing The qPCR efficiency averaged approximately 96% across the eight run were used to compare communities of microbes identified point standard curve. in all samples collected to look for differences across the entire A barcoded primer set adapted for Illumina MiSeq was used sample set and between the smaller volumetric subsets (e.g., the to produce PCR amplicon libraries targeting the 16S rRNA five samples at 1000 mL filtered). encoding gene. After PCR optimization, the V4 region of the 16S rRNA gene (515F-806R) was then amplified using PCR with region-specific universal primers (Caporaso et al., 2011), RESULTS AND DISCUSSION including sequencer adapter sequences used in the Illumina flowcell and a 12 base barcode sequence that supports sample pooling in each lane (Caporaso et al., 2011, 2012). Each PCR Using qPCR to Determine if Smaller reaction contained 9.5 µL of certified DNA-free MoBio PCR Volumes of Sample Result in Sequentially water, 12.5 µL of QuantaBio Accustart II PCR ToughMix (2× Smaller Biomass Concentrations concentration, 1× final), 1 µL Golay barcode tagged forward Fifty-seven samples including 8 external blanks along with one primer (5 µM concentration, 200 pM final), 1 µL reverse primer internal laboratory extraction blank were analyzed by qPCR using (5 µM concentration, 200 pM final), and 1 µL of template DNA. an eight-point calibration curve (not shown). The results of PCR conditions were denaturing DNA at 94◦C for 3 min, with 35 the qPCR analysis were used to compare relative amounts of cycles at 94◦C for 45 s, 50◦C for 60 s, and 72◦C for 90 s, and a final biomass in each sample collected and across samples with the extension of 10 min at 72◦C to ensure complete amplification. same volume of filtrate (e.g., to compare the five 400 mL filtered Amplicons were then quantified with a plate reader (infinite volume samples). This was performed in order to determine if ˆR 200 PRO, Tecan) and PicoGreen (Invitrogen). After decreasing sample volume correlated with decreasing biomass quantification, volumes of each product are pooled into a single concentrations and increasing contamination. Triplicate analyses tube to ensure equimolar amounts of each amplicon. The pool were performed for each sample, the eight blanks, and one was cleaned using AMPure XP Beads (Beckman Coulter) and laboratory extraction blank, producing three different Cp values quantified using a fluorometer (Qubit, Invitrogen). The molarity per sample which were averaged (Supplementary Table S1). of the pool was determined after quantification and diluted down The Cp or CT (threshold cycle) value is the cycle at which to 2 nM. The pool was denatured and further diluted to a final the fluorescence achieves a defined threshold and can be useful concentration of 6.75 pM with a 10% PhiX spike for Illumina to understand biomass concentrations in samples. A smaller MiSeq sequencing. Amplicons were sequenced on a 151 base Cp value is indicative of a larger target expression in a given pair × 12 base pair × 151 base pair MiSeq run using customized sample, or more generally, indicative of a larger concentration of sequencing primers and procedures. targeted DNA per sample. The range of average Cp values for the Resulting Illumina MiSeq data were processed using QIIME2 samples in this study was 25.55 (indicating the largest 16S rRNA (Bolyen et al., 2019). Operational Taxonomic Units (OTUs) copies/µL) for sample JC30 (900 mL) to 40.41 (indicating the were mapped at greater than 99% similarity and taxonomy sample with the smallest 16S rRNA copies/µL) for sample JC59 was assigned at the species level. Taxonomic assignments were (a blank, 0 mL). performed using Silva 132 (Yilmaz et al., 2013) and the dataset Average (not displayed) and median 16S rRNA copies/µL was exported to R (R Core Team, 2019) to perform cleaning generally displayed a weak trend when compared to filtered steps and all statistical analyses. Sequences for each of the 57 volume (Figure 1). The R2 value for a linear correlation between samples (including the eight blanks) were scaled to represent sample volume and Cp value was 0.0695. Additionally, using a percent abundance (i.e., summing all sequences per sample Kruskal-Wallis rank sum test due to a non-normal distribution resulted in a value of 100 percent for every sample) so that of Cp value by sample volume, it was confirmed that Cp value is rarefaction would not occur and limit the dataset by potentially not significantly correlated to volume of sample; there was not a removing operational taxonomic units (OTUs). Sequence reads significant relationship between Cp value and sample volume (p for each sample were deposited in the National Center for value > 0.05). Biotechnology Information (NCBI) Short Read Archive (SRA) As this is quite an unusual result, it could likely be explained under Bioproject PRJNA529810. Data, OTU table, taxonomic by either (1) variability in biomass concentrations (and also table, associated metadata, and code used are available in contaminants) in produced water during production from a Shelton and DeVera (2019). hydrocarbon well (i.e., biomass concentration varies over short Various methods were used to test the effectiveness of periods of time, such as minutes to hours, during production of laboratory procedures and data quality for the samples post- water, oil, and gas from a well), (2) the presence of PCR inhibitors 16S rRNA gene sequencing, as discussed in the results section. disproportionally affecting samples of the same volume, or (3) the Statistical analyses were performed in R (R Core Team, 2019) volumes filtered are too small to detect differences in microbial with base packages, vegan (Oksanen et al., 2019), ggplot2 density. If biomass concentrations could change across short time Frontiers in Microbiology | www.frontiersin.org 5 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 6 Shelton et al. Low Biomass Environment Sampling Reproducibility scales in hydraulically fractured shale environments, then it is lower volume samples (200 mL filtered or less), while 7 of the not unreasonable to assume that PCR inhibitor concentrations 8 external blank samples also fell below this detection limit could change across similar time scales, which could have caused (Figure 2). All samples with filtered volumes of 700 mL and the differences observed in Cp value across identical volumes greater were above the laboratory detection limit. However, Cp of sample. PCR inhibitors are chemicals that interfere with the values did not correlate with this detection limit (Figure 2), as PCR process and are predominantly dissolved or solid organic many samples had very similar Cp values but were not similarly compounds such as clays, humic acids, phenols, and proteins classified by detection limit (i.e., two samples with the same Cp (Rossen et al., 1992; Abbaszadegan et al., 1993; Ijzerman et al., value were not both below the laboratory detection limit). 1997; Rådström et al., 2004; Schrader et al., 2012). However, Due to the fact that one blank sample was above the laboratory previous studies (e.g., Hull et al., 2018; Oetjen et al., 2018) have detection limit, this indicates that: (1) this blank sample had a concluded that the produced water geochemistry of hydraulically contaminated filter, introducing more biomass than expected; fractured shale wells doesn’t change dramatically once in steady (2) this blank sample was contaminated during the laboratory state; therefore, PCR inhibitors may also be less variable in or analytical processes; and/or (3) this detection limit was not concentration once in steady state production. To the authors’ suitable to discern low biomass samples. However, the internal knowledge, there are no studies looking at changes in microbial extraction blank submitted by the laboratory had a much larger community composition or geochemistry across short (minutes Cp (i.e., lower value of 16S rRNA copies/µL) than this blank to hours) time scales in mature hydraulically fractured shale sample, meaning that laboratory, analytical, or background wells, so some variability may be missing in previous studies. contamination should have been minimal. Therefore, it is likely Therefore, more investigations should be done to ensure that that this suggested detection limit was inadequate to fully capture variability in both water chemistry and biomass does not occur at poor quality samples and it shouldn’t be used as a threshold. short time scales in produced water associated with hydraulically A different detection limit was tested so that every blank fractured shale. sample would fall below it (i.e., all blank samples would be It is also important to note that some of the variability classified as low biomass). The smallest Cp value generated from observed in Cp values between samples of the same filtrate all eight blanks, Cp = 30.5, was selected as a new detection limit. volume may be due to the batch of water used during sampling. Samples with an average Cp value greater than that threshold For instance, carboys 3 and 6 (collected at approximately 0 and Cp value were deemed below detection limit (n = 33) while 24 h, respectively) only produced samples that had Cp values those with an average Cp value smaller than that threshold larger than the suggested detection limit of 30.5. As all of the value were deemed above the detection limit (n = 26), shown in carboys were identical, it is unlikely that the carboy itself caused Figure 3. There is no trend in volume filtered when compared these differences. However, carboys one, two and three were all to samples falling above or below this new threshold (Figure 3). collected minutes apart on day 1 while carboys four, five, and six Every volume sampled had at least one sample above and below were collected minutes apart on day 2; this suggests that there the threshold (except the blanks) suggesting variability in the may be variation in biomass concentrations in produced waters composition of the water sampled, widespread contamination, or from shale over very short time scales (e.g., minutes to hours). that all samples collected were impacted by low biomass. This will be investigated further in future work. Glassing et al. (2016) found that their low bacterial biomass samples had Cp values equal to or less than those generated for Can a Quality Threshold for Low their no template (i.e., negative) controls. They ranged from 26 to31 with an average of 29, values that are much smaller than those Biomass Samples Be Determined Using identified for blanks in this study (Supplementary Table S1). Cp Values and Field Blanks? However, many samples in this study had Cp values outside that As simply increasing sample volume was not significantly range; all of the blank samples had an average Cp value greater correlated with increasing biomass concentrations, it would then than 31. This suggests that there may not be one specific Cp value be ideal to determine a given Cp value that could identify low that classifies low biomass conditions or low-quality samples, and quality (i.e., low biomass) samples. This Cp would serve as a that submitting multiple blanks, to establish a well vetted Cp cutoff threshold where samples with Cp values larger than the threshold, with a sample set is vital to establishing variation in threshold are always considered “low-biomass” and potentially baseline or non-detect scenarios. Additional work will need to be could be eliminated from sample sets. In an attempt to determine done to determine if Cp values vary by laboratory, extraction kit, this Cp value, multiple Cp detection limits were considered when or other circumstances when submitting blanks. trying to determine if Cp values could define a quality threshold for low biomass samples based on internal laboratory detection limits and externally submitted blank samples. Comparing the Suggested Cp Threshold The two Cp value thresholds tested were the laboratory’s to 16S rRNA Illumina MiSeq Sequencing internal QC threshold, and the Cp value generated based on Results the blank samples submitted for analysis. Argonne National To determine if the suggested blank-defined detection limit (Cp Laboratory provided information as to which of the 57 samples value = 30.5) could filter out poor quality samples, the sequencing did not amplify above their internal QC threshold (Figure 2). data generated for these samples was considered. As discussed Samples below the laboratory’s detection limit were generally previously, as every sample in this study was from the same Frontiers in Microbiology | www.frontiersin.org 6 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 7 Shelton et al. Low Biomass Environment Sampling Reproducibility FIGURE 2 | qPCR data visualized with data categorized based on the internal detection limit (library amplification) supplied by the laboratory. Samples are either above (dark blue color) or below (light blue color) the suggested detection limit for library amplification. water source over the course of 24 h, the microbial community four OTUs (300 mL sample) to 132 OTUs (400 mL sample). The composition of every sample (excluding the blanks) should be Shannon Diversity index (H), a measurement of diversity across statistically similar to one another, and the blanks should be a sampled microbial community (e.g., Haegeman et al., 2013), statistically dissimilar from the actual samples. The extracted ranged from 4.4 (1000 mL sample) to 0.3 (300 mL sample). In DNA from the 57 samples in this sample set was sequenced so that general, the 900 mL filtered samples have the highest richness the microbial community composition of each sample could be and the 0 mL filtered samples have the lowest sample richness compared across the sample set and within its respective volume (Figure 4); there is no general trend observed in sample diversity, bin (see Supplementary Table S2 for abbreviated taxonomic table either variation or similar average values (Figure 4). The median or Shelton and DeVera (2019) for full taxonomic table). Results Shannon Diversity index was generally similar for 1000, 800, 700, presented below include the eight blank samples. 600, 500, and 400 mL samples and unexpectedly, the blanks. There were 875 different OTUs identified in the sample set The largest within-volume variation is observed in 100 and (including the eight blanks), with only one OTU identified 300 mL samples, possibly suggesting that either these volumes in every sample, Escherichia-Shigella sp. No other OTUs were did not capture the representative microbial community of the identified in every blank or identified in every non-blank sample. sampled well and may have been influenced by contaminants The most abundant OTU in each sample did not necessarily or other low biomass artifacts, or the different within-volume dominate (i.e., was present at greater than 20%) the given sample. samples captured a limited representation of the subsurface For example, the most prominent OTU in a given 700 mL microbial community. sample was present at 4.4% abundance (Acidobacteria, Subgroup A species-based Bray-Curtis distance matrix was visualized 6). Methanogens, thermophilic and halophilic organisms are using a nonmetric multidimensional scaling plot (NMDS; present, typical of those identified in other waters produced Figure 5) to determine which samples were most similar to from hydraulically fractured shales (e.g., Kirk et al., 2012; Murali each other (i.e., did samples from the same volume have similar Mohan et al., 2013; Cluff et al., 2014; Wang et al., 2019). Sample microbial community compositions). An adonis2 test was used richness (or number of OTUs identified per sample) ranged from to determine if a significant difference in microbial community Frontiers in Microbiology | www.frontiersin.org 7 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 8 Shelton et al. Low Biomass Environment Sampling Reproducibility FIGURE 3 | qPCR data visualized with data categorized based on the blank-defined detection limit (Cp = 30.5). Samples are either above (dark blue color) or below (light blue color) the detection limit. composition existed between samples above and below the two detection limit tested successfully separated low biomass samples identified detection limits (laboratory versus submitted blanks). from the sample set within. However, the microbial ecology of Samples were grouped by either being above or below the library produced fluids of hydraulically fractured wells are known to amplification (Figure 5A), or the blank-defined (Cp equal to 30.5; change over the lifetime of the well (Cluff et al., 2014; Evert Figure 5B) detection limit. An adonis2 test produced a non- et al., 2016), although most of the significant change occurs significant p value (>0.05) when samples were grouped using during the flowback period (typically within the first 2 months). the laboratory detection limit, but produced a significant p value Studies suggest that the microbial ecology becomes stable in (0.001) when grouped by the blank-determined detection limit. mature hydrocarbon-producing wells (e.g., Cluff et al., 2014). This suggests that the blank-determined Cp value was able to Geochemical conditions in established hydrocarbon-producing successfully group samples with significantly different microbial wells where no injection of outside fluids is occurring also do community compositions based solely on biomass concentration. not vary widely between sampled points, specifically, for the Therefore, samples below this detection limit had a significantly Niobrara Shale (e.g., Hull et al., 2018; Oetjen et al., 2018). The different microbial community composition than samples above methods used, such as keeping the sample water well mixed and this detection limit, meaning that any effects that low biomass filtering the water across different volumetric batches instead samples may have had on the sequencing data may be removed of in succession (i.e., sampling 800, 700, 600, 500 mL in a when using this detection limit. batch instead of 800, 800, 800, and 800 mL), were done to Contamination is the most likely reason that the samples reduce variability across batches of water collected over the below the detection limit had a different microbial community 24 h. Therefore, the microbial ecology should be considerably composition than those above. As contamination more strongly stable over the sampling time period given the age of the impacts low biomass samples than non-low biomass samples well. Theoretically, similar diversity and richness across samples (Salter et al., 2014; Eisenhofer et al., 2019; Karstens et al., 2019; should also occur if the sampled water was indeed uniform and Weyrich et al., 2019), the difference in microbial composition unchanging over time, as all filtered water originated from the between these two groups of samples may suggest that the same hydrocarbon well. Frontiers in Microbiology | www.frontiersin.org 8 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 9 Shelton et al. Low Biomass Environment Sampling Reproducibility FIGURE 4 | Box plots of Shannon Diversity index and sample richness grouped by volume of sample water filtered. FIGURE 5 | Nonmetric-multidimensional scaling (NMDS) plots using a Bray-Curtis dissimilarity matrix with data categorized based on detection limit used. Samples are either above (dark blue color) or below (light blue color) the given detection limit. (A) Laboratory internal detection limit; (B) Blank-determined detection limit, Cp value equal to 30.5. Results of an adonis2 test provided on both plots. When comparing the microbial community composition of difference across the two groups is the abundance of the samples above and below the threshold defined by the smallest class Thermotogae in the above detection limit samples and Cp value identified in the blanks, Cp = 30.5, the most obvious the abundance of Gammaproteobacteria in the samples below Frontiers in Microbiology | www.frontiersin.org 9 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 10 Shelton et al. Low Biomass Environment Sampling Reproducibility the detection limit (Supplementary Figure S1). OTUs of the and non-low biomass samples in the same way but impacts of class Thermotogae are anaerobic, thermophilic, and saccharolytic that contamination would be greater in low biomass samples, bacteria that have been associated with thiosulfate reduction as there is less real sample DNA (e.g., Salter et al., 2014) or to sulfide (Huber et al., 1986), drilling mud in Barnett Shale the contaminants are a much larger proportion of the sample natural gas wells (Struchtemeyer et al., 2011), hydraulic fracturing when below the detection limit. Therefore, one would expect that flowback water impoundments from the Marcellus Shale (Murali the number of OTUs removed per sample and/or the relative Mohan et al., 2013) and produced waters from oil-producing number of sequences per sample removed due to contamination reservoirs (Salinas et al., 2004; Magot, 2005). Thermotogae is should be much higher in low biomass samples than in non-low not listed in the low-biomass contaminant database defined by biomass samples. Barton et al. (2006); therefore, its presence or absence may be A study by Karstens et al. (2019) used serial dilutions of a a good indicator to distinguish between high- and low-quality mock community to investigate how biomass concentration samples, respectively, in samples for this study. The presence and contamination are related. Their experiment found of Escherichia-Shigella sp. largely explains the dominance of that contamination increased with decreasing starting Gammaproteobacteria in the below detection limit samples. biomass concentration, or that increasing dilution increased contamination. This is not in agreement to what we observe here; 16S rRNA Sequencing Contaminant we see no relationship with smaller volumes of water filteredand contamination. Increasing sample volume, which should Removal and Testing the Proposed Cp theoretically be tied to increasing biomass volume, is not related Threshold to decreasing contamination or higher-quality samples. However, As a detection limit has been identified, Cp = 30.5, that the samples with Cp values greater than 30.5 do have a greater seemingly was able to distinguish between low biomass samples percentage of their OTUs comprised of contaminant OTUs than and non-low biomass samples, the next step was to remove those above this detection limit, but the inverse is generally true any contamination to see if the above detection limit samples for contaminant sequences per sample. It appears that natural become more similar to each other to test whether or not the systems are harder to decipher than mock community dilutions blank-defined detection limit was able to successfully capture like those presented in Karstens et al. (2019), and that it is not most of the contamination within the dataset. Therefore, any appropriate to assume that collecting a greater volume of sample OTU identified in a blank sample was removed from all other will result in greater amounts of biomass and thus, fewer impacts samples in this study. from contamination. The blank samples had, on average, 8066 different sequences When a Bray-Curtis distance matrix of the blank-removed across 156 different observed OTUs. However, most OTUs had a dataset was visualized via an NMDS, significant clustering is very low average percent abundance (less than 0.1% abundance) observed (Figure 6). The two different detection limit scenarios with only one OTU present at greater than 5% abundance, are illustrated, with the laboratory’s library amplification Escherichia-Shigella sp., and only 12 OTUs present at greater threshold plotted in Figure 6A and the blank-defined threshold than 1% abundance: Nitrososphaeraceae sp., Acidobacteria (Cp = 30.5) used as the detection limit in Figure 6B. An adonis2 Subgroup 6 uncultured bacterium, Acidimicrobiia IMCC26256 test produced a significant p value (p = 0.001) for both detection uncultured bacterium, Sporichthyaceae sp., Sediminibacterium limit even though visually, it appears that the blank-defined sp., uncultured Flexibacter sp., Mucilaginibacter sp., uncultured threshold more successfully captures the samples with extreme rumen bacterium from the class Kiritimatiellae, SAR11 Clade microbial community composition similarity (as many samples Ia sp., Escherichia-Shigella sp., uncultured Chthoniobacteraceae plot on top of each other). The samples that cluster near the LD29, and unknown Bacteria sequences. Some of these are origin of the plot all have very similar microbial community organisms commonly identified as contaminants in DNA compositions, which is expected for samples representing the extraction kits and in the generation of Taq polymerase (e.g., composition of the single well sampled for this study. In Salter et al., 2014; Chen et al., 2015; Glassing et al., 2016). Figure 6A, there is no clear clustering of samples based on Removing all OTUs identified in the blank samples from the detection limit even though the two groups cluster significantly rest of the sample set reduced the total number of OTUs in according to an adonis2 test. Although the two groups in the remaining 49 samples to 719. The minimum and maximum Figure 6A (above and below the detection limit) are significantly number of sequences per sample changed from 447 and 89,394, different, the similarity of the samples above the detection limit respectively, to 28 and 40,948, respectively. The impact of is greater in Figure 6B, or when Cp = 30.5 is used as the removing the contaminant OTUs identified in the blank samples detection limit. can be observed in Table 1. Samples in Table 1 are organized as either above or below the detection limit of Cp = 30.5 (as discussed in previous The Role of Blanks in Low Biomass sections). Fewer contaminants were present in the samples Samples classified above the Cp detection limit than those below the As the blank-defined detection limit proved to be better at detection limit, supporting the use of the smallest Cp value differentiating low biomass samples from a sample set, it is generated for the blanks as a good threshold for determining data clear that the submission of external blanks is critical when quality. Theoretically, contamination could affect low biomass sampling a potentially low biomass environment. Additionally, Frontiers in Microbiology | www.frontiersin.org 10 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 11 Shelton et al. Low Biomass Environment Sampling Reproducibility FIGURE 6 | Non-metric multidimensional scaling plots of a Bray-Curtis distance matrix based on the dataset after blanks and contaminant removal and normalization of sequences per sample. Plots are of the two different detection limits described in the test. (A) Detection limit is based on the laboratory’s internal amplification threshold. (B) Detection limit is based on blank-defined detection limit, Cp = 30.5. Results from an adonis2 test are provided on both plots. the development of this detection limit was highly dependent on though all samples collected were from the same produced the number of blanks submitted with a sample set. Removing water source, the microbial community compositions before and any of these eight blanks could change the Cp value used after contaminant removal varied widely across the sample set. as the detection limit for these scenarios and could therefore This suggested that simply implementing contaminant removal easily change which samples are classified as above or below a techniques for samples may not be enough to prove worthwhile in suggested detection limit. It is therefore critical to submit a large downstream analysis, as many samples in the sample set with the number of blanks when sampling; future work will focus on largest DNA concentrations had significantly similar microbial how many blanks proves most successful in creating a usable Cp community compositions. detection limit. Furthermore, even after contaminant removal, there was Microorganisms Identified in High still variability in microbial community composition across the Biomass Samples remaining 49 samples. This suggests that simply removing There were 24 samples that had adequate biomass concentrations contaminants from low biomass samples may not improve the (i.e., fell above the suggested Cp threshold of 30.5). Although samples enough to be worthwhile to include in data analysis. investigating the identified microbial community composition We suggest that it is imperative to differentiate from samples of these 24 samples was not the focus of this study, expanding impacted by their low biomass signature and remove them from on the communities identified would be useful to further sample analysis. characterize the microbes present in the Niobrara Shale. After Overall, these results suggest that increasing sample volume contaminant removal, the major taxa identified in these 24 does not necessarily directly relate to increasing biomass samples were Thermovirga spp. (10.2% average abundance across concentrations (possibly due to increased presence of PCR the 24 samples), uncultured Methanothermobacter (9.2% average inhibitors) or the likelihood for a sample to be close to or abundance), Caldanaerobacter spp. (8.3% average abundance), at the detection limit for qPCR. These results also imply and Thermoanaerobacter spp. (5.7% average abundance). Many that no relationship between sample volume and microbial of the identified OTUs are thermophilic, methanogens, or community composition exist in this sample set. The samples halophilic organisms. most affected by the removal of contaminant OTUs generally The orders Methanobacter (uncultured fell below the detection limit, suggesting that using the smallest Methanothermobacter) and Thermoanaerobacterales Cp value in all submitted blanks as a detection limit may serve (Thermoanaerobacter spp.), and the classes Clostridia as a good metric for weeding out low biomass samples that (Caldanaerobacter spp. and Thermoanaerobacter spp.) and may result in misleading or incorrect data. However, this is Synergistia (Thermovirga spp.) have been previously identified in highly dependent on the number of blanks submitted. Even early (i.e., in production for fewer than 100 days) Niobrara Frontiers in Microbiology | www.frontiersin.org 11 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 12 Shelton et al. Low Biomass Environment Sampling Reproducibility produced waters (Hull et al., 2018; Oetjen et al., 2018; data. However, this is highly dependent on the number of Wang et al., 2019). Thermoanaerobacterales are sulfidogenic blanks submitted. Even though all samples collected were from organisms, potentially indicating the presence of sulfate in the same produced water source, the microbial community these produced fluids and the potential for well souring (Davis compositions before and after contaminant removal varied et al., 2012). Methanothermobacter is typically associated with widely across the sample set. This suggested that simply hydrogenotrophic methanogenesis (e.g., Wang et al., 2019), and implementing contaminant removal techniques for samples its presence in these samples could indicate the potential for may not be enough to prove worthwhile in downstream ongoing methane production in the Niobrara Shale that could analysis, as many samples in the sample set with the be stimulated. largest DNA concentrations had significantly similar microbial Similar to Hull et al. (2018) and Oetjen et al. (2018), community compositions. Halanaerobium was not detected in any samples within, perhaps This suggests that researchers may be able to collect many providing further evidence that the microbiology of hydraulically lower volume samples (e.g., 500 mL compared to 1000 mL) and fractured shales are not uniform and may be specific to get the same quality data (i.e., a representative sample), which other formation conditions, such as salinity (Kondash et al., could save time in the field. One could argue that collecting a 2017). Hull et al. (2018) also identified an abundance of larger volume of sample over a longer time is needed to fully Methanothermobacter in Niobrara Shale horizontal produced capture a truly representative sample of the produced water fluids at a different location in the DJ Basin, indicating that wide- microbial community, but the results within do not support spread enhancement of methanogenesis across the shale may be that a large sample is necessary. If there is indeed variability possible, and that hydrogenotrophic methanogenesis may be the in the microbiology of produced fluids from shale wells over major metabolic pathway for methane generation. Further work short time periods like those sampled in this study (hours to on the genomics of Niobrara Shale produced waters would be days), collecting many smaller volume samples would capture necessary to confirm this hypothesis. this variability better than a few large volume samples, because The four predominate orders identified in this study were the lower volume samples also represent smaller points in time. not identified in any of the samples below the blank-defined Additionally, this could also suggest that it is more prudent detection limit. This indicates two things: (i) that these four to obtain multiple samples in collected and composite time- classes could be used as indicator species for waters produced integrated water samples. from late-time series wells in the Niobrara Shale, and (ii) that, When collecting samples of unknown biomass concentrations, again, the established blank-determined threshold was valid we recommend simply collecting multiple lower volume samples for use in this study. The organisms identified in abundance over a few large volume samples. This is because potential in the samples above the low biomass threshold are typically variability in biomass across short time scales in a production well identified in extremely similar environments (i.e., the Niobrara may be more adequately captured in smaller volume samples. Shale). Our analysis indicates these microorganisms are present We suggest submitting multiple blanks and using the smallest in mature steady-state formation water long after the flowback Cp value as a cutoff for usable data in downstream analyses. period. Although the organisms identified in the low biomass Simply submitting one blank or only using an extraction blank samples can be loosely tied to oil and gas production, the may not be adequate to account for any contamination or four communities discussed above were not abundant in those underlying variation in sequencing results due to low biomass samples. Therefore, this provides further evidence that low conditions. Setting a conservative detection limit using a large biomass samples must be screened and that this study may number of internal and external blanks is the key to obtaining represent an adequate sampling plan to capture organisms reliable data. If all Cp values are below the defined threshold present in low biomass environments. The microbial ecology suggested by blank submission, other methods such as sample of formation water associated with the Niobrara Shale will be pooling (i.e., taking multiple samples from a sample site and expanded on in future research. pooling the extracted DNA from those samples into one sample) may be required to overcome the limitations of low biomass Sampling Plan Recommendations settings. Additional research on the reproducibility of multiple Theoretically, increasing the volume of water sampled should low volume samples compared to a few large volume samples in increase the volume of biomass collected, but this relationship environments outside produced water is necessary. Testing this was not observed in this study. Overall, these results suggest hypothesis on shotgun metagenomic data would also be useful that increasing sample volume does not necessarily directly for future studies as well as investigating changes in microbial relate to increasing biomass concentrations or the likelihood community composition over short time periods (hours to days) for a sample to be close to or at the detection limit for in mature oil and gas wells. qPCR. No relationship between sample volume and microbial community composition exist in this sample set. The samples most affected by the removal of contaminant OTUs generally DATA AVAILABILITY STATEMENT fell below the detection limit, suggesting that using the smallest Cp value in all submitted blanks as a detection limit may Sequence reads for each sample were deposited in the National serve as a good metric for filtering out low biomass samples, Center for Biotechnology Information (NCBI) Sequence Read where their inclusion may result in misleading or incorrect Archive (SRA) under BioProject PRJNA529810. Frontiers in Microbiology | www.frontiersin.org 12 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 13 Shelton et al. Low Biomass Environment Sampling Reproducibility AUTHOR CONTRIBUTIONS and two reviewers for their thoughtful reviews that greatly increased the quality of this manuscript. We also thank JS developed the research question. JS, LR, and AJ devised the the energy company that provided site access for this research plan. JS, MB, and CD performed field work. JS and study under a Technical Assistance Agreement. Any use of EB performed data analysis and interpretations. All authors trade, firm, or product names is for descriptive purposes drafted the manuscript. only and does not imply endorsement by the United States Government. FUNDING This project was funded by the U.S. Geological Survey’s Energy SUPPLEMENTARY MATERIAL Resource Program (Walter Guidroz, Program Coordinator). The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. ACKNOWLEDGMENTS 2020.536978/full#supplementary-material We thank Adam Mumford and Daniel Hayba for comments TABLE S1 | Taxa identified in the raw (blanks included) dataset. Taxa are and discussions that sharpened our thinking and presentation, presented to the class level only. REFERENCES and recommendations. Trends Microbiol. 27, 105–117. doi: 10.1016/j.tim.2018. 11.003 Abbaszadegan, M., Huber, M. S., Gerba, C. P., and Pepper, I. L. (1993). Detection Evert, M., Panescu, J., Daly, R. A., Welch, S. A., Hespen, J., Sharma, S., et al. (2016). of enteroviruses in groundwater with the polymerase chain reaction. Appl. “Temporal changes in fluid biogeochemistry and microbial cell abundance after Environ. Microbiol. 59, 1318–1324. doi: 10.1128/aem.59.5.1318-1324.1993 hydraulic fracturing in marcellus shale,” in AAPG Eastern Section Meeting, Barnhart, E. P., Davis, K. J., Varonka, M., Orem, W., Cunningham, A. B., Ramsay, Lexington, KY. B. D., et al. (2017). Enhanced coal-dependent methanogenesis coupled with Gieg, L. M. (2019). “Microbial communities in oil shales, biodegraded and algal biofuels: potential water recycle and carbon capture. Int. J. Coal Geol. 171, heavy oil reservoirs, and bitumen deposits,” in Microbial Communities 69–75. doi: 10.1016/j.coal.2017.01.001 Utilizing Hydrocarbons and Lipids: Members, Metagenomics and Ecophysiology. Barton, H. A., Taylor, N. M., Lubbers, B. R., and Pemberton, A. C. (2006). Handbook of Hydrocarbon and Lipid Microbiology, ed. T. McGenity (Cham: DNA extraction from low-biomass carbonate rock: an improved method with Springer). doi: 10.1007/978-3-030-14785-3_4 reduced contamination and the low-biomass contaminant database. J. Micro. Giloteaux, L., Goni-Urriza, M., and Duran, R. (2010). Nested PCR and new primers Meth. 66, 21–31. doi: 10.1016/j.mimet.2005.10.005 for analysis of sulfate-reducing bacteria in low-cell-biomass environments. Bolyen, E., Rideout, J. R., Dillon, M. R., Bokulich, N. A., Abnet, C. C., Al- Appl. Environ. Microbiol. 76, 2856–2865. doi: 10.1128/aem.02023-09 Ghalith, G. A., et al. (2019). Reproducible, interactive, scalable and extensible Glassing, A., Dowd, S. E., Galandiuk, S., Davis, B., and Chiodini, R. J. (2016). microbiome data science using QIIME 2. Nat. Biotech. 37, 852–857. Inherent bacterial DNA contamination of extraction and sequencing reagents Cai, M., Jiménez, N., Krüger, M., Guo, H., Jun, Y., Straaten, N., et al. (2015). may affect interpretation of microbiota in low bacterial biomass samples. Gut Potential for aerobic and methanogenic oil biodegradation in a water flooded Path. 8:24. oil field (Dagang oil field). Fuel 141, 143–153. doi: 10.1016/j.fuel.2014.10.035 Graham, J. L., Loftin, K. A., Ziegler, A. C., and Meyer, M. T. (2008). Cyanobacteria Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Huntley, J., Fierer, in Lakes and Reservoirs—Toxin and Taste-and-Odor Sampling Guidelines (ver. N., et al. (2012). Ultra-high-throughput microbial community analysis on the 1.0): U.S. Geological Survey Techniques of Water-Resources Investigations, Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624. doi: 10.1038/ismej. Book 9, Chap. A7, section 7.5. Avaliable at: http://pubs.water.usgs.gov/twri9A/ 2012.8 (accessed January 17, 2019). Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Haegeman, B., Hamelin, J., Moriarty, J., Neal, P., Dushoff, J., and Weitz, J. S. (2013). Turnbaugh, P. J., et al. (2011). Global patterns of 16S rRNA diversity at a depth Robust estimation of microbial diversity in theory and in practice. ISME J. 7, of millions of sequences per sample. Pro. Nat. Acad. Sci. U.S.A. 108, 4516–4522. 1092–1101. doi: 10.1038/ismej.2013.10 doi: 10.1073/pnas.1000080107 Head, I. M., Jones, D. M., and Larter, S. R. (2003). Biological activity in the deep Chandler, D. P., Fredrickson, J. K., and Brockman, F. J. (1997). Effect of subsurface and the origin of heavy oil. Nature 426, 344–352. doi: 10.1038/ PCR template concentration on the composition and distribution of total nature02134 community 16S rDNA clone libraries. Mol. Ecol. 6, 475–482. doi: 10.1046/j. Huber, R., Langworthy, T. A., König, H., Thomm, M., Woese, C. R., Sleytr, U. B., 1365-294x.1997.00205.x et al. (1986). Thermotoga maritima sp. nov. represents a new genus of unique Chen, S., Zheng, X., Cao, H., Jiang, L., Liu, F., and Sun, X. (2015). A simple and extremely thermophilic eubacteria growing up to 90 C. Arch. Microbiol. 144, efficient method for extraction of Taq DNA polymerase. J. Biotech. 18, 343–346. 324–333. doi: 10.1007/bf00409880 Cluff, M. A., Hartsock, A., MacRae, J. D., Carter, K., and Mouser, P. J. (2014). Hull, N. M., Rosenblum, J. S., Robertson, C. E., Harris, J. K., and Linden, K. G. Temporal changes in microbial ecology and geochemistry in produced water (2018). Succession of toxicity and microbiota in hydraulic fracturing flowback from hydraulically fractured Marcellus Shale gas wells. Env. Sci. Tech. 48, and produced water in the denver–julesburg basin. Sci. Total Envin. 644, 6508–6517. doi: 10.1021/es501173p 183–192. doi: 10.1016/j.scitotenv.2018.06.067 Daly, R. A., Borton, M. A., Wilkins, M. J., Hoyt, D. W., Kountz, D. J., Wolfe, R. A., Ijzerman, M. M., Dahling, D. R., and Fout, G. S. (1997). A method to remove et al. (2016). Microbial metabolisms in a 2.5-km-deep ecosystem created by environmental inhibitors prior to the detection of waterborne enteric viruses hydraulic fracturing in shales. Nat. Microbiol. 1:16146. by reverse transcription-polymerase chain reaction. J. Virol. Meth. 63, 145–153. Davis, J. P., Struchtemeyer, C. G., and Elshahed, M. S. (2012). Bacterial doi: 10.1016/s0166-0934(96)02123-4 communities associated with production facilities of two newly drilled Karstens, L., Asquith, M., Davin, S., Fair, D., Gregory, W. T., Wolfe, A. J., thermogenic natural gas wells in the Barnett Shale (Texas. USA). Micro. Ecol. et al. (2019). Controlling for contaminants in low-biomass 16S rRNA gene 64, 942–954. doi: 10.1007/s00248-012-0073-3 sequencing experiments. mSystems 4, 290–319. Eisenhofer, R., Minich, J. J., Marotz, C., Cooper, A., Knight, R., and Weyrich, L. S. Kirk, M. F., Martini, A. M., Breecker, D. O., Colman, D. R., Takacs-Vesbach, (2019). Contamination in low microbial biomass microbiome studies: issues C., and Petsch, S. T. (2012). Impact of commercial natural gas production Frontiers in Microbiology | www.frontiersin.org 13 September 2020 | Volume 11 | Article 536978 fmicb-11-536978 September 11, 2020 Time: 10:27 # 14 Shelton et al. Low Biomass Environment Sampling Reproducibility on geochemistry and microbiology in a shale-gas reservoir. Chem. Geol. 332, Schrader, C., Schielke, A., Ellerbroek, L., and Johne, R. (2012). PCR inhibitors– 15–25. doi: 10.1016/j.chemgeo.2012.08.032 occurrence, properties and removal. J. App. Microbio. 113, 1014–1026. doi: Kondash, A. J., Albright, E., and Vengosh, A. (2017). Quantity of flowback and 10.1111/j.1365-2672.2012.05384.x produced waters from unconventional oil and gas exploration. Sci. Total. Shelton, J. L., and DeVera, C. A. (2019). Low Biomass Microbiology Samples Environ. 574, 314–321. doi: 10.1016/j.scitotenv.2016.09.069 Collected From a Hydraulically Fractured Well Producing From the Niobrara Larter, S. R., Head, I. M., Jones, D. M., Erdmann, M., and Wilhelms, A. (2015). U.S. Shale in Colorado. Reston, VA: U.S. Geological Survey Data release, doi: 10. Patent No. 9,068,107. Washington, DC: U.S. Patent and Trademark Office. 5066/P9D9ZOGU Magot, M. (2005). “Indigenous microbial communities in oil fields,” in Petroleum Silva, T. R., Verde, L. C. L., Neto, E. S., and Oliveira, V. M. (2013). Diversity analyses Microbiology, eds B. Ollivier, and M. Magot, (Washington, DC: ASM Press), of microbial communities in petroleum samples from Brazilian oil fields. Int. 21–34. doi: 10.1128/9781555817589.ch2 Biodeter. Biodeg. 81, 57–70. doi: 10.1016/j.ibiod.2012.05.005 Minich, J. J., Zhu, Q., Janssen, S., Hendrickson, R., Amir, A., Vetter, R., Struchtemeyer, C. G., Davis, J. P., and Elshahed, M. S. (2011). Influence of the et al. (2018). KatharoSeq enables high-throughput microbiome analysis drilling mud formulation process on the bacterial communities in thermogenic from low-biomass samples. mSystems 3:e00218-17. doi: 10.1128/msystems.002 natural gas wells of the Barnett Shale. Appl. Environ. Microbiol. 77, 4744–4753. 18-17 doi: 10.1128/aem.00233-11 Murali Mohan, A., Hartsock, A., Hammack, R. W., Vidic, R. D., and Gregory, Vaishampayan, P., Probst, A. J., La Duc, M. T., Bargoma, E., Benardini, K. B. (2013). Microbial communities in flowback water impoundments from J. N., Andersen, G. L., et al. (2013). New perspectives on viable microbial hydraulic fracturing for recovery of shale gas. FEMS Micro. Eco. 86, 567–580. communities in low-biomass cleanroom environments. ISME J. 7, 312–324. doi: 10.1111/1574-6941.12183 doi: 10.1038/ismej.2012.114 Neuwirth, E., and Brewer, R. C. (2014). ColorBrewer palettes. R Package Version, Wang, H., Lu, L., Chen, X., Bian, Y., and Ren, Z. J. (2019). Geochemical and 1–1. microbial characterizations of flowback and produced water in three shale oil Oetjen, K., Chan, K. E., Gulmark, K., Christensen, J. H., Blotevogel, J., Borch, and gas plays in the central and western United States. Water Res. 164:114942. T., et al. (2018). Temporal characterization and statistical analysis of flowback doi: 10.1016/j.watres.2019.114942 and produced waters and their potential for reuse. Sci. Tot. Env. 619, 654–664. Webster, G., Newberry, C. J., Fry, J. C., and Weightman, A. J. (2003). Assessment doi: 10.1016/j.scitotenv.2017.11.078 of bacterial community structure in the deep sub-seafloor biosphere by 16S Ogram, A., Sun, W., Brockman, F. J., and Fredrickson, J. K. (1995). Isolation and rDNA-based techniques: a cautionary tale. J. Microbiol. Meth. 55, 155–164. characterization of RNA from low-biomass deep-subsurface sediments. Appl. doi: 10.1016/s0167-7012(03)00140-4 Environ. Microbiol. 61, 763–768. doi: 10.1128/aem.61.2.763-768.1995 Weyrich, L. S., Farrer, A. G., Eisenhofer, R., Arriola, L. A., Young, J., Selway, C. A., Oksanen, J., Blanchet, G., Kindt, R., Legendre, P., Minchin, P. R., and O’Hara, R. B. et al. (2019). Laboratory contamination over time during low-biomass sample (2019). vegan: Community Ecology Package. R Package Version 2.3–5. analysis. Mole. Ecol. Resour. 19, 982–996. doi: 10.1111/1755-0998.13011 R Core Team, (2019). R: A Language and Environment for Statistical Computing. Wickham, H. (2009). plyr: Tools for Splitting, Applying and Combining Data. R Vienna: R Core Team. Package Version 0.1, 9, 651. Rådström, P., Knutsson, R., Wolffs, P., Lövenklev, M., and Löfström, C. (2004). Wickham, H. (2012). reshape2: Flexibly Reshape Data: A Reboot of the Reshape Pre-PCR processing. Mol. Biotech. 26, 133–146. doi: 10.1385/mb:26:2:133 Package. R Package Version, Vol. 1. Ritter, D., Vinson, D., Barnhart, E., Akob, D. M., Fields, M. W., Cunningham, Wickham, H. (2016). ggplot2: Elegant graphics for Data Analysis. Berlin: Springer. A. B., et al. (2015). Enhanced microbial coalbed methane generation: a review of Wuchter, C., Banning, E., Mincer, T., Drenzek, N. J., and Coolen, M. J. (2013). research, commercial activity, and remaining challenges. Int. J. Coal Geol. 146, Microbial diversity and methanogenic activity of antrim shale formation waters 28–41. doi: 10.1016/j.coal.2015.04.013 from recently fractured wells. Front. Microbiol. 4:367. doi: 10.3389/fmicb.2013. Rossen, L., Nørskov, P., Holmstrøm, K., and Rasmussen, O. F. (1992). Inhibition of 00367 PCR by components of food samples, microbial diagnostic assays and DNA- Yilmaz, P., Parfrey, L. W., Yarza, P., Gerken, J., Pruesse, E., Quast, C., et al. (2013). extraction solutions. Int. J. Food Microbiol. 17, 37–45. doi: 10.1016/0168- The SILVA and “all-species living tree project (LTP)” taxonomic frameworks. 1605(92)90017-w Nucleic Acids Res. 42, D643–D648. Salinas, M. B., Fardeau, M. L., Cayol, J. L., Casalot, L., Patel, B. K., Thomas, P., Zelaya, A. J., Parker, A. E., Bailey, K. L., Zhang, P., Van Nostrand, J., Ning, D., et al. (2004). Petrobacter succinatimandens gen. nov., sp. nov., a moderately et al. (2019). High spatiotemporal variability of bacterial diversity over short thermophilic, nitrate-reducing bacterium isolated from an Australian oil well. time scales with unique hydrochemical associations within a shallow aquifer. Int. J. Syst. Evol. Microbiol. 54, 645–649. doi: 10.1099/ijs.0.02732-0 Water Res. 164:114917. doi: 10.1016/j.watres.2019.114917 Salter, S. J., Cox, M. J., Turek, E. M., Calus, S. T., Cookson, W. O., Moffatt, M. F., et al. (2014). Reagent and laboratory contamination can critically impact Conflict of Interest: The authors declare that the research was conducted in the sequence-based microbiome analyses. BMC Biol. 12:87. doi: 10.1186/s12915- absence of any commercial or financial relationships that could be construed as a 014-0087-z potential conflict of interest. Santelli, C. M., Banerjee, N., Bach, W., and Edwards, K. J. (2010). Tapping the subsurface ocean crust biosphere: low biomass and drilling-related Copyright © 2020 Shelton, Barnhart, Ruppert, Jubb, Blondes and DeVera. This is an contamination calls for improved quality controls. Geomicrobiol. J. 27, 158–169. open-access article distributed under the terms of the Creative Commons Attribution doi: 10.1080/01490450903456780 License (CC BY). The use, distribution or reproduction in other forums is permitted, Schlegel, M. E., McIntosh, J. C., Petsch, S. T., Orem, W. H., Jones, E. J., and Martini, provided the original author(s) and the copyright owner(s) are credited and that the A. M. (2013). Extent and limits of biodegradation by in situ methanogenic original publication in this journal is cited, in accordance with accepted academic consortia in shale and formation fluids. App. Geochem. 28, 172–184. doi: practice. No use, distribution or reproduction is permitted which does not comply 10.1016/j.apgeochem.2012.10.008 with these terms. Frontiers in Microbiology | www.frontiersin.org 14 September 2020 | Volume 11 | Article 536978