Microb Ecol DOI 10.1007/s00248-014-0377-6

SOIL MICROBIOLOGY

Land Use Type Significantly Affects Microbial Gene Transcription in Soil Heiko Nacke & Christiane Fischer & Andrea Thürmer & Peter Meinicke & Rolf Daniel

Received: 31 October 2013 / Accepted: 29 January 2014 # Springer Science+Business Media New York 2014

Abstract Soil microorganisms play an essential role in sustaining biogeochemical processes and cycling of nutrients across different land use types. To gain insights into microbial gene transcription in forest and grassland soil, we isolated mRNA from 32 sampling sites. After sequencing of generated complementary DNA (cDNA), a total of 5,824,229 sequences could be further analyzed. We were able to assign nonribosomal cDNA sequences to all three domains of life. A dominance of bacterial sequences, which were affiliated to 25 different phyla, was found. Bacterial groups capable of aromatic compound degradation such as Phenylobacterium and Burkholderia were detected in significantly higher relative abundance in forest soil than in grassland soil. Accordingly, KEGG pathway categories related to degradation of aromatic ring-containing molecules (e.g., benzoate degradation) were identified in high abundance within forest soil-

H. N. and C. F. contributed equally to this work. Electronic supplementary material The online version of this article (doi:10.1007/s00248-014-0377-6) contains supplementary material, which is available to authorized users. H. Nacke : C. Fischer : A. Thürmer : R. Daniel (*) Department of Genomic and Applied Microbiology and Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, Georg-August-University Göttingen, Grisebachstr. 8, 37077 Göttingen, Germany e-mail: [email protected] P. Meinicke Department of Bioinformatics, Institute of Microbiology and Genetics, Georg-August-University Göttingen, Grisebachstr. 8, 37077 Göttingen, Germany Present Address: C. Fischer BiK-F Biodiversität und Klima Forschungszentrum, Biodiversity Exploratories, Senckenberg Gesellschaft für Naturforschung, Senckenberganlage 25, 60325 Frankfurt, Germany

derived metatranscriptomic datasets. The impact of land use type forest on community composition and activity is evidently to a high degree caused by the presence of wood breakdown products. Correspondingly, bacterial groups known to be involved in lignin degradation and containing ligninolytic genes such as Burkholderia, Bradyrhizobium, and Azospirillum exhibited increased transcriptional activity in forest soil. Higher solar radiation in grassland presumably induced increased transcription of photosynthesis-related genes within this land use type. This is in accordance with high abundance of photosynthetic organisms and plant-infecting viruses in grassland.

Introduction Soils are considered to represent the most diverse microbial habitat worldwide with respect to species diversity and community size. They probably harbor the highest level of prokaryotic diversity and abundance of any environment on Earth [1]. One gram of soil can contain up to 109 microbial cells and an estimated 2,000 to 18,000 different bacterial genomes [2, 3]. Microorganisms colonizing soil mediate nearly all biogeochemical cycles in terrestrial ecosystems, play a key role in soil formation, and are responsible for most nutrient transformations in soil, thereby influencing the aboveground plant diversity and productivity [4]. Large datasets on soilinhabiting microorganisms have been gathered by DNAbased analysis of full-length or partial 16S ribosomal RNA (rRNA) and 18S rRNA genes [5–7]. In addition, factors driving soil microbial community composition and diversity have been detected. It has been reported that land use and plant species, as well as soil characteristics such as pH, organic carbon content, and soil texture can induce soil microbial community shifts [8–12]. Despite these studies on soil microbial community composition and the importance of soil microorganisms for ecosystem function, we understand little

H. Nacke et al.

about how environmental differences, e.g., in land use types affect the functions of soil microbial communities. Theoretically, metatranscriptomic analyses based on isolation and sequencing of soil messenger RNA (mRNA; complementary DNA (cDNA)) allow whole-genome transcription profiling of soil microbial communities. Nevertheless, soil metatranscriptomic approaches are a technological challenge, as extraction of high-quality soil mRNA is difficult. Total RNA extracts contain a low amount of mRNA, accounting for only 1 to 5 % [13]. In addition, prokaryotic mRNA molecules exhibit a high instability toward RNases, are not polyadenylated, and their half lives are short [14, 15]. To assess microbial gene transcription profiles based on soil-derived mRNA, the application of pyrosequencing allows the generation of tens to hundreds of thousand sequences with a high number of long reads (≥500 bp). This increases the likelihood of accurate annotation of gene fragments using various databases [16]. Only four pyrosequencing-based metatranscriptome analyses of soils are published. Samples for two of these studies were derived from a German sandy lawn or a North American pine/hemlock forest [17, 18]. In these studies, 21,133 and 445,479 reads, respectively, matching protein-coding genes were obtained. Furthermore, microcosm experiments investigating the effect of phenanthrene on soil microbial communities and analyses of high-Arctic peat soil were performed [19, 20]. Sanger sequencing-based transcriptional activity analyses of paddy soil bacteria and eukaryotic microorganisms in forest soil were also carried out [15, 21, 22]. In addition to assessment of the soil metatranscriptome, expression studies of individual genes such as pmoA were performed [23]. So far, next-generation sequencing-based surveys investigating transcriptional activity responses of soil microbial communities to varying land use type have not been published. In a previous amplicon-based 16S rRNA gene survey of the samples used in this study, we discovered that land use type is a major driver of bacterial community composition and diversity [11]. To verify and evaluate the effect of land use type on a functional level, we applied metatransctriptomics to analyze microbial gene transcription in forest and grassland soils. Replicates of forest sites (n = 15) and grassland sites (n = 17) were selected for soil sampling. Soil samples were derived from the German Biodiversity Exploratories HainichDün and Schwäbische Alb [24]. For each soil sample, we enriched mRNA and used it for the synthesis of cDNA, which was subsequently sequenced and analyzed. We assessed statistically significant land use type-induced effects on taxonomically and functionally analyzed soil metatranscriptomes.

Materials and Methods Soil Sampling The sampling sites are part of the German Biodiversity Exploratories initiative [24] and are located in the Hainich region (Thuringia) and the Schwäbische Alb (Baden-Württemberg). Soil samples from the A horizons were derived from 17 grassland sites (nine from Hainich and eight from Schwäbische Alb) and from 15 forest sites (six from Hainich and nine from Schwäbische Alb). A description of the different sampling sites is provided in Supplementary Table S1. Sampling was performed in spring 2008 as described by Will et al. [10] and Nacke et al. [11]. Field work permits were kindly issued by the responsible state environmental offices of Baden-Württemberg and Thuringia. The samples were frozen in liquid nitrogen and stored at −80 °C prior to extraction of RNA. RNA Extraction and Enrichment of mRNA Total microbial community RNA was extracted from approximately 7 g of soil per sample by employing the PowerSoil™ total RNA isolation kit (MoBio Laboratories, Carlsbad, CA, USA) according to the manufacturer’s instruction. Subsequently, total RNA extracts were treated with DNase using the TURBO DNA-free kit (Applied Biosystems, Darmstadt, Germany) to remove remaining traces of DNA. DNase-treated RNA was purified and concentrated by using the RNeasy MinElute Cleanup kit as recommended by the manufacturer (Qiagen, Hilden, Germany). The presence of remaining DNA was tested by PCR using the 16S rRNA gene as a target for amplification. The following primer set was employed: V4F_517_17/907R (5′-AGGC AGCAGTGGGGAAT-3′ [25] and 5′-CCGTCAATTCCTTT RAGTTT-3′ [26]). The PCR reaction mixture (25 μl) contained 2.5 μl of 10-fold Mg-free Taq polymerase buffer (Thermo Fisher Scientific, Inc., Waltham, MA, USA), 200 μM of each of the four desoxynucleoside triphosphates, 1.75 mM MgCl2, 0.2 μM of each primer, 1.5 U of Taq DNA polymerase (Thermo Fisher Scientific, Inc.), and purified RNA sample as template. The following thermal cycling scheme was used: initial denaturation at 95 °C for 2 min, 30 cycles of denaturation at 95 °C for 1.5 min, annealing at 55 °C for 1 min, followed by extension at 72 °C for 1 min. The final extension was carried out at 72 °C for 10 min. Enriched mRNA was obtained from total RNA using the Ribominus transcriptome isolation kit (Invitrogen GmbH, Karlsruhe, Germany) following the manufacturer’s protocol with one modification: the denaturation of the RNA was performed at 70 °C for 10 min. The enriched mRNA was purified and concentrated by using the RNeasy MinElute cleanup kit (Qiagen). Total RNA and enriched mRNA were

Land Use Type Significantly Affects Microbial Gene Transcription

analyzed by using a NanoDrop ND-1000 UV–vis Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and an Agilent 2100 Bioanalyser (Agilent Technologies, Böblingen, Germany) according to the protocols of the manufacturers. Synthesis of cDNA and Pyrosequencing The purified mRNA was reverse-transcribed to cDNA by using random primer pd(N)6 (Roche, Mannheim, Germany) and the SuperScript™ double-stranded cDNA synthesis kit (Invitrogen) as recommended by the manufacturer. The preparation of cDNA libraries for pyrosequencing was conducted as described by the manufacturer (Roche). Sequencing was performed by employing a 454 GS-FLX pyrosequencer (Roche) and Titanium chemistry. The pyrosequencingderived sequence datasets have been deposited in the GenBank short-read archive as part of projects SRP003237 and SRP002807. Removal of Residual rRNA Contamination Within Pyrosequencing-Derived Datasets We used the hmm_rRNA script [27] as implemented within the WebMGA server [28] to remove ribosomal sequences from the pyrosequencing-derived datasets. The script is based on a HMMER [29] search against the 5S ribosomal RNA database [30] and the European ribosomal RNA database [31]. In a second approach, all generated cDNA sequences were compared with the SILVA ribosomal RNA database release 104 [32] using BLASTN. BLAST results with a bit score of >50 were considered positive hits and were removed from the datasets. Taxonomic and Functional Analyses of cDNA Sequences All remaining nonribosomal cDNA sequences were queried using BLASTX against the NCBI nonredundant protein database (http://www.ncbi.nlm.nih.gov/). The BLASTX result was analyzed with MEGAN [33, 34] version 4 using default settings. This program allows taxonomic as well as functional investigation of metatranscriptomic data. The taxonomic analysis was performed based on the NCBI taxonomy. We used t test and Mann–Whitney U (M-W-U) test for nonparametric data to compare relative abundances of bacterial groups within the present study as well as between this survey and previous studies using the software package PAST [35]. For functional analyses of cDNA sequences, the Kyoto Encyclopedia of Genes and Genomes (KEGG) [36] integrated in MEGAN4 was used. Significant differences between gene transcription patterns in forest and grassland were tested employing t and M-W-U tests for nonparametric data by using STAMP v2.0.0 [37] and PAST [35].

Comparative Functional Profiling with CoMet The CoMet Web server [38] provides a platform for comparative functional profiling based on the identification of protein domains in metagenomic and metatranscriptomic sequencing reads. The fast UFO domain detection engine [39] and a speed-optimized version of the Orphelia ORF finder [40] are combined to compute protein domain frequencies in uploaded multi-FASTA files according to the Pfam A database (release 24) [41]. These frequencies are used for a pairwise statistical comparison of different samples. Statistical significance is indicated by a Bonferroni-corrected P value, which is based on a normal approximation of the binomial distribution used for testing. As a general significance threshold, we used the default setting of CoMet (0.05). For evaluation of the pooled metatranscriptomic data in this study, we used the sorted lists of significant Pfams and Gene Ontology (GO) terms together with the associated P values [38].

Results and Discussion General Characteristics of the Soil-Derived Metatranscriptomic Dataset In this study, we assessed microbial gene transcription in forest and grassland soils by pyrosequencing-based analysis of cDNA generated from enriched mRNA. Total RNA extracted from an overall of 32 soil samples enabled recovery of mRNA. The sampling sites were located in central and southern Germany and covered replicates of forest (n=15) and grassland (n=17) (Supplementary Table S1). The RNA yield in the forest sites ranged from 2.9 to 13.8 μg g−1 soil and in the grassland sites from 0.38 to 11.8 μg g−1 soil. Currently, only a few soil metatranscriptomic studies and corresponding RNA yields are available. Approximately 0.36 μg RNA g−1 soil was extracted from a monospecific Pinus pinaster forest planted on a stabilized coastal sand dune in South West France [21]. In addition, after sampling several soil types, including organic compost and garden topsoil from a private household and sugarcane soil, McGrath et al. [42] obtained a yield of 2.3 μg RNA g−1 soil. To minimize introduction of amplification-based biases RNA or cDNA amplification was not carried out. This approach results in a more accurate reflection of original microbial transcriptional activities in soil. Currently, in only two other pyrosequencing-based metatranscriptome analyses of soil RNA amplification was also avoided [17, 20]. A total number of 5,824,229 reads with an average length of 516 bases could be obtained in the present study. To remove remaining rRNA sequences, we compared two filtering methods: searching the 5S ribosomal RNA database [30] and the European ribosomal RNA database [31] for sequence

H. Nacke et al.

homologies by employing the hmm_rRNA script [27, 28] and BLAST against the SILVA ribosomal RNA database [32]. The number of total reads of every sample and remaining numbers of reads after filtering are depicted in Supplementary Table S2. Initial filtering based on the hmm_rRNA script removed 78 % of the reads from the dataset whereas filtering against the SILVA ribosomal RNA database removed 84 % of the reads. We decided to continue working with the SILVA-filtered data, accounting for a total of 947,485 nonribosomally derived sequences (503,591 from grasslands and 443,894 from forests). Although we might lose potential protein-encoding reads by employing the stricter filtering method, the risk of working with false positive protein hits was reduced. Taxa Contributing to Microbial Gene Transcription in Forest and Grassland Soil Based on matches in the NCBI nonredundant protein database, nonribosomal cDNA sequences could be assigned to all three domains of life. Bacterial cDNA sequences were most abundant (90.3 %), followed by eukaryotic (9.2 %) and archaeal cDNA sequences (0.5 %) (Fig. 1). The bacterial cDNA sequences were affiliated to 25 phyla (Fig. 2) and 3 candidate divisions (NC10, TM7, and WWE1). The dominant bacterial phyla across all forest and grassland soil samples were Proteobacteria, Actinobacteria, Verrucomicrobia, Acidobacteria, Planctomycetes, Bacteroidetes, Chloroflexi, Firmicutes, Cyanobacteria, and Gemmatimonadetes, representing 47.69, 12.32, 11.70, 10.80, 5.42, 4.87, 1.79, 1.72, 1.44, and 0.87 % of all bacterial sequences, respectively (Supplementary Table S3). These phyla are typically encountered in soil and were also present in a meta-analysis of 32 16S rRNA gene libraries derived from a variety of soils including forest and grassland samples [43]. Bacterial community compositions of forest and grassland soil samples analyzed in this study were previously assessed by amplicon-based 16S rRNA gene surveys [10, 11]. All phyla detected in these surveys were also found in the forest and grassland metatranscriptomes of the present study. Additionally, Chlorobi (0.16 %), Chlamydiae (0.07 %), Lentisphaerae (0.06 %), Aquificae (0.05 %), Synergistetes (0.04 %), Thermotogae (0.03 %), Deferribacteres (0.01 %), Dictyoglomi (0.01 %), and Elusimicrobia (0.01 %) cDNA sequences were discovered in the forest and grassland soil metatranscriptomes in low relative abundances (Fig. 2; Supplementary Table S3). Except Aquificae, Synergistetes, and Elusimicrobia, these phyla were also identified within 193,219 rRNA-tags derived from a sandy soil ecosystem [17]. The most abundant bacterial phylum, Proteobacteria, c o m p r i s e d s i x d i f f e r e n t c l a s s e s i n t h i s s t u d y. Alphaproteobacteria were most abundant (55.39 %), followed by Deltaproteobacteria (17.77 %), Betaproteobacteria ( 1 5 . 5 8 % ) , G a m m a p ro t e o b a c t e r i a ( 1 0 . 8 6 % ) ,

Epsilonproteobacteria (0.24 %), and Zetaproteobacteria (0.04 %). Within the Eukaryota, the majority originates from the Fungi/Metazoa group (71.2 % of all eukaryotic sequences) (Fig. 1). The remaining eukaryotic sequences were assigned to the Viridiplantae (17.5 %) and the protist groups Alveolata, Amoebozoa, Cryptophyta, Euglenozoa, Fornicata, Heterolobosea, Parabasalia, Rhizaria, and Stramenopiles, which altogether represent 11.3 %. Consistently, in a recent Sanger sequencing survey on the diversity of genes expressed by eukaryotes in French forest soils the annotated cDNA sequences of the opisthokont group (essentially Fungi and Metazoa) also dominated (60–61 %) [22]. Sequences of the least abundant domain in our metatranscriptomic dataset, Archaea, were affiliated to the Euryarchaeota (52.89 %), Thaumarchaeota (32.08 %), Crenarchaeota (14.31 %), and Korarchaeota (0.71 %) (Fig. 1). In addition to sequences belonging to one of the three domains of life, a number of viral cDNA sequences were obtained (2,955 sequences) (Fig. 1). Land Use Type Effects on Taxonomic Community Profiles Comparisons of forest and grassland soil microbial communities based on 16S rRNA and 18S rRNA genes have been carried out in previous surveys [8, 9, 11]. It has been shown that soil microbial taxa such as Alphaproteobacteria can vary significantly between the two land use types forest and grassland [11]. However, it has not been investigated if soil-derived nonribosomal cDNAs that have been taxonomically assigned can also be used to show statistically significant variations between these two land use types. With respect to bacterial phyla and proteobacterial classes, Acidobacteria and Alphaproteobacteria showed strong significant variations between forest and grassland soils (P

Land use type significantly affects microbial gene transcription in soil.

Soil microorganisms play an essential role in sustaining biogeochemical processes and cycling of nutrients across different land use types. To gain in...
717KB Sizes 0 Downloads 3 Views