Surveying microbial diversity and function is accomplished by combining complementary molecular tools. Among them, metagenomics is a PCR free approach that contains all genetic information from microbial assemblages and is today performed at a relatively large scale and reasonable cost, mostly based on very short reads. Here we investigated the potential of metagenomics to provide taxonomic reports of marine microbial eukaryotes. We prepared a curated database with reference sequences of the V4 region of 18S rDNA clustered at 97% similarity and used this database to extract and classify metagenomic reads. More than half of them were unambiguously affiliated to a unique reference whilst the rest could be assigned to a given taxonomic group. The overall diversity reported by metagenomics was similar to that obtained by amplicon sequencing of the V4 and V9 regions of the 18S rRNA gene, although either one or both of these amplicon surveys performed poorly for groups like Excavata, Amoebozoa, Fungi and Haptophyta. We then studied the diversity of picoeukaryotes and nanoeukaryotes using 91 metagenomes from surface down to bathypelagic layers in different oceans, unveiling a clear taxonomic separation between size fractions and depth layers. Finally, we retrieved long rDNA sequences from assembled metagenomes that improved phylogenetic reconstructions of particular groups. Overall, this study shows metagenomics as an excellent resource for taxonomic exploration of marine microbial eukaryotes.
Bibliographical noteKAUST Repository Item: Exported on 2020-04-23
Acknowledged KAUST grant number(s): OSR #3362
Acknowledgements: This research was supported by the Spanish Ministry of Economy and Competitiveness projects Malaspina-2010 (CSD2008–00077) and ALLFLAGS (CTM2016-75083-R) and King Abdullah University of Science and Technology (KAUST) under contract OSR #3362. AO was supported by a Spanish FPI grant. We thank all scientists and crew that participated in the Malaspina 2010 expedition. We also thank Javier del Campo for useful discussions. Bioinformatic analyses were performed at the Marbits platform (ICM-CSIC; https://marbits.icm.csic.es).