TY - JOUR
T1 - A Preliminary Metagenome Analysis Based on a Combination of Protein Domains
AU - Igarashi, Yoji
AU - Mori, Daisuke
AU - Mitsuyama, Susumu
AU - Yoshitake, Kazutoshi
AU - Ono, Hiroaki
AU - Watanabe, Tsuyoshi
AU - Taniuchi, Yukiko
AU - Sakami, Tomoko
AU - Kuwata, Akira
AU - Kobayashi, Takanori
AU - Ishino, Yoshizumi
AU - Watabe, Shugo
AU - Gojobori, Takashi
AU - Asakawa, Shuichi
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work was supported by CREST (Core Research for Evolutional Science and Technology) of the Japan Science and Technology Corporation (JST).
PY - 2019/4/29
Y1 - 2019/4/29
N2 - Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.
AB - Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.
UR - http://hdl.handle.net/10754/652833
UR - https://www.mdpi.com/2227-7382/7/2/19
UR - http://www.scopus.com/inward/record.url?scp=85066760394&partnerID=8YFLogxK
U2 - 10.3390/proteomes7020019
DO - 10.3390/proteomes7020019
M3 - Article
C2 - 31035705
SN - 2227-7382
VL - 7
SP - 19
JO - Proteomes
JF - Proteomes
IS - 2
ER -