研究実績の概要 |
The influence of interactions with other organisms, particularly soil microbiota, is gaining attention as a crucial factor in cultivation management. Soil microbiota plays a vital role in nutrient cycling, plant growth and health, and overall soil quality. Next-generation sequencing enables large-scale genomic (metagenomic) and functional analysis of the soil microbiota. To identify the heterogeneous pattern of individual-to-individual variability in the microbiome data, I introduced the stochastic variational variable selection (SVVS) to identify a minimal-size core set of representative microbial species that significantly improved the performances of clustering method, considerably reduced computational burden and captured biological variabilities. My novel methodology was published in Microbiome journal (IF: 16.837).
Currently, I propose a novel framework, integrative stochastic variational variable selection (I-SVVS), which is an extension of stochastic variational variable selection for high-dimensional microbiome data in my previous paper. The I-SVVS approach address a specific Bayesian mixture model for each of different types of omics data, i.e., an infinite Dirichlet multinomial mixture (DMM) model for microbiome data and an infinite Gaussian mixture model for metabolomic data, to improve the accuracy and computational time of cluster process. The method can also identify a critical set of representative variables in multiomics microbiome data. I demonstrate I-SVVS on three large datasets in integration of microbiome and metabolome from soybean, mice and human.
|
今後の研究の推進方策 |
Although I have proposed several solutions to overcome important challenges for microbiome multi-omics analysis, I-SVVS is not free of limitations. The model focused to optimize the contributions of microbiome data that could improve significantly its performance for the join analysis of multi-omics datasets. Although DMM approach is the best mixture model to analyze the count data of microbiome data, it could not model efficiently count data of different omics datasets. Future extensions of I-SVVS may address these problems that develop and integrate the specific Bayesian mixture models of different omics data such as metatranscriptome RNA sequencing (MT), shotgun mass spectrometry-based metaproteomics (MP) in its framework. Additionally, variations in the structure of omics data such as imbalance in a number of features of each omics dataset impact on the stability and optimal performance of I-SVVS. Future developments could take this point into account. Finally, although I-SVVS identified successfully a small number of vital features of different omics datasets, it is difficult to infer the interactions among the selected features. Therefore, there is room for future extensions that are more efficient to enforce the important relationships across omics than the use of correlation.
|