研究開始時の研究の概要 |
High-dimensional multiomics microbiome data plays an important role in elucidating microbial communities’ interactions with their hosts and environment in critical diseases and ecological changes. I develop a novel framework, which is an extension of stochastic variational variable selection for high-dimensional microbiome data. My approach address a specific Bayesian mixture model for each of different types of omics data, to improve the accuracy and computational time of cluster process. I demonstrate integration of microbiome and metabolome from soybean, mice and human.
|
研究実績の概要 |
In my Ph.D. research (DC1), the methodologies of Random Forest with Forward Variable Selection (RF-FVS), Stochastic Variational Variable Selection (SVVS), and Integrative Stochastic Variational Variable Selection (I-SVVS) introduce crucial tools for multi-omics microbiome data analysis. These approaches effectively tackle challenges like high dimensionality, computational efficiency, and feature selection. While these methods have demonstrated their worth in analyzing 16S ribosomal RNA microbiome datasets, they do have limitations. Factors like short read lengths from sequencing, potential sequencing errors, and variability stemming from sequencing region choices can limit the accuracy and comprehensiveness of taxonomic profiles. Future expansions of these methodologies will involve comprehensive analyses integrating diverse host databases. These include the host genome, transcriptome, proteome, and metabolome, alongside microbiome databases covering whole metagenome, metatranscriptome, and metaproteome. This broader scope will accommodate various types of data, including multicategory data (e.g., copy number states: loss/normal/gain), binary data (e.g., mutation status), and count data (e.g., sequencing data).
|