研究課題/領域番号 |
22KJ0656
|
補助金の研究課題番号 |
21J21850 (2021-2022)
|
研究種目 |
特別研究員奨励費
|
配分区分 | 基金 (2023) 補助金 (2021-2022) |
応募区分 | 国内 |
審査区分 |
小区分39010:遺伝育種科学関連
|
研究機関 | 東京大学 |
研究代表者 |
Dang Tung 東京大学, 農学生命科学研究科, 特別研究員(DC1)
|
研究期間 (年度) |
2023-03-08 – 2024-03-31
|
研究課題ステータス |
交付 (2023年度)
|
配分額 *注記 |
2,200千円 (直接経費: 2,200千円)
2023年度: 700千円 (直接経費: 700千円)
2022年度: 700千円 (直接経費: 700千円)
2021年度: 800千円 (直接経費: 800千円)
|
キーワード | integrative analysis / variational inference / Bayesian model / variable selection / drought irrigation / microbiome / stochastic optimization / metabolome |
研究開始時の研究の概要 |
High-dimensional multiomics microbiome data plays an important role in elucidating microbial communities’ interactions with their hosts and environment in critical diseases and ecological changes. I develop a novel framework, which is an extension of stochastic variational variable selection for high-dimensional microbiome data. My approach address a specific Bayesian mixture model for each of different types of omics data, to improve the accuracy and computational time of cluster process. I demonstrate integration of microbiome and metabolome from soybean, mice and human.
|
研究実績の概要 |
The influence of interactions with other organisms, particularly soil microbiota, is gaining attention as a crucial factor in cultivation management. Soil microbiota plays a vital role in nutrient cycling, plant growth and health, and overall soil quality. Next-generation sequencing enables large-scale genomic (metagenomic) and functional analysis of the soil microbiota. To identify the heterogeneous pattern of individual-to-individual variability in the microbiome data, I introduced the stochastic variational variable selection (SVVS) to identify a minimal-size core set of representative microbial species that significantly improved the performances of clustering method, considerably reduced computational burden and captured biological variabilities. My novel methodology was published in Microbiome journal (IF: 16.837).
Currently, I propose a novel framework, integrative stochastic variational variable selection (I-SVVS), which is an extension of stochastic variational variable selection for high-dimensional microbiome data in my previous paper. The I-SVVS approach address a specific Bayesian mixture model for each of different types of omics data, i.e., an infinite Dirichlet multinomial mixture (DMM) model for microbiome data and an infinite Gaussian mixture model for metabolomic data, to improve the accuracy and computational time of cluster process. The method can also identify a critical set of representative variables in multiomics microbiome data. I demonstrate I-SVVS on three large datasets in integration of microbiome and metabolome from soybean, mice and human.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
Thank to JSPS research funding in 2022, I continued my experiments in Tottori university in order to collect the microbiome and metabolome databases in summer and winter. I bought some necessary devices to improve the performance of computational machines that can analyze the high-dimensional microbiome multiomics databases. I joined a number of Japanese and international conference about the development of bioinformatics areas to report my novel methodology and new results for the analysis of microbiome databases. I achieved a good paper in Microbiome journal (IF: 16.837). I have written a new paper to report a novel methodology and new results for the integrated analysis of microbiome-metabolome databases. I will submit a current manuscript and join several conferences to report the novel results
|
今後の研究の推進方策 |
Although I have proposed several solutions to overcome important challenges for microbiome multi-omics analysis, I-SVVS is not free of limitations. The model focused to optimize the contributions of microbiome data that could improve significantly its performance for the join analysis of multi-omics datasets. Although DMM approach is the best mixture model to analyze the count data of microbiome data, it could not model efficiently count data of different omics datasets. Future extensions of I-SVVS may address these problems that develop and integrate the specific Bayesian mixture models of different omics data such as metatranscriptome RNA sequencing (MT), shotgun mass spectrometry-based metaproteomics (MP) in its framework. Additionally, variations in the structure of omics data such as imbalance in a number of features of each omics dataset impact on the stability and optimal performance of I-SVVS. Future developments could take this point into account. Finally, although I-SVVS identified successfully a small number of vital features of different omics datasets, it is difficult to infer the interactions among the selected features. Therefore, there is room for future extensions that are more efficient to enforce the important relationships across omics than the use of correlation.
|