• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

環境ゲノムと機械学習の融合による未知代謝機能の解明と環境工学イノベーションの創出

Research Project

Project/Area Number 20F20346
Research Category

Grant-in-Aid for JSPS Fellows

Allocation TypeSingle-year Grants
Section外国
Review Section Basic Section 22060:Environmental systems for civil engineering-related
Research InstitutionNational Institute of Advanced Industrial Science and Technology
Host Researcher 延 優  国立研究開発法人産業技術総合研究所, 生命工学領域, 研究員 (40805644)
Foreign Research Fellow LENG LING  国立研究開発法人産業技術総合研究所, 生命工学領域, 外国人特別研究員
Project Period (FY) 2020-11-13 – 2023-03-31
Project Status Granted (Fiscal Year 2021)
Budget Amount *help
¥2,200,000 (Direct Cost: ¥2,200,000)
Fiscal Year 2021: ¥1,100,000 (Direct Cost: ¥1,100,000)
Fiscal Year 2020: ¥600,000 (Direct Cost: ¥600,000)
KeywordsMetagenomics / Machine learning / Genome annotation / Metabolism
Outline of Research at the Start

微生物は様々な産業で利用されているが、自然界に存在する多くの微生物はまだ機能が未知であり、それ故産業利用に応用できていない。機械学習を通じ未知遺伝子や未知機能を発掘し、それらを更に深く調査するために多様な未知機能や未知微生物が出現する廃水処理プロセスをラボで構築し、 微生物の複合系で相互作用を担う未知遺伝子や複数の微生物に渡って機能する新規代謝機能を研究する。

Outline of Annual Research Achievements

A comprehensive genome database was successfully constructed and machine-learning-based pipeline has been developed to mine bacterial genomes for novel functions. Preliminary analyses have allowed identification of potential metabolic networks that have been overlooked in both cultured and uncultured bacteria. We have further explored what proportion of genes are core and specific to each pathway and which protein families are prone to being non-essential or -specific. Collectively, this will serve as foundational criteria for identification of genes that have the same function and contribute to a specific pathway.

Current Status of Research Progress
Current Status of Research Progress

2: Research has progressed on the whole more than it was originally planned.

Reason

The project is proceeding as expected.

Strategy for Future Research Activity

The next step is to discover new biological pathways containing the identified essential and specific proteins. With each representation of core genes as one dimension, similarity-based learning approach (i.e. Nearest Neighbor Algorithm) will be adopted to group the sequence windows that contain these genes. Subsequently, the category of potential pathway-associated windows will be predicted based on the similarity (space distance) with known aromatic compound degradation pathways. This allows us to identify candidate sequence windows for putative aromatic compound degradation pathways. Based on preliminary results, we found the xenobiotics-related genes consistently showed distinct phylogenetic behavior (tight clustering and confinement to specific habitats) compared to those associated with degradation of natural aromatic compounds. Using this trend, we can further differentiate pathways related to xenobiotics and natural compounds. To strengthen the connection between machine learning and pathway prediction, we will adopt thermodynamics to estimate the feasibility and directionality of predicted biochemical reactions.

Report

(1 results)
  • 2020 Annual Research Report

URL: 

Published: 2020-11-16   Modified: 2021-12-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi