遺伝子変異を考慮しナノポアシーケンスから高精度にメチル化を検出する情報技術の開発

研究課題

研究課題/領域番号	21K12104
研究種目	基盤研究(C)
配分区分	基金
応募区分	一般
審査区分	小区分62010:生命、健康および医療情報学関連
研究機関	東京大学
研究代表者	張耀中東京大学, 医科学研究所, 准教授 (60817138)
研究期間 (年度)	2021-04-01 – 2024-03-31
研究課題ステータス	完了 (2023年度)
配分額 *注記	4,160千円 (直接経費: 3,200千円、間接経費: 960千円) 2023年度: 910千円 (直接経費: 700千円、間接経費: 210千円) 2022年度: 1,950千円 (直接経費: 1,500千円、間接経費: 450千円) 2021年度: 1,300千円 (直接経費: 1,000千円、間接経費: 300千円)
キーワード	methylation / nanopore / deep learning / nanopore methylation / k-mer model / representation learning / pre-training / メチル化 / 事前表現学習モデル / k-mer / 全ゲノム表現学習 / nanopore sequencing / ナノポアシーケンシング / 構造変異 / 深層学習
研究開始時の研究の概要	本研究では、ナノポアシーケンシングからがんゲノムやRNAウイルスの複雑なメチル化プロファイリングを正確に行うために、特定遺伝子型を考慮したディープニューラルネットワークによって高精度にメチル化を検出する情報解析技術を構築する。これまで独立して解析が行われていたゲノムアセンブリ、遺伝子変異同定、構造変異検出を統合し、アンサンブリングを行うことによって正確なメチル化プロファイリングを行うことのできる情報技術を提案する。
研究成果の概要	ナノポアシーケンスデータでの高性能メチル化検出手法をモデルレベルとパイプラインレベルの両方で開発した。モデルレベルでは、Transformerモデルのencoderアーキテクチャを使ってmethBERTを開発した。BERTモデルで塩基配列の表現学習を検証した。同じゲノムlociにアライメントされたリードを統合利用することで新しいメチル化コーラーを開発した。パイプラインレベルでは、ハプロタイプおよびゲノム変異を考慮したメチル化予測パイプラインを構築した。このパイプラインは、正常および腫瘍細胞株を用いて検証された。その上、対照学習を通じて生物学的関係を導入することで、新しい表現学習方法を開発した。
研究成果の学術的意義や社会的意義	ゲノムシーケンシングのコストが安くなるにつれて、その利用も広がってきた。ゲノムシーケンシングデータをより迅速かつ高精度に解析することは、ヘルスケアや疾患診断において重要である。本研究では、ナノポアシーケンシングから高精度なメチル化プロファイリング解析技術を開発した。この技術により、メチル化を高速かつ高精度な検出することが可能になり、老化や疾患におけるエピジェネティックな変化を理解するために役割を果たすことが期待される。

報告書

(4件)

研究成果
(20件)

すべて 2024 2023 2022 2021 その他

すべて雑誌論文 (11件) (うち査読あり 9件、オープンアクセス 7件) 学会発表 (5件) (うち国際学会 5件) 備考 (4件)

[雑誌論文] Predicting cell types with supervised contrastive learning on cells and their types2024
- 著者名/発表者名
  Heryanto Yusri Dwi、Zhang Yao-zhong、Imoto Seiya
- 雑誌名
  
  Scientific Reports
  
  巻: 14 号: 1 ページ: 1-16
- DOI
  10.1038/s41598-023-50185-2
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Zero-shot-capable identification of phage?host relationships with whole-genome sequence representation by contrastive learning2023
- 著者名/発表者名
  Zhang Yao-zhong、Liu Yunjie、Bai Zeheng、Fujimoto Kosuke、Uematsu Satoshi、Imoto Seiya
- 雑誌名
  
  Briefings in Bioinformatics
  
  巻: 24 号: 5 ページ: 1-10
- DOI
  10.1093/bib/bbad239
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Investigation of the BERT model on nucleotide sequences with non-standard pre-training and evaluation of different k-mer embeddings2023
- 著者名/発表者名
  Zhang Yao-zhong、Bai Zeheng、Imoto Seiya
- 雑誌名
  
  Bioinformatics
  
  巻: 39 号: 10 ページ: 1-10
- DOI
  10.1093/bioinformatics/btad617
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Microbial Gene Ontology informed deep neural network for microbe functionality discovery in human diseases2023
- 著者名/発表者名
  Liu Yunjie、Zhang Yao-zhong、Imoto Seiya
- 雑誌名
  
  PLOS ONE
  
  巻: 18 号: 8 ページ: 1-13
- DOI
  10.1371/journal.pone.0290307
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Imputing time-series microbiome abundance profiles with diffusion model2023
- 著者名/発表者名
  Seki Misato、Zhang Yao-Zhong、Imoto Seiya
- 雑誌名
  
  IEEE Xplore
  
  巻: BIBM 2023 ページ: 914-919
- DOI
  10.1109/bibm58861.2023.10385703
- 関連する報告書
  2023 実績報告書
- 査読あり
[雑誌論文] Identification of bacteriophage genome sequences with representation learning2022
- 著者名/発表者名
  Bai Zeheng、Zhang Yao-zhong、Miyano Satoru、Yamaguchi Rui、Fujimoto Kosuke、Uematsu Satoshi、Imoto Seiya
- 雑誌名
  
  Bioinformatics
  
  巻: 38 号: 18 ページ: 4264-4270
- DOI
  10.1093/bioinformatics/btac509
- 関連する報告書
  2022 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2022
- 著者名/発表者名
  Zhang Yao-zhong、Bai Zeheng、Imoto Seiya
- 雑誌名
  
  bioAxiv
  
  巻: preprint ページ: 1-7
- DOI
  10.1101/2022.12.05.518770
- 関連する報告書
  2022 実施状況報告書
- オープンアクセス
[雑誌論文] On the application of BERT models for nanopore methylation detection2021
- 著者名/発表者名
  Zhang Yao-Zhong、Yamaguchi Kiyoshi、Hatakeyama Sera、Furukawa Yoichi、Miyano Satoru、Yamaguchi Rui、Imoto Seiya
- 雑誌名
  
  Proceedings of 2021 IEEE International Conference on Bioinformatics and Biomedicine
  
  ページ: 320-327
- DOI
  10.1109/bibm52615.2021.9669841
- 関連する報告書
  2021 実施状況報告書
- 査読あり
[雑誌論文] Enhancing breakpoint resolution with deep segmentation model: A general refinement method for read-depth based structural variant callers2021
- 著者名/発表者名
  Zhang Yao-zhong、Imoto Seiya、Miyano Satoru、Yamaguchi Rui
- 雑誌名
  
  PLOS Computational Biology
  
  巻: 17 号: 10 ページ: 1009186-1009186
- DOI
  10.1371/journal.pcbi.1009186
- 関連する報告書
  2021 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Identification of Bacteriophages Using Deep Representation Model with Pre-training2021
- 著者名/発表者名
  Bai Zeheng、Zhang Yao-zhong、Miyano Satoru、Yamaguchi Rui、Uematsu Satoshi、Imoto Seiya
- 雑誌名
  
  BioAxiv
  
  ページ: 1-7
- DOI
  10.1101/2021.09.25.461359
- 関連する報告書
  2021 実施状況報告書
[雑誌論文] Discovering microbe functionality in human disease with a gene-ontology-aware model2021
- 著者名/発表者名
  Liu Yunjie、Zhang Yaozhong、Imoto Seiya
- 雑誌名
  
  Proceedings of 2021 IEEE International Conference on Bioinformatics and Biomedicine
  
  ページ: 1873-1880
- DOI
  10.1109/bibm52615.2021.9669492
- 関連する報告書
  2021 実施状況報告書
- 査読あり
[学会発表] Zero-shot-capable identification of phage-host relationships with whole-genome sequence representation by contrastive learning2023
- 著者名/発表者名
  Yao-zhong Zhang
- 学会等名
  International Workshop on Bioinformatics and Systems Biology (IBSB)
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Imputing time-series microbiome abundance profiles with diffusion model2023
- 著者名/発表者名
  Misato Seki, Yao-zhong Zhang, Seiya Imoto
- 学会等名
  International Conference on Bioinformatics and Biomedicine (BIBM)
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2023
- 著者名/発表者名
  Yao-zhong Zhang
- 学会等名
  27th Annual International Conference on Research in Computational Molecular Biology
- 関連する報告書
  2022 実施状況報告書
- 国際学会
[学会発表] On the application of BERT models for nanopore methylation detection2021
- 著者名/発表者名
  Yao-zhong Zhang
- 学会等名
  IEEE International Conference on Bioinformatics and Biomedicine
- 関連する報告書
  2021 実施状況報告書
- 国際学会
[学会発表] Discovering microbe functionality in human disease with a gene-ontology-aware model2021
- 著者名/発表者名
  Yunjie Liu,Yao-zhong Zhang, Seiya Imoto
- 学会等名
  Biological Ontologies and Knowledge Bases workshop 2021
- 関連する報告書
  2021 実施状況報告書
- 国際学会
[備考] CL4PHI
- URL
  https://github.com/yaozhong/CL4PHI
- 関連する報告書
  2023 実績報告書
[備考] SCLSC
- URL
  https://github.com/yaozhong/SCLSC
- 関連する報告書
  2023 実績報告書
[備考] bert investigation
- URL
  https://github.com/yaozhong/bert_investigation
- 関連する報告書
  2022 実施状況報告書
[備考] methBERT open source software
- URL
  https://methbert.readthedocs.io/en/latest/index.html
- 関連する報告書
  2021 実施状況報告書

遺伝子変異を考慮しナノポアシーケンスから高精度にメチル化を検出する情報技術の開発

研究代表者

張 耀中 東京大学, 医科学研究所, 准教授 (60817138)

4,160千円 (直接経費: 3,200千円、間接経費: 960千円)

報告書

研究成果

[雑誌論文] Predicting cell types with supervised contrastive learning on cells and their types2024

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Zero-shot-capable identification of phage?host relationships with whole-genome sequence representation by contrastive learning2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Investigation of the BERT model on nucleotide sequences with non-standard pre-training and evaluation of different k-mer embeddings2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Microbial Gene Ontology informed deep neural network for microbe functionality discovery in human diseases2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Imputing time-series microbiome abundance profiles with diffusion model2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Identification of bacteriophage genome sequences with representation learning2022

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2022

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] On the application of BERT models for nanopore methylation detection2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Enhancing breakpoint resolution with deep segmentation model: A general refinement method for read-depth based structural variant callers2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Identification of Bacteriophages Using Deep Representation Model with Pre-training2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Discovering microbe functionality in human disease with a gene-ontology-aware model2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[学会発表] Zero-shot-capable identification of phage-host relationships with whole-genome sequence representation by contrastive learning2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Imputing time-series microbiome abundance profiles with diffusion model2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] On the application of BERT models for nanopore methylation detection2021

著者名/発表者名

学会等名

関連する報告書

[学会発表] Discovering microbe functionality in human disease with a gene-ontology-aware model2021

著者名/発表者名

学会等名

張耀中東京大学, 医科学研究所, 准教授 (60817138)