• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

The development of a high-performance nanopore methylation detection method with consideration of structural variation

Research Project

Project/Area Number 21K12104
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 62010:Life, health and medical informatics-related
Research InstitutionThe University of Tokyo

Principal Investigator

ZHANG Yaozhong  東京大学, 医科学研究所, 准教授 (60817138)

Project Period (FY) 2021-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2023: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2022: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2021: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywordsmethylation / nanopore / deep learning / nanopore methylation / k-mer model / representation learning / pre-training / メチル化 / 事前表現学習モデル / k-mer / 全ゲノム表現学習 / nanopore sequencing / ナノポアシーケンシング / 構造変異 / 深層学習
Outline of Research at the Start

本研究では、ナノポアシーケンシングからがんゲノムやRNAウイルスの複雑なメチル化プロファイリングを正確に行うために、特定遺伝子型を考慮したディープニューラルネットワークによって高精度にメチル化を検出する情報解析技術を構築する。これまで独立して解析が行われていたゲノムアセンブリ、遺伝子変異同定、構造変異検出を統合し、アンサンブリングを行うことによって正確なメチル化プロファイリングを行うことのできる情報技術を提案する。

Outline of Final Research Achievements

In this project, we developed both model-level and pipeline-level high-performance methylation callers for nanopore sequencing data. We developed methBERT using the encoder architecture of the transformer model. In addition to signal analysis, we investigated the learning of nucleotide representation in the BERT model through pre-training. We analyzed representations for signals and nucleotides and developed a novel methylation caller based on the alignment of reads at target positions. At the pipeline level, we built a haplotype-aware and structural-variant-informed methylation detection pipeline, which we tested on both normal and tumor cells. Besides developing high-performance methylation callers, we extended our findings to whole-genome-level nucleotide sequence representation and single-cell representations using contrastive learning with biological constraints.

Academic Significance and Societal Importance of the Research Achievements

ゲノムシーケンシングのコストが安くなるにつれて、その利用も広がってきた。ゲノムシーケンシングデータをより迅速かつ高精度に解析することは、ヘルスケアや疾患診断において重要である。本研究では、ナノポアシーケンシングから高精度なメチル化プロファイリング解析技術を開発した。この技術により、メチル化を高速かつ高精度な検出することが可能になり、老化や疾患におけるエピジェネティックな変化を理解するために役割を果たすことが期待される。

Report

(4 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • Research Products

    (20 results)

All 2024 2023 2022 2021 Other

All Journal Article (11 results) (of which Peer Reviewed: 9 results,  Open Access: 7 results) Presentation (5 results) (of which Int'l Joint Research: 5 results) Remarks (4 results)

  • [Journal Article] Predicting cell types with supervised contrastive learning on cells and their types2024

    • Author(s)
      Heryanto Yusri Dwi、Zhang Yao-zhong、Imoto Seiya
    • Journal Title

      Scientific Reports

      Volume: 14 Issue: 1 Pages: 1-16

    • DOI

      10.1038/s41598-023-50185-2

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Zero-shot-capable identification of phage?host relationships with whole-genome sequence representation by contrastive learning2023

    • Author(s)
      Zhang Yao-zhong、Liu Yunjie、Bai Zeheng、Fujimoto Kosuke、Uematsu Satoshi、Imoto Seiya
    • Journal Title

      Briefings in Bioinformatics

      Volume: 24 Issue: 5 Pages: 1-10

    • DOI

      10.1093/bib/bbad239

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Investigation of the BERT model on nucleotide sequences with non-standard pre-training and evaluation of different k-mer embeddings2023

    • Author(s)
      Zhang Yao-zhong、Bai Zeheng、Imoto Seiya
    • Journal Title

      Bioinformatics

      Volume: 39 Issue: 10 Pages: 1-10

    • DOI

      10.1093/bioinformatics/btad617

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Microbial Gene Ontology informed deep neural network for microbe functionality discovery in human diseases2023

    • Author(s)
      Liu Yunjie、Zhang Yao-zhong、Imoto Seiya
    • Journal Title

      PLOS ONE

      Volume: 18 Issue: 8 Pages: 1-13

    • DOI

      10.1371/journal.pone.0290307

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Imputing time-series microbiome abundance profiles with diffusion model2023

    • Author(s)
      Seki Misato、Zhang Yao-Zhong、Imoto Seiya
    • Journal Title

      IEEE Xplore

      Volume: BIBM 2023 Pages: 914-919

    • DOI

      10.1109/bibm58861.2023.10385703

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Identification of bacteriophage genome sequences with representation learning2022

    • Author(s)
      Bai Zeheng、Zhang Yao-zhong、Miyano Satoru、Yamaguchi Rui、Fujimoto Kosuke、Uematsu Satoshi、Imoto Seiya
    • Journal Title

      Bioinformatics

      Volume: 38 Issue: 18 Pages: 4264-4270

    • DOI

      10.1093/bioinformatics/btac509

    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2022

    • Author(s)
      Zhang Yao-zhong、Bai Zeheng、Imoto Seiya
    • Journal Title

      bioAxiv

      Volume: preprint Pages: 1-7

    • DOI

      10.1101/2022.12.05.518770

    • Related Report
      2022 Research-status Report
    • Open Access
  • [Journal Article] On the application of BERT models for nanopore methylation detection2021

    • Author(s)
      Zhang Yao-Zhong、Yamaguchi Kiyoshi、Hatakeyama Sera、Furukawa Yoichi、Miyano Satoru、Yamaguchi Rui、Imoto Seiya
    • Journal Title

      Proceedings of 2021 IEEE International Conference on Bioinformatics and Biomedicine

      Pages: 320-327

    • DOI

      10.1109/bibm52615.2021.9669841

    • Related Report
      2021 Research-status Report
    • Peer Reviewed
  • [Journal Article] Enhancing breakpoint resolution with deep segmentation model: A general refinement method for read-depth based structural variant callers2021

    • Author(s)
      Zhang Yao-zhong、Imoto Seiya、Miyano Satoru、Yamaguchi Rui
    • Journal Title

      PLOS Computational Biology

      Volume: 17 Issue: 10 Pages: 1009186-1009186

    • DOI

      10.1371/journal.pcbi.1009186

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Identification of Bacteriophages Using Deep Representation Model with Pre-training2021

    • Author(s)
      Bai Zeheng、Zhang Yao-zhong、Miyano Satoru、Yamaguchi Rui、Uematsu Satoshi、Imoto Seiya
    • Journal Title

      BioAxiv

      Pages: 1-7

    • DOI

      10.1101/2021.09.25.461359

    • Related Report
      2021 Research-status Report
  • [Journal Article] Discovering microbe functionality in human disease with a gene-ontology-aware model2021

    • Author(s)
      Liu Yunjie、Zhang Yaozhong、Imoto Seiya
    • Journal Title

      Proceedings of 2021 IEEE International Conference on Bioinformatics and Biomedicine

      Pages: 1873-1880

    • DOI

      10.1109/bibm52615.2021.9669492

    • Related Report
      2021 Research-status Report
    • Peer Reviewed
  • [Presentation] Zero-shot-capable identification of phage-host relationships with whole-genome sequence representation by contrastive learning2023

    • Author(s)
      Yao-zhong Zhang
    • Organizer
      International Workshop on Bioinformatics and Systems Biology (IBSB)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Imputing time-series microbiome abundance profiles with diffusion model2023

    • Author(s)
      Misato Seki, Yao-zhong Zhang, Seiya Imoto
    • Organizer
      International Conference on Bioinformatics and Biomedicine (BIBM)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Dysfunctional analysis of the pre-training model on nucleotide sequences and the evaluation of different k-mer embeddings2023

    • Author(s)
      Yao-zhong Zhang
    • Organizer
      27th Annual International Conference on Research in Computational Molecular Biology
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] On the application of BERT models for nanopore methylation detection2021

    • Author(s)
      Yao-zhong Zhang
    • Organizer
      IEEE International Conference on Bioinformatics and Biomedicine
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] Discovering microbe functionality in human disease with a gene-ontology-aware model2021

    • Author(s)
      Yunjie Liu,Yao-zhong Zhang, Seiya Imoto
    • Organizer
      Biological Ontologies and Knowledge Bases workshop 2021
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Remarks] CL4PHI

    • URL

      https://github.com/yaozhong/CL4PHI

    • Related Report
      2023 Annual Research Report
  • [Remarks] SCLSC

    • URL

      https://github.com/yaozhong/SCLSC

    • Related Report
      2023 Annual Research Report
  • [Remarks] bert investigation

    • URL

      https://github.com/yaozhong/bert_investigation

    • Related Report
      2022 Research-status Report
  • [Remarks] methBERT open source software

    • URL

      https://methbert.readthedocs.io/en/latest/index.html

    • Related Report
      2021 Research-status Report

URL: 

Published: 2021-04-28   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi