• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Pure transformer encoder-based generative adversarial networks for molecular generation

Research Project

Project/Area Number 23KF0063
Research Category

Grant-in-Aid for JSPS Fellows

Allocation TypeMulti-year Fund
Section外国
Review Section Basic Section 62010:Life, health and medical informatics-related
Research InstitutionNagoya University

Principal Investigator

山西 芳裕  名古屋大学, 情報学研究科, 教授 (60437267)

Co-Investigator(Kenkyū-buntansha) LI CHEN  名古屋大学, 情報学研究科, 外国人特別研究員
Project Period (FY) 2023-04-25 – 2025-03-31
Project Status Granted (Fiscal Year 2023)
Budget Amount *help
¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 2024: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2023: ¥1,000,000 (Direct Cost: ¥1,000,000)
KeywordsGenerative AI Model / Deep Learning / Drug Discovery / Molecular Generation / Property Optimization
Outline of Research at the Start

In this study, we propose a new method by combining a transformer and GAN to generate realistic molecules. I would like to propose a property-optimized GAN that contains only transformer encoders to generate molecules with the desired chemical properties.

Outline of Annual Research Achievements

In tackling challenges such as the complexity of generating molecular representations (SMILES) via GANs, along with the non-uniqueness of SMILES representation and the instability associated with GAN training, I proposed an innovative de novo molecular generative model. To enhance the ability to capture features within molecular SMILES representations, a transformer and its variants were utilized as the generator and discriminator of the GAN. Additionally, the concept of variant SMILES was leveraged, recognizing that a molecule can manifest multiple distinct SMILES representations, to comprehensively train the model. Furthermore, molecular chemical properties were determined as rewards within the reinforcement learning-based framework. Such rewards effectively guide the update of the generator.
To address the challenge of preserving molecular scaffold integrity in de novo molecular generation, a functional group generative model was introduced. This model not only generates functional groups for a given molecular scaffold but also optimizes molecular properties simultaneously. Diverging from traditional transformer, this model utilizes a reverse transformer with a first-decoder-then-encoder architecture to achieve GAN functionality.

Current Status of Research Progress
Current Status of Research Progress

1: Research has progressed more than it was originally planned.

Reason

The research project has made significant strides beyond my original plans.In the realm where AI intersects with bioinformatics, my efforts have been focused on mitigating the development phases and substantial trial-and-error expenses inherent in conventional drug discovery processes. The main contributions include the de novo molecular generative models [1], functional group generation based on molecular scaffolds [2], and drug candidate generation based on gene expression profiles [3].
[1] C. Li and Y. Yamanishi (2024), “TenGAN: Pure transformer encoders make an efficient discrete GAN for de novo molecular generation,” In the proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), top AI conference.
[2] C. Li and Y. Yamanishi (2023), “SpotGAN: A reverse-transformer GAN generates scaffold-constrained molecules with property optimization,” In the proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023).
[3] C. Li and Y. Yamanishi (2024), “GxVAEs: Two joint VAEs generate hit molecules from gene expression profiles,” one of the three outstanding papers among the 2342 accepted papers at the top AI conference of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024).

Strategy for Future Research Activity

In future work, , I aim to use advanced deep learning techniques to generate novel molecules with desired chemical properties. Moreover, considering the rich biological information available in gene expression profiles, I aim to produce molecules in combination with gene expression profiles.
Moment Soft-Actor-Critic Reinforcement Learning Driven GAN for De Novo Molecule Generation
My previous work [1,2] generated molecules with desired chemical properties using Monte Carlo tree search (MCTS) reinforcement learning algorithm. While MCTS is a powerful tool for molecular generation, a potential drawback of MCTS is its computational complexity. Since molecular generation involves a complex search space, this can be particularly challenging when using MCTS for molecular generation. As a result, MCTS can require a large number of computational resources, making it both time-consuming and expensive to implement. Additionally, the discriminators of past studies need to evaluate the entire SMILES strings. However, molecules that are discriminated as false by the discriminator are often due to a certain number of unsuitable atoms. In future work, I aim to propose a GAN based on soft-actor-critical reinforcement learning with a discriminator that evaluates the generated SMILES strings in a stepwise manner.

Report

(1 results)
  • 2023 Research-status Report
  • Research Products

    (5 results)

All 2024 2023

All Journal Article (2 results) (of which Peer Reviewed: 2 results,  Open Access: 2 results) Presentation (3 results) (of which Int'l Joint Research: 3 results)

  • [Journal Article] GxVAEs: Two Joint VAEs Generate Hit Molecules from Gene Expression Profiles2024

    • Author(s)
      Li Chen、Yamanishi Yoshihiro
    • Journal Title

      Proceedings of the AAAI Conference on Artificial Intelligence

      Volume: 38 Issue: 12 Pages: 13455-13463

    • DOI

      10.1609/aaai.v38i12.29248

    • Related Report
      2023 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] SpotGAN: A Reverse-Transformer GAN Generates Scaffold-Constrained Molecules with Property Optimization2023

    • Author(s)
      Li Chen、Yamanishi Yoshihiro
    • Journal Title

      Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023)

      Volume: 2023 Pages: 323-338

    • DOI

      10.1007/978-3-031-43412-9_19

    • ISBN
      9783031434112, 9783031434129
    • Related Report
      2023 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] GxVAEs: Two Joint VAEs Generate Hit Molecules from Gene Expression Profiles2024

    • Author(s)
      Li, C. and Yamanishi, Y.
    • Organizer
      The 38th Annual AAAI Conference on Artificial Intelligence (AAAI2024)
    • Related Report
      2023 Research-status Report
    • Int'l Joint Research
  • [Presentation] SpotGAN: A Reverse-Transformer GAN Generates Scaffold-Constrained Molecules with Property Optimization2023

    • Author(s)
      Li, C. and Yamanishi, Y.
    • Organizer
      The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023)
    • Related Report
      2023 Research-status Report
    • Int'l Joint Research
  • [Presentation] Scaffold-Retained Transformer GAN for Molecular Generation with Chemical Property Optimization2023

    • Author(s)
      Chen Li and Yoshihiro Yamanishi
    • Organizer
      情報計算化学生物学会(CBI学会)2023年大会
    • Related Report
      2023 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2023-04-26   Modified: 2024-12-25  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi