Project/Area Number |
23KF0063
|
Research Category |
Grant-in-Aid for JSPS Fellows
|
Allocation Type | Multi-year Fund |
Section | 外国 |
Review Section |
Basic Section 62010:Life, health and medical informatics-related
|
Research Institution | Nagoya University |
Principal Investigator |
山西 芳裕 名古屋大学, 情報学研究科, 教授 (60437267)
|
Co-Investigator(Kenkyū-buntansha) |
LI CHEN 名古屋大学, 情報学研究科, 外国人特別研究員
|
Project Period (FY) |
2023-04-25 – 2025-03-31
|
Project Status |
Granted (Fiscal Year 2023)
|
Budget Amount *help |
¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 2024: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2023: ¥1,000,000 (Direct Cost: ¥1,000,000)
|
Keywords | Generative AI Model / Deep Learning / Drug Discovery / Molecular Generation / Property Optimization |
Outline of Research at the Start |
In this study, we propose a new method by combining a transformer and GAN to generate realistic molecules. I would like to propose a property-optimized GAN that contains only transformer encoders to generate molecules with the desired chemical properties.
|
Outline of Annual Research Achievements |
In tackling challenges such as the complexity of generating molecular representations (SMILES) via GANs, along with the non-uniqueness of SMILES representation and the instability associated with GAN training, I proposed an innovative de novo molecular generative model. To enhance the ability to capture features within molecular SMILES representations, a transformer and its variants were utilized as the generator and discriminator of the GAN. Additionally, the concept of variant SMILES was leveraged, recognizing that a molecule can manifest multiple distinct SMILES representations, to comprehensively train the model. Furthermore, molecular chemical properties were determined as rewards within the reinforcement learning-based framework. Such rewards effectively guide the update of the generator. To address the challenge of preserving molecular scaffold integrity in de novo molecular generation, a functional group generative model was introduced. This model not only generates functional groups for a given molecular scaffold but also optimizes molecular properties simultaneously. Diverging from traditional transformer, this model utilizes a reverse transformer with a first-decoder-then-encoder architecture to achieve GAN functionality.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
The research project has made significant strides beyond my original plans.In the realm where AI intersects with bioinformatics, my efforts have been focused on mitigating the development phases and substantial trial-and-error expenses inherent in conventional drug discovery processes. The main contributions include the de novo molecular generative models [1], functional group generation based on molecular scaffolds [2], and drug candidate generation based on gene expression profiles [3]. [1] C. Li and Y. Yamanishi (2024), “TenGAN: Pure transformer encoders make an efficient discrete GAN for de novo molecular generation,” In the proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), top AI conference. [2] C. Li and Y. Yamanishi (2023), “SpotGAN: A reverse-transformer GAN generates scaffold-constrained molecules with property optimization,” In the proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). [3] C. Li and Y. Yamanishi (2024), “GxVAEs: Two joint VAEs generate hit molecules from gene expression profiles,” one of the three outstanding papers among the 2342 accepted papers at the top AI conference of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024).
|
Strategy for Future Research Activity |
In future work, , I aim to use advanced deep learning techniques to generate novel molecules with desired chemical properties. Moreover, considering the rich biological information available in gene expression profiles, I aim to produce molecules in combination with gene expression profiles. Moment Soft-Actor-Critic Reinforcement Learning Driven GAN for De Novo Molecule Generation My previous work [1,2] generated molecules with desired chemical properties using Monte Carlo tree search (MCTS) reinforcement learning algorithm. While MCTS is a powerful tool for molecular generation, a potential drawback of MCTS is its computational complexity. Since molecular generation involves a complex search space, this can be particularly challenging when using MCTS for molecular generation. As a result, MCTS can require a large number of computational resources, making it both time-consuming and expensive to implement. Additionally, the discriminators of past studies need to evaluate the entire SMILES strings. However, molecules that are discriminated as false by the discriminator are often due to a certain number of unsuitable atoms. In future work, I aim to propose a GAN based on soft-actor-critical reinforcement learning with a discriminator that evaluates the generated SMILES strings in a stepwise manner.
|