2020 Fiscal Year Final Research Report
Feature Representation Design for Graph Machine Learning
Project/Area Number |
17H01783
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Institute of Physical and Chemical Research (2019-2020) Hokkaido University (2017-2018) |
Principal Investigator |
Takigawa Ichigaku 国立研究開発法人理化学研究所, 革新知能統合研究センター, 研究員 (10374597)
|
Project Period (FY) |
2017-04-01 – 2021-03-31
|
Keywords | 機械学習 / グラフデータ / 分子表現 |
Outline of Final Research Achievements |
This project focuses on the feature representation problems for graph machine learning. By extending our previous work on sparse linear learning over the subgraph-feature search space, we developed novel related methods such as decision tree ensemble learning over subgraph search space, decision tree learning based on regarding the subgraph search space as a trie, efficient learning by stochastic search over subgraph space, graph learning by subgraph co-occurrences, compressing the subgraph search space by decision diagrams, dual graph convolutions for a graph of graphs, self-attentive graph learning for molecular property prediction, and user-edit aware generative graph autocompletion.
|
Free Research Field |
機械学習
|
Academic Significance and Societal Importance of the Research Achievements |
分子のグラフ表現の主対象である有機低分子は(a)医薬品、細胞内代謝物、有機EL材料、食品、化粧品、など波及範囲が広い、(b)活性の発現機序がモデル化困難な程に複雑、(c)可能な分子の候補数が組合せ的に巨大、という背景から活性の理解にデータ科学の技術が強く望まれており、本課題で得られる知見により広い波及効果が期待できる。また、グラフ表現データという設定は広い汎用性を持ち、公的リポジトリの多様なアッセイデータに基づく具体的な評価系を多数構築しやすく、強い特徴間相関や指数的な高次元性に由来する困難を体系的に評価できる良いモデルケースとなっている。
|