2023 年度実施状況報告書

Machine Learning for Structure-Rich Data-Scarce Domains

研究課題

研究課題/領域番号	22K12150
研究機関	京都大学
研究代表者	NGUYEN Canh・Hao 京都大学, 化学研究所, 講師 (90626889)
研究期間 (年度)	2022-04-01 – 2025-03-31
キーワード	Graph neural networks / Convex Clustering
研究実績の概要	In this year, we are working on representation of data that are faithful to the original features as well as having cluster structures. We investigated the method of convex clustering to obtain a representation using a convex program, which is efficient and globally optimal. The key idea is to assume that data follows cluster structures. For that, we cluster the data using convex clustering. The advantage of convex clustering is that it is a convex program that guarantees optimality. Another advantage is that it offers a relaxation of k-means and agglomerative clustering algorithms, offering potential advantages of the two algorithms. Our main work here is to analyze analytically what are the clusters that are obtained by convex clustering, pros and cons compared to the other two algorithms. We found that convex cluster only can learn convex clusters. This is similar to k-means and different from agglomerative clustering. We also found that the clusters can be bounded in balls, making them round-shaped. These clusters are found to have gaps between them. These properties show that convex clustering found rather specific types of clusters, rather inflexible compare to the other algorithms.
現在までの達成度 (区分)	現在までの達成度 (区分) 3: やや遅れている理由 We are working on a particular problem with the difficulty of understanding the formulation of convex clustering, which has not been well studied before.
今後の研究の推進方策	We plan to continue working on finding suitable representations of data from original features with additional information such as graphs that are guaranteed to extract more information compared to currently used methods.
次年度使用額が生じた理由	We did not proceed with travel and buying articles as the research plan.