研究課題/領域番号 |
22KJ2289
|
補助金の研究課題番号 |
22J11908 (2022)
|
研究種目 |
特別研究員奨励費
|
配分区分 | 基金 (2023) 補助金 (2022) |
応募区分 | 国内 |
審査区分 |
小区分61030:知能情報学関連
|
研究機関 | 奈良先端科学技術大学院大学 |
研究代表者 |
Thonglek Kundjanasith 奈良先端科学技術大学院大学, 先端科学技術研究科, 特別研究員(DC2)
|
研究期間 (年度) |
2023-03-08 – 2024-03-31
|
研究課題ステータス |
交付 (2023年度)
|
配分額 *注記 |
1,700千円 (直接経費: 1,700千円)
2023年度: 800千円 (直接経費: 800千円)
2022年度: 900千円 (直接経費: 900千円)
|
キーワード | Collaborative Develop / Distributed Computing / Edge Machine Learning / Federated Learning / Privacy Preservation / Resource Heterogeneity |
研究開始時の研究の概要 |
I propose LiberatAI, an infrastructure for collaboratively developing machine learning models that allow researchers to work together. LiberatAI applies federated learning to train the models while preserving data privacy. LiberatAI allows individuals to collaboratively train models on their environments, which are usually heterogeneous. Three modules in LiberatAI support training a model on diverse storage, computing, and communication resources. LiberatAI was evaluated using the models to detect COVID-19 which is one of the most popular applications for privacy-sensitive data.
|
研究実績の概要 |
I proposed an infrastructure to allow individuals to collaboratively develop machine learning models on their environments, which are usually heterogeneous. The proposed infrastructure allows researchers to work together and potentially build better models than big companies can. The proposed infrastructure applied federated learning to train the models while preserving data privacy. I proposed three components in the proposed infrastructure to support training a model on diverse storage, computing, and network resources efficiently. First, I proposed a component to reduce the model size to fit the storage capacity of the heterogeneous environment. Second, I proposed a component to aggregate the models trained on heterogeneous computing resources. Third, I proposed a component to sparsify the model for exchanging the models between a server and clients. The proposed infrastructure was evaluated using state-of-the-art neural network models to detect COVID-19 cases from chest X-ray images. COVID-19 detection is one of the most popular machine learning applications for privacy-sensitive data. As a result, the ensemble model with heterogeneous structures on six different hardware environments from the proposed infrastructure produces accuracy higher than a trained single COVID-NET by 5.39%.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
The proposed project is currently progressing well. I have completed developing the proposed components to address the technical challenges for training the machine learning models on heterogeneous storage, computing, and network resources efficiently. I integrated my three proposed components to build the proposed infrastructure on schedule. Moreover, the proposed infrastructure was evaluated using state-of-the-art neural network models to detect COVID-19 cases from chest X-ray images. COVID-19 detection is one of the most popular machine learning applications for privacy-sensitive data. As a result, the ensemble model with heterogeneous structures on six different hardware environments from the proposed infrastructure produces accuracy higher than a trained single COVID-NET by 5.39%.
|
今後の研究の推進方策 |
In the future, the generality of the proposed infrastructure will be investigated using a variety of machine learning applications with diverse structures of machine learning models. I plan to evaluate the proposed infrastructure on a large number of edge devices and then improve the resource utilization in the infrastructure. I will work on data security technology to enhance the data protection mechanism in the proposed infrastructure. Additionally, I will publish the proposed infrastructure as open-source software and available for the international or domestic research communities to remove the barrier to the collaborative development of machine learning models from the limitation of data privacy and existing resource constraints.
|