研究実績の概要 |
I proposed an infrastructure to allow individuals to collaboratively develop machine learning models on their environments, which are usually heterogeneous. The proposed infrastructure allows researchers to work together and potentially build better models than big companies can. The proposed infrastructure applied federated learning to train the models while preserving data privacy. I proposed three components in the proposed infrastructure to support training a model on diverse storage, computing, and network resources efficiently. First, I proposed a component to reduce the model size to fit the storage capacity of the heterogeneous environment. Second, I proposed a component to aggregate the models trained on heterogeneous computing resources. Third, I proposed a component to sparsify the model for exchanging the models between a server and clients. The proposed infrastructure was evaluated using state-of-the-art neural network models to detect COVID-19 cases from chest X-ray images. COVID-19 detection is one of the most popular machine learning applications for privacy-sensitive data. As a result, the ensemble model with heterogeneous structures on six different hardware environments from the proposed infrastructure produces accuracy higher than a trained single COVID-NET by 5.39%.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
The proposed project is currently progressing well. I have completed developing the proposed components to address the technical challenges for training the machine learning models on heterogeneous storage, computing, and network resources efficiently. I integrated my three proposed components to build the proposed infrastructure on schedule. Moreover, the proposed infrastructure was evaluated using state-of-the-art neural network models to detect COVID-19 cases from chest X-ray images. COVID-19 detection is one of the most popular machine learning applications for privacy-sensitive data. As a result, the ensemble model with heterogeneous structures on six different hardware environments from the proposed infrastructure produces accuracy higher than a trained single COVID-NET by 5.39%.
|
今後の研究の推進方策 |
In the future, the generality of the proposed infrastructure will be investigated using a variety of machine learning applications with diverse structures of machine learning models. I plan to evaluate the proposed infrastructure on a large number of edge devices and then improve the resource utilization in the infrastructure. I will work on data security technology to enhance the data protection mechanism in the proposed infrastructure. Additionally, I will publish the proposed infrastructure as open-source software and available for the international or domestic research communities to remove the barrier to the collaborative development of machine learning models from the limitation of data privacy and existing resource constraints.
|