研究課題/領域番号 |
22K12038
|
研究種目 |
基盤研究(C)
|
配分区分 | 基金 |
応募区分 | 一般 |
審査区分 |
小区分60070:情報セキュリティ関連
|
研究機関 | 国立研究開発法人情報通信研究機構 |
研究代表者 |
班 涛 国立研究開発法人情報通信研究機構, サイバーセキュリティ研究所, 主任研究員 (80462878)
|
研究期間 (年度) |
2022-04-01 – 2025-03-31
|
研究課題ステータス |
交付 (2022年度)
|
配分額 *注記 |
4,160千円 (直接経費: 3,200千円、間接経費: 960千円)
2024年度: 650千円 (直接経費: 500千円、間接経費: 150千円)
2023年度: 650千円 (直接経費: 500千円、間接経費: 150千円)
2022年度: 2,860千円 (直接経費: 2,200千円、間接経費: 660千円)
|
キーワード | IoT malware analysis / static analysis / graph embedding / Explainable AI / machine learning / function call graph / Malware anlaysis / IoT malware / CPU architecture / Static analysis |
研究開始時の研究の概要 |
CPU architecture diversity and resource constraints on IoT devices render conventional protection schemes impractical, hindering malware precautions and countermeasures. In this proposal, we propose integrating advanced machine learning methods with security domain knowledge to implement a practical IoT malware detection and prevention scheme that meets the eligibility requirements on accuracy, computational and resource-efficiency, adaptivity to various application scenarios, and robustness against new attacks.
|
研究実績の概要 |
In order to enhance the security of IoT devices, we conducted research on malware protection schemes that could effectively and efficiently safeguard these devices, while also being independent of the CPU architecture and robust to cyberattacks. We got the following research results in FY 2022.
(1) We investigated using graph2vec to encode function call graphs from static analysis of IoT malware for malware family classification. We proposed two methods to improve feature representation: reinterpret opcode sequences for unified user-defined function names and integrate literal information in the embedding. Tested on a large-scale dataset of over 108K malware binaries, the proposed method showed higher accuracy under various architectures, leading to superior overall performance.
(2) We explored the use of Explainable Artificial Intelligence (XAI) to identify unique features that distinguish malware families. We propose Color-coded Attribute Graph (CAG), which utilizes feature importance scores from classifier models to create a visual representation of malware samples. Results show the CAG is effective in interpreting machine learning-based methods for IoT malware classification, leading to more accurate analyses.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
1: 当初の計画以上に進展している
理由
In FY2022, we successfully executed our planned research projects, which included benchmark dataset collection, research on embedding methods, and research on static strings. Our team achieved impressive research output, publishing one top journal paper, submitting one international conference paper, and delivering one research presentation.
Additionally, we have ongoing research in progress, which we are currently summarizing for publication. One area of research focuses on the efficient implementation of string kernels for IoT malware analysis. We designed an efficient algorithm based on suffix tree data structure for fast searching of similar components in different malware samples. This work aims to accelerate the malware analysis process, which is crucial in detecting and mitigating malware attacks on IoT devices.
Furthermore, we are conducting research on detecting IoT malware in packed samples, which presents unique challenges in malware analysis. Our proposed solution involves using feature selection to address the ambiguous Opcode generated in unpacking failure cases. We aim to enhance the accuracy and efficiency of malware detection in packed samples, contributing to the overall security of IoT devices.
|
今後の研究の推進方策 |
For FY 2023, our team aims to enhance the effectiveness and efficiency of our IoT malware protection scheme by applying cutting-edge learning algorithms. We will leverage Word2Vec, Doc2Vec, and FastText to preprocess high-dimensional vectors and evaluate their performance using deep neural networks, including Convolutional Neural Networks and Recurrent Neural Networks. We will compare these new algorithms against conventional methods, such as Random Forest, Support Vector Machine, and Neural Networks, which we previously examined. Our objective is to achieve a high level of generalization performance for the protection scheme.
In addition, we are conducting research on detecting IoT malware in packed samples, which poses unique challenges in malware analysis. To address the ambiguous Opcode generated in unpacking failure cases, our proposed solution involves using feature selection. Our aim is to enhance the accuracy and efficiency of malware detection in packed samples, thereby improving the overall security of IoT devices.
Looking ahead to FY 2024, we plan to adopt adversarial learning to enhance the model's resilience against obfuscation techniques. We will use Generative Adversarial Networks, a type of generative deep learning algorithm, to create attacking data instances that will improve the models' robustness. After verifying performance on benchmark datasets, we plan to develop a prototype of the protection scheme and test it on popular IoT devices.
|