研究課題/領域番号 |
19K11993
|
研究機関 | 国立研究開発法人理化学研究所 |
研究代表者 |
GEROFI BALAZS 国立研究開発法人理化学研究所, 計算科学研究センター, 上級研究員 (70633501)
|
研究期間 (年度) |
2019-04-01 – 2022-03-31
|
キーワード | Memory access tracking / Neural network training / I/O of deep learning |
研究実績の概要 |
We have completed the extension to the gem5 simulator for supporting heterogeneous memory systems by adding capabilities to define an arbitrary number of different memory devices with specific performance characteristics. We completed the python interface for real-time memory access communication between gem5 and PyTorch and developed test codes to run simple analysis on the captured data. Due to the high runtime overhead of gem5 we also started working on a simplified simulator based on leading-loads model using gem5 results, this runtime estimator will be more suitable for plugging it into a reinforcement learning framework. As a side topic, we tarted exploring I/O implications of large scale training that is necessary for distributed training of large neural networks in supercomputing environments.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
4: 遅れている
理由
Our PostDoc student who was scheduled to work on this project couldn't come to Japan due to COVID-19 and resigned from his RIKEN position. We are lacking man-power at the moment for the agenda to progress as originally planned.
|
今後の研究の推進方策 |
Continue implementation of leading-load based runtime estimator. Continue exploration of memory sensitive applications. Start investigating an alternative runtime estimator based on precise-event based sampling and heterogeneous memory platforms (Intel Optane+DRAM or DRAM+MCDRAM configurations as primary targets). Continue development of I/O improvements for large-scale training.
|
次年度使用額が生じた理由 |
Most of the fund will be used for renting compute capacity in order to run experiments. Depending on the COVID situation, some of the funds may be used for international travel.
|