研究実績の概要 |
Our research was centered around bringing fast object-oriented programming (OOP) to GPUs. We improved our previously developed CUDA framework "DynaSOAr". One of DynaSOAr's main components is a dynamic memory manager for object-oriented Single-Method Multiple-Objects (SMMO) applications.
The main improvement to DynaSOAr was "CompactGpu", a memory defragmentation system. CompactGpu can improve the performance of memory-bound GPU applications by storing allocations in a denser format. CompactGpu physically rearranges objects in memory, so that they are stored in few, compact blocks of memory. With dynamic memory allocation, such fragmentation is often caused by unfavorable allocate-deallocate pattern.
Memory defragmentation proved to be particularly useful for two kinds of applications: (1) Applications that use only a small amount of memory and fit largely into the L1/L2 cache if stored densely. (2) Applications that already benefit from a Structure of Array (SOA) memory layout get even faster, because vector load/stores are more efficient if data is stored in a compact form. Our work, published at ISMM 2019, achieved a speedup of up to 16% in our benchmark applications.
|