2022 Fiscal Year Research-status Report
Data locality for sparse matrices via advanced optimisations in large-scale scientific programs
Project/Area Number |
22K17900
|
Research Institution | Institute of Physical and Chemical Research |
Principal Investigator |
Vatai Emil 国立研究開発法人理化学研究所, 計算科学研究センター, 研究員 (70889633)
|
Project Period (FY) |
2022-04-01 – 2026-03-31
|
Keywords | sparsity / data locality / spin chemistry / machine learning / NLP / linear solvers / numerical methods |
Outline of Annual Research Achievements |
The s-step CG topic progressed the least from the 3 topics and this direction remains mostly unexplored.: with an intern we merely made minor progress in implementing s-step CG and combining it with DMPK.
We've started working on sparsity and explainability in NLP recently with another student. We have looked different approaches to sparsifycation, such as the STEN library, matrix decompositions, graph neural networks (GNNs) including graph convolutional networks (GCNs). We have gained a comprehensive understanding of the challenges of sparsity in ML training, such as sparsity crawling back quickly into our the weights (because the sum and product of matrices becomes dense easily) as well as the problem of dense gradients.
Finally, the most promising direction is a collaboration with spin-chemists and the topic of radical pair simulations. We have developed a python package/library for simulating radical-pair called RadicalPy. The package includes various components which can be used to construct simulations of various experiments in the field, such as construction Hamiltonians modelling different forces in the reactions, kinetics and relaxation mechanisms, experiment schemes, data conversion and plotting tools and molecule database which enables easy and convenient loading of molecules and isotopes.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
The research has been going smoothly, albeit not exactly in the planned direction. The planned research was mostly about the application of DMPK to the s-step CG method to solve sparse linear equations, however this direction did not bare much fruit.
However, the radical pair simulations is a fertile direction where we already made substantial contributions in the form of the RadicalPy python package. The original problem, which came from the spin chemists, was that the number of particles which can be simulated is very limited since the size of the matrix describing simulated system grows (super-)exponentially as the number of particles increases. Such huge matrices are solvable only on supercomputers, which motivated us to start investigating this task. It turns out that the matrix of the system is extremely sparse, so it fits perfectly into this work. The RadicalPy is the first iteration of this grand simulation software. It is written using numpy and scipy, and it is not yet meant for large multi-node distributed simulation. In the context of the large simulation (of 30+ particles), it is just a toy example, however it is written in a very user friendly way, complete with full documentation and continuous integration/deployment (CI/CD) and so servers as a complete software tool for spin chemists exploring reactions involving radical pairs.
The sparsity and explainability in NLP direction is relatively new, and progressing smoothly so far.
|
Strategy for Future Research Activity |
As we've made the least amount of progress in the s-step CG with DMPK solvers for sparse iterative linear solvers topic, we will continue to pursue this direction as time and opportunity permits it, with a reduced priority.
The main focus will be on spin chemistry, radical pair simulations. The programming of the RadicalPy python package is mostly done, however the documentation needs some work. The immediate plan for future work is finishing the RadicalPy documentation, which we are doing in parallel of writing a paper about RadicalPy. While there is ample opportunity for improvement and expansion of RadicalPy, the main task for the future is a distributed simulation software package for large radical pair simulations (written in C/C++, using MPI or other similar library such as PETSc). This simulation software will eventually be integrated into RadicalPy, but it will be first developed as a separate software. The development of this software will require a lot of work and research, and might take more than one year.
|
Causes of Carryover |
I had to reallocate some funds due to unforeseen high airplain ticket prices. I also had some misunderstanding about which category online interns belong to, and this will require some budget adjustments.
|
Research Products
(4 results)