2023 Fiscal Year Research-status Report
Towards Efficient Code Review: Automatic Recommendation of Needed Information
Project/Area Number |
23K16864
|
Research Institution | Kyushu University |
Principal Investigator |
王 棟 九州大学, システム情報科学研究院, 助教 (30965075)
|
Project Period (FY) |
2023-04-01 – 2025-03-31
|
Keywords | Code Review / Information Need |
Outline of Annual Research Achievements |
In FY2023, I established the research environment and began mining information from social coding platforms such as GitHub and OpenStack. As part of this, I have been investigating developers' activities across various development channels, including code review channels, GitHub Discussion and GitHub Issue. We have now collected data from over 10 million GitHub repositories and are ready for the next stage. Here is a summary of achieved publications.
-Information need of continuous integration. I have worked with international collaborators on an empirical study to understand the software waste resulting from the misuse of recheck command on continuous integration failures. -Information spread across various channels. Specifically, I conducted a study investigating developer activities on GitHub Discussion. The results suggested that, in addition to issues, many code reviews were mentioned or converted in the GitHub Discussion. -Other developer activities. Meanwhile, I focus on the developer communication through issues (i.e., use of visuals to report bugs) and code comments (i.e., self-admitted technique debt)
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
So far, I have collected a large amount of data from open-source software ecosystems. This has resulted in several publications in top international journals and conferences.
These publications better complements the gap lying in information needs by developers during the code review process (such as continuous integration information, cross-channel knowledge).
Large language models have demonstrated impressive performance in a variety of recommendation tasks. This could further prove the feasible of the automated information recommendation and accelerate research progress.
|
Strategy for Future Research Activity |
The next step is to further mine the developers' information needs from other communication channels, particular issues, in order to establish a relationship between code review and issues. Inspired by the state-of-the-art Retrieval-Augmented Generation(RAG) technology, I plan to construct a high-quality knowledge graph that is specifically devised for the code review activities, based on the information/knowledge across various communications. The knowledge graph would be the premise for employing large language models to fulfill the automation of information recommendation for code reviews.
|
Causes of Carryover |
There is an international conference (the top conference ICSE) in April. The cost relatively high for this trip as it is held in Portugal. Thus I save the amount of budget for it.
|