Project/Area Number |
23K16864
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 60050:Software-related
|
Research Institution | Kyushu University |
Principal Investigator |
王 棟 九州大学, システム情報科学研究院, 助教 (30965075)
|
Project Period (FY) |
2023-04-01 – 2025-03-31
|
Project Status |
Discontinued (Fiscal Year 2023)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2025: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2024: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2023: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
|
Keywords | Code Review / Information Need / Software Engineering / Information Needs / Repository Mining |
Outline of Research at the Start |
To meet the developers’ information needs and facilitate an effective code review, the applicant proposes a framework of an intelligent and non-intrusive notification mechanism to automatically recommend developers the needed information seamlessly that they should be aware of instantly and dynamically during the code review process.
|
Outline of Annual Research Achievements |
In FY2023, I established the research environment and began mining information from social coding platforms such as GitHub and OpenStack. As part of this, I have been investigating developers' activities across various development channels, including code review channels, GitHub Discussion and GitHub Issue. We have now collected data from over 10 million GitHub repositories and are ready for the next stage. Here is a summary of achieved publications.
-Information need of continuous integration. I have worked with international collaborators on an empirical study to understand the software waste resulting from the misuse of recheck command on continuous integration failures. -Information spread across various channels. Specifically, I conducted a study investigating developer activities on GitHub Discussion. The results suggested that, in addition to issues, many code reviews were mentioned or converted in the GitHub Discussion. -Other developer activities. Meanwhile, I focus on the developer communication through issues (i.e., use of visuals to report bugs) and code comments (i.e., self-admitted technique debt)
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
So far, I have collected a large amount of data from open-source software ecosystems. This has resulted in several publications in top international journals and conferences.
These publications better complements the gap lying in information needs by developers during the code review process (such as continuous integration information, cross-channel knowledge).
Large language models have demonstrated impressive performance in a variety of recommendation tasks. This could further prove the feasible of the automated information recommendation and accelerate research progress.
|
Strategy for Future Research Activity |
The next step is to further mine the developers' information needs from other communication channels, particular issues, in order to establish a relationship between code review and issues. Inspired by the state-of-the-art Retrieval-Augmented Generation(RAG) technology, I plan to construct a high-quality knowledge graph that is specifically devised for the code review activities, based on the information/knowledge across various communications. The knowledge graph would be the premise for employing large language models to fulfill the automation of information recommendation for code reviews.
|