Autorepairability: creating and disseminating a new software quality indicator

Research Project

Project/Area Number	21K18302
Research Category	Grant-in-Aid for Challenging Research (Pioneering)
Allocation Type	Multi-year Fund
Review Section	Medium-sized Section 60:Information science, computer engineering, and related fields
Research Institution	Osaka University
Principal Investigator	肥後芳樹大阪大学, 大学院情報科学研究科, 教授 (70452414)
Co-Investigator(Kenkyū-buntansha)	林晋平東京工業大学, 情報理工学院, 准教授 (40541975) 松本真佑大阪大学, 大学院情報科学研究科, 助教 (90583948)
Project Period (FY)	2021-07-09 – 2025-03-31
Project Status	Granted (Fiscal Year 2023)
Budget Amount *help	¥24,440,000 (Direct Cost: ¥18,800,000、Indirect Cost: ¥5,640,000) Fiscal Year 2024: ¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000) Fiscal Year 2023: ¥4,940,000 (Direct Cost: ¥3,800,000、Indirect Cost: ¥1,140,000) Fiscal Year 2022: ¥7,670,000 (Direct Cost: ¥5,900,000、Indirect Cost: ¥1,770,000) Fiscal Year 2021: ¥7,280,000 (Direct Cost: ¥5,600,000、Indirect Cost: ¥1,680,000)
Keywords	自動プログラム修正 / ミューテーションテスティング / 大規模言語モデル / コードクローン / プログラム解析 / 自動修正適合性 / テスト自動生成 / ソフトウェア品質 / 欠陥限局
Outline of Research at the Start	本研究で提案する自動修正適合性とは，対象ソフトウェアとAPR技術との親和性を表すソフトウェアの新しい品質指標である．自動修正適合性に関して，本研究では以下の三項目に取り組む．項目Aではソースコードの実装方法がAPRの成功／失敗に影響があることを明らかにする．項目Bではバグの特徴に基づいて自動修正適合性を自動計測する手法を考案する．項目Cでは自動修正適合性が低いプログラムを高いプログラムへと自動変換する手法を考案する．
Outline of Annual Research Achievements	2023年度は，2022年度に作成した大規模な機能等価メソッドのデータベースを利用して，大規模言語モデルに基づくコードクローン検出技術の検出精度向上に取り組んだ．大規模言語モデルを用いたコードクローン検出は，構文的な類似度が低いコードクローンに対して従来の検出技術よりも高い精度での検出ができることが知られている．しかしながら，GPT-3.5-turboやGPT-4では，構文的な類似度が低いコードクローンの精度が十分に高いとはいえない．またLlama2ではほぼ全てのメソッドペアをコードクローンとして判断してしまっているのが現状である．そこで，本研究ではこれらの大規模言語モデルに対して，機能等価メソッドをファインチューニングに用いることにより，コードクローンの検出精度向上を試みた．その結果，GPT-3.5-turboについては，誤検出は減ったが検出漏れが増えた．また，Llama2についても同様の傾向が見られ，全体の検出精度が向上したことが確認できた．また，この機能等価メソッドデータベースを利用して自動修正適合性の計測も行った．機能等価なメソッドペアの両者に対して自動修正適合性を計測し，どのような場合にその値が異なるのかを調査した．その結果，Java言語については，if文を連続して書くよりは三項演算子を使うことで値が高くなることや，簡単な条件を持つif文を複数書くよりはそれらの条件を1つにした（条件が複雑になった）if文を1つだけ記述する方が値が高くなること等がわかった．この実験結果から，どのように人間がプログラムを記述すれば，自動プログラム修正技術によってバグ修正が行いやすくなるのかをある程度明らかにすることができた．
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 特に問題は起こっておらず，研究は順調に進んでいる．
Strategy for Future Research Activity	2024年度については，CやPythonについても機能等価なメソッド/関数のデータベースについて取り組む予定である．そして作成したデータベースはGitHub等で公開し，他の研究者もこのデータベースを利用できるようにする．また，2023年度に実施した大規模言語モデルを利用したコードクローン検出技術を機能等価メソッドデータベースを利用して精度向上する試みはまだ実験の規模が小さく，十分な成果が出ているとはいえない．2024年度については，より大規模な実験をさまざまなモデルを利用して行う．また，自動修正適合性がソースコードのバグ修正や機能追加でどのように変遷していくのかについても調査を行う予定である．これにより，自動修正適合性という品質指標の観点からソフトウェア進化を評価することができると考えている．

Report

(4 results)

2023 Research-status Report
2022 Research-status Report
2021 Comments on the Screening Results Research-status Report

Research Products
(41 results)

All 2024 2023 2022 2021 Other

All Journal Article (6 results) (of which Peer Reviewed: 6 results, Open Access: 2 results) Presentation (34 results) (of which Int'l Joint Research: 12 results) Remarks (1 results)

[Journal Article] Program Slice-based Crossoverfor Automated Program Genration2024
- Author(s)
  渡辺大登、?本真佑、肥後芳樹、楠本真二、倉林利行、切貫弘之、丹野治門
- Journal Title
  
  情報処理学会論文誌
  
  Volume: 65 Issue: 3 Pages: 718-728
- DOI
  10.20729/00233254
- ISSN
  1882-7764
- Year and Date
  2024-03-15
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Dataset of Functionally Equivalent Java Methods and Its Application to Evaluating Clone Detection Tools2024
- Author(s)
  Yoshiki Higo
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E107-E
- Related Report
  2023 Research-status Report
- Peer Reviewed
[Journal Article] SemanticCloneBenchを用いた深層学習に基づくコードクローン検出手法の評価2024
- Author(s)
  鶴智秋, 松下誠, 肥後芳樹
- Journal Title
  
  電子情報通信学会論文誌D
  
  Volume: J107-D
- Related Report
  2023 Research-status Report
- Peer Reviewed
[Journal Article] 自動プログラム生成に対する多目的遺伝的アルゴリズムの導入ー相補的な個体選択を目的としてー2022
- Author(s)
  渡辺大登, 松本真佑, 肥後芳樹, 楠本真二, 倉林利行, 切貫弘之, 丹野治門
- Journal Title
  
  情報処理学会論文誌
  
  Volume: 63 Pages: 1564-1573
- Related Report
  2022 Research-status Report
- Peer Reviewed
[Journal Article] Historinc: 細粒度履歴追跡のための増分的なリポジトリ変換ツール2022
- Author(s)
  柴駿太, 林晋平
- Journal Title
  
  コンピュータソフトウェア
  
  Volume: 39 Pages: 75-85
- Related Report
  2022 Research-status Report
- Peer Reviewed
[Journal Article] Supporting Proactive Refactoring: An Exploratory Study on Decaying Modules and Their Prediction2021
- Author(s)
  Natthawute Sae-Lim, Shinpei Hayashi, Motoshi Saeki
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E104.D Issue: 10 Pages: 1601-1615
- DOI
  10.1587/transinf.2020EDP7255
- NAID
  130008095639
- ISSN
  0916-8532, 1745-1361
- Year and Date
  2021-10-01
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Presentation] Osmy: A Tool for Periodic Software Vulnerability Assessment and File Integrity Verification using SPDX Documents2024
- Author(s)
  Rio Kishimoto
- Organizer
  the 31th of the International Conference on Software Analysis, Evolution and Reengineering (SANER2024)
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Presentation] Autorepairability: A New Software Quality Characteristic2024
- Author(s)
  Pongpop Lapvikai
- Organizer
  the 31th of the International Conference on Software Analysis, Evolution and Reengineering (SANER2024
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Presentation] 機能等価メソッドデータセットを利用したLLMによるコードクローン検出の精度向上2024
- Author(s)
  井上龍太郎
- Organizer
  信学技報
- Related Report
  2023 Research-status Report
[Presentation] Impacts of Program Structures on Code Coverage of Generated Test Suites2023
- Author(s)
  Ryoga Watanabe
- Organizer
  the 24th International Conference on Product-Focused Software Process Improvement
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Presentation] Do Exceptional Behavior Tests Matter on Spectrum-Based Fault Localization?2023
- Author(s)
  Haruka Yoshioka
- Organizer
  the 24th International Conference on Product-Focused Software Process Improvement (PROFES2023)
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Presentation] PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets2023
- Author(s)
  Shiyu Yang
- Organizer
  the 31st IEEE/ACM International Conference on Program Comprehension (ICPC2023)
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Presentation] 自動テスト生成技術を利用した機能等価メソッドデータセットの構築2023
- Author(s)
  肥後芳樹
- Organizer
  ソフトウェアエンジニアリングシンポジウム2023
- Related Report
  2023 Research-status Report
[Presentation] 大規模データセットと多種ミューテーション演算子を利用した欠陥限局に適するプログラム構造の再調査2023
- Author(s)
  久保光生
- Organizer
  ソフトウェアエンジニアリングシンポジウム2023
- Related Report
  2023 Research-status Report
[Presentation] 例外処理を検査するテストが実行経路に基づく欠陥限局手法に与える影響の調査2023
- Author(s)
  吉岡遼
- Organizer
  ソフトウェアエンジニアリングシンポジウム2023
- Related Report
  2023 Research-status Report
[Presentation] Large-Scale Evaluation of Method-Level Bug Localization with FinerBench4BL2023
- Author(s)
  Shizuka Tsumita
- Organizer
  the 30th IEEE International Conference on Software Analysis, Evolution and Reengineering
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] ソースコードの変更差分の学習に基づくリファクタリングコミットの識別2023
- Author(s)
  青木俊介
- Organizer
  情報処理学会ソフトウェア工学研究発表会
- Related Report
  2022 Research-status Report
[Presentation] 語形と省略を考慮した一括名前変更リファクタリング支援2023
- Author(s)
  大住祐輝
- Organizer
  情報処理学会ソフトウェア工学研究発表会
- Related Report
  2022 Research-status Report
[Presentation] 単語埋め込みによる言語横断バグ箇所検索2023
- Author(s)
  大柴昂輝
- Organizer
  情報処理学会ソフトウェア工学研究発表会
- Related Report
  2022 Research-status Report
[Presentation] リファクタリング事例検索システムの設計と実装2023
- Author(s)
  阿部元輝
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] 事前構文定義を必要としないリファクタリング検出手法の提案2023
- Author(s)
  古藤寛大
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] プログラム構造が自動生成テストの網羅率に与える影響の調査2023
- Author(s)
  渡邉凌雅
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] スペクトラムに基づく欠陥限局に適したプログラム構造の再調査2023
- Author(s)
  久保光生
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] Classification of Changes Based on API2022
- Author(s)
  Masashi Iriyama
- Organizer
  the 23rd International Conference on Product-Focused Software Process Improvement (PROFES2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Are NLP Metrics Suitable for Evaluating Generated Code2022
- Author(s)
  Riku Takaichi
- Organizer
  the 23rd International Conference on Product-Focused Software Process Improvement (PROFES2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Improving Weighted-SBFL by Blocking Spectrum2022
- Author(s)
  Haruka Yoshikoka
- Organizer
  the 22nd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Constructing Dataset of Functionally Equivalent Java Methods2022
- Author(s)
  Yoshiki Higo
- Organizer
  the 19th International Conference on Mining Software Repositories (MSR2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Revisiting the Effect of Branch Handling Strategies on Change Recommendation2022
- Author(s)
  Keisuke Isemoto
- Organizer
  the 30th IEEE/ACM International Conference on Program Comprehension
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Impact of Change Granularity in Refactoring Detection2022
- Author(s)
  Lei Chen
- Organizer
  the 30th IEEE/ACM International Conference on Program Comprehension
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] リポジトリマイニング手法に対する前処理としての履歴書き換えツールの試作2022
- Author(s)
  柴駿太
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] 探索に基づくリファクタリング推薦におけるレビュー工数見積もりの利用2022
- Author(s)
  陳磊
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] リポジトリ変換によるBug Localization手法の細粒度化とその評価2022
- Author(s)
  積田静夏
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] イミュータブルクラスを利用する必要性に関する調査 ~ハッシュ値を利用するデータ型を対象として~2022
- Author(s)
  橋本周
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] 構文誤りを含むプログラムを評価可能なソースコード用自動評価尺度の調査2022
- Author(s)
  高市陸
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2022 Research-status Report
[Presentation] 自動修正適合性を用いた修正しやすいプログラム構造の評価2022
- Author(s)
  前島葵
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2021 Research-status Report
[Presentation] ソースコード変更パターンのプロジェクト共通性を考慮した変更推薦2022
- Author(s)
  安藤直樹
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2021 Research-status Report
[Presentation] 自動修正適合性：新しいソフトウェア品質指標とその計測2021
- Author(s)
  九間哲士
- Organizer
  ソフトウェアエンジニアリングシンポジウム2021
- Related Report
  2021 Research-status Report
[Presentation] ソースコードの時間変化がバグ限局に与える影響の調査2021
- Author(s)
  三井亮称
- Organizer
  電子情報通信学会ソフトウェアサイエンス研究会
- Related Report
  2021 Research-status Report
[Presentation] 細粒度履歴追跡のための増分的なリポジトリ変換ツールの設計と実装2021
- Author(s)
  柴駿太
- Organizer
  日本ソフトウェア科学会第38回大会
- Related Report
  2021 Research-status Report
[Presentation] 複合メトリクスのトレンド分析の効率化に向けて：モジュール腐敗度への適用2021
- Author(s)
  林辰宜
- Organizer
  第28回ソフトウェア工学の基礎ワークショップ
- Related Report
  2021 Research-status Report
[Remarks] FEMPDataset (機能等価メソッドペアデータセット)
- URL
  https://github.com/YoshikiHigo/FEMPDataset
- Related Report
  2023 Research-status Report

Autorepairability: creating and disseminating a new software quality indicator

Principal Investigator

肥後 芳樹 大阪大学, 大学院情報科学研究科, 教授 (70452414)

¥24,440,000 (Direct Cost: ¥18,800,000、Indirect Cost: ¥5,640,000)

Current Status of Research Progress

Reason

Report

Research Products

[Journal Article] Program Slice-based Crossoverfor Automated Program Genration2024

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Dataset of Functionally Equivalent Java Methods and Its Application to Evaluating Clone Detection Tools2024

Author(s)

Journal Title

Related Report

[Journal Article] SemanticCloneBenchを用いた深層学習に基づくコードクローン検出手法の評価2024

Author(s)

Journal Title

Related Report

[Journal Article] 自動プログラム生成に対する多目的遺伝的アルゴリズムの導入 ー相補的な個体選択を目的としてー2022

Author(s)

Journal Title

Related Report

[Journal Article] Historinc: 細粒度履歴追跡のための増分的なリポジトリ変換ツール2022

Author(s)

Journal Title

Related Report

[Journal Article] Supporting Proactive Refactoring: An Exploratory Study on Decaying Modules and Their Prediction2021

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Presentation] Osmy: A Tool for Periodic Software Vulnerability Assessment and File Integrity Verification using SPDX Documents2024

Author(s)

Organizer

Related Report

[Presentation] Autorepairability: A New Software Quality Characteristic2024

Author(s)

Organizer

Related Report

[Presentation] 機能等価メソッドデータセットを利用したLLMによるコードクローン検出の精度向上2024

Author(s)

Organizer

Related Report

[Presentation] Impacts of Program Structures on Code Coverage of Generated Test Suites2023

Author(s)

Organizer

Related Report

[Presentation] Do Exceptional Behavior Tests Matter on Spectrum-Based Fault Localization?2023

Author(s)

Organizer

Related Report

[Presentation] PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets2023

Author(s)

Organizer

Related Report

[Presentation] 自動テスト生成技術を利用した機能等価メソッドデータセットの構築2023

Author(s)

Organizer

Related Report

[Presentation] 大規模データセットと多種ミューテーション演算子を利用した欠陥限局に適するプログラム構造の再調査2023

Author(s)

Organizer

Related Report

[Presentation] 例外処理を検査するテストが実行経路に基づく欠陥限局手法に与える影響の調査2023

Author(s)

Organizer

Related Report

[Presentation] Large-Scale Evaluation of Method-Level Bug Localization with FinerBench4BL2023

Author(s)

Organizer

Related Report

[Presentation] ソースコードの変更差分の学習に基づくリファクタリングコミットの識別2023

肥後芳樹大阪大学, 大学院情報科学研究科, 教授 (70452414)

[Journal Article] 自動プログラム生成に対する多目的遺伝的アルゴリズムの導入ー相補的な個体選択を目的としてー2022