CREF: An LLM-Based Conversational Software Repair Framework for Programming Tutors
With the proven effectiveness of \textbf{L}arge \textbf{L}anguage \textbf{M}odels (LLMs) in code-related tasks, researchers have explored their potential for program repair. However, existing repair benchmarks might have influenced LLM training data, potentially causing data leakage. To evaluate LLMs' realistic repair capabilities, (i) we introduce an extensive, non-crawled benchmark TutorCode, comprising 1,239 C++ defect codes and associated information such as tutor guidance, solution description, failing test cases, and the corrected code. Our work assesses LLM's repair performance on TutorCode, measuring repair correctness (TOP-5 and AVG-5) and patch precision (RPSR). (ii) We then provide a comprehensive investigation into which types of extra information can help LLMs improve their repair performance. Among these types, tutor guidance was the most effective information. To fully harness LLMs' conversational capabilities and the benefits of augmented information, (iii) we introduce a novel conversational semi-automatic repair framework CREF assisting human programming tutors. It demonstrates a remarkable AVG-5 improvement of 17.2%-24.6% compared to the baseline, achieving an impressive AVG-5 of 76.6% when utilizing GPT-4. These results highlight the potential for enhancing LLMs' repair capabilities through tutor interactions and historical conversations. The successful application of CREF in a real-world educational setting demonstrates its effectiveness in reducing tutors' workload and improving students' learning experience, showing promise for code review and other software engineering tasks.
Wed 18 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
15:30 - 17:10 | |||
15:30 20mTalk | Automated Program Repair via Conversation: Fixing 162 out of 337 Bugs for $0.42 Each using ChatGPT Technical Papers Chunqiu Steven Xia University of Illinois at Urbana-Champaign, Lingming Zhang University of Illinois at Urbana-Champaign DOI | ||
15:50 20mTalk | ThinkRepair: Self-Directed Automated Program Repair Technical Papers Xin Yin Zhejiang University, Chao Ni Zhejiang University, Shaohua Wang Central University of Finance and Economics, Zhenhao Li York University, Limin Zeng Zhejiang University, Xiaohu Yang Zhejiang University DOI | ||
16:10 20mTalk | BRAFAR: Bidirectional Refactoring, Alignment, Fault Localization, and Repair for Programming Assignments Technical Papers Linna Xie Nanjing University, Chongmin Li Nanjing University, Yu Pei Hong Kong Polytechnic University, Tian Zhang Nanjing University, Minxue Pan Nanjing University DOI | ||
16:30 20mTalk | CREF: An LLM-Based Conversational Software Repair Framework for Programming Tutors Technical Papers Boyang Yang Yanshan University; Beijing JudaoYouda Network Technology, Haoye Tian University of Melbourne, Weiguo Pian University of Luxembourg, Haoran Yu Beijing JudaoYouda Network Technology, Haitao Wang Beijing JudaoYouda Network Technology, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Shunfu Jin Yanshan University DOI | ||
16:50 20mTalk | One Size Does Not Fit All: Multi-granularity Patch Generation for Better Automated Program RepairACM SIGSOFT Distinguished Paper Award Technical Papers Bo Lin National University of Defense Technology, Shangwen Wang National University of Defense Technology, Ming Wen Huazhong University of Science and Technology, Liqian Chen National University of Defense Technology, Xiaoguang Mao National University of Defense Technology DOI Pre-print |