LPR: Large Language Models-Aided Program Reduction
Program reduction is a widely used technique to facilitate debugging
compilers by automatically minimizing programs that trigger
compiler bugs. Existing program reduction techniques are either
generic to a wide range of languages (such as Perses and Vulcan)
or specifically optimized for one certain language by exploiting
language-specific knowledge (e.g., C-Reduce). However, synergistically
combining both generality across languages and optimality
to a specific language in program reduction is yet to be explored.
This paper proposes LPR, the first LLMs-aided technique leveraging
LLMs to perform language-specific program reduction for
multiple languages. The key insight is to utilize both the language
generality of program reducers such as Perses and the languagespecific
semantics learned by LLMs. Concretely, language-generic
program reducers can efficiently reduce programs into a small size
that is suitable for LLMs to process; LLMs can effectively transform
programs via the learned semantics to create new reduction opportunities
for the language-generic program reducers to further
reduce the programs.
Our thorough evaluation on 50 benchmarks across three programming
languages (i.e., C, Rust and JavaScript) has demonstrated
LPR’s practicality and superiority over Vulcan, the state-of-the-art
language-generic program reducer. For effectiveness, LPR surpasses
Vulcan by producing 24.93%, 4.47%, and 11.71% smaller programs
on benchmarks in C, Rust and JavaScript, separately. Moreover, LPR
and Vulcan have the potential to complement each other. For the C
language for which C-Reduce is optimized, by applying Vulcan to
the output produced by LPR, we can attain program sizes that are
on par with those achieved by C-Reduce. For efficiency perceived
by users, LPR is more efficient when reducing large and complex
programs, taking 10.77%, 34.88%, 36.96% less time than Vulcan to
finish all the benchmarks in C, Rust and JavaScript, separately.
Wed 18 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 11:50 | Code Mutation and ReductionTechnical Papers at EI 10 Fritz Paschke Chair(s): Andreas Zeller CISPA Helmholtz Center for Information Security | ||
10:30 20mTalk | Large Language Models for Equivalent Mutant Detection: How Far Are We?ACM SIGSOFT Distinguished Paper Award Technical Papers Zhao Tian Tianjin University, Honglin Shu Kyushu University, Dong Wang Tianjin University, Xuejie Cao Tianjin University, Yasutaka Kamei Kyushu University, Junjie Chen Tianjin University DOI Pre-print | ||
10:50 20mTalk | An Empirical Examination of Fuzzer Mutator Performance Technical Papers James Kukucka George Mason University, Luís Pina University of Illinois at Chicago, Paul Ammann George Mason University, Jonathan Bell Northeastern University DOI | ||
11:10 20mTalk | Equivalent Mutants in the Wild: Identifying and Efficiently Suppressing Equivalent Mutants for Java Programs Technical Papers Benjamin Kushigian University of Washington, Samuel Kaufman University of Washington, Ryan Featherman University of Washington, Hannah Potter University of Washington, Ardi Madadi University of Washington, René Just University of Washington DOI | ||
11:30 20mTalk | LPR: Large Language Models-Aided Program Reduction Technical Papers Mengxiao Zhang University of Waterloo, Yongqiang Tian Hong Kong University of Science and Technology, Zhenyang Xu University of Waterloo, Yiwen Dong University of Waterloo, Shin Hwei Tan Concordia University, Chengnian Sun University of Waterloo DOI |