ISSTA 2024
Mon 16 - Fri 20 September 2024 Vienna, Austria
co-located with ISSTA/ECOOP 2024
Fri 20 Sep 2024 14:30 - 14:50 at EI 7 - LLMs for Code Chair(s): Jacques Klein

Fuzz drivers are essential for library API fuzzing. However, automatically generating fuzz drivers is a complex task, as it demands the creation of high-quality, correct, and robust API usage code. An LLM-based (Large Language Model) approach for generating fuzz drivers is a promising area of research. Unlike traditional program analysis-based generators, this text-based approach is more generalized and capable of harnessing a variety of API usage information, resulting in code that is friendly for human readers. However, there is still a lack of understanding regarding the fundamental issues on this direction, such as its effectiveness and potential challenges.

To bridge this gap, we conducted the first in-depth study targeting the important issues of using LLMs to generate effective fuzz drivers. Our study features a curated dataset with 86 fuzz driver generation questions from 30 widely-used C projects. Six prompting strategies are designed and tested across five state-of-the-art LLMs with five different temperature settings. In total, our study evaluated 736,430 generated fuzz drivers, with 0.85 billion token costs ($8,000+ charged tokens). Additionally, we compared the LLM-generated drivers against those utilized in industry, conducting extensive fuzzing experiments (3.75 CPU-year). Our study uncovered that:

1) While LLM-based fuzz driver generation is a promising direction, it still encounters several obstacles towards practical applications;

2) LLMs face difficulties in generating effective fuzz drivers for APIs with intricate specifics. Three featured design choices of prompt strategies can be beneficial: issuing repeat queries, querying with examples, and employing an iterative querying process;

3) While LLM-generated drivers can yield fuzzing outcomes that are on par with those used in the industry, there are substantial opportunities for enhancement, such as extending contained API usage, or integrating semantic oracles to facilitate logical bug detection.

Our insights have been implemented to improve the OSS-Fuzz-Gen project, facilitating practical fuzz driver generation in industry.

Fri 20 Sep

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

13:30 - 14:50
LLMs for CodeTechnical Papers at EI 7
Chair(s): Jacques Klein University of Luxembourg
13:30
20m
Talk
Bridge and Hint: Extending Pre-trained Language Models for Long-Range Code
Technical Papers
Yujia Chen Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Zezhou Yang Harbin Institute of Technology, Hongyu Zhang Chongqing University, Qing Liao Harbin Institute of Technology
DOI
13:50
20m
Talk
CoSec: On-the-Fly Security Hardening of Code LLMs via Supervised Co-decoding
Technical Papers
Dong Li Chongqing University, Meng Yan Chongqing University, Yaosheng Zhang Chongqing University, Zhongxin Liu Zhejiang University, Chao Liu Chongqing University, Xiaohong Zhang Chongqing University, Ting Chen University of Electronic Science and Technology of China, David Lo Singapore Management University
DOI
14:10
20m
Talk
Oracle-Guided Program Selection from Large Language Models
Technical Papers
Zhiyu Fan National University of Singapore, Haifeng Ruan National University of Singapore, Sergey Mechtaev Peking University, Abhik Roychoudhury National University of Singapore
DOI
14:30
20m
Talk
How Effective Are They? Exploring Large Language Model Based Fuzz Driver Generation
Technical Papers
Cen Zhang Nanyang Technological University, Yaowen Zheng Nanyang Technological University, Mingqiang Bai Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yeting Li Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Wei Ma Nanyang Technological University, Xiaofei Xie Singapore Management University, Yuekang Li UNSW, Limin Sun Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yang Liu Nanyang Technological University
DOI

Information for Participants
Fri 20 Sep 2024 13:30 - 14:50 at EI 7 - LLMs for Code Chair(s): Jacques Klein
Info for room EI 7:

Map: https://tuw-maps.tuwien.ac.at/?q=CDEG13

Room tech: https://raumkatalog.tiss.tuwien.ac.at/room/15417