See the Forest, not Trees: Unveiling and Escaping the Pitfalls of Error-Triggering Inputs in Neural Network Testing
Recent efforts in deep neural network (DNN) testing commonly use error-triggering inputs (ETIs) to quantify DNN errors and to fine-tune the tested DNN for repairing. This study reveals the pitfalls of ETIs in DNN testing. Specifically, merely seeking for more ETIs ``traps'' the testing campaign into local plateaus, where similar ETIs are continuously generated using a few fixed input transformations. Similarly, fine-tuning the DNN with ETIs, while capable of fixing the exposed DNN mis-predictions, undermines the DNN's resilience towards certain input transformations. However, these ETI-induced pitfalls have been overlooked in previous research, due to the insufficient input transformations (usually $< 10$), and we show that the severity of such deceptive phenomena is enlarged when testing DNNs with more and diverse real-life input transformations.
This paper presents a comprehensive study on the pitfalls of ETIs in DNN testing. We first augment conventional DNN testing pipelines with a large set of input transformations; the correctness and validity of these new transformations are verified with large-scale human studies. Based on this, we show that launching an endless pursuit for ETIs cannot alleviate the ``trapped testing'' issue, and the undermined resilience pervasively occurs in many input transformations. Accordingly, we propose a novel and holistic viewpoint over DNN errors: instead of counting which input triggers a DNN mis-prediction, we record which input transformation can generate ETIs. The targeted input property of this transformation, termed erroneous property (EP), counts one DNN error and guides DNN testing (i.e., our new paradigm aims to find more EPs rather than ETIs). Evaluation shows that this EP-oriented testing paradigm significantly expands the explored DNN error space. Moreover, fine-tuning DNNs with EPs effectively improves their resilience towards different input transformations.
Wed 18 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
15:30 - 17:10 | Testing and Repairing Neural NetworksTechnical Papers at EI 9 Hlawka Chair(s): Mike Papadakis University of Luxembourg | ||
15:30 20mTalk | Interoperability in Deep Learning: A User Survey and Failure Analysis of ONNX Model Converters Technical Papers Purvish Jajal Purdue University, Wenxin Jiang Purdue University, Arav Tewari Purdue University, Erik Kocinare Purdue University, Joseph Woo Purdue University, Anusha Sarraf Purdue University, Yung-Hsiang Lu Purdue University, George K. Thiruvathukal Loyola University Chicago, James C. Davis Purdue University DOI Pre-print | ||
15:50 20mTalk | Interpretability Based Neural Network Repair Technical Papers Zuohui Chen Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Jun Zhou Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Youcheng Sun University of Manchester, Jingyi Wang Zhejiang University, Qi Xuan Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Xiaoniu Yang Zhejiang University of Technology; National Key Laboratory of Electromagnetic Space Security DOI | ||
16:10 20mTalk | See the Forest, not Trees: Unveiling and Escaping the Pitfalls of Error-Triggering Inputs in Neural Network Testing Technical Papers Yuanyuan Yuan Hong Kong University of Science and Technology, Shuai Wang Hong Kong University of Science and Technology, Zhendong Su ETH Zurich DOI | ||
16:30 20mTalk | Isolation-Based Debugging for Neural Networks Technical Papers Jialuo Chen Zhejiang University, Jingyi Wang Zhejiang University, Youcheng Sun University of Manchester, Peng Cheng Zhejiang University, Jiming Chen Zhejiang University; Hangzhou Dianzi University DOI | ||
16:50 20mTalk | Certified Continual Learning for Neural Network Regression Technical Papers DOI |