Interpretability Based Neural Network Repair
Along with the prevalent use of deep neural networks (DNNs), concerns have been raised on the security threats from DNNs such as backdoors in the network.
While neural network repair methods have shown to be effective for fixing the defects in DNNs, they have been also found to produce biased models, with imbalanced accuracy across different classes, or weakened adversarial robustness, allowing malicious attackers to trick the model by adding small perturbations. To address these challenges, we propose INNER, an INterpretability-based NEural Repair approach.
INNER formulates the idea of neuron routing for identifying fault neurons, in which the interpretability technique model probe is used to evaluate each neuron's contribution to the undesired behaviour of the neural network.
INNER then optimizes the identified neurons for repairing the neural network. We test INNER on three typical application scenarios, including backdoor attacks, adversarial attacks, and wrong predictions. Our experimental results demonstrate that INNER can effectively repair neural networks, by ensuring accuracy, fairness, and robustness. Moreover, the performance of other repair methods can be also improved by re-using the fault neurons found by INNER, justifying the generality of the proposed approach.
Wed 18 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
15:30 - 17:10 | Testing and Repairing Neural NetworksTechnical Papers at EI 9 Hlawka Chair(s): Mike Papadakis University of Luxembourg | ||
15:30 20mTalk | Interoperability in Deep Learning: A User Survey and Failure Analysis of ONNX Model Converters Technical Papers Purvish Jajal Purdue University, Wenxin Jiang Purdue University, Arav Tewari Purdue University, Erik Kocinare Purdue University, Joseph Woo Purdue University, Anusha Sarraf Purdue University, Yung-Hsiang Lu Purdue University, George K. Thiruvathukal Loyola University Chicago, James C. Davis Purdue University DOI Pre-print | ||
15:50 20mTalk | Interpretability Based Neural Network Repair Technical Papers Zuohui Chen Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Jun Zhou Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Youcheng Sun University of Manchester, Jingyi Wang Zhejiang University, Qi Xuan Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Xiaoniu Yang Zhejiang University of Technology; National Key Laboratory of Electromagnetic Space Security DOI | ||
16:10 20mTalk | See the Forest, not Trees: Unveiling and Escaping the Pitfalls of Error-Triggering Inputs in Neural Network Testing Technical Papers Yuanyuan Yuan Hong Kong University of Science and Technology, Shuai Wang Hong Kong University of Science and Technology, Zhendong Su ETH Zurich DOI | ||
16:30 20mTalk | Isolation-Based Debugging for Neural Networks Technical Papers Jialuo Chen Zhejiang University, Jingyi Wang Zhejiang University, Youcheng Sun University of Manchester, Peng Cheng Zhejiang University, Jiming Chen Zhejiang University; Hangzhou Dianzi University DOI | ||
16:50 20mTalk | Certified Continual Learning for Neural Network Regression Technical Papers DOI |