Enhancing Robustness of Code Authorship Attribution through Expert Feature Knowledge
Code authorship attribution has been an interesting research problem for decades. Recent studies have revealed that existing methods for code authorship attribution suffer from weak robustness. Under the influence of small perturbations added by the attacker, the accuracy of the method will be greatly reduced. As of now, there is no code authorship attribution method capable of effectively handling such attacks. In this paper, we attribute the weak robustness of code authorship attribution methods to dataset bias and argue that this bias can be mitigated through adjustments to the feature learning strategy. We first propose a robust code authorship attribution feature combination framework, which is composed of only simple shallow neural network structures, and introduces controllability for the framework in the feature extraction by incorporating expert knowledge. Experiments show that the framework has significantly improved robustness over mainstream code authorship attribution methods, with an average drop of 23.4% (from 37.8% to 14.3%) in the success rate of targeted attacks and 25.9% (from 46.7% to 20.8%) in the success rate of untargeted attacks. At the same time, it can also achieve results comparable to mainstream code authorship attribution methods in terms of accuracy.
Fri 20 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
15:30 - 16:30 | Analysis of Code OriginTechnical Papers at EI 10 Fritz Paschke Chair(s): Darko Marinov University of Illinois at Urbana-Champaign | ||
15:30 20mTalk | Enhancing Robustness of Code Authorship Attribution through Expert Feature Knowledge Technical Papers Xiaowei Guo Huazhong University of Science and Technology, Cai Fu Huazhong University of Science and Technology, Juan Chen Huazhong University of Science and Technology, Hongle Liu Huazhong University of Science and Technology, Lansheng Han Huazhong University of Science and Technology, Wenjin Li NSFOCUS Technologies Group DOI | ||
15:50 20mTalk | Your “Notice” Is Missing: Detecting and Fixing Violations of Modification Terms in Open Source Licenses during Forking Technical Papers Kaifeng Huang Tongji University, Yingfeng Xia Fudan University, Bihuan Chen Fudan University, Siyang He Fudan University, Huazheng Zeng Fudan University, Zhuotong Zhou Fudan University, Jin Guo Fudan University, Xin Peng Fudan University DOI | ||
16:10 20mTalk | DeLink: Source File Information Recovery in Binaries Technical Papers Zhe Lang Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Zhengzi Xu Nanyang Technological University; Imperial Global Singapore, Xiaohui Chen China Mobile Research Institute, Shichao Lv Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Zhanwei Song Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Zhiqiang Shi Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Limin Sun Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI |