Inconsistencies in TeX-Produced Documents
TeX is a widely-used typesetting system adopted by most publishers and professional societies. While TeX is responsible for generating a significant number of documents, irregularities in the TeX ecosystem may produce inconsistent documents. These inconsistencies may occur across different TeX engines or different versions of TeX distributions, resulting in failures to adhere to formatting specifications, or the same document rendering differently for different authors. In this work, we investigate and quantify the robustness of the TeX ecosystem through a large-scale study of 432 documents. We developed an automated pipeline to evaluate the cross-engine and cross-version compatibility of the TeX ecosystem. We found significant inconsistencies in the outputs of different TeX engines: only 0.2% of documents compiled to identical output with XeTeX and PDFTeX due to a lack of cross-engine support in popular LaTeX packages and classes used in academic conferences. A smaller—yet significant—extent of inconsistencies was found across different TeX Live distributions, with only 42.1% of documents producing the same output from 2020 to 2023. Our automated pipeline additionally reduces the human effort in bug-finding: from a sample of 10 unique root causes of inconsistencies, we identified two new bugs in LaTeX packages and five existing bugs that were fixed independently of this study. We also observed potentially unintended inconsistencies across different TeX Live distributions beyond the updates listed in changelogs. We expect that this study will help authors of TeX documents to avoid unexpected outcomes by understanding how they may be affected by the often undocumented subtleties of the TeX ecosystem, while benefiting developers by demonstrating how different implementations result in unintended inconsistencies.
Fri 20 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 11:50 | |||
10:30 20mTalk | Inconsistencies in TeX-Produced Documents Technical Papers DOI Pre-print | ||
10:50 20mTalk | Fuzzing MLIR Compiler Infrastructure via Operation Dependency Analysis Technical Papers Chenyao Suo Tianjin University, Junjie Chen Tianjin University, Shuang Liu Renmin University of China, Jiajun Jiang Tianjin University, Yingquan Zhao Tianjin University, Jianrong Wang Tianjin University DOI | ||
11:10 20mTalk | Towards Understanding the Bugs in Solidity Compiler Technical Papers Haoyang Ma Hong Kong University of Science and Technology, Wuqi Zhang Hong Kong University of Science and Technology, Qingchao Shen Tianjin University, Yongqiang Tian Hong Kong University of Science and Technology, Junjie Chen Tianjin University, Shing-Chi Cheung Hong Kong University of Science and Technology DOI | ||
11:30 20mTalk | Uncovering and Mitigating the Impact of Code Obfuscation on Dataset Annotation with Antivirus Engines Technical Papers Gao Cuiying Huazhong University of Science and Technology; JD.com, Yueming Wu Nanyang Technological University, Heng Li Huazhong University of Science and Technology, Wei Yuan Huazhong University of Science and Technology, Haoyu Jiang Huazhong University of Science and Technology, Qidan He JD.com, Yang Liu Nanyang Technological University DOI |