VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction (ISSTA 2024 - Technical Papers)

Who

Thanh-Dat Nguyen, Tung Do-Viet, Hung Nguyen-Duy, Tuan-Hai Luu, Hung Le, Xuan-Bach D. Le, Patanamon Thongtanunam

Track

ISSTA 2024 Technical Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 18 Sep 2024 16:10 - 16:30 at EI 3 Sahulka - Static Analysis and Verification Chair(s): Jian Zhang

Abstract

Businesses often need to query visually rich documents (VRDs), e.g., purchase receipts, medical records, and insurance forms, among many other forms from multiple vendors, to make informed decisions. As such, several techniques have been proposed to automatically extract independent entities of interest from VRDs such as extracting price tags from purchase receipts, etc. However, for extracting semantically linked entities, such as finding corresponding price tags for each item, these techniques either have limited capability in handling new layouts, e.g., template-based approaches, or require extensive amounts of pre-training data and do not perform well, e.g., deep-learning approaches.

In this work, we introduce a program synthesis method, namely VRDSynth, to automatically generate programs to extract entity relations from multilingual VRDs. Two key novelties, which empower VRDSynth to tackle flexible layouts while requiring no pre-training data for extracting entity relations, include: (1) a new domain-specific language (DSL) to effectively capture the spatial and textual relations between document entities, and (2) a novel synthesis algorithm that makes use of frequent spatial relations between entities to construct initial programs, equivalent reduction to prune the search space, and a combination of positive, negative, and mutually exclusive programs to improve the coverage of programs.

We evaluate our method on two popular VRD understanding benchmarks, namely FUNSD and XFUND, on the semantic entity linking task, consisting of 1,600 forms in 8 different languages. Experiments show that VRDSynth, despite having no prior pre-training data, outperforms the state-of-the-art pre-trained deep-learning approach, namely LayoutXLM, in 5 out of 8 languages. Noticeably, VRDSynth achieved an improvement of 42% over LayoutXLM in terms of F1 score on FUNSD while being complementary to LayoutXLM in 7/8 languages. Regarding efficiency, VRDSynth significantly improves the memory footprint required for storage and inference over LayoutXLM (1M and 380MB versus that of 1.48GB and 3GB required by LayoutXLM), while maintaining similar time efficiency despite the speed differences between the languages used for implementation (Python vs C++).

Link to Preprint

https://arxiv.org/pdf/2407.06826

DOI

https://doi.org/10.1145/3650212.3680314

Thanh-Dat Nguyen

University of Melbourne

Australia

Tung Do-Viet

Cinnamon AI

Vietnam

Hung Nguyen-Duy

Independent Researcher

Vietnam

Tuan-Hai Luu

Cinnamon AI

Vietnam

Hung Le

Deakin University

Australia

Xuan-Bach D. Le

University of Melbourne

Australia

Patanamon Thongtanunam

University of Melbourne

Australia

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 18 Sep
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

15:30 - 17:10	Static Analysis and VerificationTechnical Papers at EI 3 Sahulka Chair(s): Jian Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences

15:30 20m Talk		Learning to Check LTL Satisfiability and to Generate Traces via Differentiable Trace Checking Technical Papers Weilin Luo Sun Yat-sen University, Pingjia Liang Sun Yat-sen University, Junming Qiu Sun Yat-sen University, Polong Chen Sun Yat-sen University, Hai Wan Sun Yat-sen University, Jianfeng Du Guangdong University of Foreign Studies, Weiyuan Fang Sun Yat-sen University DOI
15:50 20m Talk		Interprocedural Path Complexity Analysis Technical Papers Mira Kaniyur Harvey Mudd College, Ana Cavalcante-Studart Harvey Mudd College, Yihan Yang Harvey Mudd College, Sangeon Park Harvey Mudd College, David Chen Harvey Mudd College, Duy Lam Harvey Mudd College, Lucas Bang Harvey Mudd College DOI
16:10 20m Talk		VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction Technical Papers Thanh-Dat Nguyen University of Melbourne, Tung Do-Viet Cinnamon AI, Hung Nguyen-Duy Independent Researcher, Tuan-Hai Luu Cinnamon AI, Hung Le Deakin University, Xuan-Bach D. Le University of Melbourne, Patanamon Thongtanunam University of Melbourne DOI Pre-print
16:30 20m Talk		Characterizing and Detecting Program Representation Faults of Static Analysis Frameworks Technical Papers Huaien Zhang The Hong Kong Polytechnic University, Yu Pei Hong Kong Polytechnic University, Shuyun Liang Southern University of Science and Technology, Zezhong Xing Southern University of Science and Technology, Shin Hwei Tan Concordia University DOI
16:50 20m Talk		API Misuse Detection via Probabilistic Graphical Model Technical Papers Yunlong Ma Beihang University, Wentong Tian Beihang University, Xiang Gao Beihang University, Hailong Sun Beihang University, Li Li Beihang University DOI

Information for Participants

Wed 18 Sep 2024 15:30 - 17:10 at EI 3 Sahulka - Static Analysis and Verification Chair(s): Jian Zhang

Info for room EI 3 Sahulka:

Map: https://tuw-maps.tuwien.ac.at/?q=CF0205

Room tech: https://raumkatalog.tiss.tuwien.ac.at/room/15663