Remote Presentation Links for the Session and Poster Tracks can be found on the Program Overview.
The ICDAR 2023 main conference session presentation details can be found below.
Note: All times are Pacific Daylight Time (PDT).
Note: Links to individual paper DOIs will begin working once the publisher finalizes the proceedings.
Oral Session 1 – Graphics 1: Graphics Recognition
Chair: Richard Zanibbi
Monday, August 21, 2023 – 10:50-12:30 PDT
O1.1 | 5340 | A Holistic Approach for Aligned Music and Lyrics Transcription | Juan C. Martinez-Sevilla, Antonio Rios-Vila, Francisco J. Castellanos and Jorge Calvo-Zaragoza |
O1.2 | J151 | End-to-end Optical Music Recognition for Pianoform Sheet Music (IJDAR Track) | Antonio Ríos-Vila, David Rizo, José M. Iñesta, Jorge Calvo-Zaragoza |
O1.3 | 8444 | A multi-level synthesis strategy for online handwritten chemical equation recognition | Haoyang Shen, Jinrong Li, Jianmin Lin and Wei Wu |
O1.4 | 9527 | Context and Structure Understanding Oriented Chart Object Detection | Pengyu Yan, Saleem Ahmed and David Doermann |
O1.5 | 3117 | SCI-3000: A Dataset for Figure, Table and Caption Extraction from Scientific PDFs | Filip Darmanović, Allan Hanbury and Markus Zlabinger |
Oral Session 2 – D-NLP 1: Document NLP
Chair: Rajiv Jain
Monday, August 21, 2023 – 10:50-12:30 PDT
O2.1 | 1653 | Consistent Nested Named Entity Recognition in handwritten documents via Lattice Rescoring | David Villanova-Aparisi, Carlos David Martinez-Hinarejos, Verónica Romero and Moisés Pastor-Gadea |
O2.2 | 424 | Search for Hyphenated Words in Probabilistic Indices: a Machine Learning Approach | José Andrés, Alejandro H. Toselli and Enrique Vidal |
O2.3 | 9104 | A Unified Document-level Chinese Discourse Parser on Different Granularity Levels | Weihao Liu, Feng Jiang, Yaxin Fan, Xiaomin Chu, Peifeng Li and Qiaoming Zhu |
O2.4 | J158 | LSTM-Based Siamese Neural Network for Urdu News Story Segmentation (IJDAR Track) | Muhammad Nauman Ahmed Bhatti, Imran Siddiqi, Momina Moetesum |
O2.5 | J140 | Large Scale Genealogical Information Extraction From Handwritten Quebec Parish Records (IJDAR Track) | Solène Tarride , Martin Maarand, Mélodie Boillet, James McGrath, Eugénie Capel, Hélène Vézina, Christopher Kermorvant |
Oral Session 3 – Graphics 2: Tables and Charts
Chair: Jean-Christophe Burie
Monday, August 21, 2023 – 16:00-18:00 PDT
O3.1 | 1070 | A Study on Reproducibility and Replicability of Table Structure Recognition Methods | Kehinde Ajayi, Muntabir Choudhury, Sarah Rajtmajer and Jian Wu |
O3.2 | 3372 | An End-to-End Local Attention Based Model for Table Recognition | Nam Tuan Ly and Atsuhiro Takasu |
O3.3 | 1710 | Optimized Table Tokenization for Table Structure Recognition | Maksym Lysak, Ahmed Nassar, Nikolaos Livathinos, Christoph Auer and Peter Staar |
O3.4 | 1221 | Towards End-to-End Semi-Supervised Table Detection with Deformable Transformer | Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki and Muhammad Zeshan Afzal |
O3.5 | 897 | SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding | Saleem Ahmed, David Doermann, Srirangaraj Setlur, Venu Govindaraju and Pengyu Yan |
O3.6 | 9623 | Generalization of Fine Granular Extractions from Charts | Shubham Singh Paliwal, Manasi Patwardhan and Lovekesh Vig |
Oral Session 4 – D-NLP 2: Information Extraction
Chair: Josep Llados
Monday, August 21, 2023 – 16:00-18:00 PDT
O4.1 | 286 | Improving Information Extraction from Semi-Structured Documents Using Attention based Semi-variational Graph Auto-encoder | Djedjiga Belhadj, Abdel Belaïd and Yolande Belaïd |
O4.2 | 673 | Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding | Bhanu Prakash Voutharoja, Lizhen Qu and Fatemeh Shiri |
O4.3 | 910 | DocILE Benchmark for Document Information Localization and Extraction | Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalický, Jiří Matas, Antoine Doucet, Mickaël Coustaty and Dimosthenis Karatzas |
O4.4 | 2969 | Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks | Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong and Ran Xu |
O4.5 | 1200 | Key-value information extraction from full handwritten pages | Solène Tarride, Mélodie Boillet and Christopher Kermorvant |
O4.6 | 4995 | Information Extraction from Documents: Question Answering vs Token Classification in real-world setups | Laurent Lam, Pirashanth Ratnamogan, Joël Tang, William Vanhuffel and Fabien Caspani |
Oral Session 5 – Applications 1: Medical, Legal, and Financial
Chair: Elisa Barney Smith
Tuesday, August 22, 2023 – 09:00-10:20 PDT
O5.1 | 855 | Multi-Stage Fine-tuning Deep Learning Models Improves Automatic Assessment of the Rey-Osterrieth Complex Figure Test | Benjamin Schuster, Florian Kordon, Martin Mayr, Mathias Seuret and Vincent Christlein |
O5.2 | 7277 | Structure Diagram Recognition in Financial Announcements | Meixuan Qiao, Jun Wang, Junfu Xiang, Qiyu Hou and Ruixuan Li |
O5.3 | 2113 | TransDocAnalyser: A framework for semi-structured offline handwritten documents analysis with an application to legal domain | Sagar Chakraborty, Gaurav Harit and Saptarshi Ghosh |
O5.4 | J161 | Inv3D: A High-Resolution 3D Invoice Dataset for Template-Guided Single-Image Document Unwarping (IJDAR Track) | Felix Hertlein, Alexander Naumann, Patrick Philipp |
Oral Session 6 – Handwriting 1: Online Documents
Chair: Gernot Fink
Tuesday, August 22, 2023 – 09:00-10:20 PDT
O6.1 | J147 | Online Handwriting Trajectory Reconstruction from Kinematic Sensors using Temporal Convolutional Network (IJDAR Track) | Wassim Swaileh, Florent Imbert, Yann Soullard, Romain Tavenard, Eric Anquetil |
O6.2 | J163 | IAMonSense: Multi-level Handwriting Classification using Spatio-temporal Information (IJDAR Track) | Ahmad Mustafid, Junaid Younas, Paul Lukowicz, Sheraz Ahmed |
O6.3 | 2503 | SET, SORT! A Novel Sub-Stroke Level Transformer for Offline Handwriting to Online Conversion | Elmokhtar Mohamed Moussa, Thibault Lelore and Harold Mouchère |
O6.4 | 4206 | Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation | Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat and Andreas Fischer |
Oral Session 7 – DAR 1: Document Layout Analysis
Chair: Koichi Kise
Tuesday, August 22, 2023 – 10:50-12:30 PDT
O7.1 | 8783 | SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation | Ayan Banerjee, Sanket Biswas, Josep Lladós and Umapada Pal |
O7.2 | 8654 | BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset | Md. Istiak Hossain Shihab, Md. Rakibul Hasan, Mahfuzur Rahman Emon, Syed Mobassir Hossen, Md. Nazmuddoha Ansary, Intesur Ahmed, Fazle Rabbi Rakib, Shahriar Elahi Dhruvo, Souhardya Saha Dip, Akib Hasan Pavel, Marsia Haque Meghla, Md. Rezwanul Haque, Sayma Sultana Chowdhury, Farig Sadeque, Tahsin Reasat, Ahmed Imtiaz Humayun and Asif Shahriyar Sushmit |
O7.3 | 9548 | SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation | Subhajit Maity, Sanket Biswas, Siladittya Manna, Ayan Banerjee, Josep Lladós, Saumik Bhattacharya and Umapada Pal |
O7.4 | J182 | Line Extraction in Handwritten Documents via Instance Segmentation (IJDAR Track) | Adeela Islam, Tayaba Anjum, Nazar Khan |
O7.5 | 3827 | Diffusion-based document layout generation | Liu He, Yijuan Lu, John Corring, Dinei Florencio and Cha Zhang |
Oral Session 8 – Handwriting 2: Historical Documents
Chair: Rolf Ingold
Tuesday, August 22, 2023 – 10:50-12:30 PDT
O8.1 | 9690 | DTDT: Highly Accurate Dense Text Line Detection in Historical Documents via Dynamic Transformer | Haiyang Li, Chongyu Liu, Jiapeng Wang, Mingxin Huang, Weiying Zhou and Lianwen Jin |
O8.2 | 5871 | The Bullinger Writer Adaptation Challenge | Anna Scius-Bertrand and Andreas Fischer |
O8.3 | 9679 | Towards Writer Retrieval for Historical Datasets | Marco Peer, Florian Kleber and Robert Sablatnig |
O8.4 | 5959 | HisDoc R-CNN: Robust Chinese Historical Document Text Line Detection with Dynamic Rotational Proposal Network and Iterative Attention Head | Cheng Jian, Lianwen Jin, Lingyu Liang and Chongyu Liu |
O8.5 | 7655 | Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring | George Retsinas, Giorgos Sfikas and Christophoros Nikou |
Poster Session 1
Tuesday, August 22, 2023 – 14:30-16:00 PDT
P1.1 | 1120 | Evaluation of different tagging schemes for Named Entity Recognition in Handwritten Documents | David Villanova-Aparisi, Carlos David Martinez-Hinarejos, Verónica Romero and Moisés Pastor-Gadea | D-NLP |
P1.2 | 1633 | DAMGCN: Entity Linking in Visually Rich Documents with Dependency-Aware Multimodal Graph Convolutional Network | Yi-Ming Chen, Xiang-Ting Hou, Dong-Fang Lou, Zhi-Lin Liao and Cheng-Lin Liu | D-NLP |
P1.3 | 3015 | RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic | Saleem Ahmed, Bhavin Jawade, Shubham Pandey, Srirangaraj Setlur and Venu Govindaraju | D-NLP |
P1.4 | 4131 | QuOTeS: Query-Oriented Technical Summarization | Juan Antonio Ramirez-Orta, Eduardo Xamena, Ana Maguitman, Axel J. Soto, Flavia P. Zanoto and Evangelos Milios | D-NLP |
P1.5 | 5117 | Explain Thyself Bully”: Sentiment Aided Cyberbullying Detection with Explanation | Krishanu Maity, Prince Jha, Raghav Jain, Sriparna Saha and Pushpak Bhattacharyya | D-NLP |
P1.6 | 5939 | Topic Shift Detection in Chinese Dialogues: Corpus and Benchmark | Jiangyi Lin, Yaxin Fan, Feng Jiang, Xiaomin Chu and Peifeng Li | D-NLP |
P1.7 | 6780 | CED: Catalog Extraction from Documents | Tong Zhu, Guoliang Zhang, Zechang Li, Zijian Yu, Junfei Ren, Mengsong Wu, Zhefeng Wang, Baoxing Huai, Pingfu Chao and Wenliang Chen | D-NLP |
P1.8 | 8939 | Multimodal Rumour Detection: Catching news that never transpired! | Raghvendra Kumar, Ritika Sinha, Sriparna Saha and Adam Jatowt | D-NLP |
P1.9 | 9559 | I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection | Yongzhu Chang, Rongsheng Zhang and Jiashu Pu | D-NLP |
P1.10 | 7991 | On Web-based Visual Corpus Construction for Visual Document Understanding | DongHyun Kim, Teakgyu Hong, Moonbin Yim, Yoonsik Kim and Geewook Kim | Data and Synthesis |
P1.11 | 4066 | Analyzing Font Style Usage and Contextual Factors in Real Images | Naoya Yasukochi, Hideaki Hayashi, Daichi Haraguchi and Seiichi Uchida | Data and Synthesis |
P1.12 | 5935 | ESTER-Pt: An Evaluation Suite for TExt Recognition in Portuguese | Moniele Kunrath Santos, Guilherme Bazzo, Lucas Lima de Oliveira and Viviane P. Moreira | Data and Synthesis |
P1.13 | 7047 | TextREC: a Dataset for Referring Expression Comprehension with Reading Comprehension | Chenyang Gao, Biao Yang, Hao Wang, Mingkun Yang, Wenwen Yu, Yuliang Liu and Xiang Bai | Data and Synthesis |
P1.14 | 8519 | DocImagen: Diffusion Model for Layout Conditioned Document Image Generation | Noman Tanveer, Adnan Ul-Hasan and Faisal Shafait | Data and Synthesis |
P1.15 | 8652 | EnsExam: A Dataset for Handwritten Text Erasure on Examination Papers | Liufeng Huang, Bangdong Chen, Chongyu Liu, Dezhi Peng, Weiying Zhou, Yaqiang Wu, Hui Li, Hao Ni and Lianwen Jin | Data and Synthesis |
P1.16 | 491 | Aligning benchmark datasets for table structure recognition | Brandon Smock, Rohith Pesala and Robin Abraham | Graphics |
P1.17 | 2121 | Line-of-sight with Graph Attention Parser (LGAP) for Math Formulas | Ayush Kumar Shah and Richard Zanibbi | Graphics |
P1.18 | 5003 | Line Graphics Digitization: A Step Towards Full Automation | Omar Moured, Jiaming Zhang, Alina Roitberg, Thorsten Schwarz and Rainer Stiefelhagen | Graphics |
P1.19 | 6359 | TRACE:Table Reconstruction Aligned to Corner and Edges | Youngmin Baek, Daehyun Nam, Jaeheung Surh, Seung Shin and Seonghyeon Kim | Graphics |
P1.20 | 7707 | Towards Making Flowchart Images Machine Interpretable | Shreya Shukla, Prajwal Gatti, Yogesh Kumar, Vikash Yadav and Anand Mishra | Graphics |
P1.21 | 9362 | GriTS: Grid table similarity metric for table structure recognition | Brandon Smock, Rohith Pesala and Robin Abraham | Graphics |
P1.22 | 171 | Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model | Haisong Ding, Bozhi Luan, Dongnan Gui, Kai Chen and Qiang Huo | Handwriting |
P1.23 | 832 | Vision Conformer: Incorporating Convolutions into Vision Transformer Layers | Brian Kenji Iwana and Akihiro Kusuda | Handwriting |
P1.24 | 1442 | Exploring Semantic Word Representations for Recognition-free NLP on Handwritten Document Images | Oliver Tüselmann and Gernot A. Fink | Handwriting |
P1.25 | 2095 | A Unified Architecture for Urdu Printed and Handwritten Text Recognition | Arooba Maqsood, Nauman Riaz, Adnan Ul-Hasan and Faisal Shafait | Handwriting |
P1.26 | 3789 | Linguistic Knowledge within Handwritten Text Recognition Models: A Real-World Case Study | Samuel Londner, Yoav Phillips, Hadar Miller, Nachum Dershowitz, Tsvi Kuflik and Moshe Lavee | Handwriting |
P1.27 | 4083 | Faster DAN: Multi-target Queries with Document Positional Encoding for End-to-end Handwritten Document Recognition | Denis Coquenet, Clément Chatelain and Thierry Paquet | Handwriting |
P1.28 | 4601 | DSS: Synthesizing long Digital Ink using Data augmentation, Style encoding and Split generation. | Aleksandr Timofeev, Anastasiia Fadeeva, Andrii Maksai, Claudiu Musat and Andrei Afonin | Handwriting |
P1.29 | 6471 | Fine-tuning Vision Encoder-Decoder Transformers for Handwriting Text Recognition on Historical Documents | Daniel Parres Montoya and Roberto Paredes Palacios | Handwriting |
P1.30 | 7403 | Incremental Teacher Model with Mixed Augmentations and Scheduled Pseudo-Label Loss for Handwritten Text Recognition | Masayuki Honda, Hung Tuan Nguyen, Cuong Tuan Nguyen, Cong Kha Nguyen, Ryosuke Odate, Takashi Kanemaru and Masaki Nakagawa | Handwriting |
P1.31 | 7741 | SeamFormer: High Precision Text Line Segmentation for Handwritten Documents | Niharika Vadlamudi, Rahul Krishna and Ravi Kiran Sarvadevabhatla | Handwriting |
P1.32 | 8630 | Adversarial Attacks on Convolutional Siamese Signature Verification Networks | Maham Jahangir, Muhammad Imran Malik and Faisal Shafait | Handwriting |
P1.33 | 9048 | Towards Writing Style Adaptation in Handwriting Recognition | Jan Kohút, Michal Hradiš and Martin Kišš | Handwriting |
P1.34 | 9806 | Group, Contrast and Recognize: A Self-supervised Method for Chinese Character Recognition | Xinzhe Jiang, Jun Du, Pengfei Hu, Mobai Xue, Jiefeng Ma, Jiajia Wu and Jianshu Zhang | Handwriting |
P1.35 | 9904 | Weakly supervised information extraction from inscrutable handwritten document images | Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar and Gaurav Aggarwal | Handwriting |
P1.36 | 1827 | TDAE: Text Detection with Affinity Areas and Evolution Strategies | Kefan Ma, Yuchen Luo, Zheng Huang, Kai Chen, Jie Guo and Weidong Qiu | Scene Text |
P1.37 | 2311 | Scene Text Recognition with Image-Text Matching-guided Dictionary | Jiajun Wei, Hongjian Zhan, Xiao Tu, Yue Lu and Umapada Pal | Scene Text |
P1.38 | 3232 | Open-Set Text Recognition via Shape-Awareness Visual Reconstruction | Chang Liu, Chun Yang and Xu-Cheng Yin | Scene Text |
P1.39 | 4204 | Text Enhancement:Scene Text Recognition in Hazy Weather | En Deng, Gang Zhou, Jiakun Tian, Yangxin Liu and Zhenhong Jia | Scene Text |
P1.40 | 5525 | TPFNet: A Novel Text In-painting Transformer for Text Removal | Onkar Susladkar, Dhruv Makwana, Gayatri Deshmukh, Sparsh Mittal, R Sai Chandra Teja and Rekha Singhal | Scene Text |
P1.41 | 1934 | Incremental Learning and Ambiguity Rejection for Document Classification | Tri-Cong Pham, Mickaël Coustaty, Aurélie Joseph, Vincent Poulain D’Andecy, Muriel Visani and Nicolas Sidere | Text & Document Recognition |
P1.42 | 2309 | A Graphical Approach to Document Layout Analysis | Jilin Wang, Michael Krumdick, Baojia Tong, Delphine Vendryes, Hamima Halim, Maxim Sokolov, Vadym Barda and Chris Tanner | Text & Document Recognition |
P1.43 | 2678 | Ensuring an error-free transcription on a full engineering tags dataset through unsupervised Post-OCR methods | Mathieu Francois and Véronique Eglin | Text & Document Recognition |
P1.44 | 3475 | Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function | Joseph Attieh, Abraham Woubie Zewoudie, Vladimir Vlassov, Adrian Flanagan and Tom Bäckström | Text & Document Recognition |
P1.45 | 3833 | DocParser: end-to-end OCR-free information extraction from Visually Rich Documents | Mohamed Dhouib, Ghassen Bettaieb and Aymen Shabou | Text & Document Recognition |
P1.46 | 4289 | A Hybrid Approach to Document Layout Analysis for Heterogeneous Document Images | Zhuoyao Zhong, Jiawei Wang, Haiqing Sun, Kai Hu, Erhan Zhang, Lei Sun and Qiang Huo | Text & Document Recognition |
P1.47 | 4485 | You Only Look for a Symbol Once: An Object Detector for Symbols and Regions in Documents | William Smith and Toby Pillatt | Text & Document Recognition |
P1.48 | 5017 | TACTFUL: A framework for Targeted Active Learning for Document Analysis | Venkatapathy Subramanian, Sagar Poudel, Ganesh Ramakrishnan and Parag Chaudhuri | Text & Document Recognition |
P1.49 | 6512 | Evaluating Adversarial Robustness on Document Image Classification | Timothée Fronteau, Arnaud Paran and Aymen Shabou | Text & Document Recognition |
P1.50 | 7080 | Layout Analysis of Historical Document Images using a Light Fully Convolutional Networks | Najoua Rahal, Lars Vögtlin and Rolf Ingold | Text & Document Recognition |
P1.51 | 8595 | Detecting Text on Historical Maps by Selecting Best Candidates of Deep Neural Networks Output | Gerasimos Matidis, Basilis Gatos, Anastasios Kesidis and Panagiotis Kaddas | Text & Document Recognition |
P1.52 | inv-2 | ICDAR 2023 Competition on Video Text Reading for Dense and Small Text | Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas and Xiang Bai | Competition |
P1.53 | inv-10 | ICDAR 2023 Competition on Born Digital Video Text Question Answering | Zhibo Yang, Xiaoge Song, Sibo Song, Tong Lu, Xiang Bai, Cheng-Lin Liu, Fei Huang and Cong Yao | Competition |
P1.54 | inv-5 | ICDAR 2023 Competition on Indic Handwriting Text Recognition | Ajoy Mondal and C. V. Jawahar | Competition |
P1.55 | inv-11 | ICDAR 2023 Competition on Reading the Seal Title | Wenwen Yu, Mingyu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas and Xiang Bai | Competition |
P1.56 | inv-16 | ICDAR 2023 Competition on Detecting Tampered Text in Images | Dongliang Luo, Yu Zhou, Rui Yang, Yuliang Liu, Xianjin Liu, Jishen Zeng, Enming Zhang, Biao Yang, Ziming Huang, Lianwen Jin and Xiang Bai | Competition |
Oral Session 9 – DAR 2: Camera Images and Scene Text
Chair: Seiichi Uchida
Tuesday, August 22, 2023 – 16:00-18:00 PDT
O9.1 | 2427 | ViSA: Visual and Semantic Alignment for Robust Scene Text Recognition | Zhenru Pan, Zhilong Ji, Xiao Liu, Jinfeng Bai and Cheng-Lin Liu |
O9.2 | J144 | An Accurate Approach to Real-time Machine Readable Zone Detection with Mobile Devices (IJDAR Track) | Alexander Gayer, Daria Ershova, Vladimir V. Arlazarov |
O9.3 | 9711 | DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection | Chixiang Ma, Lei Sun, Jiawei Wang and Qiang Huo |
O9.4 | 1462 | Decoupling Visual-Semantic Features Learning with Dual Masked Autoencoder for Self-Supervised Scene Text Recognition | Zhi Qiao, Zhilong Ji, Ye Yuan and Jinfeng Bai |
O9.5 | 7705 | Re-thinking Text Clustering for Images with Text | Shwet Kamal Mishra, Soham Joshi and Viswanath Gopalakrishnan |
O9.6 | 2326 | Scene Table Structure Recognition with Segmentation and Key Point Collaboration | Li Zhuoming, Peng Fan, Xue Yang, Ni Hao and Jin Lianwen |
Oral Session 10 – Handwriting 3: Document Synthesis
Chair: Robert Sablatnig
Tuesday, August 22, 2023 – 16:00-18:00 PDT
O10.1 | J150 | Historical Document Image Analysis using Controlled Data for Pre-Training (IJDAR Track) | Najoua Rahal, Lars Vögtlin, Rolf Ingold |
O10.2 | 1580 | Handwritten Text Generation with Character-specific Encoding for Style Imitation | Jan Zdenek and Hideki Nakayama |
O10.3 | 3176 | How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning | Vittorio Pippi, Silvia Cascianelli, Christopher Kermorvant and Rita Cucchiara |
O10.4 | 4838 | TBM-GAN: Synthetic Document Generation with Degraded Background | Arnab Poddar, Soumyadeep Dey, Pratik Jawanpuria, Jayanta Mukhopadhyay and Prabir Kumar Biswas |
O10.5 | 1250 | Styled Text-to-Text-Content-Image Generation with Latent Diffusion Models | Konstantina Nikolaidou, George Retsinas, Vincent Christlein, Mathias Seuret, Giorgos Sfikas, Elisa Barney Smith, Hamam Mokayed and Marcus Liwicki |
O10.6 | 6936 | Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition | Dongnan Gui, Kai Chen, Haisong Ding and Qiang Huo |
Oral Session 11 – Competitions
Chair: Kenny Davila
Wednesday, August 23, 2023 – 09:00-10:20 PDT
O11.1 | inv-13 | ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions | Yejing Xie, Harold Mouchère, Foteini Simistira Liwicki, Sumit Rakesh, Rajkumar Saini, Masaki Nakagawa, Cuong Tuan Nguyen and Thanh-Nghia Truong |
O11.2 | inv-8 | ICDAR 2023 Competition on Hierarchical Text Detection and Recognition | Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii and Michalis Raptis |
O11.3 | inv-15 | ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition | George Tom, Minesh Mathew, Sergi Garcia, Dimosthenis Karatzas and C V Jawahar |
O11.4 | inv-4 | ICDAR 2023 Competition on Document UnderstanDing of Everything (DUDE) | Jordy Van Landeghem, Rubèn Tito, Łukasz Borchmann, Michał Pietruszka, Dawid Jurkiewicz, Rafał Powalski, Paweł Józiak, Sanket Biswas, Mickaël Coustaty and Tomasz Stanisławek |
Oral Session 12 – Graphics 3: Math Recognition
Chair: Harold Mouchere
Wednesday, August 23, 2023 – 09:00-10:20 PDT
O12.1 | 1641 | Relative position embedding asymmetric siamese network for Offline handwritten mathematical expression recognition | Chunyi Wang, Wei Hu, Xiaqing Rao, Runqi Luohu, Ning Bi and Tan Jun |
O12.2 | 2017 | EDSL: An Encoder-Decoder Architecture with Symbol-Level Features for Printed Mathematical Expression Recognition | Yingnan Fu, Tingting Liu, Ming Gao and Aoying Zhou |
O12.3 | 4261 | Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition | Zhuang Liu, Ye Yuan, Zhilong Ji, Jinfeng Bai and Xiang Bai |
O12.4 | 8247 | An Encoder-Decoder Method with Position-Aware for Printed Mathematical Expression Recognition | Quan Hong, Jun Long and Liu Yang |
Oral Session 13 – DAR 3: Text and Document Recognition
Chair: Mickael Coustaty
Wednesday, August 23, 2023 – 10:50-12:30 PDT
O13.1 | 9627 | A hybrid model for multilingual OCR | David Etter, Cameron Carpenter and Nolan King |
O13.2 | 1261 | Multi-Teacher Knowledge Distillation for End-to-End Text Image Machine Translation | Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou and Chengqing Zong |
O13.3 | J162 | Printed Ottoman Text Recognition Using Synthetic Data and Data Augmentation (IJDAR Track) | Esma F. Bilgin Tasdemir |
O13.4 | J149 | Classification of Incunable Glyphs and Out-of-distribution Detection with Joint Energy-based Models (IJDAR Track) | Florian Kordon, Nikolaus Weichselbaumer, Randall Herz, Stephen Mossman, Edward Potten, Mathias Seuret, Martin Mayr, Vincent Christlein |
O13.5 | J154 | Analyzing the Potential of Active Learning for Document Image Classification (IJDAR Track) | Saifullah Saifullah, Stefan Agne, Andreas Dengel, Sheraz Ahmed |
Oral Session 14 – Applications 2: Document Analysis Systems
Chair: Faisal Shafait
Wednesday, August 23, 2023 – 10:50-12:30 PDT
O14.1 | 7163 | Multimodal Scoring Model for Handwritten Chinese Essay | Tonghua Su, Jifeng Wang, Hongming You and Zhongjie Wang |
O14.2 | 3025 | FCN-Boosted Historical Map Segmentation with Little Training Data | Josef Baloun, Ladislav Lenc and Pavel Král |
O14.3 | 1830 | MemeGraphs: Linking Memes to Knowledge Graphs | Vasiliki Kougia, Simon Fetzel, Thomas Kirchmair, Erion Çano, Sina Baharlou, Sahand Sharifzadeh and Benjamin Roth |
O14.4 | J159 | Scheme for Palimpsests Reconstruction Using Synthesized Dataset (IJDAR Track) | Boraq Madi, Reem Alaasam, Raed Shammas and Jihad El-Sana |
O14.5 | 9420 | Context Aware Document Binarization and Its Application to Information Extraction from Structured Documents | Ján Koloda and Jue Wang |
Poster Session 2
Wednesday, August 23, 2023 – 14:30-16:00 PDT
P2.1 | 1419 | Analyzing the Impact of Tokenization on Multilingual Epidemic Surveillance in Low-resource Languages | Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Adam Jatowt, Gaël Lejeune and Moses Odeo | D-NLP |
P2.2 | 2100 | Analysing Textual Information from Financial Statements for Default Prediction | Chinesh Doshi, Himani Shrotriya, Rohit Bhiogade, Himanshu Sharad Bhatt and Abhishek Jha | D-NLP |
P2.3 | 3165 | An Iterative Graph Learning Convolution Network for Key Information Extraction Based on the Document Inductive Bias | Jiyao Deng, Yi Zhang, Xinpeng Zhang, Zhi Tang and Liangcai Gao | D-NLP |
P2.4 | 4804 | A Benchmark of Nested Named Entity Recognition Approaches in Historical Structured Documents | Solenn Tual, Nathalie Abadie, Bertrand Duménieu, Joseph Chazalon and Edwin Carlinet | D-NLP |
P2.5 | 5441 | LayoutGCN: A Lightweight Architecture for Visually Rich Document Understanding | Dengliang Shi, Siliang Liu, Jintao Du and Huijia Zhu | D-NLP |
P2.6 | 6475 | Detecting Forged Receipts with Domain-specific Ontology-based Entities & Relations | Beatriz Martínez Tornés, Emanuela Boros, Petra Gomez-Krämer, Antoine Doucet and Jean-Marc Ogier | D-NLP |
P2.7 | 7131 | A Character-level Document Key Information Extraction Method with Contrastive Learning | Xinpeng Zhang, Liangcai Gao and Jiyao Deng | D-NLP |
P2.8 | 9403 | Semantic triple-assisted learning for question answering passage re-ranking | Dinesh Nagumothu, Bahadorreza Ofoghi and Peter Eklund | D-NLP |
P2.9 | 9981 | Information Redundancy and Biases in Public Document Information Extraction Benchmarks | Seif Edinne Laatiri, Pirashanth Ratnamogan, Joël Tang, Laurent Lam, William Vanhuffel and Fabien Caspani | D-NLP |
P2.10 | 3928 | Ambigram Generation by A Diffusion Model | Takahiro Shirakawa and Seiichi Uchida | Data and Synthesis |
P2.11 | 5155 | CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data | Michał Turski, Tomasz Stanisławek, Karol Kaczmarek, Paweł Dyda and Filip Graliński | Data and Synthesis |
P2.12 | 6077 | Augraphy: A Data Augmentation Library for Document Images | Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini and Jonathan Boarman | Data and Synthesis |
P2.13 | 7774 | SIMARA: a database for key-value information extraction from full-page handwritten documents | Solène Tarride, Mélodie Boillet, Jean-François Moufflet and Christopher Kermorvant | Data and Synthesis |
P2.14 | 9867 | Receipt Dataset for Document Forgery Detection | Beatriz Martínez Tornés, Théo Taburet, Emanuela Boros, Kais Rouis, Petra Gomez-Krämer, Nicolas Sidere, Antoine Doucet and Vincent Poulain d’Andecy | Data and Synthesis |
P2.15 | 200 | MIDV-Holo: a dataset for ID document hologram detection in a video stream | Leisan Koliaskina, Ekaterina Emelianova, Daniil Tropin, Vladimir Popov, Konstantin Bulatov, Dmitry Nikolaev and Vladimir V. Arlazarov | Data and Synthesis |
P2.16 | 2013 | LineFormer: Line Chart Data Extraction using Instance Segmentation | Jay Lal, Aditya Mitkari, Mahesh Bhosale and David Doermann | Graphics |
P2.17 | 2566 | PyramidTabNet: Transformer based Table Recognition in Image-based Documents | Muhammad Umer, Ahmed Mohsin, Adnan Ul-Hasan and Faisal Shafait | Graphics |
P2.18 | 5671 | Linear Object Detection in Document Images using Multiple Object Tracking | Philippe Bernet, Joseph Chazalon, Edwin Carlinet, Alexandre Bourquelot and Elodie Puybareau | Graphics |
P2.19 | 6754 | Contour Completion by Transformers and Its Application to Vector Font Data | Yusuke Nagata, Brian Kenji Iwana and Seiichi Uchida | Graphics |
P2.20 | 9308 | Formerge: Recover spanning cells in complex table structure using transformer network | Nam Quan Nguyen, Anh Duy Le, Anh Khoa Lu, Xuan Toan Mai and Tuan Anh Tran | Graphics |
P2.21 | 125 | A Shallow Graph Neural Network with Innovative Node Updating for Online Handwritten Stroke Classification | Yan-Rong Wang, Da-Han Wang, Xiao-Long Yun, Yan-Ming Zhang, Fei Yin and Shunzhi Zhu | Handwriting |
P2.22 | 590 | Improved Learning for Online Handwritten Chinese Text Recognition with Convolutional Prototye Network | Yi Chen, Heng Zhang and Cheng-Lin Liu | Handwriting |
P2.23 | 1118 | Modeling Cross-layer Interaction for Chinese Calligraphy Style Classification | Zhigang Li, Li Liu, Taorong Qiu, Yue Lu and Ching Y. Suen | Handwriting |
P2.24 | 1887 | OCR Language Models with Custom Vocabularies | Peter Garst, Yasuhisa Fuji and Reeve Ingle | Handwriting |
P2.25 | 2745 | Sampling and Ranking for Digital Ink Generation on a tight computational budget | Andrii Maksai, Andrei Afonin, Aleksandr Timofeev and Claudiu Musat | Handwriting |
P2.26 | 4033 | Decoupled Learning for Long-Tailed Oracle Character Recognition | Jing Li, Bin Dong, Qiu-Feng Wang, Lei Ding, Rui Zhang and Kaizhu Huang | Handwriting |
P2.27 | 4287 | Shared-Operation Hypercomplex Networks for Handwritten Text Recognition | Giorgos Sfikas, George Retsinas, Panagiotis Dimitrakopoulos, Basilis Gatos and Christophoros Nikou | Handwriting |
P2.28 | 6036 | Precise Segmentation for Children Handwriting Analysis by Combining Multiple Deep Models with Online Knowledge | Simon Corbillé, Éric Anquetil and Élisa Fromont | Handwriting |
P2.29 | 7310 | Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition | Jan Kohút and Michal Hradiš | Handwriting |
P2.30 | 7663 | AFFGANwriting: A handwriting image generation method based on multi-feature fusion | Heng Wang, Yiming Wang and Hongxi Wei | Handwriting |
P2.31 | 8156 | SegCTC: Offline Handwritten Chinese Text Recognition via Better Fusion between Explicit and Implicit Segmentation | Jiarong Huang, Dezhi Peng, Hongliang Li, Hao Ni and Lianwen Jin | Handwriting |
P2.32 | 8727 | A System for Processing and Recognition of Greek Byzantine and Post-Byzantine Documents | Panagiotis Kaddas, Konstantinos Palaiologos, Basilis Gatos, Vassilis Katsouros and Katerina Christopoulou | Handwriting |
P2.33 | 9669 | Historical document image segmentation combining deep learning and Gabor features | Maroua Mehri, Akrem Sellami and Salvatore Tabbone | Handwriting |
P2.34 | 9897 | Content-Aware Urdu Handwriting Generation | Zeeshan Memon, Adnan Ul-Hasan and Faisal Shafait | Handwriting |
P2.35 | 1429 | Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation | Renshen Wang, Yasuhisa Fujii and Alessandro Bissacco | Scene Text |
P2.36 | 2111 | Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution | Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren and Xiang Bai | Scene Text |
P2.37 | 2850 | E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation | Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou and Chengqing Zong | Scene Text |
P2.38 | 3409 | Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning | Sergi Garcia-Bordils, Dimosthenis Karatzas and Marçal Rusiñol | Scene Text |
P2.39 | 5000 | Reading Between the Lanes: Text VideoQA on the Road | George Tom, Minesh Mathew, Sergi Garcia, Dimosthenis Karatzas and C.V. Jawahar | Scene Text |
P2.40 | 864 | Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections | Lluis Gomez, Francesc Net, Pep Casals-Puig and Marc Folia | Text & Document Recognition |
P2.41 | 2194 | EEBO-Verse: Sifting for Poetry in Large Early Modern Corpora using Visual Features | Danlu Chen, Nan Jiang and Taylor Berg-Kirkpatrick | Text & Document Recognition |
P2.42 | 2627 | Gaussian Kernels based Network for Multiple License Plate Number Detection in Day-Night Images | Soumi Das, Shivakumara Palaiahnakote, Umapada Pal and Raghavendra Ramachandra | Text & Document Recognition |
P2.43 | 2771 | Unraveling confidence: examining confidence scores as proxy for OCR quality | Mirjam Cuper, Corine van Dongen and Tineke Koster | Text & Document Recognition |
P2.44 | 3792 | FTDNet: Joint Semantic Learning for Scene Text Detection in Adverse Weather Conditions | Jiakun Tian, Gang Zhou, Yangxin Liu, En Deng and Zhenhong Jia | Text & Document Recognition |
P2.45 | 4178 | MUGS: A Multiple Granularity Semi-Supervised Method for Text Recognition | Qi Song, Qianyi Jiang, Wang Lei, Lingling Zhao and Rui Zhang | Text & Document Recognition |
P2.46 | 4319 | ColDBin: Cold Diffusion for Document Image Binarization | Saifullah Saifullah, Stefan Agne, Andreas Dengel and Sheraz Ahmed | Text & Document Recognition |
P2.47 | 4548 | SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition | Junyi Zhang, Chang Liu and Chun Yang | Text & Document Recognition |
P2.48 | 5951 | End-to-end Multi-line License Plate Recognition with Cascaded Perception | Song-Lu Chen, Qi Liu, Feng Chen and Xu-Cheng Yin | Text & Document Recognition |
P2.49 | 6516 | UTRNet: High-Resolution Urdu Text Recognition In Printed Documents | Abdur Rahman, Chetan Arora and Arjun Ghosh | Text & Document Recognition |
P2.50 | 7319 | Combining OCR Models for Reading Early Modern Books | Mathias Seuret, Janne van der Loop, Nikolaus Weichselbaumer, Martin Mayr, Janina Molnar, Tatjana Hass and Vincent Christlein | Text & Document Recognition |
P2.51 | inv-6 | ICDAR 2023 Competition on Visual Question Answering on Business Document Images | Sachin Raja, Ajoy Mondal and C. V. Jawahar | Competition |
P2.52 | inv-7 | ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents | Christoph Auer, Ahmed Nassar, Maksym Lysak, Michele Dolfi, Nikolaos Livathinos and Peter Staar | Competition |
P2.53 | inv-12 | ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images | Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang and Xiang Bai | Competition |
P2.54 | inv-9 | ICDAR 2023 Competition on Detection and Recognition of Greek Letters on Papyri | Mathias Seuret, Isabelle Marthot-Santaniello, Stephen A. White, Olga Serbaeva Saraogi, Selaudin Agolli, Guillaume Carrière, Dalia Rodriguez-Salas and Vincent Christlein | Competition |
P2.55 | inv-14 | ICDAR 2023 Competition on Recognition of Multi-line Handwritten Mathematical Expressions | Chenyang Gao, Yuliang Liu, Shiyu Yao, Jinfeng Bai, Xiang Bai, Lianwen Jin and Cheng-Lin Liu | Competition |
P2.56 | N/A | ICDAR 2023 Competition on Document Information Localization and Extraction | Stepan Simsa, Milan Sulc , Matyas Skalicky, Yash Patel, and Ahmed Hamdi | Competition |
Doctoral Consortium
Tuesday, August 22, 2023 – 14:30-16:00 PDT
Wednesday, August 23, 2023 – 14:30-16:00 PDT
DC.1 | Computer Vision Techniques for Handwritten Optical Music Recognition | Pau Torras |
DC.2 | Graph based deep learning research for recognition of on-line handwritten mathematical expression | Yejing Xie |
DC.3 | Strokes Trajectory Recovery for Unconstrained Handwritten Documents with Automatic Evaluation | Sidra Hanif |
DC.4 | Enabling Deep Document Image Analysis with Generative Models | Konstantina Nikolaidou |
DC.5 | Enhancing Information Extraction in Business Documents through Line-Level Analysis and Automation | Eliott Thomas |
DC.6 | Writer Retrieval for Historical Documents | Marco Peer |
DC.7 | HTR for distant reading of medieval charters | Nicolas Renet |
DC.8 | Line-of-Sight Graph Attention and Graph-based Task Interaction (LGATI) for Visual Parsing of Math Formulas and Chemical Diagrams | Ayush Kumar Shah |