125 | A Shallow Graph Neural Network with Innovative Node Updating for Online Handwritten Stroke Classification | Yan-Rong Wang, Da-Han Wang, Xiao-Long Yun, Yan-Ming Zhang, Fei Yin and Shunzhi Zhu |
171 | Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model | Haisong Ding, Bozhi Luan, Dongnan Gui, Kai Chen and Qiang Huo |
200 | MIDV-Holo: a dataset for ID document hologram detection in a video stream | Leisan Koliaskina, Ekaterina Emelianova, Daniil Tropin, Vladimir Popov, Konstantin Bulatov, Dmitry Nikolaev and Vladimir V. Arlazarov |
491 | Aligning benchmark datasets for table structure recognition | Brandon Smock, Rohith Pesala and Robin Abraham |
590 | Improved Learning for Online Handwritten Chinese Text Recognition with Convolutional Prototye Network | Yi Chen, Heng Zhang and Cheng-Lin Liu |
832 | Vision Conformer: Incorporating Convolutions into Vision Transformer Layers | Brian Kenji Iwana and Akihiro Kusuda |
864 | Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections | Francesc Net, Marc Folia, Pep Casals-Puig, and Lluis Gomez |
1118 | Modeling Cross-layer Interaction for Chinese Calligraphy Style Classification | Zhigang Li, Li Liu, Taorong Qiu, Yue Lu and Ching Y. Suen |
1120 | Evaluation of different tagging schemes for Named Entity Recognition in Handwritten Documents | David Villanova-Aparisi, Carlos David Martinez-Hinarejos, Verónica Romero and Moisés Pastor-Gadea |
1419 | Analyzing the Impact of Tokenization on Multilingual Epidemic Surveillance in Low-resource Languages | Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Adam Jatowt, Gaël Lejeune and Moses Odeo |
1429 | Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation | Renshen Wang, Yasuhisa Fujii and Alessandro Bissacco |
1442 | Exploring Semantic Word Representations for Recognition-free NLP on Handwritten Document Images | Oliver Tüselmann and Gernot A. Fink |
1633 | DAMGCN: Entity Linking in Visually Rich Documents with Dependency-Aware Multimodal Graph Convolutional Network | Yi-Ming Chen, Xiang-Ting Hou, Dong-Fang Lou, Zhi-Lin Liao and Cheng-Lin Liu |
1827 | TDAE: Text Detection with Affinity Areas and Evolution Strategies | Kefan Ma, Yuchen Luo, Zheng Huang, Kai Chen, Jie Guo and Weidong Qiu |
1887 | OCR Language Models with Custom Vocabularies | Peter Garst, Yasuhisa Fuji and Reeve Ingle |
1934 | Incremental Learning and Ambiguity Rejection for Document Classification | Tri-Cong Pham, Mickaël Coustaty, Aurélie Joseph, Vincent Poulain D’Andecy, Muriel Visani and Nicolas Sidere |
2013 | LineFormer: Line Chart Data Extraction using Instance Segmentation | Jay Lal, Aditya Mitkari, Mahesh Bhosale and David Doermann |
2095 | A Unified Architecture for Urdu Printed and Handwritten Text Recognition | Arooba Maqsood, Nauman Riaz, Adnan Ul-Hasan and Faisal Shafait |
2100 | Analysing Textual Information from Financial Statements for Default Prediction | Chinesh Doshi, Himani Shrotriya, Rohit Bhiogade, Himanshu Sharad Bhatt and Abhishek Jha |
2111 | Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution | Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren and Xiang Bai |
2121 | Line-of-sight with Graph Attention Parser (LGAP) for Math Formulas | Ayush Kumar Shah and Richard Zanibbi |
2194 | EEBO-Verse: Sifting for Poetry in Large Early Modern Corpora using Visual Features | Danlu Chen, Nan Jiang and Taylor Berg-Kirkpatrick |
2309 | A Graphical Approach to Document Layout Analysis | Jilin Wang, Michael Krumdick, Baojia Tong, Delphine Vendryes, Hamima Halim, Maxim Sokolov, Vadym Barda and Chris Tanner |
2311 | Scene Text Recognition with Image-Text Matching-guided Dictionary | Jiajun Wei, Hongjian Zhan, Xiao Tu, Yue Lu and Umapada Pal |
2566 | PyramidTabNet: Transformer based Table Recognition in Image-based Documents | Muhammad Umer, Ahmed Mohsin, Adnan Ul-Hasan and Faisal Shafait |
2627 | Gaussian Kernels based Network for Multiple License Plate Number Detection in Day-Night Images | Soumi Das, Shivakumara Palaiahnakote, Umapada Pal and Raghavendra Ramachandra |
2678 | Ensuring an error-free transcription on a full engineering tags dataset through unsupervised Post-OCR methods | Mathieu Francois and Véronique Eglin |
2745 | Sampling and Ranking for Digital Ink Generation on a tight computational budget | Andrii Maksai, Andrei Afonin, Aleksandr Timofeev and Claudiu Musat |
2771 | Unraveling confidence: examining confidence scores as proxy for OCR quality | Mirjam Cuper, Corine van Dongen and Tineke Koster |
2850 | E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation | Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou and Chengqing Zong |
3015 | RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic | Saleem Ahmed, Bhavin Jawade, Shubham Pandey, Srirangaraj Setlur and Venu Govindaraju |
3165 | An Iterative Graph Learning Convolution Network for Key Information Extraction Based on the Document Inductive Bias | Jiyao Deng, Yi Zhang, Xinpeng Zhang, Zhi Tang and Liangcai Gao |
3232 | Open-Set Text Recognition via Shape-Awareness Visual Reconstruction | Chang Liu, Chun Yang and Xu-Cheng Yin |
3409 | Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning | Sergi Garcia-Bordils, Dimosthenis Karatzas and Marçal Rusiñol |
3475 | Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function | Joseph Attieh, Abraham Woubie Zewoudie, Vladimir Vlassov, Adrian Flanagan and Tom Bäckström |
3789 | Linguistic Knowledge within Handwritten Text Recognition Models: A Real-World Case Study | Samuel Londner, Yoav Phillips, Hadar Miller, Nachum Dershowitz, Tsvi Kuflik and Moshe Lavee |
3792 | FTDNet: Joint Semantic Learning for Scene Text Detection in Adverse Weather Conditions | Jiakun Tian, Gang Zhou, Yangxin Liu, En Deng and Zhenhong Jia |
3833 | DocParser: end-to-end OCR-free information extraction from Visually Rich Documents | Mohamed Dhouib, Ghassen Bettaieb and Aymen Shabou |
3928 | Ambigram Generation by A Diffusion Model | Takahiro Shirakawa and Seiichi Uchida |
4033 | Decoupled Learning for Long-Tailed Oracle Character Recognition | Jing Li, Bin Dong, Qiu-Feng Wang, Lei Ding, Rui Zhang and Kaizhu Huang |
4066 | Analyzing Font Style Usage and Contextual Factors in Real Images | Naoya Yasukochi, Hideaki Hayashi, Daichi Haraguchi and Seiichi Uchida |
4083 | Faster DAN: Multi-target Queries with Document Positional Encoding for End-to-end Handwritten Document Recognition | Denis Coquenet, Clément Chatelain and Thierry Paquet |
4131 | QuOTeS: Query-Oriented Technical Summarization | Juan Antonio Ramirez-Orta, Eduardo Xamena, Ana Maguitman, Axel J. Soto, Flavia P. Zanoto and Evangelos Milios |
4178 | MUGS: A Multiple Granularity Semi-Supervised Method for Text Recognition | Qi Song, Qianyi Jiang, Wang Lei, Lingling Zhao and Rui Zhang |
4204 | Text Enhancement:Scene Text Recognition in Hazy Weather | En Deng, Gang Zhou, Jiakun Tian, Yangxin Liu and Zhenhong Jia |
4287 | Shared-Operation Hypercomplex Networks for Handwritten Text Recognition | Giorgos Sfikas, George Retsinas, Panagiotis Dimitrakopoulos, Basilis Gatos and Christophoros Nikou |
4289 | A Hybrid Approach to Document Layout Analysis for Heterogeneous Document Images | Zhuoyao Zhong, Jiawei Wang, Haiqing Sun, Kai Hu, Erhan Zhang, Lei Sun and Qiang Huo |
4319 | ColDBin: Cold Diffusion for Document Image Binarization | Saifullah Saifullah, Stefan Agne, Andreas Dengel and Sheraz Ahmed |
4485 | You Only Look for a Symbol Once: An Object Detector for Symbols and Regions in Documents | William Smith and Toby Pillatt |
4548 | SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition | Junyi Zhang, Chang Liu and Chun Yang |
4601 | DSS: Synthesizing long Digital Ink using Data augmentation, Style encoding and Split generation. | Aleksandr Timofeev, Anastasiia Fadeeva, Andrii Maksai, Claudiu Musat and Andrei Afonin |
4804 | A Benchmark of Nested Named Entity Recognition Approaches in Historical Structured Documents | Solenn Tual, Nathalie Abadie, Bertrand Duménieu, Joseph Chazalon and Edwin Carlinet |
5000 | Reading Between the Lanes: Text VideoQA on the Road | George Tom, Minesh Mathew, Sergi Garcia, Dimosthenis Karatzas and C.V. Jawahar |
5003 | Line Graphics Digitization: A Step Towards Full Automation | Omar Moured, Jiaming Zhang, Alina Roitberg, Thorsten Schwarz and Rainer Stiefelhagen |
5017 | TACTFUL: A framework for Targeted Active Learning for Document Analysis | Venkatapathy Subramanian, Sagar Poudel, Ganesh Ramakrishnan and Parag Chaudhuri |
5117 | “Explain Thyself Bully”: Sentiment Aided Cyberbullying Detection with Explanation | Krishanu Maity, Prince Jha, Raghav Jain, Sriparna Saha and Pushpak Bhattacharyya |
5155 | CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data | Michał Turski, Tomasz Stanisławek, Karol Kaczmarek, Paweł Dyda and Filip Graliński |
5441 | LayoutGCN: A Lightweight Architecture for Visually Rich Document Understanding | Dengliang Shi, Siliang Liu, Jintao Du and Huijia Zhu |
5525 | TPFNet: A Novel Text In-painting Transformer for Text Removal | Onkar Susladkar, Dhruv Makwana, Gayatri Deshmukh, Sparsh Mittal, R Sai Chandra Teja and Rekha Singhal |
5671 | Linear Object Detection in Document Images using Multiple Object Tracking | Philippe Bernet, Joseph Chazalon, Edwin Carlinet, Alexandre Bourquelot and Elodie Puybareau |
5935 | ESTER-Pt: An Evaluation Suite for TExt Recognition in Portuguese | Moniele Kunrath Santos, Guilherme Bazzo, Lucas Lima de Oliveira and Viviane P. Moreira |
5939 | Topic Shift Detection in Chinese Dialogues: Corpus and Benchmark | Jiangyi Lin, Yaxin Fan, Feng Jiang, Xiaomin Chu and Peifeng Li |
5951 | End-to-end Multi-line License Plate Recognition with Cascaded Perception | Song-Lu Chen, Qi Liu, Feng Chen and Xu-Cheng Yin |
6036 | Precise Segmentation for Children Handwriting Analysis by Combining Multiple Deep Models with Online Knowledge | Simon Corbillé, Éric Anquetil and Élisa Fromont |
6077 | Augraphy: A Data Augmentation Library for Document Images | Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini and Jonathan Boarman |
6359 | TRACE:Table Reconstruction Aligned to Corner and Edges | Youngmin Baek, Daehyun Nam, Jaeheung Surh, Seung Shin and Seonghyeon Kim |
6471 | Fine-tuning Vision Encoder-Decoder Transformers for Handwriting Text Recognition on Historical Documents | Daniel Parres Montoya and Roberto Paredes Palacios |
6475 | Detecting Forged Receipts with Domain-specific Ontology-based Entities & Relations | Beatriz Martínez Tornés, Emanuela Boros, Petra Gomez-Krämer, Antoine Doucet and Jean-Marc Ogier |
6512 | Evaluating Adversarial Robustness on Document Image Classification | Timothée Fronteau, Arnaud Paran and Aymen Shabou |
6516 | UTRNet: High-Resolution Urdu Text Recognition In Printed Documents | Abdur Rahman, Chetan Arora and Arjun Ghosh |
6754 | Contour Completion by Transformers and Its Application to Vector Font Data | Yusuke Nagata, Brian Kenji Iwana and Seiichi Uchida |
6780 | CED: Catalog Extraction from Documents | Tong Zhu, Guoliang Zhang, Zechang Li, Zijian Yu, Junfei Ren, Mengsong Wu, Zhefeng Wang, Baoxing Huai, Pingfu Chao and Wenliang Chen |
7047 | TextREC: a Dataset for Referring Expression Comprehension with Reading Comprehension | Chenyang Gao, Biao Yang, Hao Wang, Mingkun Yang, Wenwen Yu, Yuliang Liu and Xiang Bai |
7080 | Layout Analysis of Historical Document Images using a Light Fully Convolutional Networks | Najoua Rahal, Lars Vögtlin and Rolf Ingold |
7131 | A Character-level Document Key Information Extraction Method with Contrastive Learning | Xinpeng Zhang, Liangcai Gao and Jiyao Deng |
7310 | Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition | Jan Kohút and Michal Hradiš |
7319 | Combining OCR Models for Reading Early Modern Books | Mathias Seuret, Janne van der Loop, Nikolaus Weichselbaumer, Martin Mayr, Janina Molnar, Tatjana Hass and Vincent Christlein |
7403 | Incremental Teacher Model with Mixed Augmentations and Scheduled Pseudo-Label Loss for Handwritten Text Recognition | Masayuki Honda, Hung Tuan Nguyen, Cuong Tuan Nguyen, Cong Kha Nguyen, Ryosuke Odate, Takashi Kanemaru and Masaki Nakagawa |
7663 | AFFGANwriting: A handwriting image generation method based on multi-feature fusion | Heng Wang, Yiming Wang and Hongxi Wei |
7707 | Towards Making Flowchart Images Machine Interpretable | Shreya Shukla, Prajwal Gatti, Yogesh Kumar, Vikash Yadav and Anand Mishra |
7741 | SeamFormer: High Precision Text Line Segmentation for Handwritten Documents | Niharika Vadlamudi, Rahul Krishna and Ravi Kiran Sarvadevabhatla |
7774 | SIMARA: a database for key-value information extraction from full-page handwritten documents | Solène Tarride, Mélodie Boillet, Jean-François Moufflet and Christopher Kermorvant |
7991 | On Web-based Visual Corpus Construction for Visual Document Understanding | DongHyun Kim, Teakgyu Hong, Moonbin Yim, Yoonsik Kim and Geewook Kim |
8156 | SegCTC: Offline Handwritten Chinese Text Recognition via Better Fusion between Explicit and Implicit Segmentation | Jiarong Huang, Dezhi Peng, Hongliang Li, Hao Ni and Lianwen Jin |
8519 | DocImagen: Diffusion Model for Layout Conditioned Document Image Generation | Noman Tanveer, Adnan Ul-Hasan and Faisal Shafait |
8595 | Detecting Text on Historical Maps by Selecting Best Candidates of Deep Neural Networks Output | Gerasimos Matidis, Basilis Gatos, Anastasios Kesidis and Panagiotis Kaddas |
8630 | Adversarial Attacks on Convolutional Siamese Signature Verification Networks | Maham Jahangir, Muhammad Imran Malik and Faisal Shafait |
8652 | EnsExam: A Dataset for Handwritten Text Erasure on Examination Papers | Liufeng Huang, Bangdong Chen, Chongyu Liu, Dezhi Peng, Weiying Zhou, Yaqiang Wu, Hui Li, Hao Ni and Lianwen Jin |
8727 | A System for Processing and Recognition of Greek Byzantine and Post-Byzantine Documents | Panagiotis Kaddas, Konstantinos Palaiologos, Basilis Gatos, Vassilis Katsouros and Katerina Christopoulou |
8939 | Multimodal Rumour Detection: Catching news that never transpired! | Raghvendra Kumar, Ritika Sinha, Sriparna Saha and Adam Jatowt |
9048 | Towards Writing Style Adaptation in Handwriting Recognition | Jan Kohút, Michal Hradiš and Martin Kišš |
9308 | Formerge: Recover spanning cells in complex table structure using transformer network | Nam Quan Nguyen, Anh Duy Le, Anh Khoa Lu, Xuan Toan Mai and Tuan Anh Tran |
9362 | GriTS: Grid table similarity metric for table structure recognition | Brandon Smock, Rohith Pesala and Robin Abraham |
9403 | Semantic triple-assisted learning for question answering passage re-ranking | Dinesh Nagumothu, Bahadorreza Ofoghi and Peter Eklund |
9559 | I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection | Yongzhu Chang, Rongsheng Zhang and Jiashu Pu |
9669 | Historical document image segmentation combining deep learning and Gabor features | Maroua Mehri, Akrem Sellami and Salvatore Tabbone |
9806 | Group, Contrast and Recognize: A Self-supervised Method for Chinese Character Recognition | Xinzhe Jiang, Jun Du, Pengfei Hu, Mobai Xue, Jiefeng Ma, Jiajia Wu and Jianshu Zhang |
9867 | Receipt Dataset for Document Forgery Detection | Beatriz Martínez Tornés, Théo Taburet, Emanuela Boros, Kais Rouis, Petra Gomez-Krämer, Nicolas Sidere, Antoine Doucet and Vincent Poulain d’Andecy |
9897 | Content-Aware Urdu Handwriting Generation | Zeeshan Memon, Adnan Ul-Hasan and Faisal Shafait |
9904 | Weakly supervised information extraction from inscrutable handwritten document images | Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar and Gaurav Aggarwal |
9981 | Information Redundancy and Biases in Public Document Information Extraction Benchmarks | Seif Edinne Laatiri, Pirashanth Ratnamogan, Joël Tang, Laurent Lam, William Vanhuffel and Fabien Caspani |