ICDAR 2026 Competition on Long-Term Handwriting Author Identification

The AnyScript Challenge

The first benchmark to evaluate writer identification systems across manuscripts separated by decades, incorporating the temporal dimension of handwriting evolution.

1,019 Authors

2,286 Books

527,340 Pages

300+ Years

Introduction & Motivation

Traditional writer identification frameworks treat handwriting style as a static snapshot, overlooking the crucial dimension of temporal evolution. Handwriting naturally evolves due to aging, context, and changing writing materials, limiting the real-world applicability of current approaches.

The AnyScript competition introduces temporal transitivity to authorship identification, challenging systems to identify common authorship across manuscripts separated by decades. Our dataset from the Spanish Digital Library (Biblioteca Digital Hispánica) spans 500+ years of authentic autograph manuscripts.

Temporal Evolution

Account for progressive changes in writing style over time

Multi-Page Context

Process long contextual sequences efficiently

Historical Significance

18th-20th century literary and artistic works

Task Definition

The competition follows a query-by-example retrieval framework. Given a query document (page or book), systems must retrieve other documents written by the same author from the dataset.

For further details to how the tasks are formulated; refer to this README.md file

Intra-Book Evaluation

Goal: Identify which books query pages belong to

Input: 1,000 individual pages excluded from training books

Output: Page-level retrieval rankings

Challenge: Match pages to their source books

Extra-Book Evaluation

Goal: Identify books by the same author across temporal spans

Input: 200 complete books excluded from training set

Output: Book-level retrieval rankings

Challenge: Long-term stylistic continuity across decades

Temporal Transitivity

Systems are expected to incorporate temporal reasoning. Since direct inference between temporally distant manuscripts (A ↔ C) is challenging, methods should leverage intermediate associations through temporally closer manuscripts (A ↔ B ↔ C).

Evaluation Metrics

Three complementary metrics evaluate retrieval performance:

Recall@K

Measures the proportion of relevant documents among the top K retrieved items (K = 50).

Recall@K = (Number of relevant documents in top K) / (Total number of relevant documents)

Mean Average Precision (mAP)

Summarizes ranking performance by averaging precision across all queries. Rewards systems that rank relevant items higher.

AP(q) = (1/R_q) Σ P_q(k) × rel_q(k)
mAP = (1/Q) Σ AP(q)

Normalized Discounted Cumulative Gain (nDCG)

Uses normalized temporal distance as relevance score to promote historically coherent rankings.

DCG_q = Σ (2^rel_q(k) - 1) / log₂(k + 1)
nDCG_q = DCG_q / IDCG_q

The AnyScript Dataset

Dataset Statistics

500,000+ images (colored + binarized)
2,286 books by 1,019 unique authors
Temporal span: 500+ years (18th-20th centuries)
Content: Novels, plays, music scores, literary works
Language: Primarily Spanish
Source: Spanish Digital Library autograph copies

Data Characteristics

The dataset contains verified autograph manuscripts - original works produced by their creators. Documents labeled as autographs but dating after authors' deaths have been removed to ensure authenticity.

Temporal coverage focuses on the late 19th and early 20th centuries, with some authors having continuous book representation spanning 4 decades, enabling study of long-term stylistic evolution.

Dataset Examples

The AnyScript dataset contains diverse historical documents including personal letters, theater scripts, music scores, and literary works spanning over 500 years. Below are representative examples of the types of manuscripts included in our collection.

Personal letter from 19th century

Historical handwritten correspondence demonstrating the evolution of personal writing styles and formal letter composition.

Music score manuscript

Handwritten musical notation showcasing the unique challenges of identifying authorship in specialized symbolic writing systems.

Literary work

Antique handwritten document illustrating the temporal evolution of handwriting styles across different literary contexts.

Temporal Challenge

These examples demonstrate the core challenge of our competition: identifying common authorship across documents that may span decades, accounting for natural evolution in handwriting style, different contexts (personal vs. formal), and varying writing materials.

Submission Format

Both evaluation tracks require CSV file submissions with the following structure:

Required CSV Columns

query_document_id,retrieved_document_id,similarity_score

Intra-Book Track

Submit page-to-page similarity scores for 1,000 query pages against all training pages.

Example:
query_page_001,train_page_1234,0.89
query_page_001,train_page_5678,0.76

Extra-Book Track

Submit book-to-book similarity scores for 200 query books against all training books.

Example:
query_book_001,train_book_045,0.92
query_book_001,train_book_128,0.68

Downloads

Dataset Component	Description	Status
Training Set	Full-resolution colored images from training books	700Gb!
Training Set (Binarized)	Binarized versions of training images	Coming Soon
Training Labels	Author and date annotations for training data	Dates comming soon...
Test Queries	Unlabeled test queries for evaluation	See submission examples!

Competition Schedule

January 2026

Competition Launch

Competition website and validation set live

March 01, 2026

Test Set Release

Test set available on evaluation server

April 03, 2026

Results Deadline

Competition results submission deadline

April 17, 2026

Report Submission

Initial competition report submission due

May 5, 2026

Camera-Ready Papers

Camera-ready paper submission due

June 22, 2026

Winners Announced

Competition winners announcement

Organizing Team

Adrià Molina Rodríguez

PhD Candidate

amolina@cvc.uab.cat

Cultural heritage, data governance, responsible AI, retrieval systems, data visualization, graphs

Carlos Boned Riera

PhD Student

cboned@cvc.uab.cat

Representation learning, graph-based methods, historical manuscripts, retrieval systems

Pau Torras