The rapid development of Document Intelligence (DI) has transformed traditional Document Analysis and Recognition (DAR) into a sophisticated, AI-driven field. This summer school provides cutting-edge tools for information extraction using Retrieval-Augmented Generation (RAG), Vision-Language Models (VLMs), and structured knowledge representation.
Through a Challenge-Based Learning framework, participants will progress from foundational lectures to hands-on practice sessions and culminate in solving real-world industry challenges in collaborative teams.
Lessons learned from training large-scale vision-language models.
How to use VLMs to extract structured information from documents.
Use graph neural networks to analyze and interpret document structures.
Convert documents into structured knowledge graphs for enhanced understanding.
How to augment your models with graph retrieval-augmented generation techniques.
Panel with experts working on trustworthiness and explainability in document analysis.
Panels with experts discussing the latest research and challenges in historical document analysis.
Fine-tune a small VLM on documents of a specific domain.
Explore the use of graph neural networks for document structure analysis.
Build an agentic system using GraphRAG for advanced document analysis.
Teams of 5 participants will tackle real-world industry problems.
Students in their 1st-3rd year working on document analysis, information retrieval, or related fields
Advanced students preparing for PhD studies in document analysis
Industry professionals seeking to deepen expertise in document understanding
We offer 5 full registration fee grants. Apply during registration!
Evaluation criteria:
A stunning glacial valley nestled in the heart of the Pyrenees mountains of Catalonia, at approximately 2,000 meters elevation. Accessible only by scenic cogwheel train, offering a peaceful and inspiring environment away from everyday distractions.
All participants stay at the hotel in the center of the valley. Shared rooms (double/triple) with private bathrooms. All meals included: buffet-style breakfast, lunch, and dinner with vegetarian/vegan options.
Includes accommodation, all meals, coffee breaks, transport, and materials
Capacity: Limited to 50 participants • Format: In-person only • Language: English
5 full registration grants available thanks to IAPR support. Apply during registration!
Evaluation criteria:
Questions? Contact: {amolina, allabres}@cvc.uab.es
Sponsorship opportunity available
Computer Vision Center
Universitat Autònoma de Barcelona
IAPR 10th and 11th Technical Committees
IAPR
ELLIS Unit Barcelona
Barcelona
ELLIOT Project
Horizon European project
Support the next generation of document understanding researchers and gain visibility in the academic community.
Associate Director, CVC
Head of Vision, Language and Reading Group
Director, CVC
Head of Document Analysis Group
Researcher, CVC
And Coordinator of the AI Degree at UAB
PhD Student, CVC
Document Analysis • Retrieval Systems
PhD Student, CVC
Vision & Language • Document Understanding
Nuria Martinez • Aurora Garcia • Xavier Galvez