About
Welcome to the Workshop on Medical Video Understanding (MedVidU), held in conjunction with ECCV 2026.
Medical video understanding is a rapidly emerging frontier at the intersection of computer vision and healthcare. Medical procedures generate massive volumes of video data, yet the ability to automatically interpret this data — recognizing phases, detecting critical anatomical structures, grounding temporal events, and generating rich granular textual descriptions — remains an open and challenging problem. Recent advances in large multimodal models and video foundation models have created unprecedented opportunities to tackle these tasks, but standardized benchmarks that span diverse tasks and community-wide evaluation protocols are still lacking.
This workshop addresses this gap by bringing together researchers working on medical AI and video understanding. To catalyze progress, the workshop hosts the MedVidU Challenge, centered on the diverse MedVidBench dataset (CVPR 2026), spanning four domains: laparoscopic surgery, open surgery, robotic surgery, and nursing.
MedVidU Challenge
The MedVidU Challenge is built on the MedVidBench dataset (CVPR 2026), providing a unified evaluation across diverse tasks including Critical View of Safety (CVS) assessment, next action prediction, skill assessment, temporal action grounding, dense video captioning, and video summary & region captioning. The benchmark spans four domains: laparoscopic surgery, open surgery, robotic surgery, and nursing.
Challenge Format
The challenge runs in two phases. See the Challenge Guide for detailed step-by-step instructions.
Phase 1 — Public Leaderboard
- Download the MedVidU ECCV 2026 train/val split from UII-AI/MedVidU_ECCV2026_TrainVal and train your model.
- Run inference on the MedVidBench test set and submit your predictions to the MedVidBench Leaderboard.
- Challenge participants are required to submit a report on OpenReview (coming soon) for the MedVidU Challenge.
Phase 2 — Held-Out Evaluation
- The top 5 teams on the Phase 1 leaderboard will be further evaluated on a completely held-out test set.
- A Docker image will be released for Phase 2 submissions.
Confidentiality: We encourage teams to open-source their models, but it is not required — submitted Docker images can be kept private, are never shared or open-sourced by us, and are deleted after evaluation. See the Challenge Guide for details.
Important Dates
| Registration Opens & Dataset Release (HuggingFace) | May 20, 2026 | Live now |
| Beginning of Phase 1 (Public Validation Leaderboard) | May 20, 2026 | Live now |
| Workshop Paper Submission (Mandatory for Challenge Participants) | July 13, 2026 (23:59 AoE) | Loading… |
| Phase 1 Deadline (Public Leaderboard Closes) | July 13, 2026 (23:59 AoE) | Loading… |
| Phase 2 Docker Image Released (Top 5 Teams) | TBA |
*Deadlines are Anywhere on Earth (AoE). Timelines are subject to change.
Top 3 winners will get a chance to present their method in the workshop.
We plan to have an extended journal submission with all the phase 2 qualifiers. Prize information will be available soon.
Call for Papers
Please submit your manuscript formatted according to the ECCV 2026 author guidelines. Accepted papers are intended to be published in the ECCV 2026 workshop proceedings.
We encourage submissions reporting novel theories, methods, and applications of video understanding in medical and surgical settings.
Note: MedVidU 2026 Challenge participants are required to submit a paper.
Topics of Interest
- Temporal action grounding in medical video
- Dense video captioning for medical procedures
- Surgical tool detection and tracking
- Medical video foundation models and multimodal large language models
- Surgical phase and action recognition
- Medical scene understanding from videos and 3D/4D reconstruction
- Privacy-preserving analysis of surgical recordings
- Explainability and trustworthiness in AI for medical videos
- Efficient video representation learning for long medical procedures
Important Dates
| Paper Submission | July 13, 2026 (23:59 AoE) | Loading… |
| ECCV Accepted Paper Submission | July 24, 2026 (23:59 AoE) | Loading… |
| Paper Notification | July 31, 2026 | Loading… |
| Camera Ready | August 15, 2026 | Loading… |
*All deadlines are Anywhere on Earth (AoE). Timelines are subject to change. Submission portal will be announced soon.
Keynote Speakers
Speakers to be announced.
Workshop Program
Half-day in-person event · Tentative schedule
| Time | Activity |
|---|---|
| 08:30 – 08:45 | Opening Remarks |
| 08:45 – 09:25 | Keynote #1 — 30 min talk + 10 min Q&A |
| 09:25 – 09:35 | Accepted Paper Oral Presentation #1 |
| 09:35 – 09:45 | Accepted Paper Oral Presentation #2 |
| 09:45 – 09:55 | Accepted Paper Oral Presentation #3 |
| 09:55 – 10:35 | Coffee Break / Poster Session |
| 10:35 – 11:15 | Keynote #2 — 30 min talk + 10 min Q&A |
| 11:15 – 11:30 | MedVidU Challenge: Introduction |
| 11:30 – 12:00 | Presentations from Top 3 Teams (10 min each) |
| 12:00 – 12:15 | Challenge Awards & Closing Remarks |
Organizers
United Imaging Intelligence · University of Strasbourg / IHU Strasbourg · TUM
Venue & Location
The workshop will be held in conjunction with ECCV 2026.
Workshop Location
Malmö, Sweden
8–9 September 2026
MedVidU is co-located with the European Conference on Computer Vision (ECCV 2026). Please refer to the main ECCV 2026 website for details on travel and accommodation.
Contact
For general inquiries about the workshop, please email medvidu@googlegroups.com.