Medical VLM with Global Anomaly Detection (MEDIC-AD)

Team AIDAS Collaboration

Models

PDF

Overview

Synchronizing overview from sheet...

Key Achievements

Synchronizing achievements from sheet...

Milestones

Synchronizing milestones from sheet...

Abstract

Lesion detection, symptom tracking, and visual explainability are central to real-world medical image analysis, yet current medical vision-language models (VLMs) still lack mechanisms that translate their broad knowledge into clinically actionable outputs.

To bridge this gap, we present MEDIC-AD, a clinically oriented VLM that strengthens these capabilities through a stage-wise framework. First, learnable anomaly-aware tokens ($\langle Ano \rangle$) encourage the model to focus on abnormal regions and build more discriminative lesion-centered representations. Second, inter-image difference tokens ($\langle Diff \rangle$) explicitly encode temporal changes between studies, allowing the model to distinguish worsening, improvement, and stability in disease burden.

Finally, a dedicated explainability stage trains the model to generate heatmaps that highlight lesion-related regions, offering clear visual evidence that is consistent with the model's reasoning. Through this staged design, MEDIC-AD steadily boosts performance across anomaly detection, symptom tracking, and anomaly segmentation, achieving state-of-the-art results compared with both closed-source and medical-specialized baselines.

Evaluations on real longitudinal clinical data collected from hospital workflows further show that MEDIC-AD delivers stable predictions and clinically faithful explanations in practical patient-monitoring and decision-support workflows.

Deep Learning Medical Imaging Computer Vision Vision-Language Model