November 18, 2025
Does AI-assisted capsule endoscopy improve detection of small bowel ulcers and erosions in Crohn’s disease?
Artificial intelligence can significantly enhance the performance of capsule endoscopy for detection of small bowel Crohn’s disease and may be a useful clinical adjunct.
Andrade P, Mascarenhas M, Mendes F, et al. AI-Assisted Capsule Endoscopy for Detection of Ulcers and Erosions in Crohn’s Disease: A Multicenter Validation Study. Clin Gastroenterol Hepatol. Epub ahead of print Nov 5, 2025; https://www.sciencedirect.com
This multicenter validation study evaluated an artificial intelligence (AI) model for detecting small bowel ulcers and erosions on capsule endoscopy in patients with suspected or known Crohn’s disease (CD). The study included 259 patients from four centers across Portugal, Spain and the United States who underwent capsule endoscopy between January 2021 and April 2024 using either PillCam SB3 (91.9%) or Olympus EC-10 (8.1%) devices. The mean patient age was 57.3 years, and 55.2% were female. Roughly 80% of patients underwent capsule endoscopy for suspected CD, while the remainder had confirmed disease.
The AI-assisted reading system (Deep Capsule) was compared against standard-of-care reading by an endoscopist. An independent expert review board established the reference standard for ulcer and erosion detection and reviewed all of the footage.
AI-assisted reading identified lesions in 109 patients (42.1%; 95% Confidence Interval [CI]: 36.0–48.1), while standard-of-care reading detected lesions in 65 patients (25.1%; 95% CI: 19.8–30.4) (p<0.001). Review by the expert board confirmed ulcers and erosions in 93 of 259 patients (35.9%). The mean difference of 17% (95% CI: 12.3–21.7) demonstrated statistical non-inferiority as well as superiority for AI-assisted reading.
AI-assisted reading had 90.2% sensitivity, 84.4% specificity, 76.1% positive predictive value, 94% negative predictive value and 86.5% overall accuracy. In contrast, standard-of-care reading achieved 69.6% sensitivity, 99.4% specificity, 98.5% positive predictive value, 85.6% negative predictive value and 88.8% accuracy. The area under the receiver operating characteristic curve was 0.876 for AI-assisted reading and 0.840 for standard-of-care.
AI-assisted reading identified 568 of 600 five-minute video segments containing ulcers or erosions confirmed by expert board review (94.7%; 95% CI: 92.6–96.3). The median time required for physicians to review AI-flagged frames and complete reporting for an entire capsule endoscopy examination was only 172 seconds (Interquartile Range: 127 seconds).
Subgroup analysis by indication showed that for patients with suspected CD, AI-assisted reading achieved 37.7% detection yield, compared to 19.8% with standard-of-care. In confirmed CD patients, these values were 59.6% vs. 46.2%, respectively. AI-assisted reading demonstrated superior sensitivity in both groups, though positive predictive value was lower in the suspected CD cohort (69.2% vs. 97.6%), reflecting a higher rate of false positives. The researchers found these false positives were primarily attributable to poor visualization, bowel contents, bubbles and non-erosive inflammatory findings.
Performance remained consistent across both capsule endoscopy devices and all participating centers, supporting the model’s generalizability and real-world applicability.
Details
Study Design: Cross-sectional cohort study
Funding: None reported
Allocation: Not applicable
Setting: Multicenter
Level of Evidence: 2b