Case ID: M25-290L

Published: 2026-02-27 15:18:55

Last Updated: 1772205535


Inventor(s)

Yalin Wang
Xin Li
Wenhui Zhu
Oana Dumitrascu

Technology categories

Artificial Intelligence/Machine LearningComputing & Information TechnologyDiagnostic Assays/DevicesImagingLife Science (All LS Techs)Medical Imaging

Licensing Contacts

Jovan Heusser
Director of Licensing and Business Development
[email protected]

RetinalGPT: Multimodal Large Language Model for Retinal Image Analysis

Invention Description
Multimodal large language models (MLLMs) have shown strong potential in analyzing complex data types such as images, video, and audio, prompting growing interest in their use for medical applications. While several general-domain MLLMs have been adapted for healthcare tasks, including retinal imaging, their performance remains limited when applied to clinically meaningful interpretation. In particular, existing models struggle to provide the quantitative analysis that medical experts rely on for accurate disease detection and assessment. This reveals a critical gap between general-purpose MLLMs and the specialized requirements of medical diagnostics, where precision, interpretability, and domain knowledge are essential. Bridging this gap is necessary for deploying MLLMs as reliable tools in clinical decision-making.
 
Researchers at Arizona State University have developed RetinalGPT, a multimodal conversational assistant designed specifically for clinically preferred quantitative analysis of retinal images. This tool is an advanced vision-language model tailored for retinal image analysis, combining large-scale retinal image datasets and innovative training methods to achieve these clinically preferred quantitative insights. It leverages a two-stage training process to align generic medical knowledge with specialized retinal diagnostics, enabling superior detection of retinal diseases and detailed lesion and vascular analyses. Beyond classification, it provides quantitative measurements and lesion localization, improving interpretability and clinical relevance.
 
RetinalGPT is a cutting-edge multimodal large language model designed to improve retinal disease diagnosis and lesion localization through advanced retinal image analysis.
 
Potential Applications
  • Clinical diagnosis and monitoring of retinal diseases
  • Automated lesion detection and localization tools in ophthalmology
  • Medical research ophthalmology and retinal pathology
  • Development of AI-assisted diagnostic platforms for eye care
  • Integration in telemedicine platforms for remote retinal analyses
Benefits and Advantages
  • Detailed lesion localization and vascular structure analysis capabilities.
  • Comprehensive processing of clinical features including disease labels, lesion bounding boxes, and vascular characteristics
  • Two-stage training that balances generic medical knowledge with retinal domain expertise
  • Improved interpretability in medical image analysis for clinical research
  • Specialized training dataset curated for clinical preferences in retinal analysis
  • Maintains broad medical knowledge while enhancing retinal-specific expertise
  • Superior performance across multiple benchmark datasets for retinal disease diagnosis
For more information about this opportunity, please see