Next generation medical image interpretation with MedGemma 1.5 and medical speech output with MedASR

Next generation medical image interpretation with MedGemma 1.5 and medical speech output with MedASR

By research.google
Publication Date: 2026-01-13 12:00:00

Improved performance for medical imaging use cases

MedGemma was designed from the ground up as a multimodal model that reflects the multimodal nature of medicine. MedGemma 1 included support for the interpretation of two-dimensional medical images, including chest radiographs, dermatology images, fundus images, and histopathology patches.

With MedGemma 1.5, we expand support for high-dimensional medical imaging, starting with three-dimensional volume representations of CT imaging and MRI, as well as whole-body histopathological imaging. Developers can create applications in which multiple slices (for CT or MRI) or multiple patches (for histopathology) are provided as input along with a prompt describing the task.

In internal benchmarks, the absolute baseline accuracy of MedGemma 1.5 improved by 3% (61% vs. 58%) over MedGemma 1 in classifying disease-related CT findings and by 14% (65% vs. 51%)…