VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics
Abstract: Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific LLMs, in order to generate normal counterparts of abnormal images. The discrepancy between the two hence gives rise to a mapping indicating the diagnostically-relevant image components. To achieve this, we deploy image priors in conjunction with appropriate conditioning mechanisms in order to control the image generative process, including natural language text prompts acquired from medical science and applied radiology. We perform experiments and quantitatively evaluate our results on the COVID-19 Radiography Database containing labelled chest X-rays with differing pathologies via the Frechet Inception Distance (FID), Structural Similarity (SSIM) and Multi Scale Structural Similarity Metric (MS-SSIM) metrics obtained between real and generated images. The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction, which are evaluated with real examples from the cheXpert dataset.
- Causability and explainability of artificial intelligence in medicine. \JournalTitleWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9, e1312 (2019).
- Visual feature attribution using wasserstein gans. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8309–8319 (2018).
- Making machine learning models interpretable. In ESANN, vol. 12, 163–172 (Citeseer, 2012).
- Deep multi-instance networks with sparse label assignment for whole mammogram classification. In International conference on medical image computing and computer-assisted intervention, 603–611 (Springer, 2017).
- Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images. In Medical Image Computing and Computer Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III 20, 250–258 (Springer, 2017).
- Discriminative localization in cnns for weakly-supervised segmentation of pulmonary nodules. In Medical Image Computing and Computer Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III 20, 568–576 (Springer, 2017).
- Weakly-supervised evidence pinpointing and description. In Information Processing in Medical Imaging: 25th International Conference, IPMI 2017, Boone, NC, USA, June 25-30, 2017, Proceedings 25, 210–222 (Springer, 2017).
- Application of explainable artificial intelligence in the identification of squamous cell carcinoma biomarkers. \JournalTitleComputers in Biology and Medicine 146, 105505 (2022).
- Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6428–6436 (2017).
- Towards a rigorous science of interpretable machine learning. \JournalTitlearXiv preprint arXiv:1702.08608 (2017).
- A review of explainable deep learning cancer detection models in medical imaging. \JournalTitleApplied Sciences 11, 4573 (2021).
- Robust tumor localization with pyramid grad-cam. \JournalTitlearXiv preprint arXiv:1805.11393 (2018).
- Multiple interpretations improve deep learning transparency for prostate lesion detection. In Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2020 and DMAH 2020, Virtual Event, August 31 and September 4, 2020, Revised Selected Papers 6, 120–137 (Springer, 2021).
- Explaining the black-box smoothly—a counterfactual approach. \JournalTitleMedical Image Analysis 84, 102721 (2023).
- Region-to-boundary deep learning model with multi-scale feature fusion for medical image segmentation. \JournalTitleBiomedical Signal Processing and Control 71, 103165 (2022).
- Classifiers comparison for convolutional neural networks (cnns) in image classification. In 2019 IEEE/ACM 23rd International Symposium on Distributed Simulation and Real Time Applications (DS-RT), 1–4 (IEEE, 2019).
- Opening the black box: the promise and limitations of explainable machine learning in cardiology. \JournalTitleCanadian Journal of Cardiology 38, 204–213 (2022).
- Image caption and medical report generation based on deep learning: a review and algorithm analysis. In 2021 International Conference on Computer Information Science and Artificial Intelligence (CISAI), 373–379 (IEEE, 2021).
- Sun, L. et al. An adversarial learning approach to medical image synthesis for lesion detection. \JournalTitleIEEE journal of biomedical and health informatics 24, 2303–2314 (2020).
- Vant-gan: adversarial learning for discrepancy-based visual attribution in medical imaging. \JournalTitlePattern Recognition Letters 156, 112–118 (2022).
- What is healthy? generative counterfactual diffusion for lesion localization. In Deep Generative Models: Second MICCAI Workshop, DGM4MICCAI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings, 34–44 (Springer, 2022).
- Xia, X. et al. Gan-based anomaly detection: a review. \JournalTitleNeurocomputing (2022).
- Med-tex: Transferring and explaining knowledge with less data from pretrained medical imaging models. \JournalTitlearXiv preprint arXiv:2008.02593 (2020).
- Diffusion models beat gans on image synthesis. \JournalTitleAdvances in Neural Information Processing Systems 34, 8780–8794 (2021).
- Diffusion models for medical anomaly detection. In International Conference on Medical image computing and computer-assisted intervention, 35–45 (Springer, 2022).
- Özbey, M. et al. Unsupervised medical image translation with adversarial diffusion models. \JournalTitleIEEE Transactions on Medical Imaging (2023).
- Lee, J. et al. Biobert: a pre-trained biomedical language representation model for biomedical text mining. \JournalTitleBioinformatics 36, 1234–1240 (2020).
- Yan, A. et al. Radbert: Adapting transformer-based language models to radiology. \JournalTitleRadiology: Artificial Intelligence 4, e210258 (2022).
- Does clip benefit visual question answering in the medical domain as much as it does in the general domain? \JournalTitlearXiv preprint arXiv:2112.13906 (2021).
- Chambon, P. et al. Roentgen: Vision-language foundation model for chest x-ray generation. \JournalTitlearXiv preprint arXiv:2211.12737 (2022).
- Pinaya, W. H. et al. Brain imaging generation with latent diffusion models. In Deep Generative Models: Second MICCAI Workshop, DGM4MICCAI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings, 117–126 (Springer, 2022).
- Visual explanation by unifying adversarial generation and feature importance attributions. In Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data: 4th International Workshop, iMIMIC 2021, and 1st International Workshop, TDA4MedicalData 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Proceedings 4, 44–55 (Springer, 2021).
- Jacob, P. et al. Steex: steering counterfactual explanations with semantics. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XII, 387–403 (Springer, 2022).
- Diffusion causal models for counterfactual estimation. \JournalTitlearXiv preprint arXiv:2202.10166 (2022).
- Editing implicit assumptions in text-to-image diffusion models. \JournalTitlearXiv preprint arXiv:2303.08084 (2023).
- The swiss army knife for image-to-image translation: Multi-task diffusion models. \JournalTitlearXiv preprint arXiv:2204.02641 (2022).
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684–10695 (2022).
- Gal, R. et al. An image is worth one word: Personalizing text-to-image generation using textual inversion. \JournalTitlearXiv preprint arXiv:2208.01618 (2022).
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
- Vaswani, A. et al. Attention is all you need. \JournalTitleAdvances in neural information processing systems 30 (2017).
- Meng, C. et al. Sdedit: Guided image synthesis and editing with stochastic differential equations. In International Conference on Learning Representations (2021).
- Anderson, B. D. Reverse-time diffusion equation models. \JournalTitleStochastic Processes and their Applications 12, 313–326 (1982).
- Song, Y. et al. Score-based generative modeling through stochastic differential equations. \JournalTitlearXiv preprint arXiv:2011.13456 (2020).
- Adding conditional control to text-to-image diffusion models. \JournalTitlearXiv preprint arXiv:2302.05543 (2023).
- Chowdhury, M. E. et al. Can ai help in screening viral and covid-19 pneumonia? \JournalTitleIeee Access 8, 132665–132676 (2020).
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. \JournalTitleAdvances in neural information processing systems 30 (2017).
- Seitzer, M. pytorch-fid: FID Score for PyTorch. https://github.com/mseitzer/pytorch-fid (2020). Version 0.3.0.
- Analyzing the role of visual structure in the recognition of natural image content with multi-scale ssim. In Human Vision and Electronic Imaging XIII, vol. 6806, 410–423 (SPIE, 2008).
- Image quality assessment: from error visibility to structural similarity. \JournalTitleIEEE transactions on image processing 13, 600–612 (2004).
- Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, 1398–1402 (Ieee, 2003).
- Small cell lung carcinoma: staging, imaging, and treatment considerations. \JournalTitleRadiographics 34, 1707–1721 (2014).
- Irvin, J. et al. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, vol. 33, 590–597 (2019).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.