Papers
Topics
Authors
Recent
Search
2000 character limit reached

COMPRER: A Multimodal Multi-Objective Pretraining Framework for Enhanced Medical Image Representation

Published 4 Feb 2024 in cs.CV and cs.LG | (2403.09672v1)

Abstract: Substantial advances in multi-modal AI facilitate the combination of diverse medical modalities to achieve holistic health assessments. We present COMPRER , a novel multi-modal, multi-objective pretraining framework which enhances medical-image representation, diagnostic inferences, and prognosis of diseases. COMPRER employs a multi-objective training framework, where each objective introduces distinct knowledge to the model. This includes a multimodal loss that consolidates information across different imaging modalities; A temporal loss that imparts the ability to discern patterns over time; Medical-measure prediction adds appropriate medical insights; Lastly, reconstruction loss ensures the integrity of image structure within the latent space. Despite the concern that multiple objectives could weaken task performance, our findings show that this combination actually boosts outcomes on certain tasks. Here, we apply this framework to both fundus images and carotid ultrasound, and validate our downstream tasks capabilities by predicting both current and future cardiovascular conditions. COMPRER achieved higher Area Under the Curve (AUC) scores in evaluating medical conditions compared to existing models on held-out data. On the Out-of-distribution (OOD) UK-Biobank dataset COMPRER maintains favorable performance over well-established models with more parameters, even though these models were trained on $75\times$ more data than COMPRER. In addition, to better assess our model's performance in contrastive learning, we introduce a novel evaluation metric, providing deeper understanding of the effectiveness of the latent space pairing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Artificial intelligence in healthcare: transforming the practice of medicine. Future Healthcare Journal, 8:e188–e194, 2021. doi:10.7861/fhj.2021-0095.
  2. A guide to deep learning in healthcare. Nature Medicine, 25(1):24–29, 2019.
  3. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. npj Digital Medicine, 6:74, 2023. doi:10.1038/s41746-023-00811-0.
  4. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering, 2:158–164, 2018. doi:10.1038/s41551-018-0195-0.
  5. J. D. Spence. Technology insight: ultrasound measurement of carotid plaque–patient management, genetic research, and therapy evaluation. Nature Clinical Practice Neurology, 2:611–619, 2006. doi:10.1038/ncpneuro0324.
  6. A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nature Communications, 12:3242, 2021. doi:10.1038/s41467-021-23458-5.
  7. Deep-learning-based prediction of late age-related macular degeneration progression. medRxiv, 2019. doi:10.1101/19006171.
  8. Machine learning models for screening carotid atherosclerosis in asymptomatic adults. Scientific Reports, 11:22236, 2021. doi:10.1038/s41598-021-01456-3. URL https://doi.org/10.1038/s41598-021-01456-3.
  9. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nature Reviews Cardiology, 18:465–478, 2021. doi:10.1038/s41569-020-00503-2.
  10. A foundation model for generalizable disease detection from retinal images. Nature, 622:156–163, 2023. doi:10.1038/s41586-023-06555-x.
  11. 10 k: a large-scale prospective longitudinal study in israel. European Journal of Epidemiology, 36:1187–1194, 2021. doi:10.1007/s10654-021-00753-5.
  12. Foundation models for generalist medical artificial intelligence. Nature, 616:259–265, 2023. doi:10.1038/s41586-023-05881-4.
  13. Bootstrap your own latent: A new approach to self-supervised learning, 2020.
  14. A simple framework for contrastive learning of visual representations, 2020.
  15. Cross-modal autoencoder framework learns holistic representations of cardiovascular state. Nature Communications, 14:2436, 2023. doi:10.1038/s41467-023-38125-0.
  16. An image is worth 16x16 words: Transformers for image recognition at scale, 2020.
  17. Dinov2: Learning robust visual features without supervision, 2023.
  18. Learning transferable visual models from natural language supervision, 2021.
  19. Self-supervised learning to increase the performance of skin lesion classification. Electronics, 9:1930, 2020. doi:10.3390/electronics9111930.
  20. Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning. Nature Biomedical Engineering, 6:1399–1406, 2022. doi:10.1038/s41551-022-00936-9.
  21. Identifying pneumonia in chest x-rays: A deep learning approach. Measurement, 145:511–518, 2019. doi:10.1016/j.measurement.2019.05.076.
  22. Brain tumor segmentation using convolutional neural networks in mri images. IEEE Transactions on Medical Imaging, 35:1240–1251, 2016. doi:10.1109/TMI.2016.2538465.
  23. Self-supervised learning is more robust to dataset imbalance, 2021.
  24. On the opportunities and risks of foundation models, 2021.
  25. Attention is all you need, 2017.
  26. Hierarchical text-conditional image generation with clip latents, 2022.
  27. Zero-shot text-to-image generation, 2021.
  28. Language models are few-shot learners, 2020.
  29. Self-supervised retinal thickness prediction enables deep learning from unlabelled data to boost classification of diabetic retinopathy. Nature Machine Intelligence, 2:719–726, 2020. doi:10.1038/s42256-020-00247-1.
  30. Transformer interpretability beyond attention visualization, 2020.
  31. S. Ruder. An overview of multi-task learning in deep neural networks, 2017.
  32. M. Crawshaw. Multi-task learning with deep neural networks: A survey, 2020.
  33. Automorph: Automated retinal vascular morphology quantification via a deep learning pipeline. Translational vision science & technology, 11(7):12–12, 2022.
  34. Emerging properties in self-supervised vision transformers, 2021.
  35. A cookbook of self-supervised learning, 2023.
  36. Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock. eLife, 12:e82364, 2023. doi:https://doi.org/.
  37. Retinal vascular fractal dimensions and their association with macrovascular cardiac disease. Ophthalmic Research, 64(4):561–566, 2021. doi:10.1159/000514442.
  38. Fractal analysis of the retinal vascular network in fundus images. Annu Int Conf IEEE Eng Med Biol Soc, 2007:6456–6459, 2007. doi:10.1109/IEMBS.2007.4353837.
  39. Uk biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Medicine, 12(3):e1001779, 2015.
  40. Is carotid intima-media thickness as predictive as other noninvasive techniques for the detection of coronary artery disease? Arteriosclerosis, Thrombosis, and Vascular Biology, 34(7):1341–1345, Jul 2014. doi:10.1161/ATVBAHA.113.302075. URL https://doi.org/10.1161/ATVBAHA.113.302075. Epub 2014 Apr 24.
  41. Deep learning algorithm predicts diabetic retinopathy progression in individual patients. npj Digital Medicine, 2:92, 2019. doi:10.1038/s41746-019-0172-3.
Citations (1)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.