Multimodal Generalized Zero Shot Learning for Gleason Grading using Self-Supervised Learning

Published 15 Nov 2021 in cs.CV, cs.LG, and eess.IV | (2111.07646v1)

Abstract: Gleason grading from histopathology images is essential for accurate prostate cancer (PCa) diagnosis. Since such images are obtained after invasive tissue resection quick diagnosis is challenging under the existing paradigm. We propose a method to predict Gleason grades from magnetic resonance (MR) images which are non-interventional and easily acquired. We solve the problem in a generalized zero-shot learning (GZSL) setting since we may not access training images of every disease grade. Synthetic MRI feature vectors of unseen grades (classes) are generated by exploiting Gleason grades' ordered nature through a conditional variational autoencoder (CVAE) incorporating self-supervised learning. Corresponding histopathology features are generated using cycle GANs, and combined with MR features to predict Gleason grades of test images. Experimental results show our method outperforms competing feature generating approaches for GZSL, and comes close to performance of fully supervised methods.

Abstract PDF Upgrade to Chat

Citations (5)

View on Semantic Scholar

Summary

The paper introduces a novel multimodal framework combining GZSL, CVAE, and self-supervised learning to enhance Gleason grading using non-invasive MRI data.
The methodology employs cycle-GANs and contrastive predictive coding to synthesize and transform MRI-derived features for accurately classifying unseen Gleason grades.
Experimental results demonstrate near-supervised performance, validating the approach's effectiveness in predicting both seen and unseen Gleason grades.

Multimodal Generalized Zero Shot Learning for Gleason Grading Using Self-Supervised Learning

Introduction

The paper discussed herein presents a method for predicting Gleason grades using magnetic resonance images (MRIs) by leveraging a generalized zero-shot learning (GZSL) framework combined with self-supervised learning (SSL). Traditionally, Gleason grading is performed using high-resolution histopathology images, necessitating invasive procedures for tissue acquisition. This approach proposes a non-invasive alternative by harnessing the advantages of MRI with feature synthesis techniques, particularly useful due to the MRI's non-invasiveness yet historically lower resolution and inherent noise when compared to digital pathology images.

The GZSL framework addresses the issue of not having training images of every Gleason grade due to practical challenges in obtaining a fully annotated dataset across all potential diagnostic categories. The method capitalizes on the ordered nature of Gleason grades, generating synthetic MRI feature vectors for unseen classes via a conditional variational autoencoder (CVAE). Additionally, cycle-consistent generative adversarial networks (cycle-GANs) facilitate the transformation of MR features into histopathology feature representations, ultimately culminating in improved test image classification and grading.

Methodology

Feature Extraction and Transformation

The methodology initiates with the extraction of feature vectors from both MRI and digital pathology images, employing ResNet-50 networks as feature extractors for each modality. The challenge in GZSL lies in generating accurate feature representations for unseen classes during training. The proposed approach combines MR and histopathology features, which are subsequently used to train a softmax classifier. The feature transformation relies on cycle-GANs to bridge the modality gap by learning mapping functions between the MR and digital pathology feature spaces.

Figure 1: Training Workflow: Feature extraction from MR and digital pathology images generates respective feature vectors $F_{MRI}$ and $F_{DP}$ .

CVAE and Self-Supervised Learning

The use of a CVAE for feature synthesis plays a crucial role in the generation of MRI features, which are then transformed to histopathology features. Training this network involves an adversarial loss term for domain adaptation and ensures cycle-consistency across input and transformed features. A significant innovation in this work is the integration of self-supervised learning, where the ordered relationships between Gleason grades are leveraged to generate new features using contrastive predictive coding (CPC), assisting in overcoming the semantic gaps between seen and unseen classes.

Figure 2: Feature visualizations: (a) Seen+Unseen classes from actual dataset; distribution of synthetic samples generated by b) $MM_{GZSL}$ .

Experimental Results

The proposed MM $_{GZSL}$ method is validated against several competing GZSL frameworks such as those based on GANs, over-complete distributions, and other SSL-based methods. The evaluations highlight MM $_{GZSL}$ 's superior accuracy in predicting Gleason grades for both seen and unseen class scenarios. It approaches the performance levels of fully supervised methods, demonstrating notable efficacy even in the absence of class attribute vectors or unlabeled target data during training, a common requirement in traditional GZSL applications.

Comparative analysis illustrates how the synergy of CVAE feature generation and self-supervised learning significantly enhances classification performance across multiple Gleason grades. Ablation studies underscore the importance of each component within the framework, with the exclusion of any single term resulting in marked performance degradation.

Conclusion

This research contributes a multimodal GZSL approach for prostate cancer Gleason grading, achieving high accuracy using non-invasive MRIs. The outlined methodology stands out for eliminating the need for explicit class attribute vectors and leveraging SSL for effective feature synthesis. By improving early detection metrics using MR data alone, it holds potential for broader clinical adoption and extension to other medical imaging challenges, although further work is required to refine the registration of MR and histopathology images in lower-quality MRI scenarios. Future directions may include enhancing this framework's application across diverse pathological imaging tasks.

Markdown Report Issue