Papers
Topics
Authors
Recent
Search
2000 character limit reached

Domain shifts in dermoscopic skin cancer datasets: Evaluation of essential limitations for clinical translation

Published 14 Apr 2023 in cs.CV | (2304.06968v3)

Abstract: The limited ability of Convolutional Neural Networks to generalize to images from previously unseen domains is a major limitation, in particular, for safety-critical clinical tasks such as dermoscopic skin cancer classification. In order to translate CNN-based applications into the clinic, it is essential that they are able to adapt to domain shifts. Such new conditions can arise through the use of different image acquisition systems or varying lighting conditions. In dermoscopy, shifts can also occur as a change in patient age or occurence of rare lesion localizations (e.g. palms). These are not prominently represented in most training datasets and can therefore lead to a decrease in performance. In order to verify the generalizability of classification models in real world clinical settings it is crucial to have access to data which mimics such domain shifts. To our knowledge no dermoscopic image dataset exists where such domain shifts are properly described and quantified. We therefore grouped publicly available images from ISIC archive based on their metadata (e.g. acquisition location, lesion localization, patient age) to generate meaningful domains. To verify that these domains are in fact distinct, we used multiple quantification measures to estimate the presence and intensity of domain shifts. Additionally, we analyzed the performance on these domains with and without an unsupervised domain adaptation technique. We observed that in most of our grouped domains, domain shifts in fact exist. Based on our results, we believe these datasets to be helpful for testing the generalization capabilities of dermoscopic skin cancer classifiers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Supporting skin lesion diagnosis with content-based image retrieval. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 8053–8060. IEEE, 2021. doi:10.1109/icpr48806.2021.9412419. URL https://doi.org/10.1109/icpr48806.2021.9412419.
  2. A theory of learning from different domains. Machine learning, 79(1):151–175, 2010. doi:10.1007/s10994-009-5152-4. URL http://dx.doi.org/10.1007/s10994-009-5152-4.
  3. Deep neural networks are superior to dermatologists in melanoma image classification. European Journal of Cancer, 119:11–17, 2019. doi:10.1016/j.ejca.2019.05.023. URL https://doi.org/10.1016/j.ejca.2019.05.023.
  4. Jensen–shannon divergence for visual quality assessment. Signal, Image and Video Processing, 7(3):411–421, 2013. doi:10.1007/s11760-013-0444-3. URL https://doi.org/10.1007/s11760-013-0444-3.
  5. Analysis of the isic image datasets: usage, benchmarks and recommendations. Medical Image Analysis, 75:102305, 2022. doi:10.1016/j.media.2021.102305. URL https://doi.org/10.1016/j.media.2021.102305.
  6. Y. Chen. Exploring the impact of similarity model to identify the most similar image from a large image database. In Journal of Physics: Conference Series, volume 1693, page 012139. IOP Publishing, 2020. doi:10.1088/1742-6596/1693/1/012139. URL https://doi.org/10.1088/1742-6596/1693/1/012139.
  7. Bcn20000: Dermoscopic lesions in the wild. arXiv preprint arXiv:1908.02288, 2019. doi:10.48550/arXiv.1908.02288. URL https://doi.org/10.48550/arXiv.1908.02288.
  8. Light field image dataset of skin lesions. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 3905–3908. IEEE, 2019. doi:10.1109/embc.2019.8856578. URL https://doi.org/10.1109/embc.2019.8856578.
  9. Clinical abcde rule for early melanoma detection. European Journal of Dermatology, 31(6):771–778, 2021. doi:10.1684/ejd.2021.4171. URL https://doi.org/10.1684/ejd.2021.4171.
  10. H. Elsahar and M. Gallé. To annotate or not? predicting performance drop under domain shift. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2163–2173, 2019. doi:10.18653/v1/d19-1222. URL https://doi.org/10.18653/v1/d19-1222.
  11. Dermatologist-level classification of skin cancer with deep neural networks. nature, 542(7639):115–118, 2017. doi:10.1038/nature21056. URL https://doi.org/10.1038/nature21056.
  12. The clinician and dataset shift in artificial intelligence. The New England journal of medicine, 385(3):283, 2021. doi:10.1056/nejmc2104626. URL https://doi.org/10.1056/nejmc2104626.
  13. Y. Ganin and V. Lempitsky. Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015. doi:10.48550/arXiv.1409.7495. URL https://doi.org/10.48550/arXiv.1409.7495.
  14. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016. doi:10.48550/arXiv.1505.07818. URL https://doi.org/10.48550/arXiv.1505.07818.
  15. Progressive transfer learning and adversarial domain adaptation for cross-domain skin disease classification. IEEE journal of biomedical and health informatics, 24(5):1379–1393, 2019. doi:10.1109/jbhi.2019.2942429. URL https://doi.org/10.1109/jbhi.2019.2942429.
  16. H. Guan and M. Liu. Domain adaptation for medical image analysis: a survey. IEEE Transactions on Biomedical Engineering, 69(3):1173–1185, 2021. doi:10.1109/tbme.2021.3117407. URL https://doi.org/10.1109/tbme.2021.3117407.
  17. Skin lesions of face and scalp–classification by a market-approved convolutional neural network in comparison with 64 dermatologists. European Journal of Cancer, 144:192–199, 2021. doi:10.1016/j.ejca.2020.11.034. URL https://doi.org/10.1016/j.ejca.2020.11.034.
  18. Dermoscopy for the pediatric dermatologist part iii: dermoscopy of melanocytic lesions. Pediatric Dermatology, 30(3):281–293, 2013. doi:10.1111/pde.12041. URL https://doi.org/10.1111/pde.12041.
  19. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. doi:10.1109/CVPR.2016.90. URL https://doi.org/10.1109/CVPR.2016.90.
  20. Integrating patient data into skin cancer classification using convolutional neural networks: systematic review. Journal of Medical Internet Research, 23(7):e20708, 2021. doi:10.2196/20708. URL https://doi.org/10.2196/20708.
  21. Artificial Intelligence and Machine Learning for Digital Pathology. Springer International Publishing, 2020. doi:10.1007/978-3-030-50402-1. URL https://doi.org/10.1007/978-3-030-50402-1.
  22. Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology. Modern Pathology, 35(12):1759–1769, Dec. 2022. doi:10.1038/s41379-022-01147-y. URL https://doi.org/10.1038/s41379-022-01147-y.
  23. Similarity analysis for medical images using color and texture histogramss. Current Health Sciences Journal, 48(2):196–202, 2022. doi:10.12865/CHSJ.48.02.09. URL https://doi.org/10.12865/CHSJ.48.02.09.
  24. Domain divergences: a survey and empirical analysis. arXiv preprint arXiv:2010.12198, 2020. doi:10.18653/v1/2021.naacl-main.147. URL http://dx.doi.org/10.18653/v1/2021.naacl-main.147.
  25. An optimal segmentation method using jensen–shannon divergence via a multi-size sliding window technique. Entropy, 17(12):7996–8006, 2015. doi:10.3390/e17127858. URL https://doi.org/10.3390/e17127858.
  26. Seven-point checklist and skin lesion classification using multitask multimodal neural nets. IEEE journal of biomedical and health informatics, 23(2):538–546, 2018. doi:10.1109/jbhi.2018.2824327. URL https://doi.org/10.1109/jbhi.2018.2824327.
  27. Deep visual unsupervised domain adaptation for classification tasks: a survey. IET Image Processing, 14(14):3283–3299, 2020. doi:10.1049/iet-ipr.2020.0087. URL http://dx.doi.org/10.1049/iet-ipr.2020.0087.
  28. Systematic outperformance of 112 dermatologists in multiclass skin cancer image classification by convolutional neural networks. European Journal of Cancer, 119:57–65, 2019. doi:10.1016/j.ejca.2019.06.013. URL https://doi.org/10.1016/j.ejca.2019.06.013.
  29. S. Martin and T. S. Durrani. A new divergence measure for medical image registration. IEEE transactions on image processing, 16(4):957–966, 2007. doi:10.1109/tip.2007.891772. URL https://doi.org/10.1109/tip.2007.891772.
  30. Ph 2-a dermoscopic image database for research and benchmarking. In 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pages 5437–5440. IEEE, 2013. doi:10.1109/embc.2013.6610779. URL https://doi.org/10.1109/embc.2013.6610779.
  31. Explainability and causability for artificial intelligence-supported medical image analysis in the context of the european in vitro diagnostic regulation. New Biotechnology, 70:67–72, Sept. 2022. doi:10.1016/j.nbt.2022.05.002. URL https://doi.org/10.1016/j.nbt.2022.05.002.
  32. H. V. Nguyen and L. Bai. Cosine similarity metric learning for face verification. In Asian conference on computer vision, pages 709–720. Springer, 2010. doi:10.1007/978-3-642-19309-5_55. URL https://doi.org/10.1007/978-3-642-19309-5_55.
  33. Assessing the generalizability of deep learning models trained on standardized and nonstandardized images and their performance against teledermatologists: Retrospective comparative study. JMIR Dermatology, 5(3):e35150, 2022. doi:10.2196/35150. URL https://doi.org/10.2196/35150.
  34. I. Omer and M. Werman. Image specific feature similarities. In European Conference on Computer Vision, pages 321–333. Springer, 2006. doi:10.1007/11744047_25. URL https://doi.org/10.1007/11744047_25.
  35. Causality-inspired single-source domain generalization for medical image segmentation. IEEE Transactions on Medical Imaging, 2022. doi:10.1109/tmi.2022.3224067. URL https://doi.org/10.1109/tmi.2022.3224067.
  36. Unsupervised domain adaptation via cyclegan for white matter hyperintensity segmentation in multicenter mr images. In 16th International Symposium on Medical Information Processing and Analysis, volume 11583, page 1158302. SPIE, 2020. doi:10.48550/arXiv.2009.04985. URL https://doi.org/10.48550/arXiv.2009.04985.
  37. Age-specific incidence of melanoma in the united states. JAMA dermatology, 156(1):57–64, 2020. doi:10.1001/jamadermatol.2019.3353. URL https://doi.org/10.1001/jamadermatol.2019.3353.
  38. Kullback leibler divergence for image quantitative evaluation. In AIP Conference Proceedings, volume 1750, page 020003. AIP Publishing LLC, 2016. doi:10.1063/1.4954516. URL https://doi.org/10.1063/1.4954516.
  39. B. Plank and G. Van Noord. Effective measures of domain similarity for parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1566–1576, 2011. doi:10.5555/2002472.2002661. URL https://dl.acm.org/doi/10.5555/2002472.2002661.
  40. Failing loudly: An empirical study of methods for detecting dataset shift. Advances in Neural Information Processing Systems, 32, 2019. doi:10.48550/arXiv.1810.11953. URL https://doi.org/10.48550/arXiv.1810.11953.
  41. Adversarial domain adaptation for classification of prostate histopathology whole-slide images. In International conference on medical image computing and computer-assisted intervention, pages 201–209. Springer, 2018. doi:10.1007/978-3-030-00934-2_23. URL http://dx.doi.org/10.1007/978-3-030-00934-2_23.
  42. Unsupervised domain adaptation for classification of histopathology whole-slide images. Frontiers in bioengineering and biotechnology, 7:102, 2019. doi:10.3389/fbioe.2019.00102. URL https://doi.org/10.3389/fbioe.2019.00102.
  43. Deep bregman divergence for contrastive learning of visual representations. SSRN Electronic Journal, 2022. doi:10.2139/ssrn.4313782. URL https://doi.org/10.2139/ssrn.4313782.
  44. A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific data, 8(1):1–8, 2021. doi:10.1038/s41597-021-00815-z. URL https://doi.org/10.1038/s41597-021-00815-z.
  45. Adapting visual category models to new domains. In European conference on computer vision, pages 213–226. Springer, 2010. doi:10.1007/978-3-642-15561-1_16. URL http://dx.doi.org/10.1007/978-3-642-15561-1_16.
  46. A new distribution metric for image segmentation. In Medical Imaging 2008: Image Processing, volume 6914, pages 40–48. SPIE, 2008. doi:10.1117/12.769010. URL https://doi.org/10.1117/12.769010.
  47. The study of nevi in children: Principles learned and implications for melanoma diagnosis. Journal of the American Academy of Dermatology, 75(4):813–823, 2016. doi:10.1016/j.jaad.2016.03.027. URL https://doi.org/10.1016/j.jaad.2016.03.027.
  48. Does sex matter? analysis of sex-related differences in the diagnostic performance of a market-approved convolutional neural network for skin cancer detection. European Journal of Cancer, 164:88–94, 2022. doi:10.1016/j.ejca.2021.12.034. URL https://doi.org/10.1016/j.ejca.2021.12.034.
  49. Measuring domain shift for deep learning in histopathology. IEEE journal of biomedical and health informatics, 25(2):325–336, 2020. doi:10.1109/jbhi.2020.3032060. URL https://doi.org/10.1109/jbhi.2020.3032060.
  50. Primary locations of malignant melanoma lesions depending on patients’ gender and age. Asian Pacific journal of cancer prevention: APJCP, 18(11):3081, 2017. doi:10.22034/APJCP.2017.18.11.3081. URL https://doi.org/10.22034/APJCP.2017.18.11.3081.
  51. A. A. Taha and A. Hanbury. Metrics for evaluating 3d medical image segmentation: analysis, selection, and tool. BMC medical imaging, 15(1):1–28, 2015. doi:10.1186/s12880-015-0068-x. URL https://doi.org/10.1186/s12880-015-0068-x.
  52. A. Torralba and A. A. Efros. Unbiased look at dataset bias. In CVPR 2011, pages 1521–1528. IEEE, 2011. doi:10.1109/CVPR.2011.5995347. URL https://doi.org/10.1109/CVPR.2011.5995347.
  53. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018. doi:10.1038/sdata.2018.161. URL https://doi.org/10.1038/sdata.2018.161.
  54. L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008. URL http://jmlr.org/papers/v9/vandermaaten08a.html.
  55. T. Van Erven and P. Harremos. Rényi divergence and kullback-leibler divergence. IEEE Transactions on Information Theory, 60(7):3797–3820, 2014. doi:10.1109/TIT.2014.2320500. URL http://dx.doi.org/10.1109/TIT.2014.2320500.
  56. Deep-learning systems for domain adaptation in computer vision: Learning transferable feature representations. IEEE Signal Processing Magazine, 34(6):117–129, 2017. doi:10.1109/MSP.2017.2740460. URL https://doi.org/10.1109/msp.2017.2740460.
  57. Cross-modality paired-images generation for rgb-infrared person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 12144–12151, 2020. doi:10.1016/j.neunet.2020.05.008. URL https://doi.org/10.1016/j.neunet.2020.05.008.
  58. M. Wang and W. Deng. Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 2018. doi:10.1016/j.neucom.2018.05.083. URL https://doi.org/10.1016/j.neucom.2018.05.083.
  59. G. Wilson and D. J. Cook. A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology (TIST), 11(5):1–46, 2020. doi:10.1145/3400066. URL https://doi.org/10.1145/3400066.
  60. Y. Zhang. A survey of unsupervised domain adaptation for visual recognition. arXiv preprint arXiv:2112.06745, 2021. doi:10.48550/arXiv.2112.06745. URL https://doi.org/10.48550/arXiv.2112.06745.
  61. Learning to generate novel domains for domain generalization. In European conference on computer vision, pages 561–578. Springer, 2020. doi:10.1007/978-3-030-58517-4_33. URL https://doi.org/10.1007/978-3-030-58517-4_33.
Citations (11)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.