DC3 -- A Diagnostic Case Challenge Collection for Clinical Decision Support

Published 22 Aug 2019 in cs.IR and cs.AI | (1908.08581v1)

Abstract: In clinical care, obtaining a correct diagnosis is the first step towards successful treatment and, ultimately, recovery. Depending on the complexity of the case, the diagnostic phase can be lengthy and ridden with errors and delays. Such errors have a high likelihood to cause patients severe harm or even lead to their death and are estimated to cost the U.S. healthcare system several hundred billion dollars each year. To avoid diagnostic errors, physicians increasingly rely on diagnostic decision support systems drawing from heuristics, historic cases, textbooks, clinical guidelines and scholarly biomedical literature. The evaluation of such systems, however, is often conducted in an ad-hoc fashion, using non-transparent methodology, and proprietary data. This paper presents DC3, a collection of 31 extremely difficult diagnostic case challenges, manually compiled and solved by clinical experts. For each case, we present a number of temporally ordered physician-generated observations alongside the eventually confirmed true diagnosis. We additionally provide inferred dense relevance judgments for these cases among the PubMed collection of 27 million scholarly biomedical articles.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces DC3, a dataset of 31 intricate diagnostic cases designed to reduce errors in clinical decision support.
It employs a structured temporal case format with UMLS identifiers to trace evolving clinical observations and bolster literature relevance judgments.
Experimental tasks in document retrieval and classification demonstrate the dataset's potential to advance CDSS performance.

DC $^3$ -- A Diagnostic Case Challenge Collection for Clinical Decision Support

This essay provides an overview of the paper "DC $^3$ -- A Diagnostic Case Challenge Collection for Clinical Decision Support," emphasizing the dataset's creation and its implications for clinical decision support systems (CDSS). DC $^3$ aims to address current limitations in diagnosing complex medical cases by offering a standardized benchmark for evaluating CDSS accuracy and effectiveness.

Introduction

The paper introduces DC $^3$ , addressing the critical need for improved diagnostic accuracy in clinical settings. Diagnostic errors present significant risks, leading to severe patient harm and substantial economic costs. Many errors stem from the cognitive diagnostic process overshadowing less prevalent conditions. This collection is designed to enhance physician support during challenging diagnoses by consolidating rare and complex case episodes into an easily accessible format.

Figure 1: An example of the DC $^3$ case file format.

Dataset Characteristics

DC $^3$ comprises 31 diagnostic cases from Massachusetts General Hospital between 2013 and 2018, illustrating complex and rare conditions often misdiagnosed due to their low prevalence in developed-world hospitals. Each case follows a structured presentation, reflecting temporal health note discoveries and containing several Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) representing diagnoses.

Temporal Case Structure

The dataset's structured narrative format enables detailed temporal tracking of diagnostic processes, shedding light on the evolution of physician-generated observations and confirmed diagnoses. This format supports the inference of dense relevance judgments across publications, enhancing CDSS implementation in diverse clinical settings.

Comparison to Existing Resources

DC $^3$ extends beyond existing collections like TREC CDS, CLEF eHealth, MIMIC-III, and i2b2 by providing publicly accessible data that covers both the complexities of multi-morbidity cases and offers relevance judgments linked to biomedical literature. These features are integral in supporting CDSS evaluation efforts, enabling reproducible research and fostering advancements in medical AI applications.

Experimental Tasks and Baseline Performance

Two major experimental tasks using DC $^3$ are outlined: patient-centric document retrieval and classification. These tasks aim to measure the effectiveness of CDSS models in retrieving relevant biomedical literature and diagnosing based on structured case descriptions.

Patient-centric Document Retrieval

Using Lucene, DC $^3$ indexes PubMed abstracts and uses case descriptions as retrieval queries. Evaluations show performance metrics via nDCG scores among standard retrieval models, revealing the challenges in extracting pertinent literature.

Supervised Classification

Another approach frames diagnostic decision support as a classification task, testing models like Naïve Bayes, Logistic Regression, and SVM on PubMed abstracts mentioning diagnosis-related terms. Performance metrics, primarily $F_1$ scores, indicate task complexity and potential for innovation in supporting unconstrained primary care diagnostics.

Conclusion

DC $^3$ represents a pivotal resource for evaluating and advancing CDSS by providing a standardized collection of complex diagnostic cases coupled with literature relevance judgments. Future improvements are anticipated, including manual expert annotations enriched with broader diagnostic and retrieval tasks, enhancing the dataset's utility in comprehensive diagnostic processes. As CDSS technology evolves, DC $^3$ will likely serve as an invaluable tool in shaping precision in clinical diagnostics.

This essay articulates the potential of DC $^3$ in alleviating diagnostic challenges, ensuring a thorough understanding of rare medical conditions and their practical implications in improving physician decision-making processes.