Sign In

Designing AI-Based Systems to Support Clinicians' Diagnostic Reasoning: A Conceptual Framework for Complementarity

Core Concepts
The core message of this article is to propose a conceptual framework for designing AI-based systems that support clinicians' diagnostic reasoning through feature importance, counterexample explanations, and similar-case explanations, rather than providing direct recommendations or explanations.
The article presents a conceptual framework and a case study for designing AI-based systems to support clinicians' diagnostic reasoning in healthcare. The key points are: The authors argue that the current paradigm of using Explainable AI (XAI) to build trust in AI systems is limited, as explanations can lead to over-reliance or under-reliance on AI recommendations. Instead, they propose a human-centered approach that focuses on supporting the clinician's abductive reasoning process. The proposed conceptual framework is based on three pillars: feature importance, counterexample explanations, and similar-case explanations. These techniques aim to provide clinicians with insights into the AI model's decision-making without directly recommending a diagnosis. The authors conducted a case study involving clinicians, technicians, and researchers in a participatory design approach. They developed a high-fidelity prototype based on a real dataset of thyroid disease diagnosis. The co-design process with clinicians revealed that they appreciated the approach of providing counterexamples and similar-case explanations rather than direct recommendations, as it aligned with their diagnostic reasoning strategy. The prototype allows clinicians to input a patient case, select their hypothesis, and view feature importance, counterexamples, and similar-case explanations to support their decision-making process. The authors conclude that this approach can contribute to the current discourse on designing AI systems to support clinicians' decision-making processes and promote the virtuous adoption of AI in healthcare.
The dataset used in the case study has 7,142 data records, with 89.4% negative cases, 8.15% hyperthyroid cases, and 2.45% hypothyroid cases. The input features include age, sex, thyroid-related medical history, and various thyroid hormone levels. The XGB model trained on this dataset achieved an average accuracy of 0.99 across 10-fold cross-validation.
"providing AI with explainability [. . . ] is more akin to painting the black box of inscrutable algorithms [. . . ] white, rather than making them transparent." "what is needed for enabling trust and adoption is not so much explainability, intended as the capability of providing explanations for the behaviour or output of an AI system, but rather accuracy and reliability." "the AI decision support, we claim, should merely support the specific diagnostic reasoning of the clinician at stake in the setting under consideration, and thus, in principle, neither be focused on recommending decisions nor on providing explanations for them."

Key Insights Distilled From

by Elisa Rubegn... at 04-09-2024
Designing for Complementarity

Deeper Inquiries

How can the proposed framework be extended to support clinicians' decision-making in other healthcare domains beyond thyroid disease diagnosis?

The proposed framework can be extended to support clinicians' decision-making in other healthcare domains by adapting the three pillars conceptual framework to the specific characteristics and requirements of those domains. For instance, in the case of radiology, the feature importance analysis could focus on different imaging parameters, while in cardiology, it could emphasize specific biomarkers or ECG readings. The counterexample explanations and similar-case explanations can be tailored to the unique diagnostic challenges and data types present in each domain. By customizing the framework to different healthcare specialties, clinicians can receive more relevant and actionable insights to aid in their decision-making processes.

What are the potential limitations or drawbacks of the counterexample and similar-case explanation approaches, and how can they be addressed?

One potential limitation of counterexample explanations is the generation of unrealistic or impractical scenarios that may not align with the clinical reality. Similarly, similar-case explanations may not always provide a diverse range of relevant examples, leading to biased or incomplete insights. To address these limitations, it is essential to fine-tune the algorithms used to generate counterexamples and similar cases, ensuring that they are clinically plausible and representative of the diverse scenarios clinicians encounter. Additionally, incorporating feedback mechanisms from clinicians to validate the relevance and accuracy of these explanations can enhance their effectiveness.

How can the integration of the proposed framework with other AI-based decision support tools, such as those providing recommendations, be explored to achieve an optimal balance between complementarity and automation in clinical decision-making?

The integration of the proposed framework with other AI-based decision support tools can be achieved by combining the strengths of each approach to create a comprehensive decision-making system. By leveraging the feature importance analysis, counterexample explanations, and similar-case explanations alongside recommendation systems, clinicians can benefit from a holistic view of the data, model predictions, and alternative scenarios. This integration can enable clinicians to make informed decisions based on a combination of AI-generated insights and their expertise. To achieve an optimal balance between complementarity and automation, it is crucial to design the system in a way that empowers clinicians to retain control over the decision-making process while leveraging AI support for enhanced accuracy and efficiency. Regular feedback loops and continuous validation of the AI recommendations can help maintain this balance and ensure that clinicians remain central to the decision-making process.