top of page

The Fragility of Fully Autonomous Decision-Making in Healthcare

  • Writer: jvourganas
    jvourganas
  • May 18
  • 6 min read

Updated: Jun 6


ree

The Fragility of Fully Autonomous Decision-Making in Healthcare

AI systems in healthcare promise to augment diagnostic precision, streamline treatment planning, and reduce clinician burden. However, reliance on fully autonomous AI without embedded human oversight mechanisms introduces both clinical risks and governance challenges. The following cases illustrate the consequences of omitting structured human-in-the-loop (HITL) models, while also demonstrating the trade-offs between automation and accountability.


Case 1: IBM Watson for Oncology


IBM’s Watson for Oncology was designed to assist clinicians by offering cancer treatment recommendations based on natural language processing of medical literature and guidelines. Initially deployed in over 230 hospitals across 13 countries, the system’s ambition was immense: to democratize expert level oncology insights.


Pros:

  • Knowledge aggregation: Watson could ingest and synthesize vast volumes of oncological data, guidelines, and research.

  • Speed: Provided instantaneous treatment suggestions for complex cases.

  • Scalability: Potential to reduce disparities in clinical expertise across regions.


Cons and Reasons for Failure:

  • Bias from synthetic training data: The system was reportedly trained not on real patient data but on hypothetical scenarios created by a small group of Memorial Sloan Kettering (MSK) oncologists. This introduced confirmation bias and limited generalizability to diverse patient profiles [1].

  • Opaque reasoning: The model provided little interpretability regarding why specific treatments were recommended, violating the principle of clinical accountability.

  • Insufficient oversight: There was no institutional mechanism requiring clinician review or override before implementing recommendations.

  • Unsafe outputs: Internal documents revealed that Watson had, at times, suggested treatments that were “unsafe and incorrect” [2].


Academic Perspective:

In [3] authors argues, clinical AI systems lacking transparent justification mechanisms create a "responsibility gap" where neither the system nor the practitioner is clearly accountable. Watson exemplifies this, demonstrating how algorithmic opacity and designer overconfidence can yield ethically fraught outcomes.



Case 2: Sepsis Prediction Algorithms


AI models such as Epic Sepsis Model (ESM) and TREWS (Targeted Real-time Early Warning System) have been widely adopted in U.S. hospitals to detect early signs of sepsis, a condition where delayed intervention can be fatal.


Pros:

  • Early detection potential: Timely alerts could theoretically enable rapid clinical response.

  • Operational integration: Embedded in EHR systems, enabling real-time monitoring.

  • Scalability: Automated detection may support overburdened clinical teams.


Cons and Reasons for Failure:

  • Low precision and high false positive rates: A 2021 peer-reviewed study found that the Epic model identified only 7% of sepsis cases, with an overwhelming number of false positives [4].

  • Limited clinician trust: Due to excessive alerts, many providers experienced “alert fatigue,” leading to widespread disregard of AI warnings.

  • Black box limitations: Clinicians could not inspect or challenge model logic due to proprietary constraints, undermining clinical judgment.

  • Absence of escalation protocols: AI outputs were often routed directly into workflows without any mandated human validation or triage.


Academic Perspective:

in [5] authors argue that “algorithmic stewardship” is essential in clinical AI: models must be monitored, validated, and audited with clinician feedback as a systemic requirement.

The ESM failure highlights what happens when predictive power is prioritized over interpretability and contextual relevance.


Synthesis: Implications for Oversight and Governance


These failures are not merely technical missteps, they are symptomatic of deeper governance deficits. In both cases, institutional deployment proceeded without adequate human override, transparency protocols, or accountability structures. They expose a recurrent pattern: the substitution of human judgment with unverified automation in high-risk domains is not only imprudent but increasingly indefensible.

These examples offer a compelling rationale for the codification of human oversight in standards such as ISO/IEC 42001, and align with broader academic calls for sociotechnical system thinking in AI deployment [6].





Academic References:


[1] Ross, C., & Swetlitz, I. (2017). IBM’s Watson supercomputer recommended “unsafe and incorrect” cancer treatments, internal documents show. STAT News.

[2] Tanne, J. H. (2018). IBM’s Watson recommended unsafe cancer treatments, documents show. BMJ, 362, k3430.

[3] London, A. J. (2019). Artificial intelligence and black-box medical decisions: Accuracy versus explainability. Hastings Center Report, 49(1), 15–21.

[4] Singh, K., Valley, T. S., Tang, S., et al. (2021). Performance of a sepsis prediction algorithm on patients with sepsis associated with COVID-19. Journal of Critical Care, 63, 34–38.

[5] Sendak, M., D’Arcy, J., Kashyap, S., Gao, M., Nichols, M., Corey, K., Ratliff, W., & Balu, S. (2020). A path for translation of machine learning products into healthcare delivery. NPJ Digital Medicine, 3(1), 1–10.

[6] Amann, J., Blasimme, A., Vayena, E., Frey, D., & Madai, V. I. (2022). Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Medical Informatics and Decision Making, 20(1), 310.



Example


Disclaimer on Data and Functionality

This demonstration represents a simplified version of the full application. The machine learning algorithms integrated herein were trained exclusively on synthetic data in compliance with GDPR and other applicable regulatory and ethical standards. No real or personally identifiable datasets were used in the development or testing of this version.

Please note that not all features present in the complete application are included in this demonstration. The data utilized for this prototype were artificially generated and sourced from the publicly available mock dataset provided by Airbnb’s Visx project, accessible at: https://github.com/airbnb/visx/blob/master/packages/visx-mock-data/src/generators/genDateValue


Sepsis Sentinel: Early Detection Saves Lives


Sepsis Sentinel is an AI-powered clinical decision support system designed to help healthcare providers detect sepsis early, when intervention is most effective. By continuously monitoring patient vital signs, laboratory results, and clinical data, our system calculates real-time risk scores to identify patients showing early signs of sepsis before traditional detection methods.

With a user-friendly interface that highlights high-risk patients, facilitates clinical review, and maintains comprehensive audit trails, Sepsis Sentinel integrates seamlessly into your existing workflows.

Our goal is simple: help clinicians save more lives by identifying sepsis earlier, reducing mortality rates, length of hospital stays, and overall healthcare costs.



How to use the app: Navigation

The application features a navigation menu that provides access to all main sections:

  • Dashboard: Overview of patient risk levels and pending reviews

  • Patients: Complete list of monitored patients with filtering options

  • Reviews: Cases flagged by the AI that require clinical review

  • Technical: Audit logs and technical performance metrics

  • Customer View: Information for hospital administrators


Dashboard

The dashboard provides an at a glance view of your patient population:


Key Metrics:
  • Total number of monitored patients

  • High-risk cases requiring attention

  • Pending reviews awaiting assessment

  • Average response time for high-risk cases


Highest Risk Patients:

This section displays the top 3 patients with the highest risk scores, allowing for quick identification of the most critical cases.


Pending Reviews
Shows patients who have been flagged by the AI system for review by a clinician.

Actions
  • Click on any patient card to view detailed information

  • Use the "Review Case" button to directly access the review workflow for flagged patients


Patient Management

The Patients section allows you to view and manage all patients being monitored by the system.


Patient List Features

  1. Search: Filter patients by name or ID

  2. Risk Filtering: Filter patients by risk level (All, High, Medium, Low)

  3. Patient Cards: Visual representation of each patient with key information:

    Name and demographic information, Current location in the hospital, Admission date, Risk score and level indicator, Quick actions


Detailed Patient View

Clicking on a patient card opens a detailed view with:

  • Complete demographic information

  • Current risk assessment

  • Vital signs

  • Laboratory results

  • Option to initiate review for high-risk patients


Case Reviews

The Reviews section lists all cases that have been flagged by the AI for clinical review.


Review Process
  1. Select a case from the pending reviews list

  2. Review the patient's clinical information

  3. Analyze the AI risk assessment

  4. Add clinical notes and observations

  5. Select appropriate intervention options

  6. Submit your review with clinical judgment

  7. The system will update the patient's status based on your input


Review Form Components
  • Clinical assessment section

  • Notes field for documentation

  • Intervention selection options

  • Sepsis protocol implementation checkbox

  • Submission button


Technical Audit

The Technical section provides an audit trail of all system activities:


Available Information
  • AI prediction logs

  • Clinical decision timestamps

  • User interactions with high-risk patients

  • System performance metrics

  • Data quality indicators


Audit Features
  • Searchable log entries

  • Filterable by date, user, and action type

  • Exportable reports for compliance documentation


Customer View

The Customer View section provides information about Sepsis AI for hospital administrators:


Content
  • System performance metrics

  • Implementation documentation

  • Training materials

  • FAQs for administrative staff

  • Contact information for technical support





 

                          


Mandatory & Recommended standards / Regulatory Frameworks


Regulatory & Compliance Standards


 ISO/IEC 42001: Artificial Intelligence Management Systems (AIMS)

EU AI Act (for use in Europe)

U.S. FDA Guidance for Clinical Decision Support Software (CDSS)

 

 Healthcare-Specific Standards


 ISO 13485: Quality Management for Medical Devices

 ISO 14971: Risk Management in Medical Devices

 HL7 FHIR Standard (Fast Healthcare Interoperability Resources

 

 Data Privacy & Security Standards


HIPAA (U.S.) and GDPR (EU)

Patient data use must comply with:

HIPAA

GDPR

ISO/IEC 27001: Information Security Management

 

 AI Ethics & Explainability


 IEEE 7000 & 7001 Series

  

Guidance from WHO & OECD on Trustworthy AI in Health





Requirement

Relevant Standard/Regulation

How It Applies

Risk classification & lifecycle governance

ISO/IEC 42001, EU AI Act

Document AI risks, human control, auditability

Clinical safety & validation

ISO 13485, ISO 14971, FDA CDS Guidance

Model performance evaluation, clinical trials

Data security & privacy

HIPAA, GDPR, ISO 27001

Access control, encryption, logging

Human oversight

EU AI Act Art. 14, ISO/IEC 42001

Review workflows, override buttons, traceable decisions

Interoperability

HL7 FHIR

Integrate with EHRs, reduce manual errors

Transparency & explainability

IEEE 7001, EU AI Act Art. 13

Explain risk scores and recommendations to clinicians

Bias monitoring

WHO & OECD AI Principles

Fairness analysis, post-deployment monitoring








 
 
 

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

Contact Information

ijvourganas(at)netrity(dot)co(dot)uk

jvourganas(at)teemail(dot)gr

linkedin-2815918_1280.jpg

Thanks for submitting!

© Copyright
bottom of page