ChatGPT Health performance in a structured test of triage recommendations

by
Nature
February 23, 2026
18 min read

Subjects

Abstract

ChatGPT Health launched in January 2026 as OpenAI’s consumer health tool, reaching millions of users. Here, we conducted a structured stress test of triage recommendations using 60 clinician-authored vignettes across 21 clinical domains under 16 factorial conditions (960 total responses). Performance followed an inverted U-shaped pattern, with the most dangerous failures concentrated at clinical extremes: non-urgent presentations (35%) and emergency conditions (48%). Among gold-standard emergencies, the system under-triaged 52% of cases, directing patients with diabetic ketoacidosis and impending respiratory failure to 24–48-hour evaluation rather than the emergency department, while correctly triaging classical emergencies such as stroke and anaphylaxis. When family or friends minimized symptoms (anchoring bias), triage recommendations shifted significantly in edge cases (OR 11.7, 95% CI 3.7-36.6), with the majority of shifts toward less urgent care. Crisis intervention messages activated unpredictably across suicidal ideation presentations, firing more when patients described no specific method than when they did. Patient race, gender, and barriers to care showed no significant effects, though confidence intervals did not exclude clinically meaningful differences. Our findings reveal missed high-risk emergencies and inconsistent activation of crisis safeguards, raising safety concerns that warrant prospective validation before consumer-scale deployment of artificial intelligence triage systems.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

27,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

269,00 € per year

only 22,42 € per issue

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Author information

Author notes

These authors contributed equally: Eyal Klang, Girish N. Nadkarni.

Authors and Affiliations

The Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Ashwin Ramaswamy, Alvira Tyagi, Alexis E. Te, Steven A. Kaplan, Ashutosh K. Tewari & Michael A. Gorin
Department of Medicine, NYC Health + Hospitals / Elmhurst, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Hannah Hugo
The Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Joy Jiang, Pushkala Jayaraman, Joshua Lampert, Robert Freeman, Ankit Sakhuja, Bilal Naved, Alexander W. Charney, Mahmud Omar, Michael A. Gorin, Eyal Klang & Girish N. Nadkarni
The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Joy Jiang, Pushkala Jayaraman, Mateen Jangda, Ankit Sakhuja, Bilal Naved, Alexander W. Charney & Girish N. Nadkarni
University of Miami Miller School of Medicine, Miami, FL, USA
Mateen Jangda
Department of Emergency Medicine, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Nicholas Gavin
The Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai and Mount Sinai Health System, New York, NY, USA
Ankit Sakhuja, Bilal Naved, Alexander W. Charney, Mahmud Omar, Eyal Klang & Girish N. Nadkarni

Authors

Ashwin Ramaswamy
Alvira Tyagi
Hannah Hugo
Joy Jiang
Pushkala Jayaraman
Mateen Jangda
Alexis E. Te
Steven A. Kaplan
Joshua Lampert
Robert Freeman
Nicholas Gavin
Ashutosh K. Tewari
Ankit Sakhuja
Bilal Naved
Alexander W. Charney
Mahmud Omar
Michael A. Gorin
Eyal Klang
Girish N. Nadkarni

Corresponding authors

Correspondence toAshwin Ramaswamy or Girish N. Nadkarni.

Supplementary information

Rights and permissions

About this article

Cite this article

Ramaswamy, A., Tyagi, A., Hugo, H. et al. ChatGPT Health performance in a structured test of triage recommendations.Nat Med (2026). https://doi.org/10.1038/s41591-026-04297-7

Download citation

Received: 15 January 2026
Accepted: 20 February 2026
Published: 23 February 2026
DOI: https://doi.org/10.1038/s41591-026-04297-7