Abridge Says GPT-5.5 Improves Clinical Synthesis as Tool Complexity Rises
Abridge’s Chaitanya Asawa says GPT-5.5 improved the company’s clinical decision-support system as it added more tools and context, a signal that the model could better synthesize information under complexity. His case is that stronger reasoning and tool use can turn patient context, live clinical conversation, and trusted medical guidance into denser point-of-care support, while leaving clinicians to review answers and accept or reject proposed note edits.

GPT-5.5 improved as Abridge added tools
Chaitanya Asawa says Abridge saw an upward trend on its evaluation set when it used GPT-5.5 and increased the number of tools available to the model. For Abridge, that was the key signal: the model appeared better able to synthesize information as the environment around it became more complex.
When we used GPT-5.5, as we increased the number of tools we actually saw an upward trend on our evaluation set. And to us that was a clear sign that this model is better able to synthesize information.
Asawa, Abridge’s head of engineering for clinical decision support, describes the product problem as synthesis at the point of care: bringing together medical information, patient context, and the live conversation between a patient and doctor while a clinician is making decisions.
The improvement he attributes to GPT-5.5 is “higher, better clinical reasoning and tool use.” He frames the goal as bringing “the most information density right at the point of care,” after which clinicians “can make the ultimate decision.” The point is not simply that GPT-5.5 can generate more output; it is that Abridge believes the model can turn larger amounts of clinical context into denser support.
For this workflow, additional tools are useful only if the model can sort relevance, use the right source at the right time, and present the result in a form that can be inspected quickly. Asawa’s evaluation result is that GPT-5.5 moved in the right direction as tool complexity increased, rather than degrading under the added load.
The workflow combines patient context with UpToDate guidance
Abridge’s interface frames the workflow as “context-aware clinical insights, right where care happens.” The screen shown says Abridge synthesizes the patient conversation, the clinical note, and trusted UpToDate guidance from Wolters Kluwer so a clinician can review treatment information, ask follow-up questions, and edit documentation from the same context.
The example centers on an outpatient COPD exacerbation without pneumonia. The interface suggests questions specific to the patient situation: whether prednisone 40 mg for five days is optimal compared with other regimens under GOLD guidance; whether symptoms warrant antibiotics and how Anthonisen criteria apply given dry cough, wheeze, and no chest X-ray consolidation; and what follow-up or monitoring is recommended after outpatient steroid treatment, including red flags and spirometry timing.
The generated answer is qualified rather than absolute. It says that after an outpatient COPD exacerbation treated with a short prednisone course, “one approach” is early clinical reassessment plus clear return precautions. It describes pulse oximetry as something used pragmatically and says spirometry is deferred until recovery. It suggests follow-up contact or a visit within about one week to confirm symptom trajectory and inhaler use or technique, and to make sure there is no evolving alternative diagnosis.
The interface also exposes sourcing. The displayed answer cites “COPD exacerbations: Management” from UpToDate and includes a “View Sources” option. The relevant behavior is the combination: the system uses patient context to generate clinical questions, draws from trusted guidance, and produces a response that can be reviewed before it affects care or documentation.
Decision support feeds back into the note with accept and reject controls
The note-editing screen shows the COPD example moving from question-answering into documentation. The patient is described as having COPD, wheezing, and a dry cough for the past week, with a recent chest X-ray showing no pneumonia or fluid. The assessment identifies a likely COPD exacerbation, includes prednisone 40 mg once daily for five days, and calls for follow-up monitoring of symptoms and treatment response.
Alongside the note, the Abridge AI pane shows the clinician’s request: “Add these follow-up plans to my Assessment & Plan...” The assistant reports that the request was processed and that edits were suggested. The suggested update adds an outpatient COPD exacerbation follow-up and monitoring plan, including timing, what to monitor, red flags, and pulse oximetry guidance.
The UI also shows a boundary on what the assistant added. It says it did not add spirometry timing because that point “was not explicitly discussed in the prior plan and could vary by practice/patient stability.” The system is applying earlier guidance to the note, while withholding a possible addition when the basis is not explicit enough in the prior plan.
The visible controls are “Accept” and “Reject.” The model concentrates context into proposed edits, but those edits remain reviewable before they enter the note.
More context is the product promise and the engineering burden
Chaitanya Asawa says the most exciting part of GPT-5.5 is its ability to keep up as Abridge adds more tools and context. For clinical decision support, the product becomes more valuable when it can use more information: live conversation, patient context, note content, clinical references, and source-backed treatment guidance. It also becomes harder to keep the output concise, relevant, and bounded.
The thing that excites me the most about this model is that as we keep adding tools and context, we believe the model will actually be able to keep up with that complexity.
Asawa says Abridge saw a “clear, clear jump” in results and that GPT-5.5 can deliver more value to users “at the rigor that we want and that we need for such a product.” His account of GPT-5.5 is about handling the burden created by richer clinical context: more inputs should become denser point-of-care support, not more noise.


