Abridge Bets Clinical Conversations Can Become Healthcare’s Intelligence Layer

Chaitanya AsawaLatent SpaceThursday, May 14, 202620 min read

Abridge executives Janie Lee and Chaitanya “Chai” Asawa argue that the patient-clinician conversation is becoming healthcare’s core intelligence layer, not merely an input for automated notes. In a discussion with Redpoint’s Jacob Effron, they describe Abridge’s move from ambient documentation into clinical decision support, prior authorization and other workflows that depend on EHR data, payer rules, medical literature and local guidelines. Their case is that healthcare AI will be judged less by chatbot fluency than by whether it can deliver accurate, low-latency, privacy-preserving support inside clinical workflows without adding to clinicians’ alert burden.

Abridge wants the clinical conversation to become healthcare’s intelligence layer

Janie Lee describes Abridge as “a clinical intelligence layer for health systems.” The entry point was documentation: turning the patient-clinician conversation into the note clinicians otherwise write after the visit. But Lee’s larger claim is that the conversation is not just a source for documentation. It is “probably the most important workflow in healthcare” because care is given and received there, and because much of the downstream system derives from it: the claim, the payment, the diagnosis, the treatment plan, and the record of what happened.

That premise explains Abridge’s progression. Lee frames the company’s first act as helping clinicians “save time.” Doctors, she says, spend 10 to 20 hours a week on documentation, including after-hours charting known in healthcare as “pajama time”: doctors at home in pajamas catching up on notes. Abridge’s early product reduced that burden. Lee describes clinicians telling the company that Abridge kept them from retiring early or let them get home for dinner with their children. Chaitanya Asawa adds that in some cases it “saved their marriage.”

10–20 hours

weekly documentation burden Lee says clinicians face

The second and third acts are broader. Lee says Abridge now thinks about helping health systems “save and make more money,” then ultimately “save lives.” Health systems, in her telling, face record-low operating margins, rising difficulty serving patients, and both regulatory tailwinds and headwinds. Abridge’s opportunity is that its software is open “millions of times a week before, during, and after a patient walks in the room.” That gives the company a chance to intervene not only after the visit, but around the visit itself.

Asawa’s work in clinical decision support is built on that same expansion. He asks what becomes possible before, during, and after the patient conversation if a system has access to patient context, payer guidelines, medical literature, and other relevant information. The ambition is not just to write down what happened. It is to use the visit as the central event around which healthcare intelligence can operate.

The product screens shown make that positioning concrete. Abridge is not presented as a free-standing chatbot. One screen shows decision support for an outpatient COPD exacerbation, with a source-linked recommendation referencing UpToDate, an option to add follow-up plans into the Assessment & Plan, and a warning that “Abridge AI is a supplemental decision support and edit tool” that may contain errors and must be independently reviewed. Another screen shows an EHR-adjacent clinical note with history, exam, results, and plan on the left, and an Abridge AI side panel on the right suggesting edits, exposing transcript and visit-diagnosis views, and offering “Copy All” and “Send to EHR.” The important product claim is not that AI replaces the clinician’s judgment. It is that AI is embedded near the note, the sources, and the clinical workflow where generated content can be reviewed and acted on.

The product should feel like air conditioning, not another alert system

Abridge’s decision-support ambitions run into a known failure mode of healthcare software: alert fatigue. Lee calls alerts “notorious” in healthcare and estimates that more than 90% are ignored. Her answer is not more aggressive interruption. It is better timing, more context, and a higher bar for when the product breaks into the clinician’s attention.

We want our product to feel like air conditioning. It should be in the background just making things better.

Janie Lee

The metaphor matters. Abridge should be present without taking over the patient-clinician interaction. Lee argues that the system should act only when there is meaningful clinical risk and when “intervening now and not later is incredibly important.” Otherwise, intelligence should often appear before the clinician enters the room rather than during the conversation.

Instead of interrupting a serious or sensitive patient visit five or 10 times, Abridge could prepare the clinician beforehand: summarize recent context, connect it to the reason for the visit, and identify the issues worth discussing. The goal is that the clinician does not walk in cold.

The clearest example is prior authorization. Lee describes a familiar scenario: a patient visits a doctor with knee pain, the doctor orders an MRI, and weeks later the patient gets a call saying the MRI was not approved and they need to come back. In Abridge’s target workflow, the system could quietly tell the doctor before the patient leaves that the patient’s Aetna plan in California requires six criteria for approval. Four have already been confirmed from available context. Two remain: whether the patient has had physical therapy and whether the pain has lasted more than six weeks. If the doctor asks those questions while the patient is still in the room, Lee says the MRI could be approved before the visit ends.

Asawa describes this as reducing “latency in the world.” Prior authorization is not merely paperwork; it delays care. The product challenge is to turn weeks of downstream back-and-forth into a real-time prompt that is accurate and useful enough not to become another ignored alert.

The hard part is not only the model. The system needs EHR integration to understand prior labs, imaging, history, and other patient context. It needs payer policies, which vary by state and may live on websites or in unstructured 50-page PDF files. It needs to match the patient to the correct plan, apply the policy correctly, and surface the missing information while the clinician can still act.

For Lee, that difficulty is also the moat. The hard parts are the data, the model quality, and the workflow. If insurance-company AI is deployed only after the visit, she says the result can become “AI just fighting each other when it’s too late.” Abridge’s bet is to pull the intelligence forward into the clinical moment, where the doctor and patient can still resolve the issue.

Healthcare makes the context problem sharper and less forgiving

Jacob Effron frames clinical decision support as a search problem: in the context of a visit, the clinician needs help finding the right information across many data sources. Asawa accepts the analogy and describes Abridge as, in some ways, a “healthcare-coded version of Glean.” In both settings, the core insight is that powerful models become useful only when grounded in the right context.

The difference is the downside. Asawa says the risk in healthcare is “extremely high” and can be fatal, for example if a system leads to prescribing something a patient is allergic to. That changes evaluation, rollout, and tolerance for mistakes. At Glean, a wrong enterprise-search answer was usually not “the end of the world.” In healthcare, that assumption does not hold.

Healthcare is also narrower than horizontal enterprise search, which Asawa sees as an advantage. Abridge still has to handle variance across specialties, health systems, and personas, but the domain has enough structure to focus product development. Many healthcare problems, he says, have historically been solved with labor and process, making them ripe for AI augmentation.

The other major difference is modality. Abridge began as an ambient product, listening in the background rather than waiting for the user to type a prompt. Asawa sees that as close to “the greatest form of AI”: seamless, not requiring the clinician to look at a screen, but always available to help. That same ambient position creates the interruption problem Lee describes. If the system is always listening, the product question becomes when it should speak up.

Today the main form factors are mobile and desktop. Lee says clinicians often carry mobile devices in and out of rooms, then use desktop at the end of the day to close notes or prepare for the next day. Abridge is also exploring partnerships with in-room device companies, especially for nursing workflows. Nurses may enter a room for 30 seconds or two minutes, where starting a separate recording experience could take longer than the interaction itself. Always-on room devices could capture context that is currently missed.

Asawa also mentions AR glasses as a possible future form factor for bringing information to clinicians without forcing attention onto a screen, though he says it is not a near-term product roadmap item. The near-term constraint is social as much as technical. Lee says the product is meant to help clinicians focus on patients, and that patients and clinicians do not currently want “a third voice” in the room. The likely interaction pattern is voice in and text out, not voice in and voice out.

The customer is multi-sided, and each side values a different outcome

Abridge sells into large health systems; in explaining the category, Lee refers to the “Mayos” and “Kaisers” of the world as examples of the kinds of institutions being discussed. But the product has to satisfy several constituencies at once. The buyers may be Chief Medical Information Officers, Chief Financial Officers, and CIOs. The primary users today are clinicians. The downstream people affected are patients. Payers and pharma companies appear in adjacent workflows.

That structure changes what value means. For clinicians, the immediate value is lower documentation burden and more time. For CFOs, time savings is not enough. Abridge has to show that for every dollar a health system spends, the product saves or adds real dollars through more compliant documentation, fewer billing queries, or other measurable financial effects.

For security, compliance, and workflow leaders, the questions are different: who can see what, where data flows, and how the product integrates with existing systems. Asawa adds that payers are another axis. He clarifies that payers do not see raw Abridge data. In a prior authorization workflow, Abridge would communicate the information needed for that authorization, not expose the underlying raw conversation.

Lee’s broader view is that the same clinical conversation can serve many stakeholders. The doctor needs documentation that reflects the care delivered. The patient needs to understand what happened and what comes next. The payer needs to know whether appropriate care was given. A pharma company might want to understand why a drug is not being properly used or whether a patient could be a clinical-trial candidate. Lee sees Abridge’s product, platform, and infrastructure as potentially serving those needs from one conversational foundation, rather than through separate systems for each stakeholder.

The hardest AI problem is accuracy at low latency and cost

Asawa reduces the core AI challenge to three product KPIs: quality, latency, and cost. The prior authorization example stresses all three at once. The system has to reason over procedures, payer policies, patient context, and sometimes health-system-specific requirements. It has to do that while the clinician is still with the patient. It has to be accurate enough not to worsen alert fatigue. And it has to operate economically at scale.

That shapes Abridge’s model strategy. Proprietary models matter when they provide higher quality or lower cost and latency at similar quality. The advantage comes from proprietary data. Asawa first describes Abridge as having “on the order of 80 million” medical conversations, then says “hundreds of millions actually now getting close to,” and later speaks of “a hundred million conversations.” The exact count is not pinned down in the discussion; the claim is scale. Abridge has a large corpus of patient-provider traces that most AI products do not.

~100M

conversation-scale dataset Asawa uses to describe Abridge’s operating scale

Asawa calls those conversations the trace where “debugging happens in healthcare.” They can support transcription, diarization, note generation, and eventually more agentic use cases. They also change the economics of model choice. In a prototype, a team can use the most expensive model and burn tokens freely. At Abridge’s scale, token costs and model efficiency become central.

Abridge does not assume it should build every model itself. Asawa expects frontier models to keep improving at general healthcare knowledge, in part because healthcare queries are a large class of consumer model usage. That helps Abridge. The company can use a “constellation of models,” selecting different models for different jobs and optimizing for product experience rather than model ownership.

The latency strategy has several layers. Sometimes the bottleneck is the model itself, where post-training on Abridge’s data can improve efficiency. Sometimes the answer is routing: a cheap fast model triages and hands off to a larger model when more intelligence is needed, a “thinking fast and slow” pattern. Asawa also mentions modeling payer policies in an intermediate representation to make real-time reasoning more tractable, though he does not disclose implementation details.

The product is not yet fully real time in the sense of continuous voice-in, text-out. Today’s systems are more batch-based, but Abridge is prototyping ways to trigger models, agents, or agentic workflows at the right moments in the conversation and reduce the feedback loop as much as possible.

The EHR becomes the filesystem for healthcare agents

Asawa’s broader technical framing is that “almost every agent is a coding agent underneath the hood.” Given a filesystem, an agent can read, write, manipulate data, and use code-like workflows. In healthcare, the EHR can be thought of as that filesystem: a large store of clinical information, too large for today’s model context windows but essential for the product use cases Abridge wants to build.

That makes EHR integration more than plumbing. Lee says deep interoperability is table stakes. Abridge has to pull data from and push data into the right places; otherwise, health systems will not use it. Clinicians already spend much of their day in the EHR, and they resist tools that add clicks. If a new product adds two clicks, clinicians may simply refuse to use it.

Lee says close partnerships with major EHRs helped Abridge win in large health systems, including work with APIs that “weren’t ready out of the box.” The product has to save time inside the workflow clinicians already use, not create a parallel system. Lee says Abridge talks internally about “earning the right”: the product has to provide enough value or save enough time to justify its place in the workflow.

Asawa distinguishes Abridge’s layer from what EHRs traditionally own. EHRs focus on clinical workflows and records. Abridge is trying to build an intelligence layer across providers, payers, and pharma, including payer-provider connections, payer policies, and clinical-trial matching. In his view, that is a different scope.

The agentic future is not simply more notifications. Asawa describes background systems reacting to events: for example, when a lab value updates, a background agent could use the patient’s context and flag the next appropriate step to the clinician. The clinician remains in the loop, but the latency between new information and action falls.

Personalization has to preserve accuracy, not just style

Lee says personalization is “massive” for Abridge and happens at three levels: the individual clinician, the specialty, and the health system.

At the individual level, the note is personal. It reflects how a doctor practices and communicates care. Clinicians may prefer bullets or paragraphs, concise or comprehensive notes, specific phrases, templates, or formatting details such as two spaces between sentences. Abridge has had to support those preferences. The harder problem is ensuring that stylistic personalization does not interrupt accuracy and quality.

At the specialty level, the product and evals must adapt to different work. A cardiology note and a dermatology note do not look the same. A complete, compliant, billable dermatology note is different from a primary care note. Lee says specialty-level evals are “hard earned” and require internal and external calibration.

At the health-system level, personalization includes local guidelines and best practices. Health systems may have spent years refining care pathways, and they want their own hospital guidelines embedded into clinical decision support. Asawa says high-level recommendations across systems may be similar, but the details differ: when to refer to a specialist, which conditions must be met, and how local practice patterns shape decisions.

Asawa connects personalization to “AI slop.” In his framing, slop is “AI without context.” Abridge’s advantage is that it can use clinician preferences, note edits, product usage, and clinical context to make outputs less generic. Lee adds that Abridge has the edits people make in the product, creating a data flywheel for personalization.

Memory is one mechanism. Asawa distinguishes between baking memory into model weights and keeping it in an external store. Because models change quickly, baking preferences into weights can be “throwaway.” Abridge is more interested, at least initially, in a separate memory store, potentially with a memory sub-agent working in the background to identify which clinician actions should be remembered long-term. He also describes background jobs that collate memories over time, analogous to sleep, using note edits, conversations, and other action data.

Evals are a clinical, operational, and ML system

Abridge’s eval process begins with the question Lee says the company asks from day one of a new product or feature: “What does good look like?” Clinical safety is table stakes, but quality also includes style, completeness, billability, and whether the documentation justifies the work actually delivered.

The company uses internal clinicians in a process it calls LFD: “look at the effing data.” Lee describes it as a first pass for judging whether an output is good enough. Abridge then creates LLM judges across components of quality and calibrates those judges against annotated data and internal or external evaluators. Depending on the stakes, the company also uses in-house and third-party evaluators before shipping major changes.

The goal is to shorten the process from months to weeks to days. Some of that is a science and ML problem, but much of it is operational. The company has to plan eval capacity, know which specialties are needed, and learn which third parties are strong for which use cases. Domain expertise matters because healthcare is not one undifferentiated market. Evals may be segmented by specialty, nursing workflow, billing and coding expertise, or another product-specific dimension.

Asawa compares the rollout problem to self-driving cars. Offline evals matter, but they have to match real-world distributions. Abridge wants to make contact with reality quickly, while still rolling out progressively. In past roles, he might have alpha-tested a product and generally released it the next week. In healthcare, the approach is more controlled: learn from real usage, but through gradual rollout that feeds both online and offline evaluation.

Lee says customer trust has changed what is possible. Health-system release cycles for new vendors are often quarterly or semiannual. Abridge has moved customers to monthly release cycles, and a subset of customers now develop with Abridge outside those cycles. Those customers have a higher tolerance for early feedback because they trust the company, but Lee says the bar remains high. This is not the same as rapid experimentation in a consumer product.

Abridge has also changed team design to handle long-tail quality. Lee describes a role called “clinician scientist”: people with clinical backgrounds, typically MDs, who are also deeply technical, ranging from full-stack engineering ability to highly capable prompting and product work. They are embedded in teams, judge whether products are clinically useful, and help define evaluation criteria. Lee says she and Asawa should not be the ones defining those criteria because they do not have clinical backgrounds.

The company also decides how much offline evaluation is needed before production: when hundreds of responses are enough versus when several thousand are required. Lee describes that as an area Abridge is trying to make less art and more science.

Privacy constraints shape how Abridge learns from its own data

Abridge’s data advantage comes with privacy constraints. Asawa says any real-world data used for online eval sets or learning must be de-identified. He notes that government guidelines define what counts as protected health information, and Abridge has built models that can take a clinical transcript and remove key PHI indicators to produce a scrubbed or de-identified version.

That creates another evaluation problem. Abridge first has to trust the de-identification model itself. Asawa describes this as “multiple probabilistic systems on top of each other.” Once that system is proven out, the de-identified data can be used for training and evaluation, subject to the right data contracts with partners.

When asked whether anonymization is one-way, Asawa says yes. Lee adds that customer contracts specify who can access PHI, how long it is retained, and when it is de-identified. Abridge maintains a high bar for PHI access to respect customer data and privacy while still trying to preserve as much quality signal as possible.

Effron points to a tension in healthcare data: in areas such as cancer care, some patients may actively want others to learn from their experience. Asawa responds that Abridge’s dataset contains opportunities for clinician coaching, treatment-effectiveness insights, and other learning from a data source that previously was not captured. But those possibilities are still framed through de-identification and customer agreements, not unrestricted data use.

Healthcare may force some of the hardest AI systems to mature first

Both Lee and Asawa say working in healthcare changed their expectations about where AI progress may happen.

Asawa initially had concerns about regulation, and he says engineers he recruits often ask about the same issue. He still describes healthcare as heavily regulated, and says that is appropriate because patients have to be kept safe. But he now sees favorable regulatory tailwinds as well as constraints. In Asawa’s view, government policy is pushing toward interoperability between systems so agents can access information. He also says FDA guidance on clinical decision support has become more forward-looking than earlier guidance, without treating that point as a full regulatory analysis.

Lee says she initially expected healthcare to be on “the tail end” of AI innovation because of the stakes and constraints. She now thinks the opposite may happen in some areas: because the bar is so high, healthcare may be where some of the hardest AI problems are solved first. Zero-error evals, multi-step workflows, and low-tolerance systems are not optional if the company wants to ship.

Her shorthand is that “80/20 doesn’t work here.” In other domains, a product can solve 80% of cases and still be useful. In healthcare, the long tail can be the difference between acceptable and unacceptable risk. Asawa adds that he used to encounter a stigma that healthcare companies were not technically interesting. His current view is that the problems — latency, quality, context, cost, evaluation — are technically hard even apart from their social impact.

The operating model is context, events, and written judgment

Asawa’s infrastructure view starts from a belief that models will become more agentic over time. Earlier AI applications often needed scaffolding: custom DSLs, agent frameworks, or simplified environments that compensated for models that could not reliably use richer tools. If models increasingly use computers, write code in sandboxes, and operate over existing tools, then the durable infrastructure shifts toward the context layers and tool surfaces those agents receive.

That view fits Abridge’s clinical setting. The hard problem is not just generating one answer; it is knowing which context to retrieve, when to trigger work, and how to keep humans in control. Asawa points to event-driven real-time systems as a persistent infrastructure pattern, especially for products that need to respond during or around a live clinical conversation. He names Kafka, Temporal, sockets, and related event-driven technologies as examples of tools that remain relevant.

He also draws from collaboration systems built for humans. Google Docs-style conflict management and CRDTs become relevant when multiple agents or systems may be operating over shared state. The underlying pattern is that infrastructure designed to coordinate humans may also be durable for coordinating agents, only at much higher scale.

The same operating discipline appears in Lee’s product-process view. She says she has changed her mind on the claim that prototypes are everything and PRDs are dead. Abridge’s products are too complex for a prototype to capture the most important questions: what can be done with the data, whether the problem is worth solving, whether it deepens the company’s moat, whether competitors can copy it, whether customer implementation is required, and what security, compliance, and edge cases exist.

She is not arguing against prototypes. AI should help create better documentation, and prototypes are useful for showing clinicians possible experiences and getting feedback. But for products that must work at the largest health systems, crisp written clarity matters more, not less. The document may now be a markdown file that teams and systems use as context rather than a traditional PRD, but the underlying function remains.

Asawa makes the same point more generally: judgment and clarity have not stopped mattering. The cost and speed of software have changed, so teams need to recalibrate their judgment. But that does not mean moving entirely to one extreme. Sometimes a small feature should be prototyped and shipped. Sometimes engineers need the clarity that comes from writing before they build.

That process view connects back to the product thesis. Abridge is not trying to win by shipping a weekend demo into a forgiving workflow. It is trying to collapse processes such as prior authorization, which Lee says can take 45 days across many touchpoints, into something closer to a real-time clinical workflow. In that setting, “go faster” does not mean skipping the hard questions. Lee’s claim is that doing the hard thinking upfront lets the company move faster on the right things.

Abridge also uses the current generation of coding tools internally. Asawa says Abridge engineering is heavily using Claude Code and also uses Cursor. He describes seeing engineers with multiple Claude sessions running and says Claude Code helped him onboard faster as a relatively new person at the company. The point is practical rather than ideological: the same team building high-stakes AI into healthcare workflows is also using coding agents to move through its own codebase and systems faster.

Agents and Autonomy AI Application Architecture RAG and Knowledge Systems Evals and Benchmarks Inference and Deployment Data and Training Voice and Audio AI Enterprise AI Adoption AI in Healthcare and Life Sciences AI Governance and Regulation