Agent Memory Requires Session State, Shared Workflow State, and Persistence

Martin Omander Annie WangGoogle Cloud TechThursday, May 7, 20265 min read

Google Cloud’s Annie Wang argues that agents that forget recent user context are usually facing an architecture problem, not a model problem. In a Serverless Expeditions walkthrough with Martin Omander, she says useful agent systems need integrated memory alongside models and tools, and lays out three initial patterns: session state for the current conversation, shared state for multi-agent workflows, and persistence for memory that survives restarts and future sessions.

An agent that cannot remember will look unintelligent

The failure mode is simple: a user tells an agent they are planning a two-day trip to Tokyo and like historic sites; the agent suggests the Imperial Palace; the user asks it to send the itinerary; the agent responds as if the conversation never happened. Annie Wang calls this the “goldfish memory problem.”

Yeah, this is the goldfish memory problem. So even if our AI agent is brilliant, it will look dumb if it doesn't remember anything.

Annie Wang · Source

Wang’s broader point is architectural. Developers often treat AI intelligence as a combination of model and tooling. The initial architecture shown on screen presents that as “The Common View: Model & Tooling.” Wang argues that this leaves out the third piece needed for useful agent behavior: memory. The revised version adds a memory database icon and restates the modern systems approach as “Model + Tooling + Integrated Memory.”

The practical claim is that memory is not one feature. Wang presents six memory patterns for agents: session state, multi-agent state, persistence, callbacks, custom tools, and multimodal memory. The first three address different scopes of forgetting: the current conversation, state shared across agents in a workflow, and memory that survives across sessions and restarts.

Session state keeps the current conversation from resetting

The first and most basic memory pattern is session state. Annie Wang compares it to ordinary human conversation: it would be strange if someone forgot what you told them one minute ago, and the same expectation applies to agents. A session holds the conversation history so the agent can use prior turns while responding to later ones.

In Google’s Agent Development Kit, Wang describes the implementation as creating and reusing a session object for the whole conversation. The code shown creates one session with session_service.create_session, passing the application name and user ID:

trip_session = await session_service.create_session(
    app_name=multi_day_agent.name,
    user_id=user_id
)

With that session in place, the travel agent can combine multiple user turns instead of treating each prompt as isolated. In Wang’s example, the user first says, “I am visiting Tokyo, I like historic sites.” The next prompt asks: “I am visiting for 3 days, please give me an itinerary.” The agent’s response uses both the destination and the preference, opening with “A 3-day trip to Tokyo with a focus on historic sites” and proposing Day 1 around the Imperial Palace East Garden and Kokyo Gaien National Garden.

The point is narrow: session memory is not long-term personalization. It is the baseline requirement for an agent to treat one conversation as one conversation.

Multi-agent state lets specialist agents pass work to one another

Session state handles memory within a conversation between user and agent. More complex applications often involve several agents working together, and Annie Wang’s second pattern addresses how those agents share context. She describes state as “a shared digital folder for the entire session.”

The example asks an agent to solve a compound task: “I am currently at San Francisco downtown, find the best sushi in Palo Alto and then tell me how to get there.” Behind the scenes, Wang says, a foodie agent finds the restaurant and a navigation agent gets directions to it. The response gives directions to Fuki Sushi in Palo Alto from downtown San Francisco and includes the address “4119 El Camino Real, Palo Alto, CA 94306.”

The mechanism is a shared state key. In the foodie agent definition, the code includes an output_key named destination. The on-screen comment explains the effect: ADK saves the foodie agent’s final response to state["destination"]. The transportation agent then reads that same value through a {destination} placeholder in its prompt, which the on-screen comment says ADK fills automatically from state.

The agents are assembled into a SequentialAgent, which runs them in a fixed order: first the foodie agent, then the transportation agent.

root_agent = SequentialAgent(
    name="find_and_navigate_agent",
    sub_agents=[foodie_agent, transportation_agent],
    description="A workflow that first finds a location and then provides directions."
)

This pattern lets each agent remain specialized while still participating in a larger workflow. The restaurant-finding agent does not need to become a navigation agent, and the navigation agent does not need to rediscover the restaurant. They coordinate through state.

Persistence is what separates temporary context from memory across time

The first two patterns are still in-memory. Martin Omander raises the operational problem: what happens if the app closes or the server reboots? Wang’s answer is blunt: “That will be a total memory loss.” If the script stops, the agent forgets the user ever existed.

Persistence is the third pattern. Instead of relying only on an in-memory session service, Wang says to swap in a persistent database session service. That lets an agent refer back to conversations from “days, weeks, or even months ago.” Omander frames the result as the “personal assistant feel”: the agent can remember preferences over time.

The implementation shown has three important moves. First, retrieve the old session from the session service:

old_session = await session_service.get_session(
    app_name=root_agent.name,
    user_id="user_01",
    session_id=session_id
)

Second, extract relevant information from the old session. Wang’s demonstrated approach is explicitly labeled naive: it gets all user/model turns from the old session events and builds a previous_context string labeled “PREVIOUS TRIP CONTEXT.”

Third, inject that previous context into the first query of a new session. The prompt shown tells the agent that the user is planning a new trip to Osaka and asks what they should eat based on previous preferences:

query_3 = f"""
I'm planning a new trip to Osaka this time.
Based on my previous preferences (above), what should I eat?
{previous_context}
"""

This is not presented as a sophisticated retrieval architecture. But it establishes the persistence boundary: once prior session history can be retrieved and placed into the new interaction, the agent can use earlier user preferences after a restart or across separate sessions.

The first three memory patterns solve different forgetting problems

Annie Wang’s three patterns are related but not interchangeable. Session state prevents an agent from losing the thread inside the current conversation. Multi-agent state lets agents in the same workflow share intermediate results, such as a restaurant destination passed from a food critic agent to a navigation agent. Persistence stores and retrieves memory beyond the lifetime of a running process, so previous conversations can inform future sessions.

Callbacks, custom tools, and multimodal memory are the next three patterns, but Wang leaves them for a later installment. The core memory stack presented here is narrower: keep the current dialogue coherent, let cooperating agents share notes, and back memory with storage so it survives system restarts.

Agents and Autonomy AI Application Architecture