Orply.

Useful AI Agents Need Smaller Contexts and Simpler Representations

Angus McLeanAI EngineerMonday, May 25, 202611 min read

Angus McLean, an AI Director at OLIVER, argues that useful agents are not the most autonomous ones but the best constrained. Drawing on OLIVER’s production use of AI across thousands of daily creative assets, he says builders should resist both model and developer tendencies toward verbosity and over-engineering: use curated documentation instead of open web access, ask how little context a task needs, choose simple representations such as HTML when they work, and avoid automating jobs they cannot do themselves.

Agent work is becoming production work, not just experimentation

Angus McLean frames agent design as a practical discipline for people already building with large language models, not as a definitive theory of autonomy. His title, “Bounded Autonomy,” points to the main tension in his advice: useful agents sit between open-ended possibility and imposed constraint. The work is not simply to give models more agency, but to decide how that agency should be limited through context, representation, workflow design, and human judgment.

McLean’s perspective comes from advertising production at OLIVER, where he works as an AI Director. He describes the company as having 3,000 staff across 46 countries and says it generates around 4,000 assets a day for more than 200 brands. The important point, in his account, is not just volume. OLIVER puts meaningful media spend behind generated assets — from around £20,000 to a few million — so the work is measured “in the wild.” That gives the organization feedback loops, performance data, and faster iteration on what actually works.

4,000
creative assets generated per day, according to McLean

That production setting changes the standard for agents. In advertising, creative and strategy work used to be knowledge work performed by teams: accounts managing the client relationship and project flow, creative turning ideas into ads and content, and strategy defining the insight, audience, and direction behind the work. McLean says creative and strategy are now increasingly agentic. Agents support generation, ideation, copywriting, content production, audience insight, trend analysis, competitor analysis, and performance optimization.

The reason to use agents is primarily speed and secondarily scale. Creative teams need speed for iteration and testing. Strategy teams need scale for research, such as campaign personalization across personas, territories, or cities. A brand might localize campaign insight for New York and Miami, for example, rather than treat “the consumer” as one abstract audience. The broader goal is to “do much more with much less” and produce more effective advertising.

But the same production context makes agent failure more consequential. McLean stresses that OLIVER is predominantly customer- and client-facing, operating in a fast-paced and high-risk environment. When AI-generated assets are scaled, poor reception can harm a brand as easily as strong reception can help it. That is the backdrop for the rest of his advice: do not mistake agentic capability for permission to automate loosely.

The core model has not changed as much as the tooling suggests

The first practical instruction is to slow down. AI tools, patterns, and acronyms can create a “blink and you’ll miss it” mentality, but Angus McLean asks whether a missed tool was really important if it disappeared that quickly. He has seen many tools come and go, and argues that the underlying constraints of large language models have not changed as dramatically as the surrounding ecosystem can make it feel.

His working mental model of an LLM is deliberately plain: a “closed box with knowledge inside,” or a flexible database capable of “semantic math.” He says he does not expect emergence or continuous learning from the model itself, and he presents this as a useful simplification for builders rather than a technical proof of what models are.

I prefer to think of it as like a flexible database capable of doing semantic math than anything else.

Angus McLean · Source

McLean points to limitations that, in his account, remain visible even as systems become more capable. Humans can learn from very few examples, while models require massive datasets to reach comparatively simple conclusions. Humans continuously learn without forgetting in a way models do not. In advertising, this shows up sharply in trend identification: if something is genuinely new, the model may not recognize it. That matters for work that depends on cultural recency.

He also characterizes much of the agent tooling stack as a set of “band aids” around model constraints. A band-aid solution, in his slide, is temporary, superficial, and inadequate: it masks symptoms rather than fixing the underlying problem. McLean extends that critique not only to developer tools but even, arguably, to some training approaches themselves. Production systems may need these fixes, but builders should know when they are compensating for a constraint rather than removing it.

Context-window growth is, for McLean, one of the clearest sources of recent agentic improvement. Long-running agents need to keep track of goals and plans, action history, tool outputs, and evolving structure over time. Without sufficient context, the system forgets mid-task and fails at long, complex workflows. He contrasts older, tiny context windows with million-token-scale windows, but refuses to treat larger context as the final solution.

His reason is simple: context demand expands with capability. The total amount of knowledge available to a system is vast and growing, and users will always want more. Even very large context windows will not be enough if the strategy remains “put more in.”

Context is a design surface, not a dumping ground

Context is one of the main ways builders shape model behavior. Angus McLean treats it as a form of soft constraint, comparable in practical importance to guardrails. Guardrails are hard limits: policies, safety rules, and system-level restrictions defining what a model must not do. Context is more flexible. It comes from prompts, instructions, conversation history, retrieved documents, memory, and tool outputs. It shapes the model’s behavior indirectly.

That makes context management one of the core tasks of agent design. McLean gives a concrete example: do not give the model web search if high-quality documentation will do. In his experience, models are poor at spotting promotional content and are highly susceptible to SEO. For competitor research in advertising, a web-connected model may absorb what competitors say about themselves instead of the consumer information the strategist actually wants. Once the model uses an external tool, its output becomes constrained by how it uses that tool.

This leads McLean to a shift in how context engineering should be understood. Earlier approaches dealt with scarcity: top-k retrieval, cluster labels, TF-IDF, and other methods for squeezing useful material into small context windows. With larger windows and dynamic context assembly, the question is no longer only what to include. It is what to exclude.

The challenge is no longer getting context in — it's keeping noise out.

Angus McLean

He argues for “quality over quantity” in context construction. The model may manage some of its own context in a dynamic agent system, pulling in retrieved documents, tool outputs, and memory at runtime. But the builder still has to decide what the system is allowed to see, what representations it should use, and how much irrelevant material it should be protected from.

McLean connects this to a broader principle: constraints create creativity. Abundance makes builders less scrappy. Instead of asking how to use the whole context window, he proposes asking how little context is needed to complete the task. That question forces the builder to understand the task, the data, and the model’s failure modes.

His suggested exercises are intentionally fundamental. Try an older or smaller model, not necessarily in production, to understand what the model can and cannot do without the crutch of frontier capability. Build a simple harness, memory, or compaction layer yourself. Preprocess and archive data. Learn how file systems, knowledge graphs, clustering, and retrieval structures affect the model’s behavior. McLean argues these practices improve prompting, control, context management, and understanding of the underlying data.

The analogy he reaches for is early computing and game development: severe compute and memory limits forced engineers to invent disciplined approaches. Spacewar! was built within 4,000 words. Crash Bandicoot, in his example, relied on memory constraints that pushed developers toward clever solutions. Strong systems often come from builders who cannot simply spend more tokens, compute, or orchestration complexity to avoid thinking.

The simplest working system often beats the agent architecture

A complex architecture can lose to a better representation. Angus McLean’s CV example is deliberately humbling: he built an elaborate application to generate his CV, involving a provenance ledger, event log schema, node path taxonomy, multi-vector setup, graph layer, file system structure, governance rules, recursive interpretation flow, layout budgeting, truth separation, a research subsystem, LangGraph topology, a voice service, and a front-end folder.

Then he found that the model did better with four letters: HTML. Asking for a CV “in HTML” produced what he first labels a 10x performance improvement on the slide and then says he would put closer to 100x.

ApproachWhat McLean showedResult he reported
Complex CV applicationProvenance ledger, event log schema, graph layer, recursive interpretation flow, LangGraph patterns, voice service, and other componentsWorked, but not as well
Simple representationThe prompt “write me a cv in html”10x on the slide; McLean said probably 100x
McLean’s CV example contrasts a complex agent application with a simple HTML constraint.

The visual contrast was central to the example. One slide asked “WHO WOULD WIN?” and placed a long stack of architecture terms opposite “4 Simple Letters.” The stack included a provenance ledger, event log schema, node path taxonomy, multi-vector setup, graph layer, file system structure, recursive interpretation flow, deep research subsystem, LangGraph topology, voice service, and front-end folder. The next slide showed two prompts — “write me a cv” and “write me a cv in html” — beside generated HTML code and the caption “10x performance improvement.”

The lesson is that a simple representation can give the model enough structure to solve the real task without a large agent architecture around it. McLean describes this as a pattern builders will recognize: you build something elaborate and then realize the model could do better with a simpler instruction or format.

He warns that both models and developers tend toward complexity. Models are naturally verbose and will often propose overbuilt solutions. Developers working with powerful tools are tempted to use them because they can. McLean’s advice is to resist that impulse: do not waste time, do not waste tokens, and do not create unnecessary work.

The practical standard is “the simplest version that works.” Build that first. Keep the feedback loop with reality short. Then iterate. McLean distinguishes this from only thinking about an agent’s internal feedback loop during task execution; the builder also needs a short path to discovering whether the product idea, architecture, or abstraction works at all.

Meet the model in the middle by choosing the right representation

The most conceptual argument in the talk is that AI can be usefully understood as translation. Angus McLean points back to “Attention Is All You Need,” which he describes as initially about English-to-French translation, and extends the frame to contemporary multimodal systems: text to image, image to audio, audio to video. In this framing, the model transforms one representation into another.

From there he makes a broader, tentative claim: there is an argument that knowledge production is itself summarization. His own talk, he says, is a compression of several years of experience into usable knowledge. If much of the work is transforming, compacting, and re-expressing information, then the builder’s job is to choose representations that let the model operate well.

Different data types can be converted into a common internal representation and then transformed into something else. Unstructured inputs can become structured outputs. Structured inputs can become unstructured outputs. McLean connects this to MCP, model handoffs, and long-running agents: something structured may sit on one side, something unstructured on the other, and the system’s usefulness depends on moving between them.

The implication, as McLean states it, is that structure is not simply an inherent property of the data. It is a property of the representation and the observer. The same content might be best expressed as a diagram, a written explanation, a voiceover, a table, or a timeline depending on what the user needs to see.

For builders, McLean recommends multiple representation layers rather than a single universal structure. Markdown or outlines are useful for human-readable hierarchy and authoring. Graphs capture relationships, references, dependencies, and causality. Clustering and embeddings help with similarity, discovery, and analogy in large or unstructured text bodies. Folders and trees support containment, ownership, navigation, and fast retrieval. Tables support consistent attributes and filtering. Timelines support sequence and change over time.

This is what he calls “organic software”: systems that use multiple representation structures. Meeting the model in the middle means understanding which representation makes a task easier for the model and more legible for the human. In McLean’s examples, that can mean choosing ordinary structures — Markdown, a folder tree, a table, a timeline, or HTML — instead of reaching first for a larger agent architecture.

Do not automate a job you cannot do yourself

Experimentation and production require different tolerances for autonomy. In production, agent work tends toward workflow automation: repeatable, dependable, constrained systems with low error tolerance. In experimentation, builders can work with long-running agents, higher autonomy, open-ended exploration, and new interaction modalities. Angus McLean argues that people who want to get good at this work often need to experiment in their own time, because most companies will not allow the same degree of play inside production roles.

That does not mean production should avoid agents. It means the builder has to know what kind of system the environment can tolerate. McLean says he had planned to discuss structured versus less structured agents in the workplace, and then invokes Adam Smith’s pin factory: capitalism naturally breaks tasks down into small, repeatable chunks. In OLIVER’s work, he says, workflows have often been more effective.

The final principle is blunt: do not automate a job unless you can do it yourself. McLean illustrates this with his old domain, social media intelligence. He shows an “insight deck” for a social listening project created from Twitter data: 49,230 posts analyzed, 34 clusters, using claude-3-opus-20240229 over a source file labeled TWITTER_DATA.json.

49,230
posts analyzed in McLean’s social-listening example

The report includes cluster volume summaries, engagement metrics, momentum, and standout posts, turning roughly 50,000 tweets into organized strategic insight for creative and strategy teams. The visible deck showed a “Social Listening” project page, an “Insight Deck” titled “Steering conversation: themes, momentum, and implications,” charts under “Scale, engagement, and momentum,” and a section of key posts for a cluster. McLean says this is the sort of information an advertising agency would want, and that agents can create it almost instantly for creative or strategic teams.

The person automating the process should understand what good analysis looks like, what information an advertising agency needs, and how the outputs will be used. Without that domain competence, automation can produce plausible-looking reports without the judgment needed to make them useful.

That principle ties together the practical advice. Slow down because tool churn can distract from persistent model limitations. Constrain context because more information is not the same as better information. Keep systems simple because the model often needs a better representation, not a larger architecture. Meet the model in the middle because task performance depends on choosing the right structure. Experiment because fluency comes from building, breaking, fixing, and repeating.

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free