On Human-Agent Collaboration: Métis, Centaur Chess, and the Milk Problem

I spent Wednesday afternoon grabbing boba at the Snorkel AI office on 2nd Street in SF for their reading group.

The talk was by Yijia Shao, a PhD candidate at Stanford NLP, covering her recent ICLR paper on Collaborative Gym (Co-Gym). The core question was: how do we build AI agents that actually collaborate with us, rather than just taking over and making a mess?

The "High-Modernist" AI Problem

Yijia opened with an insane stat: the length of tasks AI models can successfully execute is doubling almost every seven months, which she compared to Moore's Law but for agentic stamina.

But just because an agent can run for hours doesn't mean it should run unsupervised. When we let these agents off the leash, they often return with what Yijia called "AI slop" (aka output that takes humans longer to clean up and verify than if they'd just done the work themselves).

She brought up this silly (and slightly scary) example of OpenAI's "Operator" agent. A user asked it to help with grocery shopping, and instead of asking clarifying questions (where do you live? what kind of milk do you want?), it just confidently went online and started searching for milk at a random grocery store in a random city.

"We develop AI to do boring jobs so that humans get to be creative. But the reality is... the opposite is happening and people are reviewing what this AI has produced." — Yijia, referencing a Reddit comment

This reminded me of James C. Scott's Seeing Like a State. Scott's argument is that centralized systems fail because they rely on legible, oversimplified metrics while ignoring métis, which is the localized/messy knowledge of people on the ground. Fully autonomous AI agents run into the same wall. The agent has a goal ("acquire milk") but no localized context ("I live in SF and only drink 1%"), and no way to ask for it.

Enter Co-Gym: Cybernetics and Intelligent Deferral

To fix this, Yijia and her team built Collaborative Gym, a framework that moves away from the rigid turn-taking chatbot model toward something more like a continuous event loop, giving both the human and the agent dual control over a shared environment. The two moves that make this work are:

Proactive Communication: the agent can message you before you prompt it, if it realizes it's missing context
Intelligent Deferral: if the agent hits a wall, it can pause and hand a specific action back to the human

This is basically Norbert Wiener's original vision for cybernetics: a continuous feedback loop between human steering and machine execution. It's also the logic behind the Centaur Chess model, where a human and computer playing as a team tend to outperform either alone. When they ran the experiments, these collaborative agents outperformed autonomous ones, and real users preferred them because the interaction felt flexible rather than railroaded.

The Messiness of Real Humans

My favorite part of the Q&A was the difference between training agents on "simulated humans" (other LLMs) versus testing on real humans from Upwork.

Simulated humans are pushovers. Real humans are stubborn. In simulations, if the agent and the simulated human get confused about who's supposed to do what, the task just dies. With real humans, even when the agent is hallucinating or failing to coordinate, the human will drag the task to completion out of sheer willpower. As Yijia put it, human fuzziness isn't a bug; it's the new frontier.

The SF Startup Energy

After the talk I ended up talking to a few founders iterating on human-in-the-loop interfaces. The conversation kept circling back to the same question: if an agent pauses to defer, how do you design an interface that makes it instantly clear why it stopped and what it needs, without making the human dig through a terminal? Nobody's really solved that cleanly yet. Nobody's really solved that cleanly yet, and it seems like the most interesting open problem in this space right now: fewer tools that replace us, more tools that know how to pass the baton.

Things I referenced

James C. Scott — Seeing Like a State (1998)
Centaur Chess / Advanced Chess
Yijia Shao — Collaborative Gym (Co-Gym); her Augmented Mind Podcast covers more on human-centered AI systems