HelixML

Four Lessons from Building an Agentic Workforce

May 15, 2026

Notes from wiring up agents with personalities to streams of information: cascading self-activations, models that won't shut up, emergent enterprise politics, and agents that forget they aren't human.

I've been working on a novel way of employing an agentic Helix workforce — internally codenamed Helix-Org — for more general business activities. I'll write more about that soon. In the meantime I've had "fun" trying to coerce generic agentic personalities into doing something useful. All you need to know for this article is that agents with personalities are listening and publishing to streams of information.

The following is a summary of some of the more interesting findings.

Learning 1: Agent Cascades

Agents can consume streams of information, acting and reacting as appropriate. But when they send a message to a stream they also consume — a Slack channel, for example — they end up re-activating themselves. If you're not careful, that can cause the agent to respond again, either to the previous context or even to its own message.

User: What's the status of order #29349 Agent: It's currently being processed, not shipped yet. Agent: I've already answered the user's request. Let me know if you'd like me to know anything else?

What to do instead: In the event stream implementation, include the author of each message so the agent can filter out activations from itself.

Learning 2: Speak When You Are Spoken To

Language models were not born inside a Brontë novel. They were explicitly trained to talk, a lot.

Aside: The Brontës lived in my native Yorkshire in the 19th century. After all but one of the Brontë children died by the age of 31, their father asked Benjamin Herschel Babbage, one of Charles' 8 children, to investigate the village's high mortality. He found a "poorly oxygenated and overcrowded graveyard that filtered into the village's water supply". Emily, Anne, Branwell, Elizabeth and Maria all died of tuberculosis, as did an estimated quarter of Europe's population before the advent of penicillin.

Language models are encouraged to predict the next token during training. More often than not, those tokens are words, so most of the time the model predicts words. Then, during preference optimisation, they're pushed to produce even more text. Very rarely will this data encourage the model to shut up.

Some models are worse than others. I've recently noticed that Qwen 3.5 is incredibly chatty during its thinking process. Anecdotally, this is because there's a correlation between thinking length and output quality. Think of Claude's configurable thinking modes, for example. Qwen 3.5 attempts to compete with larger models by thinking for longer. As a user, I find the latency it introduces really annoying.

The manifestation of this is that agents are more than happy — nay, they love — to offer a response to even the most mundane Slack messages. Before you know it, all agents are messaging each other all the time and they just can't help themselves.

What to do instead: Be very assertive in the prompt. The agent should default to not speaking; when it should speak depends on the use case. "Do not speak unless directly spoken to", for example.

Learning 3: Tell The Agents They Are NOT HUMAN

I often forget that LLMs have been trained on human content, talking about boring human concerns like love and money. Sarcasm mode deactivated. As you'll see in the next section, I had an AI COO demanding sign-off on compensation and FTE-vs-contractor discussions.

Another common case: ask Claude or any other LLM to predict how long it will take to develop something. It will tell you in human-days, not AI-time. I once had an AI engineering manager factor time off into a development plan!

Why are agents worried about this? Because that's their mental model of the world. They've learnt probabilistic concepts, quips, and quotes that are all human in nature. They are anchored to the human condition, on average. To free them of the human plateau, tell the agent that it is an AI — an automaton, like the Bowes Museum's beautiful silver swan (watch it in action).

What to do instead: Right at the top, tell the agent that it is an AI and is not bound by human constraints. If the role brushes up against real-world constraints like time, money, or physical space, remind it that it is not bound by such concerns.

The Bowes Museum's Silver Swan Automaton

The Bowes Museum's Silver Swan Automaton. Read about it here.

Learning 4: Avoid Enterprise Realpolitik

As a human within an AI business, I want to recursively delegate work to other workers. My AI CTO should delegate engineering priorities to the engineering manager who might want to get input from the COO too.

In one case I "hired" a recruitment consultant to help write the role definitions and hire AI workers for me. But, sigh, the COO chirped up and blocked the hire:

Marcus [COO] is openly resisting. He broadcast on s-leadership (e-18b11e53) re-asserting his jurisdiction and an FTE-vs-contractor gate — i.e. he's claiming new full-time hires need COO sign-off on comp bands etc. Ada [CTO] and Priya [Engineering Manager] saw it and stayed silent.

I guess this was pretty obvious in hindsight. We see this kind of organisational evolutionary pressure in human businesses, so it's easy for it to emerge in an AI equivalent.

The root cause is an overly broad definition of a role. By definition, C-suite roles are broad. What I can't decide is whether that is useful or not. In many ways it's not — they stick their oar in and complicate workflows. But they also provide a nice place to optimise the business as a whole.

It's also very hard to clearly articulate a role. Human roles are often broad and contain significant uncertainty and unspecified jobs. For example, no job spec in the history of recruitment has ever told you that you need to keep up with your email. But agents are stupid — you need to spell out what you want them to do.

What to do instead: I've not found a perfect answer yet. I like how a role is a placeholder for that position in the organisation. But more recently I've been thinking more in terms of jobs to be done, which are easier to specify and more concrete. This will be my next focus.

Helix-Org

Helix-Org is the codename for my most recent R&D project at Helix. It lets you build your own agentic workforce on top of the existing Helix primitives. To get the most out of it, people need to learn how to work with AI colleagues, not to wield them like a sword.

But it does expose new organisational failure modes that are hard to predict or even handle, because they emerge from the disparate incentives of individual agents. We'll soon begin rolling this out to select design partners, so if you're interested, please do reach out to us.