Everyone is selling AI. Most of it is a ChatGPT wrapper with a logo on it.
Here's what I mean by AI agents, and why it's different from what most people are buying.
A Chatbot Answers Questions. An Agent Runs Operations.
A chatbot sits on your website and waits for someone to type something. It gives a response. Maybe it's useful, maybe it hallucinates. Either way, it's reactive. It does nothing until a human initiates.
An AI agent is different. It's embedded in your operational workflows. It triggers on events — a new lead comes in, a follow-up is overdue, a pipeline stage changes. It classifies, routes, personalises, and acts. No human initiates. No human monitors. It runs at 3am on a Tuesday when nobody is watching, and the work is done by morning.
That's the difference. One is a feature. The other is infrastructure.
What AI Agents Actually Do (Real Examples)
Lead classification. A new contact enters the CRM. The agent reads the intake data — age of children, location, inquiry type — and classifies them into a segment. Not based on rules someone wrote. Based on patterns the model learns from your actual conversion data. 93% accuracy on classification, validated before launch.
Personalised outreach at scale. 5,400 messages a month. Each one personalised to the contact's segment, history, and stage in the pipeline. Not a mail merge with a first name token. Actual personalisation — different tone, different content, different call to action based on where they are in the journey.
Follow-up sequencing. Lead goes cold after 48 hours? The agent triggers a re-engagement sequence. Different from the initial outreach. Calibrated to the reason they went cold — no response vs showed interest but didn't book vs booked but cancelled. Each path gets a different approach.
Multilingual operations. An education company with students across Latin America. The agent handles inbound messages in Spanish, Portuguese, and English — classifying intent, routing to the right team, and responding in the contact's language. No separate system per language. One agent, three languages, running 24/7.
The Boring Infrastructure That Makes It Work
Nobody talks about this part. It's not exciting. It's what separates a demo from a production system.
Tiered model routing. Not every task needs the most expensive model. Lead classification uses a fast, cheap model. Personalised copy generation uses a more capable one. The system routes each task to the right tier automatically. This cuts AI costs by 60-80% without losing quality where it matters.
Observability. Every agent action is logged. What it classified, why, what it sent, what the contact did next. If something breaks — a misclassification, a weird response, a failed send — you can trace it. Not "the AI did something weird." Actual logs, actual traces, actual debugging.
Guardrails. The agent has boundaries. It doesn't hallucinate pricing. It doesn't promise things the business can't deliver. It doesn't send messages outside approved hours. Every production agent has a constraint layer that's as important as the intelligence layer.
Failover. If the AI model is down, the system doesn't break. Messages queue. Fallback logic kicks in. The business keeps running. A production system is not a demo that works when conditions are perfect.
Why Most "AI Solutions" Fail
They skip the boring part. They build the intelligence layer — the model, the prompt, the demo — and call it done. No observability. No guardrails. No failover. No integration with the actual CRM where the data lives.
The result: an AI feature that works in a demo and breaks in production. Or worse, works silently wrong — sending the wrong message to the wrong contact at the wrong time, and nobody knows until a customer complains.
Production-grade means: it runs unsupervised, it handles edge cases, it logs everything, and it degrades gracefully when something goes wrong. That's the bar. Most AI implementations don't clear it.
The Stack Behind It
There's no single tool that does all of this. It's a system.
The CRM holds the contacts and pipeline. The AI layer handles classification, personalisation, and routing. The workflow engine orchestrates the triggers and sequences. The observability layer tracks every action. The integration layer connects it all — webhooks, APIs, data transforms.
I pick the right tool for each job. Not the one I know best. Sometimes that's GHL for the CRM. Sometimes it's Supabase for the data layer. Sometimes it's a custom Python service for the AI logic. The stack serves the operation, not the other way around.
The test: Ask your AI vendor what happens when the model goes down at 2am. If they don't have a clear answer, you don't have a production system. You have a demo.
Frequently Asked Questions
Running on a stack that grew by accident?
Tools added one at a time, never architected together. That's the problem I solve. Book 45 minutes and I'll map what moves, what stays, and what makes sense for your operation.
Book a Discovery Call