After two years of hype, autonomous agents are quietly being deployed inside Fortune 500 companies — but only for the narrow tasks where they actually work.
If you spent 2024 watching agent demos crash spectacularly in front of live audiences, you would have been forgiven for writing the whole category off. Two years later, the picture is very different. AI agents are now running production workloads at Walmart, JPMorgan, Siemens, and dozens of mid-market firms — and most users don't even know they exist.
Where agents actually work today
The pattern is consistent: agents thrive in narrow, well-instrumented, reversible domains. Customer support triage, internal IT helpdesk, code migration, contract review, and data-pipeline orchestration are all proving to be sweet spots. What still doesn't work? Open-ended research, anything safety-critical, and tasks with sparse feedback signals.
The new tooling stack
The agents winning in production share a common architecture: a strong base model (Claude, GPT, or Gemini), a tightly scoped tool API, durable execution via something like Temporal or Restate, and — critically — a human-in-the-loop checkpoint for any action that touches money or sends external communication.
What it means for engineers
The skillset is shifting. The 2026 AI engineer spends less time prompting and more time designing tool surfaces, writing evals, and building observability. If you want to ride this wave, start by mastering function calling and structured outputs — that's where the real work happens.