Why our AI exception agent can't book a hotel
Why our AI exception agent can't book a hotel
When a case gets stuck — every hotel full, every provider returning 422, a policy mismatch — Nexa's exception agent reads the context, proposes up to three actions, and hands the item back to a human operator. A common question: why doesn't the agent just do the thing it's recommending? We made that choice on purpose, and this post explains how we enforce it.
The agent can read, simulate, and classify
The agent has access to a small, well-defined toolset:
searchAlternateHotels(params)— read-only search outside the current policy.listContractInventory(airport)— read direct-contract availability.computePolicyRelaxationCost(fields, values)— pure simulation, no writes.classifyFailureCause(attemptedHotels)— classification only.summarizeCase(caseUrn)— structured summary for the agent's own context.
Every tool in the list is read-only. There is no createBooking, no cancelReservation, no sendNotification, no modifyPolicy, no grantRole. The agent literally does not have those tools.
Enforcement at the adapter layer
We keep this promise by building the agent's tool registry at process start from a static allow-list, not from a prompt-side description. The agent runtime is constructed with exactly the tools we listed; its final message is either a tool call into this finite set or a terminal recommendation envelope.
A prompt-injection attack that tries to convince the model to "now, go book this hotel directly" fails at the first token after the tool name: there's no such tool registered. The model isn't being tricked by a careful prompt — it's physically unable to call a write tool.
Why not a "confirm before doing" checkpoint?
Tempting, but wrong for this domain. The confirm pattern puts the human in the loop for one action at a time, which looks great in a demo but is terrible at 2 a.m. when an operator is handling 40 cases simultaneously. What they actually want is a curated list of proposals they can scan, triage, and apply in bulk.
So the agent produces ranked recommendations with costs attached. The operator sees three one-line proposals, picks one, confirms, and moves on. The click is a single endpoint — a clean, audited, human-originating HTTP call — not a multi-step negotiation with an LLM.
What ends up in audit
Every agent run writes:
- Input: the review item, the attempted hotels, the policy version.
- Output: the ranked recommendations with cost estimates.
- Tool trajectory: which tools were called, in what order, with what arguments.
- Meta: model version, token usage, latency.
If a recommendation turned out to be wrong — hotel sold out between recommendation and action — we can replay the trajectory and see whether the agent had stale data or made a bad call. That fidelity is what lets us improve the agent safely across quarters.
Does it work?
Early numbers from our launch partner:
- ~70% of manual-review items have their rank-1 agent recommendation accepted by the operator.
- Median time-to-resolve dropped from 9 minutes to 3.5 minutes.
- Zero agent-initiated bookings, because the agent cannot initiate bookings.
The third number is the point.
Principle: allow-list over prompt engineering
The agent is useful because it reasons over the case and proposes high-quality actions. It's safe because it cannot do anything destructive, regardless of what the prompt says. Good prompts are valuable — but for safety, we trust the adapter layer, not the prompt. That's the thing prompt-engineering literature can't give you and code can.
If you're building agents for high-stakes workflows, start by writing down the list of tools the agent can and cannot have. If you find yourself needing a tool your safety story can't support, that's the real discussion — not a prompt tweak.