Exception Agent
When a case lands in manual review, the Nexa Exception Agent — a Nexa Trained Model on Google — reviews the context, picks from a small allow-list of tools, and returns up to three recommendations for the human operator.
The agent is read-only. It can search, simulate, and classify — it cannot book, cancel, or notify.
When it runs
The agent is invoked synchronously when a manual-review item is created (if the agent is enabled for the tenant). It has a hard timeout (default 120s) and a max step count (default 10).
Input
A manual-review item — see Manual Review for the shape. The agent sees:
- The case (passengers, stay plan, tiers)
- The allocation wave (candidates, scores, discarded alternatives)
- Booking attempts and their errors
- The active policy
- The airport's context (contracts, nearby hotels)
Tools (allow-list)
| Tool | Description |
|---|---|
searchAlternateHotels(params) | Runs a hotel search outside the current policy's constraints (e.g., wider radius, lower stars) |
listContractInventory(airport) | Returns current direct-contract availability |
computePolicyRelaxationCost(fields, values) | Simulates what a policy relaxation would cost, given the current case |
classifyFailureCause(attemptedHotels) | Clusters provider errors into a root cause |
summarizeCase(caseUrn) | Returns a structured summary for the agent's own context management |
Tools not in the list include createBooking, cancelReservation, sendNotification, modifyPolicy, and anything that sends email. Forbidden at the adapter layer — not the prompt.
Output
{
"itemUrn": "urn:nexa:review:abc",
"agentUrn": "urn:nexa:agent-run:def",
"recommendations": [
{
"rank": 1,
"action": "rebook",
"hotelUrn": "urn:nexa:hotel:alt",
"cost": { "nightly": 130, "total": 260, "currency": "EUR" },
"reason": "Policy-compliant, 18km (4km over cap) — add ~€30 over original.",
"relaxations": [{ "field": "maxDistanceKm", "proposed": 20 }]
},
{
"rank": 2,
"action": "splitGroup",
"groupId": "GRP-3",
"into": [["PAX-1"], ["PAX-2","PAX-3"]],
"reason": "PAX-1 has accessibility requirement unmet by any near-airport hotel."
},
{
"rank": 3,
"action": "relaxPolicy",
"fields": ["minStars"],
"proposed": { "minStars": 3 },
"reason": "All 4★+ sold within radius; dropping to 3★ unlocks 12 compliant candidates."
}
],
"rootCause": "INVENTORY_EXHAUSTED",
"steps": 4,
"latencyMs": 3400,
"modelVersion": "nexa-exception-agent-v3"
}
Operator experience
In the operations console, manual-review items render the agent's recommendations as three cards with a one-click "apply" button. Each click triggers the human-gated action:
- "Apply rec 1" → operator confirms and fires
POST /manual-review/:urn/rebook. - "Relax policy" → operator confirms scope (this case only, vs. permanent policy change).
- "Split group" → operator confirms reason.
Nothing happens without that click.
Prompt shape
System prompt (cached):
- Nexa policy schema
- Tool definitions
- Hard constraints ("never recommend actions outside the tool allow-list", "always cite attempted hotels", "always rank recommendations by viability, not alphabetical")
User turn (per case):
- Structured case summary (produced by
summarizeCase) - Allocation wave JSON
- Booking attempts with provider errors
The agent then iterates with tool calls. Typical runs take 2–5 steps.
Why this design
- Strong structured-output reliability — recommendations validate against a schema before they reach the operator.
- Tool-use behaviour is deterministic (the tool registry is built from a static allow-list at process start, not from a prompt-side description).
- Data-handling contracts aligned with airline compliance needs.
The model used is Nexa's; tenants do not manage LLM API keys or quotas.
Cost and latency
- Typical case: ~2200 tokens of context + ~400 tokens of output.
- P50 latency: 2–4 seconds. P95 under 10 seconds.
- Cost is metered as part of the platform subscription — there is no per-token bill from the customer's side.
- Timeouts don't block operators — when the agent misses the deadline, the review item shows a "no recommendations" state and the operator works from the provider error messages.
Drift and eval
A shadow-eval harness replays past manual-review items monthly against the current agent and compares:
- Whether recommendations would have worked (simulated against the frozen case)
- Acceptance rate (which agent rec the human ultimately picked)
- Latency and cost trends
Results feed prompt tweaks and model-version pinning.
Disabling the agent
Tenant admins disable the agent in the operations console under Settings → AI → Exception agent. The review item is still created, operators still see the case context and attempted hotels — they just don't get the ranked recommendations. The deterministic pipeline is unaffected.