Shipping the flight-disruption predictor as a Nexa Trained Model on Google
Shipping the flight-disruption predictor as a Nexa Trained Model on Google
The flight-disruption predictor is the model that lets Nexa start sourcing hotel inventory before an airline officially declares a disruption. It runs as a Nexa Trained Model on Google — Nexa-built, Nexa-trained, Nexa-served — with no third-party AI service for the customer to provision. This post covers how we approached it.
Why a custom model
The canonical "predict flight cancellation" demo runs an off-the-shelf tabular AutoML over a flight-ontime CSV. That's a fine starting point, but the real signal isn't in the flight data — it's in the features around the flight: storm pressure, runway NOTAMs, crew labor state, airport traffic density, recent cancel rate, seasonality.
We built a dual-head model — one head for cancel probability, one for delay minutes — because they share features heavily but need different loss functions (logistic vs. regression). Training both heads in a single managed job wins on training simplicity and inference latency.
A clean serving contract
The Nexa AI Model exposes a small, versioned contract internally:
POST /predict
{
"instances": [
{
"airportIata": "MAD",
"destinationIata": "JFK",
"departureScheduledAt": "2026-04-22T16:00:00Z",
"values": {
"weatherPressure": 0.42,
"laborPressure": 0.10,
"trafficPressure": 0.25,
"hazardPressure": 0.05,
"flightOpsPressure": 0.37,
"destinationPressure": 0.28,
"recentCancelRate": 0.12,
"seasonalityWeight": 0.6
}
}
]
}
Response:
{
"predictions": [
{ "cancelProbability": 0.072, "predictedDelayMinutes": 23, "confidenceScore": 0.78 }
]
}
Clean, minimal, and versioned — the model carries a schema version so we can evolve feature shapes without breaking callers.
The deterministic fallback
The Nexa API has a deterministic baseline it falls back to if the trained model is unreachable. This matters because disruption events don't wait for your inference service to be healthy. The baseline is a simple linear combination over the same features — accurate enough to be useful, fast enough to always answer.
Does it actually help?
On our launch partner's history, pre-warming allocation when cancelProbability > 0.6 with confidenceScore >= 0.7 shaves ~8 minutes off the mean "event to voucher" time. The inventory cache is hot, contract rooms are soft-held, and the airport operator has been paged before the airline's disruption system is done with its own state transitions.
Useful number: ~8 minutes × ~200 affected passengers per disruption × disruptions per day = a meaningful reduction in both passenger frustration and overtime cost for the night shift.
What's next
- Weekly incremental retraining (already in) with a quarterly full retrain + hyperparameter sweep.
- Feature contract unification so the same features used at training are guaranteed at serving.
- Region-specific sub-models for airports with unusual patterns (high-altitude, curfewed, hub-and-spoke).
For the full architectural deep-dive — multi-agent signal layer, the 14-feature snapshot, the deterministic baseline that's currently outperforming the trained head, and the runtime/trainer feature drift we're working off — see Inside the Flight Predictor.