Methodology — FlySafe

What follows is the framework behind every FlySafe score — the signal classes we ingest, the principles that shape each output, and the coverage we stand behind. The methodology itself — how those signals are weighted, how thresholds adapt per FIR, and how the model fuses and calibrates them — is proprietary and continuously refined. We publish the foundation in full; the recipe stays closed by design.

Output

For each Flight Information Region (FIR) in our active set, we produce a numerical risk index (0–100), a discrete risk level (low / moderate / high / critical), per-factor breakdown, and a confidence tier reflecting our observational coverage of that FIR. Multi-horizon outputs cover 24h, 7d, and 30d windows. Threshold-crossings are emitted as webhooks on supported plans.

Signal stack

Inputs are organised into seven signal classes. Each class admits multiple primary sources for redundancy and is cross-validated where mandates overlap.

01Regulatory bulletins. Conflict-zone advisories, SFARs, ICAO state letters, prohibitory NOTAMs from national civil aviation authorities.
02Aircraft telemetry. ADS-B state vectors, position-integrity reports (NIC / NACp), traffic-density anomalies.
03GNSS interference. Position-integrity degradation aggregated to airspace cells, cross-referenced against pilot reports and operator advisories.
04Conflict & geopolitical events. Curated event streams with location, severity, and category attribution.
05Natural hazards. Tropical cyclone tracks, volcanic ash advisories, wildfire detection, seismic activity.
06Operator signals. Public airline routing changes, suspensions, advisory disclosures, ops-group bulletins.
07Market signals. War-risk premium movement, coverage withdrawals, underwriter advisories where publicly disclosed.

Each class has its own ingestion cadence, ranging from minutes (telemetry, regulatory feeds) to weekly (market signals). Source diversity is treated as a feature, not a redundancy.

Design principles

A handful of opinionated choices shape every output. These are not implementation details; they are positions we have committed to.

Cause separation over aggregate scoring

A FIR closed by aerial strike, a FIR ground-stopped by a typhoon, and a FIR struck by an ATC labour dispute carry materially different operational implications. Most public indices conflate these. We do not.

Coverage-honest scoring

Every FIR carries a coverage tier that reflects how observable it is via independent telemetry. A quiet ADS-B feed in an oceanic FIR is not the same evidence as a quiet feed over a major hub. Our confidence label propagates this distinction to the API consumer.

Independent validation against verified closures

The system is calibrated against a curated record of documented airspace closures across multiple categories — conflict, cyclone, volcano, sanctions, ATC action, pandemic. Predictions are graded with per-region hold-out splits, not pooled accuracy that flatters Middle-East-only performance.

Public-first by design

The foundation is data publicly available from cited sources — this keeps outputs reproducible and avoids the policy exposure that comes with classified inputs. Where partner-contributed datasets meet our quality and access-terms standards, we incorporate them. We do not ingest classified intelligence, restricted military communications, or non-public regulatory deliberations.

Stable contract over moving model

The scoring engine evolves. The API contract does not. Output shapes are versioned (v1 / v3) and breaking changes follow a public deprecation policy. Internal model versions are documented in the roadmap.

Coverage

428

FIRs in active inventory

300+

FIRs continuously scored

Global

ICAO ARTCC coverage (23 of 24 regions)

Multi-source

15+ feeds across 7 signal classes

Coverage expands by partner request and as observational density permits. Single-source FIRs are flagged at lower confidence rather than scored aggressively.

What we do not claim

×We are not a certified aviation service provider. Outputs are computational summaries of public signals, not advisory products under any regulatory framework.
×We do not replace operational documents. Crew NOTAM briefings, SIGMETs, and AIPs remain authoritative. Our role is integration, not substitution.
×We do not predict zero-history events. Black-swan onsets have a latency equal to the fastest underlying signal we observe. We model recurrence and escalation, not novelty.
×We do not disclose the model. Weights, calibration, and feature engineering are proprietary. Outputs are auditable; internals are not.

Known limitations

·Source latency. Regulatory bulletins lag the underlying event by hours; market signals by days. Our latency claim is per-class, not global.
·Observation gaps. ADS-B receiver density is uneven; oceanic and certain African / Pacific FIRs have lower confidence by design.
·Historical depth. Calibration windows for some FIRs are short. We surface this in the confidence label rather than smoothing it away.
·Cause attribution. Auto-tagging of closure cause uses LLM-assisted extraction over curated source text. We grade extraction quality continuously and accept tagged-but-uncertain rather than untagged.

For technical evaluators

The output surface is the verifiable artefact. Evaluation paths:

· Sandbox API access — sample the output shape on real data, no commitment.
· Public roadmap — shipped capabilities and forward direction, with engine version markers.
· Per-request audit log on supported plans — every score has a traceable input snapshot timestamp.

FlySafe provides automated computation of numerical indices from publicly available data. Indices are raw computational output and do not represent opinions, assessments, recommendations, or advice of any kind. They do not replace official NOTAMs, SIGMETs, AIPs, or communications from aviation authorities. Each operator is responsible for their own independent assessment. See Terms of Service.

Last reviewed: May 2026