Additional modules temporarily inaccessible

Incident Report for SchemeServe

Postmortem

Incident Report: Partial Frontend Outage on March 31, 2026

Issue

The automatic renewal of an internal SSL certificate failed, causing our feature flag service to go down. As a result, newer interfaces—QSE and Digit—did not load correctly when the frontend attempted to retrieve feature flags from the service.

Summary

On March 31, 2026, a subset of users experienced issues loading the new frontends (QSE and Digit). When the frontend requested an update to feature flags, only the frontend was affected—no other services or processes were impacted. We know how disruptive this was and sincerely regret the inconvenience.

Timeline

  • ~12:16: Internal SSL certificate failed to renew automatically.
  • ~15:00: First reported instance of QSE failing to load.
  • 15:11: Root cause identified.
  • 15:46: Service restored.

Impact

  • Affected: New frontend screens—QSE, Digit, Translations, Email Logs, Machine Keys.
  • Duration: Approximately 3.5 hours.

Root Cause Analysis

The internal SSL certificate between Cloudflare and our feature flag service failed to automatically renew as expected. This caused the service to become unreachable, resulting in the frontend either displaying a “white screen of death” or reverting to older UI versions when making network requests to the feature flag API service.

Resolution Steps

  • Removed the automated renewal process.
  • Created a long-lived SSL certificate for this service.

Preventative Actions

  • Implemented longer-lived SSL certificates:

    • Eliminates automated renewal dependence, reducing the risk of unexpected outages.
  • Enhanced alerting on the uptime state of the feature flag service:

    • Ensures faster detection and response to potential downtime.

Apology

We sincerely apologize for any disruption this may have caused your business. We are deeply committed to preventing similar incidents in the future. Thank you for your understanding and patience.

Posted Apr 01, 2026 - 09:49 BST

Resolved

This incident has been resolved.
Posted Mar 31, 2026 - 15:48 BST

Investigating

Some of our additional modules are experiencing issues, and may be inaccessible. We are investigating this as a highest priority.
Posted Mar 31, 2026 - 15:10 BST
This incident affected: 🎩 SchemeServe.