Chapter 17
2 min read
The White Tree
Silent Failure & Loss of Visibility
The White Tree didn't block attackers. It watched.
The White Tree didn't block attackers.
It watched.
Its destruction mattered because no one noticed—until it was too late.
In modern systems, this is one of the most dangerous failure modes:
security controls that stop working without alerting you.
Examples in real systems:
- runtime security agents that crash or get disabled,
- eBPF-based tools that stop loading after a kernel update,
- audit logs redirected, truncated, or silently dropped,
- metrics pipelines misconfigured or rate-limited,
- "green dashboards" that are green because data stopped arriving.
This can also happen in CI pipelines, infrastructure-as-code scans, vulnerability scanners, and more.
To mitigate silent failures:
- Health Checks & Heartbeats.
Implement regular health checks for security controls. Use heartbeats to verify that agents and services are running as expected. - Alerting & Monitoring.
Set up alerts for anomalies in security control behavior, such as unexpected shutdowns or performance degradation. - Regular Audits & Testing.
Periodically audit and test security controls to verify their effectiveness and functionality.
Exercise
- Which different types of monitoring and alerting do you have in place for your security controls? Are there any gaps where a failure could go unnoticed?
- Are your alerts too "relaxed" (e.g., only alerting on critical failures) or too "noisy" (e.g., alerting on every minor issue)? Find the right balance to ensure you are notified of important issues without overwhelming your team.
