Datadog / docker

6 posts

datadog

2023-03-08 incident: A deep dive into the platform-level impact | Datadog (opens in new tab)

The March 2023 Datadog outage was triggered by a simultaneous, global failure across multiple cloud providers and regions, caused by an unexpected interaction between a systemd security patch and Ubuntu 22.04’s default networking behavior. While Datadog typically employs rigorous, staged rollouts for infrastructure changes, the automated nature of OS-level security updates bypassed these controls. The incident highlights the hidden risks in system-level defaults and the potential for "unattended upgrades" to create synchronized failures across supposedly isolated environments. ## The systemd-networkd Routing Change * In December 2020, systemd version 248 introduced a change where `systemd-networkd` flushes all IP routing rules it does not recognize upon startup. * Version 249 introduced the `ManageForeignRoutingPolicyRules` setting, which defaults to "yes," confirming this management behavior for any rules not explicitly defined in systemd configuration files. * These changes were backported to earlier versions (v247 and v248) but were notably absent from v245, the version used in Ubuntu 20.04. ## Dormant Risks in the Ubuntu 22.04 Migration * Datadog began migrating its fleet from Ubuntu 20.04 to 22.04 in late 2022, eventually reaching 90% coverage across its infrastructure. * Ubuntu 22.04 utilizes systemd v249, meaning the majority of the fleet was susceptible to the routing rule flushing behavior. * The risk remained dormant during the initial rollout because `systemd-networkd` typically only starts during the initial boot sequence when no complex routing rules have been established yet. ## The Trigger: Unattended Upgrades and the CVE Patch * On March 7, 2023, a security patch for a systemd CVE was released to the Ubuntu security repositories. * Datadog’s fleet used the Ubuntu default configuration for `unattended-upgrades`, which automatically installs security-labeled patches once a day, typically between 06:00 and 07:00 UTC. * The installation of the patch forced a restart of the `systemd-networkd` service on active, running nodes. * Upon restarting, the service identified existing IP routing rules (crucial for container networking) as "foreign" and deleted them, effectively severing network connectivity for the nodes. ## Failure of Regional Isolation * Because the security patch was released globally and the automated upgrade window was synchronized across regions, the failure occurred nearly simultaneously worldwide. * This automation bypassed Datadog’s standard practice of "baking" changes in staging and experimental clusters for weeks before proceeding to production. * Nodes on the older Ubuntu 20.04 (systemd v245) were unaffected by the patch, as that version of systemd does not flush IP rules upon a service restart. To mitigate similar risks, infrastructure teams should consider explicitly disabling the management of foreign routing rules in systemd-networkd configuration when using third-party networking plugins. Furthermore, while automated security patching is a best practice, organizations must balance the speed of patching with the need for controlled, staged rollouts to prevent global configuration drift or synchronized failures.

datadog

Using the Dirty Pipe vulnerability to break out from containers | Datadog (opens in new tab)

The Dirty Pipe vulnerability (CVE-2022-0847) is a critical Linux kernel flaw that allows unprivileged processes to write data to any file they can read, effectively bypassing standard write permissions. This primitive is particularly dangerous in containerized environments like Kubernetes, where it can be leveraged to overwrite the host’s container runtime binary. By exploiting how the kernel manages page caches, an attacker can achieve a full container breakout and gain administrative privileges on the underlying host. ## Container Runtimes and the OCI Specification * Kubernetes utilizes the Container Runtime Interface (CRI) to manage containers via high-level runtimes like containerd or CRI-O. * These high-level runtimes rely on low-level Open Container Interface (OCI) runtimes, most commonly runC, to handle the heavy lifting of namespaces and control groups. * Isolation is achieved by runC setting up a restricted environment before executing the user-supplied entrypoint via the `execve` system call. ## Evolution of runC Vulnerabilities * A historical vulnerability, CVE-2019-5736, previously allowed escapes by overwriting the host’s runC binary through the `/proc/self/exe` file descriptor. * To mitigate this, runC was updated to either clone the binary before execution or mount the host's runC binary as read-only inside the container. * While the read-only mount improved performance through kernel cache page sharing, it created a target for the Dirty Pipe vulnerability, which specifically targets the kernel page cache. ## The Dirty Pipe Exploitation Primitive * Dirty Pipe allows an attacker to overwrite any file they can read, including read-only files, by manipulating the kernel's internal pipe-buffer structures. * The exploit targets the page cache, meaning the overwrite is non-persistent and resides only in memory; the original file on disk remains unchanged. * In a container escape scenario, the attacker waits for a runC process to start (triggered by actions like `kubectl exec`) and targets the file descriptor at `/proc/<runC-pid>/exe`. ## Proof-of-Concept Escape Walkthrough * The attack begins with a standard, unprivileged pod running a malicious script that monitors the system for new runC processes. * Once a `kubectl exec` command is issued by an administrator, the script identifies the runC PID and applies the Dirty Pipe exploit to the associated executable. * The exploit overwrites the runC binary in the kernel page cache with a malicious ELF binary. * Because the host kernel is executing this hijacked binary with root privileges to manage the container, the attacker’s malicious code (e.g., a reverse shell or administrative command) runs with full host-level authority. To protect against this attack vector, it is essential to patch the Linux kernel to a version that includes the fix for CVE-2022-0847 and ensure that container nodes are running updated distributions.