Keeping Secrets Out of Logs Posted on Aug 2, 2024 This article is a blog transcript of a talk at LocoMocoSec 2024 discussing how to prevent sensitive data ("secrets") from leaking into logs. It asserts there is no silver bullet but proposes multiple "lead bullets"—practical, layered techniques to mitigate this problem effectively. --- The Problem Logging secrets is both annoying and difficult. It's frustrating because logs bypass many standard security measures and have broader access than databases. Secrets can range from internal credentials (API keys) to PII or customer passwords, each with different impact levels. Even companies with mature security practices (e.g., Twitter, Google, Facebook) have suffered from secret leaks in logs. Logs often store plaintext secrets, which is as damaging as storing passwords in plaintext. The problem affects organizations of all sizes and is complex because secrets leak in unpredictable ways. --- Common Causes of Secrets Leaking into Logs Direct Logging: Accidentally logging secrets, e.g., leaving debug statements with sensitive info. Kitchen Sinks: Logging entire objects/structures (like error objects or API responses) that contain secrets hidden inside. Configuration Changes: Increasing log verbosity (e.g., to DEBUG) that emits sensitive data unexpectedly. Embedded Secrets: Secrets embedded in data formats, like magic login links inside URLs, which get logged automatically in HTTP logs. Telemetry: Error monitoring and analytics systems like Sentry may capture and log sensitive runtime state, including secrets. User Input: Users may enter secrets in unexpected fields (e.g., passwords in username fields), causing secrets to appear in logs. These proximate causes reveal that a single fix won't suffice. --- Lead Bullets: Practical Fixes to Keep Secrets Out of Logs ### 📐 Data Architecture Centralize all logging through a single controlled pipeline. Eliminate stray logs that write directly to the filesystem or stdout. Control and understand data flows to reduce the attack surface. ### 🍞 Data Transformations Apply data minimization: avoid logging secrets at all. Use redaction (e.g., [redacted]), tokenization, hashing, encryption, or masking to protect secrets in logs. Each method has pros/cons; redaction and minimization are safest. ### 🪨 Domain Primitives Instead of treating secrets as plain strings, create special types or classes for secrets. These domain primitives encapsulate sensitive data with security guarantees. Example: branded types in TypeScript that prevent secrets from being logged at compile time. At runtime, override serialization (e.g., toString) to output [redacted]. ### 🎁 Read-Once Objects Wrap secrets in objects that allow extracting the secret exactly once. After reading once, future accesses fail/error loudly. Prevents accidental multiple exposures or misuse. ### 🗃️ Log Formatters Build middleware to inspect log entries. Introspect structured logs and redact or drop secret fields (e.g., tokens, PII). Insert redaction steps into log pipeline, before logs are persisted. Works well for catching kitchen sinks, embedded secrets, and config leaks. ### 🧪 Unit Tests Use existing tests to validate logging behavior. In test environments, escalate detection of secrets in logs to test failures. Helps catch unsafe logging before production deploys. ### 🕵️ Sensitive Data Scanners Automate scanning logs for secrets using regex or heuristic-based scanners. Typically used as defense in depth since they detect leaks after logs are created. Evaluate scanner placement, detection vs redaction tendencies, false positives, and cost of scanning. Consider sampling strategies to handle log volume. ### 🤖 Log Pre-Processors Log streams can