mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 07:46:47 +00:00
🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737)
* 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels The default console formatter discarded every structured metadata key on the winston info object — only `CONSOLE_JSON=true` preserved it. That meant failures emitted by the agents SDK (e.g. "Summarization LLM call failed") reached stdout without the provider, model, or underlying error attached, leaving users unable to diagnose the root cause. - Add `formatConsoleMeta` helper to serialize non-reserved metadata as a compact JSON trailer, with per-value string truncation and safe handling of circular references. - Append the metadata trailer to warn/error console lines; info/debug behavior is unchanged. - Relax `debugTraverse`'s debug-only gate so warn/error messages routed through the debug formatter also surface their metadata. - Add a `formatConsoleMeta` stub to the shared logger mock so existing tests keep working. * 🔐 fix: Also Redact Sensitive Patterns on Warn Console Lines The warn-level console output now includes a metadata trailer that may contain provider-returned error strings with embedded tokens or keys (e.g. `Bearer ...`, `sk-...`). Apply `redactMessage` to warn lines in addition to error, matching the new surface area. * 🔐 fix: Redact Sensitive Tokens Embedded in JSON Metadata Two gaps in the existing console redaction that became user-visible once warn/error lines started emitting structured metadata: 1. The OpenAI-key regex (`/^(sk-)[^\s]+/`) was anchored to start-of-line, so keys embedded inside JSON payloads (e.g. `{"apiKey":"sk-..."}`) were never redacted. Every console line begins with a timestamp, so the anchor effectively made this pattern dead code. 2. `formatConsoleMeta` stringified metadata values verbatim; a sensitive string value was only redacted by the whole-line regex pass, which missed the anchored `sk-` case above. Fix: - Drop the `^` anchor; add `/g` so every occurrence is redacted, not just the first. - Also exclude `"` and `'` from the token body so JSON-embedded values terminate at the closing quote rather than chewing into the next field. - Simplify `redactMessage` to apply patterns directly (dropping the `getMatchingSensitivePatterns` filter) — the filter used `.test()` which has stateful behavior on `/g` regexes and is no longer needed. - `formatConsoleMeta` now runs `redactMessage` over every string value before JSON serialization, so the metadata trailer is safe even on the warn path. - Add regression tests covering both fixes. Reviewed-by: Codex (P1 finding on PR #12737, commit68c31b6). * 🔐 fix: Redact Metadata in debugTraverse for Warn and Error Relaxing the debug-only gate in debugTraverse (in commit59371be0) routed warn/error records through the traversal path, which emits leaf string values verbatim (via truncateLongStrings only). Because DEBUG_LOGGING defaults to true, those records are also written to the rotating debug log file — which means payloads like `{ auth: 'Bearer ...' }` or `{ openaiKey: 'sk-...' }` were persisted unredacted once my earlier change took effect. Apply redactMessage to the final formatted string when the level is warn or error. Debug-level behavior is unchanged (matching prior art). Includes regression tests covering error/warn redaction and debug-level preservation. Reviewed-by: Codex (P1 finding on PR #12737, commite288f7fd). * 🔐 fix: Anchor Secret Regexes at Word Boundaries to Prevent Over-Redaction Removing the `^` anchor in commite288f7fdlet the OpenAI-key regex match anywhere in the line — including inside ordinary words like `task-runner` or `mask-value`, where `sk-` appears mid-word. Non-secret text was being rewritten to `task-[REDACTED]`, hiding real log content from operators. - Anchor every sensitive-key pattern with `\b` so matches only fire at word boundaries. - Constrain the OpenAI-key body to the documented charset (`[a-zA-Z0-9_-]+`) instead of the broader "not whitespace or quote" character class. - Add `&` to the `key=` exclusion so a query-string value stops at the next parameter separator. - Regression tests covering both the over-redaction cases (`task-runner`, `monkey=10`) and the intended redactions still firing. Reviewed-by: Codex (P2 finding on PR #12737, commitc09d293d). * 🔐 fix: Redact Before Colorize To Survive ANSI Word-Boundary Interference The console pipeline runs `redactFormat → colorize({ all: true }) → printf`. With `all: true`, winston wraps `info.message` in ANSI escapes whose trailing `m` is a word character. That means `\b(Bearer )…` placed at the start of a colorized segment can fall on a (word,word) boundary and miss — the earlier line-wise `redactMessage(line)` pass in printf suffers the same issue because it runs after colorize. Extend `redactFormat` to run for `warn` in addition to `error`, operating on the raw pre-colorize `info.message` + `Symbol.for('message')` strings. The later in-printf `redactMessage(line)` stays as a backstop, but the primary redaction now happens where the regex can actually see the text. Metadata redaction already operates on the raw info object via `formatConsoleMeta`, so it was never affected by ANSI — no change there. Includes regression tests for the new warn-level behavior and for the info/debug no-op path. Reviewed-by: Codex (P2 finding on PR #12737, commitfdb6b361). * 🧹 fix: Prefer Structured Metadata Over Consumed Splat Args in Traversal `debugTraverse` previously read `metadata[Symbol.for('splat')][0]` first and only fell back to the structured metadata object. When a caller uses printf interpolation alongside a metadata object — for example `logger.warn('failed for %s', tenant, { provider })` — winston leaves the *consumed* positional arg (`tenant`) in `SPLAT[0]` after interpolation. The formatter would then append the tenant a second time and skip the real metadata, regressing debug-file and `DEBUG_CONSOLE` output quality now that warn/error share this path. Prefer the structured metadata object (via `extractMetaObject`) and only fall back to `SPLAT[0]` when there's nothing else, so the surviving log line surfaces the actual key/value pairs regardless of call shape. Reviewed-by: Codex (P2 finding on PR #12737, commit1e43d636). * 🧹 fix: Skip Consumed Splat Primitives in Warn/Error Debug Traversal When no structured metadata is attached, winston still leaves consumed `%s` / `%d` arguments in `Symbol.for('splat')`. Previous fix preferred the structured object but still fell back to whatever sat at `SPLAT[0]` — so `logger.warn('failed for %s', tenantId)` emitted `failed for tenant-7 tenant-7` in the traversal path (debug file and `DEBUG_CONSOLE`), now regressed outside of `debug` level because warn/error share the path. Only accept the splat fallback when the value is a plain object or an array (structural data worth surfacing). Primitives there are almost certainly consumed printf args and get skipped. Regression tests cover the single-%s case and the array-as-metadata case (which still surfaces through splat). Reviewed-by: Codex (P2 finding on PR #12737, commitbccbf117). * 🧹 fix: Skip Numeric Splat Keys When Extracting Log Metadata When a caller passes a primitive as the second argument — e.g. `logger.warn('Unhandled step creation type:', step.type)` — winston / `format.splat()` can leave character-index keys (`"0"`, `"1"`, …) on the `info` object. With the warn/error metadata trailer in play, those synthetic artifacts were being surfaced as bogus metadata, producing noisy console and debug-file output. Filter out numeric-string keys in `extractMetaObject` so only real metadata fields reach the trailer. Added a regression test. Reviewed-by: Codex (P2 finding on PR #12737, commitb34628de). * 🧹 fix: Preserve Unconsumed Primitive Splat Args in Debug Traversal The previous round dropped every primitive SPLAT[0] value to avoid duplicating consumed %s args, but that removed useful context from calls like \`logger.debug('prefix:', detail)\` where the primitive was never interpolated — users lost the \`detail\` value. Refine the heuristic: skip a primitive splat value only when it already appears inside the (post-interpolation) \`info.message\`; otherwise surface it. Arrays and objects continue to surface unconditionally. Regression test covers the 'prefix:', detail case. Reviewed-by: Codex (P2 finding on PR #12737, commit6bf9548f). * 🧹 fix: Traverse Filtered Metadata, Not Raw Metadata, In debugTraverse `debugTraverse` computes `extracted = extractMetaObject(metadata)` to strip reserved keys, underscore-prefixed internals, and numeric splat artifacts — but the later \`klona(metadata)\` + \`traverse\` path still read the raw object, putting all the filtered junk back into the rendered multi-line output. Clone and traverse \`debugValue\` (the already-filtered object) instead. Regression test exercises the case where numeric splat artifacts sit alongside a real metadata field. Reviewed-by: Codex (P2 finding on PR #12737, commitc29c18e8). * 🧹 refactor: Split Warn/Error From Debug-Level Traversal in Parsers Retrofitting `debugTraverse`'s multi-line object walker to cover warn/error created a minefield of splat-interaction edge cases (numeric artifacts, consumed %s args, bogus `_`-prefix filtering, over-eager suppression of unconsumed primitives). Each fix kept introducing new corner cases. Split the two concerns instead: - Warn/error now emit a compact single-line JSON metadata trailer via `formatConsoleMeta`, then pass the full line through `redactMessage`. This mirrors what the console formatter already does, so behavior between the console and debug-file outputs stays consistent for warn/error — and none of the splat/traversal edge cases apply. - Debug level keeps its original code path verbatim (including the raw `metadata` traversal and SPLAT\[0\] fallback). No regressions from my earlier iterations. - `extractMetaObject` no longer filters underscore-prefixed keys, so legitimate fields like MongoDB `_id` still appear. Reserved winston keys and numeric splat artifacts remain filtered. Updated tests reflect the simpler contract (underscore preservation, single-line trailer expectations already covered). Reviewed-by: Codex (two P2 findings on PR #12737, commit9ea11529: `_id` regression and over-eager primitive suppression). * 🧹 fix: Preserve Scalar Metadata When One Value Is Circular `formatConsoleMeta` previously wrapped a single `JSON.stringify` in try/catch — any circular reference inside any field (e.g. an attached request/response object) caused the entire trailer to be dropped. That defeats the goal of making failures diagnosable: one malformed field would mask the provider/model/status we wanted to surface. Use a `WeakSet`-based replacer that emits `[Circular]` for repeated object visits. On the whole-object serialization failing, fall back to per-field serialization so scalar keys always land and only the offending field is replaced with \`"[Unserializable]"\`. Reviewed-by: Codex (P2 finding on PR #12737, commitd63742a5). * 🧹 fix: Address Audit Findings (JSDoc, Case-Insensitive Api-Key, Tests) Audit review identified several MINOR/NIT items on top of the codex rounds. This commit closes the actionable ones: - **JSDoc (#1, #2)**: `extractMetaObject` no longer claims to filter underscore-prefixed keys (that filter was removed intentionally for MongoDB `_id`). `debugTraverse`'s docblock now describes the three code paths (warn/error compact trailer, debug multi-line traversal, other levels). - **Case-insensitive api-key regex (#6)**: `/gi` so the Azure style `Api-Key:` / `API-KEY:` also gets redacted. Pre-existing behavior was lowercase-only. - **Consolidated redundant branch (#5)**: `consoleFormat` printf was checking `isError || isWarn` twice; merged into one block. - **Pre-compiled regex (#9)**: `NUMERIC_KEY_RE` moved to module scope. - **Test coverage (#3, #4)**: Added regression tests for - per-field serialization fallback when a value's `toJSON` throws, - sensitive strings nested inside metadata objects, - the Azure-style `Api-Key:` header.
This commit is contained in:
parent
ccf3a6c670
commit
48c3d31db3
4 changed files with 532 additions and 39 deletions
|
|
@ -56,5 +56,6 @@ jest.mock('~/config/parsers', () => {
|
|||
redactMessage: jest.fn(),
|
||||
redactFormat: jest.fn(),
|
||||
debugTraverse: jest.fn(),
|
||||
formatConsoleMeta: jest.fn(() => ''),
|
||||
};
|
||||
});
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue