A grayscale SVG can draw its background as a full-canvas path (not just a
rect), e.g. a white background path plus a black glyph. The rect-only
background check missed that, and the icon flattened to a solid currentColor
block under the CSS mask.
Tint only when the SVG resolves to a single grayscale tone. Any second tone
(a background shape drawn as a path or rect, an accent, or a second shade)
now preserves the icon's own colors, which covers full-canvas path
backgrounds without per-shape geometry parsing.
The monochrome/tintable decision scraped SVG markup with regexes, which kept
missing edge cases (opaque backgrounds, missing or comma-separated viewBox,
stroke-width vs canvas width, embedded raster images).
Parse the SVG once with DOMParser and inspect real elements and attributes:
reject embedded <image>/<foreignObject> content, detect a full-canvas opaque
background rect, read the canvas size from the viewBox or root width/height,
and gather paint colors from attributes, inline styles, and <style> blocks.
Unparseable input is treated as not tintable. Tests cover these cases.
The viewBox regex only accepted whitespace between values, so a valid
viewBox like "0,0,24,24" failed to parse and an opaque background went
undetected, tinting the icon into a solid block. Accept commas and
whitespace as separators. Add a test for the comma-separated case.
Opaque background detection only read canvas dimensions from the viewBox, so
an SVG that declares width and height on the root element but omits the
viewBox slipped through and was tinted into a solid block.
Fall back to the root svg width and height when no viewBox is present, and
match attribute names exactly so stroke-width is not mistaken for the canvas
width. Add tests for the no-viewBox cases.
A grayscale SVG with an opaque full-canvas background (for example an
exported logo with a white background rect and a black glyph) passed the
monochrome check and was drawn through a CSS mask. Masks key off the alpha
channel, so the opaque background filled the whole area with the tint color
and the icon collapsed into a solid block.
Detect a full-canvas opaque background rect and exclude such SVGs from
tinting, rendering them with their own colors instead. Transparent
single-color and multi-shade glyphs remain tintable. Add tests for the
background cases.
A reused CustomIcon instance kept the monochrome verdict from a previous
source until its effect re-ran, so switching to a raster image or a new
multi-color SVG could briefly render it as a currentColor silhouette.
Key the resolved verdict to the current source and reset it synchronously
during render (seeding from cache when available), so a stale verdict can
never tint a different icon. Add hook tests covering the source change.
Custom icons (MCP server iconPath, model spec groupIcon) were rendered as
plain <img>, so monochrome SVGs kept fixed dark colors and were nearly
invisible in dark theme.
Introduce a shared CustomIcon component that detects monochrome SVG glyphs
and tints them with currentColor so they follow the active theme, while
multi-color SVG logos and raster images keep their original colors. The
monochrome decision parses the SVG's color tokens; content is fetched once,
cached, and any failure falls back to the original image. Monochrome SVGs
render via CSS mask, never inlined, so no SVG markup reaches the DOM.
Apply across all custom-icon surfaces: MCP settings cards, the chat MCP
dropdown, stacked MCP icons, tool-call headers, and model group icons.
Also support SVG in the MCP avatar uploader: add SVG to the accepted file
types and sanitize uploaded SVGs with DOMPurify before storing them, and
make the dialog preview theme-adaptive via the same component.
Add unit tests for SVG detection, monochrome analysis, sanitization, and
CustomIcon rendering.
* 🔧 chore: Update `@librechat/agents` to v3.2.38 and bump related dependencies in package-lock.json and package.json files
* 🔧 chore: Upgrade `multer` dependency to version 2.2.0 in package-lock.json and package.json
* 🔧 chore: Upgrade `nodemailer` dependency to version 9.0.1 in package-lock.json and package.json
* 🔧 chore: Upgrade `@aws-sdk/client-bedrock-agent-runtime` and `@aws-sdk/client-bedrock-runtime` to versions 3.1071.0, update related dependencies in package-lock.json and package.json
* 🔧 chore: Upgrade `form-data` to version 4.0.6 and `hono` to version 4.12.25, update related dependencies in package-lock.json and package.json
* 🔧 chore: npm audit fix
* 🔧 chore: Remove unused Babel dependencies from package-lock.json and package.json
* 🔧 chore: Add '@mistralai/mistralai' to esModules in Jest configuration files
* feat: add `convo.pinned`
We want to be able to pin convos (so users can easily find them), thus we
added a new field to the DB schema: `pinned`.
We also had to add an API method for pinning a convo. It's got thorough tests.
It's structured just like how /api/convos/archive works, only for pinning.
* feat: add 'pinned' section to conversation list
If there are any pinned conversations, they will appear above the normal
"chats" list, with a pinned icon next to them.
* feat: added pin/unpin to convo options
ConvoOptions now has a pin/unpin button which lets you change the
pin status of any given conversation.
* fix: adjust ellipsizing gradient on ConvoLink
Because it went across the whole ConvoLink, it would cover up any
children (i.e. icons) that appear after the title. However, the point
of the gradient is just to gradually make the title disappear, not
the icons.
This change places the gradient on the title only, so it achieves
the same ellipsizing effect without interfering with the display of
the child icons.
* Fixed import sorting
* 🔐 fix: Honor Admin-Panel MCP Allowlist Overrides Without Restart
MCPServersRegistry was built once at boot from getAppConfig({ baseOnly:
true }), freezing allowedDomains/allowedAddresses to YAML. Admin-panel
mcpSettings overrides were ignored by both inspection (addServer/
reinspectServer/updateServer/lazyInitConfigServer) and runtime connection
enforcement (assertResolvedRuntimeConfigAllowed), so a domain allowed only
via the panel failed inspection and never connected.
Make the registry's effective allowlists mutable and refresh them from the
merged admin-panel config: seed at boot, and re-apply on every config
mutation via invalidateConfigCaches -> clearMcpConfigCache. Both inspection
and connection paths read the same getters, so both honor overrides without
a restart. Fail-safe: current allowlists are preserved when the merged read
fails.
* 🛡️ fix: Scope MCP allowlist refresh to global config, fail-safe on DB error
Address Codex P1 review findings on the allowlist-refresh path:
- Tenant-scoped config mutations no longer push one tenant's merged
mcpSettings into the process-wide registry singleton (read by all MCP
connection paths), which would leak allowlists across tenants. Only
global (non-tenant) mutations refresh the registry; tenant mutations
still evict the config-server cache.
- The refresh read now uses strictOverrides:true so a transient DB error
throws instead of silently returning YAML base config — preserving the
last-known allowlists rather than overwriting them with fallback values.
Adds the strictOverrides option to getAppConfig (default off, no behavior
change for existing callers).
* ♻️ refactor: Resolve MCP allowlists per-request (tenant-scoped) instead of a global singleton
Supersedes the prior global-mutation approach. MCP allowlists live in
mcpSettings, which is tenant/principal-scoped admin config, so a process-wide
singleton value is the wrong model — it caused cross-tenant bleed and stale
reads.
Instead, inject a resolver (from the app layer, where the merged config lives)
that the registry calls per inspection and per connection. It reads the ALS
tenant context via getAppConfig and accepts the acting user so user/role-scoped
overrides resolve; config-source inspection (no user) resolves at tenant scope.
Falls back to the YAML base allowlists when no resolver is set or the lookup
fails, so a transient error fails to the operator baseline rather than
disabling the allowlist.
Removes the now-unnecessary setAllowlists / boot-seed / invalidateConfigCaches
refresh / getAppConfig.strictOverrides machinery.
* 🔒 fix: Scope config-source cache by allowlist; resolve OAuth allowlists per-request
Address Codex review of the per-request resolver:
- Config-source cache key now folds in the resolved allowlists, not just the
raw-config hash. Inspection results became allowlist-dependent, so without
this a tenant whose allowlist rejects a URL could poison the shared key with
an inspectionFailed stub for a tenant that allows it (and vice versa). The
tenant-scoped allowlist is resolved once per ensureConfigServers pass and
threaded through the cache key + inspection.
- The two remaining request-time OAuth allowlist reads now use the merged
config instead of the YAML base getters: the fallback OAuth-initiate path
(routes/mcp.js) via resolveAllowlists, and OAuth revocation
(UserController.maybeUninstallOAuthMCP) via the request's already-merged
appConfig.mcpSettings. Without this, an OAuth endpoint allowed only by an
admin-panel override was rejected while inspection/connection allowed it.
* ✅ test: Update MCP OAuth registry/config mocks for per-request allowlists
CI fix for the Finding-12 change. The OAuth-initiate route now calls
registry.resolveAllowlists() and the revocation path reads the merged
appConfig.mcpSettings, so the affected specs' mocks were asserting the old
base-getter values:
- routes/__tests__/mcp.spec.js: add resolveAllowlists to the registry mock.
- UserController.mcpOAuth.spec.js: provide mcpSettings on the getAppConfig
mock so revokeOAuthToken still receives the expected allowlists.
* 🧪 test: e2e proof that admin-panel MCP allowlist override takes effect
Adds a Playwright mock-harness spec for #13809. A URL-based MCP fixture
(e2e-http, streamable-http SDK server) boots inspectionFailed because its
origin is omitted from the YAML mcpSettings.allowedDomains; the spec adds that
origin via an admin config override (PUT /api/admin/config/user/:id) and
asserts the server reinitializes — exercising the real resolver path through
the backend + DB. Before the fix, reinspection used the frozen YAML allowlist
and the server stayed unreachable.
- e2e/setup/fake-mcp-http-server.js: streamable-HTTP MCP fixture (health GET /).
- e2e/playwright.config.mock.ts: boot the fixture as a second webServer.
- e2e/config/librechat.e2e.yaml: mcpSettings.allowedDomains (excludes 127.0.0.1)
+ the e2e-http server.
- e2e/specs/mock/mcp-allowlist-override.spec.ts: login → baseline reinit fails →
apply override → reinit succeeds.
* 🛡️ fix: Bound object-traverse against DAG fan-out and shared refs
Detect cycles via the ancestor chain (so shared, non-circular references in sibling branches / DAGs are traversed correctly) and add defensive maxNodes (100k) / maxDepth (100) caps. The removed global visited set was implicitly bounding work at O(distinct nodes); ancestor-chain-only detection is O(root-to-node paths), exponential on DAGs (a depth-24 diamond went from 26 to 50M visits / 1.6s of synchronous work). The caps bound it to ~9ms while leaving normal traversal untouched. Adds a spec covering shared refs, cycles, DAGs, and both bounds. The lone consumer, debugTraverse, inherits the defaults with no change.
* 🪵 refactor: Remove legacy api/config logger duplicate
The api/config winston logger was a stale parallel implementation of the canonical @librechat/data-schemas logger, with unbounded redaction (regex-only redactFormat, npm traverse-based debugTraverse). Its winston instance and the logger export from api/config/index.js had zero consumers — every ~/config importer uses the MCP/flow-manager exports. The only live tie was ToolService's use of redactMessage.
Re-export redactMessage from @librechat/data-schemas (behaviorally identical, a superset of the regex set), point ToolService at it, delete api/config/winston.js and api/config/parsers.js, drop the dead logger export, and remove the orphaned ~/config/parsers mock from the global test setup.
* 🧹 chore: Drop orphaned traverse dep and stale legacy logger tests
Deleting api/config/{winston,parsers}.js left the npm 'traverse' package unused in api/package.json (flagged by the detect-unused-packages CI check) and orphaned two tests that imported the deleted modules. Remove the traverse dependency (sync package-lock), and delete api/config/__tests__/{parsers,logToFile}.spec.js — the canonical logger's behavior is covered by packages/data-schemas/src/config/parsers.spec.ts.
* 🩹 fix: Make object-traverse caps bound work and survive update()
Address Codex review: (1) break the child loops as soon as the node budget is spent and iterate objects via for...in instead of materializing Object.entries/Object.keys, so maxNodes actually bounds work for wide arrays/objects; (2) detect ancestor cycles against an immutable original-node stack rather than context.node, which a callback's update() can reassign (the debug formatter rewrites array nodes in place). Adds tests for the wide-array bound and the update()-cycle case.
* 🎚️ fix: Tighten object-traverse defaults to a ~1ms log budget
Lower maxNodes 100000 -> 2500 and maxDepth 100 -> 5. Measured cost is ~140ns/node with the debug formatter callback, so 2500 nodes keeps a single log under ~1ms even on slower prod hardware; real log objects are ~25-30 nodes at depth 3-4, leaving ample headroom. maxNodes is the fan-out/cost lever; maxDepth bounds recursion and output readability (depth-5 covers typical logs, deeper renders compactly).
* 🪙 feat: Context-usage projection — data-provider + client wiring
Consumer side of the SDK-aligned context projection (agents
`projectAgentContextUsage`). Adds the `/api/endpoints/context-projection`
data-provider plumbing (endpoint, service, query key, `TContextProjectionRequest`)
and a `useContextProjectionQuery` gated to fire only when no fresh snapshot
covers the viewed branch.
Wires `useTokenUsage` precedence to: live snapshot → fresh persisted snapshot
(window matches the resolved one) → server projection → per-message estimate.
A model/window switch marks the baked snapshot stale (its `maxContextTokens`
no longer matches) and falls to the projection — closing the gauge's
window-switch (G1) and snapshot-less-branch (G2) gaps. Snapshot and projection
share the render-relevant fields, so they render uniformly.
Backend endpoint + agents version bump land in follow-up commits. Includes the
design spec (CONTEXT_PROJECTION_SPEC.md).
* 🪙 feat: Context-projection backend endpoint
POST /api/endpoints/context-projection → resolveContextProjection (packages/api):
reconstructs the viewed branch (parent-chain walk from messageId), resolves the
agent config (instructions/provider/model/maxContextTokens), reuses LibreChat's
stored per-message tokenCounts as the index map (no re-tokenizing), and calls
the agents SDK projectAgentContextUsage — no model call. Thin controller injects
db.getMessages/db.getAgent; route mirrors /token-config.
First cut targets message-windowing accuracy; tool-schema tokens are deferred to
a follow-up that reuses the full initializeAgent path.
* 🩹 fix: Codex review on context projection (G1 guard, IDOR, recount, summary)
- Guard `currentActive` against a stale window: a model/window switch on the
current branch left the live snapshot outranking the projection (G1 didn't
fire). Now defers to the projection unless streaming or the window matches.
- Scope branch lookups to the authenticated user (`getMessages` filter +
injected `userId`) — was loading any conversation by id (IDOR).
- Recount messages with no stored `tokenCount` via the tokenizer instead of
charging 0, so snapshot-less/imported histories don't under-report.
- Fall back (null) for already-summarized branches rather than projecting from
the full raw parent chain (the next call would send summary + tail); the
client's summary-baseline-aware estimate handles them until a follow-up
replays the summary boundary.
* 🩹 fix: Codex round 2 — drop agent load, summary marker, edit-invalidation
- Stop loading agent/model-spec config server-side (closes the agent-access
IDOR and the spec-prompt special-casing). Provider/model/window now come from
the client-resolved request (`limits.endpoint`/model — the agent's real
provider, not the `agents` endpoint, so the tokenizer is right). Agent/spec/
promptPrefix instructions are uniformly deferred to the full-fidelity follow-up.
- Detect summarized branches via the live path's `metadata.summaryUsedTokens`
marker (was the wrong `summaryTokenCount` field) and fall back to the
summary-aware estimate.
- Invalidate the projection query on in-place message edits via a branch
content `revision` in the cache key (the tail id is unchanged on edit).
Deferred (valid, not a regression): same-window endpoint/model switch keeps a
window-matched snapshot — needs endpoint/model persisted on the snapshot, which
lands with the fidelity follow-up. Smoke-tested: fits / prunes / summarized→null
/ no-window→null.
* 🛡️ fix: make context projection strictly additive (no-regression)
Revert the G1 window-match guard on the live/branch snapshot. When no explicit
maxContextTokens is set (the common default), the SDK's snapshot window is
reserve-derived (~0.9·(modelContext − maxOutputTokens)) while useTokenLimits
resolves the raw model context — so `snapshot.maxContextTokens === resolvedMax`
is false for the SAME model, and the guard would wrongly drop a valid
current-branch snapshot to projection/estimate post-stream (a regression in the
default case, per initialize.ts:1240-1243).
The projection now activates ONLY for snapshot-less branches (G2): the
precedence is live snapshot → persisted branch snapshot → projection → estimate,
where the first two are byte-for-byte the prior behavior and the projection just
slots ahead of the estimate. Window/model-switch (G1) detection needs the
snapshot to carry its model/window and defers to the fidelity follow-up.
* 🩹 fix: surface projections as estimates, not authoritative snapshots
A first-cut projection carries the SDK's windowing but omits instruction/tool
overhead, so rendering it as `isEstimate: false` showed a confident under-count
for snapshot-less branches. Mark projection-sourced views `isEstimate: true` +
`snapshotActive: false` (and drop the snapshot field) so they present as a
better estimate than sumBranch — improved used/window number, estimate framing,
no misleading granular breakdown with ~0 tools. Real snapshots stay
authoritative. (Codex round 3, projection.ts:139.)
* 🧹 chore: drop CONTEXT_PROJECTION_SPEC.md from the PR
* 🎨 style: fix import-sort order in projection.ts (CI sort-imports check)
* 🔧 chore: update @librechat/agents dependency to version 3.2.36 in package-lock.json and related package.json files
* chore: npm audit fix
* 🎨 style: fix import-sort order in data-service.ts (CI sort-imports check)
* 🩹 fix: drop dead calibrationRatio in projectionParams (tsc never error)
Inside the ternary, branchSnapshot is narrowed to null (the gate is
), so accessed a
property on (frontend typecheck failure). It was also dead — there is
never a snapshot to seed from in this branch — so just remove it.
* Revert "chore: npm audit fix"
This reverts commit 4cdb862d0c.
The installed @librechat/agents folds cache_creation + cache_read into
Anthropic usage_metadata.input_tokens (cache-inclusive), but
cacheSubsetProviders omitted anthropic, so splitUsage() took the additive
branch and billed cache tokens twice — at the full input rate and again at
the cache write/read rate. Verified live: a cache-read-heavy Sonnet call was
overcharged 10.7x.
Add Providers.ANTHROPIC to cacheSubsetProviders (single source of truth for
backend billing and client usage normalization). Bedrock stays additive: its
Converse path passes AWS inputTokens through unmodified. Update the Anthropic
regression tests to production-accurate cache-inclusive fixtures.
Fixes#13795
* 🥽 fix: Redact Non-User-Sourced MCP Server URLs by ACL Edit Permission
GET /api/mcp/servers and GET /api/mcp/servers/:serverName return MCP server configs to any caller with MCP-use permission. For user-sourced configs (DB-stored, UI-submitted), the URL is the caller's own and is intentionally disclosed. For non-user-sourced configs (YAML or config-tier, operator-defined), the URL and OAuth flow endpoints (authorization_url, token_url) are operator-sensitive: they can encode internal infrastructure hostnames and are not editable through the API.
This change redacts those fields on non-user-sourced configs unless the caller has edit authority on the resource, using the same ACL check (PermissionBits.EDIT) that the PATCH and DELETE routes already enforce via canAccessMCPServerResource. Callers with broad MANAGE_MCP_SERVERS capability bypass the per-resource check, matching the existing capability bypass in canAccessResource. customUserVars is intentionally not redacted: its values are UI hint metadata (title, description, sensitive), not user-supplied secrets; blanking it would give non-editor callers a Configure form with no field labels.
* 🥽 fix: Correct getResourcePermissionsMap import path + tighten redact comments
The MCP server redaction commit imported getResourcePermissionsMap from ~/server/controllers/PermissionsController, but that controller is a consumer of the helper, not its exporter. The canonical export lives in ~/server/services/PermissionService (which controllers/agents/v1.js already imports from). Fixes the runtime getResourcePermissionsMap is not a function failure on GET /api/mcp/servers and the four downstream route-spec failures whose config mocks lacked a source field and were therefore wrongly treated as non-user-sourced; mocks now reflect the real registry behavior (addServer/updateServer tag DB-stored configs with source: 'user'). Trims narrating JSDoc on the redact helpers and resorts the librechat-data-provider destructure by length.
* chore: import order
* 🥽 fix: Redact OAuth Revocation Endpoint Alongside Authorization And Token URLs
The OAuth-URL strip path only dropped authorization_url and token_url. The UserOAuthOptionsSchema in packages/data-provider/src/mcp.ts (line 146) accepts revocation_endpoint as another operator-configurable URL, and the OAuth handler uses it to revoke tokens; it can hold the same internal IdP hostnames the existing strip is trying to hide. Adds revocation_endpoint to the destructure so a non-user-sourced YAML/config MCP server config no longer leaks the revocation URL to non-editor callers. The existing strip url and oauth flow URLs spec is extended with a revocation_endpoint value to lock in the new field.
* 🥽 fix: Gate Shared DB Server URL Disclosure On ACL Edit Permission
source-driven URL disclosure was incorrect for shared DB-backed MCP servers. ServerConfigsDB.mapDBServerToParsedConfig (packages/api/src/mcp/registry/db/ServerConfigsDB.ts:465) sets source: 'user' on every DB-stored config it returns, regardless of who is accessing it. A user with only VIEW share on a DB server, or with agent-mediated access, was therefore treated by the redaction layer as if they owned the URL, and GET /api/mcp/servers disclosed the owner's URL and OAuth flow URLs to viewers who could not edit the resource.
The redaction is now driven purely by ACL edit authority: computeCanEditByServer routes every dbId-bearing config through PermissionBits.EDIT regardless of source; redactServerSecrets strips on !canEdit regardless of source. POST and PATCH controllers explicitly pass canEdit: true since both endpoints establish edit authority (POST creates the resource, PATCH is gated on the EDIT middleware). Legacy/ephemeral configs without a dbId still fall back to the source heuristic.
* 📝 docs: correct redactServerSecrets URL-disclosure comment
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* 🎤 fix: Keep Microphone Icon Visible On Initial Chat Render
AudioRecorder returned null while the parent ChatForm's textAreaRef was still null on first paint, hiding the mic icon until an unrelated re-render. Render the button disabled instead so the icon is always present.
Closes#13786
* 🎤 refactor: Drop Unused textAreaRef Dependency From AudioRecorder
Per Codex review: deriving the button's disabled state from textAreaRef.current could leave the mic permanently disabled until an unrelated re-render, since assigning a ref does not trigger one. The handlers never read the ref, so remove the dependency entirely along with the now-unused prop.
* 🪙 fix: Reconcile Context Gauge to Actual Provider Tokens
The context gauge could read several× too high (e.g. 213K when the real prompt
was 56K) and stay there across reloads. Root cause: the SDK's calibrationRatio is
`cumulativeProviderReported / cumulativeRawSent`, but a provider's server-side
web search injects large fetched content into the prompt that the SDK never sent
or counted — pinning the ratio at its cap (5) and multiplying every later message
estimate, including post-summary ones. The gauge rendered (and persisted) that
inflated estimate, never the provider's actual token count.
Fix: reconcile the snapshot to the call's ACTUAL prompt tokens (input + cache),
which already arrive in on_token_usage. Only messageTokens is calibration-scaled
(instructions/summary are raw tiktoken), so keep those and set messageTokens to
the remainder, recomputing free space. Shared `promptTokensFromUsage` +
`reconcileContextUsage` in data-provider; applied server-side in
buildPersistedContextUsage (reload-stable) and client-side in useUsageHandler on
each primary usage (corrects at turn-end, no follow-up needed). Also drop the
summary double-count from the Breakdown Messages row.
Deferred (separate agents PR): the SDK over-calibration also fires summarization
prematurely; fixing it needs decoupling real-content estimation from server-side
injection headroom without weakening pruning-overflow safety.
* 🪙 fix: Harden Token Reconciliation for Provider-less + Resume Paths
Codex review on the reconciliation:
- promptTokensFromUsage: when the provider is absent (custom/OpenAI-compatible
payloads), fall back to the same magnitude heuristic normalizeUsageUnits uses
(cache ≤ input ⇒ already included) so cached events aren't re-inflated.
- Resume: backfillUsage restores a primary call's usage without replaying a live
on_token_usage (Redis mode), so the live reconcile never ran and a reconnected
session stayed on the inflated estimate. New reconcileBackfill reconciles the
restored snapshot from the final primary call after contextHandler installs it.
* 🪙 fix: Reconcile Resume Snapshot Server-Side, Not via Backfill
Codex: the client reconcileBackfill scanned the resumed run's collectedUsage and
applied the final primary to the latest snapshot — but on a mid-call resume that
usage belongs to an EARLIER call, corrupting the restored gauge.
Move the resume reconciliation server-side: GenerationJobManager.persistTokenUsage
reconciles the stored contextUsage to a primary usage's actual prompt tokens as it
arrives. That usage is the post-invoke truth for the call the latest stored
snapshot precedes (no snapshot is captured between a call's pre-invoke dispatch
and its usage), so it's correct by construction and run-matched. A mid-call resume
(no usage yet) keeps the raw snapshot instead of mis-applying an earlier call's
tokens; it reconciles once the call completes. Removed client reconcileBackfill;
the live-path reconcile (non-resume) stays.
* 🪙 fix: Guard Reconciliation Against Replays and Snapshot Races
Two Codex concurrency findings on the reconciliation:
- Client: reconcile only on a NEWLY folded primary usage. A replayed duplicate
(folded=false on resume) can be an earlier tool-loop call sharing the run id,
which would overwrite the latest snapshot with an earlier, smaller prompt. Moved
the reconcile after the folded guard.
- Server: serialize the context-usage write through the same per-stream queue as
the token-usage write. persistTokenUsage reconciles the stored snapshot
(read-modify-write); an unserialized trackContextUsage could store a newer
snapshot between the read and write — or a stale reconciled write could land
after a newer snapshot — clobbering the newer run's gauge when calls interleave.
FIFO keeps each call's snapshot ahead of its own usage and behind the next.
* chore: import order in GenerationJobManager.ts