LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-05-13 16:07:30 +00:00

Author	SHA1	Message	Date
github-actions[bot]	2427edef90	🌍 i18n: Update translation.json with latest translations (#12711 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-21 11:41:33 -04:00
Danny Avila	d1573a8493	🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (#12755 ) * 🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (GHSA-gvpj-vm2f-2m23) RFC 9728 §3.3/§7.3 requires clients to verify that the `resource` identifier advertised by an OAuth Protected Resource Metadata document matches the URL used to fetch it. Without this check, a malicious MCP server can serve metadata pointing at a legitimate server's `authorization_servers`, causing LibreChat to obtain an access token for the real server and send it to the attacker in subsequent API calls. Validation happens at discovery time so the entire metadata document is discarded on mismatch — `authorization_servers` on a spoofed document is equally untrustworthy and is the primary theft vector in the PoC. Uses the MCP SDK's own `checkResourceAllowed` (origin + path-prefix) for semantic parity with `selectResourceURL`, a SDK code path LibreChat bypasses. This is looser than RFC-strict equality (handles common cases like `/mcp/sse` server vs `/mcp` advertised resource, or trailing-slash normalization) while still rejecting cross-origin spoofs and same-origin sibling-path confusion. * 🛡️ fix: Re-validate Resource Binding at MCP OAuth Token Exchange Adds a defense-in-depth re-assertion of the RFC 9728 §3.3 resource/server binding inside `completeOAuthFlow`. Flows have a 10-minute TTL, so a flow initiated under pre-fix (vulnerable) code could still be in-flight at upgrade time carrying unvalidated resource metadata. Re-checking here closes that window without requiring operators to flush flow state on deploy. Also guards against future regressions that might re-introduce unvalidated paths into the flow-metadata pipeline (GHSA-gvpj-vm2f-2m23). * ✅ test: Address Review Findings on MCP OAuth Resource Validation Follow-up to GHSA-gvpj-vm2f-2m23 fix. Resolves three reviewer findings: - Assert `failFlow` is called when `completeOAuthFlow` rejects at re-validation — locks in the "no stuck PENDING entry" guarantee that the catch block already provides in production code. - Update the "no resource metadata" warning in `initiateOAuthFlow`: post-fix, that branch is only reachable when PRM discovery returned nothing (404, network error, server without RFC 9728). A document with a missing `resource` field now throws earlier in `assertResourceBoundToServer`, so the old "missing 'resource' property" phrasing described a case that can no longer reach this branch. - Add a test for an unparseable `resource` string triggering the error-wrapping path in `assertResourceBoundToServer` (verifies the wrapper surfaces a descriptive message instead of leaking a raw `TypeError: Invalid URL` from the SDK's `new URL()` call).	2026-04-21 05:44:09 -07:00
Danny Avila	7cf8b84778	📦 chore: Update `@librechat/agents`to v3.1.68 (#12752 ) - Bumped the version of @librechat/agents from 3.1.67 to 3.1.68 in multiple package.json files to ensure consistency and access to the latest features and fixes. - Updated package-lock.json to reflect the new version and maintain dependency integrity.	2026-04-20 17:45:58 -07:00
Danny Avila	8e866e6010	🗺️ fix: Resolve Custom-Endpoint Providers for Summarization (#12739 ) * 🔧 fix: resolve custom-endpoint providers for summarization When `summarization.provider` in `librechat.yaml` is set to a custom-endpoint name (e.g. `Ollama`), the string was passed verbatim to the agents SDK, which only knows a fixed set of provider names and threw `Unsupported LLM provider: Ollama`. Before shaping the summarization config for the SDK, resolve the provider through `getProviderConfig`: custom-endpoint labels are remapped to the underlying SDK provider (e.g. `openAI`) and the endpoint's baseURL/apiKey are injected into `parameters` so the summarization model reaches the right backend, even when summarization targets a different custom endpoint than the main agent. Unknown names and names that appear with no matching endpoint flow through unchanged so the SDK can surface a clear error. User-provided credentials and unresolved env-var references are skipped rather than forwarded, letting the SDK's self-summarize path reuse the agent's own clientOptions. Ref: LibreChat Discussion #12614 * address: widen unresolved-env-var guard, fix test naming - Reject summarization overrides when the extracted baseURL/apiKey still contains any `${...}` placeholder, including prefix/suffix patterns like `https://${UNSET}.example.com` that `envVarRegex` (exact-match) missed. - Rename the "case-insensitive" test to reflect that only `Ollama` is normalized via `normalizeEndpointName`; add coverage proving other custom-endpoint names match case-sensitively. * address: use req.config in responses.js; forward full endpoint options - `responses.js` relied on a module-level `appConfig` set via `setAppConfig`, which is never called anywhere. Use `req.config` directly so the summarization provider resolver actually runs on the responses route. - Route the custom endpoint config through `getOpenAIConfig` so summarization inherits the same `headers`, `defaultQuery`, `addParams`/`dropParams`, and `customParams` transforms (Anthropic/Google/etc.) that `initializeCustom` applies for the main agent flow. Strip the stale `model`/`modelName` defaults so `summarization.model` still wins. * address: skip overrides when summarization matches agent endpoint When `summarization.provider` resolves to the same custom endpoint as the main agent, rely on the SDK's self-summarize path (which reuses `agentContext.clientOptions` unchanged) rather than injecting overrides. Otherwise the shallow spread of `clientOverrides.configuration` would replace the agent's request-resolved state (dynamic headers, proxy/fetch options) with yaml-only config. Only applies when summarization targets a different endpoint from the agent; the yaml config is all we have in that case, so overrides still flow through. * address: preserve raw provider when overrides cannot be built When summarization points at a different custom endpoint than the agent and we can't resolve the endpoint's credentials (user_provided, or a still-unresolved `${VAR}` after env extraction), remapping to `openAI` without overrides would silently route summaries to the default OpenAI client. Preserve the raw provider name so the SDK raises a clear "Unsupported LLM provider" error (now also logged, via the agents SDK defense-in-depth fix) instead of sending traffic to the wrong backend. * address: resolve endpoint headers and forward PROXY to summarization - Custom-endpoint `headers` now flow through `resolveHeaders` before reaching `getOpenAIConfig`, matching the main agent path. This ensures templated values like `\${PORTKEY_API_KEY}` or `{{LIBRECHAT_BODY_...}}` are substituted for summarization requests instead of being forwarded literally. - `PROXY` env var is now passed into `getOpenAIConfig` so cross-endpoint summarization honors outbound proxy dispatchers configured for the rest of the deployment. * address: user summarization parameters win over endpoint defaults Flip the merge order so `summarization.parameters` from yaml override `clientOverrides` defaults (which come from `getOpenAIConfig` and always include `streaming: true` etc.). A user who sets `parameters.streaming: false` in their config should still see non-streaming summarization for providers that require it. * address: review feedback (logging, dead code, DRY, types, deep-merge) - Log error in the resolveSummarizationProvider catch-all so programming bugs in getProviderConfig/getOpenAIConfig/resolveHeaders surface in operator logs instead of falling through silently. - Drop dead `setAppConfig`/`appConfig` infrastructure in responses.js and fix adjacent `allowedProviders` reference that also relied on the never-initialized module-level appConfig. Uses `req.config` directly. - Import canonical `normalizeEndpointName` from librechat-data-provider instead of duplicating it locally. - Replace `SummarizationClientOverrides = Record<string, unknown>` with an explicit interface covering the known fields. - Deep-merge `configuration` when user-supplied `summarization.parameters. configuration` overlaps the resolved endpoint configuration, so user additions (e.g. `defaultQuery`) don't wipe out `baseURL`/`defaultHeaders`. - Wrap `process.env` mutations in test in `try/finally` so a failed assertion doesn't leak env state into subsequent tests. - Drop `as unknown as AppConfig` in test helper; fixture now matches the `AppConfig` shape directly using a `Partial<TEndpoint>` union. - Trim JSDoc that restated the name it was attached to. * address: review nits — import order, local test type, conflict test - Move `import { logger }` up into the package value-imports section so it no longer sits between `import type` blocks. - Replace `as unknown as SummarizationConfig['parameters']` in the deep-merge test with a named `TestSummarizationParameters` type and a single narrowing cast at the call site, making intent explicit. - Add a test proving that user-supplied `configuration.baseURL` wins over the resolved endpoint baseURL, locking in the deep-merge's user-wins-on-conflict semantics that the previous suite only exercised additively.	2026-04-20 12:00:46 -04:00
Danny Avila	d2cbd551b7	🤝 fix: Load Handoff Agents for Agents API (#12740 ) * 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726) Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents into a shared `discoverConnectedAgents` helper in `@librechat/api` and wires it into the OpenAI-compatible `/v1/chat/completions` and Open Responses `/v1/responses` controllers. These endpoints previously only passed the primary agent config to `createRun` while keeping `primaryConfig.edges` intact, which forced `MultiAgentGraph` into multi-agent mode without loading the referenced sub-agents and caused StateGraph to throw "Found edge ending at unknown node <id>". The discovery helper also filters orphaned edges (deleted sub-agents or those the caller lacks VIEW permission on), so API users see the same graceful fallback the chat UI already had. * 🧪 fix: use ServerRequest in discovery spec helpers CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test helpers typed `req` as `express.Request`, which is not assignable to `DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose `user` is `IUser`). Local jest passed because ts-jest is transpile-only, but the CI typecheck uses the full compiler. * 🪲 fix: drop orphan edges on both endpoints, not just `to` Addresses the P1 codex finding on #12740: `filterOrphanedEdges` previously only removed edges whose `to` referenced a skipped agent. Edges whose `from` was a skipped agent — the symmetric case in a bidirectional graph like `A <-> B` where `B` is deleted or the user lacks VIEW on it — leaked through to `createRun` and re-triggered `Found edge ending at unknown node <id>` at StateGraph compile time. The filter now drops an edge if either endpoint references a skipped id, and the existing `to`-only test cases were updated to reflect the stricter behavior. Adds a bidirectional-graph regression test in `discovery.spec.ts`. * 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes Addresses the second P1 codex finding on #12740: the OpenAI-compat `/v1/chat/completions` and Open Responses `/v1/responses` routes gate the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`), but `discoverConnectedAgents` was checking handoff sub-agents against the looser in-app `AGENT` resource type. That allowed a remote caller who could reach the orchestrator but had only in-app visibility on a sub-agent to invoke it via the API — bypassing the remote-sharing boundary. Adds an optional `resourceType` param to `discoverConnectedAgents` (defaulting to `AGENT` for the chat UI path) and passes `ResourceType.REMOTE_AGENT` from both API controllers so every discovered sub-agent clears the same sharing boundary enforced at route entry. * 🧯 fix: enforce allowedProviders for discovered sub-agents Addresses the third P1 codex finding on #12740: `discoverConnectedAgents` forwarded the caller's `endpointOption` verbatim into `initializeAgent`, but on the OpenAI-compat routes that option's `endpoint` is the primary agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)` is true, so handoff sub-agents silently bypassed the provider allowlist configured under `endpoints.agents.allowedProviders`. Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every per-sub-agent init call. The primary agent still uses the caller's endpointOption as before — this only affects the BFS-loaded handoff targets. Regression test asserts the override. * ✂️ fix: prune unreachable sub-agents after orphan-edge filtering Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes every sub-agent referenced in the primary's edge scan, but once `filterOrphanedEdges` drops edges whose endpoints were skipped, some of those sub-agents end up disconnected from the primary. In an `A -> B -> C` graph (edges stored directly on A) where B is skipped (missing or no VIEW), both edges are filtered, but C was already loaded and would still be passed to `createRun` — which flips into multi-agent mode on `agents.length > 1` and turns C into an unintended parallel start node. After filtering edges, compute the set of agent ids reachable from the primary through the surviving edge set and prune `agentConfigs` to that set. Two regression tests added: one for the pruning case, one that confirms agents connected via surviving edges are still kept. * 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback Addresses the fifth P1 codex finding on #12740: `onAgentInitialized` fires during BFS, BEFORE the helper prunes agents that become disconnected once `filterOrphanedEdges` runs. Writing the sub-agent straight into the outer `agentConfigs` there and then only additively merging the pruned `discoveredConfigs` left stranded entries in the outer map, and `AgentClient` would still hand them to `createRun` as extra parallel start nodes (the exact failure mode the pass-4 prune was meant to eliminate for the API controllers). Drop the `agentConfigs.set` from the callback and replace the additive merge with a direct copy from `discoveredConfigs`, which is now the single authoritative source of what the run should see. The per-agent tool context map is still populated during BFS — stale entries there are harmless because they're only read by closure inside `ON_TOOL_EXECUTE` and are unreachable once the agent is not in `agentConfigs`. * 🔬 fix: address audit findings on discovery helper Resolves findings from a comprehensive external audit of #12740. Finding 1 (CRITICAL) — stale edges survive the reachability prune. The pass-4 prune removed unreachable agents from `agentConfigs` but left matching edges in the return value. In an `A -> B -> C -> D` graph (all edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B and B->C but keeps C->D (neither endpoint is skipped). The caller then sees `agentConfigs` without C/D but `edges` still references them, flipping `createRun` into multi-agent mode with mismatched agents/edges — the exact crash this PR is supposed to fix. Now filter the edge list to the reachable set in the same pass, so the returned shape is self-consistent: every edge endpoint is either the primary id or a key of `agentConfigs`. New regression test covers A->B->C->D with B skipped. Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API request. The OpenAI-compat and Responses controllers called `getModelsConfig(req)` and `discoverConnectedAgents` even when the primary agent had no edges (the common single-agent API case). Gate both behind `primaryConfig.edges?.length > 0` so single-agent runs don't pay that cost. Finding 5 (MINOR) — silent mutation of caller's `primaryConfig.userMCPAuthMap`. The helper aliased that object and then `Object.assign`'d sub-agent entries into it, changing the caller's config in-place. Shallow-clone up front so the returned merged map is the only destination. Finding 7 (NIT) — dead `?? []` coalescing. `filterOrphanedEdges` always returns a concrete array, so the `discoveredEdges ?? []` fallback was never reached. Simplified the `primaryConfig.edges = …` assignment. Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not mutated in-place. * 🧹 chore: address audit NITs on discovery helper Addresses two NIT findings from the post-fix audit: F1 — the shallow clone on `primaryConfig.userMCPAuthMap` was only applied on the primary side; the `else` branch (hit when the primary had no MCP auth and the first sub-agent seeds the map) assigned the sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's `Object.assign` mutated the first one's map in place. Harmless in practice (per-request ephemeral objects) but asymmetric. Clone in the else branch too. Test added. F2 — `initialize.js` had a defensive `if (agentConfigs.size > 0 && !edges) edges = []` normalizer. Pre-existing dead code: the helper now always returns a concrete array from `filteredEdges.filter(...)`. Removed for clarity. * 🕸 fix: require all sources reachable when traversing fan-in edges Addresses the seventh P1 codex finding on #12740: the reachability BFS advanced through an edge as soon as any of its `from` endpoints matched the current frontier node (`sources.includes(current)`), but the subsequent edge filter required ALL sources to be reachable (`every`). The two-semantics mismatch let a fan-in edge like `{from: ['A','B'], to: 'C'}` mark C reachable purely via A even when B had no path from the primary, then drop the edge itself at filter time. Result: C survived in `agentConfigs` with no surviving edge connecting it to A, so `createRun` flipped into multi-agent mode on `agents.length > 1` and C ran as an unintended parallel root. Replace the BFS with a fixed-point iteration keyed on the same all-sources-reachable predicate used by the filter, so traversal and filtering stay aligned and multi-source edges only fire once every source is in the reachable set. Two regression tests added: - `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts neither B nor C leak into the result. - `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires and D becomes reachable once both B and C are. * 🔀 fix: match SDK OR semantics for multi-source edge reachability Reverts the all-sources-required reachability gate from `4982f1c3b` and replaces it with an any-source-reachable model, which matches how `@librechat/agents`'s `MultiAgentGraph.createWorkflow` actually wires multi-source edges at runtime (per-source `builder.addEdge(source, destination)`). With the previous `every` gate, a legitimate handoff edge `{ from: ['A', 'B'], to: 'C' }` where B had no incoming path was pruned along with C, regressing OR-semantics routing that the SDK would otherwise handle correctly. New behavior: 1. Reachability: an edge advances when ANY of its `from` endpoints is already reachable. Fixed-point iteration over `filteredEdges`. 2. Edge filter: keep an edge when it has at least one reachable source AND all destinations are reachable (a missing destination would still crash `StateGraph.compile` with `Found edge ending at unknown node`). 3. Agent prune: keep agents that are reachable OR referenced on any endpoint of a surviving edge. The second clause preserves co-sources in multi-source edges (B in `{ from: ['A','B'], to: 'C' }` when nothing else reaches B) so the SDK's per-source `addEdge` — and the `validateEdgeAgents` safety-net I added to the SDK in #111 — still finds B as a node. The pass-audit A->B->C->D regression test continues to pass: with B skipped, `filterOrphanedEdges` drops both B-adjacent edges, reachability never expands past A, C->D has no reachable source so it gets filtered, and C/D are pruned because they're neither reachable nor referenced. * ✂️ fix: strip skipped co-members from multi-source/multi-dest edges Addresses codex pass-9 P2 on #12740. `filterOrphanedEdges` previously dropped an edge whenever any `from` id was skipped, which was correct for scalar edges but over-aggressive for multi-source ones: the agents SDK adds one `builder.addEdge(source, destination)` per source, so `{ from: ['A','B'], to: 'C' }` with B skipped still has a valid `A -> C` route that was being thrown away. Now sanitize each endpoint: - Scalar skipped → drop the whole edge (no route survives). - Array with some skipped → strip the skipped ids, keep the edge with the surviving members. If the array empties out, drop the edge. Symmetric handling for `to` covers multi-destination fan-out when one co-destination is skipped. Tests updated/added: - `strips skipped co-sources from multi-source edges…` - `strips skipped co-destinations from multi-destination edges` - `drops multi-member edges only when every member on a side is skipped` - Discovery-side: `preserves valid routes when one co-source of a multi-source edge is skipped` asserts the end-to-end behavior — skipped co-source B gets stripped from the edge, A->C routing survives, and C remains in `agentConfigs`. * 🔓 fix: respect SHARE-on-AGENT fallback for handoff ACL on API routes Addresses codex pass-10 P1 on #12740. The API controllers were handing `discoverConnectedAgents` a raw `PermissionService.checkPermission` call against `ResourceType.REMOTE_AGENT`, but the route-level middleware (`createCheckRemoteAgentAccess`) authorizes the primary agent via `getRemoteAgentPermissions`, which first consults the AGENT ACL and treats owners with the SHARE bit as remotely authorized even without an explicit REMOTE_AGENT grant. The mismatch meant a user could open the primary via `/v1/chat/completions` or `/v1/responses`, but their own owned handoff sub-agents were silently skipped — breaking multi-agent handoffs for the common "owner runs their own multi-agent orchestrator" case. Both controllers now pass `discoverConnectedAgents` a `checkPermission` wrapper that delegates to `getRemoteAgentPermissions` (with `getEffectivePermissions` injected from `PermissionService`) and compares the returned bitmask against the required permission via `hasPermissions`. Sub-agents are now authorized by the exact same rules the route middleware applies to the primary. * 🌱 fix: preserve user-defined parallel-start branches Addresses codex pass-11 P2 on #12740. The post-filter reachability prune seeded only from `primaryConfig.id`, which killed `MultiAgentGraph`'s legitimate multi-start pattern — a user-defined edge like `X -> Y` where X has no incoming path (X is an intentional parallel starting node, run alongside the primary) was being dropped because neither X nor Y was reachable from the primary. Reconcile the tension with pass-4 ("prune accidental orphans when an intermediate is skipped") by using pre-filter reachability as the signal: - An agent that WAS reachable from the primary via the original (pre-filter) edges but loses that path when `filterOrphanedEdges` runs is an accidental orphan (a skipped hop broke the chain) — prune. - An agent that was NEVER reachable from the primary, even pre-filter, is an intentional parallel start — seed it into post-filter reachability so its component survives. Surviving-edge endpoint references still keep an agent (co-sources in multi-source edges). New test `preserves user-defined parallel-start branches disconnected from the primary` covers the pass-11 scenario; the existing `A->B->C->D, B skipped` regression test continues to pass because C/D were pre-filter reachable through B and lose that reachability after filtering. * 🎯 fix: tighten parallel-start seed criterion to 'no pre-filter incoming edge' Addresses codex pass-12 P1 on #12740. The pass-11 seed heuristic — 'agent is in `agentConfigs` but was not pre-filter reachable from the primary' — was too permissive. A downstream agent like Y in `X -> Y` where X gets skipped (missing / no VIEW) was never pre-filter reachable from the primary either, so the old rule promoted Y to a parallel start node and discovery returned `agents: [primary, Y]` with no connecting edge. The SDK then ran Y as an unintended parallel root — exactly the orphan behavior pass-4 wanted to prevent. Tighter criterion: seed a post-filter reachability root only when the agent had NO incoming edge in the pre-filter graph. That matches `MultiAgentGraph.analyzeGraph`'s "no-incoming-edge" definition of a start node applied to the user's original declared topology, so: - `A -> B` plus a user-defined `X -> Y` parallel branch: X has no incoming pre-filter → seeded → X and Y both survive. - `A -> B` plus `X -> Y` with X skipped: Y had an incoming pre-filter (`X -> Y`) → NOT seeded → Y is pruned as the orphan it is. - `A -> B -> C` with B skipped: C had an incoming pre-filter (`B -> C`) → NOT seeded → C is pruned. New test `does not promote a downstream orphan to a parallel start when its only upstream is skipped` locks in the pass-12 scenario. The pass-11 `preserves user-defined parallel-start branches` test continues to hold. * 📁 fix: don't enforce AGENT-only file ACL on REMOTE_AGENT API callers Addresses codex pass-13 P1 on #12740. When I refactored the API controllers' DB-method bundle, I inadvertently started forwarding `filterFilesByAgentAccess` into `initializeAgent`. That helper calls `checkPermission` with `resourceType: ResourceType.AGENT`, but these routes authorize callers through `REMOTE_AGENT` (via `getRemoteAgentPermissions`). A user granted `REMOTE_AGENT_VIEWER` on a shared agent but lacking direct `AGENT_VIEW` could invoke the agent yet all its owner-attached context files would get silently filtered out — breaking `file_search`/context retrieval for remote consumers. Drop `filterFilesByAgentAccess` from the OpenAI-compat and Responses controllers' `dbMethods` (and remove the now-unused import). The chat UI's `initialize.js` keeps it since that path legitimately authorizes at the AGENT level. No functional change inside the helper — passing `undefined` simply tells `primeResources` to skip the per-file ACL filter, restoring the pre-refactor API behavior. * 🪓 fix: strip unreachable co-sources from surviving multi-source edges Addresses codex pass-14 P1 on #12740. The earlier pass-8 fix kept any agent referenced as an endpoint of a surviving edge (via a `referencedByEdge` fallback) to avoid the SDK's `validateEdgeAgents` failing on missing nodes. But that fallback propped up unreachable co-sources too: with `[A -> C, X -> B, [B,C] -> D]` and X skipped, `X -> B` gets filtered, the `[B,C] -> D` fan-in survives because C is reachable, and B stays in `agentConfigs` solely because the fan-in still lists it. `MultiAgentGraph.analyzeGraph` then sees B with no incoming edge and runs it as an unintended parallel root. Sanitize surviving edges instead: for a kept edge whose `from` is an array, filter out any co-source that isn't reachable. The SDK's per-source `addEdge` fires independently, so dropping an unreachable co-source doesn't invalidate the remaining routes — in the scenario above `[B,C] -> D` becomes `[C] -> D`, every endpoint of every surviving edge is now reachable, and the agent prune collapses to a strict `reachable.has(agentId)` check. No more referenced-by-edge fallback. Regression test added: `strips unreachable co-sources from surviving multi-source edges (no stray parallel root)` — asserts B is absent from every surviving edge endpoint and the fan-in's `from` is just `['C']`. All 22 prior discovery tests still pass unchanged.	2026-04-20 02:20:43 -04:00
Danny Avila	48c3d31db3	🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737 ) * 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels The default console formatter discarded every structured metadata key on the winston info object — only `CONSOLE_JSON=true` preserved it. That meant failures emitted by the agents SDK (e.g. "Summarization LLM call failed") reached stdout without the provider, model, or underlying error attached, leaving users unable to diagnose the root cause. - Add `formatConsoleMeta` helper to serialize non-reserved metadata as a compact JSON trailer, with per-value string truncation and safe handling of circular references. - Append the metadata trailer to warn/error console lines; info/debug behavior is unchanged. - Relax `debugTraverse`'s debug-only gate so warn/error messages routed through the debug formatter also surface their metadata. - Add a `formatConsoleMeta` stub to the shared logger mock so existing tests keep working. * 🔐 fix: Also Redact Sensitive Patterns on Warn Console Lines The warn-level console output now includes a metadata trailer that may contain provider-returned error strings with embedded tokens or keys (e.g. `Bearer ...`, `sk-...`). Apply `redactMessage` to warn lines in addition to error, matching the new surface area. * 🔐 fix: Redact Sensitive Tokens Embedded in JSON Metadata Two gaps in the existing console redaction that became user-visible once warn/error lines started emitting structured metadata: 1. The OpenAI-key regex (`/^(sk-)[^\s]+/`) was anchored to start-of-line, so keys embedded inside JSON payloads (e.g. `{"apiKey":"sk-..."}`) were never redacted. Every console line begins with a timestamp, so the anchor effectively made this pattern dead code. 2. `formatConsoleMeta` stringified metadata values verbatim; a sensitive string value was only redacted by the whole-line regex pass, which missed the anchored `sk-` case above. Fix: - Drop the `^` anchor; add `/g` so every occurrence is redacted, not just the first. - Also exclude `"` and `'` from the token body so JSON-embedded values terminate at the closing quote rather than chewing into the next field. - Simplify `redactMessage` to apply patterns directly (dropping the `getMatchingSensitivePatterns` filter) — the filter used `.test()` which has stateful behavior on `/g` regexes and is no longer needed. - `formatConsoleMeta` now runs `redactMessage` over every string value before JSON serialization, so the metadata trailer is safe even on the warn path. - Add regression tests covering both fixes. Reviewed-by: Codex (P1 finding on PR #12737, commit `68c31b6`). * 🔐 fix: Redact Metadata in debugTraverse for Warn and Error Relaxing the debug-only gate in debugTraverse (in commit `59371be0`) routed warn/error records through the traversal path, which emits leaf string values verbatim (via truncateLongStrings only). Because DEBUG_LOGGING defaults to true, those records are also written to the rotating debug log file — which means payloads like `{ auth: 'Bearer ...' }` or `{ openaiKey: 'sk-...' }` were persisted unredacted once my earlier change took effect. Apply redactMessage to the final formatted string when the level is warn or error. Debug-level behavior is unchanged (matching prior art). Includes regression tests covering error/warn redaction and debug-level preservation. Reviewed-by: Codex (P1 finding on PR #12737, commit `e288f7fd`). * 🔐 fix: Anchor Secret Regexes at Word Boundaries to Prevent Over-Redaction Removing the `^` anchor in commit `e288f7fd` let the OpenAI-key regex match anywhere in the line — including inside ordinary words like `task-runner` or `mask-value`, where `sk-` appears mid-word. Non-secret text was being rewritten to `task-[REDACTED]`, hiding real log content from operators. - Anchor every sensitive-key pattern with `\b` so matches only fire at word boundaries. - Constrain the OpenAI-key body to the documented charset (`[a-zA-Z0-9_-]+`) instead of the broader "not whitespace or quote" character class. - Add `&` to the `key=` exclusion so a query-string value stops at the next parameter separator. - Regression tests covering both the over-redaction cases (`task-runner`, `monkey=10`) and the intended redactions still firing. Reviewed-by: Codex (P2 finding on PR #12737, commit `c09d293d`). * 🔐 fix: Redact Before Colorize To Survive ANSI Word-Boundary Interference The console pipeline runs `redactFormat → colorize({ all: true }) → printf`. With `all: true`, winston wraps `info.message` in ANSI escapes whose trailing `m` is a word character. That means `\b(Bearer )…` placed at the start of a colorized segment can fall on a (word,word) boundary and miss — the earlier line-wise `redactMessage(line)` pass in printf suffers the same issue because it runs after colorize. Extend `redactFormat` to run for `warn` in addition to `error`, operating on the raw pre-colorize `info.message` + `Symbol.for('message')` strings. The later in-printf `redactMessage(line)` stays as a backstop, but the primary redaction now happens where the regex can actually see the text. Metadata redaction already operates on the raw info object via `formatConsoleMeta`, so it was never affected by ANSI — no change there. Includes regression tests for the new warn-level behavior and for the info/debug no-op path. Reviewed-by: Codex (P2 finding on PR #12737, commit `fdb6b361`). * 🧹 fix: Prefer Structured Metadata Over Consumed Splat Args in Traversal `debugTraverse` previously read `metadata[Symbol.for('splat')][0]` first and only fell back to the structured metadata object. When a caller uses printf interpolation alongside a metadata object — for example `logger.warn('failed for %s', tenant, { provider })` — winston leaves the consumed positional arg (`tenant`) in `SPLAT[0]` after interpolation. The formatter would then append the tenant a second time and skip the real metadata, regressing debug-file and `DEBUG_CONSOLE` output quality now that warn/error share this path. Prefer the structured metadata object (via `extractMetaObject`) and only fall back to `SPLAT[0]` when there's nothing else, so the surviving log line surfaces the actual key/value pairs regardless of call shape. Reviewed-by: Codex (P2 finding on PR #12737, commit `1e43d636`). * 🧹 fix: Skip Consumed Splat Primitives in Warn/Error Debug Traversal When no structured metadata is attached, winston still leaves consumed `%s` / `%d` arguments in `Symbol.for('splat')`. Previous fix preferred the structured object but still fell back to whatever sat at `SPLAT[0]` — so `logger.warn('failed for %s', tenantId)` emitted `failed for tenant-7 tenant-7` in the traversal path (debug file and `DEBUG_CONSOLE`), now regressed outside of `debug` level because warn/error share the path. Only accept the splat fallback when the value is a plain object or an array (structural data worth surfacing). Primitives there are almost certainly consumed printf args and get skipped. Regression tests cover the single-%s case and the array-as-metadata case (which still surfaces through splat). Reviewed-by: Codex (P2 finding on PR #12737, commit `bccbf117`). * 🧹 fix: Skip Numeric Splat Keys When Extracting Log Metadata When a caller passes a primitive as the second argument — e.g. `logger.warn('Unhandled step creation type:', step.type)` — winston / `format.splat()` can leave character-index keys (`"0"`, `"1"`, …) on the `info` object. With the warn/error metadata trailer in play, those synthetic artifacts were being surfaced as bogus metadata, producing noisy console and debug-file output. Filter out numeric-string keys in `extractMetaObject` so only real metadata fields reach the trailer. Added a regression test. Reviewed-by: Codex (P2 finding on PR #12737, commit `b34628de`). * 🧹 fix: Preserve Unconsumed Primitive Splat Args in Debug Traversal The previous round dropped every primitive SPLAT[0] value to avoid duplicating consumed %s args, but that removed useful context from calls like \`logger.debug('prefix:', detail)\` where the primitive was never interpolated — users lost the \`detail\` value. Refine the heuristic: skip a primitive splat value only when it already appears inside the (post-interpolation) \`info.message\`; otherwise surface it. Arrays and objects continue to surface unconditionally. Regression test covers the 'prefix:', detail case. Reviewed-by: Codex (P2 finding on PR #12737, commit `6bf9548f`). * 🧹 fix: Traverse Filtered Metadata, Not Raw Metadata, In debugTraverse `debugTraverse` computes `extracted = extractMetaObject(metadata)` to strip reserved keys, underscore-prefixed internals, and numeric splat artifacts — but the later \`klona(metadata)\` + \`traverse\` path still read the raw object, putting all the filtered junk back into the rendered multi-line output. Clone and traverse \`debugValue\` (the already-filtered object) instead. Regression test exercises the case where numeric splat artifacts sit alongside a real metadata field. Reviewed-by: Codex (P2 finding on PR #12737, commit `c29c18e8`). * 🧹 refactor: Split Warn/Error From Debug-Level Traversal in Parsers Retrofitting `debugTraverse`'s multi-line object walker to cover warn/error created a minefield of splat-interaction edge cases (numeric artifacts, consumed %s args, bogus `_`-prefix filtering, over-eager suppression of unconsumed primitives). Each fix kept introducing new corner cases. Split the two concerns instead: - Warn/error now emit a compact single-line JSON metadata trailer via `formatConsoleMeta`, then pass the full line through `redactMessage`. This mirrors what the console formatter already does, so behavior between the console and debug-file outputs stays consistent for warn/error — and none of the splat/traversal edge cases apply. - Debug level keeps its original code path verbatim (including the raw `metadata` traversal and SPLAT\[0\] fallback). No regressions from my earlier iterations. - `extractMetaObject` no longer filters underscore-prefixed keys, so legitimate fields like MongoDB `_id` still appear. Reserved winston keys and numeric splat artifacts remain filtered. Updated tests reflect the simpler contract (underscore preservation, single-line trailer expectations already covered). Reviewed-by: Codex (two P2 findings on PR #12737, commit `9ea11529`: `_id` regression and over-eager primitive suppression). * 🧹 fix: Preserve Scalar Metadata When One Value Is Circular `formatConsoleMeta` previously wrapped a single `JSON.stringify` in try/catch — any circular reference inside any field (e.g. an attached request/response object) caused the entire trailer to be dropped. That defeats the goal of making failures diagnosable: one malformed field would mask the provider/model/status we wanted to surface. Use a `WeakSet`-based replacer that emits `[Circular]` for repeated object visits. On the whole-object serialization failing, fall back to per-field serialization so scalar keys always land and only the offending field is replaced with \`"[Unserializable]"\`. Reviewed-by: Codex (P2 finding on PR #12737, commit `d63742a5`). * 🧹 fix: Address Audit Findings (JSDoc, Case-Insensitive Api-Key, Tests) Audit review identified several MINOR/NIT items on top of the codex rounds. This commit closes the actionable ones: - JSDoc (#1, #2): `extractMetaObject` no longer claims to filter underscore-prefixed keys (that filter was removed intentionally for MongoDB `_id`). `debugTraverse`'s docblock now describes the three code paths (warn/error compact trailer, debug multi-line traversal, other levels). - Case-insensitive api-key regex (#6): `/gi` so the Azure style `Api-Key:` / `API-KEY:` also gets redacted. Pre-existing behavior was lowercase-only. - Consolidated redundant branch (#5): `consoleFormat` printf was checking `isError \|\| isWarn` twice; merged into one block. - Pre-compiled regex (#9): `NUMERIC_KEY_RE` moved to module scope. - Test coverage (#3, #4): Added regression tests for - per-field serialization fallback when a value's `toJSON` throws, - sensitive strings nested inside metadata objects, - the Azure-style `Api-Key:` header.	2026-04-19 21:49:41 -07:00
Danny Avila	ccf3a6c670	📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types (#12735 ) * 🐛 fix: accept documented `summarization.trigger.type` values The Zod schema for `summarization.trigger.type` only accepted `'token_count'`, but: - the documentation lists `token_ratio`, `remaining_tokens`, and `messages_to_refine` as valid - the `@librechat/agents` runtime only evaluates those three types and silently no-ops on anything else The result was a double failure: any user following the docs hit a startup Zod error, and anyone who matched the schema by using `token_count` got a silent no-op at runtime where summarization never fired. Align the schema with the documented, runtime-supported trigger types. Closes #12721 * 🧹 fix: bound `token_ratio` trigger value to (0, 1] Per Codex review: the previous schema accepted `value: z.number().positive()` for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }` (presumably meant as "80%") passed validation and then silently never fired — because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always false. That is exactly the silent-no-op pattern this PR is trying to eliminate. Switch to a discriminated union so each trigger type has its own value constraint: - `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense - `remaining_tokens`: positive — token counts can be large - `messages_to_refine`: positive — message counts can be > 1 Added tests for the upper-bound rejection and the inclusive upper bound (`value: 1` still accepted as a valid "fire at 100%" extreme). * 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range Per Codex review: `.positive()` rejected `value: 0`, but the docs describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins who copy the documented lower bound into their YAML would fail schema validation at startup. Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme) setting — the agents SDK's `usedRatio >= 0` check will fire as soon as there is anything to refine, which is a legitimate "always summarize when pruning happens" configuration. `remaining_tokens` and `messages_to_refine` keep `.positive()`: both are counts, and `0` there produces no meaningful behavior (the SDK has an early return for `messagesToRefineCount <= 0`). * 🐛 fix: preserve `token_ratio` trigger when `value: 0` Per Codex review: now that the schema accepts `token_ratio: 0`, `shapeSummarizationConfig` would silently drop it because of a truthy check on `config?.trigger?.value`. The trigger would disappear and the runtime would fall back to "no trigger configured" — which fires on any pruning rather than honoring the explicit ratio. Switch to `typeof value === 'number'`, which preserves `0` while still rejecting `undefined`/`null`. Added a regression test that asserts `{ type: 'token_ratio', value: 0 }` survives the shaping function untouched. * 🧹 fix: reject non-finite trigger values at schema level Per Codex review: `z.number().positive()` still accepts `Infinity` and `NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but the agents SDK guards every trigger path with `Number.isFinite(...)` and silently returns `false` — summarization never fires while the server starts cleanly. That is the exact schema/runtime split this PR is trying to eliminate. Add `.finite()` to every trigger value. `token_ratio` already had an implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps the intent obvious and catches `NaN` (which `.max(1)` does not). * 🧹 fix: integer counts + targeted token_count migration warning Two findings from the comprehensive review: 1. `remaining_tokens` and `messages_to_refine` are token/message counts and are always integers in the runtime (`Number.isFinite(...)` guards already assume integer semantics). `z.number().positive()` accepted fractional values like `2.5`, which was semantically confusing and would round oddly against the runtime's `>=` / `<=` comparisons. Add `.int()` to both count-based branches; `token_ratio` stays fractional. 2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got the generic "Invalid summarization config" warning plus a flattened Zod error. Detect that specific case in `loadSummarizationConfig` and emit a migration-friendly message that names the three valid replacements. Export the function so the behavior is unit-testable. Also added a parameterized passthrough test covering `remaining_tokens` and `messages_to_refine` shaping, complementing the existing `token_ratio` coverage. * 🧹 fix: accurate fallback wording + bare-string trigger test Two nits from the follow-up audit: 1. The legacy-`token_count` warning claimed "Summarization will be disabled," but `shapeSummarizationConfig` treats a missing summarization config as self-summarize mode (fires on every pruning event using the agent's own provider/model). "Disabled" would mislead an admin into stopping investigation. Reword to describe the actual fallback and assert the new wording in the spec. 2. Add a regression test for the `trigger: 'bare-string'` YAML case, so the `typeof raw.trigger === 'object'` guard is exercised rather than implied. 3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message is safe in every terminal/aggregator regardless of UTF-8 handling. * 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union CI TS check failed: after the schema tightening, `raw.trigger.type` is narrowed to `"token_ratio" \| "remaining_tokens" \| "messages_to_refine" \| undefined`, so the runtime comparison to `"token_count"` is a TS2367 ("no overlap") error even though that's exactly the comparison we want for the migration guard. Widen just that one access via `as { type?: unknown }` so the migration check reads runtime-shaped YAML input without the type system folding it back into the narrowed union.	2026-04-19 19:33:52 -07:00
Danny Avila	2358d07b64	📝 fix: Preserve Raw Markdown Formatting on Upload as Text (#12734 ) * 🐛 fix: Preserve Raw Markdown on `Upload as Text` When `RAG_API_URL` is configured, `.md` uploads were sent to the RAG API `/text` endpoint, which routes Markdown through `UnstructuredMarkdownLoader` and strips formatting (`#`, `*`, lists, blockquotes). Users expect `Upload as Text` to preserve raw content - identical bytes in a `.txt` file round-trip verbatim, while the `.md` came back stripped. Short-circuit the RAG API call for Markdown files (by MIME type or `.md` / `.markdown` extension) and read the file verbatim via `parseTextNative`. Non-Markdown paths are unaffected, and the embedding path (`/embed`) keeps its existing loader so vector search quality is unchanged. 🐛 fix: normalize markdown MIME and accept `text/md` Addressing review feedback on the `Upload as Text` short-circuit: - Accept `text/md` in the markdown MIME set (LibreChat treats it as a valid markdown type elsewhere, e.g. the artifact-rendering prompt). - Normalize the incoming MIME type (lowercase + strip parameters) before the set lookup so parameterized values like `text/markdown; charset=utf-8` and uppercase `TEXT/MARKDOWN` still short-circuit. Extensionless uploads relying only on the `Content-Type` header would otherwise fall through to the RAG `/text` endpoint and lose their markdown formatting. Extend `text.spec.ts` parametrized cases with `text/md`, parameterized MIME, uppercase, and whitespace-padded variants. * 🧹 chore: Address Code Review Follow-ups on `Upload as Text` fix Addressing comprehensive review feedback: - Debug log now includes filename and MIME type so operators can identify which upload triggered the short-circuit without having to correlate other logs. - Expand markdown extension detection beyond `.md` / `.markdown` to cover `.mdown`, `.mkdn`, `.mkd`, `.mdwn` (case-insensitive regex). - Tighten `normalizeMimeType` parameter type from `string \| undefined` to `string` to match the actual Express.Multer.File type. The falsy-check still protects against empty strings at runtime. - Extend parametrized tests with the most common real-world shapes: `text/plain` + `.md` (the MIME most browsers/servers assign), the new rare extensions, and empty MIME + `.md` (pure extension fallback path). - Add a positive assertion that `readFileAsString` was called with the expected arguments on every short-circuit case, so tests fail loudly if the native-parse path ever regresses. * 🧪 test: Cover `.mdwn` regex branch in Markdown short-circuit Every other alternation in `MARKDOWN_EXTENSIONS_RE` has at least one test case (`md`, `markdown`, `mdown`, `mkdn`, `mkd`) but `mdwn` did not, leaving a typo in that branch undetectable.	2026-04-19 19:31:39 -07:00
Danny Avila	3bd2681272	🪐 fix: Replace `$bitsAllSet` ACL Queries for Azure Cosmos DB Compatibility (#12736 ) * 🐛 fix: replace `$bitsAllSet` ACL queries for Cosmos DB compatibility Azure Cosmos DB for MongoDB API does not implement the `$bitsAllSet` query operator, so every permission check against Cosmos DB threw. The five read paths in `aclEntry.ts` (`hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, and two sites in `getSoleOwnedResourceIds`) now fetch candidate entries and apply the bitwise mask in application code. This matches the existing FerretDB-compatible pattern. Fixes #12729. * 🐛 fix: delegate `findPubliclyAccessibleResources` to fixed DB method `AccessControlService.findPubliclyAccessibleResources` inlined the same `$bitsAllSet` query as the data-schemas layer, which fails on Azure Cosmos DB for MongoDB. Delegate to `_dbMethods.findPublicResourceIds` so a single implementation carries the Cosmos-compatible bitwise logic. Refs #12729. * 🐛 fix: move `$bitsAllSet` filter out of remote-agent aggregation `enrichRemoteAgentPrincipals` used `$bitsAllSet` inside an aggregation `$match`, which Azure Cosmos DB for MongoDB does not implement. Project `permBits` through the pipeline and filter for `PermissionBits.SHARE` in application code. The extra documents fetched are bounded by ACL entries on a single agent resource, so the cost is negligible. Refs #12729. * 🧪 test: rename misleading public-dedup test and add real dedup coverage The test previously named "returns deduplicated IDs even if the public principal has multiple entries" only set up a single ACL entry, so it did not actually exercise deduplication. Split into two tests: one for the happy path (single entry with required bits), and one that bypasses `grantPermission`'s upsert via `AclEntry.create` to confirm the application-layer dedup Map handles genuine duplicates. Refs #12729. * 🧪 test: cover SHARE-bit filter in `enrichRemoteAgentPrincipals` The `$bitsAllSet` match stage in `enrichRemoteAgentPrincipals` previously guaranteed every aggregation row had SHARE; the Cosmos DB fix moved that check into a JS `continue` branch with no direct coverage. Add a dependency-injected unit test that stubs the aggregation with mixed SHARE / non-SHARE / zero-bit rows and asserts only SHARE holders are enriched and queued for backfill. Also includes a regression guard that the `$match` pipeline stage no longer contains a `permBits` filter. Refs #12729. * ♻️ refactor: extract `filterByBitsAndDedup` helper for ACL reads `findAccessibleResources` and `findPublicResourceIds` each inlined the same bitmask-filter + `Map`-based dedup loop. Lift it into a private `filterByBitsAndDedup(entries, requiredBits)` helper so the Cosmos-DB compatible pattern lives in one place. Pure rename/extract — no behavior change. Refs #12729. * 📝 docs: fix stale `\$bitsAllSet` references in FerretDB spec The describe block and header comment in the FerretDB parity spec still referenced `\$bitsAllSet queries` after the Cosmos DB compatibility fix moved the bit mask into application code. Update the title to \"Bitwise permission queries\" and rewrite the header comment to describe the application-layer behavior being validated. Refs #12729. * ⚡ perf: push permission-mask filter back to the database via `$in` The original fix for #12729 moved `$bitsAllSet` filtering into application code, which meant every ACL read fetched the full set of rows for a principal/resource and filtered in JS. For tenants with large ACL collections this inflates wire transfer and heap. Replace the JS filter with `permBits: { $in: permissionBitSupersets(X) }`. For the 4-bit `PermissionBits` enum the `$in` list is at most 16 values (8 for a single-bit mask like SHARE). `$in` is indexable and supported by Azure Cosmos DB for MongoDB, so the filter runs on the server again — restoring `.distinct('resourceId')` and `findOne()` semantics. `permissionBitSupersets(requiredBits)` is memoized and exported from `@librechat/data-schemas`. Callers restored: - `hasPermission`: back to `findOne` short-circuit - `findAccessibleResources` / `findPublicResourceIds`: back to `.distinct()` - `getSoleOwnedResourceIds`: back to the `$match` + `$group` aggregation - `enrichRemoteAgentPrincipals`: bit filter back in `$match`, JS `continue` removed Refs #12729. * 🧪 test: add `\$bitsAllSet` vs `\$in` parity + perf spec Introduces `aclEntry.parity.spec.ts` — a side-by-side spec that runs the legacy `\$bitsAllSet` query and the current `\$in`-based query against the same `mongodb-memory-server` fixture and asserts identical output sets for every affected method (`hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, `getSoleOwnedResourceIds`) across all 7 meaningful permBits combinations. Also logs median wall-clock time for the two query paths over 20 runs on an 800-entry fixture, with a loose 3x guard against catastrophic regressions. Initial local numbers: 1.05 ms vs 1.07 ms (findAccessibleResources), 1.10 ms vs 1.05 ms (findPublicResourceIds). Refs #12729. * 🔒 hardening: freeze `permissionBitSupersets` cache + enum-shape guard Two defensive changes from the comprehensive audit: * Cached superset arrays are now `Object.freeze`d and the return type is `readonly number[]`. Previously the cached arrays were returned by mutable reference, so a caller that mutated the result would silently corrupt the process-wide cache for every subsequent permission check. `Object.freeze` turns that into a loud `TypeError` at the mutation site. All existing call sites pass the result directly to Mongoose's `$in`, which does not mutate. * Added a module-load guard `if (MAX_PERM_BITS === 0) throw`. If `PermissionBits` is ever refactored to a `const` object or string enum, `Object.values(...).filter(isNumber)` would return `[]` and `MAX_PERM_BITS` would silently become 0, making every query match no rows and breaking every permission check. The guard fails loudly instead. Also collapsed four identical JSDoc lines across `hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, and `getSoleOwnedResourceIds` into a single `{@link permissionBitSupersets}` reference. Refs #12729. * 🧪 test: add focused unit tests for `permissionBitSupersets` The helper is the single point of correctness for every ACL read path (every query uses `permBits: { $in: permissionBitSupersets(X) }`), so it warrants direct coverage independent of the higher-level parity and behavior specs. Six cases added: * `requiredBits=0` returns all 16 values * `requiredBits=15` returns `[15]` only * every returned value is a bitwise superset of `requiredBits` * full parity against the `$bitsAllSet` definition for every `required` in 0..15 * memoization: repeat calls return the same frozen reference * frozen result throws `TypeError` on mutation attempts Refs #12729. * 🧪 test: tighten parity perf guard and document fixture constants The `expect(currentMs).toBeLessThan(legacyMs * 3 + 50)` form was dominated by the `+ 50` additive term at typical sub-ms query latencies — at legacy=1ms, a 50x regression would still pass. Replace with `Math.max(legacyMs * 5, 50)` so the multiplicative ceiling is intact once the new path climbs out of the fixed noise floor. Also added inline rationale for the `FIXTURE_SIZE = 800` and `PERF_ITERATIONS = 20` constants. Refs #12729. * 🧹 chore: remove stale perf-guard comment, hoist rationale to describe Commit `a75f122` left a "Sanity check: A 3x multiplier" comment block above the first perf guard, and `21852ac` layered a second "Multiplicative-dominant guard" block directly below it. The stale block described the `legacyMs * 3 + 50` formula that no longer exists, so both blocks coexisted and contradicted each other. Delete the stale block from the first test, remove the redundant copy from the second test, and lift the (now-single) rationale to the enclosing `describe('performance')` block. Each test is now one `expect` line — the rationale lives once, at the scope it applies to. Refs #12729. * 📝 docs: sharpen MAX_PERM_BITS guard message + test-sort consistency Two NITs from the follow-up audit: * The `MAX_PERM_BITS === 0` guard now names the specific refactors that could cause it (const object / string enum / all-string shape) and gives two concrete remediation paths (rewrite the reducer, or give `permissionBitSupersets` an explicit bit-width parameter). Previously the message just said "update permissionBitSupersets before continuing", which was vague. * The `requiredBits=15` unit test now applies the same `[...result].sort((a, b) => a - b)` normalization as the other tests so the set-equality assertions are uniform. The function happens to return values in ascending order, but the helper's JSDoc does not promise ordering, so sorting before comparing is the correct defensive pattern. Refs #12729. * 🔒 fix: enforce `permBits <= MAX_PERM_BITS` at the schema level Codex flagged a real silent regression: `permissionBitSupersets` enumerates `$in` candidates only in `[0, MAX_PERM_BITS]` (currently 15), so any ACL row with bits above the enum — e.g. `permBits = 31` from a future role or a manual DB write — would be excluded from reads that `$bitsAllSet` would have matched. Current write paths only produce values in `[0, 15]` via the `PermissionBits` / `RoleBits` enums, so no live data is affected, but the schema did not prevent out-of-range writes, so the divergence was reachable. Fix: add `min: 0`, `max: MAX_PERM_BITS`, and an `isInteger` validator to the `permBits` field in the AclEntry schema. `MAX_PERM_BITS` is derived from `PermissionBits` the same way `permissionBitSupersets` computes it, so when a new bit is added to the enum both the read-side enumeration and the write-side bound expand together. Tests: four new cases cover the upper bound, over-limit, negative, and non-integer inputs, each asserting `Mongoose.Error.ValidationError`. Plus updated JSDoc on `permissionBitSupersets` to document the invariant that the schema now enforces. Refs #12729. * ♻️ refactor: hoist `MAX_PERM_BITS` + enum-shape/ceiling guards to shared util Addresses one MINOR DRY finding and one NIT defensive-guard finding from the latest audit. `MAX_PERM_BITS` was duplicated in `methods/aclEntry.ts` and `schema/aclEntry.ts` with identical computations; they could silently diverge. Move the constant plus both module-load guards to `src/common/permissions.ts`: * `MAX_PERM_BITS === 0` guard — fail loudly if `PermissionBits` is ever refactored to a `const` object or string enum (applied in both sites now, not just methods). * `MAX_PERM_BITS > 255` guard — circuit-breaker for the `$in` enumeration strategy. At 4 bits the list tops out at 16; at 8 bits 256. Beyond that the approach degrades, so fail at module load rather than emit an unusably-large `$in` list. Both the schema and the methods file now import from the single `src/common/permissions.ts`. `readonly number[]` return type and `Object.freeze` cache-value protection from the prior commits are preserved. Refs #12729. * 🔒 fix: reject out-of-range `requiredBits` without caching (DoS fix) Codex flagged a real unbounded-cache DoS vector: `api/server/controllers/agents/v1.js` parses `req.query.requiredPermission` via `parseInt` and forwards it unvalidated to `findAccessibleResources`, which eventually calls `permissionBitSupersets(requiredBits)`. Because the old code memoized every distinct input in a process-global Map, an attacker could flood the cache with unique integers and grow memory without bound. Fix: when `requiredBits` is not an integer, is negative, or has any bits set above `MAX_PERM_BITS`, return a single shared frozen empty array and do NOT touch the cache. An empty `$in` list correctly matches zero rows (rows cannot satisfy a bit we do not support), so the response is also semantically correct — previously it "worked" only because invalid inputs coincidentally expanded to `[]` too, but at the cost of a cache entry each time. Five new tests exercise the rejection path: upper-bound overflow, mixed in-range+out-of-range bits, shared-instance identity across rejected inputs, a 2000-unique-value cache-growth probe, and a regression check that legitimate in-range inputs still get memoized normally. Refs #12729. * 🧪 test: probe `Map.prototype.set` to prove cache doesn't grow on rejection Strengthens the DoS cache-growth test the review pass flagged: reference identity alone allows a hypothetical regression like `supersetCache.set(requiredBits, EMPTY_SUPERSETS); return EMPTY_SUPERSETS;` to pass while still leaking one entry per attacker request. Add a second test that spies on `Map.prototype.set`, records the global call count before and after a burst of 1500 rejected inputs (500 each of over-MAX integers, negatives, and non-integers), and asserts the delta is zero. Manually verified the test has teeth: injecting the hypothetical regression in the implementation produced exactly 1500 extra `Map.set` calls and the new test failed as expected; reverting restored the clean state. Renames the existing identity-based test to `(reference identity)` so its scope is clear alongside the new `(Map-write probe)` companion. Refs #12729.	2026-04-19 22:28:48 -04:00
Luna	8f39f31c87	🤝 fix: Normalize Empty Handoff Fields to Restore Default Fallback (#12707 ) * fix(agents): normalize handoff prompt and parameter values to ensure default instructions * fix(agents): address lint errors and codex review comments	2026-04-19 11:53:09 -04:00
Danny Avila	b579390287	📦 chore: npm audit & bump `@librechat/agents` to v3.1.67 (#12710 ) * chore: Update package-lock.json with new dependencies and version upgrades - Added new dependencies for @langchain/anthropic and @langchain/core, including @anthropic-ai/sdk and fast-xml-parser. - Updated existing dependencies for @librechat/agents, @opentelemetry/api-logs, @opentelemetry/core, and related packages to their latest versions. - Enhanced integrity checks and licensing information for new and updated packages. * chore: Update @librechat/agents dependency to version 3.1.66 in package.json and package-lock.json - Bumped the version of @librechat/agents from 3.1.65 to 3.1.66 across multiple package.json files to ensure consistency and access to the latest features and fixes. * chore: Update dompurify and fast-xml-parser dependencies to version 3.4.0 and 5.6.0 respectively - Bumped the version of dompurify across multiple package.json files to ensure consistency and access to the latest features and security fixes. - Updated fast-xml-parser to the latest version in relevant package.json files for improved functionality. * chore: Update @librechat/agents dependency to version 3.1.67 in package.json and package-lock.json - Bumped the version of @librechat/agents from 3.1.66 to 3.1.67 across multiple package.json files to ensure consistency and access to the latest features and fixes.	2026-04-16 22:44:00 -04:00
Danny Avila	034b672d0c	🫧 feat: Claude Opus 4.7 Reasoning Visibility (#12701 ) * 🫧 fix: Restore Claude Opus 4.7 Reasoning Visibility Claude Opus 4.7 omits `thinking` content from Messages API responses by default — empty thinking blocks still stream, but the `thinking` field is blank unless the caller passes `display: "summarized"` in the adaptive thinking config. This silenced the LibreChat "Thoughts" UI for Anthropic (and Anthropic-on-Bedrock) adaptive models. - Extend `ThinkingConfigAdaptive` in `packages/api/src/types/anthropic.ts` with an optional `display: 'summarized' \| 'omitted'` field - Emit `{ type: 'adaptive', display: 'summarized' }` from `configureReasoning` in `packages/api/src/endpoints/anthropic/helpers.ts` - Emit `{ type: 'adaptive', display: 'summarized' }` from `bedrockInputParser` in `packages/data-provider/src/bedrock.ts` and update the local `ThinkingConfig` union - Update existing adaptive-thinking assertions to include the new field - Add dedicated tests asserting `display: 'summarized'` flows through both the Anthropic endpoint and the Bedrock parser See https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7#thinking-content-omitted-by-default * refactor: Gate `display: summarized` on Opus 4.7+ Narrow the reasoning-visibility opt-in to the models that actually omit thinking content by default, instead of applying it to every adaptive model. Pre-Opus-4.7 adaptive models (Opus 4.6, Sonnet 4.6) already return summaries, so sending the field is unnecessary noise. - Add `omitsThinkingByDefault(model)` in `packages/data-provider/src/bedrock.ts` that returns true only for Opus 4.7+ (including future majors like Opus 5+) - Bedrock parser now only attaches `display: 'summarized'` when the helper matches, keeping the adaptive object unchanged for older models - Anthropic endpoint `configureReasoning` uses the same helper so its emit path matches the Bedrock one - Tests: replace the blanket `display: 'summarized'` assertions with model-specific ones (Opus 4.7 gets it, Opus 4.6 / Sonnet 4.6 do not), add a dedicated `omitsThinkingByDefault` suite covering naming variants and future versions * feat: Configurable Thought Visibility for Anthropic Adaptive Models Expose the Anthropic `thinking.display` API field as a user-facing parameter so users can override the `auto` default (which stays as the Opus-4.7+ opt-in added earlier in this PR). Also fixes the CI type error by widening the adaptive thinking type assignment via a resolver helper that returns a properly-typed object. - Add `ThinkingDisplay` enum (`auto` \| `summarized` \| `omitted`) and matching zod schema in `packages/data-provider/src/schemas.ts` - Add `thinkingDisplay` to `tConversationSchema`, `anthropicSettings`, and the pick lists for Bedrock input/parser + Anthropic agent params - Add `resolveThinkingDisplay(model, explicit)` helper in `packages/data-provider/src/bedrock.ts` that returns the wire value or undefined (auto → model default, explicit → always honored) - `bedrockInputParser` now reads `thinkingDisplay` from input and emits `display` only when the resolver returns a value; strips the field on non-adaptive-model branches so it does not leak - `configureReasoning` in the Anthropic endpoint threads `thinkingDisplay` through, uses the resolver, and casts the adaptive config to `AnthropicClientOptions['thinking']` so the widened shape compiles against the stale installed SDK types - Add UI slider for `thinkingDisplay` in `parameterSettings.ts` next to `effort`, with three-position `com_ui_auto` / `com_ui_summarized` / `com_ui_omitted` labels - Add translation keys `com_endpoint_anthropic_thinking_display`, `com_endpoint_anthropic_thinking_display_desc`, `com_ui_summarized`, `com_ui_omitted` - Add tests: `resolveThinkingDisplay` suite (5 cases covering auto / explicit / unknown input), parser round-trip tests for all three modes on Opus 4.6 and Opus 4.7, Anthropic endpoint tests for explicit summarized/omitted overrides * fix: Drop `thinkingDisplay` When Adaptive Thinking Is Disabled If a user turns adaptive thinking off but had previously selected a `thinkingDisplay` value, the stale field was left in `additionalFields` and ended up merged into the Bedrock request's `additionalModelRequestFields`. That leaks a non-Bedrock key into the payload and can round-trip back into `llmConfig`. - Delete `additionalFields.thinkingDisplay` alongside `thinking` and `thinkingBudget` in the `thinking === false` branch of `bedrockInputParser` - Add a regression test asserting `thinking`, `thinkingBudget`, and `thinkingDisplay` are all absent when adaptive thinking is disabled on an Opus 4.7 request Reported by chatgpt-codex-connector on PR #12701. * refactor: Consolidate `ThinkingDisplay` Types and Preserve Persisted Display Address review findings on PR #12701: - [Codex P2] `bedrockInputSchema.transform` now extracts `thinking.display` from persisted `additionalModelRequestFields` back into the top-level `thinkingDisplay` field so explicit `'omitted'` round-trips through storage instead of being silently reverted to `'summarized'` on the next parse. - [Codex P2] `getLLMConfig` in the Anthropic endpoint now reads `.display` from a persisted `thinking` object (agents store the full Anthropic shape) and uses it as the fallback for `thinkingDisplay` when no top-level override is present. - [Audit #2] Collapse the three parallel wire-value types into a single `ThinkingDisplayWireValue = Exclude<ThinkingDisplay, 'auto'>` exported from `schemas.ts`; remove the duplicate `ThinkingDisplay` alias in `packages/api/src/types/anthropic.ts` (which collided with the enum name) and the `ThinkingDisplayValue` alias in `bedrock.ts`. - [Audit #3] Add `thinkingDisplay` to the `TEndpointOption` pick list next to `effort`. - [Audit #4] Add a TODO comment next to the `as AnthropicClientOptions['thinking']` cast explaining the stale `@librechat/agents` SDK types that require it. - Add tests: four round-trip cases asserting `bedrockInputSchema` recovers `display` from persisted AMRF (Opus 4.7 omitted, pre-4.7 summarized, unknown-value ignore, explicit top-level wins), and two `getLLMConfig` cases asserting the Anthropic endpoint preserves and overrides persisted `thinking.display`. * fix: Preserve Persisted `thinking.display` in bedrockInputParser The parser constructed a fresh adaptive thinking config without looking at any `display` already embedded in the incoming `additionalModelRequestFields.thinking`. On round-trip through `initializeBedrock`, a persisted user choice of `'omitted'` on Opus 4.7+ was silently reverted to `'summarized'` by the auto fallback. - Extract `extractPersistedDisplay` helper and reuse it in both the schema transform (form-state round-trip) and the parser (wire-request round-trip) - `bedrockInputParser` now feeds the persisted display as the resolver's explicit value when no top-level `thinkingDisplay` override is set - Add regression tests: parser preserves `display: 'omitted'` for persisted Opus 4.7 AMRF, and top-level `thinkingDisplay` still wins over persisted AMRF display Reported by chatgpt-codex-connector (P1) on PR #12701.	2026-04-16 21:56:52 -04:00
Danny Avila	e2e3284713	🦉 feat: Claude Opus 4.7 Model Support (#12698 ) * 🦉 feat: Claude Opus 4.7 Model Support - Add `claude-opus-4-7` to shared Anthropic models and `anthropic.claude-opus-4-7` to Bedrock models - Register 1M context window and 128K max output in anthropic token maps - Add token pricing ($5/$25), cache rates (6.25/0.5), and premium tier ($10/$37.50 above 200K) in tx.ts - Update `.env.example` with Opus 4.7 IDs in `ANTHROPIC_MODELS` and `BEDROCK_AWS_MODELS` examples - Add parallel Opus 4.7 test cases for token/cache/premium rates, context length, max output, name-variation matching, and 1M-context qualification * feat: Add `xhigh` Effort Level for Opus 4.7 - Add `xhigh` variant to `AnthropicEffort` enum between `high` and `max` - Expose `xhigh` in `anthropicSettings.effort.options` and the UI slider `enumMappings` - Reuse existing `com_ui_xhigh` translation key * test: Cover `xhigh` Effort and Exact Opus 4.7 Premium Rates - Assert `xhigh` position (between high and max), inclusion in `anthropicSettings.effort.options`, zod acceptance, and rejection of unknown values in schemas.spec.ts - Verify bedrockInputParser emits `output_config: { effort: 'xhigh' }` for adaptive `anthropic.claude-opus-4-7` - Verify getLLMConfig sets adaptive thinking and `output_config.effort = 'xhigh'` for `claude-opus-4-7` - Pin Opus 4.7 premium pricing to exact threshold/prompt/completion values (200000 / 10 / 37.5) so silent rate drift fails the test	2026-04-16 14:51:00 -04:00
Danny Avila	49f228de78	🔼 refactor: Improve UX for Command Popovers (#12677 ) * refactor: Improve UX for Command Popovers * Added loading state handling in Mention and PromptsCommand components to display a spinner when data is being fetched. * Refactored onFocus logic to clear the textarea and set the search value based on command character input. * Introduced a new `isLoading` state in the useMentions hook to manage loading indicators across multiple data queries. * Added unit tests for the useHandleKeyUp hook to ensure command triggering works correctly under various conditions. * ci: useHandleKeyUp tests for command navigation * Added tests to ensure that the command popovers do not trigger when the cursor is mid-text after pressing ArrowLeft or Delete. * Updated the shouldTriggerCommand function to refine the conditions under which commands are triggered based on cursor position. * Improved agent query handling in useMentions hook for better performance and clarity. * refactor: Optimize Mention and PromptsCommand Components * Refactored Mention and PromptsCommand components to utilize Recoil state for popover visibility, improving state management and reducing prop drilling. * Simplified onFocus logic to enhance user experience when interacting with command inputs. * Added unit tests for useHandleKeyUp to ensure proper command handling and popover visibility based on user input. * Improved performance by memoizing popover state and reducing unnecessary re-renders. * fix: Address review findings for command popover refactor - Fix endpointType regression: add effectiveEndpointByIndex selector that returns endpointType ?? endpoint, matching the original ChatForm guard for custom endpoints proxying assistants - Extract duplicated initInputRef callback into shared useInitPopoverInput hook, used by both Mention and PromptsCommand - Add navigation keys (ArrowLeft, ArrowRight, ArrowDown, Home, End, Delete) to invalidKeys to prevent false popover triggers - Add endpoint gating tests for assistants/azureAssistants blocking the + command - Remove unused _index param from MentionContent	2026-04-15 17:47:24 -04:00
Danny Avila	dd26a2fda5	🧩 style: Agent Side Panel Layout and Consistency Fixes (#12676 ) * style: Update padding in ActionsPanel, ModelPanel, and AdvancedPanel for improved layout * Adjusted padding in ActionsPanel, ModelPanel, and AdvancedPanel components to enhance visual consistency and layout. * Changed `py-4` to `pt-2` in the main container of each panel to reduce vertical spacing and improve overall design aesthetics. * style: Update text size in MCPTool component for improved accessibility * Changed text size in the MCPTool component to `text-sm` for better readability and consistency across the UI. * This adjustment enhances the user experience by ensuring that text is appropriately sized for various display settings. * style: Enhance layout and accessibility in ActionsInput, ActionsPanel, and AgentPanel components * Updated the layout in ActionsInput to improve flex properties and ensure better responsiveness. * Refined the structure of ActionsPanel for a more consistent visual hierarchy and added accessibility features. * Adjusted the AgentPanel form layout for improved usability and streamlined component integration. * Changed text size in Dropdown component to `text-sm` for better readability across the UI. * style: Refactor layout in ActionsInput component for improved responsiveness * Updated the layout of the ActionsInput component to enhance flex properties and ensure better responsiveness. * Adjusted the textarea styling for improved usability and consistency in design. * Removed commented-out code related to example functionality to clean up the component structure. * refactor: Simplify single line code detection in MarkdownComponents * Introduced a new utility function `isSingleLineCode` to streamline the logic for determining if code is a single line. * Updated references in the `MarkdownCode` and `MarkdownCodeNoExecution` components to use the new utility function for improved readability and maintainability. * Enhanced the `processChildren` function in the Markdown editor to handle non-code elements more effectively.	2026-04-15 14:27:13 -04:00
kojinseok-del	edd4c6d60c	💎 fix: Handle `usage_metadata` in Title Transaction for Gemini Models (#12386 ) * fix: handle `usage_metadata` in title transaction for Gemini models Gemini models return token usage via `usage_metadata` instead of `usage` or `tokenUsage`. The `collectedUsage` mapping in `titleConvo` only handled the latter two, causing title generation transactions to be silently skipped for Gemini. Adds an `else if (item.usage_metadata)` branch to extract `input_tokens`/`output_tokens`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove trailing whitespace in usage_metadata handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:09:37 -04:00
Danny Avila	6183303653	🔉 fix: Normalize audio MIME types in STT format validation (#12674 ) * fix: normalize audio MIME types in STT format validation Use getFileExtensionFromMime() to normalize non-standard MIME types (e.g. audio/x-m4a, audio/x-wav, audio/x-flac) before checking against the accepted formats list in azureOpenAIProvider. This is the same class of bug as #12608 (text/x-markdown), but for STT audio validation. Only audio/ and video/ MIME prefixes are normalized to prevent non-audio types from matching via the webm default fallback. Export getFileExtensionFromMime for testability. Fixes #12632 * fix: reject unknown audio subtypes in STT format validation Use MIME_TO_EXTENSION_MAP for normalization instead of getFileExtensionFromMime() which falls back to 'webm' for unrecognized types. Gate raw subtype matching on audio/video prefix to prevent non-audio types (e.g. text/webm) from passing validation. Resolves Codex review comment about unknown subtypes silently passing. --------- Co-authored-by: Tobias Jonas <t.jonas@innfactory.de>	2026-04-15 09:58:07 -04:00
Danny Avila	76e9543f99	🧹 chore: Cap PR Indexes at 3 and Add Delete-Before-Sync (#12672 ) * fix: add docker system prune before image pull to prevent disk exhaustion The 60GB droplet filled up after ~40 deploys because each docker compose pull leaves the previous image's layers as dangling/unused. The gitnexus image is ~700MB, so ~40 stale copies ≈ 28GB of dead layers. Combined with indexes, OS, and Docker's build cache, the disk hits 100% and the next pull fails with 'no space left on device'. Add a docker system prune -af --volumes BEFORE pulling the new image on every deploy. This removes stopped containers, unused networks, all images not referenced by a running container, and build cache. Running containers are never touched. Typically frees 1-2GB per deploy (the previous image's layers). Also add a hard 2GB free-space guard after prune so the deploy fails with a clear error instead of letting docker pull attempt a 700MB extract onto a near-full disk. * fix: cap PR indexes at 3 + delete-before-sync for 10GB disk The 10GB droplet has ~2GB free. Each index is ~130MB, so 7 PR indexes (~900MB) plus main+dev (~260MB) plus the ~700MB Docker image leaves almost nothing for image pulls. The deploy failed with 'no space left on device' during docker compose pull. Three changes: 1. Cap PR indexes at MAX_PR_INDEXES=3. The resolve step now sorts PR artifacts by created_at descending and only keeps the 3 most recent. Older PR indexes are logged as evicted and their droplet folders get cleaned by the prune step. 2. Prune BEFORE sync (was after). Freeing disk space from evicted indexes before rsyncing new data is critical on a tight disk. The old order (sync then prune) could briefly hold both old evicted indexes and newly-uploaded ones simultaneously. 3. Delete-before-sync for every index, including main/dev. Instead of rsync --delete (which transfers new files then removes extras), rm -rf the target folder before rsync so the disk never holds both old and new copies of the same index (~260MB saved per index). Main/dev are only deleted when a fresh artifact is about to replace them — never evicted between deploys. Budget on 10GB disk: OS + Docker engine: ~4.0 GB Docker image (running): ~0.7 GB main + dev indexes: ~0.26 GB 3 PR indexes: ~0.39 GB Docker prune headroom: ~0.7 GB (for image pull) Free: ~3.9 GB * refine: restrict automatic PR indexing to danny-avila authored PRs With 200+ open PRs and a 10GB disk capped at 3 served PR indexes, auto-indexing every contributor PR burns CI minutes for artifacts that will mostly be evicted before anyone queries them. Narrow the pull_request auto-trigger to PRs authored by danny-avila only. Other contributors' PRs can still be indexed on demand via /gitnexus index (contributor-gated comment command) or manual workflow_dispatch — both arrive as workflow_dispatch events and bypass the pull_request filter entirely. * fix: drop --volumes from docker system prune to preserve Caddy TLS state The deploy workflow explicitly handles a caddy-not-running state later in the same step. If Caddy is stopped when the prune runs, --volumes deletes the caddy-data and caddy-config volumes (TLS certs + ACME account keys), forcing a Let's Encrypt re-issuance on next start. LE rate-limits to 5 certs per domain per week, so repeated wipes could brick HTTPS for days. docker system prune -af (without --volumes) still removes stopped containers, unused networks, all dangling/unreferenced images, and build cache — which is where the disk savings come from. Named volumes are left untouched. * fix: rsync-then-swap instead of delete-before-sync The delete-before-sync pattern removed the live index BEFORE rsync ran. If rsync failed (SSH timeout, disk pressure, network error), the index was already gone — production served nothing for that repo until a later deploy succeeded. Replace with rsync-then-swap: upload to a .new temp directory, and only rm + mv into place after rsync succeeds. On rsync failure, the .new temp is cleaned up and the old index stays live. The cost is ~130MB of extra disk while both old and new coexist, but the prune step runs first and frees evicted PR indexes, so this fits comfortably on the 10GB disk. * fix: fail deploy on main/dev rsync failure, soft-fail PRs only The rsync-then-swap pattern downgraded ALL failures to a warning, so the deploy continued even when LibreChat or LibreChat-dev failed to sync. The job would pull the new image, restart the container, and report success while serving stale or missing core indexes. Split by criticality: main/dev rsync failures now exit 1 (aborting the deploy before the container restart). PR index failures remain soft-fail with a warning — a missing PR index is inconvenient but shouldn't take the whole server down.	2026-04-15 09:46:48 -04:00
Ivan	f6b73938af	📝 docs: add `RAG_API_URL` to `.env.example` (#12665 )	2026-04-15 09:46:29 -04:00
Danny Avila	a613caced3	🧊 fix: In-Memory Endpoint Token Config Cache Isolation (#12673 ) * fix: endpoint token config not using shared cache in same process (initializing clients) * refactor: Update default max context tokens for agent initialization - Introduced a constant `DEFAULT_MAX_CONTEXT_TOKENS` set to 32000. - Updated the `initializeAgent` function to use this constant instead of hardcoded values for maximum context tokens, improving maintainability and clarity. * refactor: shared caching mechanism for token configuration - Introduced a memoized in-memory cache for Keyv instances to ensure shared access across the same namespace, improving cache efficiency. - Updated the `standardCache` function to utilize the new in-memory cache for the TOKEN_CONFIG namespace. - Refactored the `initializeCustom` function to use the `tokenConfigCache` for better cache management. - Removed redundant tokenCache parameter from `fetchModels` to streamline the function signature. * fix: match TOKEN_CONFIG TTL and add memoization tests Pass Time.THIRTY_MINUTES to tokenConfigCache() to match the TTL used by getLogStores.js, preventing load-order-dependent expiry behavior. Add 7 automated tests covering in-memory memoization: referential identity, cross-call-site data sharing, namespace isolation, first-caller TTL semantics, fallbackStore bypass, and tokenConfigCache parity with direct standardCache access. * fix: export DEFAULT_MAX_CONTEXT_TOKENS, address review nits - Export the constant so tests (and future consumers) reference it directly instead of hardcoding the numeric value. - Add independent TTL assertion for tokenConfigCache (R-1 nit). - Add tokenConfigCache mock to custom/initialize.spec.ts.	2026-04-15 09:41:42 -04:00
Danny Avila	b40e8be7c8	🖼️ fix: Hide Duplicate Image Placeholder During Image Generation (#12654 ) * fix: Hide duplicate image placeholder during image generation * test: Update OpenAIImageGen tests for conditional Image rendering	2026-04-14 07:53:23 -04:00
github-actions[bot]	5d108df665	🌍 i18n: Update translation.json with latest translations (#12646 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-13 22:32:02 -04:00
Danny Avila	60cee6eec7	🔍 fix: Anthropic Web Search Multi-Turn Issue and Attachment Results (#12651 ) * 🔍 fix: Improve WebSearch Progress Handling Based on Attachment Results - Adjusted progress handling in the WebSearch component to treat searches as complete if attachments contain results, addressing issues with server tool calls not receiving completion signals. - Introduced `effectiveProgress` to reflect the actual state of progress based on the presence of results, enhancing the accuracy of cancellation and completion states. - Updated related logic to ensure proper handling of search completion and finalization states based on the new progress calculations. * fix: only override progress when not streaming During streaming (isSubmitting=true), use actual progress so the searching/processing/reading states display correctly. Only override to 1 after streaming completes to prevent the cancelled check from hiding the component. * chore: Update @librechat/agents and mathjs dependencies to latest versions * chore: Upgrade mathjs dependency to version 15.2.0 across package-lock and package.json files	2026-04-13 15:44:41 -04:00
Phạm Trung Nam	5cc783b8e8	🎯 fix: Preserve Selected Artifact When Clicking Artifact Button (#12601 ) * fix: preserve selected artifact when clicking artifact button * fix: preserve artifact selection on click and during streaming * fix: remove broken streaming guard, add JSDoc and test - Remove `userHasManualSelection` guard from effect #3: it cannot distinguish manual clicks from system auto-selection, blocking auto-advancement to new artifacts during streaming. - Add JSDoc on `currentArtifactIdRef` explaining why it must not be added to effect deps (toggle-close regression). - Add test verifying auto-advancement during streaming. * style: use standard multi-line JSDoc format for ref comment --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-04-13 10:37:30 -04:00
Danny Avila	c4bb41137d	🔑 fix: Clear Stale Client Registration on `invalid_client` During OAuth Token Refresh (#12643 ) * fix: clear stale client registration on invalid_client during token refresh When a token refresh fails with `invalid_client`, the stored DCR client registration is no longer valid on the authorization server. The existing error handler only checked for `unauthorized_client` and returned null, leaving the stale client_id cached in the database permanently. Every subsequent token refresh attempt would fail with the same error. Now when `invalid_client` is detected during refresh: 1. The stale client registration is deleted from the database 2. A `ReauthenticationRequiredError` is thrown to trigger a fresh OAuth flow with new dynamic client registration Also passes `deleteTokens` from MCPConnectionFactory to getTokens() so the cleanup has access to the token deletion method. * fix: address review findings for stale client cleanup on token refresh - Delete stale refresh token alongside client registration on invalid_client (Finding 1) - Add tests for all new code paths: cleanup, warning, case-insensitivity, cleanup failure (Finding 2) - Detect all vendor-specific client rejection patterns (client_id mismatch, client not found, unknown client) with case-insensitive matching (Finding 3) - Use else-if for mutually exclusive error branches (Finding 4) - Log warning when deleteTokens is not available on client rejection (Finding 6) - Fix log message to say "attempting to clear" before async cleanup (Finding 7) - Extract isClientRejectionMessage to shared utility, refactor MCPConnectionFactory.isClientRejection to use it * fix: address followup review findings - Extract isInvalidClientMessage (4 stale-client patterns) from isClientRejectionMessage to eliminate pattern duplication between utils.ts and tokens.ts (Finding 1) - Remove redundant staleIdentifier variable, reuse identifier already in scope (Finding 2) - Separate await from .then() on Promise.allSettled for readability (Finding 3) - Add dedicated unit tests for isInvalidClientMessage and isClientRejectionMessage (Finding 4) - Assert error message content in primary invalid_client test (Finding 5) - Add JSDoc on deleteTokens in GetTokensParams (Finding 6) --------- Co-authored-by: Mani Japra <mani@muonspace.com>	2026-04-13 10:12:54 -04:00
Danny Avila	7b48203906	🗂️ feat: Sidebar Icon Toggle & New Chat History Switch (#12642 ) * 🗂️ feat: Sidebar Icon Toggle & New Chat History Switch Add collapse-on-active-click for sidebar icons (VSCode-style) and optionally switch to Chat History panel when creating a new chat. * fix: Address review findings — extract DEFAULT_PANEL constant, add tests Export DEFAULT_PANEL from ActivePanelContext and use it in ExpandedPanel instead of hardcoding 'conversations'. Add ExpandedPanel tests covering NavIconButton collapse toggle and NewChatButton panel switch behaviors. * fix: Address review — prop-drill setActive, test disabled setting, strengthen assertions Pass setActive as a prop to NewChatButton instead of subscribing to ActivePanelContext, avoiding wasted re-renders on every panel switch. Add negative-path test for switchToHistory=false. Add positive panel assertions to inactive-icon click tests. Fix import order.	2026-04-13 09:46:38 -04:00
Caesar Isidro Va-ay	9b9a86d17d	🔀 fix: Resolve Action Tools by Exact Name to Prevent Multi-Action Domain Collision (#12594 ) * 🐛 fix: resolve Action tools by exact tool name to prevent multi-action collision When two OpenAPI Actions on the same Agent share a hostname, the second action's entry overwrote the first in the encoded-domain Map and one action's tools silently disappeared from the LLM payload. The buggy resolution loop also used substring matching, which caused similar shadowing for any encoded-domain prefix overlap. This change builds a Map keyed on the full tool name (`<operationId>_action_<encoded-domain>`) directly, mirroring the exact lookup pattern that getActionToolDefinitions already uses. Each function in an action's spec gets its own slot, so two actions sharing a hostname no longer collide. Both the new and the legacy domain encodings are registered for each function so agents whose stored tool names predate the current encoding still resolve. Applied at all three call sites that had the buggy pattern: - processRequiredActions (assistants/threads path) - loadAgentTools (agent build path) - loadActionToolsForExecution (agent execution path) Adds three regression tests covering both ordering directions and the execution path. Tests fail without the fix and pass with it. * 🐛 fix: Normalize action tool name at lookup + cover assistants path Follow-up to the multi-action domain collision fix. Addresses PR #12594 review feedback: Must-fix #1 — short-hostname lookup mismatch. The toolToAction map is keyed on the `_`-collapsed domain, but `agent.tools` and `currentAction.tool` persist the raw `domainParser(..., true)` output, which for hostnames ≤ ENCODED_DOMAIN_LENGTH is a `---`-separated string (e.g. `medium---com`). Exact-match `Map.get()` missed those keys and silently dropped the tool. Fix: normalize every incoming tool name through a new `normalizeActionToolName` helper before the lookup in `loadAgentTools`, `processRequiredActions`, and `loadActionToolsForExecution`. Must-fix #2 — assistants path coverage. `processRequiredActions` received the same structural rewrite but had zero tests. Added a regression test under `multi-action domain collision regression` that drives two shared-hostname actions through the assistants path and asserts each tool reaches its own request builder. Must-fix #3 — legacy encoding branch coverage. The `if (legacyNormalized !== normalizedDomain)` registration was never exercised by any test. Added a test where `agent.tools` stores the legacy-format name and asserts it still resolves. Should-fix #4 — DRY the registration loop. Extracted `registerActionTools({ toolToAction, functionSignatures, normalizedDomain, legacyNormalized, makeEntry })`. All three call sites now share the same key-building logic; the key template lives in one place. Should-fix #5 — remove stale optional chaining. In `loadActionToolsForExecution`, `functionSignature?.description ?? ''` became `functionSignature.description` — `sig` is always defined by the iterator, matching the style of `loadAgentTools`. Should-fix #6 — drop unreachable `!requestBuilder` guard. Entries in `processRequiredActions` are now pre-built with `requestBuilder: requestBuilders[sig.name]`, which `openapiToFunction` always produces alongside the signature, so the guard is dead. Should-fix #7 — unwrap `actionSetsData`. It now holds a bare `Map` instead of `{ toolToAction }`; the sentinel `!actionSetsData` check still works because `new Map()` is truthy. Also added a short-hostname regression test (`loadAgentTools resolves raw ---separated tool names`) that reproduces Must-fix #1: it fails against the previous commit (0 create calls) and passes with the normalization in place. 41 tests, all passing. The 3 new regression tests are under `multi-action domain collision regression` and cover the assistants path, the legacy encoding branch, and the short-hostname lookup path. * 🐛 fix: Tighten registerActionTools key handling and assistants test Follow-up to `d643444` addressing the second review pass on PR #12594. ESLint — Two prettier errors in the spec file (multi-line arrow function bodies that should fit on one line). Auto-fixed. [MINOR] operationId containing `---` → key mismatch. The lookup path collapsed every `actionDomainSeparator` sequence in the full tool name, but the registration path passed `sig.name` through unchanged. A `---` that survived into an operationId would shift the underscore boundary at lookup and miss its own key. Fix in `registerActionTools`: normalize `sig.name` with the same helper so registration and lookup always agree on the canonical form. `sanitizeOperationId` strips the characters that produce `---` in practice, so this is theoretical hardening, not a fix for a known reproducer. [MINOR] Same-operationId + same-hostname silent overwrite. Two actions sharing both an operationId and a hostname still produced a silent `Map.set()` overwrite (the new key is identical, so neither the operationId nor the domain disambiguates). Added a `setKey` helper inside `registerActionTools` that logs a `[Actions] operationId collision: ...` warning whenever a key is already present, naming the overwriting action_id. The silent-overwrite mode from the original bug cannot reappear under a different disguise without surfacing in the logs. [NIT] processRequiredActions test simulated a runtime crash. `mockCreateActionTool` returned a tool with `_call: jest.fn()`, which resolves to `undefined`. `processRequiredActions` chains `.then(handleToolOutput).catch(handleToolError)` directly onto that return, so `undefined.then(...)` threw synchronously and the outer try/catch funneled the error into `handleToolError`. Creation count assertions still passed because `createActionTool` runs before the crash, but the test was silently exercising the failure path. Updated the global mock to `_call: jest.fn().mockResolvedValue('{"status":"ok"}')` so the success path runs end-to-end. The assistants regression test now executes in ~5ms instead of ~90ms, which corroborates that it's no longer hitting the synchronous throw. [NIT] Duplicated rationale comments. All three call sites carried multi-line comment blocks restating why we key on the full tool name. That rationale now lives canonically in `registerActionTools`'s JSDoc; the inline blocks collapsed to `// See registerActionTools for the key-shape rationale.` Net -22 lines of comments. 41/41 tests still pass; lint is clean. * 🐛 fix: Scope tool-name normalization to the encoded-domain suffix Follow-up to `f22228e` addressing the Codex P1 on PR #12594. Regression. The previous commit normalized the entire tool name (`normalizeActionToolName(sig.name)` at registration, full-name `.replace()` at lookup) to handle operationIds that theoretically contained `---`. But `openapiToFunction` uses user-supplied operationIds verbatim and the fallback `sanitizeOperationId` only strips characters outside `[a-zA-Z0-9_-]`, so specs can legitimately produce operationIds like `get_foo---bar` and `get_foo_bar` side by side. Collapsing `---` to `_` across the entire key merged those two into a single map slot — one silently overwrote the other, and both tool requests routed to the surviving entry's request builder. Fix. Limit normalization to the encoded-domain portion of the full tool name, i.e. the substring after the last `actionDelimiter`. The operationId half is left verbatim, so hyphens-vs-underscores remain disambiguating. The short-hostname bug (Must-fix #1 from the original review) is still covered because the `---` → `_` collapse still happens on the domain suffix where it matters: ```js const normalizeActionToolName = (toolName) => { const delimiterIndex = toolName.lastIndexOf(actionDelimiter); if (delimiterIndex === -1) return toolName; const prefixEnd = delimiterIndex + actionDelimiter.length; const encodedDomain = toolName.slice(prefixEnd); return toolName.slice(0, prefixEnd) + encodedDomain.replace(domainSeparatorRegex, '_'); }; ``` `registerActionTools` reverts to `sig.name` verbatim — no more `normalizeActionToolName(sig.name)` ahead of the key build. Regression test. Added `loadAgentTools distinguishes operationIds that differ only by ---` vs `_`` under `multi-action domain collision regression`. It loads two actions sharing a hostname whose operationIds are `get_foo---bar` and `get_foo_bar` respectively, each pointing at a different path (`/foo-bar`, `/foo_bar`), and asserts that each tool resolves to its own request builder. Verified the test fails against `f22228e` (`hyphenTool` resolves to `/foo_bar` — the sibling's builder) and passes with this commit. 42/42 tests pass; lint clean.	2026-04-13 09:08:06 -04:00
Airam Hernández Hernández	277fdd2b43	🪪 feat: Optimized Entra ID Group Sync with Auto-Creation (#12606 ) * feat: implement optimized Entra group sync with auto-creation ## Changes ### MUST FIX (Critical Issues) - RESOLVED 1. BUG FIX: Prevent unintended user removal from existing groups - ISSUE: db.syncUserEntraGroups() was called with only missing groups, causing removal from all existing Entra groups (full bidirectional sync behavior) - SOLUTION: Replaced with db.upsertGroupByExternalId() for each missing group followed by single bulkUpdateGroups() to add memberships (race-safe, idempotent) - BENEFIT: User memberships correctly maintained for mix of existing + new groups 2. JSDoc @throws contradiction - ISSUE: JSDoc declared function throws, but implementation catches all errors - SOLUTION: Removed @throws from JSDoc - function is best-effort - BENEFIT: Prevents unnecessary try/catch in caller code 3. Missing test for group creation flow - ISSUE: Auto-creating missing Entra groups had no test coverage - SOLUTION: Added regression test for mix of existing + new groups scenario - BENEFIT: Prevents future regressions on critical path ### SHOULD FIX (Important Improvements) - RESOLVED 4. E11000 race condition handling - SOLUTION: Upserts are idempotent and race-safe by design - BENEFIT: Concurrent logins no longer race each other 5. Direct Mongoose access instead of db layer - SOLUTION: Added findGroupsByExternalIds() helper to userGroup.ts - BENEFIT: Centralized data access, easier to add tenant scoping 6. Serial DB round-trips on login path - ISSUE: 40+ queries for user with 20 new groups - SOLUTION: Promise.all() for parallel upserts + single bulkUpdate - BENEFIT: ~10x performance improvement 7. Graph API 429/503 throttling unhandled - SOLUTION: Retry logic with exponential backoff (1s, 2s delays) - BENEFIT: Temporary API issues no longer cause permanent membership loss 8. Sequential batch requests slow - ISSUE: 200 groups = 10 batches × 200ms = ~2s sequential - SOLUTION: Promise.all() with concurrency limit (5 parallel batches) - BENEFIT: ~400ms total time ## Minor Fixes - Removed dead code check - PII removal: user._id instead of user.email in logs - ES6 shorthand fixes - Style consistency (blank lines) - Projection optimization ## Verification ✅ npm run build - success ✅ npm run test:api - 61/61 passing (+ new regression test) ✅ npm run lint - no errors ✅ All feedback from danny-avila resolved * docs: better JSDoc for the syncUserEntraGroupMemberships method --------- Co-authored-by: Airam Hernández Hernández <airam.hernandez@intelequia.com>	2026-04-13 08:50:52 -04:00
Danny Avila	55840286d4	🔬 fix: Scope Web Search Results to Own Turn (#12631 ) * 🔬 fix: Scope Web Search Results to Own Turn in WebSearch Component Fix duplicated search results when multiple web_search tool calls occur in a single response. Each WebSearch instance now displays only its own turn's results instead of aggregating all turns. * test: Add WebSearch turn-scoping tests Cover the regression-prone turn isolation logic: verify each WebSearch instance renders only its own turn's sources when sharing a SearchContext, test the attachments-first priority path and the searchResults fallback, and assert component states (cancelled, error, streaming, complete). * fix: Use getAllByText for duplicate sr-only + visible text in tests WebSearch renders status text in both an sr-only aria-live region and a visible span, causing getByText to fail with multiple matches. * refactor: Make ownTurn a string to avoid repeated conversions	2026-04-12 18:38:26 -04:00
Danny Avila	546f006e42	💬 feat: Serialize GitNexus Deploys and Post Completion Comments on PR Commands (#12623 ) Three related changes that tighten the GitNexus CI/CD loop. Serialized deploys - Previous concurrency group was keyed by head ref with cancel-in-progress, which let deploys targeting different refs (e.g. main push + PR command) run in parallel. That's a data race: the prune-stale-indexes step computes active_names up front, so deploy A rsyncing /opt/gitnexus/indexes/LibreChat-pr-12580 can collide with deploy B pruning the same folder based on a pre-rsync view of the active set. - Collapse to a single global group gitnexus-deploy with cancel-in-progress: false. All deploys queue behind one another. A rsync/docker-compose restart is never killed mid-operation. The 20-minute job timeout bounds queue depth. PR completion feedback - Add a "index complete" comment step in gitnexus-index.yml that fires only when inputs.pr_number is set (i.e. the run came via the /gitnexus command). Posts success or failure with a link to the run and whether embeddings were generated. - Add a "deploy complete" comment step in gitnexus-deploy.yml that handles both trigger paths: workflow_run from a native PR auto-index (PR number recovered from the matrix entry whose runId matches the trigger run), and workflow_dispatch from the index workflow's bot- fallback path (PR number passed through as a new inputs.pr_number). - Plumb inputs.pr_number through the bot-fallback dispatch in gitnexus-index.yml so the deploy workflow knows where to comment for command-triggered runs. - Only comments on the PR that asked for the index, never broadcasts. Workflow rename - Drop the "DigitalOcean" suffix from the deploy workflow's display name and filename. The platform is still DO (.do/gitnexus/ still holds the compose + caddy config) but the workflow itself is platform-agnostic in form and the suffix was visual noise. - File renamed gitnexus-deploy-do.yml -> gitnexus-deploy.yml. - Concurrency group and all cross-references updated in lock-step. - permissions at deploy job level now includes pull-requests: write so the completion comment can post.	2026-04-11 18:15:56 -04:00
Danny Avila	8cb5c62fa1	🔗 chore: Dispatch GitNexus Deploy When Index Is Bot-Triggered (#12621 ) * fix: dispatch deploy from index when triggered by github-actions[bot] GitHub Actions suppresses workflow_run events for workflow runs whose triggering actor is GITHUB_TOKEN (to prevent recursive chains). This means when gitnexus-pr-command.yml uses `gh api workflow_dispatch` to kick off gitnexus-index.yml, the downstream gitnexus-deploy-do.yml workflow_run trigger never fires — the PR command indexes the PR but the new artifact never makes it onto the droplet. Add a final step in gitnexus-index.yml that dispatches the deploy workflow directly via API, but ONLY when the triggering actor is github-actions[bot]. User-triggered runs (push, pull_request, manual workflow_dispatch from the UI) continue to rely on workflow_run as before, so we don't double-deploy. Requires a new actions:write permission at the workflow level for this dispatch. contents:read is unchanged. * fix: resolve main/dev indexes by artifact name, not branch run query The resolve step was querying listWorkflowRuns filtered by branch=main and branch=dev, then assuming the latest successful run on each branch produced the expected gitnexus-index-main / gitnexus-index-dev artifact. That assumption breaks for /gitnexus index command runs: The PR command workflow dispatches gitnexus-index.yml with ref=main (because that's where the workflow file lives) and an input pr_number. The resulting run has head_branch='main' but uploads its artifact as gitnexus-index-pr-<N>, not gitnexus-index-main. listWorkflowRuns returns that run as the "latest success on main", the download step tries to fetch gitnexus-index-main from it, and the API returns "no artifact matches any of the names or patterns provided". Fix: resolve all indexes (main, dev, and PRs) through the same listArtifactsForRepo path the PR discovery already uses. Looks up the freshest non-expired artifact by name directly, so the run's head_branch and event type don't matter — if the artifact exists, we find it; if not, we warn and move on. Side benefit: the resolution logic is now shorter and consistent across branches and PRs. * fix: paginate open PRs and parallelize artifact lookups The resolve step was capped at 100 open PRs by github.rest.pulls.list's per_page ceiling — LibreChat has 200+ open at any given time, so the tail of the PR queue was silently skipped. On top of that, the inner artifact lookup loop was serial, so even after pagination the resolve step would take 40-60 seconds on a busy repo (one API call per PR). - Replace the single-page rest.pulls.list call with github.paginate, which follows the Link header across pages and returns the full open-PR set regardless of count. - Drop the 100-PR truncation warning that was a known-limitation notice for exactly this case. - Batch the per-PR artifact lookups into groups of 10 via Promise.all. 200 PRs now take ~10 seconds instead of ~60, and the burst stays well within the authenticated rate limit (5000/hr). - Add a final core.info summary showing how many of the open PRs actually had a servable index artifact, so the log is useful for debugging why a specific PR isn't showing up on the droplet.	2026-04-11 14:02:00 -04:00
Danny Avila	990763cbee	🧠 feat: Enable GitNexus Embeddings for Dev Branch and PR Indexes (#12620 ) * feat: auto-enable embeddings for dev and PR indexes too Previously only main branch pushes got --embeddings; dev and contributor PRs ran graph-only and relied on BM25 search. Semantic search on those indexes silently returned empty, which defeats the whole point of serving them to MCP clients. New logic: every automatic trigger (push to main/dev, pull_request from contributors) enables --embeddings. Only workflow_dispatch still respects the explicit input toggle, so operators can run a fast graph-only re-index when they don't need fresh vectors. Cost: adds ~3-5 minutes per index run. Acceptable tradeoff for having semantic search work across all served branches + open PRs instead of just main. * refine: gate PR embeddings on unit-test path relevance Previous version auto-enabled --embeddings on every contributor PR, which cost ~3-5 min of extra CI per index even on PRs that couldn't benefit from semantic code search (docs, config, workflow files, i18n strings, etc.). New logic mirrors the backend-review.yml and frontend-review.yml path filters — if a PR doesn't touch api/, client/, or packages/ it won't trigger unit tests and it doesn't need embeddings. The check queries the GitHub API for the PR's changed file list via `gh api repos/.../pulls/<N>/files` (paginated for very large PRs) and enables embeddings only when at least one path matches. main/dev pushes still always embed. workflow_dispatch still respects the explicit input toggle, which also covers the /gitnexus index [embeddings] PR command. The contributor gate at the job level is unchanged — non-contributor PRs are still skipped entirely regardless of paths. * feat: /gitnexus command works for non-contributor and fork PRs The command workflow already gated on the commenter's author association (not the PR author's), so a contributor commenting /gitnexus index on an outside contributor's PR passes the auth check. But the downstream index workflow checked out the PR's raw head SHA, which only exists in the fork for cross-repo PRs — actions/checkout fetches from the base repo's origin and fails. Switch the command workflow to dispatch with refs/pull/<N>/head instead of the SHA. GitHub mirrors every PR's head into the base repo as this ref regardless of whether the PR is from a fork, so the checkout always resolves. End result: a contributor can type `/gitnexus index embeddings` on any PR — including one opened by a first-time contributor from a fork — and the index (with embeddings, if requested) is built and served. The contributor takes responsibility for the trust boundary by typing the command. Updated the relevant header/inline comments in both workflows so the next maintainer understands the refs/pull/<N>/head choice and the commenter-based gating. * refine: /gitnexus index defaults to embeddings on A contributor typing the command has already chosen to spend ~5 minutes of CI on a full re-index; they wouldn't invoke the command just to get a BM25-only result. Flip the default so the short form `/gitnexus index` produces an embeddings-enabled index. Modifier semantics: /gitnexus index -> embeddings ON (new default) /gitnexus index embeddings -> embeddings ON (explicit, no-op alias) /gitnexus index fast -> embeddings OFF (opt-out) /gitnexus index graph-only -> embeddings OFF (alias) /gitnexus index no-embeddings-> embeddings OFF (alias) The previous `embeddings` modifier is preserved as a no-op alias so anyone who learned the earlier form still gets what they expected.	2026-04-11 13:35:29 -04:00
Danny Avila	39fb93f6c4	🏗️ chore: Set Up Docker Buildx for GitNexus Image GHA Cache Export (#12619 ) The first GHCR build on main failed with: ERROR: Cache export is not supported for the docker driver. The default docker driver on ubuntu-latest runners can't export cache to type=gha. docker/setup-buildx-action@v3 without a driver argument defaults to docker-container, which supports both cache-from and cache-to. Gated on the same condition as the build step so it only runs when an image rebuild is actually needed.	2026-04-11 13:18:17 -04:00
Danny Avila	8eab39bc8f	🌊 feat: Add GitNexus DigitalOcean Pipeline with PR Index Serving (#12612 ) * feat: migrate GitNexus deployment from Fly.io to DigitalOcean droplet Fly.io's 1GB machine was pegged at ~900MB memory with load spiking to 2.7 under even modest query load. Moving to a 2GB+ DO droplet that can take advantage of existing credits. Architecture change: indexes no longer baked into the image. Instead, a long-lived image (built only when .do/gitnexus/ changes) is pulled from GHCR, and the deploy workflow rsyncs .gitnexus/ data into /opt/gitnexus/indexes/<name>/ on the droplet and restarts only the gitnexus container. Caddy stays running so TLS certs don't churn. - Add .do/gitnexus/Dockerfile (same native-addon + extension patch layers as the Fly variant, but no COPY indexes/ step) - Add .do/gitnexus/docker-compose.yml with gitnexus + caddy services on an internal bridge network, 1.8GB memory limit, healthcheck - Add .do/gitnexus/Caddyfile with automatic HTTPS for the configured subdomain and bearer token auth for all routes except /health - Add .do/gitnexus/entrypoint.sh that registers every index mounted at /indexes/<name>/.gitnexus at container start, then runs gitnexus serve bound to 0.0.0.0 (internal docker network only) - Add .do/gitnexus/install-extensions.js for LadybugDB FTS/vector extension pre-install (workaround for upstream bug) - Add .github/workflows/gitnexus-deploy-do.yml that builds the image only on Dockerfile/entrypoint changes, pushes to GHCR, rsyncs the index artifacts to the droplet, and restarts the gitnexus container - Remove .fly/gitnexus/ and .github/workflows/gitnexus-deploy.yml — Fly app will be destroyed after DO deploy is verified working Required new secrets: DO_HOST, DO_USER, DO_SSH_KEY. GITNEXUS_DOMAIN and API_TOKEN live in /opt/gitnexus/.env on the droplet itself. * refactor: prefix deploy secrets with GITNEXUS_ for namespace isolation Rename DO_HOST -> GITNEXUS_DO_HOST, DO_USER -> GITNEXUS_DO_USER, and DO_SSH_KEY -> GITNEXUS_DO_SSH_KEY so the secrets are clearly scoped to the gitnexus deploy and don't collide with any other DigitalOcean secrets LibreChat might add later. * feat: serve PR indexes alongside main/dev and add /gitnexus command The index workflow was already building and uploading per-PR indexes (gitnexus-index-pr-<N>) for contributor PRs, but the deploy workflow only consumed main and dev artifacts. PR indexes were sitting in storage doing nothing. This wires them all the way through to the live MCP server, with proper cleanup when PRs close. Deploy workflow changes: - Drop the branches filter on workflow_run so PR index completions also trigger deploys (PR indexes are already contributor-gated upstream in gitnexus-index.yml via author_association) - Resolve all open PRs via the GitHub API, look up each one's latest non-expired gitnexus-index-pr-<N> artifact, and serve whichever ones exist. PRs without an index artifact are skipped — we don't retroactively index anything. - Per-ref concurrency group so rapid pushes to the same PR coalesce but different refs still deploy in parallel - After rsyncing active indexes, prune any /opt/gitnexus/indexes/ folder that isn't in the active set. Safety net for missed PR close events. New workflow: gitnexus-cleanup-pr.yml - Fires on pull_request closed (merged or not) - SSHs to the droplet, removes /opt/gitnexus/indexes/LibreChat-pr-<N>, restarts the gitnexus container New workflow: gitnexus-pr-command.yml - Listens for issue_comment events where body starts with /gitnexus - Contributor gated via author_association - Supports: /gitnexus index — index with defaults /gitnexus index embeddings — index with --embeddings - Dispatches gitnexus-index.yml with the PR number and head SHA, reacts to the comment with a rocket emoji Index workflow changes: - New dispatch inputs pr_number and pr_ref for command-driven runs - Checkout step uses inputs.pr_ref when set so the PR's head commit is analyzed instead of the default branch - Artifact naming falls back through pr_number -> pull_request number -> ref_name, keeping existing behavior for push/PR events - Concurrency group switches to pr-<N> when dispatched by the command so re-runs on the same PR debounce correctly * chore: remove Fly variant reference from Dockerfile header The Fly variant no longer exists in the repo, so the comparison comment is meaningless. Rewritten as a standalone description of the image's design. * review: resolve 15 findings from review audit Critical - Drop the unused caddy binary from the gitnexus image. Caddy runs in its own container in this architecture; installing it inside the gitnexus image added ~40-60MB for no reason and contradicted the Dockerfile header comment. Major - Replace ssh-keyscan TOFU with a GITNEXUS_DO_KNOWN_HOST secret. Both deploy and cleanup workflows now pin the droplet's host key from a stored value instead of silently trusting whatever the host presents at deploy time. Fails the workflow if the secret is empty so no one accidentally regresses to TOFU. - Gate gitnexus-cleanup-pr.yml to same-repo PRs via github.event.pull_request.head.repo.full_name == github.repository. Fork PR closes no longer produce failed runs when secrets are withheld by GitHub. The deploy workflow's stale-folder prune step remains the safety net for any fork-contributor indexes. - Fail fast in entrypoint.sh when main/dev index registration errors. Previously `\|\| echo WARN` swallowed failures so a broken index passed the docker healthcheck and the deploy was marked green while queries returned empty. PR indexes stay best-effort (a corrupt PR index shouldn't take the whole server down). - Authenticate the droplet with GHCR on every deploy using GITHUB_TOKEN, so private GHCR packages work without documentation detours or manual docker login on the host. Bootstrap comments explain the flow. - Switch docker-compose caddy.depends_on from the short-form (service_started) to service_healthy so Caddy doesn't route to a gitnexus container that's still starting (500ms-60s window after recreation). Minor - Guard the HEAD~1 diff with `git rev-parse --verify HEAD~1` so the first-commit and workflow_run-from-PR cases default to rebuild instead of silently skipping a legitimately-changed image. - Move `packages: write` off the workflow-level permissions and onto the build-image job. deploy no longer inherits unnecessary GHCR write access. - Skip the SSH session in cleanup-pr.yml when no gitnexus-index-pr-<N> artifact ever existed for the PR. Eliminates ~95% of no-op SSH round-trips on a busy repo (docs-only PRs, paths-ignored PRs, etc). - Reload Caddy in-place after config upload with `caddy reload`, falling back to force-recreate on reload failure and `compose up` on first-time bootstrap. Picks up Caddyfile or env changes without losing TLS certs. - Replace `sleep 5` post-deploy with a real readiness poll against docker's health status. Fails the workflow if gitnexus doesn't report healthy within 120s, so a broken startup surfaces in CI instead of being silently marked green. - Warn when listPulls hits the 100-item per_page ceiling so a future growth spurt past 100 open PRs doesn't silently drop indexes. Nit - Tighten NODE_OPTIONS --max-old-space-size from 1536 to 1280MB, giving KuzuDB's C++ heap ~512MB of room under the 1792MB cgroup limit instead of ~256MB. - Rewrite the stale "headroom for Caddy" comment in entrypoint.sh (Caddy lives in a separate container now). - Restore load-bearing comments in install-extensions.js explaining the @ladybugdb/core path layout and the throwaway-db cache-priming pattern. - Parameterize the docker-compose image reference as ${GITNEXUS_IMAGE:-ghcr.io/danny-avila/librechat-gitnexus:latest} so forks or pinned version tags can override via /opt/gitnexus/.env. Deferred - Finding 12 (memory headroom) addressed partially via the heap cap reduction; full profiling of KuzuDB C++ allocations under query load deferred to post-deploy monitoring. * review: resolve 8 follow-up findings from second review pass Security - F1: pipe GHCR token via SSH stdin instead of expanding it into the remote command string. Previously `"echo '$GH_TOKEN' \| docker login"` expanded the token locally before SSH sent it as an argument, so the live token was briefly visible in /proc/<pid>/cmdline on the droplet to any process running as deploy or root. New form uses `printf '%s' "$GH_TOKEN" \| ssh ... "docker login --password-stdin"` so the token only travels through the encrypted SSH stdin pipe. Reliability - F2: add json-file log rotation (50m x 3 files) via a YAML anchor shared by both services. Default Docker logging is unbounded and would eventually fill the 60GB droplet disk. - F4: set memswap_limit=1792m to match mem_limit. Without this, Docker lets the container silently spill onto host swap when KuzuDB's C++ heap overruns the 1792m RAM budget, turning sub-second graph queries into multi-second ones with no alert. Hard OOM-kill is preferable — unless-stopped restarts the container, the deploy health poll catches it, the failure is explicit. - F5: extend the post-deploy health poll from 24 iterations (120s) to 36 iterations (180s) so it clears Docker's own unhealthy-detection ceiling (start_period 60s + retries 3 * interval 30s = 150s). A container that legitimately takes 125s to warm up would previously fail the deploy at 120s while Docker would still report it as "starting". Operability - F3: document the `--no-deps` escape hatch in the compose file header so an operator can restart Caddy during a gitnexus outage without being trapped by the service_healthy dependency (e.g. emergency Caddyfile fix while gitnexus is thrashing). - F7: rewrite the misleading service_healthy comment. The old text said it prevents 502s "after a restart", implying continuous protection. Clarified that depends_on only governs initial compose up ordering — during force-recreates Caddy briefly routes to a starting gitnexus and the deploy's health poll is the actual guard. - F6: add `shopt -s nullglob` before the prune loop so an empty /opt/gitnexus/indexes directory is an explicit no-op instead of relying on the quirk that `rm -rf ""` (with literal "") silently succeeds. Next reader won't have to recognize the bash default. - F8: soft-fail PR artifact downloads when the artifact disappeared between resolve and download. Main/dev artifact failures stay fatal because a missing main/dev index is a real deploy failure, but a deleted PR artifact no longer aborts the whole deploy. * review: resolve 3 NITs from third review pass - F1: rewrite the printf '%s' comment. The previous version claimed docker login --password-stdin rejects trailing newlines, which is inaccurate — docker login strips whitespace. The real reason for printf over echo is byte-exact output and portability, and the token-in-process-table security rationale is already documented in the preceding sentences. - F2: when a PR artifact download soft-fails, the PR's name stays in active_names so the prune step keeps the droplet's existing copy instead of wiping it (stale > empty). Make this transition visible by spelling it out in the :⚠️: message. - F3: fencepost fix in the health poll. The previous loop ran 36 iterations and claimed "180s" in the comment, but the final iteration exits without a trailing sleep, so the real ceiling was 35 * 5s = 175s. Extended to 37 iterations (36 sleeps * 5s = 180s) so the comment matches reality.	2026-04-11 13:04:46 -04:00
Daniel Lew	b1fee80de4	📑 fix: Alias Mimetype `text/x-markdown` to `text/markdown` (#12608 ) text/x-markdown is a deprecated version of a markdown mimetype, but we're seeing that sometimes users still send this mimetype. This change allows these files to be uploaded as text/markdown.	2026-04-11 08:23:04 -04:00
Danny Avila	70c91f8afd	🩹 chore: Correct lbug-adapter Path for GitNexus Vector Extension Patch (#12609 ) The published gitnexus npm package compiles pool-adapter.ts to dist/mcp/core/lbug-adapter.js, not dist/core/lbug/pool-adapter.js as I guessed in the previous PR. The sed patch failed the Docker build because the target file didn't exist. Point the patch at the correct compiled path and add a grep -c verification after sed to confirm the replacement actually landed.	2026-04-10 13:59:41 -04:00
Danny Avila	711747a5a0	🔎 fix: Install LadybugDB Extensions and Patch Vector Load for GitNexus Search (#12607 ) Two upstream GitNexus 1.5.3 bugs combine to break query() in serve mode: 1. pool-adapter.ts calls LOAD EXTENSION fts but never INSTALL fts. The CI-produced .gitnexus/ artifact doesn't include the extension cache (~/.kuzu/extension/), so LOAD silently fails in the fresh container and all BM25/FTS searches return empty. 2. pool-adapter.ts only loads the FTS extension — it never loads the vector extension. Every CALL QUERY_VECTOR_INDEX in semanticSearch fails, so hybrid search's semantic leg returns empty too. Combined, query() returns empty even for exact function names because both the BM25 and semantic legs of RRF merging have zero inputs. Meanwhile context()/cypher()/route_map() still work because they use plain Cypher with no extensions. Workaround: - Add install-extensions.js that runs INSTALL fts and INSTALL vector against a throwaway database during Docker build, populating ~/.kuzu/extension/ with the cached extension binaries - Sed-patch pool-adapter.js at build time to also LOAD EXTENSION vector alongside the existing FTS load, wrapped in try/catch - Update deploy workflow to copy install-extensions.js into the build context	2026-04-10 13:41:55 -04:00
Danny Avila	a121ae3dea	🩺 fix: Capture GitNexus Serve Output and Add Startup Health Check (#12599 ) * fix: capture gitnexus serve output and add startup health check gitnexus serve was backgrounded with no log capture, so crash reasons were invisible in Fly logs. Now pipes output to stdout and waits up to 30s for the server to be ready before starting Caddy, with early exit if the process dies. * fix: install newer libstdc++ for LadybugDB native addon @ladybugdb/core prebuilt binary requires GLIBCXX_3.4.32 (GCC 13+) but node:24-slim ships Bookworm's libstdc++6 which only has 3.4.31. Pull libstdc++6 from Debian Trixie to satisfy the runtime dependency. * fix: separate build tools from libstdc++ upgrade to avoid conflict Installing Trixie's libstdc++6 alongside Bookworm's g++ fails because the Trixie libc6 transitive dep conflicts with Bookworm's libc6-dev. Split into two RUN steps: compile native addons first, remove g++, then upgrade libstdc++ from Trixie with no conflicting packages. * fix: name repo as LibreChat and add dev branch deploy support - Use REPO_NAME build arg (default: LibreChat) as the WORKDIR so gitnexus registers the index with a proper name instead of "repo" - Deploy workflow now triggers on both main and dev branch index runs - Dev branch registers as "LibreChat-dev", main as "LibreChat" - workflow_dispatch gains a branch selector input * feat: serve both main and dev indexes from one container + auto-embed main - Dockerfile now copies multiple indexes from indexes/<name>/.gitnexus and registers each with gitnexus, so one server handles both branches - Deploy workflow downloads latest successful main + dev artifacts in parallel and bundles them into a single deploy - list_repos returns LibreChat and LibreChat-dev; queries target either via the repo parameter - Main branch pushes auto-enable --embeddings for semantic search; dev and PRs remain graph-only for speed (opt-in via dispatch input) - Bump index job timeout to 25m to account for embedding generation * fix: raise Fly machine memory to 1GB + add swap + cap Node heap The 512MB machine was getting OOM-killed when gitnexus serve's default --max-old-space-size=8192 over-committed memory during query spikes. - Bump VM memory 512mb -> 1gb - Add 512MB swap file to absorb transient spikes - Cap Node heap at 768MB via NODE_OPTIONS so it stays within the machine's real capacity and leaves headroom for Caddy and the OS	2026-04-10 12:35:11 -04:00
Danny Avila	4f133f8955	✨ v0.8.5-rc1 (#12569 )	2026-04-09 20:06:31 -04:00
Danny Avila	f128587bb7	📦 chore: bump `axios`, `@librechat/agents` (#12598 ) * chore: bump @librechat/agents to v3.1.64 * chore: update axios to version 1.15.0 across multiple packages	2026-04-09 19:48:21 -04:00
UnicronBE	46b529a86a	📩 fix: Restore Primary Action Button Visibility in Light Mode (#12591 ) The default <Button> variant (bg-primary / text-primary-foreground) renders invisible in light mode on the Agent Marketplace 'Start Chat' button and the Grant Access dialog 'Save Changes' button, making these primary actions undiscoverable without hovering. Switch these three affirmative-action buttons to variant='submit', which uses the hardcoded 'bg-surface-submit text-white' combination already defined in the Button component specifically to avoid the contrast issues of the default variant (see the comment above the 'submit' variant in packages/client/src/components/Button.tsx). No visual change in dark mode; in light mode the buttons now render as the green submit color with white text, matching the semantic intent of the action. Co-authored-by: Timothy Look <timothy.look@pmv.eu>	2026-04-09 18:45:54 -04:00
Marco Beretta	1a83f36cda	📌 feat: Add Pin Support for Model Specs (#11219 ) * feat: Enhance favorites functionality to support model specs * refactor(FavoritesList): reorder imports based on lenght * feat: improve favorite modelSpec controller; refactor: useIsActiveItem hook * refactor: consolidate Favorite type, harden controller, add tests - Add canonical TUserFavorite type in data-provider, replace three duplicate definitions (data-service, data-schemas, store/favorites) - Consolidate FavoritesController spec validation into single block, add return on 500 paths, add maxlength to mongoose sub-schema - Fix import order in FavoritesList.tsx, merge namespace type imports in FavoriteItem.tsx and FavoritesList.tsx - Add focus-visible:ring-inset on pin buttons to prevent ring clipping - Add explicit return type and JSDoc on useIsActiveItem hook - Use props.type for narrowing consistency in FavoriteItem getTypeLabel - Add 22 backend tests for FavoritesController (spec validation, typeCount exclusivity, persistence, GET path) - Add 40 frontend tests: useFavorites spec methods, useIsActiveItem observer lifecycle, ModelSpecItem pin button, FavoriteItem all three type branches, FavoritesList spec rendering * fix: address PR review findings for pin model specs - Harden backend validation to reject partial cross-type fields (e.g. spec+endpoint, agentId+model without endpoint) - Add stale-spec auto-cleanup in FavoritesList mirroring agent cleanup - Add type="button" to pin buttons in ModelSpecItem/EndpointModelItem - Fix import order violations in EndpointModelItem and ModelSpecItem - Remove hollow test, dead key prop, inline trivial helpers - Fix misleading test description, add onSelectSpec to test mock - Add return to controller success responses for consistency - Add 6 backend tests for partial cross-type field validation * fix: guard stale-spec cleanup against unloaded startupConfig Prevents race condition where spec favorites are incorrectly deleted on cold start before startupConfig has loaded. Mirrors the existing agentsMap === undefined guard pattern used for stale agent cleanup. Also adds tests for stale-spec cleanup persistence and fixes namespace import pattern in FavoritesList.spec.tsx. * fix: replace nested ternaries with if/else in FavoriteItem Resolves ESLint no-nested-ternary warnings for name and typeLabel derivations. --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-04-09 18:37:25 -04:00
Danny Avila	daa8f0ea6b	📂 fix: Respect `supportedMimeTypes` Config in File Picker Accept Filter (#12596 ) * fix: respect supportedMimeTypes config in file picker accept filter The browser file picker's accept attribute was hardcoded by provider identity, ignoring the endpoint's supportedMimeTypes from fileConfig. Users who configured permissive MIME types (e.g., '.') still saw a restrictive filter in the upload dialog. Add isPermissiveMimeConfig utility that detects wildcard patterns in the endpoint's supportedMimeTypes. When permissive, the file picker accept attribute is set to empty (unrestricted). Non-permissive configs retain the existing provider-based defaults. Closes #12589 fix: address review findings for isPermissiveMimeConfig - Use non-standard MIME namespace probe (x-librechat/x-probe) so category-wildcard patterns like ^application\/.$ no longer false-positive as permissive - Add single-line JSDoc to isPermissiveMimeConfig - Use !== undefined instead of != null (fileType is never null) - Add endpointFileConfig to dropdownItems useMemo deps to prevent stale closure when config changes without endpoint change - Add tests for broad application and multi-category patterns fix: wrap handleUploadClick in useCallback to satisfy exhaustive-deps handleUploadClick is captured inside the dropdownItems useMemo but was not in its dependency array. Wrap it in useCallback with endpointFileConfig.supportedMimeTypes as the sole dependency, then reference the stable callback in the useMemo deps.	2026-04-09 17:43:53 -04:00
Danny Avila	81275ff0e0	⏱️ refactor: User Job Tracking TTL and Proactive Cleanup to Redis Job Store (#12595 ) * refactor: Add user job tracking TTL to RedisJobStore - Introduced a new TTL for per-user job tracking sets, set to 24 hours, to enhance job management. - Updated RedisJobStoreOptions interface to include userJobsSetTtl for configuration. - Modified job creation and deletion methods to manage user job sets effectively, ensuring proper expiration and cleanup. - Enhanced comments for clarity on the new TTL functionality and its implications for user job tracking. * fix: Address review findings for user job tracking TTL - Remove redundant `del(userJobsKey)` in `getActiveJobIdsByUser` that raced with concurrent `createJob` on other replicas (Redis auto-deletes empty Sets after SREM) - Guard `userJobsSetTtl: 0` from silently destroying tracking sets (`EXPIRE key 0` deletes the key on Redis 7.0+) - Extract `deleteJobInternal` so `cleanup()` reuses the already-fetched userId instead of issuing a redundant HGETALL per stale job - Add integration tests for TTL behavior, proactive SREM, configurable userJobsSetTtl, and TTL refresh on repeated createJob * fix: Address follow-up review findings for RedisJobStore - Use deleteJobInternal in cleanup() terminal-but-in-running-set path to ensure userJobsKey SREM is not skipped - Clear local caches in deleteJob before the fallible getJob call so they are cleaned even on transient Redis errors - Add proactive SREM tests for aborted and error terminal statuses - Add test for tenant-qualified user tracking key format * fix: Preserve completedTtl for non-running jobs in cleanup() The cleanup() terminal-status branch should only remove tracking set membership, not delete the job hash. deleteJobInternal bypasses the completedTtl window that updateJob already applied, causing clients polling for final status to lose the job data early.	2026-04-09 17:42:54 -04:00
Danny Avila	cc8ce15c38	📂 fix: Enable Hidden File Upload for GitNexus Index Artifact (#12597 ) upload-artifact@v4 defaults include-hidden-files to false, which silently skips the .gitnexus/ directory (dotfile). Scoped to the .gitnexus/ path so only index files are affected.	2026-04-09 17:02:51 -04:00
github-actions[bot]	0d97f7354a	🌍 i18n: Update translation.json with latest translations (#12588 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-09 15:26:37 -04:00
Danny Avila	7ef03391b5	📦 chore: bump `nodemailer` to v8.0.5 (#12587 )	2026-04-09 09:59:57 -04:00
Danny Avila	452af50eff	🧮 fix: Atomize Redis Event Sequence Counters for Multi-Replica Deployments (#12578 ) * fix: atomize Redis event sequence counters for multi-replica deployments Replace in-memory sequenceCounters Map with shared atomic Redis counter (INCR via Lua script) so all replicas share an authoritative sequence source. Make syncReorderBuffer async to read current sequence from Redis instead of defaulting to 0 on non-publishing replicas. Closes #12575 * fix: harden emitChunk error handling and syncReorderBuffer race safety Wrap getNextSequence inside emitChunk's try/catch so a transient Redis failure is logged-and-swallowed, preserving the non-fatal chunk emission contract. In syncReorderBuffer, replace pending.clear() with selective discard of stale entries (seq < currentSeq) followed by flushPendingMessages, preventing loss of chunks that arrived during the async GET window. * fix: address review findings — harden error paths, validate types, DRY tests - Wrap syncReorderBuffer in try/catch in GenerationJobManager.subscribe() so a Redis GET failure degrades gracefully instead of crashing SSE reconnect. - Validate eval return type in getNextSequence; throw on unexpected type instead of silently computing NaN/-1 and dropping all messages. - Add NaN guard to parseInt in syncReorderBuffer for corrupted Redis values. - Consolidate two streams loops in destroy() into a single pass. - Extract createMockPublisher to shared helpers/publisher.ts (DRY). - Add tests for eval failure and unexpected eval return type in emitChunk. - Document performance tradeoff (2x RTT per emit) and TTL rationale. * test: add cross-replica integration tests for sequence desync fix (#12575) Three real-Redis tests that reproduce the exact multi-replica failure: - Late subscriber on a different replica syncs to the shared counter and receives chunks immediately (no 500ms force-flush). - Multiple subscribe/unsubscribe cycles across replicas maintain correct sequence alignment on every reconnect. - Shared counter key is cleaned up when the stream is destroyed. * fix: close syncReorderBuffer race, stop destroy() from nuking shared keys - Replace selective prune with Math.min(currentSeq, minPendingSeq) so chunks that arrive in pending during the async GET window are preserved instead of incorrectly pruned. Since unsubscribe() already clears pending, any entries at sync time are live messages from the current subscription and must not be discarded. - Remove DEL from destroy() — the sequence key is shared across replicas and a shutting-down instance must not nuke it for active publishers. Keys expire naturally via their 1-hour TTL; cleanup() handles single-stream lifecycle teardown. - Add race-condition test: pauses GET, injects a message into pending during the window, resolves GET, asserts the chunk is delivered. - Add emitDone/emitError eval failure tests to cover the asymmetric error contract (chunk swallows, done/error propagates). - Add explicit MockPublisher return type to helpers/publisher.ts. * fix: context-aware syncReorderBuffer to prevent duplicate delivery, silence test logs The Math.min(currentSeq, minPending) fix from the prior commit correctly handles the cross-replica race (chunk arriving during async GET), but causes duplicate delivery in same-replica mode: onAbort subscribes to the Redis channel during createJob, so chunks published before subscribe() may arrive via pub/sub AND earlyEventBuffer. The Math.min logic then flushes the pub/sub copy as a "live" message. Fix: add a clearPending parameter to syncReorderBuffer. The manager passes true when earlyEventBuffer was replayed (same-replica: pending entries are duplicates → clear them) and false/undefined when it was not (cross-replica: pending entries are live → preserve via Math.min). - Silence winston logger in stream test files via logger.silent = true, following the existing pattern in data-schemas/prompt.spec.ts. - Remove sequence key DEL from destroy() — shared across replicas, a shutting-down instance must not nuke the counter for active publishers. Keys expire via TTL; cleanup() handles stream teardown. * fix: share publisher client in cross-replica tests for Redis Cluster compat The cross-replica tests created publisherB via (ioredisClient as Redis) .duplicate(), which produces a plain Redis connection to a single node. In Redis Cluster, GET/EVAL on this client can't follow MOVED redirects, so syncReorderBuffer reads null → nextSeq=0 → chunks are buffered. Fix: share ioredisClient as the publisher for both replicas. It's the correct Cluster-aware client and handles slot routing automatically. Only subscriber connections need to be separate (pub/sub requirement). * fix: increase cross-replica test timeouts and use polling for CI cluster Replace fixed 300ms delivery waits with polling (50ms intervals, 2s max) to handle Redis Cluster's cross-node pub/sub broadcast latency in CI. Increase subscription activation wait from 100ms to 500ms to match the pattern used by existing same-instance tests. * chore: silence winston logs in remaining stream test files Add logger.silent = true to GenerationJobManager, RedisJobStore, and collectedUsage test files, matching the pattern already applied to RedisEventTransport and reconnect-reorder-desync tests. * fix: remove flaky cluster pub/sub tests, fix log suppression for resetModules Remove two cross-replica transport-level integration tests that are inherently non-deterministic in Redis Cluster: cluster pub/sub fan-out is async across nodes, so pre-subscribe PUBLISHes can arrive at the subscriber's node after SUBSCRIBE takes effect, causing random +/- 1 message counts. The core logic is already covered by deterministic unit tests (mock publisher) and GenerationJobManager integration tests (end-to-end with earlyEventBuffer). The cleanup test is retained. Switch log suppression from logger.silent (which doesn't survive jest.resetModules) to jest.spyOn(console, 'log').mockImplementation() for files that use resetModules (GenerationJobManager, RedisJobStore, collectedUsage). The logger.silent approach remains for files that don't use resetModules (RedisEventTransport, reconnect-reorder-desync). * fix: replace Lua EVAL+TTL with plain INCR, preserve live chunks during same-replica sync P1: syncReorderBuffer with clearPending=true was unconditionally clearing pending, dropping NEW chunks (seq >= currentSeq) from ongoing generation that arrived via pub/sub during the async GET. Fix: selectively prune only entries with seq < currentSeq (duplicates of earlyEventBuffer) and flush remaining live entries. P2: The 1-hour TTL on sequence keys could expire mid-stream during long quiet periods (e.g., slow tool calls), restarting the counter from zero and causing subscribers to silently drop all subsequent messages. Fix: remove the Lua EVAL+EXPIRE script entirely and use plain Redis INCR — no TTL. Keys are cleaned up explicitly by cleanup()/resetSequence() when streams end. Orphaned keys from crashed processes are a few bytes each, negligible compared to the production risk of TTL expiry. - Update mock publisher helper: eval → incr - Remove "unexpected eval type" tests (not applicable to incr) - Update remaining eval error tests to reference incr * fix: re-arm flush timeout after syncReorderBuffer when gaps remain syncReorderBuffer clears flushTimeout before processing, but if pending still has gaps after flushPendingMessages, no timeout is re-armed. Without new messages arriving to trigger scheduleFlushTimeout, those buffered entries sit indefinitely — stalling the stream after reconnect. * chore: address final review — fix stale docs, rename clearPending, tighten tests Fix all 10 findings from final review pass: F1: Fix stale TTL comment in destroy() — no TTL exists after EVAL removal F2: Fix stale EVAL reference in emitChunk JSDoc → INCR + PUBLISH F3: Fix syncReorderBuffer JSDoc — describes old broken behavior, not the current selective prune F4: Rename clearPending → pruneStaleEntries for clarity at call sites F5: Add publish-not-called assertion to INCR failure test F6: Replace Math.min(...spread) with explicit loop to avoid heap alloc F7: Remove resetSequence from IEventTransport interface (no external callers; implementation detail only) F8: Consolidate cleanup/resetSequence overlap — cleanup() owns the DEL, resetReorderBuffer() (now private) handles buffer state only F9: Fix import order in reconnect test (value before type imports) F10: Export MockPublisher interface from helpers/publisher.ts * chore: fix stale resetSequence references and hadBufferReplay JSDoc Two comments referenced resetSequence() which was removed in the F7/F8 refactor (now private resetReorderBuffer(), which doesn't DEL the key). Only cleanup() deletes the Redis key. Also update hadBufferReplay JSDoc to say "prune stale entries" instead of the pre-refactor "clear pending". * chore: grammar * fix: use earlyReplayCount as prune cutoff instead of Redis counter The boolean pruneStaleEntries flag used currentSeq (from Redis GET) as the prune threshold, but INCR can advance the counter past a live chunk's seq during the GET window. Example: earlyEventBuffer held seqs 0-4, generation emits seq 5 during GET, INCR advances counter to 6, GET returns 6, prune condition 5 < 6 deletes the live chunk. Fix: pass the earlyEventBuffer replay count (5) as the prune cutoff. Entries with seq < earlyReplayCount are true duplicates; entries at or above are live regardless of what the Redis counter reads. After pruning, the unified Math.min(currentSeq, minPending) logic handles both same-replica and cross-replica paths correctly. Add a targeted test that exercises this exact race: pauses GET, emits seq 5 during the window, resolves GET with counter=6, asserts seq 5 is preserved (would have been dropped with the old boolean approach). * fix: pass earlyReplayCount for skipBufferReplay path to prune pub/sub duplicates When skipBufferReplay is true (resume scenario), earlyEventBuffer events are delivered via the resume sync payload, not replayed directly. But earlyReplayCount was left at 0, so syncReorderBuffer treated all pending entries as live — meaning delayed pub/sub copies of those buffered events could be delivered again, duplicating content. Fix: capture earlyEventBuffer.length before the skip/replay branch so the count is always passed to syncReorderBuffer regardless of delivery method. Seqs 0..earlyReplayCount-1 are pruned as duplicates whether they were replayed or delivered via sync payload. * fix: keep nextSeq monotonic in syncReorderBuffer after async GET handleOrderedChunk can deliver in-order messages and advance nextSeq during the async GET window. If those messages leave pending empty, the unconditional nextSeq = currentSeq could regress nextSeq below its already-advanced value, reopening a delivered gap and causing subsequent messages to be buffered until force-flush. Fix: wrap both nextSeq assignments with Math.max(nextSeq, ...) so syncReorderBuffer never moves the delivery frontier backward. * fix: cap nextSeq at earlyReplayCount, add 24h safety TTL, add post-GET race test Finding 1 (CRITICAL): When earlyReplayCount > 0 and pending is empty, nextSeq was set to currentSeq — but INCR can advance the counter past a live chunk whose pub/sub hasn't arrived yet. Cap at earlyReplayCount instead (what was actually delivered), so in-flight chunks are not skipped. Adds test for the message-arrives-AFTER-GET-resolves scenario. Finding 3: Add a 24-hour safety-net TTL set once on first INCR only (val === 1), never refreshed. This caps orphan lifetime from crashed processes without risking mid-stream counter resets. Finding 6: Replace setTimeout(100) in cleanup test with polling loop. Finding 9: Fix variadic DEL mock + add expire mock for the new TTL. Revert P2 fix (earlyReplayCount for skipBufferReplay path) — when skipBufferReplay is true, the resume sync payload delivers everything up to currentSeq, so syncReorderBuffer should trust the Redis counter as the frontier, not the buffer length. * chore: log expire failures consistently with other fire-and-forget errors	2026-04-09 09:57:54 -04:00
Danny Avila	632ffbcb87	🧬 fix: Merge Custom Endpoints by Name Instead of Replacing Entire Array (#12586 ) * fix: Merge Custom Endpoints by Name Instead of Replacing Entire Array The DB base config's `endpoints.custom` array was wholesale-replacing the YAML-derived array, causing YAML endpoint additions to be silently lost after the first admin panel save. Add path-aware array merging to `deepMerge` so keyed arrays (matched by `name`) are merged item-by-item instead of replaced. * fix: Harden mergeArrayByKey — deduplicate, sanitize, and prevent mutation - Remove redundant sourceOrder array; iterate Map.keys() instead to prevent duplicate entries when source contains repeated names. - Sanitize override-only appended items through deepMerge to enforce UNSAFE_KEYS prototype-pollution protection on all code paths. - Shallow-copy unmatched base items to prevent mutation leak-back. - Add post-OVERRIDE_KEY_MAP remapping note to ARRAY_MERGE_KEYS JSDoc. - Add tests: duplicate source names, base mutation safety, multi-priority sequential merging of the same custom endpoint. * fix: Add inline comments and test for keyless source items - Document keyless item drop behavior in mergeArrayByKey with inline comment and matching test case. - Add last-write-wins comment to deduplication test assertion. - Clarify path semantics in mergeArrayByKey target-iteration comment. * fix: Relocate keyless-item comment and test target-side preserve - Move keyless-item comment to the if-guard where the skip happens and clarify that target-side keyless items are preserved, not dropped. - Add test verifying base items without a name field are kept in output.	2026-04-09 09:38:26 -04:00
github-actions[bot]	72cdeb0437	🌍 i18n: Update translation.json with latest translations (#12583 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-09 07:45:13 -04:00

1 2 3 4 5 ...

4053 commits