Commit graph

4053 commits

Author SHA1 Message Date
github-actions[bot]
2427edef90
🌍 i18n: Update translation.json with latest translations (#12711)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-21 11:41:33 -04:00
Danny Avila
d1573a8493
🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (#12755)
* 🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (GHSA-gvpj-vm2f-2m23)

RFC 9728 §3.3/§7.3 requires clients to verify that the `resource` identifier
advertised by an OAuth Protected Resource Metadata document matches the URL
used to fetch it. Without this check, a malicious MCP server can serve
metadata pointing at a legitimate server's `authorization_servers`, causing
LibreChat to obtain an access token for the real server and send it to the
attacker in subsequent API calls.

Validation happens at discovery time so the entire metadata document is
discarded on mismatch — `authorization_servers` on a spoofed document is
equally untrustworthy and is the primary theft vector in the PoC.

Uses the MCP SDK's own `checkResourceAllowed` (origin + path-prefix) for
semantic parity with `selectResourceURL`, a SDK code path LibreChat bypasses.
This is looser than RFC-strict equality (handles common cases like
`/mcp/sse` server vs `/mcp` advertised resource, or trailing-slash
normalization) while still rejecting cross-origin spoofs and same-origin
sibling-path confusion.

* 🛡️ fix: Re-validate Resource Binding at MCP OAuth Token Exchange

Adds a defense-in-depth re-assertion of the RFC 9728 §3.3 resource/server
binding inside `completeOAuthFlow`. Flows have a 10-minute TTL, so a flow
initiated under pre-fix (vulnerable) code could still be in-flight at upgrade
time carrying unvalidated resource metadata. Re-checking here closes that
window without requiring operators to flush flow state on deploy.

Also guards against future regressions that might re-introduce unvalidated
paths into the flow-metadata pipeline (GHSA-gvpj-vm2f-2m23).

*  test: Address Review Findings on MCP OAuth Resource Validation

Follow-up to GHSA-gvpj-vm2f-2m23 fix. Resolves three reviewer findings:

- Assert `failFlow` is called when `completeOAuthFlow` rejects at
  re-validation — locks in the "no stuck PENDING entry" guarantee that the
  catch block already provides in production code.
- Update the "no resource metadata" warning in `initiateOAuthFlow`: post-fix,
  that branch is only reachable when PRM discovery returned nothing (404,
  network error, server without RFC 9728). A document with a missing
  `resource` field now throws earlier in `assertResourceBoundToServer`, so
  the old "missing 'resource' property" phrasing described a case that can
  no longer reach this branch.
- Add a test for an unparseable `resource` string triggering the
  error-wrapping path in `assertResourceBoundToServer` (verifies the wrapper
  surfaces a descriptive message instead of leaking a raw `TypeError:
  Invalid URL` from the SDK's `new URL()` call).
2026-04-21 05:44:09 -07:00
Danny Avila
7cf8b84778
📦 chore: Update @librechat/agentsto v3.1.68 (#12752)
- Bumped the version of @librechat/agents from 3.1.67 to 3.1.68 in multiple package.json files to ensure consistency and access to the latest features and fixes.
- Updated package-lock.json to reflect the new version and maintain dependency integrity.
2026-04-20 17:45:58 -07:00
Danny Avila
8e866e6010
🗺️ fix: Resolve Custom-Endpoint Providers for Summarization (#12739)
* 🔧 fix: resolve custom-endpoint providers for summarization

When `summarization.provider` in `librechat.yaml` is set to a custom-endpoint
name (e.g. `Ollama`), the string was passed verbatim to the agents SDK, which
only knows a fixed set of provider names and threw
`Unsupported LLM provider: Ollama`.

Before shaping the summarization config for the SDK, resolve the provider
through `getProviderConfig`: custom-endpoint labels are remapped to the
underlying SDK provider (e.g. `openAI`) and the endpoint's baseURL/apiKey are
injected into `parameters` so the summarization model reaches the right
backend, even when summarization targets a different custom endpoint than the
main agent.

Unknown names and names that appear with no matching endpoint flow through
unchanged so the SDK can surface a clear error. User-provided credentials and
unresolved env-var references are skipped rather than forwarded, letting the
SDK's self-summarize path reuse the agent's own clientOptions.

Ref: LibreChat Discussion #12614

* address: widen unresolved-env-var guard, fix test naming

- Reject summarization overrides when the extracted baseURL/apiKey still
  contains any `${...}` placeholder, including prefix/suffix patterns like
  `https://${UNSET}.example.com` that `envVarRegex` (exact-match) missed.
- Rename the "case-insensitive" test to reflect that only `Ollama` is
  normalized via `normalizeEndpointName`; add coverage proving other
  custom-endpoint names match case-sensitively.

* address: use req.config in responses.js; forward full endpoint options

- `responses.js` relied on a module-level `appConfig` set via `setAppConfig`,
  which is never called anywhere. Use `req.config` directly so the
  summarization provider resolver actually runs on the responses route.
- Route the custom endpoint config through `getOpenAIConfig` so summarization
  inherits the same `headers`, `defaultQuery`, `addParams`/`dropParams`, and
  `customParams` transforms (Anthropic/Google/etc.) that `initializeCustom`
  applies for the main agent flow. Strip the stale `model`/`modelName`
  defaults so `summarization.model` still wins.

* address: skip overrides when summarization matches agent endpoint

When `summarization.provider` resolves to the same custom endpoint as the
main agent, rely on the SDK's self-summarize path (which reuses
`agentContext.clientOptions` unchanged) rather than injecting overrides.
Otherwise the shallow spread of `clientOverrides.configuration` would
replace the agent's request-resolved state (dynamic headers, proxy/fetch
options) with yaml-only config.

Only applies when summarization targets a *different* endpoint from the
agent; the yaml config is all we have in that case, so overrides still
flow through.

* address: preserve raw provider when overrides cannot be built

When summarization points at a different custom endpoint than the agent
and we can't resolve the endpoint's credentials (user_provided, or a
still-unresolved `${VAR}` after env extraction), remapping to `openAI`
without overrides would silently route summaries to the default OpenAI
client. Preserve the raw provider name so the SDK raises a clear
"Unsupported LLM provider" error (now also logged, via the agents SDK
defense-in-depth fix) instead of sending traffic to the wrong backend.

* address: resolve endpoint headers and forward PROXY to summarization

- Custom-endpoint `headers` now flow through `resolveHeaders` before
  reaching `getOpenAIConfig`, matching the main agent path. This ensures
  templated values like `\${PORTKEY_API_KEY}` or `{{LIBRECHAT_BODY_...}}`
  are substituted for summarization requests instead of being forwarded
  literally.
- `PROXY` env var is now passed into `getOpenAIConfig` so cross-endpoint
  summarization honors outbound proxy dispatchers configured for the rest
  of the deployment.

* address: user summarization parameters win over endpoint defaults

Flip the merge order so `summarization.parameters` from yaml override
`clientOverrides` defaults (which come from `getOpenAIConfig` and always
include `streaming: true` etc.). A user who sets `parameters.streaming:
false` in their config should still see non-streaming summarization for
providers that require it.

* address: review feedback (logging, dead code, DRY, types, deep-merge)

- Log error in the resolveSummarizationProvider catch-all so programming
  bugs in getProviderConfig/getOpenAIConfig/resolveHeaders surface in
  operator logs instead of falling through silently.
- Drop dead `setAppConfig`/`appConfig` infrastructure in responses.js and
  fix adjacent `allowedProviders` reference that also relied on the
  never-initialized module-level appConfig. Uses `req.config` directly.
- Import canonical `normalizeEndpointName` from librechat-data-provider
  instead of duplicating it locally.
- Replace `SummarizationClientOverrides = Record<string, unknown>` with
  an explicit interface covering the known fields.
- Deep-merge `configuration` when user-supplied `summarization.parameters.
  configuration` overlaps the resolved endpoint configuration, so user
  additions (e.g. `defaultQuery`) don't wipe out `baseURL`/`defaultHeaders`.
- Wrap `process.env` mutations in test in `try/finally` so a failed
  assertion doesn't leak env state into subsequent tests.
- Drop `as unknown as AppConfig` in test helper; fixture now matches the
  `AppConfig` shape directly using a `Partial<TEndpoint>` union.
- Trim JSDoc that restated the name it was attached to.

* address: review nits — import order, local test type, conflict test

- Move `import { logger }` up into the package value-imports section so
  it no longer sits between `import type` blocks.
- Replace `as unknown as SummarizationConfig['parameters']` in the
  deep-merge test with a named `TestSummarizationParameters` type and a
  single narrowing cast at the call site, making intent explicit.
- Add a test proving that user-supplied `configuration.baseURL` wins
  over the resolved endpoint baseURL, locking in the deep-merge's
  user-wins-on-conflict semantics that the previous suite only exercised
  additively.
2026-04-20 12:00:46 -04:00
Danny Avila
d2cbd551b7
🤝 fix: Load Handoff Agents for Agents API (#12740)
* 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726)

Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents
into a shared `discoverConnectedAgents` helper in `@librechat/api` and
wires it into the OpenAI-compatible `/v1/chat/completions` and Open
Responses `/v1/responses` controllers. These endpoints previously only
passed the primary agent config to `createRun` while keeping
`primaryConfig.edges` intact, which forced `MultiAgentGraph` into
multi-agent mode without loading the referenced sub-agents and caused
StateGraph to throw "Found edge ending at unknown node <id>".

The discovery helper also filters orphaned edges (deleted sub-agents or
those the caller lacks VIEW permission on), so API users see the same
graceful fallback the chat UI already had.

* 🧪 fix: use ServerRequest in discovery spec helpers

CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test
helpers typed `req` as `express.Request`, which is not assignable to
`DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose
`user` is `IUser`). Local jest passed because ts-jest is transpile-only,
but the CI typecheck uses the full compiler.

* 🪲 fix: drop orphan edges on both endpoints, not just `to`

Addresses the P1 codex finding on #12740: `filterOrphanedEdges`
previously only removed edges whose `to` referenced a skipped agent.
Edges whose `from` was a skipped agent — the symmetric case in a
bidirectional graph like `A <-> B` where `B` is deleted or the user
lacks VIEW on it — leaked through to `createRun` and re-triggered
`Found edge ending at unknown node <id>` at StateGraph compile time.

The filter now drops an edge if either endpoint references a skipped
id, and the existing `to`-only test cases were updated to reflect the
stricter behavior. Adds a bidirectional-graph regression test in
`discovery.spec.ts`.

* 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes

Addresses the second P1 codex finding on #12740: the OpenAI-compat
`/v1/chat/completions` and Open Responses `/v1/responses` routes gate
the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`),
but `discoverConnectedAgents` was checking handoff sub-agents against
the looser in-app `AGENT` resource type. That allowed a remote caller
who could reach the orchestrator but had only in-app visibility on a
sub-agent to invoke it via the API — bypassing the remote-sharing
boundary.

Adds an optional `resourceType` param to `discoverConnectedAgents`
(defaulting to `AGENT` for the chat UI path) and passes
`ResourceType.REMOTE_AGENT` from both API controllers so every
discovered sub-agent clears the same sharing boundary enforced at
route entry.

* 🧯 fix: enforce allowedProviders for discovered sub-agents

Addresses the third P1 codex finding on #12740: `discoverConnectedAgents`
forwarded the caller's `endpointOption` verbatim into `initializeAgent`,
but on the OpenAI-compat routes that option's `endpoint` is the primary
agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only
enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)`
is true, so handoff sub-agents silently bypassed the provider allowlist
configured under `endpoints.agents.allowedProviders`.

Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every
per-sub-agent init call. The primary agent still uses the caller's
endpointOption as before — this only affects the BFS-loaded handoff
targets. Regression test asserts the override.

* ✂️ fix: prune unreachable sub-agents after orphan-edge filtering

Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes
every sub-agent referenced in the primary's edge scan, but once
`filterOrphanedEdges` drops edges whose endpoints were skipped, some of
those sub-agents end up disconnected from the primary. In an `A -> B ->
C` graph (edges stored directly on A) where B is skipped (missing or
no VIEW), both edges are filtered, but C was already loaded and would
still be passed to `createRun` — which flips into multi-agent mode on
`agents.length > 1` and turns C into an unintended parallel start node.

After filtering edges, compute the set of agent ids reachable from the
primary through the surviving edge set and prune `agentConfigs` to that
set. Two regression tests added: one for the pruning case, one that
confirms agents connected via surviving edges are still kept.

* 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback

Addresses the fifth P1 codex finding on #12740: `onAgentInitialized`
fires during BFS, BEFORE the helper prunes agents that become
disconnected once `filterOrphanedEdges` runs. Writing the sub-agent
straight into the outer `agentConfigs` there and then only additively
merging the pruned `discoveredConfigs` left stranded entries in the
outer map, and `AgentClient` would still hand them to `createRun` as
extra parallel start nodes (the exact failure mode the pass-4 prune
was meant to eliminate for the API controllers).

Drop the `agentConfigs.set` from the callback and replace the additive
merge with a direct copy from `discoveredConfigs`, which is now the
single authoritative source of what the run should see. The
per-agent tool context map is still populated during BFS — stale
entries there are harmless because they're only read by closure inside
`ON_TOOL_EXECUTE` and are unreachable once the agent is not in
`agentConfigs`.

* 🔬 fix: address audit findings on discovery helper

Resolves findings from a comprehensive external audit of #12740.

**Finding 1 (CRITICAL) — stale edges survive the reachability prune.**
The pass-4 prune removed unreachable agents from `agentConfigs` but left
matching edges in the return value. In an `A -> B -> C -> D` graph (all
edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B
and B->C but keeps C->D (neither endpoint is skipped). The caller then
sees `agentConfigs` without C/D but `edges` still references them,
flipping `createRun` into multi-agent mode with mismatched agents/edges
— the exact crash this PR is supposed to fix. Now filter the edge list
to the reachable set in the same pass, so the returned shape is
self-consistent: every edge endpoint is either the primary id or a key
of `agentConfigs`. New regression test covers A->B->C->D with B skipped.

**Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API
request.** The OpenAI-compat and Responses controllers called
`getModelsConfig(req)` and `discoverConnectedAgents` even when the
primary agent had no edges (the common single-agent API case). Gate
both behind `primaryConfig.edges?.length > 0` so single-agent runs
don't pay that cost.

**Finding 5 (MINOR) — silent mutation of caller's
`primaryConfig.userMCPAuthMap`.** The helper aliased that object and
then `Object.assign`'d sub-agent entries into it, changing the caller's
config in-place. Shallow-clone up front so the returned merged map is
the only destination.

**Finding 7 (NIT) — dead `?? []` coalescing.**
`filterOrphanedEdges` always returns a concrete array, so the
`discoveredEdges ?? []` fallback was never reached. Simplified the
`primaryConfig.edges = …` assignment.

Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not
mutated in-place.

* 🧹 chore: address audit NITs on discovery helper

Addresses two NIT findings from the post-fix audit:

**F1** — the shallow clone on `primaryConfig.userMCPAuthMap` was only
applied on the primary side; the `else` branch (hit when the primary
had no MCP auth and the first sub-agent seeds the map) assigned the
sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's
`Object.assign` mutated the first one's map in place. Harmless in
practice (per-request ephemeral objects) but asymmetric. Clone in the
else branch too. Test added.

**F2** — `initialize.js` had a defensive `if (agentConfigs.size > 0 &&
!edges) edges = []` normalizer. Pre-existing dead code: the helper now
always returns a concrete array from `filteredEdges.filter(...)`.
Removed for clarity.

* 🕸 fix: require all sources reachable when traversing fan-in edges

Addresses the seventh P1 codex finding on #12740: the reachability BFS
advanced through an edge as soon as any of its `from` endpoints matched
the current frontier node (`sources.includes(current)`), but the
subsequent edge filter required ALL sources to be reachable (`every`).
The two-semantics mismatch let a fan-in edge like `{from: ['A','B'],
to: 'C'}` mark C reachable purely via A even when B had no path from
the primary, then drop the edge itself at filter time. Result: C
survived in `agentConfigs` with no surviving edge connecting it to A,
so `createRun` flipped into multi-agent mode on `agents.length > 1`
and C ran as an unintended parallel root.

Replace the BFS with a fixed-point iteration keyed on the same
all-sources-reachable predicate used by the filter, so traversal and
filtering stay aligned and multi-source edges only fire once every
source is in the reachable set.

Two regression tests added:
- `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts
  neither B nor C leak into the result.
- `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires
  and D becomes reachable once both B and C are.

* 🔀 fix: match SDK OR semantics for multi-source edge reachability

Reverts the all-sources-required reachability gate from 4982f1c3b and
replaces it with an any-source-reachable model, which matches how
`@librechat/agents`'s `MultiAgentGraph.createWorkflow` actually wires
multi-source edges at runtime (per-source `builder.addEdge(source,
destination)`). With the previous `every` gate, a legitimate handoff
edge `{ from: ['A', 'B'], to: 'C' }` where B had no incoming path was
pruned along with C, regressing OR-semantics routing that the SDK
would otherwise handle correctly.

New behavior:

1. Reachability: an edge advances when ANY of its `from` endpoints is
   already reachable. Fixed-point iteration over `filteredEdges`.
2. Edge filter: keep an edge when it has at least one reachable source
   AND all destinations are reachable (a missing destination would
   still crash `StateGraph.compile` with `Found edge ending at unknown
   node`).
3. Agent prune: keep agents that are reachable OR referenced on any
   endpoint of a surviving edge. The second clause preserves co-sources
   in multi-source edges (B in `{ from: ['A','B'], to: 'C' }` when
   nothing else reaches B) so the SDK's per-source `addEdge` — and the
   `validateEdgeAgents` safety-net I added to the SDK in #111 — still
   finds B as a node.

The pass-audit A->B->C->D regression test continues to pass: with B
skipped, `filterOrphanedEdges` drops both B-adjacent edges, reachability
never expands past A, C->D has no reachable source so it gets filtered,
and C/D are pruned because they're neither reachable nor referenced.

* ✂️ fix: strip skipped co-members from multi-source/multi-dest edges

Addresses codex pass-9 P2 on #12740. `filterOrphanedEdges` previously
dropped an edge whenever any `from` id was skipped, which was correct
for scalar edges but over-aggressive for multi-source ones: the agents
SDK adds one `builder.addEdge(source, destination)` per source, so
`{ from: ['A','B'], to: 'C' }` with B skipped still has a valid
`A -> C` route that was being thrown away.

Now sanitize each endpoint:
- Scalar skipped → drop the whole edge (no route survives).
- Array with some skipped → strip the skipped ids, keep the edge with
  the surviving members. If the array empties out, drop the edge.

Symmetric handling for `to` covers multi-destination fan-out when one
co-destination is skipped. Tests updated/added:
- `strips skipped co-sources from multi-source edges…`
- `strips skipped co-destinations from multi-destination edges`
- `drops multi-member edges only when every member on a side is skipped`
- Discovery-side: `preserves valid routes when one co-source of a
  multi-source edge is skipped` asserts the end-to-end behavior —
  skipped co-source B gets stripped from the edge, A->C routing
  survives, and C remains in `agentConfigs`.

* 🔓 fix: respect SHARE-on-AGENT fallback for handoff ACL on API routes

Addresses codex pass-10 P1 on #12740. The API controllers were handing
`discoverConnectedAgents` a raw `PermissionService.checkPermission` call
against `ResourceType.REMOTE_AGENT`, but the route-level middleware
(`createCheckRemoteAgentAccess`) authorizes the primary agent via
`getRemoteAgentPermissions`, which first consults the AGENT ACL and
treats owners with the SHARE bit as remotely authorized even without
an explicit REMOTE_AGENT grant. The mismatch meant a user could open
the primary via `/v1/chat/completions` or `/v1/responses`, but their
own owned handoff sub-agents were silently skipped — breaking
multi-agent handoffs for the common "owner runs their own multi-agent
orchestrator" case.

Both controllers now pass `discoverConnectedAgents` a `checkPermission`
wrapper that delegates to `getRemoteAgentPermissions` (with
`getEffectivePermissions` injected from `PermissionService`) and
compares the returned bitmask against the required permission via
`hasPermissions`. Sub-agents are now authorized by the exact same
rules the route middleware applies to the primary.

* 🌱 fix: preserve user-defined parallel-start branches

Addresses codex pass-11 P2 on #12740. The post-filter reachability
prune seeded only from `primaryConfig.id`, which killed
`MultiAgentGraph`'s legitimate multi-start pattern — a user-defined
edge like `X -> Y` where X has no incoming path (X is an intentional
parallel starting node, run alongside the primary) was being dropped
because neither X nor Y was reachable from the primary.

Reconcile the tension with pass-4 ("prune accidental orphans when an
intermediate is skipped") by using pre-filter reachability as the
signal:

- An agent that WAS reachable from the primary via the original
  (pre-filter) edges but loses that path when `filterOrphanedEdges`
  runs is an accidental orphan (a skipped hop broke the chain) — prune.
- An agent that was NEVER reachable from the primary, even pre-filter,
  is an intentional parallel start — seed it into post-filter
  reachability so its component survives.

Surviving-edge endpoint references still keep an agent (co-sources in
multi-source edges). New test `preserves user-defined parallel-start
branches disconnected from the primary` covers the pass-11 scenario;
the existing `A->B->C->D, B skipped` regression test continues to
pass because C/D were pre-filter reachable through B and lose that
reachability after filtering.

* 🎯 fix: tighten parallel-start seed criterion to 'no pre-filter incoming edge'

Addresses codex pass-12 P1 on #12740. The pass-11 seed heuristic — 'agent
is in `agentConfigs` but was not pre-filter reachable from the primary' —
was too permissive. A downstream agent like Y in `X -> Y` where X gets
skipped (missing / no VIEW) was never pre-filter reachable from the
primary either, so the old rule promoted Y to a parallel start node and
discovery returned `agents: [primary, Y]` with no connecting edge. The
SDK then ran Y as an unintended parallel root — exactly the orphan
behavior pass-4 wanted to prevent.

Tighter criterion: seed a post-filter reachability root only when the
agent had NO incoming edge in the pre-filter graph. That matches
`MultiAgentGraph.analyzeGraph`'s "no-incoming-edge" definition of a
start node applied to the user's original declared topology, so:

- `A -> B` plus a user-defined `X -> Y` parallel branch: X has no
  incoming pre-filter → seeded → X and Y both survive.
- `A -> B` plus `X -> Y` with X skipped: Y had an incoming pre-filter
  (`X -> Y`) → NOT seeded → Y is pruned as the orphan it is.
- `A -> B -> C` with B skipped: C had an incoming pre-filter (`B -> C`)
  → NOT seeded → C is pruned.

New test `does not promote a downstream orphan to a parallel start when
its only upstream is skipped` locks in the pass-12 scenario. The pass-11
`preserves user-defined parallel-start branches` test continues to hold.

* 📁 fix: don't enforce AGENT-only file ACL on REMOTE_AGENT API callers

Addresses codex pass-13 P1 on #12740. When I refactored the API
controllers' DB-method bundle, I inadvertently started forwarding
`filterFilesByAgentAccess` into `initializeAgent`. That helper calls
`checkPermission` with `resourceType: ResourceType.AGENT`, but these
routes authorize callers through `REMOTE_AGENT` (via
`getRemoteAgentPermissions`). A user granted `REMOTE_AGENT_VIEWER` on
a shared agent but lacking direct `AGENT_VIEW` could invoke the agent
yet all its owner-attached context files would get silently filtered
out — breaking `file_search`/context retrieval for remote consumers.

Drop `filterFilesByAgentAccess` from the OpenAI-compat and Responses
controllers' `dbMethods` (and remove the now-unused import). The chat
UI's `initialize.js` keeps it since that path legitimately authorizes
at the AGENT level. No functional change inside the helper — passing
`undefined` simply tells `primeResources` to skip the per-file ACL
filter, restoring the pre-refactor API behavior.

* 🪓 fix: strip unreachable co-sources from surviving multi-source edges

Addresses codex pass-14 P1 on #12740. The earlier pass-8 fix kept any
agent referenced as an endpoint of a surviving edge (via a
`referencedByEdge` fallback) to avoid the SDK's `validateEdgeAgents`
failing on missing nodes. But that fallback propped up unreachable
co-sources too: with `[A -> C, X -> B, [B,C] -> D]` and X skipped,
`X -> B` gets filtered, the `[B,C] -> D` fan-in survives because C is
reachable, and B stays in `agentConfigs` solely because the fan-in
still lists it. `MultiAgentGraph.analyzeGraph` then sees B with no
incoming edge and runs it as an unintended parallel root.

Sanitize surviving edges instead: for a kept edge whose `from` is an
array, filter out any co-source that isn't reachable. The SDK's
per-source `addEdge` fires independently, so dropping an unreachable
co-source doesn't invalidate the remaining routes — in the scenario
above `[B,C] -> D` becomes `[C] -> D`, every endpoint of every
surviving edge is now reachable, and the agent prune collapses to a
strict `reachable.has(agentId)` check. No more referenced-by-edge
fallback.

Regression test added: `strips unreachable co-sources from surviving
multi-source edges (no stray parallel root)` — asserts B is absent
from every surviving edge endpoint and the fan-in's `from` is just
`['C']`. All 22 prior discovery tests still pass unchanged.
2026-04-20 02:20:43 -04:00
Danny Avila
48c3d31db3
🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737)
* 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels

The default console formatter discarded every structured metadata key on the
winston info object — only `CONSOLE_JSON=true` preserved it. That meant
failures emitted by the agents SDK (e.g. "Summarization LLM call failed")
reached stdout without the provider, model, or underlying error attached,
leaving users unable to diagnose the root cause.

- Add `formatConsoleMeta` helper to serialize non-reserved metadata as a
  compact JSON trailer, with per-value string truncation and safe handling
  of circular references.
- Append the metadata trailer to warn/error console lines; info/debug
  behavior is unchanged.
- Relax `debugTraverse`'s debug-only gate so warn/error messages routed
  through the debug formatter also surface their metadata.
- Add a `formatConsoleMeta` stub to the shared logger mock so existing
  tests keep working.

* 🔐 fix: Also Redact Sensitive Patterns on Warn Console Lines

The warn-level console output now includes a metadata trailer that may
contain provider-returned error strings with embedded tokens or keys
(e.g. `Bearer ...`, `sk-...`). Apply `redactMessage` to warn lines in
addition to error, matching the new surface area.

* 🔐 fix: Redact Sensitive Tokens Embedded in JSON Metadata

Two gaps in the existing console redaction that became user-visible once
warn/error lines started emitting structured metadata:

1. The OpenAI-key regex (`/^(sk-)[^\s]+/`) was anchored to start-of-line,
   so keys embedded inside JSON payloads (e.g. `{"apiKey":"sk-..."}`)
   were never redacted. Every console line begins with a timestamp, so
   the anchor effectively made this pattern dead code.
2. `formatConsoleMeta` stringified metadata values verbatim; a sensitive
   string value was only redacted by the whole-line regex pass, which
   missed the anchored `sk-` case above.

Fix:
- Drop the `^` anchor; add `/g` so every occurrence is redacted, not just
  the first.
- Also exclude `"` and `'` from the token body so JSON-embedded values
  terminate at the closing quote rather than chewing into the next field.
- Simplify `redactMessage` to apply patterns directly (dropping the
  `getMatchingSensitivePatterns` filter) — the filter used `.test()`
  which has stateful behavior on `/g` regexes and is no longer needed.
- `formatConsoleMeta` now runs `redactMessage` over every string value
  before JSON serialization, so the metadata trailer is safe even on the
  warn path.
- Add regression tests covering both fixes.

Reviewed-by: Codex (P1 finding on PR #12737, commit 68c31b6).

* 🔐 fix: Redact Metadata in debugTraverse for Warn and Error

Relaxing the debug-only gate in debugTraverse (in commit 59371be0)
routed warn/error records through the traversal path, which emits leaf
string values verbatim (via truncateLongStrings only). Because
DEBUG_LOGGING defaults to true, those records are also written to the
rotating debug log file — which means payloads like
`{ auth: 'Bearer ...' }` or `{ openaiKey: 'sk-...' }` were persisted
unredacted once my earlier change took effect.

Apply redactMessage to the final formatted string when the level is
warn or error. Debug-level behavior is unchanged (matching prior art).

Includes regression tests covering error/warn redaction and
debug-level preservation.

Reviewed-by: Codex (P1 finding on PR #12737, commit e288f7fd).

* 🔐 fix: Anchor Secret Regexes at Word Boundaries to Prevent Over-Redaction

Removing the `^` anchor in commit e288f7fd let the OpenAI-key regex match
anywhere in the line — including inside ordinary words like `task-runner`
or `mask-value`, where `sk-` appears mid-word. Non-secret text was being
rewritten to `task-[REDACTED]`, hiding real log content from operators.

- Anchor every sensitive-key pattern with `\b` so matches only fire at
  word boundaries.
- Constrain the OpenAI-key body to the documented charset
  (`[a-zA-Z0-9_-]+`) instead of the broader "not whitespace or quote"
  character class.
- Add `&` to the `key=` exclusion so a query-string value stops at the
  next parameter separator.
- Regression tests covering both the over-redaction cases (`task-runner`,
  `monkey=10`) and the intended redactions still firing.

Reviewed-by: Codex (P2 finding on PR #12737, commit c09d293d).

* 🔐 fix: Redact Before Colorize To Survive ANSI Word-Boundary Interference

The console pipeline runs `redactFormat → colorize({ all: true }) → printf`.
With `all: true`, winston wraps `info.message` in ANSI escapes whose
trailing `m` is a word character. That means `\b(Bearer )…` placed at the
start of a colorized segment can fall on a (word,word) boundary and miss —
the earlier line-wise `redactMessage(line)` pass in printf suffers the
same issue because it runs after colorize.

Extend `redactFormat` to run for `warn` in addition to `error`, operating
on the raw pre-colorize `info.message` + `Symbol.for('message')` strings.
The later in-printf `redactMessage(line)` stays as a backstop, but the
primary redaction now happens where the regex can actually see the text.

Metadata redaction already operates on the raw info object via
`formatConsoleMeta`, so it was never affected by ANSI — no change there.

Includes regression tests for the new warn-level behavior and for the
info/debug no-op path.

Reviewed-by: Codex (P2 finding on PR #12737, commit fdb6b361).

* 🧹 fix: Prefer Structured Metadata Over Consumed Splat Args in Traversal

`debugTraverse` previously read `metadata[Symbol.for('splat')][0]` first
and only fell back to the structured metadata object. When a caller uses
printf interpolation alongside a metadata object — for example
`logger.warn('failed for %s', tenant, { provider })` — winston leaves the
*consumed* positional arg (`tenant`) in `SPLAT[0]` after interpolation.
The formatter would then append the tenant a second time and skip the
real metadata, regressing debug-file and `DEBUG_CONSOLE` output quality
now that warn/error share this path.

Prefer the structured metadata object (via `extractMetaObject`) and only
fall back to `SPLAT[0]` when there's nothing else, so the surviving log
line surfaces the actual key/value pairs regardless of call shape.

Reviewed-by: Codex (P2 finding on PR #12737, commit 1e43d636).

* 🧹 fix: Skip Consumed Splat Primitives in Warn/Error Debug Traversal

When no structured metadata is attached, winston still leaves consumed
`%s` / `%d` arguments in `Symbol.for('splat')`. Previous fix preferred
the structured object but still fell back to whatever sat at `SPLAT[0]`
— so `logger.warn('failed for %s', tenantId)` emitted
`failed for tenant-7 tenant-7` in the traversal path (debug file and
`DEBUG_CONSOLE`), now regressed outside of `debug` level because
warn/error share the path.

Only accept the splat fallback when the value is a plain object or an
array (structural data worth surfacing). Primitives there are almost
certainly consumed printf args and get skipped.

Regression tests cover the single-%s case and the array-as-metadata case
(which still surfaces through splat).

Reviewed-by: Codex (P2 finding on PR #12737, commit bccbf117).

* 🧹 fix: Skip Numeric Splat Keys When Extracting Log Metadata

When a caller passes a primitive as the second argument — e.g.
`logger.warn('Unhandled step creation type:', step.type)` — winston /
`format.splat()` can leave character-index keys (`"0"`, `"1"`, …) on the
`info` object. With the warn/error metadata trailer in play, those
synthetic artifacts were being surfaced as bogus metadata, producing
noisy console and debug-file output.

Filter out numeric-string keys in `extractMetaObject` so only real
metadata fields reach the trailer. Added a regression test.

Reviewed-by: Codex (P2 finding on PR #12737, commit b34628de).

* 🧹 fix: Preserve Unconsumed Primitive Splat Args in Debug Traversal

The previous round dropped every primitive SPLAT[0] value to avoid
duplicating consumed %s args, but that removed useful context from
calls like \`logger.debug('prefix:', detail)\` where the primitive was
never interpolated — users lost the \`detail\` value.

Refine the heuristic: skip a primitive splat value only when it already
appears inside the (post-interpolation) \`info.message\`; otherwise
surface it. Arrays and objects continue to surface unconditionally.

Regression test covers the 'prefix:', detail case.

Reviewed-by: Codex (P2 finding on PR #12737, commit 6bf9548f).

* 🧹 fix: Traverse Filtered Metadata, Not Raw Metadata, In debugTraverse

`debugTraverse` computes `extracted = extractMetaObject(metadata)` to
strip reserved keys, underscore-prefixed internals, and numeric splat
artifacts — but the later \`klona(metadata)\` + \`traverse\` path still
read the raw object, putting all the filtered junk back into the
rendered multi-line output.

Clone and traverse \`debugValue\` (the already-filtered object) instead.
Regression test exercises the case where numeric splat artifacts sit
alongside a real metadata field.

Reviewed-by: Codex (P2 finding on PR #12737, commit c29c18e8).

* 🧹 refactor: Split Warn/Error From Debug-Level Traversal in Parsers

Retrofitting `debugTraverse`'s multi-line object walker to cover
warn/error created a minefield of splat-interaction edge cases
(numeric artifacts, consumed %s args, bogus `_`-prefix filtering,
over-eager suppression of unconsumed primitives). Each fix kept
introducing new corner cases.

Split the two concerns instead:

- Warn/error now emit a compact single-line JSON metadata trailer via
  `formatConsoleMeta`, then pass the full line through `redactMessage`.
  This mirrors what the console formatter already does, so behavior
  between the console and debug-file outputs stays consistent for
  warn/error — and none of the splat/traversal edge cases apply.
- Debug level keeps its original code path verbatim (including the
  raw `metadata` traversal and SPLAT\[0\] fallback). No regressions from
  my earlier iterations.
- `extractMetaObject` no longer filters underscore-prefixed keys, so
  legitimate fields like MongoDB `_id` still appear. Reserved winston
  keys and numeric splat artifacts remain filtered.

Updated tests reflect the simpler contract (underscore preservation,
single-line trailer expectations already covered).

Reviewed-by: Codex (two P2 findings on PR #12737, commit 9ea11529:
`_id` regression and over-eager primitive suppression).

* 🧹 fix: Preserve Scalar Metadata When One Value Is Circular

`formatConsoleMeta` previously wrapped a single `JSON.stringify` in
try/catch — any circular reference inside any field (e.g. an attached
request/response object) caused the entire trailer to be dropped. That
defeats the goal of making failures diagnosable: one malformed field
would mask the provider/model/status we wanted to surface.

Use a `WeakSet`-based replacer that emits `[Circular]` for repeated
object visits. On the whole-object serialization failing, fall back to
per-field serialization so scalar keys always land and only the
offending field is replaced with \`"[Unserializable]"\`.

Reviewed-by: Codex (P2 finding on PR #12737, commit d63742a5).

* 🧹 fix: Address Audit Findings (JSDoc, Case-Insensitive Api-Key, Tests)

Audit review identified several MINOR/NIT items on top of the codex
rounds. This commit closes the actionable ones:

- **JSDoc (#1, #2)**: `extractMetaObject` no longer claims to filter
  underscore-prefixed keys (that filter was removed intentionally for
  MongoDB `_id`). `debugTraverse`'s docblock now describes the three
  code paths (warn/error compact trailer, debug multi-line traversal,
  other levels).
- **Case-insensitive api-key regex (#6)**: `/gi` so the Azure style
  `Api-Key:` / `API-KEY:` also gets redacted. Pre-existing behavior
  was lowercase-only.
- **Consolidated redundant branch (#5)**: `consoleFormat` printf was
  checking `isError || isWarn` twice; merged into one block.
- **Pre-compiled regex (#9)**: `NUMERIC_KEY_RE` moved to module scope.
- **Test coverage (#3, #4)**: Added regression tests for
  - per-field serialization fallback when a value's `toJSON` throws,
  - sensitive strings nested inside metadata objects,
  - the Azure-style `Api-Key:` header.
2026-04-19 21:49:41 -07:00
Danny Avila
ccf3a6c670
📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types (#12735)
* 🐛 fix: accept documented `summarization.trigger.type` values

The Zod schema for `summarization.trigger.type` only accepted
`'token_count'`, but:

- the documentation lists `token_ratio`, `remaining_tokens`, and
  `messages_to_refine` as valid
- the `@librechat/agents` runtime only evaluates those three types and
  silently no-ops on anything else

The result was a double failure: any user following the docs hit a
startup Zod error, and anyone who matched the schema by using
`token_count` got a silent no-op at runtime where summarization never
fired.

Align the schema with the documented, runtime-supported trigger types.

Closes #12721

* 🧹 fix: bound `token_ratio` trigger value to (0, 1]

Per Codex review: the previous schema accepted `value: z.number().positive()`
for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }`
(presumably meant as "80%") passed validation and then silently never fired —
because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always
false. That is exactly the silent-no-op pattern this PR is trying to eliminate.

Switch to a discriminated union so each trigger type has its own value
constraint:

- `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense
- `remaining_tokens`: positive — token counts can be large
- `messages_to_refine`: positive — message counts can be > 1

Added tests for the upper-bound rejection and the inclusive upper bound
(`value: 1` still accepted as a valid "fire at 100%" extreme).

* 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range

Per Codex review: `.positive()` rejected `value: 0`, but the docs
describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins
who copy the documented lower bound into their YAML would fail schema
validation at startup.

Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme)
setting — the agents SDK's `usedRatio >= 0` check will fire as soon as
there is anything to refine, which is a legitimate "always summarize
when pruning happens" configuration.

`remaining_tokens` and `messages_to_refine` keep `.positive()`: both
are counts, and `0` there produces no meaningful behavior (the SDK
has an early return for `messagesToRefineCount <= 0`).

* 🐛 fix: preserve `token_ratio` trigger when `value: 0`

Per Codex review: now that the schema accepts `token_ratio: 0`,
`shapeSummarizationConfig` would silently drop it because of a truthy
check on `config?.trigger?.value`. The trigger would disappear and the
runtime would fall back to "no trigger configured" — which fires on any
pruning rather than honoring the explicit ratio.

Switch to `typeof value === 'number'`, which preserves `0` while still
rejecting `undefined`/`null`.

Added a regression test that asserts `{ type: 'token_ratio', value: 0 }`
survives the shaping function untouched.

* 🧹 fix: reject non-finite trigger values at schema level

Per Codex review: `z.number().positive()` still accepts `Infinity` and
`NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but
the agents SDK guards every trigger path with `Number.isFinite(...)` and
silently returns `false` — summarization never fires while the server
starts cleanly. That is the exact schema/runtime split this PR is trying
to eliminate.

Add `.finite()` to every trigger value. `token_ratio` already had an
implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps
the intent obvious and catches `NaN` (which `.max(1)` does not).

* 🧹 fix: integer counts + targeted token_count migration warning

Two findings from the comprehensive review:

1. `remaining_tokens` and `messages_to_refine` are token/message counts
   and are always integers in the runtime (`Number.isFinite(...)` guards
   already assume integer semantics). `z.number().positive()` accepted
   fractional values like `2.5`, which was semantically confusing and
   would round oddly against the runtime's `>=` / `<=` comparisons. Add
   `.int()` to both count-based branches; `token_ratio` stays fractional.

2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got
   the generic "Invalid summarization config" warning plus a flattened
   Zod error. Detect that specific case in `loadSummarizationConfig` and
   emit a migration-friendly message that names the three valid
   replacements. Export the function so the behavior is unit-testable.

Also added a parameterized passthrough test covering `remaining_tokens`
and `messages_to_refine` shaping, complementing the existing
`token_ratio` coverage.

* 🧹 fix: accurate fallback wording + bare-string trigger test

Two nits from the follow-up audit:

1. The legacy-`token_count` warning claimed "Summarization will be
   disabled," but `shapeSummarizationConfig` treats a missing
   summarization config as self-summarize mode (fires on every pruning
   event using the agent's own provider/model). "Disabled" would
   mislead an admin into stopping investigation. Reword to describe the
   actual fallback and assert the new wording in the spec.

2. Add a regression test for the `trigger: 'bare-string'` YAML case, so
   the `typeof raw.trigger === 'object'` guard is exercised rather than
   implied.

3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message
   is safe in every terminal/aggregator regardless of UTF-8 handling.

* 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union

CI TS check failed: after the schema tightening, `raw.trigger.type` is
narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine"
| undefined`, so the runtime comparison to `"token_count"` is a
TS2367 ("no overlap") error even though that's exactly the comparison
we want for the migration guard.

Widen just that one access via `as { type?: unknown }` so the migration
check reads runtime-shaped YAML input without the type system folding
it back into the narrowed union.
2026-04-19 19:33:52 -07:00
Danny Avila
2358d07b64
📝 fix: Preserve Raw Markdown Formatting on Upload as Text (#12734)
* 🐛 fix: Preserve Raw Markdown on `Upload as Text`

When `RAG_API_URL` is configured, `.md` uploads were sent to the RAG API
`/text` endpoint, which routes Markdown through `UnstructuredMarkdownLoader`
and strips formatting (`#`, `**`, lists, blockquotes). Users expect `Upload
as Text` to preserve raw content - identical bytes in a `.txt` file round-trip
verbatim, while the `.md` came back stripped.

Short-circuit the RAG API call for Markdown files (by MIME type or `.md` /
`.markdown` extension) and read the file verbatim via `parseTextNative`.
Non-Markdown paths are unaffected, and the embedding path (`/embed`) keeps
its existing loader so vector search quality is unchanged.

* 🐛 fix: normalize markdown MIME and accept `text/md`

Addressing review feedback on the `Upload as Text` short-circuit:

- Accept `text/md` in the markdown MIME set (LibreChat treats it as a
  valid markdown type elsewhere, e.g. the artifact-rendering prompt).
- Normalize the incoming MIME type (lowercase + strip parameters) before
  the set lookup so parameterized values like
  `text/markdown; charset=utf-8` and uppercase `TEXT/MARKDOWN` still
  short-circuit. Extensionless uploads relying only on the `Content-Type`
  header would otherwise fall through to the RAG `/text` endpoint and
  lose their markdown formatting.

Extend `text.spec.ts` parametrized cases with `text/md`, parameterized
MIME, uppercase, and whitespace-padded variants.

* 🧹 chore: Address Code Review Follow-ups on `Upload as Text` fix

Addressing comprehensive review feedback:

- Debug log now includes filename and MIME type so operators can
  identify which upload triggered the short-circuit without having
  to correlate other logs.
- Expand markdown extension detection beyond `.md` / `.markdown` to
  cover `.mdown`, `.mkdn`, `.mkd`, `.mdwn` (case-insensitive regex).
- Tighten `normalizeMimeType` parameter type from `string | undefined`
  to `string` to match the actual Express.Multer.File type. The
  falsy-check still protects against empty strings at runtime.
- Extend parametrized tests with the most common real-world shapes:
  `text/plain` + `.md` (the MIME most browsers/servers assign),
  the new rare extensions, and empty MIME + `.md` (pure extension
  fallback path).
- Add a positive assertion that `readFileAsString` was called with the
  expected arguments on every short-circuit case, so tests fail loudly
  if the native-parse path ever regresses.

* 🧪 test: Cover `.mdwn` regex branch in Markdown short-circuit

Every other alternation in `MARKDOWN_EXTENSIONS_RE` has at least one
test case (`md`, `markdown`, `mdown`, `mkdn`, `mkd`) but `mdwn` did
not, leaving a typo in that branch undetectable.
2026-04-19 19:31:39 -07:00
Danny Avila
3bd2681272
🪐 fix: Replace $bitsAllSet ACL Queries for Azure Cosmos DB Compatibility (#12736)
* 🐛 fix: replace `$bitsAllSet` ACL queries for Cosmos DB compatibility

Azure Cosmos DB for MongoDB API does not implement the `$bitsAllSet`
query operator, so every permission check against Cosmos DB threw. The
five read paths in `aclEntry.ts` (`hasPermission`, `findAccessibleResources`,
`findPublicResourceIds`, and two sites in `getSoleOwnedResourceIds`) now
fetch candidate entries and apply the bitwise mask in application code.
This matches the existing FerretDB-compatible pattern.

Fixes #12729.

* 🐛 fix: delegate `findPubliclyAccessibleResources` to fixed DB method

`AccessControlService.findPubliclyAccessibleResources` inlined the same
`$bitsAllSet` query as the data-schemas layer, which fails on Azure
Cosmos DB for MongoDB. Delegate to `_dbMethods.findPublicResourceIds`
so a single implementation carries the Cosmos-compatible bitwise logic.

Refs #12729.

* 🐛 fix: move `$bitsAllSet` filter out of remote-agent aggregation

`enrichRemoteAgentPrincipals` used `$bitsAllSet` inside an aggregation
`$match`, which Azure Cosmos DB for MongoDB does not implement. Project
`permBits` through the pipeline and filter for `PermissionBits.SHARE` in
application code. The extra documents fetched are bounded by ACL entries
on a single agent resource, so the cost is negligible.

Refs #12729.

* 🧪 test: rename misleading public-dedup test and add real dedup coverage

The test previously named "returns deduplicated IDs even if the public
principal has multiple entries" only set up a single ACL entry, so it
did not actually exercise deduplication. Split into two tests: one for
the happy path (single entry with required bits), and one that bypasses
`grantPermission`'s upsert via `AclEntry.create` to confirm the
application-layer dedup Map handles genuine duplicates.

Refs #12729.

* 🧪 test: cover SHARE-bit filter in `enrichRemoteAgentPrincipals`

The `$bitsAllSet` match stage in `enrichRemoteAgentPrincipals` previously
guaranteed every aggregation row had SHARE; the Cosmos DB fix moved that
check into a JS `continue` branch with no direct coverage. Add a
dependency-injected unit test that stubs the aggregation with mixed
SHARE / non-SHARE / zero-bit rows and asserts only SHARE holders are
enriched and queued for backfill. Also includes a regression guard that
the `$match` pipeline stage no longer contains a `permBits` filter.

Refs #12729.

* ♻️ refactor: extract `filterByBitsAndDedup` helper for ACL reads

`findAccessibleResources` and `findPublicResourceIds` each inlined the
same bitmask-filter + `Map`-based dedup loop. Lift it into a private
`filterByBitsAndDedup(entries, requiredBits)` helper so the Cosmos-DB
compatible pattern lives in one place. Pure rename/extract — no
behavior change.

Refs #12729.

* 📝 docs: fix stale `\$bitsAllSet` references in FerretDB spec

The describe block and header comment in the FerretDB parity spec still
referenced `\$bitsAllSet queries` after the Cosmos DB compatibility fix
moved the bit mask into application code. Update the title to
\"Bitwise permission queries\" and rewrite the header comment to
describe the application-layer behavior being validated.

Refs #12729.

*  perf: push permission-mask filter back to the database via `$in`

The original fix for #12729 moved `$bitsAllSet` filtering into application
code, which meant every ACL read fetched the full set of rows for a
principal/resource and filtered in JS. For tenants with large ACL
collections this inflates wire transfer and heap.

Replace the JS filter with `permBits: { $in: permissionBitSupersets(X) }`.
For the 4-bit `PermissionBits` enum the `$in` list is at most 16 values
(8 for a single-bit mask like SHARE). `$in` is indexable and supported by
Azure Cosmos DB for MongoDB, so the filter runs on the server again —
restoring `.distinct('resourceId')` and `findOne()` semantics.

`permissionBitSupersets(requiredBits)` is memoized and exported from
`@librechat/data-schemas`. Callers restored:
  - `hasPermission`: back to `findOne` short-circuit
  - `findAccessibleResources` / `findPublicResourceIds`: back to `.distinct()`
  - `getSoleOwnedResourceIds`: back to the `$match` + `$group` aggregation
  - `enrichRemoteAgentPrincipals`: bit filter back in `$match`, JS `continue` removed

Refs #12729.

* 🧪 test: add `\$bitsAllSet` vs `\$in` parity + perf spec

Introduces `aclEntry.parity.spec.ts` — a side-by-side spec that runs the
legacy `\$bitsAllSet` query and the current `\$in`-based query against the
same `mongodb-memory-server` fixture and asserts identical output sets for
every affected method (`hasPermission`, `findAccessibleResources`,
`findPublicResourceIds`, `getSoleOwnedResourceIds`) across all 7 meaningful
permBits combinations.

Also logs median wall-clock time for the two query paths over 20 runs on
an 800-entry fixture, with a loose 3x guard against catastrophic
regressions. Initial local numbers: 1.05 ms vs 1.07 ms
(findAccessibleResources), 1.10 ms vs 1.05 ms (findPublicResourceIds).

Refs #12729.

* 🔒 hardening: freeze `permissionBitSupersets` cache + enum-shape guard

Two defensive changes from the comprehensive audit:

* Cached superset arrays are now `Object.freeze`d and the return type is
  `readonly number[]`. Previously the cached arrays were returned by
  mutable reference, so a caller that mutated the result would silently
  corrupt the process-wide cache for every subsequent permission check.
  `Object.freeze` turns that into a loud `TypeError` at the mutation
  site. All existing call sites pass the result directly to Mongoose's
  `$in`, which does not mutate.
* Added a module-load guard `if (MAX_PERM_BITS === 0) throw`. If
  `PermissionBits` is ever refactored to a `const` object or string
  enum, `Object.values(...).filter(isNumber)` would return `[]` and
  `MAX_PERM_BITS` would silently become 0, making every query match no
  rows and breaking every permission check. The guard fails loudly
  instead.

Also collapsed four identical JSDoc lines across `hasPermission`,
`findAccessibleResources`, `findPublicResourceIds`, and
`getSoleOwnedResourceIds` into a single `{@link permissionBitSupersets}`
reference.

Refs #12729.

* 🧪 test: add focused unit tests for `permissionBitSupersets`

The helper is the single point of correctness for every ACL read path
(every query uses `permBits: { $in: permissionBitSupersets(X) }`), so it
warrants direct coverage independent of the higher-level parity and
behavior specs. Six cases added:

* `requiredBits=0` returns all 16 values
* `requiredBits=15` returns `[15]` only
* every returned value is a bitwise superset of `requiredBits`
* full parity against the `$bitsAllSet` definition for every
  `required` in 0..15
* memoization: repeat calls return the same frozen reference
* frozen result throws `TypeError` on mutation attempts

Refs #12729.

* 🧪 test: tighten parity perf guard and document fixture constants

The `expect(currentMs).toBeLessThan(legacyMs * 3 + 50)` form was
dominated by the `+ 50` additive term at typical sub-ms query
latencies — at legacy=1ms, a 50x regression would still pass. Replace
with `Math.max(legacyMs * 5, 50)` so the multiplicative ceiling is
intact once the new path climbs out of the fixed noise floor.

Also added inline rationale for the `FIXTURE_SIZE = 800` and
`PERF_ITERATIONS = 20` constants.

Refs #12729.

* 🧹 chore: remove stale perf-guard comment, hoist rationale to describe

Commit a75f122 left a "Sanity check: A 3x multiplier" comment block
above the first perf guard, and 21852ac layered a second
"Multiplicative-dominant guard" block directly below it. The stale
block described the `legacyMs * 3 + 50` formula that no longer exists,
so both blocks coexisted and contradicted each other.

Delete the stale block from the first test, remove the redundant copy
from the second test, and lift the (now-single) rationale to the
enclosing `describe('performance')` block. Each test is now one
`expect` line — the rationale lives once, at the scope it applies to.

Refs #12729.

* 📝 docs: sharpen MAX_PERM_BITS guard message + test-sort consistency

Two NITs from the follow-up audit:

* The `MAX_PERM_BITS === 0` guard now names the specific refactors that
  could cause it (const object / string enum / all-string shape) and
  gives two concrete remediation paths (rewrite the reducer, or give
  `permissionBitSupersets` an explicit bit-width parameter). Previously
  the message just said "update permissionBitSupersets before
  continuing", which was vague.
* The `requiredBits=15` unit test now applies the same
  `[...result].sort((a, b) => a - b)` normalization as the other tests
  so the set-equality assertions are uniform. The function happens to
  return values in ascending order, but the helper's JSDoc does not
  promise ordering, so sorting before comparing is the correct
  defensive pattern.

Refs #12729.

* 🔒 fix: enforce `permBits <= MAX_PERM_BITS` at the schema level

Codex flagged a real silent regression: `permissionBitSupersets`
enumerates `$in` candidates only in `[0, MAX_PERM_BITS]` (currently 15),
so any ACL row with bits above the enum — e.g. `permBits = 31` from a
future role or a manual DB write — would be excluded from reads that
`$bitsAllSet` would have matched. Current write paths only produce
values in `[0, 15]` via the `PermissionBits` / `RoleBits` enums, so
no live data is affected, but the schema did not prevent out-of-range
writes, so the divergence was reachable.

Fix: add `min: 0`, `max: MAX_PERM_BITS`, and an `isInteger` validator to
the `permBits` field in the AclEntry schema. `MAX_PERM_BITS` is derived
from `PermissionBits` the same way `permissionBitSupersets` computes it,
so when a new bit is added to the enum both the read-side enumeration
and the write-side bound expand together.

Tests: four new cases cover the upper bound, over-limit, negative, and
non-integer inputs, each asserting `Mongoose.Error.ValidationError`.
Plus updated JSDoc on `permissionBitSupersets` to document the
invariant that the schema now enforces.

Refs #12729.

* ♻️ refactor: hoist `MAX_PERM_BITS` + enum-shape/ceiling guards to shared util

Addresses one MINOR DRY finding and one NIT defensive-guard finding from
the latest audit. `MAX_PERM_BITS` was duplicated in `methods/aclEntry.ts`
and `schema/aclEntry.ts` with identical computations; they could silently
diverge. Move the constant plus both module-load guards to
`src/common/permissions.ts`:

* `MAX_PERM_BITS === 0` guard — fail loudly if `PermissionBits` is ever
  refactored to a `const` object or string enum (applied in both sites
  now, not just methods).
* `MAX_PERM_BITS > 255` guard — circuit-breaker for the `$in` enumeration
  strategy. At 4 bits the list tops out at 16; at 8 bits 256. Beyond
  that the approach degrades, so fail at module load rather than emit an
  unusably-large `$in` list.

Both the schema and the methods file now import from the single
`src/common/permissions.ts`. `readonly number[]` return type and
`Object.freeze` cache-value protection from the prior commits are
preserved.

Refs #12729.

* 🔒 fix: reject out-of-range `requiredBits` without caching (DoS fix)

Codex flagged a real unbounded-cache DoS vector:
`api/server/controllers/agents/v1.js` parses `req.query.requiredPermission`
via `parseInt` and forwards it unvalidated to `findAccessibleResources`,
which eventually calls `permissionBitSupersets(requiredBits)`. Because
the old code memoized *every* distinct input in a process-global Map,
an attacker could flood the cache with unique integers and grow memory
without bound.

Fix: when `requiredBits` is not an integer, is negative, or has any bits
set above `MAX_PERM_BITS`, return a single shared frozen empty array and
do NOT touch the cache. An empty `$in` list correctly matches zero rows
(rows cannot satisfy a bit we do not support), so the response is also
semantically correct — previously it "worked" only because invalid
inputs coincidentally expanded to `[]` too, but at the cost of a cache
entry each time.

Five new tests exercise the rejection path: upper-bound overflow, mixed
in-range+out-of-range bits, shared-instance identity across rejected
inputs, a 2000-unique-value cache-growth probe, and a regression check
that legitimate in-range inputs still get memoized normally.

Refs #12729.

* 🧪 test: probe `Map.prototype.set` to prove cache doesn't grow on rejection

Strengthens the DoS cache-growth test the review pass flagged: reference
identity alone allows a hypothetical regression like
`supersetCache.set(requiredBits, EMPTY_SUPERSETS); return EMPTY_SUPERSETS;`
to pass while still leaking one entry per attacker request.

Add a second test that spies on `Map.prototype.set`, records the global
call count before and after a burst of 1500 rejected inputs (500 each of
over-MAX integers, negatives, and non-integers), and asserts the delta
is zero. Manually verified the test has teeth: injecting the
hypothetical regression in the implementation produced exactly 1500
extra `Map.set` calls and the new test failed as expected; reverting
restored the clean state.

Renames the existing identity-based test to `(reference identity)` so
its scope is clear alongside the new `(Map-write probe)` companion.

Refs #12729.
2026-04-19 22:28:48 -04:00
Luna
8f39f31c87
🤝 fix: Normalize Empty Handoff Fields to Restore Default Fallback (#12707)
* fix(agents): normalize handoff prompt and parameter values to ensure default instructions

* fix(agents): address lint errors and codex review comments
2026-04-19 11:53:09 -04:00
Danny Avila
b579390287
📦 chore: npm audit & bump @librechat/agents to v3.1.67 (#12710)
* chore: Update package-lock.json with new dependencies and version upgrades

- Added new dependencies for @langchain/anthropic and @langchain/core, including @anthropic-ai/sdk and fast-xml-parser.
- Updated existing dependencies for @librechat/agents, @opentelemetry/api-logs, @opentelemetry/core, and related packages to their latest versions.
- Enhanced integrity checks and licensing information for new and updated packages.

* chore: Update @librechat/agents dependency to version 3.1.66 in package.json and package-lock.json

- Bumped the version of @librechat/agents from 3.1.65 to 3.1.66 across multiple package.json files to ensure consistency and access to the latest features and fixes.

* chore: Update dompurify and fast-xml-parser dependencies to version 3.4.0 and 5.6.0 respectively

- Bumped the version of dompurify across multiple package.json files to ensure consistency and access to the latest features and security fixes.
- Updated fast-xml-parser to the latest version in relevant package.json files for improved functionality.

* chore: Update @librechat/agents dependency to version 3.1.67 in package.json and package-lock.json

- Bumped the version of @librechat/agents from 3.1.66 to 3.1.67 across multiple package.json files to ensure consistency and access to the latest features and fixes.
2026-04-16 22:44:00 -04:00
Danny Avila
034b672d0c
🫧 feat: Claude Opus 4.7 Reasoning Visibility (#12701)
* 🫧 fix: Restore Claude Opus 4.7 Reasoning Visibility

Claude Opus 4.7 omits `thinking` content from Messages API responses by
default — empty thinking blocks still stream, but the `thinking` field is
blank unless the caller passes `display: "summarized"` in the adaptive
thinking config. This silenced the LibreChat "Thoughts" UI for Anthropic
(and Anthropic-on-Bedrock) adaptive models.

- Extend `ThinkingConfigAdaptive` in `packages/api/src/types/anthropic.ts`
  with an optional `display: 'summarized' | 'omitted'` field
- Emit `{ type: 'adaptive', display: 'summarized' }` from
  `configureReasoning` in `packages/api/src/endpoints/anthropic/helpers.ts`
- Emit `{ type: 'adaptive', display: 'summarized' }` from
  `bedrockInputParser` in `packages/data-provider/src/bedrock.ts` and
  update the local `ThinkingConfig` union
- Update existing adaptive-thinking assertions to include the new field
- Add dedicated tests asserting `display: 'summarized'` flows through
  both the Anthropic endpoint and the Bedrock parser

See https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7#thinking-content-omitted-by-default

* refactor: Gate `display: summarized` on Opus 4.7+

Narrow the reasoning-visibility opt-in to the models that actually omit
thinking content by default, instead of applying it to every adaptive
model. Pre-Opus-4.7 adaptive models (Opus 4.6, Sonnet 4.6) already return
summaries, so sending the field is unnecessary noise.

- Add `omitsThinkingByDefault(model)` in `packages/data-provider/src/bedrock.ts`
  that returns true only for Opus 4.7+ (including future majors like Opus 5+)
- Bedrock parser now only attaches `display: 'summarized'` when the helper
  matches, keeping the adaptive object unchanged for older models
- Anthropic endpoint `configureReasoning` uses the same helper so its emit
  path matches the Bedrock one
- Tests: replace the blanket `display: 'summarized'` assertions with
  model-specific ones (Opus 4.7 gets it, Opus 4.6 / Sonnet 4.6 do not),
  add a dedicated `omitsThinkingByDefault` suite covering naming variants
  and future versions

* feat: Configurable Thought Visibility for Anthropic Adaptive Models

Expose the Anthropic `thinking.display` API field as a user-facing
parameter so users can override the `auto` default (which stays as the
Opus-4.7+ opt-in added earlier in this PR). Also fixes the CI type error
by widening the adaptive thinking type assignment via a resolver helper
that returns a properly-typed object.

- Add `ThinkingDisplay` enum (`auto` | `summarized` | `omitted`) and
  matching zod schema in `packages/data-provider/src/schemas.ts`
- Add `thinkingDisplay` to `tConversationSchema`, `anthropicSettings`, and
  the pick lists for Bedrock input/parser + Anthropic agent params
- Add `resolveThinkingDisplay(model, explicit)` helper in
  `packages/data-provider/src/bedrock.ts` that returns the wire value or
  undefined (auto → model default, explicit → always honored)
- `bedrockInputParser` now reads `thinkingDisplay` from input and emits
  `display` only when the resolver returns a value; strips the field
  on non-adaptive-model branches so it does not leak
- `configureReasoning` in the Anthropic endpoint threads
  `thinkingDisplay` through, uses the resolver, and casts the adaptive
  config to `AnthropicClientOptions['thinking']` so the widened shape
  compiles against the stale installed SDK types
- Add UI slider for `thinkingDisplay` in `parameterSettings.ts` next to
  `effort`, with three-position `com_ui_auto` / `com_ui_summarized` /
  `com_ui_omitted` labels
- Add translation keys `com_endpoint_anthropic_thinking_display`,
  `com_endpoint_anthropic_thinking_display_desc`, `com_ui_summarized`,
  `com_ui_omitted`
- Add tests: `resolveThinkingDisplay` suite (5 cases covering auto /
  explicit / unknown input), parser round-trip tests for all three
  modes on Opus 4.6 and Opus 4.7, Anthropic endpoint tests for explicit
  summarized/omitted overrides

* fix: Drop `thinkingDisplay` When Adaptive Thinking Is Disabled

If a user turns adaptive thinking off but had previously selected a
`thinkingDisplay` value, the stale field was left in `additionalFields`
and ended up merged into the Bedrock request's
`additionalModelRequestFields`. That leaks a non-Bedrock key into the
payload and can round-trip back into `llmConfig`.

- Delete `additionalFields.thinkingDisplay` alongside `thinking` and
  `thinkingBudget` in the `thinking === false` branch of
  `bedrockInputParser`
- Add a regression test asserting `thinking`, `thinkingBudget`, and
  `thinkingDisplay` are all absent when adaptive thinking is disabled on
  an Opus 4.7 request

Reported by chatgpt-codex-connector on PR #12701.

* refactor: Consolidate `ThinkingDisplay` Types and Preserve Persisted Display

Address review findings on PR #12701:

- [Codex P2] `bedrockInputSchema.transform` now extracts
  `thinking.display` from persisted `additionalModelRequestFields` back
  into the top-level `thinkingDisplay` field so explicit `'omitted'`
  round-trips through storage instead of being silently reverted to
  `'summarized'` on the next parse.
- [Codex P2] `getLLMConfig` in the Anthropic endpoint now reads
  `.display` from a persisted `thinking` object (agents store the full
  Anthropic shape) and uses it as the fallback for `thinkingDisplay`
  when no top-level override is present.
- [Audit #2] Collapse the three parallel wire-value types into a single
  `ThinkingDisplayWireValue = Exclude<ThinkingDisplay, 'auto'>` exported
  from `schemas.ts`; remove the duplicate `ThinkingDisplay` alias in
  `packages/api/src/types/anthropic.ts` (which collided with the enum
  name) and the `ThinkingDisplayValue` alias in `bedrock.ts`.
- [Audit #3] Add `thinkingDisplay` to the `TEndpointOption` pick list
  next to `effort`.
- [Audit #4] Add a TODO comment next to the `as
  AnthropicClientOptions['thinking']` cast explaining the stale
  `@librechat/agents` SDK types that require it.
- Add tests: four round-trip cases asserting `bedrockInputSchema`
  recovers `display` from persisted AMRF (Opus 4.7 omitted, pre-4.7
  summarized, unknown-value ignore, explicit top-level wins), and two
  `getLLMConfig` cases asserting the Anthropic endpoint preserves and
  overrides persisted `thinking.display`.

* fix: Preserve Persisted `thinking.display` in bedrockInputParser

The parser constructed a fresh adaptive thinking config without looking at
any `display` already embedded in the incoming
`additionalModelRequestFields.thinking`. On round-trip through
`initializeBedrock`, a persisted user choice of `'omitted'` on Opus 4.7+
was silently reverted to `'summarized'` by the auto fallback.

- Extract `extractPersistedDisplay` helper and reuse it in both the
  schema transform (form-state round-trip) and the parser (wire-request
  round-trip)
- `bedrockInputParser` now feeds the persisted display as the resolver's
  explicit value when no top-level `thinkingDisplay` override is set
- Add regression tests: parser preserves `display: 'omitted'` for
  persisted Opus 4.7 AMRF, and top-level `thinkingDisplay` still wins
  over persisted AMRF display

Reported by chatgpt-codex-connector (P1) on PR #12701.
2026-04-16 21:56:52 -04:00
Danny Avila
e2e3284713
🦉 feat: Claude Opus 4.7 Model Support (#12698)
* 🦉 feat: Claude Opus 4.7 Model Support

- Add `claude-opus-4-7` to shared Anthropic models and `anthropic.claude-opus-4-7` to Bedrock models
- Register 1M context window and 128K max output in anthropic token maps
- Add token pricing ($5/$25), cache rates (6.25/0.5), and premium tier ($10/$37.50 above 200K) in tx.ts
- Update `.env.example` with Opus 4.7 IDs in `ANTHROPIC_MODELS` and `BEDROCK_AWS_MODELS` examples
- Add parallel Opus 4.7 test cases for token/cache/premium rates, context length, max output, name-variation matching, and 1M-context qualification

* feat: Add `xhigh` Effort Level for Opus 4.7

- Add `xhigh` variant to `AnthropicEffort` enum between `high` and `max`
- Expose `xhigh` in `anthropicSettings.effort.options` and the UI slider `enumMappings`
- Reuse existing `com_ui_xhigh` translation key

* test: Cover `xhigh` Effort and Exact Opus 4.7 Premium Rates

- Assert `xhigh` position (between high and max), inclusion in
  `anthropicSettings.effort.options`, zod acceptance, and rejection of
  unknown values in schemas.spec.ts
- Verify bedrockInputParser emits `output_config: { effort: 'xhigh' }`
  for adaptive `anthropic.claude-opus-4-7`
- Verify getLLMConfig sets adaptive thinking and `output_config.effort =
  'xhigh'` for `claude-opus-4-7`
- Pin Opus 4.7 premium pricing to exact threshold/prompt/completion
  values (200000 / 10 / 37.5) so silent rate drift fails the test
2026-04-16 14:51:00 -04:00
Danny Avila
49f228de78
🔼 refactor: Improve UX for Command Popovers (#12677)
* refactor: Improve UX for Command Popovers

* Added loading state handling in Mention and PromptsCommand components to display a spinner when data is being fetched.
* Refactored onFocus logic to clear the textarea and set the search value based on command character input.
* Introduced a new `isLoading` state in the useMentions hook to manage loading indicators across multiple data queries.
* Added unit tests for the useHandleKeyUp hook to ensure command triggering works correctly under various conditions.

* ci: useHandleKeyUp tests for command navigation

* Added tests to ensure that the command popovers do not trigger when the cursor is mid-text after pressing ArrowLeft or Delete.
* Updated the shouldTriggerCommand function to refine the conditions under which commands are triggered based on cursor position.
* Improved agent query handling in useMentions hook for better performance and clarity.

* refactor: Optimize Mention and PromptsCommand Components

* Refactored Mention and PromptsCommand components to utilize Recoil state for popover visibility, improving state management and reducing prop drilling.
* Simplified onFocus logic to enhance user experience when interacting with command inputs.
* Added unit tests for useHandleKeyUp to ensure proper command handling and popover visibility based on user input.
* Improved performance by memoizing popover state and reducing unnecessary re-renders.

* fix: Address review findings for command popover refactor

- Fix endpointType regression: add effectiveEndpointByIndex selector
  that returns endpointType ?? endpoint, matching the original ChatForm
  guard for custom endpoints proxying assistants
- Extract duplicated initInputRef callback into shared useInitPopoverInput
  hook, used by both Mention and PromptsCommand
- Add navigation keys (ArrowLeft, ArrowRight, ArrowDown, Home, End,
  Delete) to invalidKeys to prevent false popover triggers
- Add endpoint gating tests for assistants/azureAssistants blocking the
  + command
- Remove unused _index param from MentionContent
2026-04-15 17:47:24 -04:00
Danny Avila
dd26a2fda5
🧩 style: Agent Side Panel Layout and Consistency Fixes (#12676)
* style: Update padding in ActionsPanel, ModelPanel, and AdvancedPanel for improved layout

* Adjusted padding in ActionsPanel, ModelPanel, and AdvancedPanel components to enhance visual consistency and layout.
* Changed `py-4` to `pt-2` in the main container of each panel to reduce vertical spacing and improve overall design aesthetics.

* style: Update text size in MCPTool component for improved accessibility

* Changed text size in the MCPTool component to `text-sm` for better readability and consistency across the UI.
* This adjustment enhances the user experience by ensuring that text is appropriately sized for various display settings.

* style: Enhance layout and accessibility in ActionsInput, ActionsPanel, and AgentPanel components

* Updated the layout in ActionsInput to improve flex properties and ensure better responsiveness.
* Refined the structure of ActionsPanel for a more consistent visual hierarchy and added accessibility features.
* Adjusted the AgentPanel form layout for improved usability and streamlined component integration.
* Changed text size in Dropdown component to `text-sm` for better readability across the UI.

* style: Refactor layout in ActionsInput component for improved responsiveness

* Updated the layout of the ActionsInput component to enhance flex properties and ensure better responsiveness.
* Adjusted the textarea styling for improved usability and consistency in design.
* Removed commented-out code related to example functionality to clean up the component structure.

* refactor: Simplify single line code detection in MarkdownComponents

* Introduced a new utility function `isSingleLineCode` to streamline the logic for determining if code is a single line.
* Updated references in the `MarkdownCode` and `MarkdownCodeNoExecution` components to use the new utility function for improved readability and maintainability.
* Enhanced the `processChildren` function in the Markdown editor to handle non-code elements more effectively.
2026-04-15 14:27:13 -04:00
kojinseok-del
edd4c6d60c
💎 fix: Handle usage_metadata in Title Transaction for Gemini Models (#12386)
* fix: handle `usage_metadata` in title transaction for Gemini models

Gemini models return token usage via `usage_metadata` instead of `usage`
or `tokenUsage`. The `collectedUsage` mapping in `titleConvo` only
handled the latter two, causing title generation transactions to be
silently skipped for Gemini. Adds an `else if (item.usage_metadata)`
branch to extract `input_tokens`/`output_tokens`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove trailing whitespace in usage_metadata handler

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:09:37 -04:00
Danny Avila
6183303653
🔉 fix: Normalize audio MIME types in STT format validation (#12674)
* fix: normalize audio MIME types in STT format validation

Use getFileExtensionFromMime() to normalize non-standard MIME types
(e.g. audio/x-m4a, audio/x-wav, audio/x-flac) before checking against
the accepted formats list in azureOpenAIProvider. This is the same class
of bug as #12608 (text/x-markdown), but for STT audio validation.

Only audio/ and video/ MIME prefixes are normalized to prevent
non-audio types from matching via the webm default fallback.

Export getFileExtensionFromMime for testability.

Fixes #12632

* fix: reject unknown audio subtypes in STT format validation

Use MIME_TO_EXTENSION_MAP for normalization instead of
getFileExtensionFromMime() which falls back to 'webm' for unrecognized
types. Gate raw subtype matching on audio/video prefix to prevent
non-audio types (e.g. text/webm) from passing validation.

Resolves Codex review comment about unknown subtypes silently passing.

---------

Co-authored-by: Tobias Jonas <t.jonas@innfactory.de>
2026-04-15 09:58:07 -04:00
Danny Avila
76e9543f99
🧹 chore: Cap PR Indexes at 3 and Add Delete-Before-Sync (#12672)
* fix: add docker system prune before image pull to prevent disk exhaustion

The 60GB droplet filled up after ~40 deploys because each
docker compose pull leaves the previous image's layers as
dangling/unused. The gitnexus image is ~700MB, so ~40 stale
copies ≈ 28GB of dead layers. Combined with indexes, OS, and
Docker's build cache, the disk hits 100% and the next pull fails
with 'no space left on device'.

Add a docker system prune -af --volumes BEFORE pulling the new
image on every deploy. This removes stopped containers, unused
networks, all images not referenced by a running container, and
build cache. Running containers are never touched. Typically
frees 1-2GB per deploy (the previous image's layers).

Also add a hard 2GB free-space guard after prune so the deploy
fails with a clear error instead of letting docker pull attempt
a 700MB extract onto a near-full disk.

* fix: cap PR indexes at 3 + delete-before-sync for 10GB disk

The 10GB droplet has ~2GB free. Each index is ~130MB, so 7 PR indexes
(~900MB) plus main+dev (~260MB) plus the ~700MB Docker image leaves
almost nothing for image pulls. The deploy failed with 'no space left
on device' during docker compose pull.

Three changes:

1. Cap PR indexes at MAX_PR_INDEXES=3. The resolve step now sorts
   PR artifacts by created_at descending and only keeps the 3 most
   recent. Older PR indexes are logged as evicted and their droplet
   folders get cleaned by the prune step.

2. Prune BEFORE sync (was after). Freeing disk space from evicted
   indexes before rsyncing new data is critical on a tight disk. The
   old order (sync then prune) could briefly hold both old evicted
   indexes and newly-uploaded ones simultaneously.

3. Delete-before-sync for every index, including main/dev. Instead
   of rsync --delete (which transfers new files then removes extras),
   rm -rf the target folder before rsync so the disk never holds both
   old and new copies of the same index (~260MB saved per index).
   Main/dev are only deleted when a fresh artifact is about to replace
   them — never evicted between deploys.

Budget on 10GB disk:
  OS + Docker engine:    ~4.0 GB
  Docker image (running): ~0.7 GB
  main + dev indexes:    ~0.26 GB
  3 PR indexes:          ~0.39 GB
  Docker prune headroom: ~0.7 GB (for image pull)
  Free:                  ~3.9 GB

* refine: restrict automatic PR indexing to danny-avila authored PRs

With 200+ open PRs and a 10GB disk capped at 3 served PR indexes,
auto-indexing every contributor PR burns CI minutes for artifacts
that will mostly be evicted before anyone queries them.

Narrow the pull_request auto-trigger to PRs authored by danny-avila
only. Other contributors' PRs can still be indexed on demand via
/gitnexus index (contributor-gated comment command) or manual
workflow_dispatch — both arrive as workflow_dispatch events and
bypass the pull_request filter entirely.

* fix: drop --volumes from docker system prune to preserve Caddy TLS state

The deploy workflow explicitly handles a caddy-not-running state later
in the same step. If Caddy is stopped when the prune runs, --volumes
deletes the caddy-data and caddy-config volumes (TLS certs + ACME
account keys), forcing a Let's Encrypt re-issuance on next start.
LE rate-limits to 5 certs per domain per week, so repeated wipes
could brick HTTPS for days.

docker system prune -af (without --volumes) still removes stopped
containers, unused networks, all dangling/unreferenced images, and
build cache — which is where the disk savings come from. Named
volumes are left untouched.

* fix: rsync-then-swap instead of delete-before-sync

The delete-before-sync pattern removed the live index BEFORE rsync
ran. If rsync failed (SSH timeout, disk pressure, network error),
the index was already gone — production served nothing for that
repo until a later deploy succeeded.

Replace with rsync-then-swap: upload to a .new temp directory, and
only rm + mv into place after rsync succeeds. On rsync failure,
the .new temp is cleaned up and the old index stays live. The cost
is ~130MB of extra disk while both old and new coexist, but the
prune step runs first and frees evicted PR indexes, so this fits
comfortably on the 10GB disk.

* fix: fail deploy on main/dev rsync failure, soft-fail PRs only

The rsync-then-swap pattern downgraded ALL failures to a warning,
so the deploy continued even when LibreChat or LibreChat-dev failed
to sync. The job would pull the new image, restart the container,
and report success while serving stale or missing core indexes.

Split by criticality: main/dev rsync failures now exit 1 (aborting
the deploy before the container restart). PR index failures remain
soft-fail with a warning — a missing PR index is inconvenient but
shouldn't take the whole server down.
2026-04-15 09:46:48 -04:00
Ivan
f6b73938af
📝 docs: add RAG_API_URL to .env.example (#12665) 2026-04-15 09:46:29 -04:00
Danny Avila
a613caced3
🧊 fix: In-Memory Endpoint Token Config Cache Isolation (#12673)
* fix: endpoint token config not using shared cache in same process (initializing clients)

* refactor: Update default max context tokens for agent initialization

- Introduced a constant `DEFAULT_MAX_CONTEXT_TOKENS` set to 32000.
- Updated the `initializeAgent` function to use this constant instead of hardcoded values for maximum context tokens, improving maintainability and clarity.

* refactor: shared caching mechanism for token configuration

- Introduced a memoized in-memory cache for Keyv instances to ensure shared access across the same namespace, improving cache efficiency.
- Updated the `standardCache` function to utilize the new in-memory cache for the TOKEN_CONFIG namespace.
- Refactored the `initializeCustom` function to use the `tokenConfigCache` for better cache management.
- Removed redundant tokenCache parameter from `fetchModels` to streamline the function signature.

* fix: match TOKEN_CONFIG TTL and add memoization tests

Pass Time.THIRTY_MINUTES to tokenConfigCache() to match the TTL used
by getLogStores.js, preventing load-order-dependent expiry behavior.

Add 7 automated tests covering in-memory memoization: referential
identity, cross-call-site data sharing, namespace isolation,
first-caller TTL semantics, fallbackStore bypass, and tokenConfigCache
parity with direct standardCache access.

* fix: export DEFAULT_MAX_CONTEXT_TOKENS, address review nits

- Export the constant so tests (and future consumers) reference it
  directly instead of hardcoding the numeric value.
- Add independent TTL assertion for tokenConfigCache (R-1 nit).
- Add tokenConfigCache mock to custom/initialize.spec.ts.
2026-04-15 09:41:42 -04:00
Danny Avila
b40e8be7c8
🖼️ fix: Hide Duplicate Image Placeholder During Image Generation (#12654)
* fix: Hide duplicate image placeholder during image generation

* test: Update OpenAIImageGen tests for conditional Image rendering
2026-04-14 07:53:23 -04:00
github-actions[bot]
5d108df665
🌍 i18n: Update translation.json with latest translations (#12646)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-13 22:32:02 -04:00
Danny Avila
60cee6eec7
🔍 fix: Anthropic Web Search Multi-Turn Issue and Attachment Results (#12651)
* 🔍 fix: Improve WebSearch Progress Handling Based on Attachment Results

- Adjusted progress handling in the WebSearch component to treat searches as complete if attachments contain results, addressing issues with server tool calls not receiving completion signals.
- Introduced `effectiveProgress` to reflect the actual state of progress based on the presence of results, enhancing the accuracy of cancellation and completion states.
- Updated related logic to ensure proper handling of search completion and finalization states based on the new progress calculations.

* fix: only override progress when not streaming

During streaming (isSubmitting=true), use actual progress so the
searching/processing/reading states display correctly. Only override
to 1 after streaming completes to prevent the cancelled check from
hiding the component.

* chore: Update @librechat/agents and mathjs dependencies to latest versions

* chore: Upgrade mathjs dependency to version 15.2.0 across package-lock and package.json files
2026-04-13 15:44:41 -04:00
Phạm Trung Nam
5cc783b8e8
🎯 fix: Preserve Selected Artifact When Clicking Artifact Button (#12601)
* fix: preserve selected artifact when clicking artifact button

* fix: preserve artifact selection on click and during streaming

* fix: remove broken streaming guard, add JSDoc and test

- Remove `userHasManualSelection` guard from effect #3: it cannot
  distinguish manual clicks from system auto-selection, blocking
  auto-advancement to new artifacts during streaming.
- Add JSDoc on `currentArtifactIdRef` explaining why it must not be
  added to effect deps (toggle-close regression).
- Add test verifying auto-advancement during streaming.

* style: use standard multi-line JSDoc format for ref comment

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-13 10:37:30 -04:00
Danny Avila
c4bb41137d
🔑 fix: Clear Stale Client Registration on invalid_client During OAuth Token Refresh (#12643)
* fix: clear stale client registration on invalid_client during token refresh

When a token refresh fails with `invalid_client`, the stored DCR client
registration is no longer valid on the authorization server. The existing
error handler only checked for `unauthorized_client` and returned null,
leaving the stale client_id cached in the database permanently. Every
subsequent token refresh attempt would fail with the same error.

Now when `invalid_client` is detected during refresh:
1. The stale client registration is deleted from the database
2. A `ReauthenticationRequiredError` is thrown to trigger a fresh OAuth
   flow with new dynamic client registration

Also passes `deleteTokens` from MCPConnectionFactory to getTokens() so
the cleanup has access to the token deletion method.

* fix: address review findings for stale client cleanup on token refresh

- Delete stale refresh token alongside client registration on invalid_client (Finding 1)
- Add tests for all new code paths: cleanup, warning, case-insensitivity, cleanup failure (Finding 2)
- Detect all vendor-specific client rejection patterns (client_id mismatch, client not found, unknown client) with case-insensitive matching (Finding 3)
- Use else-if for mutually exclusive error branches (Finding 4)
- Log warning when deleteTokens is not available on client rejection (Finding 6)
- Fix log message to say "attempting to clear" before async cleanup (Finding 7)
- Extract isClientRejectionMessage to shared utility, refactor MCPConnectionFactory.isClientRejection to use it

* fix: address followup review findings

- Extract isInvalidClientMessage (4 stale-client patterns) from isClientRejectionMessage
  to eliminate pattern duplication between utils.ts and tokens.ts (Finding 1)
- Remove redundant staleIdentifier variable, reuse identifier already in scope (Finding 2)
- Separate await from .then() on Promise.allSettled for readability (Finding 3)
- Add dedicated unit tests for isInvalidClientMessage and isClientRejectionMessage (Finding 4)
- Assert error message content in primary invalid_client test (Finding 5)
- Add JSDoc on deleteTokens in GetTokensParams (Finding 6)

---------

Co-authored-by: Mani Japra <mani@muonspace.com>
2026-04-13 10:12:54 -04:00
Danny Avila
7b48203906
🗂️ feat: Sidebar Icon Toggle & New Chat History Switch (#12642)
* 🗂️ feat: Sidebar Icon Toggle & New Chat History Switch

Add collapse-on-active-click for sidebar icons (VSCode-style) and optionally switch to Chat History panel when creating a new chat.

* fix: Address review findings — extract DEFAULT_PANEL constant, add tests

Export DEFAULT_PANEL from ActivePanelContext and use it in ExpandedPanel
instead of hardcoding 'conversations'. Add ExpandedPanel tests covering
NavIconButton collapse toggle and NewChatButton panel switch behaviors.

* fix: Address review — prop-drill setActive, test disabled setting, strengthen assertions

Pass setActive as a prop to NewChatButton instead of subscribing to
ActivePanelContext, avoiding wasted re-renders on every panel switch.
Add negative-path test for switchToHistory=false. Add positive panel
assertions to inactive-icon click tests. Fix import order.
2026-04-13 09:46:38 -04:00
Caesar Isidro Va-ay
9b9a86d17d
🔀 fix: Resolve Action Tools by Exact Name to Prevent Multi-Action Domain Collision (#12594)
* 🐛 fix: resolve Action tools by exact tool name to prevent multi-action collision

When two OpenAPI Actions on the same Agent share a hostname, the second
action's entry overwrote the first in the encoded-domain Map and one
action's tools silently disappeared from the LLM payload. The buggy
resolution loop also used substring matching, which caused similar
shadowing for any encoded-domain prefix overlap.

This change builds a Map keyed on the full tool name
(`<operationId>_action_<encoded-domain>`) directly, mirroring the exact
lookup pattern that getActionToolDefinitions already uses. Each function
in an action's spec gets its own slot, so two actions sharing a hostname
no longer collide. Both the new and the legacy domain encodings are
registered for each function so agents whose stored tool names predate
the current encoding still resolve.

Applied at all three call sites that had the buggy pattern:

  - processRequiredActions (assistants/threads path)
  - loadAgentTools (agent build path)
  - loadActionToolsForExecution (agent execution path)

Adds three regression tests covering both ordering directions and the
execution path. Tests fail without the fix and pass with it.

* 🐛 fix: Normalize action tool name at lookup + cover assistants path

Follow-up to the multi-action domain collision fix. Addresses PR #12594
review feedback:

**Must-fix #1 — short-hostname lookup mismatch.** The toolToAction map
is keyed on the `_`-collapsed domain, but `agent.tools` and
`currentAction.tool` persist the raw `domainParser(..., true)` output,
which for hostnames ≤ ENCODED_DOMAIN_LENGTH is a `---`-separated string
(e.g. `medium---com`). Exact-match `Map.get()` missed those keys and
silently dropped the tool. Fix: normalize every incoming tool name
through a new `normalizeActionToolName` helper before the lookup in
`loadAgentTools`, `processRequiredActions`, and
`loadActionToolsForExecution`.

**Must-fix #2 — assistants path coverage.** `processRequiredActions`
received the same structural rewrite but had zero tests. Added a
regression test under `multi-action domain collision regression` that
drives two shared-hostname actions through the assistants path and
asserts each tool reaches its own request builder.

**Must-fix #3 — legacy encoding branch coverage.** The
`if (legacyNormalized !== normalizedDomain)` registration was never
exercised by any test. Added a test where `agent.tools` stores the
legacy-format name and asserts it still resolves.

**Should-fix #4 — DRY the registration loop.** Extracted
`registerActionTools({ toolToAction, functionSignatures,
normalizedDomain, legacyNormalized, makeEntry })`. All three call sites
now share the same key-building logic; the key template lives in one
place.

**Should-fix #5 — remove stale optional chaining.** In
`loadActionToolsForExecution`, `functionSignature?.description ?? ''`
became `functionSignature.description` — `sig` is always defined by the
iterator, matching the style of `loadAgentTools`.

**Should-fix #6 — drop unreachable `!requestBuilder` guard.** Entries
in `processRequiredActions` are now pre-built with
`requestBuilder: requestBuilders[sig.name]`, which `openapiToFunction`
always produces alongside the signature, so the guard is dead.

**Should-fix #7 — unwrap `actionSetsData`.** It now holds a bare
`Map` instead of `{ toolToAction }`; the sentinel `!actionSetsData`
check still works because `new Map()` is truthy.

Also added a short-hostname regression test
(`loadAgentTools resolves raw ---separated tool names`) that reproduces
Must-fix #1: it fails against the previous commit (0 create calls) and
passes with the normalization in place.

41 tests, all passing. The 3 new regression tests are under
`multi-action domain collision regression` and cover the assistants
path, the legacy encoding branch, and the short-hostname lookup path.

* 🐛 fix: Tighten registerActionTools key handling and assistants test

Follow-up to d643444 addressing the second review pass on PR #12594.

**ESLint** — Two prettier errors in the spec file (multi-line arrow
function bodies that should fit on one line). Auto-fixed.

**[MINOR] operationId containing `---` → key mismatch.** The lookup
path collapsed every `actionDomainSeparator` sequence in the full tool
name, but the registration path passed `sig.name` through unchanged.
A `---` that survived into an operationId would shift the underscore
boundary at lookup and miss its own key. Fix in `registerActionTools`:
normalize `sig.name` with the same helper so registration and lookup
always agree on the canonical form. `sanitizeOperationId` strips the
characters that produce `---` in practice, so this is theoretical
hardening, not a fix for a known reproducer.

**[MINOR] Same-operationId + same-hostname silent overwrite.** Two
actions sharing both an operationId and a hostname still produced a
silent `Map.set()` overwrite (the new key is identical, so neither the
operationId nor the domain disambiguates). Added a `setKey` helper
inside `registerActionTools` that logs a `[Actions] operationId
collision: ...` warning whenever a key is already present, naming the
overwriting action_id. The silent-overwrite mode from the original bug
cannot reappear under a different disguise without surfacing in the
logs.

**[NIT] processRequiredActions test simulated a runtime crash.**
`mockCreateActionTool` returned a tool with `_call: jest.fn()`, which
resolves to `undefined`. `processRequiredActions` chains
`.then(handleToolOutput).catch(handleToolError)` directly onto that
return, so `undefined.then(...)` threw synchronously and the outer
try/catch funneled the error into `handleToolError`. Creation count
assertions still passed because `createActionTool` runs before the
crash, but the test was silently exercising the failure path. Updated
the global mock to `_call: jest.fn().mockResolvedValue('{"status":"ok"}')`
so the success path runs end-to-end. The assistants regression test
now executes in ~5ms instead of ~90ms, which corroborates that it's
no longer hitting the synchronous throw.

**[NIT] Duplicated rationale comments.** All three call sites carried
multi-line comment blocks restating why we key on the full tool name.
That rationale now lives canonically in `registerActionTools`'s JSDoc;
the inline blocks collapsed to `// See registerActionTools for the
key-shape rationale.` Net -22 lines of comments.

41/41 tests still pass; lint is clean.

* 🐛 fix: Scope tool-name normalization to the encoded-domain suffix

Follow-up to f22228e addressing the Codex P1 on PR #12594.

**Regression.** The previous commit normalized the entire tool name
(`normalizeActionToolName(sig.name)` at registration, full-name
`.replace()` at lookup) to handle operationIds that theoretically
contained `---`. But `openapiToFunction` uses user-supplied
operationIds verbatim and the fallback `sanitizeOperationId` only
strips characters outside `[a-zA-Z0-9_-]`, so specs can legitimately
produce operationIds like `get_foo---bar` and `get_foo_bar` side by
side. Collapsing `---` to `_` across the entire key merged those two
into a single map slot — one silently overwrote the other, and both
tool requests routed to the surviving entry's request builder.

**Fix.** Limit normalization to the encoded-domain portion of the
full tool name, i.e. the substring after the last `actionDelimiter`.
The operationId half is left verbatim, so hyphens-vs-underscores
remain disambiguating. The short-hostname bug (Must-fix #1 from the
original review) is still covered because the `---` → `_` collapse
still happens on the domain suffix where it matters:

```js
const normalizeActionToolName = (toolName) => {
  const delimiterIndex = toolName.lastIndexOf(actionDelimiter);
  if (delimiterIndex === -1) return toolName;
  const prefixEnd = delimiterIndex + actionDelimiter.length;
  const encodedDomain = toolName.slice(prefixEnd);
  return toolName.slice(0, prefixEnd) + encodedDomain.replace(domainSeparatorRegex, '_');
};
```

`registerActionTools` reverts to `sig.name` verbatim — no more
`normalizeActionToolName(sig.name)` ahead of the key build.

**Regression test.** Added
`loadAgentTools distinguishes operationIds that differ only by
---` vs `_`` under `multi-action domain collision regression`. It
loads two actions sharing a hostname whose operationIds are
`get_foo---bar` and `get_foo_bar` respectively, each pointing at a
different path (`/foo-bar`, `/foo_bar`), and asserts that each tool
resolves to its own request builder. Verified the test fails against
f22228e (`hyphenTool` resolves to `/foo_bar` — the sibling's builder)
and passes with this commit.

42/42 tests pass; lint clean.
2026-04-13 09:08:06 -04:00
Airam Hernández Hernández
277fdd2b43
🪪 feat: Optimized Entra ID Group Sync with Auto-Creation (#12606)
* feat: implement optimized Entra group sync with auto-creation

## Changes

### MUST FIX (Critical Issues) - RESOLVED

1. **BUG FIX: Prevent unintended user removal from existing groups**
   - ISSUE: db.syncUserEntraGroups() was called with only missing groups, causing removal
     from all existing Entra groups (full bidirectional sync behavior)
   - SOLUTION: Replaced with db.upsertGroupByExternalId() for each missing group followed
     by single bulkUpdateGroups() to add memberships (race-safe, idempotent)
   - BENEFIT: User memberships correctly maintained for mix of existing + new groups

2. **JSDoc @throws contradiction**
   - ISSUE: JSDoc declared function throws, but implementation catches all errors
   - SOLUTION: Removed @throws from JSDoc - function is best-effort
   - BENEFIT: Prevents unnecessary try/catch in caller code

3. **Missing test for group creation flow**
   - ISSUE: Auto-creating missing Entra groups had no test coverage
   - SOLUTION: Added regression test for mix of existing + new groups scenario
   - BENEFIT: Prevents future regressions on critical path

### SHOULD FIX (Important Improvements) - RESOLVED

4. **E11000 race condition handling**
   - SOLUTION: Upserts are idempotent and race-safe by design
   - BENEFIT: Concurrent logins no longer race each other

5. **Direct Mongoose access instead of db layer**
   - SOLUTION: Added findGroupsByExternalIds() helper to userGroup.ts
   - BENEFIT: Centralized data access, easier to add tenant scoping

6. **Serial DB round-trips on login path**
   - ISSUE: 40+ queries for user with 20 new groups
   - SOLUTION: Promise.all() for parallel upserts + single bulkUpdate
   - BENEFIT: ~10x performance improvement

7. **Graph API 429/503 throttling unhandled**
   - SOLUTION: Retry logic with exponential backoff (1s, 2s delays)
   - BENEFIT: Temporary API issues no longer cause permanent membership loss

8. **Sequential batch requests slow**
   - ISSUE: 200 groups = 10 batches × 200ms = ~2s sequential
   - SOLUTION: Promise.all() with concurrency limit (5 parallel batches)
   - BENEFIT: ~400ms total time

## Minor Fixes

- Removed dead code check
- PII removal: user._id instead of user.email in logs
- ES6 shorthand fixes
- Style consistency (blank lines)
- Projection optimization

## Verification

 npm run build - success
 npm run test:api - 61/61 passing (+ new regression test)
 npm run lint - no errors
 All feedback from danny-avila resolved

* docs: better JSDoc for the syncUserEntraGroupMemberships method

---------

Co-authored-by: Airam Hernández Hernández <airam.hernandez@intelequia.com>
2026-04-13 08:50:52 -04:00
Danny Avila
55840286d4
🔬 fix: Scope Web Search Results to Own Turn (#12631)
* 🔬 fix: Scope Web Search Results to Own Turn in WebSearch Component

Fix duplicated search results when multiple web_search tool calls occur
in a single response. Each WebSearch instance now displays only its own
turn's results instead of aggregating all turns.

* test: Add WebSearch turn-scoping tests

Cover the regression-prone turn isolation logic: verify each WebSearch
instance renders only its own turn's sources when sharing a SearchContext,
test the attachments-first priority path and the searchResults fallback,
and assert component states (cancelled, error, streaming, complete).

* fix: Use getAllByText for duplicate sr-only + visible text in tests

WebSearch renders status text in both an sr-only aria-live region and
a visible span, causing getByText to fail with multiple matches.

* refactor: Make ownTurn a string to avoid repeated conversions
2026-04-12 18:38:26 -04:00
Danny Avila
546f006e42
💬 feat: Serialize GitNexus Deploys and Post Completion Comments on PR Commands (#12623)
Three related changes that tighten the GitNexus CI/CD loop.

Serialized deploys
- Previous concurrency group was keyed by head ref with cancel-in-progress,
  which let deploys targeting different refs (e.g. main push + PR command)
  run in parallel. That's a data race: the prune-stale-indexes step
  computes active_names up front, so deploy A rsyncing
  /opt/gitnexus/indexes/LibreChat-pr-12580 can collide with deploy B
  pruning the same folder based on a pre-rsync view of the active set.
- Collapse to a single global group gitnexus-deploy with
  cancel-in-progress: false. All deploys queue behind one another.
  A rsync/docker-compose restart is never killed mid-operation.
  The 20-minute job timeout bounds queue depth.

PR completion feedback
- Add a "index complete" comment step in gitnexus-index.yml that
  fires only when inputs.pr_number is set (i.e. the run came via the
  /gitnexus command). Posts success or failure with a link to the
  run and whether embeddings were generated.
- Add a "deploy complete" comment step in gitnexus-deploy.yml that
  handles both trigger paths: workflow_run from a native PR auto-index
  (PR number recovered from the matrix entry whose runId matches the
  trigger run), and workflow_dispatch from the index workflow's bot-
  fallback path (PR number passed through as a new inputs.pr_number).
- Plumb inputs.pr_number through the bot-fallback dispatch in
  gitnexus-index.yml so the deploy workflow knows where to comment
  for command-triggered runs.
- Only comments on the PR that asked for the index, never broadcasts.

Workflow rename
- Drop the "DigitalOcean" suffix from the deploy workflow's display
  name and filename. The platform is still DO (.do/gitnexus/ still
  holds the compose + caddy config) but the workflow itself is
  platform-agnostic in form and the suffix was visual noise.
- File renamed gitnexus-deploy-do.yml -> gitnexus-deploy.yml.
- Concurrency group and all cross-references updated in lock-step.
- permissions at deploy job level now includes pull-requests: write
  so the completion comment can post.
2026-04-11 18:15:56 -04:00
Danny Avila
8cb5c62fa1
🔗 chore: Dispatch GitNexus Deploy When Index Is Bot-Triggered (#12621)
* fix: dispatch deploy from index when triggered by github-actions[bot]

GitHub Actions suppresses workflow_run events for workflow runs whose
triggering actor is GITHUB_TOKEN (to prevent recursive chains). This
means when gitnexus-pr-command.yml uses `gh api workflow_dispatch`
to kick off gitnexus-index.yml, the downstream gitnexus-deploy-do.yml
workflow_run trigger never fires — the PR command indexes the PR but
the new artifact never makes it onto the droplet.

Add a final step in gitnexus-index.yml that dispatches the deploy
workflow directly via API, but ONLY when the triggering actor is
github-actions[bot]. User-triggered runs (push, pull_request, manual
workflow_dispatch from the UI) continue to rely on workflow_run as
before, so we don't double-deploy.

Requires a new actions:write permission at the workflow level for
this dispatch. contents:read is unchanged.

* fix: resolve main/dev indexes by artifact name, not branch run query

The resolve step was querying listWorkflowRuns filtered by branch=main
and branch=dev, then assuming the latest successful run on each branch
produced the expected gitnexus-index-main / gitnexus-index-dev
artifact. That assumption breaks for /gitnexus index command runs:

The PR command workflow dispatches gitnexus-index.yml with ref=main
(because that's where the workflow file lives) and an input pr_number.
The resulting run has head_branch='main' but uploads its artifact as
gitnexus-index-pr-<N>, not gitnexus-index-main. listWorkflowRuns
returns that run as the "latest success on main", the download step
tries to fetch gitnexus-index-main from it, and the API returns
"no artifact matches any of the names or patterns provided".

Fix: resolve all indexes (main, dev, and PRs) through the same
listArtifactsForRepo path the PR discovery already uses. Looks up
the freshest non-expired artifact by name directly, so the run's
head_branch and event type don't matter — if the artifact exists,
we find it; if not, we warn and move on.

Side benefit: the resolution logic is now shorter and consistent
across branches and PRs.

* fix: paginate open PRs and parallelize artifact lookups

The resolve step was capped at 100 open PRs by github.rest.pulls.list's
per_page ceiling — LibreChat has 200+ open at any given time, so the
tail of the PR queue was silently skipped. On top of that, the inner
artifact lookup loop was serial, so even after pagination the resolve
step would take 40-60 seconds on a busy repo (one API call per PR).

- Replace the single-page rest.pulls.list call with github.paginate,
  which follows the Link header across pages and returns the full
  open-PR set regardless of count.
- Drop the 100-PR truncation warning that was a known-limitation
  notice for exactly this case.
- Batch the per-PR artifact lookups into groups of 10 via Promise.all.
  200 PRs now take ~10 seconds instead of ~60, and the burst stays
  well within the authenticated rate limit (5000/hr).
- Add a final core.info summary showing how many of the open PRs
  actually had a servable index artifact, so the log is useful for
  debugging why a specific PR isn't showing up on the droplet.
2026-04-11 14:02:00 -04:00
Danny Avila
990763cbee
🧠 feat: Enable GitNexus Embeddings for Dev Branch and PR Indexes (#12620)
* feat: auto-enable embeddings for dev and PR indexes too

Previously only main branch pushes got --embeddings; dev and
contributor PRs ran graph-only and relied on BM25 search. Semantic
search on those indexes silently returned empty, which defeats the
whole point of serving them to MCP clients.

New logic: every automatic trigger (push to main/dev, pull_request
from contributors) enables --embeddings. Only workflow_dispatch
still respects the explicit input toggle, so operators can run a
fast graph-only re-index when they don't need fresh vectors.

Cost: adds ~3-5 minutes per index run. Acceptable tradeoff for
having semantic search work across all served branches + open PRs
instead of just main.

* refine: gate PR embeddings on unit-test path relevance

Previous version auto-enabled --embeddings on every contributor PR,
which cost ~3-5 min of extra CI per index even on PRs that couldn't
benefit from semantic code search (docs, config, workflow files,
i18n strings, etc.).

New logic mirrors the backend-review.yml and frontend-review.yml
path filters — if a PR doesn't touch api/, client/, or packages/
it won't trigger unit tests and it doesn't need embeddings. The
check queries the GitHub API for the PR's changed file list via
`gh api repos/.../pulls/<N>/files` (paginated for very large PRs)
and enables embeddings only when at least one path matches.

main/dev pushes still always embed. workflow_dispatch still respects
the explicit input toggle, which also covers the /gitnexus index
[embeddings] PR command.

The contributor gate at the job level is unchanged — non-contributor
PRs are still skipped entirely regardless of paths.

* feat: /gitnexus command works for non-contributor and fork PRs

The command workflow already gated on the commenter's author
association (not the PR author's), so a contributor commenting
/gitnexus index on an outside contributor's PR passes the auth
check. But the downstream index workflow checked out the PR's
raw head SHA, which only exists in the fork for cross-repo PRs —
actions/checkout fetches from the base repo's origin and fails.

Switch the command workflow to dispatch with refs/pull/<N>/head
instead of the SHA. GitHub mirrors every PR's head into the base
repo as this ref regardless of whether the PR is from a fork, so
the checkout always resolves.

End result: a contributor can type `/gitnexus index embeddings`
on any PR — including one opened by a first-time contributor from
a fork — and the index (with embeddings, if requested) is built
and served. The contributor takes responsibility for the trust
boundary by typing the command.

Updated the relevant header/inline comments in both workflows so
the next maintainer understands the refs/pull/<N>/head choice and
the commenter-based gating.

* refine: /gitnexus index defaults to embeddings on

A contributor typing the command has already chosen to spend ~5
minutes of CI on a full re-index; they wouldn't invoke the command
just to get a BM25-only result. Flip the default so the short form
`/gitnexus index` produces an embeddings-enabled index.

Modifier semantics:
  /gitnexus index              -> embeddings ON (new default)
  /gitnexus index embeddings   -> embeddings ON (explicit, no-op alias)
  /gitnexus index fast         -> embeddings OFF (opt-out)
  /gitnexus index graph-only   -> embeddings OFF (alias)
  /gitnexus index no-embeddings-> embeddings OFF (alias)

The previous `embeddings` modifier is preserved as a no-op alias so
anyone who learned the earlier form still gets what they expected.
2026-04-11 13:35:29 -04:00
Danny Avila
39fb93f6c4
🏗️ chore: Set Up Docker Buildx for GitNexus Image GHA Cache Export (#12619)
The first GHCR build on main failed with:
  ERROR: Cache export is not supported for the docker driver.

The default docker driver on ubuntu-latest runners can't export
cache to type=gha. docker/setup-buildx-action@v3 without a driver
argument defaults to docker-container, which supports both
cache-from and cache-to. Gated on the same condition as the build
step so it only runs when an image rebuild is actually needed.
2026-04-11 13:18:17 -04:00
Danny Avila
8eab39bc8f
🌊 feat: Add GitNexus DigitalOcean Pipeline with PR Index Serving (#12612)
* feat: migrate GitNexus deployment from Fly.io to DigitalOcean droplet

Fly.io's 1GB machine was pegged at ~900MB memory with load spiking to
2.7 under even modest query load. Moving to a 2GB+ DO droplet that can
take advantage of existing credits.

Architecture change: indexes no longer baked into the image. Instead,
a long-lived image (built only when .do/gitnexus/ changes) is pulled
from GHCR, and the deploy workflow rsyncs .gitnexus/ data into
/opt/gitnexus/indexes/<name>/ on the droplet and restarts only the
gitnexus container. Caddy stays running so TLS certs don't churn.

- Add .do/gitnexus/Dockerfile (same native-addon + extension patch
  layers as the Fly variant, but no COPY indexes/ step)
- Add .do/gitnexus/docker-compose.yml with gitnexus + caddy services
  on an internal bridge network, 1.8GB memory limit, healthcheck
- Add .do/gitnexus/Caddyfile with automatic HTTPS for the configured
  subdomain and bearer token auth for all routes except /health
- Add .do/gitnexus/entrypoint.sh that registers every index mounted
  at /indexes/<name>/.gitnexus at container start, then runs
  gitnexus serve bound to 0.0.0.0 (internal docker network only)
- Add .do/gitnexus/install-extensions.js for LadybugDB FTS/vector
  extension pre-install (workaround for upstream bug)
- Add .github/workflows/gitnexus-deploy-do.yml that builds the image
  only on Dockerfile/entrypoint changes, pushes to GHCR, rsyncs the
  index artifacts to the droplet, and restarts the gitnexus container
- Remove .fly/gitnexus/ and .github/workflows/gitnexus-deploy.yml —
  Fly app will be destroyed after DO deploy is verified working

Required new secrets: DO_HOST, DO_USER, DO_SSH_KEY. GITNEXUS_DOMAIN
and API_TOKEN live in /opt/gitnexus/.env on the droplet itself.

* refactor: prefix deploy secrets with GITNEXUS_ for namespace isolation

Rename DO_HOST -> GITNEXUS_DO_HOST, DO_USER -> GITNEXUS_DO_USER, and
DO_SSH_KEY -> GITNEXUS_DO_SSH_KEY so the secrets are clearly scoped
to the gitnexus deploy and don't collide with any other DigitalOcean
secrets LibreChat might add later.

* feat: serve PR indexes alongside main/dev and add /gitnexus command

The index workflow was already building and uploading per-PR indexes
(gitnexus-index-pr-<N>) for contributor PRs, but the deploy workflow
only consumed main and dev artifacts. PR indexes were sitting in
storage doing nothing. This wires them all the way through to the
live MCP server, with proper cleanup when PRs close.

Deploy workflow changes:
- Drop the branches filter on workflow_run so PR index completions
  also trigger deploys (PR indexes are already contributor-gated
  upstream in gitnexus-index.yml via author_association)
- Resolve all open PRs via the GitHub API, look up each one's latest
  non-expired gitnexus-index-pr-<N> artifact, and serve whichever
  ones exist. PRs without an index artifact are skipped — we don't
  retroactively index anything.
- Per-ref concurrency group so rapid pushes to the same PR coalesce
  but different refs still deploy in parallel
- After rsyncing active indexes, prune any /opt/gitnexus/indexes/
  folder that isn't in the active set. Safety net for missed PR
  close events.

New workflow: gitnexus-cleanup-pr.yml
- Fires on pull_request closed (merged or not)
- SSHs to the droplet, removes /opt/gitnexus/indexes/LibreChat-pr-<N>,
  restarts the gitnexus container

New workflow: gitnexus-pr-command.yml
- Listens for issue_comment events where body starts with /gitnexus
- Contributor gated via author_association
- Supports: /gitnexus index           — index with defaults
           /gitnexus index embeddings — index with --embeddings
- Dispatches gitnexus-index.yml with the PR number and head SHA,
  reacts to the comment with a rocket emoji

Index workflow changes:
- New dispatch inputs pr_number and pr_ref for command-driven runs
- Checkout step uses inputs.pr_ref when set so the PR's head commit
  is analyzed instead of the default branch
- Artifact naming falls back through pr_number -> pull_request number
  -> ref_name, keeping existing behavior for push/PR events
- Concurrency group switches to pr-<N> when dispatched by the command
  so re-runs on the same PR debounce correctly

* chore: remove Fly variant reference from Dockerfile header

The Fly variant no longer exists in the repo, so the comparison
comment is meaningless. Rewritten as a standalone description of
the image's design.

* review: resolve 15 findings from review audit

Critical
- Drop the unused caddy binary from the gitnexus image. Caddy runs in
  its own container in this architecture; installing it inside the
  gitnexus image added ~40-60MB for no reason and contradicted the
  Dockerfile header comment.

Major
- Replace ssh-keyscan TOFU with a GITNEXUS_DO_KNOWN_HOST secret.
  Both deploy and cleanup workflows now pin the droplet's host key
  from a stored value instead of silently trusting whatever the host
  presents at deploy time. Fails the workflow if the secret is empty
  so no one accidentally regresses to TOFU.
- Gate gitnexus-cleanup-pr.yml to same-repo PRs via
  github.event.pull_request.head.repo.full_name == github.repository.
  Fork PR closes no longer produce failed runs when secrets are
  withheld by GitHub. The deploy workflow's stale-folder prune step
  remains the safety net for any fork-contributor indexes.
- Fail fast in entrypoint.sh when main/dev index registration errors.
  Previously `|| echo WARN` swallowed failures so a broken index
  passed the docker healthcheck and the deploy was marked green
  while queries returned empty. PR indexes stay best-effort
  (a corrupt PR index shouldn't take the whole server down).
- Authenticate the droplet with GHCR on every deploy using
  GITHUB_TOKEN, so private GHCR packages work without documentation
  detours or manual docker login on the host. Bootstrap comments
  explain the flow.
- Switch docker-compose caddy.depends_on from the short-form
  (service_started) to service_healthy so Caddy doesn't route to a
  gitnexus container that's still starting (500ms-60s window after
  recreation).

Minor
- Guard the HEAD~1 diff with `git rev-parse --verify HEAD~1` so the
  first-commit and workflow_run-from-PR cases default to rebuild
  instead of silently skipping a legitimately-changed image.
- Move `packages: write` off the workflow-level permissions and onto
  the build-image job. deploy no longer inherits unnecessary GHCR
  write access.
- Skip the SSH session in cleanup-pr.yml when no gitnexus-index-pr-<N>
  artifact ever existed for the PR. Eliminates ~95% of no-op SSH
  round-trips on a busy repo (docs-only PRs, paths-ignored PRs, etc).
- Reload Caddy in-place after config upload with `caddy reload`,
  falling back to force-recreate on reload failure and `compose up`
  on first-time bootstrap. Picks up Caddyfile or env changes without
  losing TLS certs.
- Replace `sleep 5` post-deploy with a real readiness poll against
  docker's health status. Fails the workflow if gitnexus doesn't
  report healthy within 120s, so a broken startup surfaces in CI
  instead of being silently marked green.
- Warn when listPulls hits the 100-item per_page ceiling so a future
  growth spurt past 100 open PRs doesn't silently drop indexes.

Nit
- Tighten NODE_OPTIONS --max-old-space-size from 1536 to 1280MB,
  giving KuzuDB's C++ heap ~512MB of room under the 1792MB cgroup
  limit instead of ~256MB.
- Rewrite the stale "headroom for Caddy" comment in entrypoint.sh
  (Caddy lives in a separate container now).
- Restore load-bearing comments in install-extensions.js explaining
  the @ladybugdb/core path layout and the throwaway-db cache-priming
  pattern.
- Parameterize the docker-compose image reference as
  ${GITNEXUS_IMAGE:-ghcr.io/danny-avila/librechat-gitnexus:latest}
  so forks or pinned version tags can override via /opt/gitnexus/.env.

Deferred
- Finding 12 (memory headroom) addressed partially via the heap cap
  reduction; full profiling of KuzuDB C++ allocations under query
  load deferred to post-deploy monitoring.

* review: resolve 8 follow-up findings from second review pass

Security
- F1: pipe GHCR token via SSH stdin instead of expanding it into the
  remote command string. Previously `"echo '$GH_TOKEN' | docker login"`
  expanded the token locally before SSH sent it as an argument, so the
  live token was briefly visible in /proc/<pid>/cmdline on the droplet
  to any process running as deploy or root. New form uses
  `printf '%s' "$GH_TOKEN" | ssh ... "docker login --password-stdin"`
  so the token only travels through the encrypted SSH stdin pipe.

Reliability
- F2: add json-file log rotation (50m x 3 files) via a YAML anchor
  shared by both services. Default Docker logging is unbounded and
  would eventually fill the 60GB droplet disk.
- F4: set memswap_limit=1792m to match mem_limit. Without this, Docker
  lets the container silently spill onto host swap when KuzuDB's C++
  heap overruns the 1792m RAM budget, turning sub-second graph queries
  into multi-second ones with no alert. Hard OOM-kill is preferable —
  unless-stopped restarts the container, the deploy health poll
  catches it, the failure is explicit.
- F5: extend the post-deploy health poll from 24 iterations (120s) to
  36 iterations (180s) so it clears Docker's own unhealthy-detection
  ceiling (start_period 60s + retries 3 * interval 30s = 150s). A
  container that legitimately takes 125s to warm up would previously
  fail the deploy at 120s while Docker would still report it as
  "starting".

Operability
- F3: document the `--no-deps` escape hatch in the compose file header
  so an operator can restart Caddy during a gitnexus outage without
  being trapped by the service_healthy dependency (e.g. emergency
  Caddyfile fix while gitnexus is thrashing).
- F7: rewrite the misleading service_healthy comment. The old text
  said it prevents 502s "after a restart", implying continuous
  protection. Clarified that depends_on only governs initial compose
  up ordering — during force-recreates Caddy briefly routes to a
  starting gitnexus and the deploy's health poll is the actual guard.
- F6: add `shopt -s nullglob` before the prune loop so an empty
  /opt/gitnexus/indexes directory is an explicit no-op instead of
  relying on the quirk that `rm -rf "*"` (with literal "*") silently
  succeeds. Next reader won't have to recognize the bash default.
- F8: soft-fail PR artifact downloads when the artifact disappeared
  between resolve and download. Main/dev artifact failures stay fatal
  because a missing main/dev index is a real deploy failure, but a
  deleted PR artifact no longer aborts the whole deploy.

* review: resolve 3 NITs from third review pass

- F1: rewrite the printf '%s' comment. The previous version claimed
  docker login --password-stdin rejects trailing newlines, which is
  inaccurate — docker login strips whitespace. The real reason for
  printf over echo is byte-exact output and portability, and the
  token-in-process-table security rationale is already documented
  in the preceding sentences.

- F2: when a PR artifact download soft-fails, the PR's name stays
  in active_names so the prune step keeps the droplet's existing
  copy instead of wiping it (stale > empty). Make this transition
  visible by spelling it out in the :⚠️: message.

- F3: fencepost fix in the health poll. The previous loop ran 36
  iterations and claimed "180s" in the comment, but the final
  iteration exits without a trailing sleep, so the real ceiling
  was 35 * 5s = 175s. Extended to 37 iterations (36 sleeps * 5s
  = 180s) so the comment matches reality.
2026-04-11 13:04:46 -04:00
Daniel Lew
b1fee80de4
📑 fix: Alias Mimetype text/x-markdown to text/markdown (#12608)
text/x-markdown is a deprecated version of a markdown mimetype, but
we're seeing that sometimes users still send this mimetype. This
change allows these files to be uploaded as text/markdown.
2026-04-11 08:23:04 -04:00
Danny Avila
70c91f8afd
🩹 chore: Correct lbug-adapter Path for GitNexus Vector Extension Patch (#12609)
The published gitnexus npm package compiles pool-adapter.ts to
dist/mcp/core/lbug-adapter.js, not dist/core/lbug/pool-adapter.js
as I guessed in the previous PR. The sed patch failed the Docker
build because the target file didn't exist.

Point the patch at the correct compiled path and add a grep -c
verification after sed to confirm the replacement actually landed.
2026-04-10 13:59:41 -04:00
Danny Avila
711747a5a0
🔎 fix: Install LadybugDB Extensions and Patch Vector Load for GitNexus Search (#12607)
Two upstream GitNexus 1.5.3 bugs combine to break query() in serve mode:

1. pool-adapter.ts calls LOAD EXTENSION fts but never INSTALL fts. The
   CI-produced .gitnexus/ artifact doesn't include the extension cache
   (~/.kuzu/extension/), so LOAD silently fails in the fresh container
   and all BM25/FTS searches return empty.

2. pool-adapter.ts only loads the FTS extension — it never loads the
   vector extension. Every CALL QUERY_VECTOR_INDEX in semanticSearch
   fails, so hybrid search's semantic leg returns empty too.

Combined, query() returns empty even for exact function names because
both the BM25 and semantic legs of RRF merging have zero inputs.
Meanwhile context()/cypher()/route_map() still work because they use
plain Cypher with no extensions.

Workaround:
- Add install-extensions.js that runs INSTALL fts and INSTALL vector
  against a throwaway database during Docker build, populating
  ~/.kuzu/extension/ with the cached extension binaries
- Sed-patch pool-adapter.js at build time to also LOAD EXTENSION vector
  alongside the existing FTS load, wrapped in try/catch
- Update deploy workflow to copy install-extensions.js into the build
  context
2026-04-10 13:41:55 -04:00
Danny Avila
a121ae3dea
🩺 fix: Capture GitNexus Serve Output and Add Startup Health Check (#12599)
* fix: capture gitnexus serve output and add startup health check

gitnexus serve was backgrounded with no log capture, so crash
reasons were invisible in Fly logs. Now pipes output to stdout
and waits up to 30s for the server to be ready before starting
Caddy, with early exit if the process dies.

* fix: install newer libstdc++ for LadybugDB native addon

@ladybugdb/core prebuilt binary requires GLIBCXX_3.4.32 (GCC 13+)
but node:24-slim ships Bookworm's libstdc++6 which only has 3.4.31.
Pull libstdc++6 from Debian Trixie to satisfy the runtime dependency.

* fix: separate build tools from libstdc++ upgrade to avoid conflict

Installing Trixie's libstdc++6 alongside Bookworm's g++ fails because
the Trixie libc6 transitive dep conflicts with Bookworm's libc6-dev.
Split into two RUN steps: compile native addons first, remove g++,
then upgrade libstdc++ from Trixie with no conflicting packages.

* fix: name repo as LibreChat and add dev branch deploy support

- Use REPO_NAME build arg (default: LibreChat) as the WORKDIR so
  gitnexus registers the index with a proper name instead of "repo"
- Deploy workflow now triggers on both main and dev branch index runs
- Dev branch registers as "LibreChat-dev", main as "LibreChat"
- workflow_dispatch gains a branch selector input

* feat: serve both main and dev indexes from one container + auto-embed main

- Dockerfile now copies multiple indexes from indexes/<name>/.gitnexus
  and registers each with gitnexus, so one server handles both branches
- Deploy workflow downloads latest successful main + dev artifacts in
  parallel and bundles them into a single deploy
- list_repos returns LibreChat and LibreChat-dev; queries target either
  via the repo parameter
- Main branch pushes auto-enable --embeddings for semantic search;
  dev and PRs remain graph-only for speed (opt-in via dispatch input)
- Bump index job timeout to 25m to account for embedding generation

* fix: raise Fly machine memory to 1GB + add swap + cap Node heap

The 512MB machine was getting OOM-killed when gitnexus serve's default
--max-old-space-size=8192 over-committed memory during query spikes.

- Bump VM memory 512mb -> 1gb
- Add 512MB swap file to absorb transient spikes
- Cap Node heap at 768MB via NODE_OPTIONS so it stays within the
  machine's real capacity and leaves headroom for Caddy and the OS
2026-04-10 12:35:11 -04:00
Danny Avila
4f133f8955
v0.8.5-rc1 (#12569) 2026-04-09 20:06:31 -04:00
Danny Avila
f128587bb7
📦 chore: bump axios, @librechat/agents (#12598)
* chore: bump @librechat/agents to v3.1.64

* chore: update axios to version 1.15.0 across multiple packages
2026-04-09 19:48:21 -04:00
UnicronBE
46b529a86a
📩 fix: Restore Primary Action Button Visibility in Light Mode (#12591)
The default <Button> variant (bg-primary / text-primary-foreground)
renders invisible in light mode on the Agent Marketplace 'Start Chat'
button and the Grant Access dialog 'Save Changes' button, making these
primary actions undiscoverable without hovering.

Switch these three affirmative-action buttons to variant='submit',
which uses the hardcoded 'bg-surface-submit text-white' combination
already defined in the Button component specifically to avoid the
contrast issues of the default variant (see the comment above the
'submit' variant in packages/client/src/components/Button.tsx).

No visual change in dark mode; in light mode the buttons now render
as the green submit color with white text, matching the semantic
intent of the action.

Co-authored-by: Timothy Look <timothy.look@pmv.eu>
2026-04-09 18:45:54 -04:00
Marco Beretta
1a83f36cda
📌 feat: Add Pin Support for Model Specs (#11219)
* feat: Enhance favorites functionality to support model specs

* refactor(FavoritesList): reorder imports based on lenght

* feat: improve favorite modelSpec controller; refactor: useIsActiveItem hook

* refactor: consolidate Favorite type, harden controller, add tests

- Add canonical TUserFavorite type in data-provider, replace three
  duplicate definitions (data-service, data-schemas, store/favorites)
- Consolidate FavoritesController spec validation into single block,
  add return on 500 paths, add maxlength to mongoose sub-schema
- Fix import order in FavoritesList.tsx, merge namespace type imports
  in FavoriteItem.tsx and FavoritesList.tsx
- Add focus-visible:ring-inset on pin buttons to prevent ring clipping
- Add explicit return type and JSDoc on useIsActiveItem hook
- Use props.type for narrowing consistency in FavoriteItem getTypeLabel
- Add 22 backend tests for FavoritesController (spec validation,
  typeCount exclusivity, persistence, GET path)
- Add 40 frontend tests: useFavorites spec methods, useIsActiveItem
  observer lifecycle, ModelSpecItem pin button, FavoriteItem all three
  type branches, FavoritesList spec rendering

* fix: address PR review findings for pin model specs

- Harden backend validation to reject partial cross-type fields
  (e.g. spec+endpoint, agentId+model without endpoint)
- Add stale-spec auto-cleanup in FavoritesList mirroring agent cleanup
- Add type="button" to pin buttons in ModelSpecItem/EndpointModelItem
- Fix import order violations in EndpointModelItem and ModelSpecItem
- Remove hollow test, dead key prop, inline trivial helpers
- Fix misleading test description, add onSelectSpec to test mock
- Add return to controller success responses for consistency
- Add 6 backend tests for partial cross-type field validation

* fix: guard stale-spec cleanup against unloaded startupConfig

Prevents race condition where spec favorites are incorrectly deleted
on cold start before startupConfig has loaded. Mirrors the existing
agentsMap === undefined guard pattern used for stale agent cleanup.

Also adds tests for stale-spec cleanup persistence and fixes namespace
import pattern in FavoritesList.spec.tsx.

* fix: replace nested ternaries with if/else in FavoriteItem

Resolves ESLint no-nested-ternary warnings for name and typeLabel
derivations.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-09 18:37:25 -04:00
Danny Avila
daa8f0ea6b
📂 fix: Respect supportedMimeTypes Config in File Picker Accept Filter (#12596)
* fix: respect supportedMimeTypes config in file picker accept filter

The browser file picker's accept attribute was hardcoded by provider
identity, ignoring the endpoint's supportedMimeTypes from fileConfig.
Users who configured permissive MIME types (e.g., '.*') still saw a
restrictive filter in the upload dialog.

Add isPermissiveMimeConfig utility that detects wildcard patterns in
the endpoint's supportedMimeTypes. When permissive, the file picker
accept attribute is set to empty (unrestricted). Non-permissive
configs retain the existing provider-based defaults.

Closes #12589

* fix: address review findings for isPermissiveMimeConfig

- Use non-standard MIME namespace probe (x-librechat/x-probe) so
  category-wildcard patterns like ^application\/.*$ no longer
  false-positive as permissive
- Add single-line JSDoc to isPermissiveMimeConfig
- Use !== undefined instead of != null (fileType is never null)
- Add endpointFileConfig to dropdownItems useMemo deps to prevent
  stale closure when config changes without endpoint change
- Add tests for broad application and multi-category patterns

* fix: wrap handleUploadClick in useCallback to satisfy exhaustive-deps

handleUploadClick is captured inside the dropdownItems useMemo but was
not in its dependency array. Wrap it in useCallback with
endpointFileConfig.supportedMimeTypes as the sole dependency, then
reference the stable callback in the useMemo deps.
2026-04-09 17:43:53 -04:00
Danny Avila
81275ff0e0
⏱️ refactor: User Job Tracking TTL and Proactive Cleanup to Redis Job Store (#12595)
* refactor: Add user job tracking TTL to RedisJobStore

- Introduced a new TTL for per-user job tracking sets, set to 24 hours, to enhance job management.
- Updated RedisJobStoreOptions interface to include userJobsSetTtl for configuration.
- Modified job creation and deletion methods to manage user job sets effectively, ensuring proper expiration and cleanup.
- Enhanced comments for clarity on the new TTL functionality and its implications for user job tracking.

* fix: Address review findings for user job tracking TTL

- Remove redundant `del(userJobsKey)` in `getActiveJobIdsByUser` that
  raced with concurrent `createJob` on other replicas (Redis auto-deletes
  empty Sets after SREM)
- Guard `userJobsSetTtl: 0` from silently destroying tracking sets
  (`EXPIRE key 0` deletes the key on Redis 7.0+)
- Extract `deleteJobInternal` so `cleanup()` reuses the already-fetched
  userId instead of issuing a redundant HGETALL per stale job
- Add integration tests for TTL behavior, proactive SREM, configurable
  userJobsSetTtl, and TTL refresh on repeated createJob

* fix: Address follow-up review findings for RedisJobStore

- Use deleteJobInternal in cleanup() terminal-but-in-running-set path
  to ensure userJobsKey SREM is not skipped
- Clear local caches in deleteJob before the fallible getJob call so
  they are cleaned even on transient Redis errors
- Add proactive SREM tests for aborted and error terminal statuses
- Add test for tenant-qualified user tracking key format

* fix: Preserve completedTtl for non-running jobs in cleanup()

The cleanup() terminal-status branch should only remove tracking set
membership, not delete the job hash. deleteJobInternal bypasses the
completedTtl window that updateJob already applied, causing clients
polling for final status to lose the job data early.
2026-04-09 17:42:54 -04:00
Danny Avila
cc8ce15c38
📂 fix: Enable Hidden File Upload for GitNexus Index Artifact (#12597)
upload-artifact@v4 defaults include-hidden-files to false,
which silently skips the .gitnexus/ directory (dotfile).
Scoped to the .gitnexus/ path so only index files are affected.
2026-04-09 17:02:51 -04:00
github-actions[bot]
0d97f7354a
🌍 i18n: Update translation.json with latest translations (#12588)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-09 15:26:37 -04:00
Danny Avila
7ef03391b5
📦 chore: bump nodemailer to v8.0.5 (#12587) 2026-04-09 09:59:57 -04:00
Danny Avila
452af50eff
🧮 fix: Atomize Redis Event Sequence Counters for Multi-Replica Deployments (#12578)
* fix: atomize Redis event sequence counters for multi-replica deployments

Replace in-memory sequenceCounters Map with shared atomic Redis counter
(INCR via Lua script) so all replicas share an authoritative sequence
source. Make syncReorderBuffer async to read current sequence from Redis
instead of defaulting to 0 on non-publishing replicas.

Closes #12575

* fix: harden emitChunk error handling and syncReorderBuffer race safety

Wrap getNextSequence inside emitChunk's try/catch so a transient Redis
failure is logged-and-swallowed, preserving the non-fatal chunk emission
contract. In syncReorderBuffer, replace pending.clear() with selective
discard of stale entries (seq < currentSeq) followed by
flushPendingMessages, preventing loss of chunks that arrived during the
async GET window.

* fix: address review findings — harden error paths, validate types, DRY tests

- Wrap syncReorderBuffer in try/catch in GenerationJobManager.subscribe()
  so a Redis GET failure degrades gracefully instead of crashing SSE reconnect.
- Validate eval return type in getNextSequence; throw on unexpected type
  instead of silently computing NaN/-1 and dropping all messages.
- Add NaN guard to parseInt in syncReorderBuffer for corrupted Redis values.
- Consolidate two streams loops in destroy() into a single pass.
- Extract createMockPublisher to shared helpers/publisher.ts (DRY).
- Add tests for eval failure and unexpected eval return type in emitChunk.
- Document performance tradeoff (2x RTT per emit) and TTL rationale.

* test: add cross-replica integration tests for sequence desync fix (#12575)

Three real-Redis tests that reproduce the exact multi-replica failure:

- Late subscriber on a different replica syncs to the shared counter
  and receives chunks immediately (no 500ms force-flush).
- Multiple subscribe/unsubscribe cycles across replicas maintain
  correct sequence alignment on every reconnect.
- Shared counter key is cleaned up when the stream is destroyed.

* fix: close syncReorderBuffer race, stop destroy() from nuking shared keys

- Replace selective prune with Math.min(currentSeq, minPendingSeq) so
  chunks that arrive in pending during the async GET window are preserved
  instead of incorrectly pruned. Since unsubscribe() already clears
  pending, any entries at sync time are live messages from the current
  subscription and must not be discarded.
- Remove DEL from destroy() — the sequence key is shared across replicas
  and a shutting-down instance must not nuke it for active publishers.
  Keys expire naturally via their 1-hour TTL; cleanup() handles
  single-stream lifecycle teardown.
- Add race-condition test: pauses GET, injects a message into pending
  during the window, resolves GET, asserts the chunk is delivered.
- Add emitDone/emitError eval failure tests to cover the asymmetric
  error contract (chunk swallows, done/error propagates).
- Add explicit MockPublisher return type to helpers/publisher.ts.

* fix: context-aware syncReorderBuffer to prevent duplicate delivery, silence test logs

The Math.min(currentSeq, minPending) fix from the prior commit correctly
handles the cross-replica race (chunk arriving during async GET), but
causes duplicate delivery in same-replica mode: onAbort subscribes to
the Redis channel during createJob, so chunks published before
subscribe() may arrive via pub/sub AND earlyEventBuffer. The Math.min
logic then flushes the pub/sub copy as a "live" message.

Fix: add a clearPending parameter to syncReorderBuffer. The manager
passes true when earlyEventBuffer was replayed (same-replica: pending
entries are duplicates → clear them) and false/undefined when it was
not (cross-replica: pending entries are live → preserve via Math.min).

- Silence winston logger in stream test files via logger.silent = true,
  following the existing pattern in data-schemas/prompt.spec.ts.
- Remove sequence key DEL from destroy() — shared across replicas,
  a shutting-down instance must not nuke the counter for active
  publishers. Keys expire via TTL; cleanup() handles stream teardown.

* fix: share publisher client in cross-replica tests for Redis Cluster compat

The cross-replica tests created publisherB via (ioredisClient as Redis)
.duplicate(), which produces a plain Redis connection to a single node.
In Redis Cluster, GET/EVAL on this client can't follow MOVED redirects,
so syncReorderBuffer reads null → nextSeq=0 → chunks are buffered.

Fix: share ioredisClient as the publisher for both replicas. It's the
correct Cluster-aware client and handles slot routing automatically.
Only subscriber connections need to be separate (pub/sub requirement).

* fix: increase cross-replica test timeouts and use polling for CI cluster

Replace fixed 300ms delivery waits with polling (50ms intervals, 2s max)
to handle Redis Cluster's cross-node pub/sub broadcast latency in CI.
Increase subscription activation wait from 100ms to 500ms to match the
pattern used by existing same-instance tests.

* chore: silence winston logs in remaining stream test files

Add logger.silent = true to GenerationJobManager, RedisJobStore, and
collectedUsage test files, matching the pattern already applied to
RedisEventTransport and reconnect-reorder-desync tests.

* fix: remove flaky cluster pub/sub tests, fix log suppression for resetModules

Remove two cross-replica transport-level integration tests that are
inherently non-deterministic in Redis Cluster: cluster pub/sub fan-out
is async across nodes, so pre-subscribe PUBLISHes can arrive at the
subscriber's node after SUBSCRIBE takes effect, causing random +/- 1
message counts. The core logic is already covered by deterministic unit
tests (mock publisher) and GenerationJobManager integration tests
(end-to-end with earlyEventBuffer). The cleanup test is retained.

Switch log suppression from logger.silent (which doesn't survive
jest.resetModules) to jest.spyOn(console, 'log').mockImplementation()
for files that use resetModules (GenerationJobManager, RedisJobStore,
collectedUsage). The logger.silent approach remains for files that
don't use resetModules (RedisEventTransport, reconnect-reorder-desync).

* fix: replace Lua EVAL+TTL with plain INCR, preserve live chunks during same-replica sync

P1: syncReorderBuffer with clearPending=true was unconditionally clearing
pending, dropping NEW chunks (seq >= currentSeq) from ongoing generation
that arrived via pub/sub during the async GET. Fix: selectively prune
only entries with seq < currentSeq (duplicates of earlyEventBuffer) and
flush remaining live entries.

P2: The 1-hour TTL on sequence keys could expire mid-stream during long
quiet periods (e.g., slow tool calls), restarting the counter from zero
and causing subscribers to silently drop all subsequent messages. Fix:
remove the Lua EVAL+EXPIRE script entirely and use plain Redis INCR —
no TTL. Keys are cleaned up explicitly by cleanup()/resetSequence()
when streams end. Orphaned keys from crashed processes are a few bytes
each, negligible compared to the production risk of TTL expiry.

- Update mock publisher helper: eval → incr
- Remove "unexpected eval type" tests (not applicable to incr)
- Update remaining eval error tests to reference incr

* fix: re-arm flush timeout after syncReorderBuffer when gaps remain

syncReorderBuffer clears flushTimeout before processing, but if pending
still has gaps after flushPendingMessages, no timeout is re-armed.
Without new messages arriving to trigger scheduleFlushTimeout, those
buffered entries sit indefinitely — stalling the stream after reconnect.

* chore: address final review — fix stale docs, rename clearPending, tighten tests

Fix all 10 findings from final review pass:

F1: Fix stale TTL comment in destroy() — no TTL exists after EVAL removal
F2: Fix stale EVAL reference in emitChunk JSDoc → INCR + PUBLISH
F3: Fix syncReorderBuffer JSDoc — describes old broken behavior, not
    the current selective prune
F4: Rename clearPending → pruneStaleEntries for clarity at call sites
F5: Add publish-not-called assertion to INCR failure test
F6: Replace Math.min(...spread) with explicit loop to avoid heap alloc
F7: Remove resetSequence from IEventTransport interface (no external
    callers; implementation detail only)
F8: Consolidate cleanup/resetSequence overlap — cleanup() owns the DEL,
    resetReorderBuffer() (now private) handles buffer state only
F9: Fix import order in reconnect test (value before type imports)
F10: Export MockPublisher interface from helpers/publisher.ts

* chore: fix stale resetSequence references and hadBufferReplay JSDoc

Two comments referenced resetSequence() which was removed in the F7/F8
refactor (now private resetReorderBuffer(), which doesn't DEL the key).
Only cleanup() deletes the Redis key. Also update hadBufferReplay JSDoc
to say "prune stale entries" instead of the pre-refactor "clear pending".

* chore: grammar

* fix: use earlyReplayCount as prune cutoff instead of Redis counter

The boolean pruneStaleEntries flag used currentSeq (from Redis GET) as
the prune threshold, but INCR can advance the counter past a live
chunk's seq during the GET window. Example: earlyEventBuffer held seqs
0-4, generation emits seq 5 during GET, INCR advances counter to 6,
GET returns 6, prune condition 5 < 6 deletes the live chunk.

Fix: pass the earlyEventBuffer replay count (5) as the prune cutoff.
Entries with seq < earlyReplayCount are true duplicates; entries at or
above are live regardless of what the Redis counter reads. After
pruning, the unified Math.min(currentSeq, minPending) logic handles
both same-replica and cross-replica paths correctly.

Add a targeted test that exercises this exact race: pauses GET, emits
seq 5 during the window, resolves GET with counter=6, asserts seq 5
is preserved (would have been dropped with the old boolean approach).

* fix: pass earlyReplayCount for skipBufferReplay path to prune pub/sub duplicates

When skipBufferReplay is true (resume scenario), earlyEventBuffer events
are delivered via the resume sync payload, not replayed directly. But
earlyReplayCount was left at 0, so syncReorderBuffer treated all pending
entries as live — meaning delayed pub/sub copies of those buffered
events could be delivered again, duplicating content.

Fix: capture earlyEventBuffer.length before the skip/replay branch so
the count is always passed to syncReorderBuffer regardless of delivery
method. Seqs 0..earlyReplayCount-1 are pruned as duplicates whether
they were replayed or delivered via sync payload.

* fix: keep nextSeq monotonic in syncReorderBuffer after async GET

handleOrderedChunk can deliver in-order messages and advance nextSeq
during the async GET window. If those messages leave pending empty,
the unconditional nextSeq = currentSeq could regress nextSeq below
its already-advanced value, reopening a delivered gap and causing
subsequent messages to be buffered until force-flush.

Fix: wrap both nextSeq assignments with Math.max(nextSeq, ...) so
syncReorderBuffer never moves the delivery frontier backward.

* fix: cap nextSeq at earlyReplayCount, add 24h safety TTL, add post-GET race test

Finding 1 (CRITICAL): When earlyReplayCount > 0 and pending is empty,
nextSeq was set to currentSeq — but INCR can advance the counter past
a live chunk whose pub/sub hasn't arrived yet. Cap at earlyReplayCount
instead (what was actually delivered), so in-flight chunks are not
skipped. Adds test for the message-arrives-AFTER-GET-resolves scenario.

Finding 3: Add a 24-hour safety-net TTL set once on first INCR only
(val === 1), never refreshed. This caps orphan lifetime from crashed
processes without risking mid-stream counter resets.

Finding 6: Replace setTimeout(100) in cleanup test with polling loop.

Finding 9: Fix variadic DEL mock + add expire mock for the new TTL.

Revert P2 fix (earlyReplayCount for skipBufferReplay path) — when
skipBufferReplay is true, the resume sync payload delivers everything
up to currentSeq, so syncReorderBuffer should trust the Redis counter
as the frontier, not the buffer length.

* chore: log expire failures consistently with other fire-and-forget errors
2026-04-09 09:57:54 -04:00
Danny Avila
632ffbcb87
🧬 fix: Merge Custom Endpoints by Name Instead of Replacing Entire Array (#12586)
* fix: Merge Custom Endpoints by Name Instead of Replacing Entire Array

The DB base config's `endpoints.custom` array was wholesale-replacing
the YAML-derived array, causing YAML endpoint additions to be silently
lost after the first admin panel save. Add path-aware array merging
to `deepMerge` so keyed arrays (matched by `name`) are merged item-by-item
instead of replaced.

* fix: Harden mergeArrayByKey — deduplicate, sanitize, and prevent mutation

- Remove redundant sourceOrder array; iterate Map.keys() instead to
  prevent duplicate entries when source contains repeated names.
- Sanitize override-only appended items through deepMerge to enforce
  UNSAFE_KEYS prototype-pollution protection on all code paths.
- Shallow-copy unmatched base items to prevent mutation leak-back.
- Add post-OVERRIDE_KEY_MAP remapping note to ARRAY_MERGE_KEYS JSDoc.
- Add tests: duplicate source names, base mutation safety, multi-priority
  sequential merging of the same custom endpoint.

* fix: Add inline comments and test for keyless source items

- Document keyless item drop behavior in mergeArrayByKey with inline
  comment and matching test case.
- Add last-write-wins comment to deduplication test assertion.
- Clarify path semantics in mergeArrayByKey target-iteration comment.

* fix: Relocate keyless-item comment and test target-side preserve

- Move keyless-item comment to the if-guard where the skip happens and
  clarify that target-side keyless items are preserved, not dropped.
- Add test verifying base items without a name field are kept in output.
2026-04-09 09:38:26 -04:00
github-actions[bot]
72cdeb0437
🌍 i18n: Update translation.json with latest translations (#12583)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-09 07:45:13 -04:00