LibreChat/api
Danny Avila d2cbd551b7
🤝 fix: Load Handoff Agents for Agents API (#12740)
* 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726)

Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents
into a shared `discoverConnectedAgents` helper in `@librechat/api` and
wires it into the OpenAI-compatible `/v1/chat/completions` and Open
Responses `/v1/responses` controllers. These endpoints previously only
passed the primary agent config to `createRun` while keeping
`primaryConfig.edges` intact, which forced `MultiAgentGraph` into
multi-agent mode without loading the referenced sub-agents and caused
StateGraph to throw "Found edge ending at unknown node <id>".

The discovery helper also filters orphaned edges (deleted sub-agents or
those the caller lacks VIEW permission on), so API users see the same
graceful fallback the chat UI already had.

* 🧪 fix: use ServerRequest in discovery spec helpers

CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test
helpers typed `req` as `express.Request`, which is not assignable to
`DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose
`user` is `IUser`). Local jest passed because ts-jest is transpile-only,
but the CI typecheck uses the full compiler.

* 🪲 fix: drop orphan edges on both endpoints, not just `to`

Addresses the P1 codex finding on #12740: `filterOrphanedEdges`
previously only removed edges whose `to` referenced a skipped agent.
Edges whose `from` was a skipped agent — the symmetric case in a
bidirectional graph like `A <-> B` where `B` is deleted or the user
lacks VIEW on it — leaked through to `createRun` and re-triggered
`Found edge ending at unknown node <id>` at StateGraph compile time.

The filter now drops an edge if either endpoint references a skipped
id, and the existing `to`-only test cases were updated to reflect the
stricter behavior. Adds a bidirectional-graph regression test in
`discovery.spec.ts`.

* 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes

Addresses the second P1 codex finding on #12740: the OpenAI-compat
`/v1/chat/completions` and Open Responses `/v1/responses` routes gate
the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`),
but `discoverConnectedAgents` was checking handoff sub-agents against
the looser in-app `AGENT` resource type. That allowed a remote caller
who could reach the orchestrator but had only in-app visibility on a
sub-agent to invoke it via the API — bypassing the remote-sharing
boundary.

Adds an optional `resourceType` param to `discoverConnectedAgents`
(defaulting to `AGENT` for the chat UI path) and passes
`ResourceType.REMOTE_AGENT` from both API controllers so every
discovered sub-agent clears the same sharing boundary enforced at
route entry.

* 🧯 fix: enforce allowedProviders for discovered sub-agents

Addresses the third P1 codex finding on #12740: `discoverConnectedAgents`
forwarded the caller's `endpointOption` verbatim into `initializeAgent`,
but on the OpenAI-compat routes that option's `endpoint` is the primary
agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only
enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)`
is true, so handoff sub-agents silently bypassed the provider allowlist
configured under `endpoints.agents.allowedProviders`.

Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every
per-sub-agent init call. The primary agent still uses the caller's
endpointOption as before — this only affects the BFS-loaded handoff
targets. Regression test asserts the override.

* ✂️ fix: prune unreachable sub-agents after orphan-edge filtering

Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes
every sub-agent referenced in the primary's edge scan, but once
`filterOrphanedEdges` drops edges whose endpoints were skipped, some of
those sub-agents end up disconnected from the primary. In an `A -> B ->
C` graph (edges stored directly on A) where B is skipped (missing or
no VIEW), both edges are filtered, but C was already loaded and would
still be passed to `createRun` — which flips into multi-agent mode on
`agents.length > 1` and turns C into an unintended parallel start node.

After filtering edges, compute the set of agent ids reachable from the
primary through the surviving edge set and prune `agentConfigs` to that
set. Two regression tests added: one for the pruning case, one that
confirms agents connected via surviving edges are still kept.

* 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback

Addresses the fifth P1 codex finding on #12740: `onAgentInitialized`
fires during BFS, BEFORE the helper prunes agents that become
disconnected once `filterOrphanedEdges` runs. Writing the sub-agent
straight into the outer `agentConfigs` there and then only additively
merging the pruned `discoveredConfigs` left stranded entries in the
outer map, and `AgentClient` would still hand them to `createRun` as
extra parallel start nodes (the exact failure mode the pass-4 prune
was meant to eliminate for the API controllers).

Drop the `agentConfigs.set` from the callback and replace the additive
merge with a direct copy from `discoveredConfigs`, which is now the
single authoritative source of what the run should see. The
per-agent tool context map is still populated during BFS — stale
entries there are harmless because they're only read by closure inside
`ON_TOOL_EXECUTE` and are unreachable once the agent is not in
`agentConfigs`.

* 🔬 fix: address audit findings on discovery helper

Resolves findings from a comprehensive external audit of #12740.

**Finding 1 (CRITICAL) — stale edges survive the reachability prune.**
The pass-4 prune removed unreachable agents from `agentConfigs` but left
matching edges in the return value. In an `A -> B -> C -> D` graph (all
edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B
and B->C but keeps C->D (neither endpoint is skipped). The caller then
sees `agentConfigs` without C/D but `edges` still references them,
flipping `createRun` into multi-agent mode with mismatched agents/edges
— the exact crash this PR is supposed to fix. Now filter the edge list
to the reachable set in the same pass, so the returned shape is
self-consistent: every edge endpoint is either the primary id or a key
of `agentConfigs`. New regression test covers A->B->C->D with B skipped.

**Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API
request.** The OpenAI-compat and Responses controllers called
`getModelsConfig(req)` and `discoverConnectedAgents` even when the
primary agent had no edges (the common single-agent API case). Gate
both behind `primaryConfig.edges?.length > 0` so single-agent runs
don't pay that cost.

**Finding 5 (MINOR) — silent mutation of caller's
`primaryConfig.userMCPAuthMap`.** The helper aliased that object and
then `Object.assign`'d sub-agent entries into it, changing the caller's
config in-place. Shallow-clone up front so the returned merged map is
the only destination.

**Finding 7 (NIT) — dead `?? []` coalescing.**
`filterOrphanedEdges` always returns a concrete array, so the
`discoveredEdges ?? []` fallback was never reached. Simplified the
`primaryConfig.edges = …` assignment.

Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not
mutated in-place.

* 🧹 chore: address audit NITs on discovery helper

Addresses two NIT findings from the post-fix audit:

**F1** — the shallow clone on `primaryConfig.userMCPAuthMap` was only
applied on the primary side; the `else` branch (hit when the primary
had no MCP auth and the first sub-agent seeds the map) assigned the
sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's
`Object.assign` mutated the first one's map in place. Harmless in
practice (per-request ephemeral objects) but asymmetric. Clone in the
else branch too. Test added.

**F2** — `initialize.js` had a defensive `if (agentConfigs.size > 0 &&
!edges) edges = []` normalizer. Pre-existing dead code: the helper now
always returns a concrete array from `filteredEdges.filter(...)`.
Removed for clarity.

* 🕸 fix: require all sources reachable when traversing fan-in edges

Addresses the seventh P1 codex finding on #12740: the reachability BFS
advanced through an edge as soon as any of its `from` endpoints matched
the current frontier node (`sources.includes(current)`), but the
subsequent edge filter required ALL sources to be reachable (`every`).
The two-semantics mismatch let a fan-in edge like `{from: ['A','B'],
to: 'C'}` mark C reachable purely via A even when B had no path from
the primary, then drop the edge itself at filter time. Result: C
survived in `agentConfigs` with no surviving edge connecting it to A,
so `createRun` flipped into multi-agent mode on `agents.length > 1`
and C ran as an unintended parallel root.

Replace the BFS with a fixed-point iteration keyed on the same
all-sources-reachable predicate used by the filter, so traversal and
filtering stay aligned and multi-source edges only fire once every
source is in the reachable set.

Two regression tests added:
- `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts
  neither B nor C leak into the result.
- `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires
  and D becomes reachable once both B and C are.

* 🔀 fix: match SDK OR semantics for multi-source edge reachability

Reverts the all-sources-required reachability gate from 4982f1c3b and
replaces it with an any-source-reachable model, which matches how
`@librechat/agents`'s `MultiAgentGraph.createWorkflow` actually wires
multi-source edges at runtime (per-source `builder.addEdge(source,
destination)`). With the previous `every` gate, a legitimate handoff
edge `{ from: ['A', 'B'], to: 'C' }` where B had no incoming path was
pruned along with C, regressing OR-semantics routing that the SDK
would otherwise handle correctly.

New behavior:

1. Reachability: an edge advances when ANY of its `from` endpoints is
   already reachable. Fixed-point iteration over `filteredEdges`.
2. Edge filter: keep an edge when it has at least one reachable source
   AND all destinations are reachable (a missing destination would
   still crash `StateGraph.compile` with `Found edge ending at unknown
   node`).
3. Agent prune: keep agents that are reachable OR referenced on any
   endpoint of a surviving edge. The second clause preserves co-sources
   in multi-source edges (B in `{ from: ['A','B'], to: 'C' }` when
   nothing else reaches B) so the SDK's per-source `addEdge` — and the
   `validateEdgeAgents` safety-net I added to the SDK in #111 — still
   finds B as a node.

The pass-audit A->B->C->D regression test continues to pass: with B
skipped, `filterOrphanedEdges` drops both B-adjacent edges, reachability
never expands past A, C->D has no reachable source so it gets filtered,
and C/D are pruned because they're neither reachable nor referenced.

* ✂️ fix: strip skipped co-members from multi-source/multi-dest edges

Addresses codex pass-9 P2 on #12740. `filterOrphanedEdges` previously
dropped an edge whenever any `from` id was skipped, which was correct
for scalar edges but over-aggressive for multi-source ones: the agents
SDK adds one `builder.addEdge(source, destination)` per source, so
`{ from: ['A','B'], to: 'C' }` with B skipped still has a valid
`A -> C` route that was being thrown away.

Now sanitize each endpoint:
- Scalar skipped → drop the whole edge (no route survives).
- Array with some skipped → strip the skipped ids, keep the edge with
  the surviving members. If the array empties out, drop the edge.

Symmetric handling for `to` covers multi-destination fan-out when one
co-destination is skipped. Tests updated/added:
- `strips skipped co-sources from multi-source edges…`
- `strips skipped co-destinations from multi-destination edges`
- `drops multi-member edges only when every member on a side is skipped`
- Discovery-side: `preserves valid routes when one co-source of a
  multi-source edge is skipped` asserts the end-to-end behavior —
  skipped co-source B gets stripped from the edge, A->C routing
  survives, and C remains in `agentConfigs`.

* 🔓 fix: respect SHARE-on-AGENT fallback for handoff ACL on API routes

Addresses codex pass-10 P1 on #12740. The API controllers were handing
`discoverConnectedAgents` a raw `PermissionService.checkPermission` call
against `ResourceType.REMOTE_AGENT`, but the route-level middleware
(`createCheckRemoteAgentAccess`) authorizes the primary agent via
`getRemoteAgentPermissions`, which first consults the AGENT ACL and
treats owners with the SHARE bit as remotely authorized even without
an explicit REMOTE_AGENT grant. The mismatch meant a user could open
the primary via `/v1/chat/completions` or `/v1/responses`, but their
own owned handoff sub-agents were silently skipped — breaking
multi-agent handoffs for the common "owner runs their own multi-agent
orchestrator" case.

Both controllers now pass `discoverConnectedAgents` a `checkPermission`
wrapper that delegates to `getRemoteAgentPermissions` (with
`getEffectivePermissions` injected from `PermissionService`) and
compares the returned bitmask against the required permission via
`hasPermissions`. Sub-agents are now authorized by the exact same
rules the route middleware applies to the primary.

* 🌱 fix: preserve user-defined parallel-start branches

Addresses codex pass-11 P2 on #12740. The post-filter reachability
prune seeded only from `primaryConfig.id`, which killed
`MultiAgentGraph`'s legitimate multi-start pattern — a user-defined
edge like `X -> Y` where X has no incoming path (X is an intentional
parallel starting node, run alongside the primary) was being dropped
because neither X nor Y was reachable from the primary.

Reconcile the tension with pass-4 ("prune accidental orphans when an
intermediate is skipped") by using pre-filter reachability as the
signal:

- An agent that WAS reachable from the primary via the original
  (pre-filter) edges but loses that path when `filterOrphanedEdges`
  runs is an accidental orphan (a skipped hop broke the chain) — prune.
- An agent that was NEVER reachable from the primary, even pre-filter,
  is an intentional parallel start — seed it into post-filter
  reachability so its component survives.

Surviving-edge endpoint references still keep an agent (co-sources in
multi-source edges). New test `preserves user-defined parallel-start
branches disconnected from the primary` covers the pass-11 scenario;
the existing `A->B->C->D, B skipped` regression test continues to
pass because C/D were pre-filter reachable through B and lose that
reachability after filtering.

* 🎯 fix: tighten parallel-start seed criterion to 'no pre-filter incoming edge'

Addresses codex pass-12 P1 on #12740. The pass-11 seed heuristic — 'agent
is in `agentConfigs` but was not pre-filter reachable from the primary' —
was too permissive. A downstream agent like Y in `X -> Y` where X gets
skipped (missing / no VIEW) was never pre-filter reachable from the
primary either, so the old rule promoted Y to a parallel start node and
discovery returned `agents: [primary, Y]` with no connecting edge. The
SDK then ran Y as an unintended parallel root — exactly the orphan
behavior pass-4 wanted to prevent.

Tighter criterion: seed a post-filter reachability root only when the
agent had NO incoming edge in the pre-filter graph. That matches
`MultiAgentGraph.analyzeGraph`'s "no-incoming-edge" definition of a
start node applied to the user's original declared topology, so:

- `A -> B` plus a user-defined `X -> Y` parallel branch: X has no
  incoming pre-filter → seeded → X and Y both survive.
- `A -> B` plus `X -> Y` with X skipped: Y had an incoming pre-filter
  (`X -> Y`) → NOT seeded → Y is pruned as the orphan it is.
- `A -> B -> C` with B skipped: C had an incoming pre-filter (`B -> C`)
  → NOT seeded → C is pruned.

New test `does not promote a downstream orphan to a parallel start when
its only upstream is skipped` locks in the pass-12 scenario. The pass-11
`preserves user-defined parallel-start branches` test continues to hold.

* 📁 fix: don't enforce AGENT-only file ACL on REMOTE_AGENT API callers

Addresses codex pass-13 P1 on #12740. When I refactored the API
controllers' DB-method bundle, I inadvertently started forwarding
`filterFilesByAgentAccess` into `initializeAgent`. That helper calls
`checkPermission` with `resourceType: ResourceType.AGENT`, but these
routes authorize callers through `REMOTE_AGENT` (via
`getRemoteAgentPermissions`). A user granted `REMOTE_AGENT_VIEWER` on
a shared agent but lacking direct `AGENT_VIEW` could invoke the agent
yet all its owner-attached context files would get silently filtered
out — breaking `file_search`/context retrieval for remote consumers.

Drop `filterFilesByAgentAccess` from the OpenAI-compat and Responses
controllers' `dbMethods` (and remove the now-unused import). The chat
UI's `initialize.js` keeps it since that path legitimately authorizes
at the AGENT level. No functional change inside the helper — passing
`undefined` simply tells `primeResources` to skip the per-file ACL
filter, restoring the pre-refactor API behavior.

* 🪓 fix: strip unreachable co-sources from surviving multi-source edges

Addresses codex pass-14 P1 on #12740. The earlier pass-8 fix kept any
agent referenced as an endpoint of a surviving edge (via a
`referencedByEdge` fallback) to avoid the SDK's `validateEdgeAgents`
failing on missing nodes. But that fallback propped up unreachable
co-sources too: with `[A -> C, X -> B, [B,C] -> D]` and X skipped,
`X -> B` gets filtered, the `[B,C] -> D` fan-in survives because C is
reachable, and B stays in `agentConfigs` solely because the fan-in
still lists it. `MultiAgentGraph.analyzeGraph` then sees B with no
incoming edge and runs it as an unintended parallel root.

Sanitize surviving edges instead: for a kept edge whose `from` is an
array, filter out any co-source that isn't reachable. The SDK's
per-source `addEdge` fires independently, so dropping an unreachable
co-source doesn't invalidate the remaining routes — in the scenario
above `[B,C] -> D` becomes `[C] -> D`, every endpoint of every
surviving edge is now reachable, and the agent prune collapses to a
strict `reachable.has(agentId)` check. No more referenced-by-edge
fallback.

Regression test added: `strips unreachable co-sources from surviving
multi-source edges (no stray parallel root)` — asserts B is absent
from every surviving edge endpoint and the fan-in's `from` is just
`['C']`. All 22 prior discovery tests still pass unchanged.
2026-04-20 02:20:43 -04:00
..
app 📎 fix: Route Unrecognized File Types via supportedMimeTypes Config (#12508) 2026-04-01 23:04:43 -04:00
cache 🚦 fix: ERR_ERL_INVALID_IP_ADDRESS and IPv6 Key Collisions in IP Rate Limiters (#12319) 2026-03-19 21:48:03 -04:00
config 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737) 2026-04-19 21:49:41 -07:00
db 🐛 fix: Resolve MeiliSearch Startup Sync Failure from Model Loading Order (#12397) 2026-03-25 14:02:44 -04:00
models 🗑️ chore: Remove Action Test Suite and Update Mock Implementations (#12268) 2026-03-21 14:28:55 -04:00
server 🤝 fix: Load Handoff Agents for Agents API (#12740) 2026-04-20 02:20:43 -04:00
strategies 🔐 feat: Admin Auth Support for SAML and Social OAuth Providers (#12472) 2026-03-30 22:49:44 -04:00
test 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737) 2026-04-19 21:49:41 -07:00
utils 🦉 feat: Claude Opus 4.7 Model Support (#12698) 2026-04-16 14:51:00 -04:00
jest.config.js 📏 refactor: Add File Size Limits to Conversation Imports (#12221) 2026-03-14 03:06:29 -04:00
jsconfig.json
package.json 📦 chore: npm audit & bump @librechat/agents to v3.1.67 (#12710) 2026-04-16 22:44:00 -04:00
typedefs.js 🪦 refactor: Remove Legacy Code (#10533) 2025-12-11 16:36:12 -05:00