LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-05-13 16:07:30 +00:00

History

Danny Avila d2cbd551b7 🤝 fix: Load Handoff Agents for Agents API (#12740 ) * 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726) Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents into a shared `discoverConnectedAgents` helper in `@librechat/api` and wires it into the OpenAI-compatible `/v1/chat/completions` and Open Responses `/v1/responses` controllers. These endpoints previously only passed the primary agent config to `createRun` while keeping `primaryConfig.edges` intact, which forced `MultiAgentGraph` into multi-agent mode without loading the referenced sub-agents and caused StateGraph to throw "Found edge ending at unknown node <id>". The discovery helper also filters orphaned edges (deleted sub-agents or those the caller lacks VIEW permission on), so API users see the same graceful fallback the chat UI already had. * 🧪 fix: use ServerRequest in discovery spec helpers CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test helpers typed `req` as `express.Request`, which is not assignable to `DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose `user` is `IUser`). Local jest passed because ts-jest is transpile-only, but the CI typecheck uses the full compiler. * 🪲 fix: drop orphan edges on both endpoints, not just `to` Addresses the P1 codex finding on #12740: `filterOrphanedEdges` previously only removed edges whose `to` referenced a skipped agent. Edges whose `from` was a skipped agent — the symmetric case in a bidirectional graph like `A <-> B` where `B` is deleted or the user lacks VIEW on it — leaked through to `createRun` and re-triggered `Found edge ending at unknown node <id>` at StateGraph compile time. The filter now drops an edge if either endpoint references a skipped id, and the existing `to`-only test cases were updated to reflect the stricter behavior. Adds a bidirectional-graph regression test in `discovery.spec.ts`. * 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes Addresses the second P1 codex finding on #12740: the OpenAI-compat `/v1/chat/completions` and Open Responses `/v1/responses` routes gate the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`), but `discoverConnectedAgents` was checking handoff sub-agents against the looser in-app `AGENT` resource type. That allowed a remote caller who could reach the orchestrator but had only in-app visibility on a sub-agent to invoke it via the API — bypassing the remote-sharing boundary. Adds an optional `resourceType` param to `discoverConnectedAgents` (defaulting to `AGENT` for the chat UI path) and passes `ResourceType.REMOTE_AGENT` from both API controllers so every discovered sub-agent clears the same sharing boundary enforced at route entry. * 🧯 fix: enforce allowedProviders for discovered sub-agents Addresses the third P1 codex finding on #12740: `discoverConnectedAgents` forwarded the caller's `endpointOption` verbatim into `initializeAgent`, but on the OpenAI-compat routes that option's `endpoint` is the primary agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)` is true, so handoff sub-agents silently bypassed the provider allowlist configured under `endpoints.agents.allowedProviders`. Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every per-sub-agent init call. The primary agent still uses the caller's endpointOption as before — this only affects the BFS-loaded handoff targets. Regression test asserts the override. * ✂️ fix: prune unreachable sub-agents after orphan-edge filtering Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes every sub-agent referenced in the primary's edge scan, but once `filterOrphanedEdges` drops edges whose endpoints were skipped, some of those sub-agents end up disconnected from the primary. In an `A -> B -> C` graph (edges stored directly on A) where B is skipped (missing or no VIEW), both edges are filtered, but C was already loaded and would still be passed to `createRun` — which flips into multi-agent mode on `agents.length > 1` and turns C into an unintended parallel start node. After filtering edges, compute the set of agent ids reachable from the primary through the surviving edge set and prune `agentConfigs` to that set. Two regression tests added: one for the pruning case, one that confirms agents connected via surviving edges are still kept. * 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback Addresses the fifth P1 codex finding on #12740: `onAgentInitialized` fires during BFS, BEFORE the helper prunes agents that become disconnected once `filterOrphanedEdges` runs. Writing the sub-agent straight into the outer `agentConfigs` there and then only additively merging the pruned `discoveredConfigs` left stranded entries in the outer map, and `AgentClient` would still hand them to `createRun` as extra parallel start nodes (the exact failure mode the pass-4 prune was meant to eliminate for the API controllers). Drop the `agentConfigs.set` from the callback and replace the additive merge with a direct copy from `discoveredConfigs`, which is now the single authoritative source of what the run should see. The per-agent tool context map is still populated during BFS — stale entries there are harmless because they're only read by closure inside `ON_TOOL_EXECUTE` and are unreachable once the agent is not in `agentConfigs`. * 🔬 fix: address audit findings on discovery helper Resolves findings from a comprehensive external audit of #12740. Finding 1 (CRITICAL) — stale edges survive the reachability prune. The pass-4 prune removed unreachable agents from `agentConfigs` but left matching edges in the return value. In an `A -> B -> C -> D` graph (all edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B and B->C but keeps C->D (neither endpoint is skipped). The caller then sees `agentConfigs` without C/D but `edges` still references them, flipping `createRun` into multi-agent mode with mismatched agents/edges — the exact crash this PR is supposed to fix. Now filter the edge list to the reachable set in the same pass, so the returned shape is self-consistent: every edge endpoint is either the primary id or a key of `agentConfigs`. New regression test covers A->B->C->D with B skipped. Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API request. The OpenAI-compat and Responses controllers called `getModelsConfig(req)` and `discoverConnectedAgents` even when the primary agent had no edges (the common single-agent API case). Gate both behind `primaryConfig.edges?.length > 0` so single-agent runs don't pay that cost. Finding 5 (MINOR) — silent mutation of caller's `primaryConfig.userMCPAuthMap`. The helper aliased that object and then `Object.assign`'d sub-agent entries into it, changing the caller's config in-place. Shallow-clone up front so the returned merged map is the only destination. Finding 7 (NIT) — dead `?? []` coalescing. `filterOrphanedEdges` always returns a concrete array, so the `discoveredEdges ?? []` fallback was never reached. Simplified the `primaryConfig.edges = …` assignment. Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not mutated in-place. * 🧹 chore: address audit NITs on discovery helper Addresses two NIT findings from the post-fix audit: F1 — the shallow clone on `primaryConfig.userMCPAuthMap` was only applied on the primary side; the `else` branch (hit when the primary had no MCP auth and the first sub-agent seeds the map) assigned the sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's `Object.assign` mutated the first one's map in place. Harmless in practice (per-request ephemeral objects) but asymmetric. Clone in the else branch too. Test added. F2 — `initialize.js` had a defensive `if (agentConfigs.size > 0 && !edges) edges = []` normalizer. Pre-existing dead code: the helper now always returns a concrete array from `filteredEdges.filter(...)`. Removed for clarity. * 🕸 fix: require all sources reachable when traversing fan-in edges Addresses the seventh P1 codex finding on #12740: the reachability BFS advanced through an edge as soon as any of its `from` endpoints matched the current frontier node (`sources.includes(current)`), but the subsequent edge filter required ALL sources to be reachable (`every`). The two-semantics mismatch let a fan-in edge like `{from: ['A','B'], to: 'C'}` mark C reachable purely via A even when B had no path from the primary, then drop the edge itself at filter time. Result: C survived in `agentConfigs` with no surviving edge connecting it to A, so `createRun` flipped into multi-agent mode on `agents.length > 1` and C ran as an unintended parallel root. Replace the BFS with a fixed-point iteration keyed on the same all-sources-reachable predicate used by the filter, so traversal and filtering stay aligned and multi-source edges only fire once every source is in the reachable set. Two regression tests added: - `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts neither B nor C leak into the result. - `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires and D becomes reachable once both B and C are. * 🔀 fix: match SDK OR semantics for multi-source edge reachability Reverts the all-sources-required reachability gate from `4982f1c3b` and replaces it with an any-source-reachable model, which matches how `@librechat/agents`'s `MultiAgentGraph.createWorkflow` actually wires multi-source edges at runtime (per-source `builder.addEdge(source, destination)`). With the previous `every` gate, a legitimate handoff edge `{ from: ['A', 'B'], to: 'C' }` where B had no incoming path was pruned along with C, regressing OR-semantics routing that the SDK would otherwise handle correctly. New behavior: 1. Reachability: an edge advances when ANY of its `from` endpoints is already reachable. Fixed-point iteration over `filteredEdges`. 2. Edge filter: keep an edge when it has at least one reachable source AND all destinations are reachable (a missing destination would still crash `StateGraph.compile` with `Found edge ending at unknown node`). 3. Agent prune: keep agents that are reachable OR referenced on any endpoint of a surviving edge. The second clause preserves co-sources in multi-source edges (B in `{ from: ['A','B'], to: 'C' }` when nothing else reaches B) so the SDK's per-source `addEdge` — and the `validateEdgeAgents` safety-net I added to the SDK in #111 — still finds B as a node. The pass-audit A->B->C->D regression test continues to pass: with B skipped, `filterOrphanedEdges` drops both B-adjacent edges, reachability never expands past A, C->D has no reachable source so it gets filtered, and C/D are pruned because they're neither reachable nor referenced. * ✂️ fix: strip skipped co-members from multi-source/multi-dest edges Addresses codex pass-9 P2 on #12740. `filterOrphanedEdges` previously dropped an edge whenever any `from` id was skipped, which was correct for scalar edges but over-aggressive for multi-source ones: the agents SDK adds one `builder.addEdge(source, destination)` per source, so `{ from: ['A','B'], to: 'C' }` with B skipped still has a valid `A -> C` route that was being thrown away. Now sanitize each endpoint: - Scalar skipped → drop the whole edge (no route survives). - Array with some skipped → strip the skipped ids, keep the edge with the surviving members. If the array empties out, drop the edge. Symmetric handling for `to` covers multi-destination fan-out when one co-destination is skipped. Tests updated/added: - `strips skipped co-sources from multi-source edges…` - `strips skipped co-destinations from multi-destination edges` - `drops multi-member edges only when every member on a side is skipped` - Discovery-side: `preserves valid routes when one co-source of a multi-source edge is skipped` asserts the end-to-end behavior — skipped co-source B gets stripped from the edge, A->C routing survives, and C remains in `agentConfigs`. * 🔓 fix: respect SHARE-on-AGENT fallback for handoff ACL on API routes Addresses codex pass-10 P1 on #12740. The API controllers were handing `discoverConnectedAgents` a raw `PermissionService.checkPermission` call against `ResourceType.REMOTE_AGENT`, but the route-level middleware (`createCheckRemoteAgentAccess`) authorizes the primary agent via `getRemoteAgentPermissions`, which first consults the AGENT ACL and treats owners with the SHARE bit as remotely authorized even without an explicit REMOTE_AGENT grant. The mismatch meant a user could open the primary via `/v1/chat/completions` or `/v1/responses`, but their own owned handoff sub-agents were silently skipped — breaking multi-agent handoffs for the common "owner runs their own multi-agent orchestrator" case. Both controllers now pass `discoverConnectedAgents` a `checkPermission` wrapper that delegates to `getRemoteAgentPermissions` (with `getEffectivePermissions` injected from `PermissionService`) and compares the returned bitmask against the required permission via `hasPermissions`. Sub-agents are now authorized by the exact same rules the route middleware applies to the primary. * 🌱 fix: preserve user-defined parallel-start branches Addresses codex pass-11 P2 on #12740. The post-filter reachability prune seeded only from `primaryConfig.id`, which killed `MultiAgentGraph`'s legitimate multi-start pattern — a user-defined edge like `X -> Y` where X has no incoming path (X is an intentional parallel starting node, run alongside the primary) was being dropped because neither X nor Y was reachable from the primary. Reconcile the tension with pass-4 ("prune accidental orphans when an intermediate is skipped") by using pre-filter reachability as the signal: - An agent that WAS reachable from the primary via the original (pre-filter) edges but loses that path when `filterOrphanedEdges` runs is an accidental orphan (a skipped hop broke the chain) — prune. - An agent that was NEVER reachable from the primary, even pre-filter, is an intentional parallel start — seed it into post-filter reachability so its component survives. Surviving-edge endpoint references still keep an agent (co-sources in multi-source edges). New test `preserves user-defined parallel-start branches disconnected from the primary` covers the pass-11 scenario; the existing `A->B->C->D, B skipped` regression test continues to pass because C/D were pre-filter reachable through B and lose that reachability after filtering. * 🎯 fix: tighten parallel-start seed criterion to 'no pre-filter incoming edge' Addresses codex pass-12 P1 on #12740. The pass-11 seed heuristic — 'agent is in `agentConfigs` but was not pre-filter reachable from the primary' — was too permissive. A downstream agent like Y in `X -> Y` where X gets skipped (missing / no VIEW) was never pre-filter reachable from the primary either, so the old rule promoted Y to a parallel start node and discovery returned `agents: [primary, Y]` with no connecting edge. The SDK then ran Y as an unintended parallel root — exactly the orphan behavior pass-4 wanted to prevent. Tighter criterion: seed a post-filter reachability root only when the agent had NO incoming edge in the pre-filter graph. That matches `MultiAgentGraph.analyzeGraph`'s "no-incoming-edge" definition of a start node applied to the user's original declared topology, so: - `A -> B` plus a user-defined `X -> Y` parallel branch: X has no incoming pre-filter → seeded → X and Y both survive. - `A -> B` plus `X -> Y` with X skipped: Y had an incoming pre-filter (`X -> Y`) → NOT seeded → Y is pruned as the orphan it is. - `A -> B -> C` with B skipped: C had an incoming pre-filter (`B -> C`) → NOT seeded → C is pruned. New test `does not promote a downstream orphan to a parallel start when its only upstream is skipped` locks in the pass-12 scenario. The pass-11 `preserves user-defined parallel-start branches` test continues to hold. * 📁 fix: don't enforce AGENT-only file ACL on REMOTE_AGENT API callers Addresses codex pass-13 P1 on #12740. When I refactored the API controllers' DB-method bundle, I inadvertently started forwarding `filterFilesByAgentAccess` into `initializeAgent`. That helper calls `checkPermission` with `resourceType: ResourceType.AGENT`, but these routes authorize callers through `REMOTE_AGENT` (via `getRemoteAgentPermissions`). A user granted `REMOTE_AGENT_VIEWER` on a shared agent but lacking direct `AGENT_VIEW` could invoke the agent yet all its owner-attached context files would get silently filtered out — breaking `file_search`/context retrieval for remote consumers. Drop `filterFilesByAgentAccess` from the OpenAI-compat and Responses controllers' `dbMethods` (and remove the now-unused import). The chat UI's `initialize.js` keeps it since that path legitimately authorizes at the AGENT level. No functional change inside the helper — passing `undefined` simply tells `primeResources` to skip the per-file ACL filter, restoring the pre-refactor API behavior. * 🪓 fix: strip unreachable co-sources from surviving multi-source edges Addresses codex pass-14 P1 on #12740. The earlier pass-8 fix kept any agent referenced as an endpoint of a surviving edge (via a `referencedByEdge` fallback) to avoid the SDK's `validateEdgeAgents` failing on missing nodes. But that fallback propped up unreachable co-sources too: with `[A -> C, X -> B, [B,C] -> D]` and X skipped, `X -> B` gets filtered, the `[B,C] -> D` fan-in survives because C is reachable, and B stays in `agentConfigs` solely because the fan-in still lists it. `MultiAgentGraph.analyzeGraph` then sees B with no incoming edge and runs it as an unintended parallel root. Sanitize surviving edges instead: for a kept edge whose `from` is an array, filter out any co-source that isn't reachable. The SDK's per-source `addEdge` fires independently, so dropping an unreachable co-source doesn't invalidate the remaining routes — in the scenario above `[B,C] -> D` becomes `[C] -> D`, every endpoint of every surviving edge is now reachable, and the agent prune collapses to a strict `reachable.has(agentId)` check. No more referenced-by-edge fallback. Regression test added: `strips unreachable co-sources from surviving multi-source edges (no stray parallel root)` — asserts B is absent from every surviving edge endpoint and the fan-in's `from` is just `['C']`. All 22 prior discovery tests still pass unchanged.		2026-04-20 02:20:43 -04:00
..
app	📎 fix: Route Unrecognized File Types via supportedMimeTypes Config (#12508 )	2026-04-01 23:04:43 -04:00
cache	🚦 fix: ERR_ERL_INVALID_IP_ADDRESS and IPv6 Key Collisions in IP Rate Limiters (#12319 )	2026-03-19 21:48:03 -04:00
config	🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737 )	2026-04-19 21:49:41 -07:00
db	🐛 fix: Resolve MeiliSearch Startup Sync Failure from Model Loading Order (#12397 )	2026-03-25 14:02:44 -04:00
models	🗑️ chore: Remove Action Test Suite and Update Mock Implementations (#12268 )	2026-03-21 14:28:55 -04:00
server	🤝 fix: Load Handoff Agents for Agents API (#12740 )	2026-04-20 02:20:43 -04:00
strategies	🔐 feat: Admin Auth Support for SAML and Social OAuth Providers (#12472 )	2026-03-30 22:49:44 -04:00
test	🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737 )	2026-04-19 21:49:41 -07:00
utils	🦉 feat: Claude Opus 4.7 Model Support (#12698 )	2026-04-16 14:51:00 -04:00
jest.config.js	📏 refactor: Add File Size Limits to Conversation Imports (#12221 )	2026-03-14 03:06:29 -04:00
jsconfig.json
package.json	📦 chore: npm audit & bump `@librechat/agents` to v3.1.67 (#12710 )	2026-04-16 22:44:00 -04:00
typedefs.js	🪦 refactor: Remove Legacy Code (#10533 )	2025-12-11 16:36:12 -05:00