CERBERUS/LibreChat - Gitora: Self-Hosted Future for Developers

CERBERUS/LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-06-27 01:41:19 +00:00

Author	SHA1	Message	Date
Danny Avila	6357ea10c1	🧭 feat: Scope Model Spec Skills (#13522 ) * feat: scope model spec skills * style: format skill catalog limit * fix: serialize model spec skill resolution * test: satisfy model spec load config typing * fix: apply model spec skills to added conversations * fix: support alwaysApply frontmatter alias * fix: address model spec skills review	2026-06-05 10:22:02 -04:00
Danny Avila	1da789bac0	🗂️ feat: Add Agent File Authoring Tools (#13435 ) * feat: add agent file authoring tools * style: format file authoring changes * style: satisfy file authoring prettier * test: fix file authoring initialization expectations * fix: complete skill file authoring flow * fix: pass skill authoring state on edit * test: mock missing bundled skill file * fix: harden agent file authoring gates * fix: preserve file authoring runtime context * test: fix authoring context mock typing * fix: preserve subagent skill primes * test: avoid array at in handler spec * refactor: deepen skill authoring runtime wiring * fix: address codex authoring review findings * test: fix authoring collision fixture type * test: add skill file authoring mock e2e * fix: Improve skill file authoring recovery * fix: Show file authoring args while running * fix: Clarify skill rename authoring errors * fix: Keep code-only file authoring schemas sandbox scoped * fix: Address skill authoring review findings * fix: Gate skill authoring on write access	2026-06-03 23:58:12 -04:00
Danny Avila	68eac104ad	🗂️ fix: Scope Handoff Agent Context Docs (#13167 ) * fix: Scope agent context docs to handoff agents * fix: Deduplicate scoped request context * refactor: Extract agent attachment helpers	2026-05-18 15:36:22 -04:00
Danny Avila	7631366f52	🪵 chore: Log Subagent Limit Hits (#13068 )	2026-05-11 09:25:08 -04:00
Danny Avila	70b6bb69d3	🧬 fix: Bound Subagent Expansion (#13064 ) * fix: Bound subagent expansion * fix: Preserve subagent path depth	2026-05-11 08:53:53 -04:00
Danny Avila	22890771cf	🧭 fix: Preserve Resend Files for Subagents (#13030 )	2026-05-08 17:21:52 -04:00
Danny Avila	d83cb84f59	🪆 feat: Subagent configuration in Agent Builder (#12725 ) * 🪆 feat: Subagents configuration (isolated-context child agents) Surfaces the new @librechat/agents `SubagentConfig` primitive in the Agent Builder. Subagents let a supervisor delegate a focused subtask to a child graph running in an isolated context window: verbose tool output stays in the child, only a filtered summary returns to the parent. Data model: new `subagents: { enabled, allowSelf, agent_ids }` on Agent, wired through the Zod, Mongoose, and form schemas plus a new `AgentCapabilities.subagents` capability (enabled by default). Backend: `initialize.js` loads explicit subagent configs alongside handoff agents, and drops subagent-only references from the parallel/handoff maps so they don't leak into the supervisor's graph. `run.ts` emits `SubagentConfig[]` on the primary `AgentInputs` — a self-spawn entry when `allowSelf` is enabled plus one entry per configured agent. UI: an "Advanced" panel section with an enable toggle, a self-spawn toggle, and an agent picker (capped at 10). Enabling without adding agents still yields self-spawn; disabling self-spawn with no agents shows a warning. A capability flag gates the whole section. * 🪆 feat: Stream subagent progress to UI (dialog + inline ticker) Pairs with the @librechat/agents SDK change that forwards child-graph events through the parent's handler registry (danny-avila/agents#107): - Self-spawn and explicit subagents can now use event-driven tools, because child `ON_TOOL_EXECUTE` dispatches reach our ToolService via the parent's registered handler. - The same forwarding path wraps the child's run_step / run_step_delta / run_step_completed / message_delta / reasoning_delta dispatches in a new `ON_SUBAGENT_UPDATE` envelope, with start/stop/error bookends. Backend: `callbacks.js` registers an `ON_SUBAGENT_UPDATE` handler that forwards each envelope straight to the SSE stream. Frontend: - `useStepHandler` consumes `ON_SUBAGENT_UPDATE` events and merges them into a per-tool_call Recoil atom (`subagentProgressByToolCallId`). First-seen `subagentRunId` claims the most-recent unclaimed `subagent` tool call in the active response message — a temporal mapping, no SDK wire-format change needed to correlate child runs with parent tool calls. - New `SubagentCall` part component replaces the default `ToolCall` rendering when `toolCall.name === Constants.SUBAGENT`: compact status ticker showing the 3 most recent update labels, clickable to open a dialog with the full activity log + final markdown-rendered result. - Adds `Constants.SUBAGENT`, `StepEvents.ON_SUBAGENT_UPDATE`, and `SubagentUpdateEvent` type in data-provider. Tests: - `packages/api npx jest run-summarization` — 23 pass - `api npx jest initialize` — 16 pass - `npm run build` — clean Dependency note: bumps `@librechat/agents` to `^3.1.67-dev.1` — requires the SDK PR (danny-avila/agents#107) to be merged to dev and published before this PR merges. `ON_SUBAGENT_UPDATE` is absent from dev.0, so the handler registration would be a no-op with the older SDK but would not crash. * 🪆 fix: address Codex review and review audit on subagents Stacks on top of the SDK change in danny-avila/agents#107 (bumped to `^3.1.67-dev.2`). - P1 (`initialize.js`): subagent-only agents were being deleted from both `agentConfigs` AND `agentToolContexts`. The tool-execute handler resolves execution context (agent, tool_resources, skill ACLs) from `agentToolContexts`, so explicit subagents would run without their configured resources and skip action tools. Now only `agentConfigs` is pruned — tool context stays intact. - P2 (`AgentSubagents.tsx`): toggling subagents off set the form field to `undefined`; `removeNullishValues` stripped it from the PATCH, leaving the server copy enabled. Now it persists an explicit `{ enabled: false, ... }` so the update actually clears state. - Finding 1 (MAJOR) — `agent_ids` Zod schema gains `.max()` via a new `MAX_SUBAGENTS` export from `data-provider` (shared with the UI cap). Crafted payloads can't trigger hundreds of `processAgent` calls. - Finding 2 (MAJOR) — `subagentProgressByToolCallId` atomFamily atoms are now tracked in a ref and reset from `clearStepMaps` via a `useRecoilCallback({ reset })`. No monotonic growth across a session. - Finding 3 (MAJOR) — early-arriving `ON_SUBAGENT_UPDATE` events whose parent `tool_call_id` is not yet mapped are now buffered in `pendingSubagentBuffer` (keyed by `subagentRunId`) and replayed in arrival order once correlation completes. Mirrors the existing `pendingDeltaBuffer` pattern. - Finding 4 (MAJOR) — switched to deterministic correlation via the new `parentToolCallId` that SDK `3.1.67-dev.2` threads through from `ToolRunnableConfig.toolCall.id`. Temporal fallback now iterates oldest-unclaimed-first (forward), matching tool-call creation order, so concurrent spawns map correctly. - Finding 6 (MINOR) — `agent_ids` are deduped on the backend via `new Set(...)` before the load loop. Duplicates no longer produce duplicate `SubagentConfig` entries visible to the LLM. - Finding 7 (MINOR) — events array inside each Recoil atom is capped at 200 entries. Long-running subagents no longer replay O(n) spreads on every update; the dialog log still shows the cap window. - Finding 8 (MINOR) — documented: subagents are loaded only for the primary agent this release (handoff children get self-spawn but not explicit sub-subagents). In-code comment added so the next maintainer doesn't wonder. - Finding 9 (NIT) — removed `{!isSubmitting && null}` dead code and the misleading announce-polite comment in `SubagentCall`. - New `validation.spec.ts` — 9 tests covering the cap on `agent_ids.length` at the subagent schema, agent-create, and agent-update layers. - `run-summarization` — 23 pass, `initialize` — 16 pass, total backend package: 103 pass across touched areas. Findings 5 (component tests) and 10 (micro-allocation) are tracked but deferred; the former needs a Recoil-RenderHook harness that isn't in this PR's scope, and the latter has negligible impact (one `Array.from` per subagent run). * 🧪 test: integration coverage for subagent correlation + backend loading Addresses the follow-up audit on #12725 with real-code tests (no mock handlers, only the existing setMessages/getMessages spies and the standard mongodb-memory-server harness). Six new tests under a dedicated `describe('subagent loading')`: - loads a configured subagent, populates `subagentAgentConfigs`, keeps it out of `agentConfigs` - P1 regression guard: drives the real `toolExecuteOptions.loadTools` closure with the subagent id and asserts `loadToolsForExecution` is called with `agent: <subagent>`, `tool_resources`, `actionsEnabled`. If anyone deletes `agentToolContexts` again, this fails. - dedup: three copies of the same id load the agent once - overlap: agent referenced both as handoff target and subagent stays in `agentConfigs` - capability gate: admin disabling `subagents` suppresses loading even when the agent has a config - per-agent disable: `subagents.enabled: false` skips loading entirely Five new tests under `describe('on_subagent_update event')` using a real `RecoilRoot` and a companion `useRecoilCallback` reader so writes from the hook are observable: - deterministic correlation via `parentToolCallId` (happy path with SDK dev.2+) - fallback: oldest-unclaimed tool call wins for concurrent spawns without `parentToolCallId` - early-arrival buffer: updates with no mapping get buffered and replayed once the tool call appears - event cap: 205 updates collapse to 200 retained, oldest dropped - `clearStepMaps` resets tracked atoms back to their null default - F2 — added explicit `// TODO` marker for handoff-subagent-loading extension (matches the comment that referenced it). - F3 — dropped the unnecessary `MAX_SUBAGENTS as MAX_SUBAGENTS_CAP` alias; just import the constant directly. - Bumped `@librechat/agents` to `^3.1.67-dev.3` to pick up the SDK's paired test additions. - `api/server/services/Endpoints/agents/initialize.spec.js` — 22 pass (6 new + 16 existing) - `packages/api/src/agents/validation.spec.ts` + `run-summarization.test.ts` — 103 pass - `client/src/hooks/SSE/__tests__/useStepHandler.spec.ts` — 48 pass (5 new + 43 existing) * 🪆 fix: strip parent run summary + discovered tools from subagent inputs Codex P1 on #12725: `buildSubagentConfigs` reused the shared `buildAgentInput` factory for each explicit child, and that factory always stamps the parent run's `initialSummary` (cross-run conversation summary) and `discoveredTools` (tool names the parent's LLM searched earlier) onto every `AgentInputs` it returns. When subagents were enabled on a conversation that had already been summarized, every child inherited that summary — silently defeating the isolated-context contract and burning extra tokens on unrelated prior chat. Fix in `run.ts`: after `buildAgentInput(child)`, explicitly clear `childInputs.initialSummary` and `childInputs.discoveredTools` before attaching to the `SubagentConfig`. The parent keeps both — that's how the supervisor receives cross-turn context — but the child starts fresh. Paired with danny-avila/agents#107 (bumped to `^3.1.67-dev.4`), which adds the equivalent strip inside `buildChildInputs` to cover the self-spawn path where the SDK clones parent `_sourceInputs` directly and LibreChat never sees the intermediate shape. Belt and suspenders. Regression test (new): - `does NOT leak the parent run initialSummary into an explicit child (Codex P1 regression)` — sets `initialSummary` on the run, enables subagents with an explicit child, asserts the parent still has the summary but `childConfig.agentInputs.initialSummary` is `undefined`. Same for `discoveredTools`. 24 pass. * 🪆 fix: capability gate applies to handoff agents + parallel subagent test ### Codex P2 — handoff agents kept `subagents` after capability disabled The endpoint-level `AgentCapabilities.subagents` gate only cleared `subagents` on `primaryConfig`. Handoff agents loaded into `agentConfigs` retained their persisted `subagents.enabled: true`, and because `run.ts` calls `buildSubagentConfigs` for every agent input, self-spawn would still fire on a handoff target even when the admin had disabled the capability globally. Fix in `initialize.js`: after the subagent loading block, when the capability is off, iterate `agentConfigs.values()` and clear `subagents` + `subagentAgentConfigs` on every loaded config. Regression test: `clears subagents on handoff agents too when capability is disabled (Codex P2 regression)` — seeds a handoff target with its own `subagents.enabled: true`, disables the capability at the endpoint, asserts both primary AND handoff have `subagents` undefined in the client args. 23 init tests pass. ### Parallel subagent correlation — user-requested verification Added `keeps parallel subagent streams independent when events interleave` to `useStepHandler.spec.ts`. Two `subagent` tool calls seeded side by side, 6 interleaved `ON_SUBAGENT_UPDATE` envelopes dispatched (a-start, b-start, a-step, b-step, a-stop, b-step), each carrying its own `parentToolCallId`. Asserts each `tool_call_id`'s Recoil bucket accumulates only its own run's events, statuses reflect each run independently (`call_a` → stop, `call_b` → run_step), no cross-contamination. 49 step-handler tests pass. * 🪆 fix: SubagentCall detects cancelled / errored states (Codex P2) Codex P2 on #12725: the old `running` check only consulted `initialProgress` and the subagent's phase. A user stop, dropped stream, or backend crash before a terminal `stop`/`error` envelope arrived would leave the ticker permanently stuck on "working…". Other Call components (ToolCall.tsx) already model this via `!isSubmitting && !finished` → cancelled. Mirror that pattern. Re-introduce `isSubmitting` on `SubagentCallProps` (the prop was dropped earlier as 'unused' — that was a bug) and resolve status as a tri-state: - `finished` — initialProgress >= 1, or subagent `stop`/`error` - `cancelled` — `!isSubmitting && !finished` - `running` — neither New locale keys `com_ui_subagent_cancelled` + `com_ui_subagent_errored` swap in the right header text per state. Tests: new `SubagentCall.test.tsx` covers all four states with a real `RecoilRoot` and a `useRecoilCallback` seeder — no mocked store — 5/5 pass. Includes an explicit P2 regression test that simulates the `isSubmitting=false, progress.status='run_step', initialProgress<1` scenario and asserts the cancelled label renders. 🪆 feat: semantic ticker + aggregated content-part dialog for subagents Two rounds of feedback on #12725: ### Ticker — user-readable lines, not raw event names The old ticker showed \`on_run_step\`, \`on_message_delta\`, etc. — not meaningful to users. Replaced with \`buildSubagentTickerLines\`, a pure helper that walks the \`SubagentUpdateEvent\` stream and emits: - message/reasoning deltas → a single live "Writing: <last 60 chars>" (or "Reasoning: …") line that updates in place as chunks arrive - run_step with tool_calls → "Using calculator(expression=4258)" for a single call, "Using tool: a, b" for parallel (args dropped when multiple so the line stays short) - run_step_completed → "calculator → 4258 = 2436" (output truncated to 48 chars; falls back to "Tool X complete" when output is empty) - error → "Error: <message>" - start / stop / run_step_delta → suppressed (too granular / lifecycle-only) Args and output pass through \`summarizeArgs\` / \`summarizeOutput\` which flatten JSON to \`key=value\` pairs and head-truncate long strings so a 200-line tool output never bloats the ticker. ### Dialog — aggregated content parts via leaf renderers \`aggregateSubagentContent\` folds the raw event stream into \`TMessageContentParts[]\` — text/reasoning delta streaks collapse into single \`TEXT\` / \`THINK\` parts, tool calls become \`TOOL_CALL\` parts, and \`run_step\` boundaries correctly break text runs around tool calls. The dialog iterates those parts through a \`SubagentDialogPart\` renderer that delegates to the existing \`Text\`, \`Reasoning\`, and \`ToolCall\` leaf components — the same sub-components \`<Part />\` uses — wrapped in a minimal \`MessageContext\` so reasoning expand state and cursor animation work. Leaf components are used directly rather than importing \`<Part />\` itself to avoid a module cycle (Part → Parts/index → SubagentCall → Part) and to sidestep a hypothetical nested-subagent rendering. ### Tests - \`subagentContent.test.ts\` — 19 pure-function tests covering the aggregator (text concat, reasoning concat, tool call lifecycle, interleaving, phase suppression, late-arriving completions) and the ticker builder (live preview truncation, args/output snippets, parallel-call handling, output truncation, i18n formatter override). - \`SubagentCall.test.tsx\` — 9 component tests: 5 status-resolution (existing) + 2 ticker (semantic text, delta collapse) + 2 dialog (aggregated parts routed to leaf renderers, raw-output fallback). ### Locale keys New: \`com_ui_subagent_ticker_writing\`, \`…_reasoning\`, \`…_error\`, \`…_using\`, \`…_using_with_args\`, \`…_tool_complete\`, \`…_tool_output\`. Preserves i18n at the display layer while the helper stays pure. * chore: drop unused com_ui_subagent_activity_log locale key The dialog no longer renders an "Activity log" section — the new content-parts renderer replaced it. Also tweaks the dialog description copy to match. * 🪆 fix: subagent dialog order, persistence, auto-scroll, width Follow-up pass addressing the four issues observed in real runs against a live subagent-using parent. ### Aggregator ordering (reasoning appearing after text it preceded) Reproducible pattern: LLM emits reasoning → text → tool call in that order, but the dialog rendered text BEFORE reasoning in the content array. Root cause: `aggregateSubagentContent` maintained `currentText` and `currentThink` buffers in parallel and only flushed them at a `run_step` boundary in a fixed (text, think) order, losing the actual arrival order. Fix: when a text chunk arrives, close any open think buffer first (pushes it into the content array right then); symmetric for think → text. Two new regression tests cover the exact reasoning → text → tool_call sequence from the screenshot and the repeated reasoning ↔ text flow across a turn. ### Content persists after completion (markdown not rendering when done) `clearStepMaps` was calling `resetSubagentAtoms()` at stream end, which wiped every `subagentProgressByToolCallId` entry. Once reset, `contentParts.length === 0` and the dialog fell back to rendering the raw `output` string with plain text — hence the literal `##`/`*` in the completed-state screenshot. Stopped resetting; the atoms are bounded per-call (200-event cap) and per-conversation (one per subagent spawn) so growth matches the rest of the conversation state. `resetSubagentAtoms` is kept for a future conversation-switch caller. Also: routed the raw-`output` fallback (older subagent runs recorded before the event forwarder existed) through the same `SubagentDialogPart` → `Text` leaf that content parts use, so its markdown renders the same way. ### Auto-scroll to bottom while running Added a `scrollRef` on the dialog body and a `useEffect` that pins `scrollTop = scrollHeight` while the dialog is open AND the subagent is running. Triggers on `contentParts.length` (new tool calls / part boundaries) and `events.length` (intra-part deltas) so the cursor tracks text streaming. Disabled post-completion so re-opening a finished run doesn't yank to the bottom. ### Wider dialog Went from `max-w-2xl` (42rem / 672px — too cramped on maximized laptop windows) to `w-[min(95vw,64rem)] max-w-[min(95vw,64rem)]`. Narrow on phones, scales up to 64rem on desktop, always leaves a bit of margin from the viewport edge. Bumped `max-h-[65vh]` on the scroll area to give the extra width room to breathe vertically too. ### Tests - `subagentContent.test.ts` — 21 pass (2 new ordering regressions). - `useStepHandler.spec.ts` — 49 pass (1 updated to assert atoms are preserved* on clearStepMaps). - `SubagentCall.test.tsx` — 9 pass (unchanged; aggregator-level tests cover the ordering). * 🪆 feat: persist subagent_content via SDK createContentAggregator Per-request map of createContentAggregator instances keyed by the parent's tool_call_id. ON_SUBAGENT_UPDATE handler feeds each event into the matching aggregator (phase → GraphEvent mapping); AgentClient harvests contentParts onto the subagent tool_call at message save so the child's reasoning / tool calls / final text survive a page refresh. Reusing the SDK's battle-tested aggregator instead of a bespoke one keeps the persisted shape identical to the parent graph's output and drops ~100 lines of custom aggregation code. * 🪆 fix: incremental subagent aggregation + dialog render parity Disappearing tool_calls: the Recoil atom trimmed events to a 200-long rolling window, so verbose subagents could shed the `run_step` that originally created a tool_call part — rebuilding content from the trimmed window then produced only the surviving text/reasoning. Fix: fold each envelope into `contentParts` incrementally in the atom as it arrives (new `foldSubagentEvent` + cursor state). Event trim window now affects only the ticker, never the dialog. Render parity: dialog now applies `groupSequentialToolCalls` and renders single parts through `Container` + grouped batches through `ToolCallGroup` — same spacing and "Used N tools" collapsing the main message view uses. Width: `min(96vw, 80rem)` — wider on big screens, still responsive. Labels: "Subagent: X" is jargon. Named subagents render as `Running "{name}" agent` / `Ran "{name}" agent` (past tense on completion); self-spawns use `Running subtask` / `Ran subtask` since `Running "self" agent` reads badly. * 🪆 polish: subagent dialog parity + agent avatar in header Labels: drop "subtask" framing. Self-spawn shows `Running agent` / `Ran agent` (past tense on completion); named subagents stay `Running "X" agent` / `Ran "X" agent`. Dialog render parity: stop wrapping every part in `Container`. TEXT keeps its `Container` (gap-3 + `mt-5` sibling margin), THINK and TOOL_CALL render bare so their own wrappers set the full-column width the regular message view gives them — matches the main `<Part>` dispatch. Outer scroll region now uses `px-4 py-3` padding and a `max-w-full flex-grow flex-col gap-0` inner wrapper, mirroring the `MessageParts` container the main conversation uses. Avatar: header icon now renders the subagent's configured avatar via `MessageIcon` when `useAgentsMapContext()` has the child agent, falling back to the `Users` SVG (which keeps its running-state pulse). Same icon-left-of-label pattern the tool UI uses. * 🪆 polish: subagent group label, ticker throttle + tail-ellipsis, scroll button Grouped label: ToolCallGroup now detects all-subagent batches and labels them "Running N agents" / "Ran N agents" instead of "Used N tools". Mixed batches keep the existing label. The tool-name summary is suppressed for all-subagent groups (every entry dedupes to "subagent", which adds nothing). Ticker width + tail-ellipsis: raise the preview cap to 300 chars so wide containers aren't half-empty, and flip the ticker `<li>` to `dir="rtl"` so `text-overflow: ellipsis` clips the oldest characters (visually the left edge) — the newest tokens stay pinned to the right regardless of container width. Bidi lays out the Latin text LTR internally, the rtl only affects which side gets the ellipsis. Throttle: `useThrottledValue` hook (trailing-edge, 1.2s) smooths the live `Writing: …` preview so tokens no longer strobe past the eye faster than they can be read. Ref-based internals (not `useState`) avoid infinite-update loops when the upstream value is a new-reference each render; `NEGATIVE_INFINITY` sentinel ensures the very first value passes through synchronously so tests and first paint aren't delayed. Scroll-to-bottom: dialog tracks `isAtBottom` with a 120px threshold; auto-scroll only engages when the user is already following along, and a persistent jump-to-latest button appears whenever they scroll up — no more fighting the auto-scroll to read back. * 🪆 polish: snappier ticker, prefix-safe labels, agents icon, readable lines Ticker lines are now incrementally aggregated in the atom — same pattern as contentParts. The raw-events rolling window is gone; event volume no longer caps what the ticker can display. Verbose subagents that used to drop early tool_call lines out of the window now keep the full 3-line history (using_tool, tool_complete, writing). Discriminated-union ticker lines split a constant prefix (e.g. "Writing:") from a tail-truncatable body. The prefix lives in a `shrink-0` span so it never gets clipped when the body overflows; the body uses `dir="rtl"` only on itself — scoped so non-streaming lines (e.g. "Waiting for first update…") can't get their trailing ellipsis flipped by bidi. Content-aware throttle: 800ms interval (down from 1200ms), skipped entirely while the live buffer is below 120 chars. Early tokens now appear immediately — no more "Reasoning: I" sitting blank for a full second before the next heartbeat. Once the preview is long enough to fill the container, throttling kicks in at the tighter interval. Header label is now a constant verb + optional muted sub-label. Base reads "Running agent" / "Ran agent" / "Cancelled agent" / "Agent errored" for every subagent; named subagents get the configured agent name rendered to the right in secondary text (self-spawns and unresolved names omit it — "Running self agent" is nonsense). ToolCallGroup now detects `allSubagents` and swaps `StackedToolIcons` for a single `Users` glyph — otherwise the group header shows a wrench ("tool") icon next to "Ran 5 agents", which reads wrong. * 🪆 feat: delimiter-aware tool labels in ticker + full-width tool lines New shared `parseToolName` helper in `client/src/utils/toolLabels.ts` — single source of truth for splitting `<tool>_mcp_<server>` ids and mapping native tool names (web_search, execute_code, …) to their friendly translation keys. `ToolCallGroup` drops its inline copy and pulls from this helper. Ticker tool lines now use the shared parser + a new `ToolIdentifier` sub-renderer so the live log reads like the main tool UI: - MCP tool → `<server> · <code-badge:tool>` (e.g. "github · `search_code`") - Native → friendly name from `TOOL_FRIENDLY_NAME_KEYS` - Unknown → bare `<code>` badge of the raw id The `using_tool` / `tool_complete` rows now render with a `flex w-full items-baseline gap-1 overflow-hidden` layout matching the writing/reasoning rows — they take the full container width instead of collapsing to content size. Output snippets on `tool_complete` get the same tail-side `dir="rtl"` ellipsis so the newest characters stay flush-right when the container is narrow. Dropped the now-unused template i18n keys (`com_ui_subagent_ticker_using_with_args`, `com_ui_subagent_ticker_tool_complete`, `com_ui_subagent_ticker_tool_output`) in favor of tokens the JSX composes structurally. Only English is touched per the project rule; other locales follow externally. * 🪆 fix: dialog scroll button + auto-scroll during streaming deltas Two race/trigger bugs in the dialog's scroll behavior: Button never showed: `addEventListener('scroll', …)` in a `useEffect` ran before Radix's portal had actually committed the scroll container, so `scrollRef.current` was still null — the listener never attached, `isAtBottom` stayed stuck at its initial `true`, and the jump-to-latest button was never rendered. Swap to React's `onScroll` prop on the element itself so the handler wires up as part of DOM commit, not a post-commit effect. Auto-scroll stalled during text streaming: the pin-to-bottom effect only re-fired on `contentParts.length` changes. Message/reasoning deltas extend the last TEXT/THINK part's `.text` without changing the array length — so the view would drift up as tokens piled in and never catch back up. Replace the length-dep effect with a `ResizeObserver` on the inner content div; every height change (new part or in-place growth) triggers a scroll-pin when the user is still at the bottom. * 🪆 fix: drop leading ellipsis from ticker body truncatePreview was prepending ... to the tail when the buffer exceeded 300 chars. The component's CSS already produces a left-side ellipsis for overflow via dir=rtl + text-overflow: ellipsis — stacking a data-level ellipsis on top renders a stray dot character right after the Writing: / Reasoning: label (Writing: .Sure!), which looks like a typo to the reader. Data now returns just the last 300 chars when truncating; CSS handles the visual cue whenever the body actually overflows its container. * 🪆 fix: Codex review — subagent isolation + concurrent-safe throttle Three findings from the @codex review pass, all valid: P1 — buildAgentInput leaks parent discovered-tool state into subagent children. `buildAgentInput` mutates `agent.toolRegistry` (`overrideDeferLoadingForDiscoveredTools` flips `defer_loading:true→false` on tools the parent previously searched for) and appends those tools' definitions to the returned `toolDefinitions` before the function returns. `buildSubagentConfigs` was clearing the reported `initialSummary` / `discoveredTools` fields on the returned AgentInputs, but that happened post-return — the registry writes and extra tool definitions persisted on the child, silently defeating context isolation and inflating the child's prompt. Fix: `buildAgentInput` now takes an `isSubagent` flag that gates the registry-mutation block and omits `initialSummary` / `discoveredTools` at the source. `buildSubagentConfigs` passes `{ isSubagent: true }` for every explicit child; no post-hoc cleanup needed. P2 — ToolCallGroup labels a finished subagent group as still running when the child returned no output. `getToolMeta` computed `hasOutput` as `!!tc.output`, which is `false` for a completed subagent that returned empty text (the UI already has an "empty result" fallback for that case). `allCompleted` would stay `false` and the group header stuck on "Running N agents" forever. Fix: treat `tc.progress === 1` as completion too — progress is the authoritative lifecycle signal, output is just content. P2 — useThrottledValue schedules `setTimeout` during render. Discarded renders under Strict Mode / Concurrent rendering would leave orphan timers firing against stale trees. Fix: move `setTimeout` into a `useEffect` keyed on `[value, intervalMs, enabled]`. Render-time still mutates refs (idempotent), but timer scheduling lives post-commit. Cleanup on unmount and on passthrough transitions is preserved. * 🪆 fix: Codex P2 — wipe subagent atoms on conversation switch `clearStepMaps()` intentionally doesn't reset `subagentProgressByToolCallId` so a user can reopen a completed subagent's dialog mid-conversation, but `resetSubagentAtoms` was defined and never exposed / called — so each completed run's aggregated `contentParts` + `tickerState` stayed resident in the `atomFamily` for the whole app session. Unbounded growth across multi-conversation sessions. Expose `resetSubagentAtoms` from `useStepHandler` and fire it from `useEventHandlers` whenever the URL's `conversationId` changes. That's the correct cleanup boundary: historical subagent dialogs rehydrate from persisted `subagent_content` on each `tool_call` at message-save time, so wiping live atoms on switch doesn't lose any viewable history — it just releases per-tool-call state that the old conversation's components no longer subscribe to. * 🪆 fix: Codex round 3 — subagent registry isolation + post-run label Two more valid findings. P1 — parent-order registry mutation leaks into subagent inputs. `overrideDeferLoadingForDiscoveredTools` mutates `agent.toolRegistry` in place (the Map and the LCTool objects inside it). When an agent appears both as a handoff target (normal graph node) AND an explicit subagent child, a subagent build that ran before the parent's build captures a reference to the same registry — the parent's later mutation leaks through to the child. Fix: for subagent children (`isSubagent`), clone the `toolRegistry` Map and shallow-clone each LCTool inside before returning the inputs. `defer_loading` flips on parent-graph registry mutations can't propagate across the clone boundary. `toolDefinitions` also gets a shallow-copy pass so the same isolation holds for definitions the child carries directly. P2 — "Running N agents" label stuck after cancel/error. ToolCallGroup's all-subagent label was gated only on `allCompleted`, which requires every child to have `hasOutput \|\| progress === 1`. A subagent that gets cancelled (stream ends, no `stop` phase, no output) never satisfies that — so even after `isSubmitting` flips false, the header stays on "Running N agents" while each individual card correctly shows "Cancelled agent". Fix: derive a `subagentsDone` flag as `allCompleted \|\| !isSubmitting` and gate the past-tense label on that. Matches the tri-state each SubagentCall card already resolves (finished / cancelled / running). * 🪆 fix: Codex P2 — ACL-check subagents.agent_ids on create/update Codex flagged that `subagents.agent_ids` was accepted as arbitrary strings on the create/update routes while `edges` got a `validateEdgeAgentAccess` pass — so users could save subagent references to agents they can't VIEW. At runtime `initializeClient`'s `processAgent` ACL gate silently drops those, so the persisted configuration and the actual behavior diverged in a way that is difficult to diagnose. Refactor: extract the id-set → unauthorized-ids check into a shared `collectUnauthorizedAgentIds`, wrap it with a dedicated `validateSubagentAccess`, and plumb the same 403-on-failure response the edge path already returns. Applied on both POST /agents and PATCH /agents/:id. * 🪆 fix: Codex round 5 — ACL-disable escape hatch + ticker order Two valid findings. P1 — can't disable subagents after losing access to a child. The subagent ACL check ran on every create/update that echoed back the `agent_ids` list, even when the user was explicitly disabling the feature. The UI keeps the list intact when toggling `enabled: false`, so a user who subsequently lost VIEW on any child would be locked in a 403 loop — every edit (including the one that turns subagents off) bounces. Fix: gate the ACL check on `subagents.enabled !== false` at both the POST /agents and PATCH /agents/:id handlers. Empty list stays a no-op. Disabling the feature is always permitted. P2 — ticker fold merges out-of-order previews across delta switches. `foldSubagentEventIntoTicker` carried `textLineIdx` / `thinkLineIdx` across a reasoning → text → reasoning transition, so the second reasoning chunk appended to the original reasoning line instead of starting a new chronological one. Fix: close the opposite buffer + cursor when a delta-type switch is detected (same rule the content-parts reducer already applies). Added a regression test. * 🪆 fix: Codex round 6 — preserve mid-stream atoms + honor sequential suppression Two valid findings. P2 — atom reset fires on initial chat URL assignment. `useEventHandlers` initialized `lastConversationIdRef` from the URL's current `paramId`, then reset subagent atoms whenever the ref and `paramId` disagreed. For a brand-new conversation the URL stamp goes from `undefined → "abc123"` while the first response is still streaming, which used to drop subagent ticker/content state mid-run and leave dialogs missing earlier updates. Fix: only reset when both the old and new IDs are non-null and differ — i.e. a user-initiated switch between two established conversations. The initial assignment passes through without clearing. P2 — ON_SUBAGENT_UPDATE bypassed `hide_sequential_outputs`. Every other streaming handler in `callbacks.js` (`ON_RUN_STEP`, `ON_MESSAGE_DELTA`, etc.) gates emission on `checkIfLastAgent` + `metadata?.hide_sequential_outputs`, but the subagent forwarder did an unconditional `emitEvent` — so intermediate agents in a sequential chain were leaking their children's activity to the client even when the chain was configured to suppress intermediates. Fix: accept `metadata` and apply the same `isLastAgent \|\| !hide_sequential_outputs` gate. Aggregation still runs regardless of visibility (persistence + dialog depend on it); only the SSE forward is suppressed. * 🪆 fix: Codex P2 — gate subagent ACL check on endpoint capability `validateSubagentAccess` ran on every create/update where `subagents.enabled !== false`, regardless of the endpoint-level `subagents` capability. When the capability is off at the appConfig level, `initializeClient` already strips the `subagents` block at runtime — so persisted `agent_ids` are inert — but the validation could still 403 on a legacy record whose referenced child is no longer viewable, blocking unrelated edits. Fix: add `isSubagentsCapabilityEnabled(req)` that reads the agents endpoint's capabilities from `req.config` and gate both the create and update ACL checks on it. Capability-off environments can update agents with stale `subagents` data freely; capability-on keeps the full ACL protection. * 🪆 fix: Codex P2 — reset subagent atoms on id→null navigation too Previous guard (both-established) skipped the reset whenever `paramId` became null/undefined, so navigating from an existing chat to a "new chat" route left stale subagent progress resident in the `atomFamily` until the user picked a specific different chat. Swap the both-established check for a one-time flag: skip only the very first `undefined → id` transition (the brand-new-chat URL stamp that happens mid-stream), then reset on any subsequent change — id→id, id→null, null→id-after-reset. If the user started on an established chat the flag is true at mount, so the guard is a no-op and every navigation resets normally. * 🪆 fix: Codex round 9 — subagent persistence gate + handoff children Two valid findings. P1 — hide_sequential_outputs also gates persistence. The previous fix gated the SSE forward on `isLastAgent \|\| !hide_sequential_outputs` but still ran the per-tool-call `createContentAggregator` aggregation unconditionally. `finalizeSubagentContent` would then attach the hidden intermediate agent's child reasoning / tool output to the saved message, so a page refresh could reveal activity that was intentionally suppressed live. Move the visibility gate to the top of the handler — hidden agents now skip both aggregation and emission, so "hide_sequential_outputs" is a consistent "don't record" rule for subagent traces. P2 — handoff agents' explicit subagents were silently dropped. `initializeClient` only resolved `subagentAgentConfigs` for the primary config, so an agent used via handoff that had its own `subagents.agent_ids` saved in the builder would get self-spawn only; every explicit child was quietly ignored, creating a saved-config / runtime mismatch the user couldn't diagnose. Extract the resolution into a shared `loadSubagentsFor(config)` helper and invoke it for the primary and every handoff agent in `agentConfigs`. The `edgeAgentIds` precomputation stays outside the helper (it's loop-invariant). Capability-off shortcuts return empty early so the existing strip-on-capability-off path still holds. * 🪆 fix: Codex P2 — recursive subagent build for multi-level delegation Previously only the outer `agents[]` loop attached `subagentConfigs` to its inputs, so a child used as a subagent (invoked via the `subagent` tool) lost every explicit spawn target of its own. A user-valid configuration like A → B → C would only run the top layer; B could never actually delegate to C from inside A's run. Recursively build `subagentConfigs` for each child inside `buildSubagentConfigs`, passing the child's freshly-constructed `childInputs` down so its own `subagents.enabled` children get resolved too. Added cycle protection via an `ancestors` Set — a configuration like A → B → A is safely cut off at the second encounter of A rather than recursing forever (the existing `child.id === agent.id` guard already prevents the direct self-loop). * 🪆 fix: Codex P2 — reset subagent atoms on useEventHandlers unmount The effect that resets subagent atoms only fired on `paramId` change, so unmounting the chat container (route change away from /c) never flushed the atoms. `knownSubagentAtomKeys` lives in a ref inside `useStepHandler` — once the hook unmounts the ref is gone, so a subsequent remount can't clean atoms it never registered. Added a second `useEffect` that only runs cleanup on unmount (empty deps aside from the stable `resetSubagentAtoms` callback). Keeps `atomFamily` bounded across full route teardowns too. * 🪆 fix: Codex round 13 — cyclic subagent guard + prefer persisted Two valid findings. P1 — cyclic subagent ref reloads the primary. A configuration like `A ↔ B` (B lists A as its own subagent) would send `loadSubagentsFor` down a path that couldn't find A in `agentConfigs` (the primary isn't stored there), so it called `processAgent(A)` a second time. That inserts a fresh config for the primary id, which downstream duplicates in `[primary, ...agentConfigs.values()]` and can replace the primary's tool context with the reloaded copy. Fix: short-circuit when a subagent ref points back at `primaryConfig.id` — reuse the already-loaded primary config. Primary is always an edge id so no pruning bookkeeping needed. P2 — live atom preferred over canonical persisted trace. The dialog picked `progress.contentParts` ahead of `persistedContent`, but the Recoil bucket is best-effort — after a disconnect/reconnect it can be stale or partial. The server's `subagent_content` on the `tool_call` is the canonical record refreshed on sync. Preferring live could hide completed tool/reasoning history that was actually persisted. Fix: flip the preference order. Persisted wins when it's non-empty; live covers the mid-stream window (before the parent message saves, persisted is empty) and the older-runs fallback. Updated the test that enforced the old order to lock the new semantics in (separate mid-stream live-fallback assertion kept). * 🪆 fix: Codex P2 — subagent atom reset rule simplified to 'leaving established id' The `hasEstablishedConversationRef` + check for initial undefined→id covered the first navigation but missed the equivalent mid-stream URL stamp when a user goes from an existing chat to a new chat and sends a message there (`id → null → newId`). The null → newId transition was still hitting the reset branch and wiping the in-flight subagent ticker/content for that first turn. Simpler rule: only reset when the PREVIOUS paramId is an established id. Every transition AWAY from an established chat clears (id→id2, id→null, id→undefined); every transition FROM null/undefined passes through (initial mount, new-chat URL stamp mid-stream). Drop the `hasEstablishedConversationRef` machinery in favor of that single condition. * 🪆 fix: Codex P2 — match runtime's strict subagent enable check in ACL Runtime (`initializeClient` + `run.ts`) treats `subagents?.enabled` as a truthy predicate — `undefined`, `null`, missing, and `false` all short-circuit. The ACL gate was using `!== false` which accepted `undefined` / missing as "enabled" and could 403 a payload whose subagent tool would be inert at runtime. Swap both create and update to `enabled === true`. Only a strictly- enabled payload triggers the ACL check; the disable path (`false`) still passes through so a user who lost VIEW on a child can still save the disable edit. * 🪆 fix: Codex P2 — reject missing subagent references with 400 `validateSubagentAccess` collapsed through `collectUnauthorizedAgentIds`, which returns an empty list for ids with no DB record — so typos and references to deleted agents passed validation silently, and `initializeClient` later dropped them at runtime. Saved config would then list spawn targets that the backend never honored, a hard-to- diagnose drift. Refactor the helper into `classifyAgentReferences(ids, …)` which returns `{ missing, unauthorized }` separately. `validateEdgeAgentAccess` keeps its old semantics (missing is intentional — a self-referential `from` names the agent being created). `validateSubagentReferences` surfaces both buckets so the create/update handlers can 400 on missing and 403 on unauthorized with distinct error messages and `agent_ids` lists. * 🪆 polish: tighten subagent dialog grid gap to gap-2 OGDialogContent's grid default is `gap-4`, which renders the title, description, and scroll area as three visually separated panels. Drop to `gap-2` so they read as one block. * 🪆 polish: swap Subagents above Handoffs in Advanced panel Subagents is the more common knob users reach for, so show it first. Handoffs keep the same Controller wiring, just move below.	2026-04-25 04:02:01 -04:00
Danny Avila	dd72b7b17e	🔄 chore: Consolidate agent model imports across middleware and tests from rebase - Updated imports for `createAgent` and `getAgent` to streamline access from a unified `~/models` path. - Enhanced test files to reflect the new import structure, ensuring consistency and maintainability across the codebase. - Improved clarity by removing redundant imports and aligning with the latest model updates.	2026-03-21 14:28:55 -04:00
Danny Avila	bcf45519bd	🪪 fix: Enforce VIEW ACL on Agent Edge References at Write and Runtime (#12246 ) * 🛡️ fix: Enforce ACL checks on handoff edge and added-convo agent loading Edge-linked agents and added-convo agents were fetched by ID via getAgent without verifying the requesting user's access permissions. This allowed an authenticated user to reference another user's private agent in edges or addedConvo and have it initialized at runtime. Add checkPermission(VIEW) gate in processAgent before initializing any handoff agent, and in processAddedConvo for non-ephemeral added agents. Unauthorized agents are logged and added to skippedAgentIds so orphaned-edge filtering removes them cleanly. * 🛡️ fix: Validate edge agent access at agent create/update time Reject agent create/update requests that reference agents in edges the requesting user cannot VIEW. This provides early feedback and prevents storing unauthorized agent references as defense-in-depth alongside the runtime ACL gate in processAgent. Add collectEdgeAgentIds utility to extract all unique agent IDs from an edge array, and validateEdgeAgentAccess helper in the v1 handler. * 🧪 test: Improve ACL gate test coverage and correctness - Add processAgent ACL gate tests for initializeClient (skip/allow handoff agents) - Fix addedConvo.spec.js to mock loadAddedAgent directly instead of getAgent - Seed permMap with ownedAgent VIEW bits in v1.spec.js update-403 test * 🧹 chore: Remove redundant addedConvo ACL gate (now in middleware) PR #12243 moved the addedConvo agent ACL check upstream into canAccessAgentFromBody middleware, making the runtime check in processAddedConvo and its spec redundant. * 🧪 test: Rewrite processAgent ACL test with real DB and minimal mocking Replace heavy mock-based test (12 mocks, Providers.XAI crash) with MongoMemoryServer-backed integration test that exercises real getAgent, checkPermission, and AclEntry — only external I/O (initializeAgent, ToolService, AgentClient) remains mocked. Load edge utilities directly from packages/api/src/agents/edges to sidestep the config.ts barrel. * 🧪 fix: Use requireActual spread for @librechat/agents and @librechat/api mocks The Providers.XAI crash was caused by mocking @librechat/agents with a minimal replacement object, breaking the @librechat/api initialization chain. Match the established pattern from client.test.js and recordCollectedUsage.spec.js: spread jest.requireActual for both packages, overriding only the functions under test.	2026-03-15 18:08:57 -04:00