mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 16:07:30 +00:00
186 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
24e29aa8cb
|
🌱 fix: Inject Code-Tool Files Into Graph Sessions on First Call (+ read_file Sandbox Fallback) (#12831)
* 🌱 fix: Seed Code Tool Files Into Graph Sessions on First Call
Files attached to an agent's `tool_resources.execute_code` (user uploads
or generated artifacts from a prior turn) were silently dropped on the
first `execute_code` invocation of a turn. The agents-side `ToolNode`
populates `_injected_files` only when its `sessions` map already has an
`EXECUTE_CODE` entry — but that entry is only written by a previous
successful execution, so call #1 had nothing to inject. CodeExecutor
then fell back to a `/files/{session_id}` fetch, but `session_id` was
also empty on call #1, leaving the sandbox without the primed files.
Mirror the existing skill-priming pattern (`primeInvokedSkills` →
`initialSessions`) for code-resource files: eagerly call `primeFiles`
before `createRun` and merge the result into `initialSessions` via a
new `seedCodeFilesIntoSessions` helper. Skill files and code-resource
files now share the same `EXECUTE_CODE` entry; the prior representative
`session_id` is preserved on merge.
* 🔬 chore: Add Diagnostic Logging for Code-Files Seeding
Temporary debug logs to diagnose why first-call file injection is not
firing in real agent runs. Logs `wantsCodeExec`, available tool-resource
keys, primed file count, and the seeded EXECUTE_CODE entry. Will revert
once the failure mode is identified.
* 🪛 refactor: Capture primedCodeFiles per-agent at init, merge across run
Replace the client.js eager `primeFiles` call with a per-agent capture at
initialization time so every agent in a multi-agent run (primary +
handoff + addedConvo) contributes its `tool_resources.execute_code`
files to the shared `Graph.sessions` seed.
- handleTools.js (eager loadTools): the `execute_code` factory closes
over a `primedCodeFiles` slot and surfaces it in the return.
- ToolService.js loadToolDefinitionsWrapper (event-driven): captures
`files` from the existing `primeCodeFiles` call (was dropping them
while only keeping `toolContext`) and surfaces them.
- packages/api initialize.ts: the loadTools callback contract now
includes `primedCodeFiles`, threaded onto `InitializedAgent`.
- client.js: iterate `[primary, ...agentConfigs.values()]` and merge
each agent's `primedCodeFiles` into `initialSessions`. Drop the
primary-only `primeCodeFiles` call and diagnostic logs from the prior
attempt — wrong layer (single-agent), wrong gate (`agent.tools`
contained Tool instances after init, so the `.includes("execute_code")`
string check always failed).
* 🔬 chore: Add per-agent diagnostic logs for code-files seeding
Logs `tool_resources` keys + file counts inside loadToolDefinitionsWrapper
and per-agent `primedCodeFiles` + final initialSessions inside
AgentClient. Will revert once the failure mode is confirmed.
* 🔬 chore: Add file-lookup diagnostics inside initializeAgent
Logs the inputs and intermediate counts of the conversation-file lookup
chain (convo file ids, thread message ids, code-generated and
user-code file counts) so we can pinpoint why `tool_resources.execute_code`
is arriving empty at `loadToolDefinitionsWrapper` despite the agent
having `execute_code` in its tools list.
* 🔬 chore: Probe execute_code files without messageId filter
Adds a relaxed `getFiles({conversationId, context: execute_code})` probe
that runs only when `getCodeGeneratedFiles` returns empty. Lists what's
actually in the DB for this conversation so we can confirm whether the
file is missing entirely or whether the messageId filter is rejecting it.
* 🔬 chore: Fix probe getFiles arg order (sort vs projection)
Probe was passing a projection object as the sort arg, which mongoose
rejected with `Invalid sort value`. Move it to the third arg
(selectFields) so the probe actually runs.
* 🪢 fix: Preserve Original messageId on Code-Output File Update
Each `processCodeOutput` call was overwriting the persisted file's
`messageId` with the *current* run's id. When a turn re-creates an
existing file (filename + conversationId match → `claimCodeFile`
returns the existing record, `isUpdate=true`), the file's link to
the assistant message that originally produced it gets clobbered.
`initializeAgent` later runs `getCodeGeneratedFiles({messageId: $in: <thread>})`
to seed `tool_resources.execute_code` from prior-turn artifacts. With a
stale `messageId` (e.g. from a failed read attempt that re-shelled the
same filename), the file no longer matches the parent-walk thread, so
`tool_resources` arrives empty at agent init, the new
`primedCodeFiles` channel has nothing to seed, and the LLM can't see
its own prior-turn artifacts on the next turn — defeating the
just-added Graph-sessions seeding fix.
Preserve the existing `claimed.messageId` on update; first-creation
behavior is unchanged. The runtime return value still includes the
current run's `messageId` (via `Object.assign(file, { messageId })`)
so the artifact is correctly attributed to the live tool_call.
* 🧹 chore: Remove diagnostic logs from code-files seeding path
Drops the temporary debug logs added to trace the empty-tool_resources
failure mode. Production code paths (loadToolDefinitionsWrapper,
client.js seed loop, initializeAgent file lookup) are left as the
permanent shape: capture primedCodeFiles, merge across agents, seed
initialSessions before run start.
* 🪛 feat: read_file Sandbox Fallback for /mnt/data + Non-Skill Paths
When the model called `read_file` with a code-execution path (e.g.
`/mnt/data/sentinel.txt`), the handler returned a misleading
`Use format: {skillName}/{path}` error. Adds a sandbox-aware fallback:
- Short-circuit `/mnt/data/...` (can never be a skill reference) →
route to a sandbox `cat` via the new host-provided `readSandboxFile`
callback, which POSTs to the codeapi `/exec` endpoint.
- Skip the skill resolver entirely when `accessibleSkillIds` is empty
— the resolved-output of `resolveAgentScopedSkillIds` already
collapses the admin capability + ephemeral badge + persisted
`skills_enabled` chain, so an empty value is the authoritative
"skills aren't in scope for this agent" signal.
- For `{firstSegment}/...` paths, consult the catalog-derived
`activeSkillNames` Set (no DB read) to detect non-skill names and
fall through to the sandbox before the model has to retry with
`bash_tool`.
`activeSkillNames` is captured from `injectSkillCatalog`, threaded onto
`InitializedAgent`, into `agentToolContexts`, then through
`enrichWithSkillConfigurable` into `mergedConfigurable` for the handler.
The host implementation of `readSandboxFile` lives in
`api/server/services/Files/Code/process.js` and shells `cat <path>`
through the seeded sandbox session — `tc.codeSessionContext`
(emitted by ToolNode for `read_file` calls in `@librechat/agents`
v3.1.72+) provides the `session_id` + `_injected_files` so the read
lands in the same sandbox that holds prior-turn artifacts. When the
seeded context isn't available (older agents version, no codeapi
configured), the handler returns a model-visible error pointing at
`bash_tool` instead of silently failing.
Tests: 8 new `handleReadFileCall` cases cover the new short-circuits,
the skills-not-enabled gate, the activeSkillNames lookup, the
sandbox-fallback success path, and the bash_tool retry hint on
fallback failure. Existing `read_file` tests now opt into "skills are
in scope" via a `skillsInScope()` fixture (production wouldn't reach
the skill lookup with empty `accessibleSkillIds`).
* 🔧 chore: Update @librechat/agents dependency to version 3.1.72
Bumps the version of the @librechat/agents package across package-lock.json and relevant package.json files to ensure compatibility with the latest features and fixes.
* 🪛 refactor: Centralize Tool-Session Seed in buildInitialToolSessions Helper
Addresses review feedback on the per-agent merge in client.js:
- **Run-wide semantics, named explicitly.** The merge into a single
`Graph.sessions[EXECUTE_CODE]` was a deliberate match to the
agents-library design (`Graph.sessions` is shared across every
`ToolNode` in the run), but the inline `for (const a of agents)`
loop in `AgentClient.chatCompletion` made it look per-agent. Move
the logic to a TS helper `buildInitialToolSessions` that documents
the run-wide-by-design contract in one place. The CJS controller
now contains a single call site, no business logic.
- **Subagent walk (P2).** The previous loop only iterated
`[primary, ...agentConfigs.values()]`. Pure subagents are pruned
out of `agentConfigs` after init and retained on each parent's
`subagentAgentConfigs`, so their primed code files were silently
dropped from the seed. The helper now walks recursively, with a
visited-Set keyed on object identity that terminates safely on a
malformed agent graph (cycle).
- **`jest.setup.cjs` polyfill for undici `File`.** Reviewer hit
`ReferenceError: File is not defined` running the targeted spec on
WSL — a known Node 18 issue where `globalThis.File` from
`node:buffer` isn't auto-exposed. Polyfill it inside a Jest setup
file so the suite boots regardless of Node patch version.
Helper test coverage (8 new): skill-only / agent-only / both,
recursive subagent walk, cycle-safe walk, primary+subagent
deduplication, undefined/null entries in the agents iterable, and
representative session_id preservation across the merge.
16 tests pass total in `codeFilesSession.spec.ts` (8 prior + 8 new).
No behavior change vs. the previous commit for the existing
primary+agentConfigs case — subagent inclusion is the only new
behavior, and it matches what the existing seeding logic would have
done if subagents had been in `agentConfigs`.
* 🪛 fix: FIFO Walk Order in buildInitialToolSessions (P3 review)
The traversal used `Array.pop()` (LIFO), which visited the LAST
top-level agent first. The docstring says "primary first"; the code
contradicted it. When no skill seed exists the first-visited agent's
first file supplies the representative `session_id` written to
`Graph.sessions[EXECUTE_CODE]` — so a LIFO walk silently flipped which
agent that came from. `ToolNode` ultimately uses per-file `session_id`s
for runtime injection (so behavior was indistinguishable for current
callers), but the discrepancy was a footgun for any future consumer
that read the representative.
Switch to FIFO via `Array.shift()` to match both the docstring and the
existing `loadSubagentsFor` walk pattern in
`Endpoints/agents/initialize.js`. Add a regression test that asserts
the primary's `session_id` is the representative (and that all three
agents' files still contribute, with per-file `session_id`s preserved).
* 🔬 test: Lock In Code-Files Bug Fixes Per Comprehensive Review
Addresses MAJOR + MINOR + NIT findings from the multi-pass review:
**Finding #4 (MINOR) — empty relativePath misses sandbox fallback.**
A model calling `read_file("output/")` where "output" isn't a skill
name dead-ended with `Missing file path after skill name` instead of
being routed to the sandbox like every other malformed-path branch.
Add the same `codeEnvAvailable → handleSandboxFileFallback` pattern,
plus two regression tests.
**Finding #7 (NIT) — duplicate `skillsInScope()` helper.**
Hoist the identical helper out of two nested describe blocks to
module scope. Single source of truth.
**Finding #1 (MAJOR) — `persistedMessageId` had zero test coverage.**
The fix preserves a file's original `messageId` on update so
`getCodeGeneratedFiles` can still match it on subsequent turns. A
regression in the `isUpdate ? (claimed.messageId ?? messageId) : messageId`
ternary would silently re-introduce the original cross-turn priming
bug. Five new tests cover:
- UPDATE preserves `claimed.messageId` in the persisted record
- UPDATE falls back to current run id when `claimed.messageId` is
absent (legacy records predating the field)
- CREATE uses current run id (no claimed record exists)
- The runtime return value uses the LIVE id (artifact attribution)
even when the persisted record kept the original
- The image branch follows the same contract (would silently regress
if the ternary diverged across the two file-build branches)
The tests use a `snapshotCreateFileArgs()` helper because
`processCodeOutput` mutates the file object after `createFile`
returns (`Object.assign(file, { messageId, toolCallId })`) and a
naive `createFile.mock.calls[0][0]` would reflect the post-mutation
state instead of what was actually persisted.
**Finding #2 (MAJOR) — `readSandboxFile` had no direct tests.**
The model-controlled `file_path` flows through a POSIX single-quote
escape into a shell `cat` command, making this a security boundary.
A quoting regression would let a malicious filename break out of the
quoted argument and inject arbitrary shell. 20 new tests across:
- Shell quoting (7): plain filenames, embedded `'`, `$()`, backticks,
newlines, shell metachars, multiple consecutive single-quotes
- Payload shape (6): /exec URL, bash language, conditional
session_id / files inclusion, dedicated keepAlive:false agents
- Response handling (6): `{content}` on success, null on missing
base URL or absent stdout, throws on stderr-only, partial-success
returns stdout, transport errors are logged then rethrown
- Timeout (1): matches processCodeOutput's 15s SLA
Audited findings #5 (acknowledged tech debt — readSandboxFile in JS
workspace), #6 (pre-existing positional-args debt on
enrichWithSkillConfigurable), and #8 (cosmetic JSDoc style) — no
action taken per the reviewer's own assessment.
Audited finding #3 (walk order vs docstring) — already addressed in
commit
|
||
|
|
596f806f60 |
🛡️ fix: Strict Opt-In Skills Activation per Agent (#12823)
* 🛡️ fix: Strict opt-in skills activation per agent
Skills were activating on every agent run that had the capability +
RBAC enabled, regardless of whether the user (ephemeral) or author
(persisted) had opted in. `scopeSkillIds(undefined)` fell through to
"full accessible catalog" whenever `agent.skills` was unset, which is
the default state for any agent created before skills existed and for
every ephemeral agent.
Activation now requires an explicit signal:
- Ephemeral agent → per-conversation skills badge toggle.
- Persisted agent → new `skills_enabled` master switch on the agent
doc, surfaced as a toggle in the Agent Builder skills section.
Enabled + empty/undefined allowlist = full accessible catalog;
enabled + non-empty allowlist = narrow to those ids; disabled (or
undefined) = no skills available, even if an allowlist is set.
Centralised the predicate in `resolveAgentScopedSkillIds` so the
primary-agent path, handoff/discovery, the subagent loop, and both
OpenAI controllers all share one source of truth. Frontend `$`
popover scope mirrors the same logic so the UI never offers skills
the backend would refuse to activate.
* test: mock resolveAgentScopedSkillIds in agent controller specs
* refactor: address review findings on skills opt-in PR
- AgentConfig: associate skills label with toggle via htmlFor for
click/keyboard affordance; simplify Switch handler to Boolean(value).
- skills: mark scopeSkillIds as @internal so runtime callers continue
to route through resolveAgentScopedSkillIds and inherit the activation
predicate (ephemeral toggle, persisted skills_enabled).
* fix(agents): include skills_enabled in agent list projection
Without this field, agents loaded via the list endpoint hydrate into the
client agentsMap with skills_enabled === undefined, causing the `$`
skill popover to hide every skill on a fresh page load even when the
agent was saved with skills_enabled: true.
* fix(skills): fail closed for persisted agents during agentsMap hydration
Returning undefined while the agents map loads let the popover render the
full catalog for a persisted agent before we could read its
skills_enabled flag, so the user could pick a skill the backend would
then refuse for the turn. Match the strict opt-in contract by returning
[] until the map is authoritative.
* refactor(skills): extract skillsHintKey for readability
Replaces the nested ternary in the skills section JSX with a
pre-computed constant so the activation -> hint key mapping reads
top-down.
* refactor(skills): unflatten skillsHintKey to remove nested ternary
|
||
|
|
d83cb84f59 |
🪆 feat: Subagent configuration in Agent Builder (#12725)
* 🪆 feat: Subagents configuration (isolated-context child agents) Surfaces the new @librechat/agents `SubagentConfig` primitive in the Agent Builder. Subagents let a supervisor delegate a focused subtask to a child graph running in an isolated context window: verbose tool output stays in the child, only a filtered summary returns to the parent. Data model: new `subagents: { enabled, allowSelf, agent_ids }` on Agent, wired through the Zod, Mongoose, and form schemas plus a new `AgentCapabilities.subagents` capability (enabled by default). Backend: `initialize.js` loads explicit subagent configs alongside handoff agents, and drops subagent-only references from the parallel/handoff maps so they don't leak into the supervisor's graph. `run.ts` emits `SubagentConfig[]` on the primary `AgentInputs` — a self-spawn entry when `allowSelf` is enabled plus one entry per configured agent. UI: an "Advanced" panel section with an enable toggle, a self-spawn toggle, and an agent picker (capped at 10). Enabling without adding agents still yields self-spawn; disabling self-spawn with no agents shows a warning. A capability flag gates the whole section. * 🪆 feat: Stream subagent progress to UI (dialog + inline ticker) Pairs with the @librechat/agents SDK change that forwards child-graph events through the parent's handler registry (danny-avila/agents#107): - Self-spawn and explicit subagents can now use event-driven tools, because child `ON_TOOL_EXECUTE` dispatches reach our ToolService via the parent's registered handler. - The same forwarding path wraps the child's run_step / run_step_delta / run_step_completed / message_delta / reasoning_delta dispatches in a new `ON_SUBAGENT_UPDATE` envelope, with start/stop/error bookends. Backend: `callbacks.js` registers an `ON_SUBAGENT_UPDATE` handler that forwards each envelope straight to the SSE stream. Frontend: - `useStepHandler` consumes `ON_SUBAGENT_UPDATE` events and merges them into a per-tool_call Recoil atom (`subagentProgressByToolCallId`). First-seen `subagentRunId` claims the most-recent unclaimed `subagent` tool call in the active response message — a temporal mapping, no SDK wire-format change needed to correlate child runs with parent tool calls. - New `SubagentCall` part component replaces the default `ToolCall` rendering when `toolCall.name === Constants.SUBAGENT`: compact status ticker showing the 3 most recent update labels, clickable to open a dialog with the full activity log + final markdown-rendered result. - Adds `Constants.SUBAGENT`, `StepEvents.ON_SUBAGENT_UPDATE`, and `SubagentUpdateEvent` type in data-provider. Tests: - `packages/api npx jest run-summarization` — 23 pass - `api npx jest initialize` — 16 pass - `npm run build` — clean Dependency note: bumps `@librechat/agents` to `^3.1.67-dev.1` — requires the SDK PR (danny-avila/agents#107) to be merged to dev and published before this PR merges. `ON_SUBAGENT_UPDATE` is absent from dev.0, so the handler registration would be a no-op with the older SDK but would not crash. * 🪆 fix: address Codex review and review audit on subagents Stacks on top of the SDK change in danny-avila/agents#107 (bumped to `^3.1.67-dev.2`). - **P1 (`initialize.js`)**: subagent-only agents were being deleted from both `agentConfigs` AND `agentToolContexts`. The tool-execute handler resolves execution context (agent, tool_resources, skill ACLs) from `agentToolContexts`, so explicit subagents would run without their configured resources and skip action tools. Now only `agentConfigs` is pruned — tool context stays intact. - **P2 (`AgentSubagents.tsx`)**: toggling subagents off set the form field to `undefined`; `removeNullishValues` stripped it from the PATCH, leaving the server copy enabled. Now it persists an explicit `{ enabled: false, ... }` so the update actually clears state. - **Finding 1 (MAJOR)** — `agent_ids` Zod schema gains `.max()` via a new `MAX_SUBAGENTS` export from `data-provider` (shared with the UI cap). Crafted payloads can't trigger hundreds of `processAgent` calls. - **Finding 2 (MAJOR)** — `subagentProgressByToolCallId` atomFamily atoms are now tracked in a ref and reset from `clearStepMaps` via a `useRecoilCallback({ reset })`. No monotonic growth across a session. - **Finding 3 (MAJOR)** — early-arriving `ON_SUBAGENT_UPDATE` events whose parent `tool_call_id` is not yet mapped are now buffered in `pendingSubagentBuffer` (keyed by `subagentRunId`) and replayed in arrival order once correlation completes. Mirrors the existing `pendingDeltaBuffer` pattern. - **Finding 4 (MAJOR)** — switched to deterministic correlation via the new `parentToolCallId` that SDK `3.1.67-dev.2` threads through from `ToolRunnableConfig.toolCall.id`. Temporal fallback now iterates oldest-unclaimed-first (forward), matching tool-call creation order, so concurrent spawns map correctly. - **Finding 6 (MINOR)** — `agent_ids` are deduped on the backend via `new Set(...)` before the load loop. Duplicates no longer produce duplicate `SubagentConfig` entries visible to the LLM. - **Finding 7 (MINOR)** — events array inside each Recoil atom is capped at 200 entries. Long-running subagents no longer replay O(n) spreads on every update; the dialog log still shows the cap window. - **Finding 8 (MINOR)** — documented: subagents are loaded only for the primary agent this release (handoff children get self-spawn but not explicit sub-subagents). In-code comment added so the next maintainer doesn't wonder. - **Finding 9 (NIT)** — removed `{!isSubmitting && null}` dead code and the misleading announce-polite comment in `SubagentCall`. - New `validation.spec.ts` — 9 tests covering the cap on `agent_ids.length` at the subagent schema, agent-create, and agent-update layers. - `run-summarization` — 23 pass, `initialize` — 16 pass, total backend package: 103 pass across touched areas. Findings 5 (component tests) and 10 (micro-allocation) are tracked but deferred; the former needs a Recoil-RenderHook harness that isn't in this PR's scope, and the latter has negligible impact (one `Array.from` per subagent run). * 🧪 test: integration coverage for subagent correlation + backend loading Addresses the follow-up audit on #12725 with real-code tests (no mock handlers, only the existing setMessages/getMessages spies and the standard mongodb-memory-server harness). Six new tests under a dedicated `describe('subagent loading')`: - loads a configured subagent, populates `subagentAgentConfigs`, keeps it out of `agentConfigs` - **P1 regression guard**: drives the real `toolExecuteOptions.loadTools` closure with the subagent id and asserts `loadToolsForExecution` is called with `agent: <subagent>`, `tool_resources`, `actionsEnabled`. If anyone deletes `agentToolContexts` again, this fails. - dedup: three copies of the same id load the agent once - overlap: agent referenced both as handoff target and subagent stays in `agentConfigs` - capability gate: admin disabling `subagents` suppresses loading even when the agent has a config - per-agent disable: `subagents.enabled: false` skips loading entirely Five new tests under `describe('on_subagent_update event')` using a real `RecoilRoot` and a companion `useRecoilCallback` reader so writes from the hook are observable: - deterministic correlation via `parentToolCallId` (happy path with SDK dev.2+) - fallback: oldest-unclaimed tool call wins for concurrent spawns without `parentToolCallId` - early-arrival buffer: updates with no mapping get buffered and replayed once the tool call appears - event cap: 205 updates collapse to 200 retained, oldest dropped - `clearStepMaps` resets tracked atoms back to their null default - F2 — added explicit `// TODO` marker for handoff-subagent-loading extension (matches the comment that referenced it). - F3 — dropped the unnecessary `MAX_SUBAGENTS as MAX_SUBAGENTS_CAP` alias; just import the constant directly. - Bumped `@librechat/agents` to `^3.1.67-dev.3` to pick up the SDK's paired test additions. - `api/server/services/Endpoints/agents/initialize.spec.js` — 22 pass (6 new + 16 existing) - `packages/api/src/agents/validation.spec.ts` + `run-summarization.test.ts` — 103 pass - `client/src/hooks/SSE/__tests__/useStepHandler.spec.ts` — 48 pass (5 new + 43 existing) * 🪆 fix: strip parent run summary + discovered tools from subagent inputs Codex P1 on #12725: `buildSubagentConfigs` reused the shared `buildAgentInput` factory for each explicit child, and that factory always stamps the parent run's `initialSummary` (cross-run conversation summary) and `discoveredTools` (tool names the parent's LLM searched earlier) onto every `AgentInputs` it returns. When subagents were enabled on a conversation that had already been summarized, every child inherited that summary — silently defeating the isolated-context contract and burning extra tokens on unrelated prior chat. Fix in `run.ts`: after `buildAgentInput(child)`, explicitly clear `childInputs.initialSummary` and `childInputs.discoveredTools` before attaching to the `SubagentConfig`. The parent keeps both — that's how the supervisor receives cross-turn context — but the child starts fresh. Paired with danny-avila/agents#107 (bumped to `^3.1.67-dev.4`), which adds the equivalent strip inside `buildChildInputs` to cover the self-spawn path where the SDK clones parent `_sourceInputs` directly and LibreChat never sees the intermediate shape. Belt and suspenders. Regression test (new): - `does NOT leak the parent run initialSummary into an explicit child (Codex P1 regression)` — sets `initialSummary` on the run, enables subagents with an explicit child, asserts the parent still has the summary but `childConfig.agentInputs.initialSummary` is `undefined`. Same for `discoveredTools`. 24 pass. * 🪆 fix: capability gate applies to handoff agents + parallel subagent test ### Codex P2 — handoff agents kept `subagents` after capability disabled The endpoint-level `AgentCapabilities.subagents` gate only cleared `subagents` on `primaryConfig`. Handoff agents loaded into `agentConfigs` retained their persisted `subagents.enabled: true`, and because `run.ts` calls `buildSubagentConfigs` for every agent input, self-spawn would still fire on a handoff target even when the admin had disabled the capability globally. Fix in `initialize.js`: after the subagent loading block, when the capability is off, iterate `agentConfigs.values()` and clear `subagents` + `subagentAgentConfigs` on every loaded config. Regression test: `clears subagents on handoff agents too when capability is disabled (Codex P2 regression)` — seeds a handoff target with its own `subagents.enabled: true`, disables the capability at the endpoint, asserts both primary AND handoff have `subagents` undefined in the client args. 23 init tests pass. ### Parallel subagent correlation — user-requested verification Added `keeps parallel subagent streams independent when events interleave` to `useStepHandler.spec.ts`. Two `subagent` tool calls seeded side by side, 6 interleaved `ON_SUBAGENT_UPDATE` envelopes dispatched (a-start, b-start, a-step, b-step, a-stop, b-step), each carrying its own `parentToolCallId`. Asserts each `tool_call_id`'s Recoil bucket accumulates only its own run's events, statuses reflect each run independently (`call_a` → stop, `call_b` → run_step), no cross-contamination. 49 step-handler tests pass. * 🪆 fix: SubagentCall detects cancelled / errored states (Codex P2) Codex P2 on #12725: the old `running` check only consulted `initialProgress` and the subagent's phase. A user stop, dropped stream, or backend crash before a terminal `stop`/`error` envelope arrived would leave the ticker permanently stuck on "working…". Other *Call components (ToolCall.tsx) already model this via `!isSubmitting && !finished` → cancelled. Mirror that pattern. Re-introduce `isSubmitting` on `SubagentCallProps` (the prop was dropped earlier as 'unused' — that was a bug) and resolve status as a tri-state: - `finished` — initialProgress >= 1, or subagent `stop`/`error` - `cancelled` — `!isSubmitting && !finished` - `running` — neither New locale keys `com_ui_subagent_cancelled` + `com_ui_subagent_errored` swap in the right header text per state. Tests: new `SubagentCall.test.tsx` covers all four states with a real `RecoilRoot` and a `useRecoilCallback` seeder — no mocked store — 5/5 pass. Includes an explicit P2 regression test that simulates the `isSubmitting=false, progress.status='run_step', initialProgress<1` scenario and asserts the cancelled label renders. * 🪆 feat: semantic ticker + aggregated content-part dialog for subagents Two rounds of feedback on #12725: ### Ticker — user-readable lines, not raw event names The old ticker showed \`on_run_step\`, \`on_message_delta\`, etc. — not meaningful to users. Replaced with \`buildSubagentTickerLines\`, a pure helper that walks the \`SubagentUpdateEvent\` stream and emits: - message/reasoning deltas → a single live "Writing: <last 60 chars>" (or "Reasoning: …") line that updates in place as chunks arrive - run_step with tool_calls → "Using calculator(expression=42*58)" for a single call, "Using tool: a, b" for parallel (args dropped when multiple so the line stays short) - run_step_completed → "calculator → 42*58 = 2436" (output truncated to 48 chars; falls back to "Tool X complete" when output is empty) - error → "Error: <message>" - start / stop / run_step_delta → suppressed (too granular / lifecycle-only) Args and output pass through \`summarizeArgs\` / \`summarizeOutput\` which flatten JSON to \`key=value\` pairs and head-truncate long strings so a 200-line tool output never bloats the ticker. ### Dialog — aggregated content parts via leaf renderers \`aggregateSubagentContent\` folds the raw event stream into \`TMessageContentParts[]\` — text/reasoning delta streaks collapse into single \`TEXT\` / \`THINK\` parts, tool calls become \`TOOL_CALL\` parts, and \`run_step\` boundaries correctly break text runs around tool calls. The dialog iterates those parts through a \`SubagentDialogPart\` renderer that delegates to the existing \`Text\`, \`Reasoning\`, and \`ToolCall\` leaf components — the same sub-components \`<Part />\` uses — wrapped in a minimal \`MessageContext\` so reasoning expand state and cursor animation work. Leaf components are used directly rather than importing \`<Part />\` itself to avoid a module cycle (Part → Parts/index → SubagentCall → Part) and to sidestep a hypothetical nested-subagent rendering. ### Tests - \`subagentContent.test.ts\` — 19 pure-function tests covering the aggregator (text concat, reasoning concat, tool call lifecycle, interleaving, phase suppression, late-arriving completions) and the ticker builder (live preview truncation, args/output snippets, parallel-call handling, output truncation, i18n formatter override). - \`SubagentCall.test.tsx\` — 9 component tests: 5 status-resolution (existing) + 2 ticker (semantic text, delta collapse) + 2 dialog (aggregated parts routed to leaf renderers, raw-output fallback). ### Locale keys New: \`com_ui_subagent_ticker_writing\`, \`…_reasoning\`, \`…_error\`, \`…_using\`, \`…_using_with_args\`, \`…_tool_complete\`, \`…_tool_output\`. Preserves i18n at the display layer while the helper stays pure. * chore: drop unused com_ui_subagent_activity_log locale key The dialog no longer renders an "Activity log" section — the new content-parts renderer replaced it. Also tweaks the dialog description copy to match. * 🪆 fix: subagent dialog order, persistence, auto-scroll, width Follow-up pass addressing the four issues observed in real runs against a live subagent-using parent. ### Aggregator ordering (reasoning appearing after text it preceded) Reproducible pattern: LLM emits reasoning → text → tool call in that order, but the dialog rendered text BEFORE reasoning in the content array. Root cause: `aggregateSubagentContent` maintained `currentText` and `currentThink` buffers in parallel and only flushed them at a `run_step` boundary in a fixed (text, think) order, losing the actual arrival order. Fix: when a text chunk arrives, close any open think buffer first (pushes it into the content array right then); symmetric for think → text. Two new regression tests cover the exact reasoning → text → tool_call sequence from the screenshot and the repeated reasoning ↔ text flow across a turn. ### Content persists after completion (markdown not rendering when done) `clearStepMaps` was calling `resetSubagentAtoms()` at stream end, which wiped every `subagentProgressByToolCallId` entry. Once reset, `contentParts.length === 0` and the dialog fell back to rendering the raw `output` string with plain text — hence the literal `##`/`**` in the completed-state screenshot. Stopped resetting; the atoms are bounded per-call (200-event cap) and per-conversation (one per subagent spawn) so growth matches the rest of the conversation state. `resetSubagentAtoms` is kept for a future conversation-switch caller. Also: routed the raw-`output` fallback (older subagent runs recorded before the event forwarder existed) through the same `SubagentDialogPart` → `Text` leaf that content parts use, so its markdown renders the same way. ### Auto-scroll to bottom while running Added a `scrollRef` on the dialog body and a `useEffect` that pins `scrollTop = scrollHeight` while the dialog is open AND the subagent is running. Triggers on `contentParts.length` (new tool calls / part boundaries) and `events.length` (intra-part deltas) so the cursor tracks text streaming. Disabled post-completion so re-opening a finished run doesn't yank to the bottom. ### Wider dialog Went from `max-w-2xl` (42rem / 672px — too cramped on maximized laptop windows) to `w-[min(95vw,64rem)] max-w-[min(95vw,64rem)]`. Narrow on phones, scales up to 64rem on desktop, always leaves a bit of margin from the viewport edge. Bumped `max-h-[65vh]` on the scroll area to give the extra width room to breathe vertically too. ### Tests - `subagentContent.test.ts` — 21 pass (2 new ordering regressions). - `useStepHandler.spec.ts` — 49 pass (1 updated to assert atoms are *preserved* on clearStepMaps). - `SubagentCall.test.tsx` — 9 pass (unchanged; aggregator-level tests cover the ordering). * 🪆 feat: persist subagent_content via SDK createContentAggregator Per-request map of createContentAggregator instances keyed by the parent's tool_call_id. ON_SUBAGENT_UPDATE handler feeds each event into the matching aggregator (phase → GraphEvent mapping); AgentClient harvests contentParts onto the subagent tool_call at message save so the child's reasoning / tool calls / final text survive a page refresh. Reusing the SDK's battle-tested aggregator instead of a bespoke one keeps the persisted shape identical to the parent graph's output and drops ~100 lines of custom aggregation code. * 🪆 fix: incremental subagent aggregation + dialog render parity **Disappearing tool_calls**: the Recoil atom trimmed events to a 200-long rolling window, so verbose subagents could shed the `run_step` that originally created a tool_call part — rebuilding content from the trimmed window then produced only the surviving text/reasoning. Fix: fold each envelope into `contentParts` incrementally in the atom as it arrives (new `foldSubagentEvent` + cursor state). Event trim window now affects only the ticker, never the dialog. **Render parity**: dialog now applies `groupSequentialToolCalls` and renders single parts through `Container` + grouped batches through `ToolCallGroup` — same spacing and "Used N tools" collapsing the main message view uses. **Width**: `min(96vw, 80rem)` — wider on big screens, still responsive. **Labels**: "Subagent: X" is jargon. Named subagents render as `Running "{name}" agent` / `Ran "{name}" agent` (past tense on completion); self-spawns use `Running subtask` / `Ran subtask` since `Running "self" agent` reads badly. * 🪆 polish: subagent dialog parity + agent avatar in header **Labels**: drop "subtask" framing. Self-spawn shows `Running agent` / `Ran agent` (past tense on completion); named subagents stay `Running "X" agent` / `Ran "X" agent`. **Dialog render parity**: stop wrapping every part in `Container`. TEXT keeps its `Container` (gap-3 + `mt-5` sibling margin), THINK and TOOL_CALL render bare so their own wrappers set the full-column width the regular message view gives them — matches the main `<Part>` dispatch. Outer scroll region now uses `px-4 py-3` padding and a `max-w-full flex-grow flex-col gap-0` inner wrapper, mirroring the `MessageParts` container the main conversation uses. **Avatar**: header icon now renders the subagent's configured avatar via `MessageIcon` when `useAgentsMapContext()` has the child agent, falling back to the `Users` SVG (which keeps its running-state pulse). Same icon-left-of-label pattern the tool UI uses. * 🪆 polish: subagent group label, ticker throttle + tail-ellipsis, scroll button **Grouped label**: ToolCallGroup now detects all-subagent batches and labels them "Running N agents" / "Ran N agents" instead of "Used N tools". Mixed batches keep the existing label. The tool-name summary is suppressed for all-subagent groups (every entry dedupes to "subagent", which adds nothing). **Ticker width + tail-ellipsis**: raise the preview cap to 300 chars so wide containers aren't half-empty, and flip the ticker `<li>` to `dir="rtl"` so `text-overflow: ellipsis` clips the *oldest* characters (visually the left edge) — the newest tokens stay pinned to the right regardless of container width. Bidi lays out the Latin text LTR internally, the rtl only affects which side gets the ellipsis. **Throttle**: `useThrottledValue` hook (trailing-edge, 1.2s) smooths the live `Writing: …` preview so tokens no longer strobe past the eye faster than they can be read. Ref-based internals (not `useState`) avoid infinite-update loops when the upstream value is a new-reference each render; `NEGATIVE_INFINITY` sentinel ensures the very first value passes through synchronously so tests and first paint aren't delayed. **Scroll-to-bottom**: dialog tracks `isAtBottom` with a 120px threshold; auto-scroll only engages when the user is already following along, and a persistent jump-to-latest button appears whenever they scroll up — no more fighting the auto-scroll to read back. * 🪆 polish: snappier ticker, prefix-safe labels, agents icon, readable lines **Ticker lines are now incrementally aggregated in the atom** — same pattern as contentParts. The raw-events rolling window is gone; event volume no longer caps what the ticker can display. Verbose subagents that used to drop early tool_call lines out of the window now keep the full 3-line history (using_tool, tool_complete, writing). **Discriminated-union ticker lines** split a constant prefix (e.g. "Writing:") from a tail-truncatable body. The prefix lives in a `shrink-0` span so it never gets clipped when the body overflows; the body uses `dir="rtl"` only on itself — scoped so non-streaming lines (e.g. "Waiting for first update…") can't get their trailing ellipsis flipped by bidi. **Content-aware throttle**: 800ms interval (down from 1200ms), skipped entirely while the live buffer is below 120 chars. Early tokens now appear immediately — no more "Reasoning: I" sitting blank for a full second before the next heartbeat. Once the preview is long enough to fill the container, throttling kicks in at the tighter interval. **Header label** is now a constant verb + optional muted sub-label. Base reads "Running agent" / "Ran agent" / "Cancelled agent" / "Agent errored" for every subagent; named subagents get the configured agent name rendered to the right in secondary text (self-spawns and unresolved names omit it — "Running self agent" is nonsense). **ToolCallGroup** now detects `allSubagents` and swaps `StackedToolIcons` for a single `Users` glyph — otherwise the group header shows a wrench ("tool") icon next to "Ran 5 agents", which reads wrong. * 🪆 feat: delimiter-aware tool labels in ticker + full-width tool lines New shared `parseToolName` helper in `client/src/utils/toolLabels.ts` — single source of truth for splitting `<tool>_mcp_<server>` ids and mapping native tool names (web_search, execute_code, …) to their friendly translation keys. `ToolCallGroup` drops its inline copy and pulls from this helper. Ticker tool lines now use the shared parser + a new `ToolIdentifier` sub-renderer so the live log reads like the main tool UI: - MCP tool → `<server> · <code-badge:tool>` (e.g. "github · `search_code`") - Native → friendly name from `TOOL_FRIENDLY_NAME_KEYS` - Unknown → bare `<code>` badge of the raw id The `using_tool` / `tool_complete` rows now render with a `flex w-full items-baseline gap-1 overflow-hidden` layout matching the writing/reasoning rows — they take the full container width instead of collapsing to content size. Output snippets on `tool_complete` get the same tail-side `dir="rtl"` ellipsis so the newest characters stay flush-right when the container is narrow. Dropped the now-unused template i18n keys (`com_ui_subagent_ticker_using_with_args`, `com_ui_subagent_ticker_tool_complete`, `com_ui_subagent_ticker_tool_output`) in favor of tokens the JSX composes structurally. Only English is touched per the project rule; other locales follow externally. * 🪆 fix: dialog scroll button + auto-scroll during streaming deltas Two race/trigger bugs in the dialog's scroll behavior: **Button never showed**: `addEventListener('scroll', …)` in a `useEffect` ran before Radix's portal had actually committed the scroll container, so `scrollRef.current` was still null — the listener never attached, `isAtBottom` stayed stuck at its initial `true`, and the jump-to-latest button was never rendered. Swap to React's `onScroll` prop on the element itself so the handler wires up as part of DOM commit, not a post-commit effect. **Auto-scroll stalled during text streaming**: the pin-to-bottom effect only re-fired on `contentParts.length` changes. Message/reasoning deltas extend the last TEXT/THINK part's `.text` without changing the array length — so the view would drift up as tokens piled in and never catch back up. Replace the length-dep effect with a `ResizeObserver` on the inner content div; every height change (new part or in-place growth) triggers a scroll-pin when the user is still at the bottom. * 🪆 fix: drop leading ellipsis from ticker body truncatePreview was prepending ... to the tail when the buffer exceeded 300 chars. The component's CSS already produces a left-side ellipsis for overflow via dir=rtl + text-overflow: ellipsis — stacking a data-level ellipsis on top renders a stray dot character right after the Writing: / Reasoning: label (Writing: .Sure!), which looks like a typo to the reader. Data now returns just the last 300 chars when truncating; CSS handles the visual cue whenever the body actually overflows its container. * 🪆 fix: Codex review — subagent isolation + concurrent-safe throttle Three findings from the @codex review pass, all valid: **P1 — buildAgentInput leaks parent discovered-tool state into subagent children.** `buildAgentInput` mutates `agent.toolRegistry` (`overrideDeferLoadingForDiscoveredTools` flips `defer_loading:true→false` on tools the parent previously searched for) and appends those tools' definitions to the returned `toolDefinitions` before the function returns. `buildSubagentConfigs` was clearing the reported `initialSummary` / `discoveredTools` fields on the returned AgentInputs, but that happened post-return — the registry writes and extra tool definitions persisted on the child, silently defeating context isolation and inflating the child's prompt. Fix: `buildAgentInput` now takes an `isSubagent` flag that gates the registry-mutation block and omits `initialSummary` / `discoveredTools` at the source. `buildSubagentConfigs` passes `{ isSubagent: true }` for every explicit child; no post-hoc cleanup needed. **P2 — ToolCallGroup labels a finished subagent group as still running when the child returned no output.** `getToolMeta` computed `hasOutput` as `!!tc.output`, which is `false` for a completed subagent that returned empty text (the UI already has an "empty result" fallback for that case). `allCompleted` would stay `false` and the group header stuck on "Running N agents" forever. Fix: treat `tc.progress === 1` as completion too — progress is the authoritative lifecycle signal, output is just content. **P2 — useThrottledValue schedules `setTimeout` during render.** Discarded renders under Strict Mode / Concurrent rendering would leave orphan timers firing against stale trees. Fix: move `setTimeout` into a `useEffect` keyed on `[value, intervalMs, enabled]`. Render-time still mutates refs (idempotent), but timer scheduling lives post-commit. Cleanup on unmount and on passthrough transitions is preserved. * 🪆 fix: Codex P2 — wipe subagent atoms on conversation switch `clearStepMaps()` intentionally doesn't reset `subagentProgressByToolCallId` so a user can reopen a completed subagent's dialog mid-conversation, but `resetSubagentAtoms` was defined and never exposed / called — so each completed run's aggregated `contentParts` + `tickerState` stayed resident in the `atomFamily` for the whole app session. Unbounded growth across multi-conversation sessions. Expose `resetSubagentAtoms` from `useStepHandler` and fire it from `useEventHandlers` whenever the URL's `conversationId` changes. That's the correct cleanup boundary: historical subagent dialogs rehydrate from persisted `subagent_content` on each `tool_call` at message-save time, so wiping live atoms on switch doesn't lose any viewable history — it just releases per-tool-call state that the old conversation's components no longer subscribe to. * 🪆 fix: Codex round 3 — subagent registry isolation + post-run label Two more valid findings. **P1 — parent-order registry mutation leaks into subagent inputs.** `overrideDeferLoadingForDiscoveredTools` mutates `agent.toolRegistry` in place (the Map *and* the LCTool objects inside it). When an agent appears both as a handoff target (normal graph node) AND an explicit subagent child, a subagent build that ran before the parent's build captures a reference to the same registry — the parent's later mutation leaks through to the child. Fix: for subagent children (`isSubagent`), clone the `toolRegistry` Map and shallow-clone each LCTool inside before returning the inputs. `defer_loading` flips on parent-graph registry mutations can't propagate across the clone boundary. `toolDefinitions` also gets a shallow-copy pass so the same isolation holds for definitions the child carries directly. **P2 — "Running N agents" label stuck after cancel/error.** ToolCallGroup's all-subagent label was gated only on `allCompleted`, which requires every child to have `hasOutput || progress === 1`. A subagent that gets cancelled (stream ends, no `stop` phase, no output) never satisfies that — so even after `isSubmitting` flips false, the header stays on "Running N agents" while each individual card correctly shows "Cancelled agent". Fix: derive a `subagentsDone` flag as `allCompleted || !isSubmitting` and gate the past-tense label on that. Matches the tri-state each SubagentCall card already resolves (finished / cancelled / running). * 🪆 fix: Codex P2 — ACL-check subagents.agent_ids on create/update Codex flagged that `subagents.agent_ids` was accepted as arbitrary strings on the create/update routes while `edges` got a `validateEdgeAgentAccess` pass — so users could save subagent references to agents they can't VIEW. At runtime `initializeClient`'s `processAgent` ACL gate silently drops those, so the persisted configuration and the actual behavior diverged in a way that is difficult to diagnose. Refactor: extract the id-set → unauthorized-ids check into a shared `collectUnauthorizedAgentIds`, wrap it with a dedicated `validateSubagentAccess`, and plumb the same 403-on-failure response the edge path already returns. Applied on both POST /agents and PATCH /agents/:id. * 🪆 fix: Codex round 5 — ACL-disable escape hatch + ticker order Two valid findings. **P1 — can't disable subagents after losing access to a child.** The subagent ACL check ran on every create/update that echoed back the `agent_ids` list, even when the user was explicitly disabling the feature. The UI keeps the list intact when toggling `enabled: false`, so a user who subsequently lost VIEW on any child would be locked in a 403 loop — every edit (including the one that turns subagents off) bounces. Fix: gate the ACL check on `subagents.enabled !== false` at both the POST /agents and PATCH /agents/:id handlers. Empty list stays a no-op. Disabling the feature is always permitted. **P2 — ticker fold merges out-of-order previews across delta switches.** `foldSubagentEventIntoTicker` carried `textLineIdx` / `thinkLineIdx` across a reasoning → text → reasoning transition, so the second reasoning chunk appended to the original reasoning line instead of starting a new chronological one. Fix: close the opposite buffer + cursor when a delta-type switch is detected (same rule the content-parts reducer already applies). Added a regression test. * 🪆 fix: Codex round 6 — preserve mid-stream atoms + honor sequential suppression Two valid findings. **P2 — atom reset fires on initial chat URL assignment.** `useEventHandlers` initialized `lastConversationIdRef` from the URL's current `paramId`, then reset subagent atoms whenever the ref and `paramId` disagreed. For a brand-new conversation the URL stamp goes from `undefined → "abc123"` while the first response is still streaming, which used to drop subagent ticker/content state mid-run and leave dialogs missing earlier updates. Fix: only reset when *both* the old and new IDs are non-null and differ — i.e. a user-initiated switch between two established conversations. The initial assignment passes through without clearing. **P2 — ON_SUBAGENT_UPDATE bypassed `hide_sequential_outputs`.** Every other streaming handler in `callbacks.js` (`ON_RUN_STEP`, `ON_MESSAGE_DELTA`, etc.) gates emission on `checkIfLastAgent` + `metadata?.hide_sequential_outputs`, but the subagent forwarder did an unconditional `emitEvent` — so intermediate agents in a sequential chain were leaking their children's activity to the client even when the chain was configured to suppress intermediates. Fix: accept `metadata` and apply the same `isLastAgent || !hide_sequential_outputs` gate. Aggregation still runs regardless of visibility (persistence + dialog depend on it); only the SSE forward is suppressed. * 🪆 fix: Codex P2 — gate subagent ACL check on endpoint capability `validateSubagentAccess` ran on every create/update where `subagents.enabled !== false`, regardless of the endpoint-level `subagents` capability. When the capability is off at the appConfig level, `initializeClient` already strips the `subagents` block at runtime — so persisted `agent_ids` are inert — but the validation could still 403 on a legacy record whose referenced child is no longer viewable, blocking unrelated edits. Fix: add `isSubagentsCapabilityEnabled(req)` that reads the agents endpoint's capabilities from `req.config` and gate both the create and update ACL checks on it. Capability-off environments can update agents with stale `subagents` data freely; capability-on keeps the full ACL protection. * 🪆 fix: Codex P2 — reset subagent atoms on id→null navigation too Previous guard (both-established) skipped the reset whenever `paramId` became null/undefined, so navigating from an existing chat to a "new chat" route left stale subagent progress resident in the `atomFamily` until the user picked a specific different chat. Swap the both-established check for a one-time flag: skip only the very first `undefined → id` transition (the brand-new-chat URL stamp that happens mid-stream), then reset on any subsequent change — id→id, id→null, null→id-after-reset. If the user started on an established chat the flag is true at mount, so the guard is a no-op and every navigation resets normally. * 🪆 fix: Codex round 9 — subagent persistence gate + handoff children Two valid findings. **P1 — hide_sequential_outputs also gates persistence.** The previous fix gated the SSE forward on `isLastAgent || !hide_sequential_outputs` but still ran the per-tool-call `createContentAggregator` aggregation unconditionally. `finalizeSubagentContent` would then attach the hidden intermediate agent's child reasoning / tool output to the saved message, so a page refresh could reveal activity that was intentionally suppressed live. Move the visibility gate to the top of the handler — hidden agents now skip both aggregation and emission, so "hide_sequential_outputs" is a consistent "don't record" rule for subagent traces. **P2 — handoff agents' explicit subagents were silently dropped.** `initializeClient` only resolved `subagentAgentConfigs` for the primary config, so an agent used via handoff that had its own `subagents.agent_ids` saved in the builder would get self-spawn only; every explicit child was quietly ignored, creating a saved-config / runtime mismatch the user couldn't diagnose. Extract the resolution into a shared `loadSubagentsFor(config)` helper and invoke it for the primary and every handoff agent in `agentConfigs`. The `edgeAgentIds` precomputation stays outside the helper (it's loop-invariant). Capability-off shortcuts return empty early so the existing strip-on-capability-off path still holds. * 🪆 fix: Codex P2 — recursive subagent build for multi-level delegation Previously only the outer `agents[]` loop attached `subagentConfigs` to its inputs, so a child used as a subagent (invoked via the `subagent` tool) lost every explicit spawn target of its own. A user-valid configuration like A → B → C would only run the top layer; B could never actually delegate to C from inside A's run. Recursively build `subagentConfigs` for each child inside `buildSubagentConfigs`, passing the child's freshly-constructed `childInputs` down so its own `subagents.enabled` children get resolved too. Added cycle protection via an `ancestors` Set — a configuration like A → B → A is safely cut off at the second encounter of A rather than recursing forever (the existing `child.id === agent.id` guard already prevents the direct self-loop). * 🪆 fix: Codex P2 — reset subagent atoms on useEventHandlers unmount The effect that resets subagent atoms only fired on `paramId` change, so unmounting the chat container (route change away from /c) never flushed the atoms. `knownSubagentAtomKeys` lives in a ref inside `useStepHandler` — once the hook unmounts the ref is gone, so a subsequent remount can't clean atoms it never registered. Added a second `useEffect` that only runs cleanup on unmount (empty deps aside from the stable `resetSubagentAtoms` callback). Keeps `atomFamily` bounded across full route teardowns too. * 🪆 fix: Codex round 13 — cyclic subagent guard + prefer persisted Two valid findings. **P1 — cyclic subagent ref reloads the primary.** A configuration like `A ↔ B` (B lists A as its own subagent) would send `loadSubagentsFor` down a path that couldn't find A in `agentConfigs` (the primary isn't stored there), so it called `processAgent(A)` a second time. That inserts a fresh config for the primary id, which downstream duplicates in `[primary, ...agentConfigs.values()]` and can replace the primary's tool context with the reloaded copy. Fix: short-circuit when a subagent ref points back at `primaryConfig.id` — reuse the already-loaded primary config. Primary is always an edge id so no pruning bookkeeping needed. **P2 — live atom preferred over canonical persisted trace.** The dialog picked `progress.contentParts` ahead of `persistedContent`, but the Recoil bucket is best-effort — after a disconnect/reconnect it can be stale or partial. The server's `subagent_content` on the `tool_call` is the canonical record refreshed on sync. Preferring live could hide completed tool/reasoning history that was actually persisted. Fix: flip the preference order. Persisted wins when it's non-empty; live covers the mid-stream window (before the parent message saves, persisted is empty) and the older-runs fallback. Updated the test that enforced the old order to lock the new semantics in (separate mid-stream live-fallback assertion kept). * 🪆 fix: Codex P2 — subagent atom reset rule simplified to 'leaving established id' The `hasEstablishedConversationRef` + check for initial undefined→id covered the first navigation but missed the equivalent mid-stream URL stamp when a user goes from an existing chat to a new chat and sends a message there (`id → null → newId`). The null → newId transition was still hitting the reset branch and wiping the in-flight subagent ticker/content for that first turn. Simpler rule: only reset when the PREVIOUS paramId is an established id. Every transition AWAY from an established chat clears (id→id2, id→null, id→undefined); every transition FROM null/undefined passes through (initial mount, new-chat URL stamp mid-stream). Drop the `hasEstablishedConversationRef` machinery in favor of that single condition. * 🪆 fix: Codex P2 — match runtime's strict subagent enable check in ACL Runtime (`initializeClient` + `run.ts`) treats `subagents?.enabled` as a truthy predicate — `undefined`, `null`, missing, and `false` all short-circuit. The ACL gate was using `!== false` which accepted `undefined` / missing as "enabled" and could 403 a payload whose subagent tool would be inert at runtime. Swap both create and update to `enabled === true`. Only a strictly- enabled payload triggers the ACL check; the disable path (`false`) still passes through so a user who lost VIEW on a child can still save the disable edit. * 🪆 fix: Codex P2 — reject missing subagent references with 400 `validateSubagentAccess` collapsed through `collectUnauthorizedAgentIds`, which returns an empty list for ids with no DB record — so typos and references to deleted agents passed validation silently, and `initializeClient` later dropped them at runtime. Saved config would then list spawn targets that the backend never honored, a hard-to- diagnose drift. Refactor the helper into `classifyAgentReferences(ids, …)` which returns `{ missing, unauthorized }` separately. `validateEdgeAgentAccess` keeps its old semantics (missing is intentional — a self-referential `from` names the agent being created). `validateSubagentReferences` surfaces both buckets so the create/update handlers can 400 on missing and 403 on unauthorized with distinct error messages and `agent_ids` lists. * 🪆 polish: tighten subagent dialog grid gap to gap-2 OGDialogContent's grid default is `gap-4`, which renders the title, description, and scroll area as three visually separated panels. Drop to `gap-2` so they read as one block. * 🪆 polish: swap Subagents above Handoffs in Advanced panel Subagents is the more common knob users reach for, so show it first. Handoffs keep the same Controller wiring, just move below. |
||
|
|
35bf04b26c |
🧰 refactor: Unify code-execution tools (#12767)
* 🛠️ feat: Add registerCodeExecutionTools helper Idempotently registers `bash_tool` + `read_file` in the run's tool registry and tool-definition list via a registry `.has()` dedupe. Sets up the single code-execution tool path shared by: - `initializeAgent` (when an agent has `execute_code` in its tools and the capability is enabled for the run) - `injectSkillCatalog` (when skills are active; unconditional read_file, bash_tool follows `codeEnvAvailable`) Both callers reach the helper in the same initialization sequence, so the second call becomes a no-op and exactly one copy of each tool reaches the LLM — no more double registration for agents that combine `execute_code` capability with active skills. Unit-tested on a fresh run, idempotence (second call, overlap with prior tooldefs, partial overlap), and the no-registry variant. * 🔀 refactor: Route injectSkillCatalog bash_tool + read_file through registerCodeExecutionTools The `skill` tool is still registered inline (it's skill-path-specific), but `bash_tool` + `read_file` now flow through the shared idempotent helper so a prior registration from the execute_code path doesn't produce a duplicate copy later in the same run. Behavior preserved: - `read_file` always registers when any active skill is in scope — manually-primed `disable-model-invocation: true` skills still need it to load `references/*` from storage. - `bash_tool` follows `codeEnvAvailable` exactly as before. Adds a test pinning the cross-call dedupe: when `injectSkillCatalog` runs AFTER `registerCodeExecutionTools` has already seeded the registry + tool definitions with bash_tool/read_file, the resulting `toolDefinitions` still contains exactly one copy of each. * 🪄 feat: Expand `execute_code` tool name into bash_tool + read_file at initialize-time When an agent's `tools` include `execute_code` and the `execute_code` capability is enabled for the run, `initializeAgent` now registers `bash_tool` + `read_file` via `registerCodeExecutionTools` before `injectSkillCatalog`. The legacy `execute_code` tool definition is no longer handed to the LLM — `execute_code` remains on the agent document as a capability-trigger marker, but the runtime expands it into the skill-flavored tool pair. Call ordering matters: the `execute_code` registration runs BEFORE `injectSkillCatalog`, so the skill path's own `registerCodeExecutionTools` call inside `injectSkillCatalog` becomes a no-op via the registry's `.has()` check. Exactly one copy of each tool reaches the LLM whether the agent has: - only `execute_code` (legacy path) - only skills - both No data migration needed — `agent.tools: ['execute_code']` stays in the DB unchanged; the expansion is a runtime operation. Three tests cover the matrix: execute_code + capability on → bash_tool + read_file registered; execute_code + capability off → neither registered; no execute_code + capability on → neither registered. * 🗑️ refactor: Drop CodeExecutionToolDefinition from the builtin registry Removes the legacy `execute_code` entry from `agentToolDefinitions` and the corresponding import. With the initialize-time expansion in place, nothing consults `getToolDefinition('execute_code')` for a tool schema any more — the capability gate still filters on the string `execute_code`, but the actual tool definitions the LLM sees come from `registerCodeExecutionTools` (i.e. `bash_tool` + `read_file`). `loadToolDefinitions` in `packages/api/src/tools/definitions.ts` silently drops `execute_code` when it no longer resolves in the registry — that's the expected path and is now covered by an updated test. No caller of `getToolDefinition('execute_code')` expects a non-undefined result after this change. * 🔌 refactor: Read CODE_API_KEY from env for primeCodeFiles + PTC Finishes the Phase 4 server-env-keyed rollout on the two remaining `loadAuthValues({ authFields: [EnvVar.CODE_API_KEY] })` sites in `ToolService.js`: - `primeCodeFiles` (user-attached file priming on execute_code agents) - Programmatic Tool Calling (`createProgrammaticToolCallingTool`) Both now read `process.env[EnvVar.CODE_API_KEY]` directly, matching `bash_tool`'s pattern. The per-user plugin-auth path is no longer consulted for code-env credentials anywhere in the hot path — the agents library owns the actual tool-call execution and also reads the env var internally. Priming still fires for existing user-file workflows so the legacy `toolContextMap[execute_code]` hint ("files available at /mnt/data/...") stays in the prompt; only the key lookup changed. * 🔧 fix: Type the pre-seeded dedupe-test tools as LCTool CI TypeScript type checks caught `{ parameters: {} }` in the new cross-call dedupe test: `LCTool.parameters` is a `JsonSchemaType`, not `{}`. Use `{ type: 'object', properties: {} }` and type the local registry Map through the parameter-derived shape so the pre-seeded values match what `toolRegistry.set` expects. * 🛡️ fix: Run execute_code expansion before GOOGLE_TOOL_CONFLICT gate Codex review caught a latent regression: the original Phase 8 placement ran `registerCodeExecutionTools` after `hasAgentTools` was computed, so an execute-code-only agent on Google/Vertex with provider-specific `options.tools` populated would no longer trip `GOOGLE_TOOL_CONFLICT` — the legacy `CodeExecutionToolDefinition` used to populate `toolDefinitions` before the guard, but after dropping it from the registry, `toolDefinitions` stayed empty until my expansion ran downstream of the guard. Mixed provider + agent tools would silently flow through to the LLM. Fix moves the `execute_code` expansion to BEFORE `hasAgentTools` computation. `bash_tool` + `read_file` now contribute to the check the same way the legacy `execute_code` def did. Covered by a new test that pins the Google+execute_code+provider-tools scenario — the `rejects.toThrow(/google_tool_conflict/)` path would have silently passed on the prior placement. * 🔗 fix: Thread codeEnvAvailable through handoff sub-agents Round-2 codex review caught the other half of the execute_code expansion gap: `discoverConnectedAgents` omitted `codeEnvAvailable` from its forwarded `initializeAgent` params, so handoff sub-agents with `agent.tools: ['execute_code']` lost the `bash_tool` + `read_file` registration (pre-Phase 8 the legacy `CodeExecutionToolDefinition` would have landed in their `toolDefinitions` via the registry). - Add `codeEnvAvailable?` to `DiscoverConnectedAgentsParams` and forward it verbatim on every sub-agent `initializeAgent` call. - Update the three JS call sites that construct the primary's `codeEnvAvailable` (`services/Endpoints/agents/initialize.js`, `controllers/agents/openai.js`, `controllers/agents/responses.js`) to pass the same flag into `discoverConnectedAgents` — one authoritative source per request. - Two regression tests in `discovery.spec.ts` pin the true/false passthrough so a future refactor that drops the param-forwarding surfaces immediately. Left intentionally unchanged: `packages/api/src/agents/openai/service.ts` (public API helper with no in-repo caller). External consumers of `createAgentChatCompletion` who want code execution should pass a `codeEnvAvailable`-aware `initializeAgent` via `deps` — documenting the full public-API surface is out of scope for this Phase 8 PR. * 🔗 fix: Thread codeEnvAvailable through addedConvo + memory-agent paths Round-3 codex review caught the last two production `initializeAgent` callers missing the Phase-8 capability flag: - `api/server/services/Endpoints/agents/addedConvo.js` (multi-convo parallel agent execution). Added `codeEnvAvailable` to `processAddedConvo`'s destructured params and forwarded it into the per-added-agent `initializeAgent` call. Caller in `api/server/services/Endpoints/agents/initialize.js` passes the same `codeEnvAvailable` it computed for the primary. - `api/server/controllers/agents/client.js` (`useMemory` — memory extraction agent). Computes its own `codeEnvAvailable` from `appConfig?.endpoints?.[EModelEndpoint.agents]?.capabilities` and forwards into `initializeAgent`. Memory agents rarely list `execute_code`, but if one does, pre-Phase 8 they got the legacy `execute_code` tool registered unconditionally — the passthrough restores parity. With this, every production caller of `initializeAgent` explicitly resolves the capability: main chat flow (primary + handoff), OpenAI chat completions (primary + handoff), Responses API (primary + handoff), added convo parallel agents, and memory agents. The one remaining caller, `packages/api/src/agents/openai/service.ts::createAgentChatCompletion`, is a public API helper with no in-repo consumer (external callers must pass a capability-aware `initializeAgent` via `deps`). * 🪤 fix: Remove duplicate appConfig declaration causing TDZ ReferenceError The Responses API controller had TWO `const appConfig = req.config;` bindings inside `createResponse`: one at the top of the function (added by the Phase 4 `bash_tool` decouple) and one inside the try block (added by the polish PR #12760). Because `const` is block-scoped with a temporal dead zone, the inner redeclaration put `appConfig` in TDZ for the entire try block, so any earlier reference inside the try — notably `appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders` at line 348 — threw `ReferenceError: Cannot access 'appConfig' before initialization`. The error was silently swallowed by the outer try/catch, leaving `recordCollectedUsage` unreached and the six `responses.unit.spec.js` token-usage tests failing. Removing the inner redeclaration fixes the six failing tests (verified: 11/11 pass locally post-fix, 0 regressions elsewhere). The outer function-scoped binding already provides `appConfig` to every downstream reference. * 🔗 fix: Thread codeEnvAvailable through the OpenAI chat-completion public API Round-4 codex review (legitimate on the type-safety angle, even though the runtime concern was already covered): the `createAgentChatCompletion` helper defines its own narrower `InitializeAgentParams` interface locally, and the type was missing `codeEnvAvailable`. External consumers who supply a capability-aware `deps.initializeAgent` couldn't route `codeEnvAvailable` through without a type-cast workaround. - Widen the local `InitializeAgentParams` interface to include `codeEnvAvailable?: boolean` (matches the real `packages/api/src/agents/initialize.ts` type). - Derive `codeEnvAvailable` inside `createAgentChatCompletion` from `deps.appConfig?.endpoints?.agents?.capabilities` (the same source the in-repo controllers use) and forward to `deps.initializeAgent`. Uses a string literal `'execute_code'` lookup so this file stays free of a `librechat-data-provider` import — keeping the dependency surface of the public helper minimal. With this, external consumers of `createAgentChatCompletion` who pass `appConfig` with the agents capabilities get `bash_tool` + `read_file` registration automatically; consumers who don't pass `appConfig` retain the existing "explicit opt-in" semantics (the flag stays `undefined`, expansion is skipped). * 🧹 chore: Review-driven polish — observability log, JSDoc DRY, test gaps, no-op allocation Addresses the comprehensive review of PR #12767: - **Finding #1** (MINOR, observability): `initializeAgent` now emits a debug log when an agent lists `execute_code` in its tools but the runtime gate is off (`params.codeEnvAvailable` !== true). The event-driven `loadToolDefinitionsWrapper` path doesn't log capability-disabled warnings, so without this the tool silently vanishes from the LLM's definitions with zero trace. Operators debugging "why isn't code interpreter working?" now get a signal at the initialize layer. - **Finding #5** (NIT, allocation): `registerCodeExecutionTools` now returns the input `toolDefinitions` array by reference on the no-op path (both tools already registered by a prior caller in the same run) instead of allocating a fresh spread array every time. The common dual-call scenario — `initializeAgent` then `injectSkillCatalog` — saves one O(n) copy per request. - **Finding #4** (NIT, DRY): Collapsed the duplicated 6-line JSDoc comment in `openai.js`, `responses.js`, and `addedConvo.js` into either a one-line `@see DiscoverConnectedAgentsParams.codeEnvAvailable` pointer (the two JS call sites) or a compact 3-line block referring back to the canonical source (addedConvo's @param). - **Finding #2** (MINOR, test gap): Added `api/server/services/Endpoints/agents/addedConvo.spec.js` with three cases covering `codeEnvAvailable=true`, `codeEnvAvailable=false`, and omitted (undefined) passthrough. A future refactor that drops the param from destructuring now surfaces here instead of silently regressing multi-convo parallel agents with `execute_code`. - **Finding #3** (MINOR, test gap): Added `api/server/controllers/agents/__tests__/client.memory.spec.js` pinning the capability-flag derivation that `AgentClient::useMemory` uses — six cases covering present/absent/null/undefined config shapes plus an enum-literal pin (`'execute_code'` / `'agents'`). Catches enum renames or config-path shifts that would otherwise silently strip `bash_tool` + `read_file` from memory agents. Finding #7 (jest.mock scoping, confidence 40) left as-is: the reviewer's own risk assessment noted `buildToolSet` doesn't touch the mocked exports, and restructuring a file-level `jest.mock` to `jest.doMock` + dynamic `import()` introduces more complexity than the speculative risk justifies. The existing mock is scoped to the test file and contains the same stubs the adjacent `skills.test.ts` already uses. Finding #6 (PR description commit count) addressed out-of-band via PR description update. All existing tests pass, typecheck clean, lint clean across touched files. New tests: 9 cases across 2 new spec files. * 🧽 refactor: Replace hardcoded 'execute_code' string with AgentCapabilities enum in service.ts Follow-up review (conf 55) caught that `openai/service.ts`'s Phase 8 `codeEnvAvailable` derivation used the literal `'execute_code'` while every in-repo controller uses `AgentCapabilities.execute_code` from `librechat-data-provider`. The file deliberately uses local type interfaces to keep the public API helper's type surface small, but that pattern was never a ban on single-value imports from the data provider — `packages/api` already depends on it. Importing the enum value means a future rename of `AgentCapabilities.execute_code` propagates to this file automatically, matching the in-repo controllers' behavior. Other follow-up findings left as-is per the reviewer's own verdict: - #2 (memory spec mirrors the production expression rather than calling `AgentClient::useMemory` directly): reviewer flagged as "not blocking" / "design-philosophy observation." The test file's JSDoc already explicitly documents the tradeoff and pins the enum literals to catch the most likely drift vector. Standing up `AgentClient` + all its mocks for a one-line regression guard is disproportionate. - #3 (`addedConvo.spec.js` mock signature vs. underlying `loadAddedAgent` arity): reviewer's own confidence 25 noted the mock matches the wrapper's actual call pattern in the production file. Not a real gap. - #4 was self-retracted as a false alarm. * 🗑️ refactor: Fully deprecate CODE_API_KEY — remove all LibreChat-side references The code-execution sandbox no longer authenticates via a per-run `CODE_API_KEY` (frontend or backend). Auth moved server-side into the agents library / sandbox service, so LibreChat drops every reference: **Backend plumbing:** - `api/server/services/Files/Code/crud.js`: `getCodeOutputDownloadStream`, `uploadCodeEnvFile`, `batchUploadCodeEnvFiles` no longer accept `apiKey` or send the `X-API-Key` header. - `api/server/services/Files/Code/process.js`: `processCodeOutput`, `getSessionInfo`, `primeFiles` drop the `apiKey` param throughout. - `api/server/services/ToolService.js`: stop reading `process.env[EnvVar.CODE_API_KEY]` for `primeCodeFiles` and PTC; the agents library handles auth internally. Remove the now-dead `loadAuthValues` + `EnvVar` imports. Drop the misleading "LIBRECHAT_CODE_API_KEY" hint from the bash_tool error log. - `api/server/services/Files/process.js`: remove the `loadAuthValues` call around `uploadCodeEnvFile`. - `api/server/routes/files/files.js`: code-env file download no longer fetches a per-user key. - `api/server/controllers/tools.js`: `execute_code` is no longer a tool that needs verifyToolAuth with `[EnvVar.CODE_API_KEY]` — the endpoint always reports system-authenticated so the client skips the key-entry dialog. `processCodeOutput` called without `apiKey`. - `api/server/controllers/agents/callbacks.js`: `processCodeOutput` invoked without the loadAuthValues round trip, for both LegacyHandler and Responses-API handlers. - `api/app/clients/tools/util/handleTools.js`: `createCodeExecutionTool` called with just `user_id` + files. **packages/api:** - `packages/api/src/agents/skillFiles.ts`: `PrimeSkillFilesParams`, `PrimeInvokedSkillsDeps`, `primeSkillFiles`, `primeInvokedSkills` all drop the `apiKey` param; the gate is purely `codeEnvAvailable`. - `packages/api/src/agents/handlers.ts`: `handleSkillToolCall` drops the `process.env[EnvVar.CODE_API_KEY]` read; skill-file priming is now gated solely on `codeEnvAvailable`. `ToolExecuteOptions` signatures drop apiKey from `batchUploadCodeEnvFiles` and `getSessionInfo`. - `packages/api/src/agents/skillConfigurable.ts`: JSDoc no longer references the env var. - `packages/api/src/tools/classification.ts`: PTC creation no longer gated on `loadAuthValues`; `buildToolClassification` drops the `loadAuthValues` dep entirely (no LibreChat-side callers need it for this path anymore). - `packages/api/src/tools/definitions.ts`: `LoadToolDefinitionsDeps` drops the `loadAuthValues` field. **Frontend:** - Delete `client/src/hooks/Plugins/useAuthCodeTool.ts`, `useCodeApiKeyForm.ts`, and `client/src/components/SidePanel/Agents/Code/ApiKeyDialog.tsx` — the install/revoke dialogs for CODE_API_KEY are fully dead. - `BadgeRowContext.tsx`: drop `codeApiKeyForm` from the context type and provider. `codeInterpreter` toggle treated as always authenticated (sandbox auth is server-side). - `ToolsDropdown.tsx`, `ToolDialogs.tsx`, `CodeInterpreter.tsx`, `RunCode.tsx`, `SidePanel/Agents/Code/Action.tsx` +`Form.tsx`: all API-key dialog trigger refs, "Configure code interpreter" gear buttons, and auth-verification plumbing removed. The "Code Interpreter" toggle is now a plain `AgentCapabilities.execute_code` checkbox — no key-entry gate. - `client/src/locales/en/translation.json`: drop the three `com_ui_librechat_code_api*` keys and `com_ui_add_code_interpreter_api_key`. Other locales are externally automated per CLAUDE.md. **Config:** - `.env.example`: remove the `# LIBRECHAT_CODE_API_KEY=your-key` section and its header. **Tests:** - `crud.spec.js`: assertions flipped to pin "no X-API-Key header" and "no apiKey param". - `skillFiles.spec.ts`: removed env-var save/restore; tests now pin that the batch-upload path is gated solely on `codeEnvAvailable` and that no apiKey is threaded through. - `handlers.spec.ts`: same — just the `codeEnvAvailable` gate pins remain. - `classification.spec.ts`: remove the two tests that asserted `loadAuthValues` was (not) called for PTC. - `definitions.spec.ts`: drop every `loadAuthValues: mockLoadAuthValues` entry from the deps shape. - `process.spec.js`: strip the mock of `EnvVar.CODE_API_KEY`. **Comment hygiene:** - `tools.ts`, `initialize.ts`, `registry/definitions.ts`: shortened stale comment references to "legacy `execute_code` tool" without naming the retired env var. Tests verified: 678 packages/api tests pass, 836 backend api tests pass. Typecheck clean, lint clean. Only remaining CODE_API_KEY mentions in the code are two regression-guard assertions: - `crud.spec.js`: pins "no X-API-Key header" stays absent. - `skillConfigurable.spec.ts`: pins `configurable` never grows a `codeApiKey` field. * 🧹 chore: Remove the last two CODE_API_KEY name mentions in LibreChat Follow-up to the prior full deprecation commit: two tests still named the retired identifier in their regression-guard assertions. - `packages/api/src/agents/skillConfigurable.spec.ts`: drop the "does not inject a codeApiKey key" test. The `codeApiKey` field is gone from the production configurable shape, so an absence-assertion naming it re-introduces the retired identifier in code. - `api/server/services/Files/Code/crud.spec.js`: rename the "without an X-API-Key header" case back to "should request stream response from the correct URL" and drop the `expect(headers).not.toHaveProperty('X-API-Key')` assertion. The surrounding request-shape checks (URL, timeout, responseType) still pin the behavior; the explicit header-absence line was named-after the deprecated contract. Result: `grep -rn "CODE_API_KEY\|codeApiKey\|LIBRECHAT_CODE_API_KEY"` against the LibreChat source tree returns zero hits. The only remaining `X-API-Key` strings in this repo are on unrelated OpenAPI Action + MCP server auth configurations, where the string is user-facing config, not a LibreChat-owned identifier. Tests: 677 packages/api pass (2 pre-existing summarization e2e failures unrelated); 126 api-workspace controller/service tests pass. Typecheck and lint clean. * 🎯 fix: Narrow codeEnvAvailable to per-agent (admin cap AND agent.tools) Before this commit, `codeEnvAvailable` was computed in the three JS controllers as the admin-level capability flag only (`enabledCapabilities.has(AgentCapabilities.execute_code)`) and passed through `initializeAgent` → `injectSkillCatalog` / `primeInvokedSkills` / `enrichWithSkillConfigurable` unchanged. A skills-only agent whose `tools` array didn't include `execute_code` still got `bash_tool` registered (via `injectSkillCatalog`) and skill files re-primed to the sandbox on every turn — wrong, because the agent never opted in to code execution. **Fix:** `initializeAgent` now computes the per-agent effective value once as `params.codeEnvAvailable === true && agent.tools.includes(Tools.execute_code)`, reuses the same boolean for: 1. The `execute_code` → `bash_tool + read_file` expansion gate (previously already consulted `agent.tools`; now shares the single `effectiveCodeEnvAvailable` binding). 2. The `injectSkillCatalog` call (previously got the raw admin flag). 3. The returned `InitializedAgent.codeEnvAvailable` field (new, typed as required boolean). **Controllers (initialize.js, openai.js, responses.js):** store `primaryConfig.codeEnvAvailable` in `agentToolContexts.set(primaryId, ...)`, capture `config.codeEnvAvailable` in every handoff `onAgentInitialized` callback, and read it from the per-agent ctx inside the `toolExecuteOptions.loadTools` runtime closure. The hoisted `const codeEnvAvailable = enabledCapabilities.has(...)` locals in the two OpenAI-compat controllers are gone — they were shadowing the narrowed per-agent value. **primeInvokedSkills:** `handlePrimeInvokedSkills` in `services/Endpoints/agents/initialize.js` now uses `primaryConfig.codeEnvAvailable` (per-agent, narrowed) instead of the raw admin flag. A skills-only primary agent won't re-prime historical skill files to the sandbox even when the admin enabled the capability globally. **Efficiency:** one extra `&&` in `initializeAgent`. No runtime hot-path cost — the `includes()` scan on `agent.tools` was already happening for the `execute_code` expansion gate; it's now just bound to a local. Tool execution closures read `ctx.codeEnvAvailable === true` (property access + strict equality, O(1)). **Ephemeral-agent note:** per-agent narrowing is authoritative for both persisted and ephemeral flows. The ephemeral toggle (`ephemeralAgent.execute_code`) is reconciled into `agent.tools` upstream in `packages/api/src/agents/added.ts`, so `agent.tools.includes('execute_code')` is the single source of truth by the time `initializeAgent` runs. **Tests:** two new regression tests pin the narrowing contract: - `initialize.test.ts` — four-quadrant matrix on `InitializedAgent.codeEnvAvailable` (cap on × agent asks, cap on × doesn't ask, cap off × asks, neither). Catches future refactors that drop either half of the AND. - `skills.test.ts` — `injectSkillCatalog` with `codeEnvAvailable: false` against an active skill catalog must NOT register `bash_tool` even though it still registers `read_file` + `skill`. This is the state a skills-only agent gets post-narrowing. All 191 affected packages/api tests pass + 836 backend api tests pass. Typecheck clean, lint clean. * 🧽 refactor: Comprehensive-review polish — hoist tool defs, pin verifyToolAuth contract, doc appConfig Addresses the comprehensive review of Phase 8. Findings mapped: **#1 (MINOR): `verifyToolAuth` unconditional auth for execute_code** - Added doc comment explicitly stating the deployment contract (admin capability → reachable sandbox; no per-check health probe to keep UI-gate queries O(1)). - New `api/server/controllers/__tests__/tools.verifyToolAuth.spec.js` with 4 regression tests pinning the contract: 1. `authenticated: true` + `SYSTEM_DEFINED` for execute_code. 2. 404 for unknown tool IDs. 3. `loadAuthValues` is never consulted (catches a future revert that would resurface the per-user key-entry dialog). 4. Response `message` is never `USER_PROVIDED`. **#2 (MINOR): `openai/service.ts` undocumented `appConfig` dependency** - Expanded the `ChatCompletionDependencies.appConfig` JSDoc to spell out that omitting it silently disables code execution for agents with `execute_code` in their tools. External consumers of `createAgentChatCompletion` now have the contract documented at the type boundary. **#5 (NIT): `registerCodeExecutionTools` re-allocates tool defs** - Hoisted `READ_FILE_DEF` and `BASH_TOOL_DEF` to module-level `Object.freeze`d constants. The shapes derive entirely from static `@librechat/agents` exports, so a single frozen object per tool is safe to share across every agent init. Eliminates the ~4-property allocations on every call (including the common second-call no-op path). **#6 (NIT): Verbose history-priming comment in initialize.js** - Trimmed the 16-line `handlePrimeInvokedSkills` block to a 5-line summary with `@see InitializedAgent.codeEnvAvailable` pointer. The canonical narrowing explanation lives on the type; the controller comment is just the ACL-vs-capability rationale. **Skipped:** - #3 (memory spec tests a mirror function): reviewer self-dismissed as a design tradeoff; the enum-literal pin already catches the highest-risk drift vector. - #4 (cross-repo contract for `createCodeExecutionTool`): user will explicitly install the latest `@librechat/agents` dev version once the companion PR publishes, so the version pin will be authoritative. - #7 (migration/deprecation note for self-hosters): out of scope per user direction — release notes handle this. Tests verified: 679 packages/api + 840 backend api tests pass. Typecheck + lint clean. * 🔧 chore: Update @librechat/agents version to 3.1.68-dev.1 across package-lock and package.json files This commit updates the version of the `@librechat/agents` package from `3.1.68-dev.0` to `3.1.68-dev.1` in the `package-lock.json` and relevant `package.json` files. This change ensures consistency across the project and incorporates any updates or fixes from the new version. |
||
|
|
7581540ab6 |
🔌 refactor: Decouple bash_tool from Per-User CODE_API_KEY (#12712)
* 🔌 refactor: Decouple bash_tool from Per-User CODE_API_KEY
Phase 4 of Agent Skills umbrella (#12625): gate bash_tool and skill
file priming on the `execute_code` capability only. Thread a boolean
`codeEnvAvailable` through `enrichWithSkillConfigurable` and
`primeInvokedSkills` in place of the old per-user `codeApiKey` +
`loadAuthValues` plumbing. The sandbox API key is the LibreChat-
hosted service key — system-level, not a user secret — so the
per-user lookup was legacy; when needed, it's read directly from
`process.env[EnvVar.CODE_API_KEY]` inside the capability gate.
`handleSkillToolCall` and `primeInvokedSkills` gate sandbox uploads
on `codeEnvAvailable` first, preventing skill-file uploads to the
sandbox when an agent has `execute_code` disabled even if the env
var happens to be set. The agents library resolves the env key
itself for `bash_tool`, so `ToolService.js` drops the
`loadAuthValues` lookup and the "Code execution is not available"
placeholder tool in favor of a plain `createBashExecutionTool({})`
with a loud error log if the env var is missing.
Also fixes a pre-existing `appConfig`-undefined lint error in
`responses.js`/`createResponse` that surfaced when this file was
touched (declares `const appConfig = req.config` at function top,
matching the existing pattern in other controllers).
Preserves the `skillPrimedIdsByName` threading added by Phase 3/5/6
and all Phase 3/5/6 call-site signatures. Adds
`skillConfigurable.spec.ts` (5 cases pinning the new surface) and
`skillFiles.spec.ts` (4-way matrix of capability × env key for
`primeInvokedSkills`).
* 🧪 refactor: Address Codex Review Feedback
Resolves findings from the second codex review on #12712:
- MAJOR: `handlers.spec.ts` now covers the `codeEnvAvailable` gate in
`handleSkillToolCall` across three cases (gate off, gate on + env
set, gate on + env unset). The gate is the critical regression
prevention — a future edit that drops it would silently re-enable
sandbox uploads for agents with `execute_code` disabled.
- MINOR: Hoist `codeEnvAvailable` and `skillPrimedIdsByName` out of
`loadTools` closures in `openai.js` and `responses.js`. Both values
are fixed once `initializeAgent` resolves, so recomputing them on
every tool execution was wasted work. `responses.js` shares a single
pair between its streaming and non-streaming branches.
- MINOR: `skillFiles.spec.ts` now has a test that exercises the full
upload path end-to-end with real file records, asserting
`batchUploadCodeEnvFiles` is called with the env-sourced apiKey and
the correct file set (including the synthetic `SKILL.md`).
- NIT: Finish the `appConfig` extraction in `responses.js/createResponse`
— replaces the remaining `req.config` references with `appConfig` for
consistency with the pattern in other controllers.
No behavioral changes beyond what was already in place; this is
coverage and readability polish.
* 🧷 test: Tighten Spec Hygiene Per Codex Nit Feedback
Round-3 codex review flagged two NITs on the test code added in the
previous commit:
- Replace `_id: 'skill-id' as unknown as never` in the new
`makeSkillHandlerWithFiles` helper with a real `Types.ObjectId`,
matching the pattern used by the primed-skill tests further up in
the same file (and by `skillFiles.spec.ts`). The `never` cast
hides the fact that `_id` really is a string / ObjectId at runtime.
- Replace the ad-hoc `{ on, pipe, read }` stub with a real
`Readable.from(Buffer.from(''))` in the upload-path test. The stub
worked only because `batchUploadCodeEnvFiles` is mocked and never
iterates the stream; `Readable.from` satisfies the same contract
and is robust to any future partial-real replacement of the upload
function.
Pure test-hygiene improvements; no runtime code touched.
* 🧹 chore: Remove Duplicate appConfig Declaration After Rebase
The upstream `
|
||
|
|
89b6bffc46 | 🧼 fix: Missing Enum imports | ||
|
|
dfc3dfa57f |
📍 feat: always-apply frontmatter: auto-prime skills every turn (#12746)
* 🔁 refactor: Rebase always-apply work onto merged structured-frontmatter columns Phase 6 (disable-model-invocation / user-invocable / allowed-tools) landed first on feat/agent-skills. Reconcile this branch with the new mainline: - Thread alwaysApplySkillPrimes through unionPrimeAllowedTools alongside manualSkillPrimes, applying the combined MAX_PRIMED_SKILLS_PER_TURN ceiling before loading tools. - Add `_id` to ResolvedAlwaysApplySkill to match Phase 6's ResolvedManualSkill shape (read_file name-collision protection). - Register 'always-apply' in ALLOWED_FRONTMATTER_KEYS / FRONTMATTER_KIND so Phase 6's validator recognizes it. - Drop frontmatter from the listSkillsByAccess projection; the backfill helper remains as defensive code but its read path is no longer exercised on summary rows (no legacy rows exist — the branch never shipped), saving ~200KB per page. - Retire the corresponding "backfills legacy on summaries" test. - Plumb listAlwaysApplySkills through the JS controllers + endpoint initializer so the always-apply resolver sees a real DB method. * 🧹 fix: Dedupe manual/always-apply overlap, share YAML util, tidy comments Addresses review findings: - Cross-list dedup: when a user $-invokes a skill that is also marked always-apply, the always-apply copy is now dropped so the same SKILL.md body never primes twice in one turn. Manual wins (explicit intent, closer to the user message). Dedup runs in both initializeAgent (so persisted user-bubble pills stay in sync) and injectSkillPrimes (defense-in-depth at splice time). New test cases cover single-overlap, partial-overlap, and dedup-before-cap. - DRY: extract stripYamlTrailingComment to packages/data-schemas/src/utils/yaml.ts; packages/api/src/skills/import.ts now imports the shared helper. Also drop the redundant inner stripYamlTrailingComment call inside parseBooleanScalar — the call site already strips. - Mark injectManualSkillPrimes as @deprecated in favor of injectSkillPrimes (kept for external consumers of @librechat/api). - Document SKILL_TRIGGER_MODEL as forward-looking plumbing for the model-invoked path rather than leaving it as a bare unused export. - Replace the stale "frontmatter is included" comment on listSkillsByAccess with an accurate explanation of why it was intentionally excluded. * 🔒 fix: Include always-apply primes in skillPrimedIdsByName + clear alwaysApply on body opt-out Two bugs flagged by Codex review: P1 (read_file): `manualSkillPrimedIdsByName` only carried manual-invocation primes, so an always-apply skill with `disable-model-invocation: true` was blocked from reading its own bundled files, and same-name collisions could resolve to a different doc than the one whose body got primed. - Rename `buildManualSkillPrimedIdsByName` → `buildSkillPrimedIdsByName` (accepts both manual + always-apply prime arrays). - Rename the configurable field `manualSkillPrimedIdsByName` → `skillPrimedIdsByName` throughout the plumbing (skillConfigurable.ts, handlers.ts, CJS callers, tests). - Overlap resolution: manual wins on the rare edge case where the same name appears in both arrays (upstream dedup should prevent this, but defensive merging treats manual as authoritative). - New tests: (1) gate-relaxation fires for always-apply primes, (2) `_id` pinning works for always-apply same-name collisions. P2 (updateSkill): when a body update had no `always-apply:` key, `extractAlwaysApplyFromBody` returned `absent` and the column was left untouched. A skill that was once `alwaysApply: true` would keep auto-priming even after its SKILL.md no longer declared the flag. - Treat `absent` as a positive "not always-apply" declaration when the body is explicitly submitted; flip the column to `false`. - Explicit top-level `alwaysApply` still wins (three-source precedence unchanged). - New tests: body removes key → false, body has no frontmatter at all → false, explicit + body-without-key → explicit wins. * 🧵 refactor: Collapse duplicate prime types + tighten parse + test hygiene Sanity-check review follow-ups: - Collapse `ResolvedManualSkill` / `ResolvedAlwaysApplySkill` into a single `ResolvedSkillPrime` canonical interface with two backward- compatible type aliases. Both resolvers feed the same pipeline stages (injectSkillPrimes, unionPrimeAllowedTools, buildSkillPrimedIdsByName); the per-source distinction lives on `additional_kwargs.trigger`, not on the resolver output. - Move the `always-apply` branch in `parseFrontmatter` to operate on the raw post-colon text. The outer `unquoteYaml` was fine today because it's idempotent on non-quoted strings, but running it twice (once per line, once after stripping the inline comment) would be fragile if the unquoter ever grows richer YAML-escape handling. - Add the missing `alwaysApplyDedupedFromManual: 0` field to the `injectSkillPrimes` mocks in `openai.spec.js` and `responses.unit.spec.js` so they match the full `InjectSkillPrimesResult` contract. - Insert the blank line between the `unionPrimeAllowedTools` and `resolveAlwaysApplySkills` describe blocks. * 🔧 fix(tsc): Cast mock.calls via `unknown` for strict tuple destructure `getSkillByName.mock.calls[0]` is typed as `[]` by jest's generic default; a direct cast to `[string, ..., ...]` fails TS2352 under `--noEmit` even though the runtime shape matches. Go through `as unknown as [...]` like the earlier test in the same file so CI's type-check step stays green. * 🪢 fix: Propagate skillPrimedIdsByName into handoff agent tool context Handoff agents go through the same `initializeAgent` flow as the primary (with `listAlwaysApplySkills` now plumbed), so they resolve their own `manualSkillPrimes` and `alwaysApplySkillPrimes` — but the `agentToolContexts.set(...)` for handoff agents didn't carry `skillPrimedIdsByName` into the per-agent context. That meant `handleReadFileCall` fell back to the full ACL set + a `prefer*` flag for handoff agents: same-name collisions could resolve to a different doc than the one whose body got primed, and a `disable-model-invocation: true` skill primed via manual `$` or always-apply inside the handoff flow would be blocked from reading its own bundled files. Build the map via `buildSkillPrimedIdsByName(config.manualSkillPrimes, config.alwaysApplySkillPrimes)` for every handoff tool context so `read_file` behaves identically across primary and handoff agents. |
||
|
|
82173f7b91 |
🛡️ feat: Persist & enforce disable-model-invocation / user-invocable / allowed-tools (#12745)
* 🧬 feat: Persist `disable-model-invocation` / `user-invocable` / `allowed-tools` Adds first-class columns mirroring the three runtime-enforced frontmatter fields, with a `deriveStructuredFrontmatterFields` helper that maps from frontmatter at create/update time and re-syncs (via `$unset`) when fields are removed. `listSkillsByAccess` projection includes them so the Phase 6 catalog filter and popover filter can both read off the summary row. Marks `invocationMode` as @deprecated on `TSkill` and the `InvocationMode` enum — the runtime now reads the persisted pair instead. * 🛡️ feat: Enforce frontmatter at runtime (catalog, skill tool, manual resolver, tool union) Wires the persisted columns into actual runtime behavior across all four invocation paths: - `injectSkillCatalog` excludes `disableModelInvocation: true` skills before catalog formatting — they cost zero context tokens and stay invisible to the model. - `handleSkillToolCall` rejects with a clear error when the model names a skill marked `disable-model-invocation: true` (defends against a stale-cache or hallucinated invocation getting past the catalog filter). - `resolveManualSkills` skips `userInvocable: false` skills with a warn log so an API-direct caller can't bypass the popover-side filter. - `unionPrimeAllowedTools` collects skill-declared `allowed-tools` minus what's already on the agent; `initialize.ts` re-runs `loadTools` for the extras and merges resulting `toolDefinitions` into the agent's effective set for the turn. Tool-name resolution is tolerant — unknown names silently drop with a debug log so cross-ecosystem skills referencing yet-to-be-implemented tools (Claude Code's `edit_file`, etc.) import without breaking. The agent document is never modified; the union is turn-scoped. Helper exports (`unionPrimeAllowedTools`) are structured so Phase 5's always-apply primes flow through the same union (combined `[...manualPrimes, ...alwaysApplyPrimes]`) once the resolver lands. Skill handler wire format gains the three fields so clients can render them on detail / list views. * 🎛️ feat: `$` popover reads `userInvocable` instead of UI-only `invocationMode` Replaces the phase-1 UI-only `invocationMode` check with the persisted `userInvocable` field (mirrors the `user-invocable` frontmatter). Skills authored with `user-invocable: false` no longer surface in the popover; the backend resolver enforces the same rule for defense-in-depth. Default-visible behavior is preserved: skills without an explicit `userInvocable` value (older rows, freshly imported skills that don't declare the field) stay visible — only an explicit `false` hides them. Test fixture updated to reflect the new field. * 🔧 fix: Address Phase 6 review findings Codex P2 + reviewer #1: Single `loadTools` call with the union of `agent.tools + allowed-tools`. The earlier two-call approach dropped `userMCPAuthMap` / `toolContextMap` / `actionsEnabled` from the skill-added pass — an MCP tool gained via `allowed-tools` would be visible to the model but fail at execution without per-user auth context. Resolution of `manualSkillPrimes` is hoisted before `loadTools` so the union can be computed up-front; the dropped-tools debug log now compares loaded vs. requested across the single call. Codex P3 + reviewer #2: `injectSkillCatalog.activeSkillIds` now includes `disable-model-invocation: true` skills. The runtime ACL check in `handleSkillToolCall` previously couldn't reach the explicit "cannot be invoked by the model" rejection because the broader access set excluded those skills. Catalog text and tool registration still gate on the visible subset (zero-context-token guarantee preserved); only the per-user `isActive` filter is a hard exclusion now. Reviewer #1 (try/catch around loadTools, MAJOR): A single bad `allowed-tools` entry from a shared skill could crash the entire turn. Now wrapped — on failure with extras, retry with just `agent.tools` and continue (the dropped-tools debug log surfaces what vanished). If the retry-without-extras still throws, propagate; the agent's own tools are the load-bearing surface. Reviewer #3 (integration tests, MAJOR): Added six tests in `initialize.test.ts` covering the full `allowed-tools` loading path: union pass-through, no-extras short-circuit, agent-baseline dedup, loadTools throw + retry, propagated throw without extras, and the empty-tools edge case. Smaller cleanups bundled in: - Reviewer #4: Moved `logger` import to the package-imports section (was wedged among local imports). - Reviewer #5: Removed unused index on `disableModelInvocation` (filtering happens application-side in `injectSkillCatalog`; index cost write overhead for zero query benefit). - Reviewer #6: Swapped order of `userInvocable` and body checks in `resolveManualSkills` so the more authoritative author-decision reason surfaces first when both apply. - Reviewer #8: Documented the `allowedTools` enforcement gap on the schema + type — model-invoked skills (mid-turn `skill` tool calls) do NOT trigger tool union, since adding tools after the graph starts would require a rebuild. Manual / always-apply (Phase 5) primes are the supported paths. - Reviewer #9: Renamed `dmi` / `ui` / `at` locals to `disableModelInvocationRaw` / `userInvocableRaw` / `allowedToolsRaw` in `deriveStructuredFrontmatterFields`. Reviewer #7 (DRY shared `getSkillByName` return type) deferred — field sets diverge meaningfully across the three call sites (handler needs `body + fileCount`; resolver needs `author + allowedTools + userInvocable`; the InitializeAgentDbMethods contract needs the superset). A `Pick<>`-based consolidation is a follow-up cleanup. * 🔧 fix: Address codex iter 2 — catalog quota + duplicate-name dedup P1: `injectSkillCatalog` cap now counts only model-visible skills, not the merged active set. The previous behavior let a tenant with many `disable-model-invocation: true` rows near the top of the cursor exhaust the 100-slot quota before any invocable skill got scanned — the catalog could end up empty even though invocable skills existed further down the paginated results. `MAX_CATALOG_PAGES` stays the ceiling on scan budget; only `visibleCount` drives the early-exit on quota fill. P2: When an invocable and a `disable-model-invocation: true` skill share a name, drop the disabled doc(s) from `activeSkillIds`. Without this dedup, `getSkillByName` (which sorts by `updatedAt` desc) could pick the disabled doc and every model call to the cataloged name would fail with "cannot be invoked by the model" instead of executing the visible skill. When ONLY a disabled doc exists for a name, it stays in `activeSkillIds` so the explicit-rejection error path still fires for hallucinated invocations. Tests: 3 new cases in `injectSkillCatalog` covering (a) cap counted on visible skills only, (b) same-name collision drops disabled doc, (c) sole-disabled-name case keeps the disabled doc. * 🔒 fix: Apply `disable-model-invocation` gate to read_file too (codex iter 3 P1) `activeSkillIds` is shared between the `skill` and `read_file` handlers. The skill-tool gate was applied last iteration, but `handleReadFileCall` authorized purely on `getSkillByName(..., accessibleIds)` — so a model that learned a hidden skill's name (stale catalog or hallucination) could still read its `SKILL.md` body or bundled files via `read_file`, defeating the contract. Same explicit rejection now fires from both handlers; no change needed to the ACL set itself (disabled docs stay in `activeSkillIds` so the explicit error path keeps firing). Two new tests in `handlers.spec.ts` cover the read_file gate and regression-protect the happy path. * 🔧 fix: Address codex iter 4 — manual-prime exception + legacy frontmatter backfill P1: Scope the `read_file` `disableModelInvocation` gate to AUTONOMOUS model probes only. A user-invoked `$` skill that is also marked `disable-model-invocation: true` had its bundled `references/*` / `scripts/*` files unreadable, leaving the manually-primed skill body referencing files the model couldn't load. Now the handler bypasses the gate when the skill name appears in `manualSkillNames` (the per-turn allowlist threaded from `manualSkillPrimes` → `agentToolContexts` → `enrichWithSkillConfigurable` → `mergedConfigurable`). Defense-in-depth: the bypass is scoped to the specific names in the allowlist; a different disabled skill name is still rejected. P2: Read-time fallback for legacy skills authored before Phase 6 landed the structured columns. `user-invocable: false` / `disable-model-invocation: true` set in `frontmatter` (the validator already accepted those keys) but with no derived column would incorrectly evaluate as "user-invocable / model-allowed" until a save backfilled the columns. New `backfillDerivedFromFrontmatter` helper fills undefined columns from frontmatter at read time in both `getSkillByName` and `listSkillsByAccess` — column wins when both are set, frontmatter fills the gap when only it's set. No DB writes; the next `updateSkill` naturally persists. `listSkillsByAccess` projection expanded to include `frontmatter` (bounded by validator, payload impact small) so summaries can also be backfilled. Sticky-primed disabled skills (ones invoked in prior turns of the same conversation) are not yet in the manual-prime allowlist — same- turn manual invocation is the load-bearing path codex flagged; the sticky-turn case is a known limitation tracked for a follow-up. Tests: 2 new in handlers.spec.ts (manual-prime allows + name-scoped block holds), 3 new in skill.spec.ts (legacy backfill via getSkillByName + listSkillsByAccess + column-wins precedence). * 🔧 fix: Address codex iter 5 — propagate manualSkillNames + keep read_file P1: `enrichWithSkillConfigurable` is also called from `openai.js` and `responses.js` (the OpenAI Responses + completions endpoints). Both were ignoring the new `manualSkillNames` parameter, which meant the manual-prime exception in the `read_file` gate (iter 4) only worked on the agents endpoint. Now all three call sites pass `primaryConfig.manualSkillPrimes?.map(p => p.name)` so manual `$` invocations of disabled skills work consistently across endpoints. P2: When every accessible skill is `disable-model-invocation: true`, the catalog text and `skill` tool are correctly omitted (no model- reachable targets) — but `read_file` and `bash_tool` MUST still be registered. A user manually invoking such a skill gets its SKILL.md body primed into context; if the body references `references/foo.md` or `scripts/run.sh`, those reads need a registered tool. Restructured `injectSkillCatalog` so `skill` registration is gated on `catalogVisibleSkills.length > 0` while `read_file` (always) and `bash_tool` (when codeEnvAvailable) register whenever any active skill is in scope. Tests: existing all-disabled test rewritten to assert read_file IS registered + skill is NOT; new test confirms bash_tool joins it when codeEnvAvailable. * 🔧 fix: Address codex iter 6 — name-collision consistency via preferInvocable P2a (resolveManualSkills): a name collision between an older user-invocable doc and a newer non-user-invocable doc made manual `$` invocation silently no-op. The popover surfaced the older invocable doc; resolver looked it up by name; `getSkillByName` returned the newer non-invocable doc; resolver skipped on `userInvocable: false`. P2b (handler / runtime ACL): with same-name duplicates (e.g. older invocable + newer disabled), the manual prime resolved to one doc while later `read_file` / `skill` execution resolved a different doc through `activeSkillIds`. Model could follow one SKILL.md body while reading files from a different skill. Both root-cause: `getSkillByName` always returned the newest match and let the caller filter, but with collisions the newest can be something the caller didn't want. Fix: extend `getSkillByName` with `options.preferInvocable`. When true, prefer the newest doc satisfying BOTH `userInvocable !== false` AND `disableModelInvocation !== true` (with frontmatter backfill); fall back to the newest match otherwise. Fast path preserved when caller doesn't opt in. Callers passing `preferInvocable: true`: - `resolveManualSkills` — picks the popover-visible invocable doc even when a newer disabled / non-user-invocable duplicate exists. - `handleSkillToolCall` — keeps execution aligned with the catalog; falls back to the disabled doc only when no invocable variant exists (so the explicit "cannot be invoked by the model" gate still fires for the hallucinated-disabled-name case). - `handleReadFileCall` — same alignment, plus the manual-prime exception added in iter 4 still applies. Tests: - 2 new in skill.spec.ts (preferInvocable picks invocable when collision exists; falls back to newest when no clean-invocable exists). - 1 new in skills.test.ts (resolver passes preferInvocable through). - 2 new in handlers.spec.ts (skill tool + read_file pass it). - Existing initialize.test.ts assertion updated for the new option. * 🔧 fix: Address codex iter 7 — split preferInvocable into per-axis flags The previous unified `preferInvocable` filter required BOTH `userInvocable !== false` AND `disableModelInvocation !== true`. That was wrong for the model paths: `userInvocable: false` skills are model-only and remain valid `skill` / `read_file` invocation targets. A duplicate-name scenario where the newer cataloged doc was model- only would let the older user-invocable variant shadow it on every model call. Split the option into two independent axes: - `preferUserInvocable` — for manual paths (`$` popover). Skips docs with `userInvocable: false`. Disable-model-invocation status is irrelevant; iter 4 explicitly supports manual prime of disabled skills. - `preferModelInvocable` — for model paths (`skill` / `read_file` handlers). Skips docs with `disableModelInvocation: true`. User- invocable status is irrelevant; model-only skills are valid here. Both flags fall back to the newest match when no preferred doc exists, so the explicit-rejection error paths still fire correctly in the sole-disabled-name case. Callers updated: - `resolveManualSkills` → `preferUserInvocable: true` - `handleSkillToolCall` / `handleReadFileCall` → `preferModelInvocable: true` Tests: - New spec test for preferModelInvocable not filtering on userInvocable. - Existing preferInvocable test renamed/split to cover the new axes. - New test asserts preferUserInvocable still returns disabled docs (preserves iter 4 manual-disabled support). - Caller tests assert each path passes the right single flag and does NOT pass the wrong one. * 🔧 fix: TypeScript type-check failure in handlers.spec.ts (CI green) `jest.fn(async () => ...)` without explicit args infers an empty tuple for the call signature, so `mock.calls[0][2]` flagged as "Tuple type '[]' has no element at index '2'." Cast to `unknown[]` then narrow to the expected option shape. Behavior unchanged. Caught by the `Type check @librechat/api` CI step (.github/workflows/backend-review.yml). * 🔧 fix: Address codex iter 8 — undefined-result fallback + read_file alignment P1 (loadTools returning undefined): Production loaders (`createToolLoader` in `initialize.js` / `openai.js` / `responses.js`) wrap `loadAgentTools` in try/catch and return `undefined` on failure rather than throwing. Without explicit handling, my iter-1 try/catch only fired for thrown errors — a silent-failure on a skill-added tool would fall through to the empty fallback and silently DROP the agent's baseline tools for the turn (much worse than just losing the extras). Added an `undefined`-result branch that retries with just `agent.tools`, mirroring the throw branch. Test pins both behaviors. P2 (read_file alignment with manual prime): When a skill is in this turn's `manualSkillNames`, the `read_file` handler now uses `preferUserInvocable` instead of `preferModelInvocable`. Same name-collision rule as `resolveManualSkills`, so the doc whose files get read is the same doc whose body got primed. For autonomous probes (skill not in `manualSkillNames`), the handler keeps `preferModelInvocable` to align with the catalog the model saw. Two new tests cover both branches and regression-protect that the wrong flag isn't passed. * 🔧 fix: Address codex iter 9 — pin read_file lookup to primed skill _id P1 (manually-primed disabled IDs were dropped from activeSkillIds): The `executableSkills` dedup in `injectSkillCatalog` correctly drops `disable-model-invocation: true` duplicates when an invocable doc shares the name — but `resolveManualSkills` legitimately primes disabled docs (iter 4 supports manual `$` invocation of disabled skills). When the resolver primed a disabled doc, the read_file handler couldn't find it in the (deduped) `activeSkillIds` and either resolved a different same-name skill or returned not-found. Fix: `ResolvedManualSkill` now carries `_id`; the legacy `initialize.js` / `openai.js` / `responses.js` controllers build a `manualSkillPrimedIdsByName` map and `enrichWithSkillConfigurable` passes it into `mergedConfigurable`. `handleReadFileCall` now pins its lookup's `accessibleIds` to `[primedId]` whenever the requested skill is in the map. The constrained set guarantees the lookup returns the EXACT doc the resolver primed — body/files come from the same source even when same-name duplicates exist or the dedup removed the prime's id from `activeSkillIds`. Autonomous read_file probes (skill not in the manual-primed map) keep the full ACL set + `preferModelInvocable` so they align with the catalog the model saw and the disabled-only case still fires the explicit-rejection gate. Test fixture changes flow from `_id` becoming required on `ResolvedManualSkill`. `buildSkillPrimeContentParts` / `injectManualSkillPrimes` widen their param types to `Pick<...>` because they only read `name` / `body` and shouldn't force test literals to invent placeholder ids. * 🧹 fix: Address independent reviewer findings (DRY + types + tests + docs) Sanity-pass review surfaced 7 findings; addressed 6 (the 7th — DRY on inline `getSkillByName` return types — is acknowledged tech debt deferred to a follow-up). #1 [MAJOR, DRY]: The 4-line `manualSkillPrimedIdsByName` map construction was duplicated across 4 CJS call sites (openai.js, responses.js x2, initialize.js). Extracted `buildManualSkillPrimedIdsByName` helper in `skillDeps.js`; all four sites now call the helper. If `ResolvedManualSkill` ever renames `_id` or gains identifying fields, only the helper changes. #2 [MINOR, type safety]: `handleReadFileCall` was casting a hex string to `Types.ObjectId[]` via `as unknown as`, relying on mongoose's auto-cast in `$in` queries. Replaced with `new Types.ObjectId(...)` so any future consumer comparing with `.equals()` / `===` gets the correct value type. Imported `Types` as a value (was type-only). #5 [MINOR, test gap]: Added a test for the worst-case silent-failure path — both the union and base-only `loadTools` calls return undefined. The agent gets no tools but the turn doesn't crash hard; pinning that contract. #4 [MINOR, performance]: Added a TODO on the `listSkillsByAccess` projection noting the `frontmatter` field can be dropped once a write migration backfills all pre-Phase-6 skills' columns. ~2KB/skill × 100/page is wasted bandwidth post-backfill. #6 [NIT, docs]: `backfillDerivedFromFrontmatter` JSDoc said "Pure" right before "mutates its undefined fields in place". Replaced with "Side-effect-free w.r.t. the DB (no writes), but mutates its argument in place" which describes both halves accurately. #7 [NIT, test determinism]: Replaced `await new Promise(r => setTimeout(r, 5))` in two same-name collision tests with explicit `updateOne` setting `updatedAt: new Date(Date.now() - 1000)` on the older doc. Removes the wall-clock race on fast CI runners. The pagination test (line 480) still uses setTimeout — that test is pre-existing and order is incidental, not load-bearing. Existing test fixtures updated to use valid 24-char hex ObjectIds (required by the iter-9 test that constructs a real `ObjectId`). #3 [MINOR, deferred]: Inline `getSkillByName` return type duplicated across `handlers.ts`, `initialize.ts`, `skills.ts`. Reviewer acknowledged this as deferred; field sets diverge across call sites (handler needs `fileCount`, resolver needs `author`/`allowedTools`). A `Pick<>`-based consolidation is a clean follow-up. |
||
|
|
539c4c7e4d |
🎬 feat: Prime Manually-Invoked Skills via $ Popover (#12709)
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover Lands the backend for manual skill invocation, making the $ popover deterministically prime SKILL.md before the LLM turn instead of leaving the model to discover the skill via the catalog. Flow: popover drains pendingManualSkillsByConvoId on submit, attaches names to the ask payload, controllers forward to initializeAgent, and initialize resolves each name to its body (ACL + active-state filtered, reusing the same rules as catalog injection). AgentClient splices the primes as meta HumanMessages before the user's current message. - Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts and reuse primeManualSkill inside handleSkillToolCall for a single shape source. - Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods and all three initializeAgent call sites (initialize.js, responses.js, openai.js). - Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages, shifting indexTokenCountMap so hydrate still fills fresh positions correctly. - Carry isMeta / source / skillName in additional_kwargs for downstream filtering. * 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input Two follow-ups to the Phase 3 priming path flagged in Codex review. Multi-agent runs: skipping the splice when agentConfigs is non-empty. `initialMessages` is shared across every agent in `createRun`, so splicing a skill body there would bypass Phase 1's per-agent `scopeSkillIds` contract — a handoff / added-convo agent with a different skill scope would see content its configuration excludes. Warn + skip is the minimal correct behavior; lifting this to per-agent initial state is a follow-up. Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS` (10) after dedup, with a warn listing the dropped tail. Controllers only validate `Array.isArray(req.body.manualSkills)`, so a crafted payload could otherwise fan out into an unbounded `Promise.all` of concurrent `getSkillByName` DB lookups. Cap lives in the resolver so every caller (including future `always-apply` in Phase 5) inherits it. * 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes Follow-ups from the comprehensive review. No behavior change for the happy path — these are architectural and defensive improvements that shrink the JS surface in /api, tighten the request-body contract, and cover the delicate splice logic with proper unit tests. - Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts so the message-array splice and `indexTokenCountMap` shift are unit- testable in TS. client.js now calls the helper. Tests pin the `>=` vs `>` boundary condition — a regression here would silently corrupt token accounting for every message after the insertion point. - Extract `extractManualSkills(body)` and use in all three controllers (initialize.js, responses.js, openai.js). Replaces copy-pasted `Array.isArray(...) ? ... : undefined` with a helper that also filters non-string / empty elements — closes a type-safety gap where a crafted payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach `getSkillByName` and waste DB round-trips. - Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves three invocation modes (`$` popover, `always-apply`, model-invoked); the old name misled readers coming from `handleSkillToolCall`. - Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills` — defensive, since the atom has a synchronous `[]` default, but the previous `.contents` cast would have been unsound under loading/error. - Document why `resolveManualSkills` honors the active-state filter even for explicit `$` selections (Phase 2 popover filter + API-direct hardening). - Remove stray `void Types;` in initialize.test.ts — `Types` is already consumed elsewhere in that test. * 🔖 refactor: Single source for the skill-message source marker Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for the model-invoked tool path) and `injectManualSkillPrimes` (for the user-invoked splice path). Downstream filtering and telemetry read this marker, so the two paths must agree; keeping the literal in one place removes the risk of them drifting when Phase 5's `always-apply` adds a third caller. * ♻️ refactor: Drop Multi-Agent Guard + Review Polish - Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking primes to handoff / added-convo agents via shared `initialMessages` is the agents SDK's concern to scope; this layer should just inject and let the graph handle agent-scoped state. The guard was well-intended but produced a silent-drop UX where `$skill` in a multi-agent run did nothing. - Bound the `[resolveManualSkills] Truncating ...` warn output to the first 5 dropped names plus a count suffix. A malicious payload of 1000 names was previously spilling all ~990 names into the log line. - Remove dead `?? []` from the `hasValue`-guarded loadable read in `drainPendingManualSkills` — the atom always yields a string[] when resolved, so the nullish fallback was unreachable. - Reorder skills.ts imports to follow the style guide: value imports shortest-to-longest (`data-schemas` → `langchain/core/messages` → multi-line `@librechat/agents`), type imports longest-to-shortest. * 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks Two fixes after the last push. CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock `@librechat/api` and the mock didn't expose the new `extractManualSkills` symbol, so every test in those files crashed before reaching the `recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()` returning `undefined` to both mocks; the controllers now no-op on manualSkills as the tests expect. Codex P2: `runMemory` passes `messages` straight through to the memory processor, so after the splice in `injectManualSkillPrimes`, SKILL.md bodies ride along as if they were real user chat. That pollutes memory extraction with synthetic instruction content and crowds out real turns from the window. - Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts` — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker. - Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))` at the top of `runMemory` before the window-sizing logic. Keeps the primes visible to the LLM (they still ride in `initialMessages`) but invisible to the memory layer. - 5 new tests for the predicate covering marker-present, plain messages, different source, non-object inputs, and array filter integration. * 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills The $ popover was priming SKILL.md bodies into the turn but leaving no visible trace on the assistant response — from the user's view it looked like the `$name ` cosmetic text did nothing. Now each manually-invoked skill renders the same "Skill X loaded" tool-call card that model-invoked skills already produce via PR #12684's SkillCall renderer. Approach: post-run prepend to `this.contentParts`. The aggregator owns per-step indices during the run, so pre-seeding collides; waiting until `await runAgents(...)` returns lets the graph settle before synthetic parts slot in at the front. - Export `buildSkillPrimeContentParts(primes, { runId })` from `packages/api/src/agents/skills.ts`. Returns completed tool_call parts (`progress: 1`, args JSON-encoded with `{skillName}`, output matching the model-invoked path's wording) that the existing `SkillCall.tsx` renderer draws identically. - In `AgentClient.chatCompletion`, prepend the built parts to `this.contentParts` immediately after `await runAgents`. Persistence and the final-event reconcile come for free — `sendCompletion` already reads `this.contentParts` verbatim. - Card ordering: skills appear first in the assistant message, reflecting that priming ran before the LLM's turn. Live-during-streaming cards are a separate follow-up — the graph's index-based aggregator makes that a bigger lift and this change delivers the core UX win without fighting the stream ordering. 6 new unit tests covering part shape, args JSON contract, output text, unique IDs, empty input, and startOffset ID differentiation. * ⚡ feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses Two follow-ups from testing. Optimistic card emit: the main chat path was only showing "Skill X loaded" cards at final-reconcile time, so the user saw nothing happen until the stream finished. Now emit synthetic ON_RUN_STEP + ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same pattern the MCP OAuth flow uses in `ToolService` — so the cards appear immediately. The graph's content at index 0 may overwrite them during streaming, but the post-run `contentParts` prepend (unchanged) restores them on final reconcile. OpenAI + Responses parity: both controllers were resolving `manualSkillPrimes` via `initializeAgent` but never injecting them into `formattedMessages` before the run. Manual invocation silently did nothing on `/v1/chat/completions` and the Responses API path. Now both call `injectManualSkillPrimes` on the formatted messages so the model sees SKILL.md bodies on every path. LibreChat-style card SSE events don't apply to these OpenAI-shaped responses, so the live-emit is chat-path-only. - Export `buildSkillPrimeStepEvents(primes, { runId })` from `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID` by default so the frontend maps events to the in-flight preliminary response message, matching the OAuth emitter. - In `AgentClient.chatCompletion`, emit via `sendEvent` (or `GenerationJobManager.emitChunk` in resumable mode) after `injectManualSkillPrimes` runs, before the LLM turn begins. - Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after `formatAgentMessages`. Refactored the destructure to `let` on `indexTokenCountMap` so the injector's returned map is usable. - 8 new unit tests covering the step-event builder: pair cardinality, default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on completion, index ordering, stepId/toolCallId pairing, empty input. * 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset Two bugs in the optimistic-card emit from the last pass. 1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the MCP OAuth pattern), but OAuth emits DURING tool loading — before the real response messageId exists. By the time skill priming fires, the graph is about to emit with `this.responseMessageId`, so the PRELIM runId orphaned every card onto the client's placeholder response entry in `messageMap`, separate from the one the LLM's events were building. Net effect: cards never rendered mid-stream. Now passing `this.responseMessageId` — the same ID `createRun` receives — so synthetic and real steps land on the same `messageMap` entry. 2. Index 0 collision. With the runId fixed, card-at-0 would have hit `updateContent`'s type-mismatch guard when the LLM's text delta arrived at the same index, suppressing the whole text stream. New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE emit and the server-side `contentParts` assignment. Sparse array during streaming renders as `[llm_text, ..., card]` (skip-holes via `Array#filter` / `Array#map`). `filterMalformedContentParts` from `sendCompletion` compacts to dense `[text, card]` before persistence, so streaming UI and saved message agree on order — no finalize reorder jank. Post-run switches from `contentParts.unshift` to `contentParts[OFFSET + i] = part` to mirror the live placement. - Add `startIndex` option to `buildSkillPrimeStepEvents` with `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from `@librechat/api` so `client.js` can reuse it for the post-run splice. - Update the existing index-ordering test to the new default and add a new test for the explicit `startIndex` override. * 🎗️ feat: Replace \$skill-name Text with Pills on the User Message The `$skill-name ` cosmetic text the popover was inserting into the textarea had two problems: it lingered in the user message forever (the card is a more meaningful marker), and it implied that free-form text invocation like \"\$foo help me\" should work — which it doesn't, and supporting it would mean another parsing layer nobody asked for. Dropped the textarea insertion. Visual confirmation after submit now comes from a compact `ManualSkillPills` row on the user bubble that self-extinguishes once the backend's live skill-card stream (`buildSkillPrimeStepEvents` from the last commit) populates the sibling assistant response. Multiple skills render as multiple pills — the atom was already a string array, so multi-select works for free. - `SkillsCommand.tsx`: select handler no longer writes to the textarea. Still drops the trigger `$` via `removeCharIfLast`, still pushes to `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`. - `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by user messageId. `useChatFunctions.ask` writes the drained skill list here on every fresh submit (regenerate/continue/edit still skip). - `ManualSkillPills.tsx` renders pills conditionally: hidden when the message isn't a user message, when no skills are attached, or when the sibling assistant response already carries a `skill` tool_call content part (the live card took over). Reads messages via React Query so we don't re-render on every message-state keystroke. - `Container.tsx` mounts the pills above the user message text, parallel to the existing `Files` slot. - Updated the SkillsCommand select-flow spec to assert the textarea is cleared of `$` instead of populated with `\$name `. 5 new tests for `ManualSkillPills` covering empty state, non-user message guard, multi-skill rendering, the skill-card hide condition, and the text-only-content-doesn't-hide case. * 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips Three problems with the previous pass: 1. Cards rendered BELOW the LLM text on the assistant message (and stayed there on reload) because the sparse index-100 offset put them after the model's content. Now back to `unshift` — cards at the top, same as before the live-emit detour. 2. Pills on the user message disappeared the moment the live card arrived, so users barely saw them. The live-emit channel also added meaningful complexity and relied on a per-message Recoil atom that had no clean cleanup story. 3. No visual cue at all during new-chat compose — the `$name ` text was removed, the submitted-message pills weren't there yet, and the popover closes after selection. User had no way to see what they'd queued up before sending. New architecture: `manualSkills` is a first-class field on `TMessage`, persisted by the backend on the user message. `ManualSkillPills` reads straight from `message.manualSkills` — no atom, no sibling-lookup — so pills survive reload, show in history, and stay for the lifetime of the message. Compose-time chips above the textarea read the existing `pendingManualSkillsByConvoId` atom and let users × skills out before submitting. Backend reverts: - `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored `this.contentParts.unshift(...primeParts)` so cards sit at the top of the persisted assistant response. - `skills.ts`: removed `buildSkillPrimeStepEvents` and `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`, `StepTypes`, and `Constants` imports went with them. Removed 8 tests. Field persistence: - `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`. - Mongoose message schema gains `manualSkills: { type: [String] }` with matching `IMessage` TS field. - `BaseClient.js` reads `req.body.manualSkills` on user-message save, filters to non-empty strings, pins onto `userMessage` before `saveMessageToDatabase`. Mirrors the existing `files` pattern right above it. Runtime resolution still reads top-level `req.body.manualSkills` — persistence and resolution are separate concerns. Frontend: - `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the drained atom value goes onto the message, not a separate atom. Removed the `attachSkillsToMessage` Recoil callback. - `ManualSkillPills`: pure render of `message.manualSkills`. No more `useQueryClient`, no sibling scan, no atom read. Loses the auto-hide-when-card-arrives behavior — pills stay on the user bubble, cards live on the assistant bubble, both are informative. - Dropped the `attachedSkillsByMessageId` atomFamily and its export. - New `PendingManualSkillsChips` above the textarea reads the compose-time atom and renders chips with × to remove. Mounted in `ChatForm` right after `TextareaHeader`. Naturally hides on submit when the atom drains. Tests: updated `ManualSkillPills` suite to the new field-based reads (5 passing). New `PendingManualSkillsChips` suite covering empty state, multi-chip render, single × removal, and full-clear (4 passing). Backend suite trimmed to 89 (was 97) from the step-events test removal — no regressions on the remaining helpers. * 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding Two small UX fixes on top of the field-on-message architecture. Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to `py-1` on each chip and added `py-0.5` to the wrapper so the row breathes a little without feeling tall. Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent user message's `manualSkills` onto the assistant bubble as transient "Running X" chips while the real card is in flight. Renders above `ContentParts` in `MessageParts`. Hides itself when the assistant's own `content` grows a `skill` tool_call — the authoritative card from `buildSkillPrimeContentParts.unshift` is showing, so the placeholder steps aside. No SSE emit, no aggregator injection, no index collision with the LLM's streaming content: just a render slot keyed off the parent's field. Why not stream the cards live: whichever content index we'd choose either blocks the LLM's text stream (`updateContent` type-mismatch at index 0) or lands below the response after sparse compaction (index 100+). Mirroring the parent field sidesteps the aggregator entirely and gives the user an immediate "skill is loading" signal that naturally gives way to the real card at finalize. Covers the gap the user flagged: pills on the user message said "I asked for these" but nothing on the assistant side said "we're working on it" until the stream finished. 5 new tests for the indicator: user-msg guard, missing parent-field guard, multi-chip render, hides-on-card-landing, orphan-parent guard. * 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit Two bugs. Indicator never rendered: `InvokingSkillsIndicator` looked up the parent user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`, but on a new chat the React Query cache is keyed by `"new"` (the URL `paramId`) until the server assigns a real conversation ID — while `message.conversationId` on the assistant message is already the server ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched to `useChatContext().getMessages()`, which reads from the same `paramId` the rest of the UI uses, so new-chat and existing-chat cases both resolve to the correct message list. Regenerate / save-and-submit dropped manual skills: the compose-time `pendingManualSkillsByConvoId` atom is drained on the first submit, so replaying that turn later found an empty atom and sent `manualSkills: []`. The pills were still on the user bubble, so from the user's point of view the model was running primed — but the backend saw nothing and produced an unprimed response. - Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a reference message pass its persisted `manualSkills`; `useChatFunctions.ask` uses the override verbatim when present, otherwise falls back to the existing drain-or-empty logic. - `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills` — the user message being regenerated has the field persisted by the backend, so the second turn primes the same skills as the first. - `EditMessage.resubmitMessage` covers both edit branches: - User-message save-and-submit: forwards the edited message's own `manualSkills` so the new sibling turn primes identically. - Assistant-response edit: forwards the parent user message's `manualSkills` for the same reason. Indicator test suite converted from `@tanstack/react-query` harness to a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added a cache-miss case. * 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup Message-ID-keyed lookups kept racing the stream: the user message flips from its client-side intermediate UUID to the server-assigned ID mid-run, conversation IDs flip from the URL `paramId="new"` to the real convo ID on brand-new chats, and the React Query cache splits briefly between the two. Previous attempts — direct `queryClient.getQueryData` and then `useChatContext().getMessages()` — each missed a different window. `TSubmission.manualSkills` is already populated at `ask()` time and the submission atom (`store.submissionByIndex(index)`) is the single stable anchor across the whole lifecycle: set once at submit, lives through every SSE event, cleared when the stream ends. No ID lookups, no cache timing. - `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via Recoil. Shows chips when: • the message is assistant-side, • a submission is in flight with non-empty `manualSkills`, • the assistant's `parentMessageId` matches `submission.userMessage.messageId` (so chips appear only on the bubble for the current turn, never on siblings), • the assistant's own content doesn't yet carry a `skill` tool_call (real card takes over from the server's post-run `contentParts.unshift`). - Drops the `useChatContext().getMessages()` dependency and the `useQueryClient` dependency before that. No more lookups by conversationId or messageId. Test suite now mocks `useChatContext` to supply `index: 0` and seeds the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover user-side, no-submission history, empty `manualSkills`, multi-chip render, hides-on-card-landing, and wrong-turn guard. * 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure The mid-stream indicator kept getting wired off state I don't own: first `queryClient.getQueryData` (raced the new-chat paramId flip), then `useChatContext().getMessages()` (same cache, same race), then `useRecoilValue(submissionByIndex)` (pulled every message into the submission subscription — re-renders all indicators on any submission change, exactly the "limit hooks in rendering" concern). Cleanest path is the one the user pointed at: the submission owns the data, `useSSE` / `useEventHandlers` owns the save points, so seed the field ONTO the response message at the save site and let the indicator be a pure prop-read. - `createdHandler` now writes `manualSkills` onto the initial response from `submission.manualSkills` at the moment the placeholder enters the messages array. The field rides through the normal mutation pipeline via spreads (`useStepHandler` response creation, `updateContent` result returns) — no special handling needed. - `InvokingSkillsIndicator` drops the Recoil / context / queryClient reads. Pure function of `message`: if assistant, has `manualSkills`, and `content` hasn't grown a `skill` tool_call yet, render chips. Only `useLocalize` left, which was already unavoidable for the i18n string. - Renders decouple: no single state change (`submissionByIndex` flip, React Query cache update) forces every indicator in the message list to re-render anymore. Only the message whose prop changed re-runs. Finalize story unchanged: server's `responseMessage` doesn't carry the frontend-only `manualSkills` field, so `finalHandler`'s replacement drops it — but by then the real `skill` tool_call is in `content` and the indicator's content-scan hides itself anyway. Test suite back to pure prop mocks: 7 cases covering user-guard, no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps, text-only-keeps, and missing message. * 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts The indicator still wasn't showing because even though MessageParts mounted it as a sibling of ContentParts, ContentParts is a `memo`'d component that owns the only rendering path that refreshes in lockstep with content deltas. Mounting above it put the indicator one layer further out — reachable, but not exercised on the same render cycle that processes the streaming `message` prop. Moved the indicator into ContentParts itself, rendered at the top of both the sequential and parallel branches. Reads the `message` prop (newly threaded through as an optional prop alongside `content`), so: - Same render cycle as Parts — updates from the SSE pipeline flow through the same pathway. - Lives outside the `content.map`, so delta-driven content reshuffles never wipe it. - Still a pure prop-read inside the indicator itself (no Recoil, queryClient, context hooks). The only dep is `useLocalize`. Thread: - `ContentPartsProps` gains `message?: TMessage`. - `MessageParts` passes `message={message}` through, drops its own indicator mount + import. - `ContentParts` renders `<InvokingSkillsIndicator message={message} />` in both the parallel-content and sequential-content branches, right under `MemoryArtifacts` and before the empty-cursor / parts map. Companion data flow (unchanged): `createdHandler` seeds `initialResponse.manualSkills` from `submission.manualSkills`; the field rides through `useStepHandler` via spreads; indicator hides on `skill` tool_call landing in `content`. * 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn Passing the full `message` object into presentational components busts `React.memo` shallow comparisons every time the message reference changes for unrelated reasons. Swap to scalar `skills?: string[]` throughout: - `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility logic (user-vs-assistant, skill tool_call arrival) now lives in the caller so this stays pure presentational. - `ManualSkillPills`: props-only (`skills?: string[]`). - `ContentParts`: takes `manualSkills?: string[]` scalar, computes `showInvokingSkills` once per render from `manualSkills` + content scan for the `skill` tool_call, then mounts the indicator with `skills=` prop in both parallel and sequential branches. - `MessageParts`: passes `manualSkills={message.manualSkills}` through to `ContentParts`. - `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`. - Tests updated to exercise the narrowed prop surface. * 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator Instead of a separate `InvokingSkillsIndicator` chip component, render pending skill placeholders through the existing `SkillCall` renderer — same component the backend's finalized prime part uses. The loading visual (`progress < 1` + empty output → pulsing "Running X") and the completed visual ("Ran X") now come from one source of truth. `ContentParts` computes `pendingSkillNames` from `manualSkills` minus any `skill` tool_call already in `content` (dedupe by `args.skillName` since the synthetic's id differs from the real one). Those names render through a separate slot ABOVE the Parts iteration — not prepended to the content array, which would shift React keys on every downstream streaming text / tool part and force unmount/remount mid-stream. When the real prime `tool_call` lands at finalize (backend unshifts to content[0..]), `collectExistingSkillNames` picks it up, the pending set empties, and the real part takes over rendering in the Parts iteration. Layout is identical either way because primes are always at the top of content. - `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced) - `ContentParts.tsx` renders `<SkillCall .../>` directly for pending names, mirrors `Part.tsx`'s usage of the same component - `createdHandler` doc comment updated to reflect the new flow * ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched Previous revision read `content` to de-dupe pending cards against real `skill` tool_calls, so any optimistic skill part streamed from the backend would race our placeholder off the screen mid-turn — exactly the "getting overridden" symptom. Now: interim `SkillCall` cards are driven purely by the response message's `manualSkills` field. `content` is never inspected here, so no backend delta can pull the cards down. The field is now seeded directly onto the assistant placeholder in `useChatFunctions` (not only in `createdHandler`) so the cards appear from the first render, before the `created` SSE event round-trip. Lifecycle: - `useChatFunctions` puts `manualSkills` on the freshly-minted `initialResponse` — cards render the instant the placeholder lands. - `createdHandler` keeps its own re-seed (idempotent; safe) so a regenerate / save-and-submit flow that hits that path still works. - `useStepHandler` spread operations preserve the field through every content update. - `finalHandler` replaces the message with the server-backed `responseMessage` (no `manualSkills`) — cards disappear, and the real `skill` tool_call part in `content` takes over. ContentParts changes: - Drop `collectExistingSkillNames` / `parseJsonField` dedupe path. - `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`. - Simpler control flow — one boolean (`hasPendingSkills`) gates the early return, one function renders. * 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs Addresses seven findings from comprehensive code review: Finding 1 (MAJOR) — Document sticky re-priming as intentional - `buildSkillPrimeContentParts`: expanded doc comment explaining synthetic `skill` tool_calls persist and get re-primed on every subsequent turn via `extractInvokedSkillsFromPayload` (shape parity with model-invoked skills). This matches the UX: the assistant skill card is a visible, persistent signal that the skill is active for the conversation. Not a bug — called out explicitly so future maintainers don't mistake it for one. Finding 2 (MAJOR) — Add ContentParts render tests - New `ContentParts.test.tsx` with 7 cases covering the interim skill card logic: assistant-only rendering, user-message suppression, undefined-content safety, parallel+sequential branch integration, progress<1 (pending) state. Child components mocked so the test exercises only the branching and prop wiring ContentParts owns. Finding 3 (MINOR) — Localize hardcoded aria-labels - Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys. - Reused existing `com_ui_remove_skill_var` for the remove-button aria-label. - `PendingManualSkillsChips` and `ManualSkillPills` now call `useLocalize()`. Test mocks updated to the label-echo pattern. Finding 4 (MINOR) — Max-length guard in `extractManualSkills` - New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from reaching `getSkillByName` / Mongo's query planner. Finding 5 (NIT) — `BaseClient.js` comment contradicted itself - Rewrote to call the filter what it is: defense-in-depth on top of Mongoose schema validation, not a redundant second layer. Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo` - Consistent with peer components (`PendingManualSkillsChips`, `ContentParts`). Rendered inside `Container`, which re-renders on every content update, so the memo is a real cycle savings. Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills` - Collapsed the duplicate null-check by computing `pendingSkills` as a `useMemo`'d array (`[]` when not applicable), and mapping directly. `hasPendingSkills` now derives from the array length — one source of truth, no redundant gate inside the render function. * 🔧 fix: Update ParallelContent to Handle Optional Content Prop Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures. * 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer This is why the interim skill cards never appeared across many rounds of iteration: `ContentRender.tsx` (the memo'd renderer used by most paths, including the agents endpoint) was calling `ContentParts` without the `manualSkills` prop. Only `MessageParts.tsx` had it wired up — and that's not the component that actually renders the assistant response in production. Two fixes: 1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call. 2. Extend the `areContentRenderPropsEqual` memo comparator to include `manualSkills.length`, otherwise a message update that adds the field (seeded by `useChatFunctions` on the initialResponse) would be bailed out by the memo and never re-render. Verified the two ContentParts call sites are now consistent; Container usages for `ManualSkillPills` on the user side were already correct. * 🧹 polish: Address Audit Follow-Up (F1/F3/F6) F1 — Clarify sticky re-priming opt-out path. The previous comment said "regenerate without the pick" as one opt-out, but `useChatFunctions.regenerate` forwards the original picks via `overrideManualSkills`, so regeneration alone keeps the skill sticky. Updated to: edit the originating message to remove the pills and resubmit, or start a new conversation. F3 — Add DOM-order assertions to the parallel + sequential tests. The two "alongside" tests verified both elements existed but didn't pin the ordering contract. Both now use `compareDocumentPosition` to assert the pending SkillCall precedes the real content, matching the backend semantic (`contentParts.unshift(...primeParts)` puts primes at the top). F6 — Fix package import order in PendingManualSkillsChips. `recoil` (58 chars) was listed before `lucide-react` (45 chars) which violates the "shortest to longest after react" rule in AGENTS.md. Swapped order; no behavior change. F2 / F4 / F5 from the audit were confirmed as non-issues (React-safe empty map, cosmetic test-mock artifact, accepted memo tradeoff) and require no change. * ✨ feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content UX polish on the interim skill card now that it's actually rendering: 1. New `PendingSkillCall` component (mirrors `SkillCall` visually but drops the expand affordance). `SkillCall`'s underlying `ProgressText` always renders a chevron + clickable button when any input is present, which on a card with empty output points at nothing — misleading cursor:pointer and a no-op toggle. The pending variant has only the icon + label, no button wrapper, no chevron. 2. "Running X" → "Ran X" transition when real content lands. `ContentParts` computes `hasRealContent` (any non-text part, or a text part with non-empty content — placeholder empty-text parts don't count) and passes `loaded={hasRealContent}` to `PendingSkillCall`. Matches what users see for model-invoked skills as they finish priming: pulsing shimmer → static icon. 3. Cleanup: - Dropped direct `SkillCall` import from `ContentParts` (replaced by `PendingSkillCall`). `SkillCall` is still used by `Part` for real `skill` tool_call content parts — no behavior change there. - Removed the now-redundant explicit `manualSkills` assignment in `createdHandler`. `useChatFunctions` seeds the field on `initialResponse` at construction, so the `...submission.initialResponse` spread already carries it through — the re-assignment was defensive belt-and-suspenders doing the same work twice. Comment rewritten to describe the actual lifecycle. Tests updated to the new component (12/12 pass): two new cases pin the loaded-state transition (unloaded when content has no real parts, flips to loaded once a non-empty text part lands). |
||
|
|
9225a279eb |
🎚️ feat: Per-User Skill Active/Inactive Toggle with Ownership-Aware Defaults (#12692)
* feat: per-user skill active/inactive toggle with ownership-aware defaults - Add `skillStates` map (Record<string, boolean>) to user schema for per-user active/inactive overrides on skills - Add `defaultActiveOnShare` to interface.skills config (default: false) so admins can control whether shared skills auto-activate - Add GET/POST /api/user/settings/skills/active endpoints with validation - Add React Query hooks with optimistic mutations for skill states - Add useSkillActiveState hook with ownership-aware resolution: owned skills default active, shared skills default inactive - Add toggle switch UI to SkillListItem and SkillDetail components - Filter inactive skills in injectSkillCatalog before agent injection - Add localization keys for active/inactive labels * fix: use Record instead of Map for IUser.skillStates Mongoose .lean() flattens Map to a plain object, causing type incompatibility with IUser in methods that return lean documents. * fix: address review findings for skill active states - Fail-closed when userId is absent: filter rejects all shared skills instead of passing them through unfiltered (Codex P1) - Validate Mongoose Map key characters (reject . and $) in controller to return 400 instead of a 500 from schema validation (Codex P2) - Block toggle while initial skill states query is loading to prevent overwriting server-side overrides with an empty snapshot (Codex P2) - Extract shared SkillToggle component, eliminating duplicate toggle markup in SkillListItem and SkillDetail (Finding #3) - Move skill state query/mutation hooks from Favorites.ts to Skills/queries.ts per feature-directory convention (Finding #4) - Fix hardcoded English aria-label in SkillListItem by passing the localized string from the parent SkillList (Finding #5) - Fix inline arrow in SkillList render loop: pass stable callback reference so SkillListItem memo() is not invalidated (Finding #1) - Extract toRecord() helper in controller to DRY the Map-to-Object conversion (Finding #6) - Remove Promise.resolve wrapping synchronous config read (Finding #8) - Remove unused TUpdateSkillStatesRequest type (Finding #12) * fix: forward tabIndex on SkillToggle to preserve list keyboard nav The original inline toggle had tabIndex={-1} so the row itself remained the sole tab target. The extraction into SkillToggle dropped this prop, making every list toggle a tab stop. Add an optional tabIndex prop and pass -1 from SkillListItem. * fix: plumb skillStates to all agent entry points, isolate toggle keydown - Add skillStates/defaultActiveOnShare loading to openai.js and responses.js controllers so shared-skill activation is respected across all agent entry points, not just initialize.js (Codex P1) - Stop keydown propagation on SkillToggle so Enter/Space does not bubble to the parent row's navigation handler (Codex P2) * fix: paginate catalog fetch and serialize toggle writes - Paginate listSkillsByAccess (up to 10 pages of 100) until the active catalog quota is filled, so inactive shared skills in recent positions do not starve active owned skills past the first page (Codex P1) - Extend listSkillsByAccess interface with cursor/has_more/after for catalog pagination - Serialize skill-state writes via a ref queue: one in-flight request at a time, with the latest desired state sent when the previous one settles. Prevents last-response-wins races where an older request overwrites newer toggles (Codex P2) * fix: share write queue across hook instances, block toggle on fetch error - Move the write queue from a per-instance useRef to a module-scoped object so every mount of useSkillActiveState (SkillList, SkillDetail, etc.) serializes against the same in-flight slot. Prior per-instance queues allowed two components to race full-map POSTs (Codex P1) - Extend the toggle guard beyond isLoading: also block when isError is true or data is undefined. Prevents a failed GET from seeding a toggle with an empty baseline that would wipe server-side overrides on the next successful POST (Codex P1) * fix: stale closure, orphan cleanup, and cap-error UX - Read toggle baseline from React Query cache via queryClient.getQueryData instead of the captured skillStates closure. The closure can be stale between onMutate's setQueryData and the next render, so rapid successive toggles would build on old state and drop earlier changes (Codex P1) - Surface the MAX_SKILL_STATES_EXCEEDED error code with a specific toast key (com_ui_skill_states_limit) so users understand the 200-cap rather than seeing a generic error - Prune orphaned entries (skillIds whose Skill doc no longer exists) on both GET and POST in SkillStatesController. Self-heals over time without needing cascade-delete hooks or a migration job. Uses one indexed Skill._id query per request * test: pin skill active-state precedence with unit tests Extract the active-state resolution logic from a closure inside injectSkillCatalog into an exported resolveSkillActive helper, then cover every branch of the precedence matrix: - Fails closed when userId is absent (even with defaultActiveOnShare=true) - Explicit override wins over ownership and config (both true and false) - Owned skills default to active when no override is set - Shared skills default to defaultActiveOnShare value - Undefined skillStates behaves identically to an empty object - defaultActiveOnShare defaults to false when omitted - Owned skills ignore defaultActiveOnShare entirely Closes Finding #2 from the pre-rebase comprehensive review. Mirrors the existing scopeSkillIds test style; injectSkillCatalog now calls resolveSkillActive instead of inlining the closure. * refactor: limit skill active toggle to detail header, drop label - Remove the per-row toggle from SkillListItem and the active-state plumbing (hook call, isSkillEnabled/onToggleEnabled/toggleAriaLabel props) from SkillList. The detail view is now the single place to change a skill's active state - Drop dim/muted styling for inactive skills in the sidebar: without a control there, the visual indication has nowhere to land - Resize SkillToggle to match neighbor buttons: outer h-9 container, h-6 w-11 track with size-5 knob, no label span. The 'Active' / 'Inactive' text that accompanied the detail-view toggle is removed - Remove the now-unused label prop and tabIndex prop (the tabIndex existed only for the list-row context) from SkillToggle. Drop the onKeyDown stopPropagation for the same reason - Remove now-orphaned com_ui_skill_active / com_ui_skill_inactive translation keys * style: shrink SkillToggle track to h-5 w-9 with size-4 knob Container stays at h-9 to match neighbor button heights. The toggle track itself drops from h-6 w-11 to h-5 w-9, with a size-4 knob travelling 1.125rem on activation. Visually lighter inside the row. * fix: remove redundant skillStates entries that match the resolved default When a toggle lands on the ownership/config default, delete the key from the map instead of persisting `{id: defaultValue}`. Without this, a user toggling a skill off and back on would leave `{id: true}` for an owned skill (whose default is already true), silently consuming a slot against the 200-entry cap. Repeated round-trip toggles could exhaust the quota with zero meaningful overrides (Codex P2). Preserves the exceptions-list invariant that the runtime-resolution design depends on. * fix: prune before enforcing skill-state cap; reject non-ObjectId keys Reorder the update controller so pruneOrphans runs before the 200-cap check. Without this, a user near the cap with some orphaned entries (skills deleted since their last GET) could send a payload that would pass after pruning but gets rejected by the raw-size check first. Add a sanity cap on raw payload size (2 * MAX_SKILL_STATES) so abusive inputs do not reach the DB query, and enforce the real cap on the pruned result instead. Harden pruneOrphans: the earlier early-return path could pass non-ObjectId keys through unchanged. Now only valid ObjectIds are returned, and the Skill-model-unavailable fallback filters by format. Also add isValidObjectIdString validation at the input boundary so malformed (but otherwise non-Mongo-unsafe) keys never reach persistence (Codex P2 x2). * fix: enforce active filter at execute time, prune revoked shares, scope queue per user P1: injectSkillCatalog now returns activeSkillIds (the filtered set that appears in the catalog). initializeAgent uses that set as the stored accessibleSkillIds on the initialized agent, so getSkillByName at runtime cannot resolve a deactivated skill — even if the LLM hallucinates a name or the user invokes by direct-invocation shorthand. Previously the executor authorized against the full ACL set, bypassing the active-state guarantee (Codex P1). P2: pruneOrphans now checks user access via findAccessibleResources in addition to skill existence. When a share is revoked, the user's skillStates entry for that skill had no cleanup path and silently consumed the 200-cap. Self-heals on both GET and POST. One extra ACL query per settings read/write; scoped to a single user so no N-user amplification (Codex P2). P2: the write queue moves from a single module-scoped object to a Map keyed by userId. Logout/login in the same tab can no longer flush the previous user's pending snapshot under the new session's auth. Each userId gets its own pending/inFlight slot; the in-flight request retains its original auth via the cookie already attached when sent, so the race window closes (Codex P2). * refactor: extract skillStates helpers to packages/api; add tests; polish Address the remaining valid findings from the comprehensive review: - Extract toRecord, loadSkillStates, validateSkillStatesPayload, and pruneOrphanSkillStates into packages/api/src/skills/skillStates.ts as TypeScript. The controller in /api shrinks to a ~90-line thin wrapper that builds live dependency adapters for Mongoose + the permission service (Review #2 DRY, #3 workspace boundary) - Replace the triplicated 12-line skillStates loading block in initialize.js, openai.js, and responses.js with a single call to loadSkillStates from @librechat/api. One helper, three sites - Swap console.error for the project logger in the controller (Review #7) - Remove the redundant INVALID_KEY_PATTERN regex: a valid ObjectId cannot contain . or $, so isValidObjectIdString already covers it (Review #11) - Parameterize the 200-cap error toast with {{0}} interpolation driven by the error response's `limit` field, so future changes to MAX_SKILL_STATES update the UI message automatically (Review #12) - Add 24 unit tests for the new skillStates helpers (toRecord, resolveDefaultActiveOnShare, loadSkillStates, validateSkillStates- Payload, pruneOrphanSkillStates) covering success paths, malformed input, cap boundaries, and parallel-query behavior (Review #4) - Add 10 tests for injectSkillCatalog pagination covering empty accessible set, missing listSkillsByAccess, single-page filter, owned-vs-shared defaults, explicit-override precedence, multi-page collection, MAX_CATALOG_PAGES safety cap, early termination on has_more=false, additional_instructions injection, and fail-closed without userId (Review #5) Total test count: 60 (was 26 on this surface). * fix: rename skillStates ValidationError to avoid barrel-export collision packages/api/src/types/error.ts already exports a ValidationError (MongooseError extension). Re-exporting a different shape from skills/skillStates.ts through the skills barrel caused TS2308 in CI because the root index re-exports both. Rename to SkillStatesValidationError to keep the exports disjoint. * refactor: tighten tests and absorb caller guard into loadSkillStates Address the followup review findings: - Add optional `accessibleSkillIds` param to loadSkillStates so the helper short-circuits to defaults when no skills are accessible. All three controllers drop the residual 7-line conditional wrapper in favor of a single destructured call (Review #2) - Remove the unreachable `typeof key !== 'string'` check from validateSkillStatesPayload: Object.entries always yields string keys per the JS spec (Review #3) - Replace the two `as unknown as` agent casts in the injectSkillCatalog tests with a `makeAgent()` factory typed directly as the function's parameter shape (Review #4) - Tighten the MAX_CATALOG_PAGES assertion from `toBeLessThanOrEqual(11)` to `toHaveBeenCalledTimes(10)` — the loop deterministically makes exactly 10 page fetches before hitting the cap (Review #1) - Rewrite the parallel-execution test for pruneOrphanSkillStates using deferred promises instead of microtask-order assertions. The test now inspects `toHaveBeenCalledTimes(1)` on both mocks after a single Promise.resolve() yield, pinning Promise.all usage without relying on push-order into a shared array (Review #5) - Evict stale writeQueue entries on user change via a module-scoped `lastSeenUserId` sentinel. When a different user's toggle is the first one after a logout/login, the previous user's queue entry is deleted. Keeps the Map bounded without adding hook-instance effect cleanup (Review #6) * fix(test): mock loadSkillStates in openai and responses controller specs The prior refactor replaced the inline 12-line skillStates loading block with a call to loadSkillStates from @librechat/api. Both controller spec files mock @librechat/api as a flat object, so any new named import from that package is undefined in the test env. Calling `await loadSkillStates(...)` threw before recordCollectedUsage ran, surfacing as "undefined is not iterable" on the test's array destructure of `mockRecordCollectedUsage.mock.calls[0]`. Add the missing mock to both spec files alongside the existing scopeSkillIds stub. * fix: abandon stale skillStates write queues on user switch Close the cross-session leak window where an in-flight flush loop still holds a reference to a previous user's queue: it could fire its next mutateAsync under the new session's auth cookies and persist the stale snapshot to the new user's document (Codex P1). Add an `abandoned` flag on `WriteQueue`. Three mechanisms cooperate: - `getWriteQueue` marks every non-active queue abandoned when the user differs from the last-seen identity (pre-existing eviction site, now more aggressive). - A `useEffect` on `userId` calls the same abandonment pass on every render with a new active identity, covering the window between logout/login and the new user's first toggle (when `getWriteQueue` would otherwise not fire). - The flush loop checks `!queue.abandoned` in its while condition so the second and later iterations exit without firing another `mutateAsync` after the session changes. The first iteration's in-flight request (already dispatched under the original user's cookies) still runs to completion or failure on its own — only the subsequent iterations, which are the dangerous ones, are blocked. |
||
|
|
3e064c2f2b |
🎯 feat: Per-Agent Skill Selection in Builder and Runtime Scoping (#12689)
* feat: per-agent skill selection in builder and runtime scoping
Wire skills persistence on the Agent model and enable the skills
section in the agents builder panel. At runtime, scope the skill
catalog to only the skills configured on each agent (intersected
with user ACL). When no skills are configured, the full user catalog
is used as the default. The ephemeral chat toggle overrides per-agent
scoping to provide the full catalog.
* fix: add scopeSkillIds to @librechat/api mock in responses unit test
The test mocks @librechat/api but was missing the newly imported
scopeSkillIds, causing createResponse to throw before reaching the
assertions. Added a passthrough mock that returns the input array.
* fix: scope primeInvokedSkills by agent's configured skills
primeInvokedSkills was receiving the full unscoped accessibleSkillIds,
bypassing the per-agent skill scoping applied to initializeAgent. This
allowed previously invoked skills from message history to be resolved
and primed even when excluded from the agent's configured skill set.
Apply the same scopeSkillIds filtering to match the initializeAgent
calls, so skill resolution is consistent across catalog injection
and history priming.
* fix: preserve agent skills through form reset and union prime scope
Two related bugs in the per-agent skill selection flow:
1. resetAgentForm dropped the persisted skills array because the generic
fall-through at the end of the loop excludes object/array values.
Combined with composeAgentUpdatePayload always emitting skills, this
caused any save of a previously-configured agent to silently overwrite
skills with an empty array. Add an explicit case for skills mirroring
the agent_ids handling.
2. primeInvokedSkills processes the full conversation payload, including
prior handoff-agent invocations. Scoping it to only primaryAgent.skills
meant a skill invoked by a handoff agent in a prior turn could not be
resolved when the current primary agent had a different scope, leaving
message history reconstruction incomplete. Union the per-agent scoped
accessibleSkillIds across primary plus all loaded handoff agents so
any skill any active agent could invoke is resolvable from history.
* fix: mark inline skill removals as dirty
The inline X button on the skills list called setValue without
shouldDirty: true, so removing a skill via this control did not
mark the skills field as dirty in react-hook-form state. When a
user removed a skill with the X button and also staged an avatar
upload in the same save, isAvatarUploadOnlyDirty returned true and
onSubmit short-circuited to avatar-only upload, silently dropping
the PATCH that would persist the skill removal.
The dialog path (SkillSelectDialog) already passes shouldDirty: true
on add/remove; this aligns the inline control with that behavior.
* fix: restore full ACL scope for primeInvokedSkills history reconstruction
Reverting the earlier scoping of primeInvokedSkills to the active-agent
union. That change conflated runtime invocation scoping (which correctly
gates what the model can call now) with history reconstruction (which
restores bodies the model already saw in prior turns).
Per-agent scoping still applies at:
- Catalog injection (injectSkillCatalog via initializeAgent)
- Runtime invocation (handleSkillToolCall via enrichWithSkillConfigurable,
using each agent's scoped accessibleSkillIds in agentToolContexts)
History priming is a read of past context, not a grant of new capability.
Scoping it causes historical skill bodies to vanish from formatAgentMessages
when an agent's skills list is edited mid-conversation or when the ephemeral
toggle flips, which breaks message reconstruction and drops code-env file
continuity for /mnt/data/{skillName}/ references. The user's ACL-accessible
set is the correct and sufficient gate for history reconstruction.
* fix: close openai.js skill gap and pin undefined vs [] semantics
Three related gaps surfaced in review:
1. api/server/controllers/agents/openai.js was a third skill resolution
site alongside responses.js and initialize.js, but still used the old
activation gate (required ephemeralAgent.skills === true) and never
passed accessibleSkillIds through scopeSkillIds. Per-agent scoping
silently did not apply on this route. Mirror the same pattern used
in responses.js so all three routes behave identically.
2. scopeSkillIds previously collapsed undefined and [] into the same
"full catalog" fallback, making it impossible for a user to express
"this agent has no skills." Tighten the semantics before any data
is written under the old behavior:
- undefined / null = not configured, full catalog
- [] = explicitly none, returns []
- non-empty = intersection with ACL-accessible set
Update defaultAgentFormValues.skills from [] to undefined so a brand
new agent whose skills UI was never touched does not accidentally
persist "explicit none" on first save (removeNullishValues strips
undefined from the payload server side).
3. Add direct unit tests for scopeSkillIds covering all five cases
(undefined, null, empty, disjoint, overlap, exact match, empty
accessible set). 16 tests total in skills.test.ts pass.
* fix: add scopeSkillIds to @librechat/api mock in openai unit test
Same pattern as the earlier responses.unit.spec.js fix: the test mocks
@librechat/api with an explicit object, so each newly imported symbol
must be added to the mock. Without scopeSkillIds, OpenAIChatCompletion
controller throws on destructuring before reaching recordCollectedUsage,
causing the token usage assertions to fail.
|
||
|
|
64ec5f18b8 |
⚙️ feat: Skill runtime integration: catalog, tools, execution, file priming (#12649)
* feat: Skill runtime integration — catalog injection, tool registration, execute handler
Wires the @librechat/agents SkillTool primitive into LibreChat's agent runtime:
**Enums:**
- Add `skills` to AgentCapabilities + defaultAgentCapabilities
**Data layer:**
- Add `getSkillByName(name, accessibleIds)` — compound query that
combines name lookup + ACL check in one findOne
**Agent initialization (packages/api/src/agents/initialize.ts):**
- Accept `accessibleSkillIds` param and `listSkillsByAccess` db method
- Query accessible skills, format catalog via `formatSkillCatalog()`,
append to `additional_instructions` (appears in agent system prompt)
- Register `SkillToolDefinition` + `createSkillTool()` when catalog
is non-empty (tool appears in model's tool list)
- Store `accessibleSkillIds` and `skillCount` on InitializedAgent
**Execute handler (packages/api/src/agents/handlers.ts):**
- Add `getSkillByName` to `ToolExecuteOptions`
- `handleSkillToolCall()` intercepts `Constants.SKILL_TOOL`:
extracts skillName, loads body from DB with ACL check,
substitutes $ARGUMENTS, returns ToolExecuteResult with
injectedMessages (skill body as isMeta user message)
**Caller wiring:**
- initialize.js: query skill IDs via findAccessibleResources,
pass to initializeAgent + store on agentToolContexts,
add getSkillByName to toolExecuteOptions,
pass accessibleSkillIds through loadTools configurable
- openai.js + responses.js: same pattern for their flows
Requires @librechat/agents >= 3.1.65 (PR #91 exports).
* feat: Skills toggle in tools menu + backend capability gating
Frontend:
- Add skills?: boolean to TEphemeralAgent type
- Add LAST_SKILLS_TOGGLE_ to LocalStorageKeys for persistence
- Add skillsEnabled to useAgentCapabilities hook
- Add skills useToolToggle to BadgeRowContext with localStorage init
- New Skills.tsx badge component (Scroll icon, cyan theme,
permission-gated via PermissionTypes.SKILLS)
- Add skills entry to ToolsDropdown with toggle + pin
- Render Skills badge in BadgeRow ephemeral section
Backend:
- Extract injectSkillCatalog() into packages/api/src/agents/skills.ts
(reduces initializeAgent module size, reusable helper)
- initializeAgent delegates to helper instead of inline block
- Capability-gate the findAccessibleResources query:
- Agents endpoint: checks AgentCapabilities.skills in admin config
- OpenAI/Responses controllers: checks ephemeralAgent.skills toggle
- ACL query runs once per run, result shared across all agents
* refactor: remove createSkillTool() instance from injectSkillCatalog
SkillTool is event-driven only. The tool definition in toolDefinitions
is sufficient for the LLM to see the tool schema. No tool instance is
needed since the host handler intercepts via ON_TOOL_EXECUTE before
tool.invoke() is ever called.
Removes tools from InjectSkillCatalogParams/Result, drops the
createSkillTool import.
* feat: skill file priming, bash tool, and invoked skills state
Multi-file skill support:
- New primeSkillFiles() helper (packages/api/src/agents/skillFiles.ts)
uploads skill files + SKILL.md body to code execution environment
- handleSkillToolCall primes files on invocation when skill.fileCount > 0,
returns session info as artifact so ToolNode stores the session
- Skill-primed files available to subsequent bash/code tool calls
Bash tool auto-registration:
- BashExecutionToolDefinition added alongside SkillToolDefinition when
skills are enabled, giving the model a bash tool for running scripts
Conversation state:
- Add invokedSkillIds field to conversation schema (Mongoose + Zod)
- handleSkillToolCall updates conversation with $addToSet on success
- Enables re-priming skill files on subsequent runs (future)
Dependency wiring:
- Pass listSkillFiles, getStrategyFunctions, uploadCodeEnvFile,
updateConversation through ToolExecuteOptions
- Pass req and codeApiKey through mergedConfigurable
- All three controller entry points wired (initialize.js, openai.js,
responses.js)
* fix: load bash_tool instance in loadToolsForExecution, remove file listing
- Add createBashExecutionTool to loadToolsForExecution alongside PTC/ToolSearch
pattern: loads CODE_API_KEY, creates bash tool instance on demand
- Add BASH_TOOL and SKILL_TOOL to specialToolNames set so they don't go
through the generic loadTools path (bash is created here, skill is
intercepted in handler before tool.invoke)
- Remove file name listing from skill content text — it's the skill
author's responsibility to disclose files in SKILL.md, not the framework
* feat: batch upload for skill files, replace sequential uploads
- Add batchUploadCodeEnvFiles() to crud.js: single POST to /upload/batch
with all files in one multipart request, returns shared session_id
- Rewrite primeSkillFiles to collect all streams (SKILL.md + bundled files)
then do one batch upload instead of N sequential uploads
- Replace uploadCodeEnvFile with batchUploadCodeEnvFiles across all callers
(handlers.ts, initialize.js, openai.js, responses.js)
* refactor: remove invokedSkillIds from conversation schema
Skills aren't re-loaded between runs, so conversation-level state for
invoked skills doesn't help. Skill state will live on messages instead
(like tool_search discoveredTools and summaries), enabling in-place
re-injection on follow-up runs.
Removes invokedSkillIds from: convo Mongoose schema, IConversation
interface, Zod schema, ToolExecuteOptions.updateConversation, and
all three caller wiring points.
* feat: smart skill file re-priming with session freshness checking
Schema:
- Add codeEnvIdentifier field to ISkillFile (type + Mongoose schema)
- Add updateSkillFileCodeEnvIds batch method (uses tenantSafeBulkWrite)
- Export checkIfActive from Code/process.js
Extraction:
- Add extractInvokedSkillsFromHistory() to run.ts — scans message
history for AIMessage tool_calls where name === 'skill', extracts
skillName args. Follows same pattern as extractDiscoveredToolsFromHistory.
Smart re-priming in primeSkillFiles:
- Before batch uploading, checks if existing codeEnvIdentifiers are
still active via getSessionInfo + checkIfActive (23h threshold)
- If session is still active, returns cached references (zero uploads)
- If stale or missing, batch-uploads everything and persists new
identifiers on SkillFile documents (fire-and-forget)
- Single session check covers all files (batch shares one session_id)
Wiring:
- Pass getSessionInfo, checkIfActive, updateSkillFileCodeEnvIds
through ToolExecuteOptions and all three controller entry points
* feat: wire skill file re-priming at run start via initialSessions
Flow:
1. initialize.js creates primeInvokedSkills callback with all deps
2. client.js calls it with message history before createRun
3. extractInvokedSkillsFromHistory scans for skill tool calls
4. For each invoked skill with files, primeSkillFiles uploads/checks
5. Returns initialSessions map passed to createRun
6. createRun passes initialSessions to Run.create (via RunConfig)
7. Run constructor seeds Graph.sessions, making skill files available
to subsequent bash/code tool calls via ToolNode session injection
Requires @librechat/agents with initialSessions on RunConfig (PR #94).
* refactor: use CODE_EXECUTION_TOOLS set for code tool checks
Import CODE_EXECUTION_TOOLS from @librechat/agents and replace inline
constant checks in handlers.ts and callbacks.js. Fixes missing bash
tool coverage in the session context injection (handlers.ts) and code
output processing (callbacks.js).
* refactor: move primeInvokedSkills to packages/api, add skill body re-injection
Moves primeInvokedSkills from an inline closure in initialize.js (with
dynamic requires) to a proper exported function in packages/api
skillFiles.ts with explicit typed dependencies.
Key changes:
- primeInvokedSkills now returns both initialSessions (for file priming)
AND injectedMessages (skill bodies for context continuity)
- createRun accepts invokedSkillMessages and appends skill bodies to
systemContent so the model retains skill instructions across runs
- initialize.js calls the packaged function with all deps passed explicitly
- client.js passes both initialSessions and injectedMessages to createRun
* fix: move dynamic requires to top-level module imports
Move primeInvokedSkills, getStrategyFunctions, batchUploadCodeEnvFiles,
getSessionInfo, and checkIfActive from inline requires to top-level
module requires where they belong.
* refactor: skill body reconstruction via formatAgentMessages, not systemContent
Replaces the lazy systemContent approach with proper message-level
reconstruction:
SDK (formatAgentMessages):
- New invokedSkillBodies param (Map<string, string>)
- Reconstructs HumanMessages after skill ToolMessages at the correct
position in the message sequence, matching where ToolNode originally
injected them
LibreChat:
- extractInvokedSkillsFromPayload replaces extractInvokedSkillsFromHistory
(works with raw TPayload before formatAgentMessages, not BaseMessage[])
- primeInvokedSkills now takes payload instead of messages, returns
skillBodies Map instead of injectedMessages
- client.js calls primeInvokedSkills BEFORE formatAgentMessages, passes
skillBodies through as the 4th param
- Removed invokedSkillMessages from createRun (no more systemContent hack)
- Single-pass: skill detection happens inside formatAgentMessages' existing
tool_call processing loop, zero extra message iterations
* refactor: rename skillBodies to skills for consistency with SDK param
* refactor: move auth loading into primeInvokedSkills, pass loadAuthValues as dep
The payload/accessibleSkillIds guard and CODE_API_KEY loading now live
inside primeInvokedSkills (packages/api) rather than in the CJS caller.
initialize.js passes loadAuthValues as a dependency and the callback
is only created when skillsCapabilityEnabled.
* feat: ReadFile tool + conditional bash registration + skill path namespacing
ReadFile tool (read_file):
- General-purpose file reader, event-driven (ON_TOOL_EXECUTE)
- Schema: { file_path: string } — "{skillName}/{path}" convention
- handleReadFileCall: resolves skill name from path, ACL check, reads
from DB cache or storage, binary detection, size limits (256KB),
lazy caching (512KB), line numbers in output
- SKILL.md special case: reads skill.body directly
- Dispatched alongside SKILL_TOOL in createToolExecuteHandler
- Added to specialToolNames in ToolService
Conditional tool registration:
- ReadFile + SkillTool: always registered when skills enabled
- BashTool: only registered when codeEnvAvailable === true
- codeEnvAvailable passed through InitializeAgentParams from caller
Skill file path namespacing:
- primeSkillFiles now uploads as "{skillName}/SKILL.md" and
"{skillName}/{relativePath}" instead of flat names
- Prevents file collisions when multiple skills are invoked
Wiring:
- getSkillFileByPath + updateSkillFileContent passed through
ToolExecuteOptions in all three callers
* feat: return images/PDFs as artifacts from read_file, tighten caching
Binary artifact support:
- Images (png, jpeg, gif, webp) returned as base64 in artifact.content
with type: 'image_url', processed by existing callback attachment flow
- PDFs returned as base64 artifact similarly
- Binary size limit: 10MB (MAX_BINARY_BYTES)
- Other binary files still return metadata + bash fallback
Caching:
- Text cached only on first read (file.content == null check)
- Binary flag cached only on first detection (file.isBinary == null)
- Skill files are immutable; no redundant cache writes
Registration:
- ReadFileToolDefinition now includes responseFormat: 'content_and_artifact'
* chore: update @librechat/agents to version 3.1.66-dev.0 and add peer dependencies in package-lock.json and package.json files
* fix: resolve review findings #1,#2,#4,#5,#6,#10,#13
Critical:
- #1: primeInvokedSkills now accumulates files across all skills into
one session entry instead of overwriting. Parallel processing via
Promise.allSettled.
- #2: codeEnvAvailable now computed and passed in openai.js and
responses.js (was missing, bash tool never registered in those flows)
Major:
- #4: relativePath in updateSkillFileCodeEnvIds now strips the
{skillName}/ prefix to match SkillFile documents. SKILL.md filter
uses endsWith instead of exact match.
- #5: File priming guarded on apiKey being non-empty (skip when not
configured instead of failing with auth error)
- #6: Skills processed in parallel via Promise.allSettled instead of
sequential for-of loop
Minor:
- #10: Use top-level imports in initialize.js instead of inline requires
- #13: Log warning when skill catalog reaches the 100-skill limit
* fix: resolve followup review findings N1,N2,N4
N1 (CRITICAL): Wire skill deps into responses.js non-streaming path.
Was completely missing getSkillByName, file strategy functions, etc.
N2 (MAJOR): Single batch upload for ALL skills' files. Resolves skills
in parallel (Phase 1), then collects all file streams across skills
and does ONE batchUploadCodeEnvFiles call (Phase 2). All files share
one session_id, eliminating cross-session isolation issues.
N4 (MINOR): Move inline require() to top-level in openai.js and
responses.js, consistent with initialize.js.
* fix: add mocks for new file strategy imports in controller tests
* fix: restore session freshness check, parallelize file lookups, add warnings
R1: Re-add session freshness check before batch upload. Checks any
existing codeEnvIdentifier via getSessionInfo + checkIfActive. If the
session is still active (23h window), returns cached file references
with zero re-uploads.
R2: listSkillFiles calls parallelized via Promise.all (were sequential
in the for-of loop).
R3: Log warning when skill record lookup fails during identifier
persistence (was a silent empty-string fallback).
* fix: guard freshness cache on single-session consistency
* fix: multi-session freshness check (code env handles mixed sessions natively)
The code execution environment fetches each file by its own
{session_id, fileId} pair independently — no single-session
requirement. Removed the sessionIds.size === 1 guard.
Now checks ALL distinct sessions for freshness. If every session
is still active (23h window), returns cached references with per-file
session_ids preserved. If any session expired, falls through to
re-upload everything in a single batch.
* perf: parallelize session freshness checks via Promise.all
* fix: add optional chaining for session info retrieval in primeInvokedSkills
Updated the primeInvokedSkills function to use optional chaining for getSessionInfo and checkIfActive methods, ensuring safer access and preventing potential runtime errors when these methods are undefined.
* fix: address review findings #1-#9 + Codex P1/P2 + session probe
Critical:
- #1/Codex P1: Add codeApiKey loading to openai.js and responses.js
loadTools configurable (was missing, file priming broken in 2/3 paths)
- Codex P1: Fix cached file name prefix in primeSkillFiles cache path
(was sf.relativePath, now ${skill.name}/${sf.relativePath})
Major:
- Codex P2: Honor ephemeral skills toggle in agents endpoint
(check ephemeralAgent?.skills !== false alongside admin capability)
- #4: Early size check using file.bytes from DB before streaming
(prevents full-file buffer for oversized files)
Minor:
- #5: Replace Record<string, any> with Record<string, boolean | string>
- #6: Localize Pin/Unpin aria-labels with com_ui_pin/com_ui_unpin
- #8: Parallelize stream acquisition in primeSkillFiles via
Promise.allSettled
- #9: Log warning for partial batch upload failures with filenames
Performance:
- Session probe optimization: getSessionInfo now hits per-object
endpoint (GET /sessions/{sid}/objects/{fid}) instead of listing
entire session (GET /files/{sid}?detail=summary). O(1) stat vs
O(N) list + linear scan.
* refactor: extract shared skill wiring helper + add unit tests
DRY (#3):
- New skillDeps.js exports getSkillToolDeps() with all 9 skill-related
deps (getSkillByName, listSkillFiles, getStrategyFunctions, etc.)
- Replaces 5 identical copy-paste blocks across initialize.js, openai.js,
responses.js (streaming + non-streaming paths)
- One place to maintain when skill deps change
Tests (#2):
- 8 unit tests for extractInvokedSkillsFromPayload covering:
string args, object args, missing skill tool_calls, non-assistant
messages, malformed JSON, empty skillName, empty payload, dedup
* fix: remove @jest/globals import, use global jest env
* fix: resolve round 2 review findings R2-1 through R2-7
R2-1 (toggle semantics): openai.js + responses.js now check admin
capability (AgentCapabilities.skills) alongside ephemeral toggle.
Aligns with initialize.js.
R2-2 (swallowed error): primeInvokedSkills now logs
updateSkillFileCodeEnvIds failures (was .catch(() => {}))
R2-4 (test cast): Record<string, string> → Record<string, unknown>
R2-5 (DRY regression): Extract enrichWithSkillConfigurable() into
skillDeps.js. Replaces 4 identical loadAuthValues blocks.
Each loadTools callback is now a one-liner. JSDoc added (R2-6).
R2-7 (sequential streams): primeInvokedSkills now uses
Promise.allSettled for parallel stream acquisition.
* fix: require explicit skills toggle + treat partial cache as miss
- initialize.js: change ephemeralSkillsToggle !== false to === true
(unset toggle no longer enables skills)
- primeSkillFiles cache: require ALL files to have codeEnvIdentifier
before using cache (partial persistence = cache miss = re-upload)
- primeInvokedSkills cache: same check (allFilesWithIds.length must
equal total file count)
* fix: pass entity_id=skillId on batch upload, eliminates per-user cache thrashing
primeSkillFiles now passes entity_id: skill._id.toString() to
batchUploadCodeEnvFiles. This scopes the code env session to the
skill, not the user. All users sharing a skill share the same
uploaded files — no more cache thrashing from overwriting each
other's codeEnvIdentifier.
The stored codeEnvIdentifier now includes ?entity_id= suffix so
freshness checks pass the entity_id through to the per-object
stat endpoint. Both primeSkillFiles and primeInvokedSkills
store consistent identifier formats.
* fix: pass entity_id on multi-skill batch upload, consistent identifier format
* Revert "fix: pass entity_id on multi-skill batch upload, consistent identifier format"
This reverts commit
|
||
|
|
d2cbd551b7
|
🤝 fix: Load Handoff Agents for Agents API (#12740)
* 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726)
Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents
into a shared `discoverConnectedAgents` helper in `@librechat/api` and
wires it into the OpenAI-compatible `/v1/chat/completions` and Open
Responses `/v1/responses` controllers. These endpoints previously only
passed the primary agent config to `createRun` while keeping
`primaryConfig.edges` intact, which forced `MultiAgentGraph` into
multi-agent mode without loading the referenced sub-agents and caused
StateGraph to throw "Found edge ending at unknown node <id>".
The discovery helper also filters orphaned edges (deleted sub-agents or
those the caller lacks VIEW permission on), so API users see the same
graceful fallback the chat UI already had.
* 🧪 fix: use ServerRequest in discovery spec helpers
CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test
helpers typed `req` as `express.Request`, which is not assignable to
`DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose
`user` is `IUser`). Local jest passed because ts-jest is transpile-only,
but the CI typecheck uses the full compiler.
* 🪲 fix: drop orphan edges on both endpoints, not just `to`
Addresses the P1 codex finding on #12740: `filterOrphanedEdges`
previously only removed edges whose `to` referenced a skipped agent.
Edges whose `from` was a skipped agent — the symmetric case in a
bidirectional graph like `A <-> B` where `B` is deleted or the user
lacks VIEW on it — leaked through to `createRun` and re-triggered
`Found edge ending at unknown node <id>` at StateGraph compile time.
The filter now drops an edge if either endpoint references a skipped
id, and the existing `to`-only test cases were updated to reflect the
stricter behavior. Adds a bidirectional-graph regression test in
`discovery.spec.ts`.
* 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes
Addresses the second P1 codex finding on #12740: the OpenAI-compat
`/v1/chat/completions` and Open Responses `/v1/responses` routes gate
the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`),
but `discoverConnectedAgents` was checking handoff sub-agents against
the looser in-app `AGENT` resource type. That allowed a remote caller
who could reach the orchestrator but had only in-app visibility on a
sub-agent to invoke it via the API — bypassing the remote-sharing
boundary.
Adds an optional `resourceType` param to `discoverConnectedAgents`
(defaulting to `AGENT` for the chat UI path) and passes
`ResourceType.REMOTE_AGENT` from both API controllers so every
discovered sub-agent clears the same sharing boundary enforced at
route entry.
* 🧯 fix: enforce allowedProviders for discovered sub-agents
Addresses the third P1 codex finding on #12740: `discoverConnectedAgents`
forwarded the caller's `endpointOption` verbatim into `initializeAgent`,
but on the OpenAI-compat routes that option's `endpoint` is the primary
agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only
enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)`
is true, so handoff sub-agents silently bypassed the provider allowlist
configured under `endpoints.agents.allowedProviders`.
Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every
per-sub-agent init call. The primary agent still uses the caller's
endpointOption as before — this only affects the BFS-loaded handoff
targets. Regression test asserts the override.
* ✂️ fix: prune unreachable sub-agents after orphan-edge filtering
Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes
every sub-agent referenced in the primary's edge scan, but once
`filterOrphanedEdges` drops edges whose endpoints were skipped, some of
those sub-agents end up disconnected from the primary. In an `A -> B ->
C` graph (edges stored directly on A) where B is skipped (missing or
no VIEW), both edges are filtered, but C was already loaded and would
still be passed to `createRun` — which flips into multi-agent mode on
`agents.length > 1` and turns C into an unintended parallel start node.
After filtering edges, compute the set of agent ids reachable from the
primary through the surviving edge set and prune `agentConfigs` to that
set. Two regression tests added: one for the pruning case, one that
confirms agents connected via surviving edges are still kept.
* 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback
Addresses the fifth P1 codex finding on #12740: `onAgentInitialized`
fires during BFS, BEFORE the helper prunes agents that become
disconnected once `filterOrphanedEdges` runs. Writing the sub-agent
straight into the outer `agentConfigs` there and then only additively
merging the pruned `discoveredConfigs` left stranded entries in the
outer map, and `AgentClient` would still hand them to `createRun` as
extra parallel start nodes (the exact failure mode the pass-4 prune
was meant to eliminate for the API controllers).
Drop the `agentConfigs.set` from the callback and replace the additive
merge with a direct copy from `discoveredConfigs`, which is now the
single authoritative source of what the run should see. The
per-agent tool context map is still populated during BFS — stale
entries there are harmless because they're only read by closure inside
`ON_TOOL_EXECUTE` and are unreachable once the agent is not in
`agentConfigs`.
* 🔬 fix: address audit findings on discovery helper
Resolves findings from a comprehensive external audit of #12740.
**Finding 1 (CRITICAL) — stale edges survive the reachability prune.**
The pass-4 prune removed unreachable agents from `agentConfigs` but left
matching edges in the return value. In an `A -> B -> C -> D` graph (all
edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B
and B->C but keeps C->D (neither endpoint is skipped). The caller then
sees `agentConfigs` without C/D but `edges` still references them,
flipping `createRun` into multi-agent mode with mismatched agents/edges
— the exact crash this PR is supposed to fix. Now filter the edge list
to the reachable set in the same pass, so the returned shape is
self-consistent: every edge endpoint is either the primary id or a key
of `agentConfigs`. New regression test covers A->B->C->D with B skipped.
**Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API
request.** The OpenAI-compat and Responses controllers called
`getModelsConfig(req)` and `discoverConnectedAgents` even when the
primary agent had no edges (the common single-agent API case). Gate
both behind `primaryConfig.edges?.length > 0` so single-agent runs
don't pay that cost.
**Finding 5 (MINOR) — silent mutation of caller's
`primaryConfig.userMCPAuthMap`.** The helper aliased that object and
then `Object.assign`'d sub-agent entries into it, changing the caller's
config in-place. Shallow-clone up front so the returned merged map is
the only destination.
**Finding 7 (NIT) — dead `?? []` coalescing.**
`filterOrphanedEdges` always returns a concrete array, so the
`discoveredEdges ?? []` fallback was never reached. Simplified the
`primaryConfig.edges = …` assignment.
Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not
mutated in-place.
* 🧹 chore: address audit NITs on discovery helper
Addresses two NIT findings from the post-fix audit:
**F1** — the shallow clone on `primaryConfig.userMCPAuthMap` was only
applied on the primary side; the `else` branch (hit when the primary
had no MCP auth and the first sub-agent seeds the map) assigned the
sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's
`Object.assign` mutated the first one's map in place. Harmless in
practice (per-request ephemeral objects) but asymmetric. Clone in the
else branch too. Test added.
**F2** — `initialize.js` had a defensive `if (agentConfigs.size > 0 &&
!edges) edges = []` normalizer. Pre-existing dead code: the helper now
always returns a concrete array from `filteredEdges.filter(...)`.
Removed for clarity.
* 🕸 fix: require all sources reachable when traversing fan-in edges
Addresses the seventh P1 codex finding on #12740: the reachability BFS
advanced through an edge as soon as any of its `from` endpoints matched
the current frontier node (`sources.includes(current)`), but the
subsequent edge filter required ALL sources to be reachable (`every`).
The two-semantics mismatch let a fan-in edge like `{from: ['A','B'],
to: 'C'}` mark C reachable purely via A even when B had no path from
the primary, then drop the edge itself at filter time. Result: C
survived in `agentConfigs` with no surviving edge connecting it to A,
so `createRun` flipped into multi-agent mode on `agents.length > 1`
and C ran as an unintended parallel root.
Replace the BFS with a fixed-point iteration keyed on the same
all-sources-reachable predicate used by the filter, so traversal and
filtering stay aligned and multi-source edges only fire once every
source is in the reachable set.
Two regression tests added:
- `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts
neither B nor C leak into the result.
- `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires
and D becomes reachable once both B and C are.
* 🔀 fix: match SDK OR semantics for multi-source edge reachability
Reverts the all-sources-required reachability gate from
|
||
|
|
b5c097e5c7
|
⚗️ feat: Agent Context Compaction/Summarization (#12287)
* chore: imports/types
Add summarization config and package-level summarize handler contracts
Register summarize handlers across server controller paths
Port cursor dual-read/dual-write summary support and UI status handling
Selectively merge cursor branch files for BaseClient summary content
block detection (last-summary-wins), dual-write persistence, summary
block unit tests, and on_summarize_status SSE event handling with
started/completed/failed branches.
Co-authored-by: Cursor <cursoragent@cursor.com>
refactor: type safety
feat: add localization for summarization status messages
refactor: optimize summary block detection in BaseClient
Updated the logic for identifying existing summary content blocks to use a reverse loop for improved efficiency. Added a new test case to ensure the last summary content block is updated correctly when multiple summary blocks exist.
chore: add runName to chainOptions in AgentClient
refactor: streamline summarization configuration and handler integration
Removed the deprecated summarizeNotConfigured function and replaced it with a more flexible createSummarizeFn. Updated the summarization handler setup across various controllers to utilize the new function, enhancing error handling and configuration resolution. Improved overall code clarity and maintainability by consolidating summarization logic.
feat(summarization): add staged chunk-and-merge fallback
feat(usage): track summarization usage separately from messages
feat(summarization): resolve prompt from config in runtime
fix(endpoints): use @librechat/api provider config loader
refactor(agents): import getProviderConfig from @librechat/api
chore: code order
feat(app-config): auto-enable summarization when configured
feat: summarization config
refactor(summarization): streamline persist summary handling and enhance configuration validation
Removed the deprecated createDeferredPersistSummary function and integrated a new createPersistSummary function for MongoDB persistence. Updated summarization handlers across various controllers to utilize the new persistence method. Enhanced validation for summarization configuration to ensure provider, model, and prompt are properly set, improving error handling and overall robustness.
refactor(summarization): update event handling and remove legacy summarize handlers
Replaced the deprecated summarization handlers with new event-driven handlers for summarization start and completion across multiple controllers. This change enhances the clarity of the summarization process and improves the integration of summarization events in the application. Additionally, removed unused summarization functions and streamlined the configuration loading process.
refactor(summarization): standardize event names in handlers
Updated event names in the summarization handlers to use constants from GraphEvents for consistency and clarity. This change improves maintainability and reduces the risk of errors related to string literals in event handling.
feat(summarization): enhance usage tracking for summarization events
Added logic to track summarization usage in multiple controllers by checking the current node type. If the node indicates a summarization task, the usage type is set accordingly. This change improves the granularity of usage data collected during summarization processes.
feat(summarization): integrate SummarizationConfig into AppSummarizationConfig type
Enhanced the AppSummarizationConfig type by extending it with the SummarizationConfig type from librechat-data-provider. This change improves type safety and consistency in the summarization configuration structure.
test: add end-to-end tests for summarization functionality
Introduced a comprehensive suite of end-to-end tests for the summarization feature, covering the full LibreChat pipeline from message creation to summarization. This includes a new setup file for environment configuration and a Jest configuration specifically for E2E tests. The tests utilize real API keys and ensure proper integration with the summarization process, enhancing overall test coverage and reliability.
refactor(summarization): include initial summary in formatAgentMessages output
Updated the formatAgentMessages function to return an initial summary alongside messages and index token count map. This change is reflected in multiple controllers and the corresponding tests, enhancing the summarization process by providing additional context for each agent's response.
refactor: move hydrateMissingIndexTokenCounts to tokenMap utility
Extracted the hydrateMissingIndexTokenCounts function from the AgentClient and related tests into a new tokenMap utility file. This change improves code organization and reusability, allowing for better management of token counting logic across the application.
refactor(summarization): standardize step event handling and improve summary rendering
Refactored the step event handling in the useStepHandler and related components to utilize constants for event names, enhancing consistency and maintainability. Additionally, improved the rendering logic in the Summary component to conditionally display the summary text based on its availability, providing a better user experience during the summarization process.
feat(summarization): introduce baseContextTokens and reserveTokensRatio for improved context management
Added baseContextTokens to the InitializedAgent type to calculate the context budget based on agentMaxContextNum and maxOutputTokensNum. Implemented reserveTokensRatio in the createRun function to allow configurable context token management. Updated related tests to validate these changes and ensure proper functionality.
feat(summarization): add minReserveTokens, context pruning, and overflow recovery configurations
Introduced new configuration options for summarization, including minReserveTokens, context pruning settings, and overflow recovery parameters. Updated the createRun function to accommodate these new options and added a comprehensive test suite to validate their functionality and integration within the summarization process.
feat(summarization): add updatePrompt and reserveTokensRatio to summarization configuration
Introduced an updatePrompt field for updating existing summaries with new messages, enhancing the flexibility of the summarization process. Additionally, added reserveTokensRatio to the configuration schema, allowing for improved management of token allocation during summarization. Updated related tests to validate these new features.
feat(logging): add on_agent_log event handler for structured logging
Implemented an on_agent_log event handler in both the agents' callbacks and responses to facilitate structured logging of agent activities. This enhancement allows for better tracking and debugging of agent interactions by logging messages with associated metadata. Updated the summarization process to ensure proper handling of log events.
fix: remove duplicate IBalanceUpdate interface declaration
perf(usage): single-pass partition of collectedUsage
Replace two Array.filter() passes with a single for-of loop that
partitions message vs. summarization usages in one iteration.
fix(BaseClient): shallow-copy message content before mutating and preserve string content
Avoid mutating the original message.content array in-place when
appending a summary block. Also convert string content to a text
content part instead of silently discarding it.
fix(ui): fix Part.tsx indentation and useStepHandler summarize-complete handling
- Fix SUMMARY else-if branch indentation in Part.tsx to match chain level
- Guard ON_SUMMARIZE_COMPLETE with didFinalize flag to avoid unnecessary
re-renders when no summarizing parts exist
- Protect against undefined completeData.summary instead of unsafe spread
fix(agents): use strict enabled check for summarization handlers
Change summarizationConfig?.enabled !== false to === true so handlers
are not registered when summarizationConfig is undefined.
chore: fix initializeClient JSDoc and move DEFAULT_RESERVE_RATIO to module scope
refactor(Summary): align collapse/expand behavior with Reasoning component
- Single render path instead of separate streaming vs completed branches
- Use useMessageContext for isSubmitting/isLatestMessage awareness so
the "Summarizing..." label only shows during active streaming
- Default to collapsed (matching Reasoning), user toggles to expand
- Add proper aria attributes (aria-hidden, role, aria-controls, contentId)
- Hide copy button while actively streaming
feat(summarization): default to self-summarize using agent's own provider/model
When no summarization config is provided (neither in librechat.yaml nor
on the agent), automatically enable summarization using the agent's own
provider and model. The agents package already provides default prompts,
so no prompt configuration is needed.
Also removes the dead resolveSummarizationLLMConfig in summarize.ts
(and its spec) — run.ts buildAgentContext is the single source of truth
for summarization config resolution. Removes the duplicate
RuntimeSummarizationConfig local type in favor of the canonical
SummarizationConfig from data-provider.
chore: schema and type cleanup for summarization
- Add trigger field to summarizationAgentOverrideSchema so per-agent
trigger overrides in librechat.yaml are not silently stripped by Zod
- Remove unused SummarizationStatus type from runs.ts
- Make AppSummarizationConfig.enabled non-optional to reflect the
invariant that loadSummarizationConfig always sets it
refactor(responses): extract duplicated on_agent_log handler
refactor(run): use agents package types for summarization config
Import SummarizationConfig, ContextPruningConfig, and
OverflowRecoveryConfig from @librechat/agents and use them to
type-check the translation layer in buildAgentContext. This ensures
the config object passed to the agent graph matches what it expects.
- Use `satisfies AgentSummarizationConfig` on the config object
- Cast contextPruningConfig and overflowRecoveryConfig to agents types
- Properly narrow trigger fields from DeepPartial to required shape
feat(config): add maxToolResultChars to base endpoint schema
Add maxToolResultChars to baseEndpointSchema so it can be configured
on any endpoint in librechat.yaml. Resolved during agent initialization
using getProviderConfig's endpoint resolution: custom endpoint config
takes precedence, then the provider-specific endpoint config, then the
shared `all` config.
Passed through to the agents package ToolNode, which uses it to cap
tool result length before it enters the context window. When not
configured, the agents package computes a sensible default from
maxContextTokens.
fix(summarization): forward agent model_parameters in self-summarize default
When no explicit summarization config exists, the self-summarize
default now forwards the agent's model_parameters as the
summarization parameters. This ensures provider-specific settings
(e.g. Bedrock region, credentials, endpoint host) are available
when the agents package constructs the summarization LLM.
fix(agents): register summarization handlers by default
Change the enabled gate from === true to !== false so handlers
register when no explicit summarization config exists. This aligns
with the self-summarize default where summarization is always on
unless explicitly disabled via enabled: false.
refactor(summarization): let agents package inherit clientOptions for self-summarize
Remove model_parameters forwarding from the self-summarize default.
The agents package now reuses the agent's own clientOptions when the
summarization provider matches the agent's provider, inheriting all
provider-specific settings (region, credentials, proxy, etc.)
automatically.
refactor(summarization): use MessageContentComplex[] for summary content
Unify summary content to always use MessageContentComplex[] arrays,
matching the pattern used by on_message_delta. No more string | array
unions — content is always an array of typed blocks ({ type: 'text',
text: '...' } for text, { type: 'reasoning_content', ... } for
reasoning).
Agents package:
- SummaryContentBlock.content: MessageContentComplex[] (was string)
- tokenCount now optional (not sent on deltas)
- Removed reasoning field — reasoning is now a content block type
- streamAndCollect normalizes all chunks to content block arrays
- Delta events pass content blocks directly
LibreChat:
- SummaryContentPart.content: Agents.MessageContentComplex[]
- Updated Part.tsx, Summary.tsx, useStepHandler.ts, BaseClient.js
- Summary.tsx derives display text from content blocks via useMemo
- Aggregator uses simple array spread
refactor(summarization): enhance summary handling and text extraction
- Updated BaseClient.js to improve summary text extraction, accommodating both legacy and new content formats.
- Modified summarization logic to ensure consistent handling of summary content across different message formats.
- Adjusted test cases in summarization.e2e.spec.js to utilize the new summary text extraction method.
- Refined SSE useStepHandler to initialize summary content as an array.
- Updated configuration schema by removing unused minReserveTokens field.
- Cleaned up SummaryContentPart type by removing rangeHash property.
These changes streamline the summarization process and ensure compatibility with various content structures.
refactor(summarization): streamline usage tracking and logging
- Removed direct checks for summarization nodes in ModelEndHandler and replaced them with a dedicated markSummarizationUsage function for better readability and maintainability.
- Updated OpenAIChatCompletionController and responses handlers to utilize the new markSummarizationUsage function for setting usage types.
- Enhanced logging functionality by ensuring the logger correctly handles different log levels.
- Introduced a new useCopyToClipboard hook in the Summary component to encapsulate clipboard copy logic, improving code reusability and clarity.
These changes improve the overall structure and efficiency of the summarization handling and logging processes.
refactor(summarization): update summary content block documentation
- Removed outdated comment regarding the last summary content block in BaseClient.js.
- Added a new comment to clarify the purpose of the findSummaryContentBlock method, ensuring consistency in documentation.
These changes enhance code clarity and maintainability by providing accurate descriptions of the summarization logic.
refactor(summarization): update summary content structure in tests
- Modified the summarization content structure in e2e tests to use an array format for text, aligning with recent changes in summary handling.
- Updated test descriptions to clarify the behavior of context token calculations, ensuring consistency and clarity in the tests.
These changes enhance the accuracy and maintainability of the summarization tests by reflecting the updated content structure.
refactor(summarization): remove legacy E2E test setup and configuration
- Deleted the e2e-setup.js and jest.e2e.config.js files, which contained legacy configurations for E2E tests using real API keys.
- Introduced a new summarization.e2e.ts file that implements comprehensive E2E backend integration tests for the summarization process, utilizing real AI providers and tracking summaries throughout the run.
These changes streamline the testing framework by consolidating E2E tests into a single, more robust file while removing outdated configurations.
refactor(summarization): enhance E2E tests and error handling
- Added a cleanup step to force exit after all tests to manage Redis connections.
- Updated the summarization model to 'claude-haiku-4-5-20251001' for consistency across tests.
- Improved error handling in the processStream function to capture and return processing errors.
- Enhanced logging for cross-run tests and tight context scenarios to provide better insights into test execution.
These changes improve the reliability and clarity of the E2E tests for the summarization process.
refactor(summarization): enhance test coverage for maxContextTokens behavior
- Updated run-summarization.test.ts to include a new test case ensuring that maxContextTokens does not exceed user-defined limits, even when calculated ratios suggest otherwise.
- Modified summarization.e2e.ts to replace legacy UsageMetadata type with a more appropriate type for collectedUsage, improving type safety and clarity in the test setup.
These changes improve the robustness of the summarization tests by validating context token constraints and refining type definitions.
feat(summarization): add comprehensive E2E tests for summarization process
- Introduced a new summarization.e2e.test.ts file that implements extensive end-to-end integration tests for the summarization pipeline, covering the full flow from LibreChat to agents.
- The tests utilize real AI providers and include functionality to track summaries during and between runs.
- Added necessary cleanup steps to manage Redis connections post-tests and ensure proper exit.
These changes enhance the testing framework by providing robust coverage for the summarization process, ensuring reliability and performance under real-world conditions.
fix(service): import logger from winston configuration
- Removed the import statement for logger from '@librechat/data-schemas' and replaced it with an import from '~/config/winston'.
- This change ensures that the logger is correctly sourced from the updated configuration, improving consistency in logging practices across the application.
refactor(summary): simplify Summary component and enhance token display
- Removed the unused `meta` prop from the `SummaryButton` component to streamline its interface.
- Updated the token display logic to use a localized string for better internationalization support.
- Adjusted the rendering of the `meta` information to improve its visibility within the `Summary` component.
These changes enhance the clarity and usability of the Summary component while ensuring better localization practices.
feat(summarization): add maxInputTokens configuration for summarization
- Introduced a new `maxInputTokens` property in the summarization configuration schema to control the amount of conversation context sent to the summarizer, with a default value of 10000.
- Updated the `createRun` function to utilize the new `maxInputTokens` setting, allowing for more flexible summarization based on agent context.
These changes enhance the summarization capabilities by providing better control over input token limits, improving the overall summarization process.
refactor(summarization): simplify maxInputTokens logic in createRun function
- Updated the logic for the `maxInputTokens` property in the `createRun` function to directly use the agent's base context tokens when the resolved summarization configuration does not specify a value.
- This change streamlines the configuration process and enhances clarity in how input token limits are determined for summarization.
These modifications improve the maintainability of the summarization configuration by reducing complexity in the token calculation logic.
feat(summary): enhance Summary component to display meta information
- Updated the SummaryContent component to accept an optional `meta` prop, allowing for additional contextual information to be displayed above the main content.
- Adjusted the rendering logic in the Summary component to utilize the new `meta` prop, improving the visibility of supplementary details.
These changes enhance the user experience by providing more context within the Summary component, making it clearer and more informative.
refactor(summarization): standardize reserveRatio configuration in summarization logic
- Replaced instances of `reserveTokensRatio` with `reserveRatio` in the `createRun` function and related tests to unify the terminology across the codebase.
- Updated the summarization configuration schema to reflect this change, ensuring consistency in how the reserve ratio is defined and utilized.
- Removed the per-agent override logic for summarization configuration, simplifying the overall structure and enhancing clarity.
These modifications improve the maintainability and readability of the summarization logic by standardizing the configuration parameters.
* fix: circular dependency of `~/models`
* chore: update logging scope in agent log handlers
Changed log scope from `[agentus:${data.scope}]` to `[agents:${data.scope}]` in both the callbacks and responses controllers to ensure consistent logging format across the application.
* feat: calibration ratio
* refactor(tests): update summarizationConfig tests to reflect changes in enabled property
Modified tests to check for the new `summarizationEnabled` property instead of the deprecated `enabled` field in the summarization configuration. This change ensures that the tests accurately validate the current configuration structure and behavior of the agents.
* feat(tests): add markSummarizationUsage mock for improved test coverage
Introduced a mock for the markSummarizationUsage function in the responses unit tests to enhance the testing of summarization usage tracking. This addition supports better validation of summarization-related functionalities and ensures comprehensive test coverage for the agents' response handling.
* refactor(tests): simplify event handler setup in createResponse tests
Removed redundant mock implementations for event handlers in the createResponse unit tests, streamlining the setup process. This change enhances test clarity and maintainability while ensuring that the tests continue to validate the correct behavior of usage tracking during on_chat_model_end events.
* refactor(agents): move calibration ratio capture to finally block
Reorganized the logic for capturing the calibration ratio in the AgentClient class to ensure it is executed in the finally block. This change guarantees that the ratio is captured even if the run is aborted, enhancing the reliability of the response message persistence. Removed redundant code and improved clarity in the handling of context metadata.
* refactor(agents): streamline bulk write logic in recordCollectedUsage function
Removed redundant bulk write operations and consolidated document handling in the recordCollectedUsage function. The logic now combines all documents into a single bulk write operation, improving efficiency and reducing error handling complexity. Updated logging to provide consistent error messages for bulk write failures.
* refactor(agents): enhance summarization configuration resolution in createRun function
Streamlined the summarization configuration logic by introducing a base configuration and allowing for overrides from agent-specific settings. This change improves clarity and maintainability, ensuring that the summarization configuration is consistently applied while retaining flexibility for customization. Updated the handling of summarization parameters to ensure proper integration with the agent's model and provider settings.
* refactor(agents): remove unused tokenCountMap and streamline calibration ratio handling
Eliminated the unused tokenCountMap variable from the AgentClient class to enhance code clarity. Additionally, streamlined the logic for capturing the calibration ratio by using optional chaining and a fallback value, ensuring that context metadata is consistently defined. This change improves maintainability and reduces potential confusion in the codebase.
* refactor(agents): extract agent log handler for improved clarity and reusability
Refactored the agent log handling logic by extracting it into a dedicated function, `agentLogHandler`, enhancing code clarity and reusability across different modules. Updated the event handlers in both the OpenAI and responses controllers to utilize the new handler, ensuring consistent logging behavior throughout the application.
* test: add summarization event tests for useStepHandler
Implemented a series of tests for the summarization events in the useStepHandler hook. The tests cover scenarios for ON_SUMMARIZE_START, ON_SUMMARIZE_DELTA, and ON_SUMMARIZE_COMPLETE events, ensuring proper handling of summarization logic, including message accumulation and finalization. This addition enhances test coverage and validates the correct behavior of the summarization process within the application.
* refactor(config): update summarizationTriggerSchema to use enum for type validation
Changed the type of the `type` field in the summarizationTriggerSchema from a string to an enum with a single value 'token_count'. This modification enhances type safety and ensures that only valid types are accepted in the configuration, improving overall clarity and maintainability of the schema.
* test(usage): add bulk write tests for message and summarization usage
Implemented tests for the bulk write functionality in the recordCollectedUsage function, covering scenarios for combined message and summarization usage, summarization-only usage, and message-only usage. These tests ensure correct document handling and token rollup calculations, enhancing test coverage and validating the behavior of the usage tracking logic.
* refactor(Chat): enhance clipboard copy functionality and type definitions in Summary component
Updated the Summary component to improve the clipboard copy functionality by handling clipboard permission errors. Refactored type definitions for SummaryProps to use a more specific type, enhancing type safety. Adjusted the SummaryButton and FloatingSummaryBar components to accept isCopied and onCopy props, promoting better separation of concerns and reusability.
* chore(translations): remove unused "Expand Summary" key from English translations
Deleted the "Expand Summary" key from the English translation file to streamline the localization resources and improve clarity in the user interface. This change helps maintain an organized and efficient translation structure.
* refactor: adjust token counting for Claude model to account for API discrepancies
Implemented a correction factor for token counting when using the Claude model, addressing discrepancies between Anthropic's API and local tokenizer results. This change ensures accurate token counts by applying a scaling factor, improving the reliability of token-related functionalities.
* refactor(agents): implement token count adjustment for Claude model messages
Added a method to adjust token counts for messages processed by the Claude model, applying a correction factor to align with API expectations. This enhancement improves the accuracy of token counting, ensuring reliable functionality when interacting with the Claude model.
* refactor(agents): token counting for media content in messages
Introduced a new method to estimate token costs for image and document blocks in messages, improving the accuracy of token counting. This enhancement ensures that media content is properly accounted for, particularly for the Claude model, by integrating additional token estimation logic for various content types. Updated the token counting function to utilize this new method, enhancing overall reliability and functionality.
* chore: fix missing import
* fix(agents): clamp baseContextTokens and document reserve ratio change
Prevent negative baseContextTokens when maxOutputTokens exceeds the
context window (misconfigured models). Document the 10%→5% default
reserve ratio reduction introduced alongside summarization.
* fix(agents): include media tokens in hydrated token counts
Add estimateMediaTokensForMessage to createTokenCounter so the hydration
path (used by hydrateMissingIndexTokenCounts) matches the precomputed
path in AgentClient.getTokenCountForMessage. Without this, messages
containing images or documents were systematically undercounted during
hydration, risking context window overflow.
Add 34 unit tests covering all block-type branches of
estimateMediaTokensForMessage.
* fix(agents): include summarization output tokens in usage return value
The returned output_tokens from recordCollectedUsage now reflects all
billed LLM calls (message + summarization). Previously, summarization
completions were billed but excluded from the returned metadata, causing
a discrepancy between what users were charged and what the response
message reported.
* fix(tests): replace process.exit with proper Redis cleanup in e2e test
The summarization E2E test used process.exit(0) to work around a Redis
connection opened at import time, which killed the Jest runner and
bypassed teardown. Use ioredisClient.quit() and keyvRedisClient.disconnect()
for graceful cleanup instead.
* fix(tests): update getConvo imports in OpenAI and response tests
Refactor test files to import getConvo from the main models module instead of the Conversation submodule. This change ensures consistency across tests and simplifies the import structure, enhancing maintainability.
* fix(clients): improve summary text validation in BaseClient
Refactor the summary extraction logic to ensure that only non-empty summary texts are considered valid. This change enhances the robustness of the message processing by utilizing a dedicated method for summary text retrieval, improving overall reliability.
* fix(config): replace z.any() with explicit union in summarization schema
Model parameters (temperature, top_p, etc.) are constrained to
primitive types rather than the policy-violating z.any().
* refactor(agents): deduplicate CLAUDE_TOKEN_CORRECTION constant
Export from the TS source in packages/api and import in the JS client,
eliminating the static class property that could drift out of sync.
* refactor(agents): eliminate duplicate selfProvider in buildAgentContext
selfProvider and provider were derived from the same expression with
different type casts. Consolidated to a single provider variable.
* refactor(agents): extract shared SSE handlers and restrict log levels
- buildSummarizationHandlers() factory replaces triplicated handler
blocks across responses.js and openai.js
- agentLogHandlerObj exported from callbacks.js for consistent reuse
- agentLogHandler restricted to an allowlist of safe log levels
(debug, info, warn, error) instead of accepting arbitrary strings
* fix(SSE): batch summarize deltas, add exhaustiveness check, conditional error announcement
- ON_SUMMARIZE_DELTA coalesces rapid-fire renders via requestAnimationFrame
instead of calling setMessages per chunk
- Exhaustive never-check on TStepEvent catches unhandled variants at
compile time when new StepEvents are added
- ON_SUMMARIZE_COMPLETE error announcement only fires when a summary
part was actually present and removed
* feat(agents): persist instruction overhead in contextMeta and seed across runs
Extend contextMeta with instructionOverhead and toolCount so the
provider-observed instruction overhead is persisted on the response message
and seeded into the pruner on subsequent runs. This enables the pruner to
use a calibrated budget from the first call instead of waiting for a
provider observation, preventing the ratio collapse caused by local
tokenizer overestimating tool schema tokens.
The seeded overhead is only used when encoding and tool count match
between runs, ensuring stale values from different configurations
are discarded.
* test(agents): enhance OpenAI test mocks for summarization handlers
Updated the OpenAI test suite to include additional mock implementations for summarization handlers, including buildSummarizationHandlers, markSummarizationUsage, and agentLogHandlerObj. This improves test coverage and ensures consistent behavior during testing.
* fix(agents): address review findings for summarization v2
Cancel rAF on unmount to prevent stale Recoil writes from dead
component context. Clear orphaned summarizing:true parts when
ON_SUMMARIZE_COMPLETE arrives without a summary payload. Add null
guard and safe spread to agentLogHandler. Handle Anthropic-format
base64 image/* documents in estimateMediaTokensForMessage. Use
role="region" for expandable summary content. Add .describe() to
contextMeta Zod fields. Extract duplicate usage loop into helper.
* refactor: simplify contextMeta to calibrationRatio + encoding only
Remove instructionOverhead and toolCount from cross-run persistence —
instruction tokens change too frequently between runs (prompt edits,
tool changes) for a persisted seed to be reliable. The intra-run
calibration in the pruner still self-corrects via provider observations.
contextMeta now stores only the tokenizer-bias ratio and encoding,
which are stable across instruction changes.
* test(SSE): enhance useStepHandler tests for ON_SUMMARIZE_COMPLETE behavior
Updated the test for ON_SUMMARIZE_COMPLETE to clarify that it finalizes the existing part with summarizing set to false when the summary is undefined. Added assertions to verify the correct behavior of message updates and the state of summary parts.
* refactor(BaseClient): remove handleContextStrategy and truncateToolCallOutputs functions
Eliminated the handleContextStrategy method from BaseClient to streamline message handling. Also removed the truncateToolCallOutputs function from the prompts module, simplifying the codebase and improving maintainability.
* refactor: add AGENT_DEBUG_LOGGING option and refactor token count handling in BaseClient
Introduced AGENT_DEBUG_LOGGING to .env.example for enhanced debugging capabilities. Refactored token count handling in BaseClient by removing the handleTokenCountMap method and simplifying token count updates. Updated AgentClient to log detailed token count recalculations and adjustments, improving traceability during message processing.
* chore: update dependencies in package-lock.json and package.json files
Bumped versions of several dependencies, including @librechat/agents to ^3.1.62 and various AWS SDK packages to their latest versions. This ensures compatibility and incorporates the latest features and fixes.
* chore: imports order
* refactor: extract summarization config resolution from buildAgentContext
* refactor: rename and simplify summarization configuration shaping function
* refactor: replace AgentClient token counting methods with single-pass pure utility
Extract getTokenCount() and getTokenCountForMessage() from AgentClient
into countFormattedMessageTokens(), a pure function in packages/api that
handles text, tool_call, image, and document content types in one loop.
- Decompose estimateMediaTokensForMessage into block-level helpers
(estimateImageDataTokens, estimateImageBlockTokens, estimateDocumentBlockTokens)
shared by both estimateMediaTokensForMessage and the new single-pass function
- Remove redundant per-call getEncoding() resolution (closure captures once)
- Remove deprecated gpt-3.5-turbo-0301 model branching
- Drop this.getTokenCount guard from BaseClient.sendMessage
* refactor: streamline token counting in createTokenCounter function
Simplified the createTokenCounter function by removing the media token estimation and directly calculating the token count. This change enhances clarity and performance by consolidating the token counting logic into a single pass, while maintaining compatibility with Claude's token correction.
* refactor: simplify summarization configuration types
Removed the AppSummarizationConfig type and directly used SummarizationConfig in the AppConfig interface. This change streamlines the type definitions and enhances consistency across the codebase.
* chore: import order
* fix: summarization event handling in useStepHandler
- Cancel pending summarizeDeltaRaf in clearStepMaps to prevent stale
frames firing after map reset or component unmount
- Move announcePolite('summarize_completed') inside the didFinalize
guard so screen readers only announce when finalization actually occurs
- Remove dead cleanup closure returned from stepHandler useCallback body
that was never invoked by any caller
* fix: estimate tokens for non-PDF/non-image base64 document blocks
Previously estimateDocumentBlockTokens returned 0 for unrecognized MIME
types (e.g. text/plain, application/json), silently underestimating
context budget. Fall back to character-based heuristic or countTokens.
* refactor: return cloned usage from markSummarizationUsage
Avoid mutating LangChain's internal usage_metadata object by returning
a shallow clone with the usage_type tag. Update all call sites in
callbacks, openai, and responses controllers to use the returned value.
* refactor: consolidate debug logging loops in buildMessages
Merge the two sequential O(n) debug-logging passes over orderedMessages
into a single pass inside the map callback where all data is available.
* refactor: narrow SummaryContentPart.content type
Replace broad Agents.MessageContentComplex[] with the specific
Array<{ type: ContentTypes.TEXT; text: string }> that all producers
and consumers already use, improving compile-time safety.
* refactor: use single output array in recordCollectedUsage
Have processUsageGroup append to a shared array instead of returning
separate arrays that are spread into a third, reducing allocations.
* refactor: use for...in in hydrateMissingIndexTokenCounts
Replace Object.entries with for...in to avoid allocating an
intermediate tuple array during token map hydration.
|
||
|
|
dd72b7b17e
|
🔄 chore: Consolidate agent model imports across middleware and tests from rebase
- Updated imports for `createAgent` and `getAgent` to streamline access from a unified `~/models` path. - Enhanced test files to reflect the new import structure, ensuring consistency and maintainability across the codebase. - Improved clarity by removing redundant imports and aligning with the latest model updates. |
||
|
|
8ba2bde5c1
|
📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830)
* chore: move database model methods to /packages/data-schemas * chore: add TypeScript ESLint rule to warn on unused variables * refactor: model imports to streamline access - Consolidated model imports across various files to improve code organization and reduce redundancy. - Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path. - Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact. - Enhanced test files to align with the new import paths, maintaining test coverage and integrity. * chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas * test: update agent model mocks in unit tests - Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality. - Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication. - Ensured consistency in agent mock implementations across test files. * fix: update types in data-schemas * refactor: enhance type definitions in transaction and spending methods - Updated type definitions in `checkBalance.ts` to use specific request and response types. - Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety. - Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods. - Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness. * refactor: streamline model imports and enhance code organization - Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy. - Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact. - Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods. * feat: implement loadAddedAgent and refactor agent loading logic - Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution. - Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`. - Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`. - Enhanced type definitions and improved error handling in the agent loading process. - Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage. * refactor: enhance balance handling with new update interface - Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase. - Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety. - Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities. - Enhanced overall code clarity and maintainability by refining type definitions related to balance operations. * feat: add unit tests for loadAgent functionality and enhance agent loading logic - Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks. - Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality. - Improved test coverage for agent loading, including handling of non-existent agents and user permissions. * chore: reorder memory method exports for consistency - Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`. |
||
|
|
8e8fb01d18
|
🧱 fix: Enforce Agent Access Control on Context and OCR File Loading (#12253)
* 🔏 fix: Apply agent access control filtering to context/OCR resource loading
The context/OCR file path in primeResources fetched files by file_id
without applying filterFilesByAgentAccess, unlike the file_search and
execute_code paths. Add filterFiles dependency injection to primeResources
and invoke it after getFiles to enforce consistent access control.
* fix: Wire filterFilesByAgentAccess into all agent initialization callers
Pass the filterFilesByAgentAccess function from the JS layer into the TS
initializeAgent → primeResources chain via dependency injection, covering
primary, handoff, added-convo, and memory agent init paths.
* test: Add access control filtering tests for primeResources
Cover filterFiles invocation with context/OCR files, verify filtering
rejects inaccessible files, and confirm graceful fallback when filterFiles,
userId, or agentId are absent.
* fix: Guard filterFilesByAgentAccess against ephemeral agent IDs
Ephemeral agents have no DB document, so getAgent returns null and the
access map defaults to all-false, silently blocking all non-owned files.
Short-circuit with isEphemeralAgentId to preserve the pass-through
behavior for inline-built agents (memory, tool agents).
* fix: Clean up resources.ts and JS caller import order
Remove redundant optional chain on req.user.role inside user-guarded
block, update primeResources JSDoc with filterFiles and agentId params,
and reorder JS imports to longest-to-shortest per project conventions.
* test: Strengthen OCR assertion and add filterFiles error-path test
Use toHaveBeenCalledWith for the OCR filtering test to verify exact
arguments after the OCR→context merge step. Add test for filterFiles
rejection to verify graceful degradation (logs error, returns original
tool_resources).
* fix: Correct import order in addedConvo.js and initialize.js
Sort by total line length descending: loadAddedAgent (91) before
filterFilesByAgentAccess (84), loadAgentTools (91) before
filterFilesByAgentAccess (84).
* test: Add unit tests for filterFilesByAgentAccess and hasAccessToFilesViaAgent
Cover every branch in permissions.js: ephemeral agent guard, missing
userId/agentId/files early returns, all-owned short-circuit, mixed
owned + non-owned with VIEW/no-VIEW, agent-not-found fail-closed,
author path scoped to attached files, EDIT gate on delete, DB error
fail-closed, and agent with no tool_resources.
* test: Cover file.user undefined/null in permissions spec
Files with no user field fall into the non-owned path and get run
through hasAccessToFilesViaAgent. Add two cases: attached file with
no user field is returned, unattached file with no user field is
excluded.
|
||
|
|
6f87b49df8
|
🛂 fix: Enforce Actions Capability Gate Across All Event-Driven Tool Loading Paths (#12252)
* fix: gate action tools by actions capability in all code paths Extract resolveAgentCapabilities helper to eliminate 3x-duplicated capability resolution. Apply early action-tool filtering in both loadToolDefinitionsWrapper and loadAgentTools non-definitions path. Gate loadActionToolsForExecution in loadToolsForExecution behind an actionsEnabled parameter with a cache-based fallback. Replace the late capability guard in loadAgentTools with a hasActionTools check to avoid unnecessary loadActionSets DB calls and duplicate warnings. * fix: thread actionsEnabled through InitializedAgent type Add actionsEnabled to the loadTools callback return type, InitializedAgent, and the initializeAgent destructuring/return so callers can forward the resolved value to loadToolsForExecution without redundant getEndpointsConfig cache lookups. * fix: pass actionsEnabled from callers to loadToolsForExecution Thread actionsEnabled through the agentToolContexts map in initialize.js (primary and handoff agents) and through primaryConfig in the openai.js and responses.js controllers, avoiding per-tool-call capability re-resolution on the hot path. * test: add regression tests for action capability gating Test the real exported functions (resolveAgentCapabilities, loadAgentTools, loadToolsForExecution) with mocked dependencies instead of shadow re-implementations. Covers definition filtering, execution gating, actionsEnabled param forwarding, and fallback capability resolution. * test: use Constants.EPHEMERAL_AGENT_ID in ephemeral fallback test Replaces a string guess with the canonical constant to avoid fragility if the ephemeral detection heuristic changes. * fix: populate agentToolContexts for addedConvo parallel agents After processAddedConvo returns, backfill agentToolContexts for any agents in agentConfigs not already present, so ON_TOOL_EXECUTE for added-convo agents receives actionsEnabled instead of falling back to a per-call cache lookup. |
||
|
|
bcf45519bd
|
🪪 fix: Enforce VIEW ACL on Agent Edge References at Write and Runtime (#12246)
* 🛡️ fix: Enforce ACL checks on handoff edge and added-convo agent loading Edge-linked agents and added-convo agents were fetched by ID via getAgent without verifying the requesting user's access permissions. This allowed an authenticated user to reference another user's private agent in edges or addedConvo and have it initialized at runtime. Add checkPermission(VIEW) gate in processAgent before initializing any handoff agent, and in processAddedConvo for non-ephemeral added agents. Unauthorized agents are logged and added to skippedAgentIds so orphaned-edge filtering removes them cleanly. * 🛡️ fix: Validate edge agent access at agent create/update time Reject agent create/update requests that reference agents in edges the requesting user cannot VIEW. This provides early feedback and prevents storing unauthorized agent references as defense-in-depth alongside the runtime ACL gate in processAgent. Add collectEdgeAgentIds utility to extract all unique agent IDs from an edge array, and validateEdgeAgentAccess helper in the v1 handler. * 🧪 test: Improve ACL gate test coverage and correctness - Add processAgent ACL gate tests for initializeClient (skip/allow handoff agents) - Fix addedConvo.spec.js to mock loadAddedAgent directly instead of getAgent - Seed permMap with ownedAgent VIEW bits in v1.spec.js update-403 test * 🧹 chore: Remove redundant addedConvo ACL gate (now in middleware) PR #12243 moved the addedConvo agent ACL check upstream into canAccessAgentFromBody middleware, making the runtime check in processAddedConvo and its spec redundant. * 🧪 test: Rewrite processAgent ACL test with real DB and minimal mocking Replace heavy mock-based test (12 mocks, Providers.XAI crash) with MongoMemoryServer-backed integration test that exercises real getAgent, checkPermission, and AclEntry — only external I/O (initializeAgent, ToolService, AgentClient) remains mocked. Load edge utilities directly from packages/api/src/agents/edges to sidestep the config.ts barrel. * 🧪 fix: Use requireActual spread for @librechat/agents and @librechat/api mocks The Providers.XAI crash was caused by mocking @librechat/agents with a minimal replacement object, breaking the @librechat/api initialization chain. Match the established pattern from client.test.js and recordCollectedUsage.spec.js: spread jest.requireActual for both packages, overriding only the functions under test. |
||
|
|
43ff3f8473
|
💸 fix: Model Identifier Edge Case in Agent Transactions (#11988)
* 🔧 fix: Add skippedAgentIds tracking in initializeClient error handling - Enhanced error handling in the initializeClient function to track agent IDs that are skipped during processing. This addition improves the ability to monitor and debug issues related to agent initialization failures. * 🔧 fix: Update model assignment in BaseClient to use instance model - Modified the model assignment in BaseClient to use `this.model` instead of `responseMessage.model`, clarifying that when using agents, the model refers to the agent ID rather than the model itself. This change improves code clarity and correctness in the context of agent usage. * 🔧 test: Add tests for recordTokenUsage model assignment in BaseClient - Introduced new test cases in BaseClient to ensure that the correct model is passed to the recordTokenUsage method, verifying that it uses this.model instead of the agent ID from responseMessage.model. This enhances the accuracy of token usage tracking in agent scenarios. - Improved error handling in the initializeClient function to log errors when processing agents, ensuring that skipped agent IDs are tracked for better debugging. |
||
|
|
6169d4f70b
|
🚦 fix: 404 JSON Responses for Unmatched API Routes (#11976)
* feat: Implement 404 JSON response for unmatched API routes - Added middleware to return a 404 JSON response with a message for undefined API routes. - Updated SPA fallback to serve index.html for non-API unmatched routes. - Ensured the error handler is positioned correctly as the last middleware in the stack. * fix: Enhance logging in BaseClient for better token usage tracking - Updated `getTokenCountForResponse` to log the messageId of the response for improved debugging. - Enhanced userMessage logging to include messageId, tokenCount, and conversationId for clearer context during token count mapping. * chore: Improve logging in processAddedConvo for better debugging - Updated the logging structure in the processAddedConvo function to provide clearer context when processing added conversations. - Removed redundant logging and enhanced the output to include model, agent ID, and endpoint details for improved traceability. * chore: Enhance logging in BaseClient for improved token usage tracking - Added debug logging in the BaseClient to track response token usage, including messageId, model, promptTokens, and completionTokens for better debugging and traceability. * chore: Enhance logging in MemoryAgent for improved context - Updated logging in the MemoryAgent to include userId, conversationId, messageId, and provider details for better traceability during memory processing. - Adjusted log messages to provide clearer context when content is returned or not, aiding in debugging efforts. * chore: Refactor logging in initializeClient for improved clarity - Consolidated multiple debug log statements into a single message that provides a comprehensive overview of the tool context being stored for the primary agent, including the number of tools and the size of the tool registry. This enhances traceability and debugging efficiency. * feat: Implement centralized 404 handling for unmatched API routes - Introduced a new middleware function `apiNotFound` to standardize 404 JSON responses for undefined API routes. - Updated the server configuration to utilize the new middleware, enhancing code clarity and maintainability. - Added tests to ensure correct 404 responses for various non-GET methods and the `/api` root path. * fix: Enhance logging in apiNotFound middleware for improved safety - Updated the `apiNotFound` function to sanitize the request path by replacing problematic characters and limiting its length, ensuring safer logging of 404 errors. * refactor: Move apiNotFound middleware to a separate file for better organization - Extracted the `apiNotFound` function from the error middleware into its own file, enhancing code organization and maintainability. - Updated the index file to export the new `notFound` middleware, ensuring it is included in the middleware stack. * docs: Add comment to clarify usage of unsafeChars regex in notFound middleware - Included a comment in the notFound middleware file to explain that the unsafeChars regex is safe to reuse with .replace() at the module scope, as it does not retain lastIndex state. |
||
|
|
b0a32b7d6d
|
👻 fix: Prevent Async Title Generation From Recreating Deleted Conversations (#11797)
* 🐛 fix: Prevent deleted conversations from being recreated by async title generation
When a user deletes a chat while auto-generated title is still in progress,
`saveConvo` with `upsert: true` recreates the deleted conversation as a ghost
entry with only a title and no messages. This adds a `noUpsert` metadata option
to `saveConvo` and uses it in both agent and assistant title generation paths,
so the title save is skipped if the conversation no longer exists.
* test: conversation creation logic with noUpsert option
Added new tests to validate the behavior of the `saveConvo` function with the `noUpsert` option. This includes scenarios where a conversation should not be created if it doesn't exist, updating an existing conversation when `noUpsert` is true, and ensuring that upsert behavior remains the default when `noUpsert` is not provided. These changes improve the flexibility and reliability of conversation management.
* test: Clean up Conversation.spec.js by removing commented-out code
Removed unnecessary comments from the Conversation.spec.js test file to improve readability and maintainability. This includes comments related to database verification and temporary conversation handling, streamlining the test cases for better clarity.
|
||
|
|
5eb0a3ad90
|
⚠️ chore: Remove Deprecated forcePrompt setting (#11622)
- Removed `forcePrompt` parameter from various configuration files including `librechat.example.yaml`, `initialize.js`, `values.yaml`, and `initialize.ts`.
- This change simplifies the configuration by eliminating unused options, enhancing clarity and maintainability across the codebase.
|
||
|
|
f34052c6bb
|
🌙 feat: Moonshot Provider Support (#11621)
* ✨ feat: Add Moonshot Provider Support - Updated the `isKnownCustomProvider` function to include `Providers.MOONSHOT` in the list of recognized custom providers. - Enhanced the `providerConfigMap` to initialize `MOONSHOT` with the custom initialization function. - Introduced `MoonshotIcon` component for visual representation in the UI, integrated into the `UnknownIcon` component. - Updated various files across the API and client to support the new `MOONSHOT` provider, including configuration and response handling. This update expands the capabilities of the application by integrating support for the Moonshot provider, enhancing both backend and frontend functionalities. * ✨ feat: Add Moonshot/Kimi Model Pricing and Tests - Introduced new pricing configurations for Moonshot and Kimi models in `tx.js`, including various model variations and their respective prompt and completion values. - Expanded unit tests in `tx.spec.js` and `tokens.spec.js` to validate pricing and token limits for the newly added Moonshot/Kimi models, ensuring accurate calculations and handling of model variations. - Updated utility functions to support the new model structures and ensure compatibility with existing functionalities. This update enhances the pricing model capabilities and improves test coverage for the Moonshot/Kimi integration. * ✨ feat: Enhance Token Pricing Documentation and Configuration - Added comprehensive documentation for token pricing configuration in `tx.js` and `tokens.ts`, emphasizing the importance of key ordering for pattern matching. - Clarified the process for defining base and specific patterns to ensure accurate pricing retrieval based on model names. - Improved code comments to guide future additions of model families, enhancing maintainability and understanding of the pricing structure. This update improves the clarity and usability of the token pricing configuration, facilitating better integration and future enhancements. * chore: import order * chore: linting |
||
|
|
40c5804ed6
|
🗑️ chore: Remove Dev Artifacts for Deferred Tools Capability (#11601)
* chore: remove TOOL_CLASSIFICATION_AGENT_IDS env var blocking deferred tools The TOOL_CLASSIFICATION_AGENT_IDS environment variable was gating Tool Search creation even when agents had deferred tools configured via the UI (agent tool_options). This caused agents with all MCP tools set to defer_loading to have no tools available, since the Tool Search tool wasn't being created. - Remove isAgentAllowedForClassification function and its usage - Remove early return that blocked classification features - Update JSDoc comments to reflect current behavior - Remove related tests from classification.spec.ts Agent-level deferred_tools configuration now works correctly without requiring env var configuration. * chore: streamline classification tests and remove unused functions - Removed deprecated tests related to environment variable configurations for tool classification. - Simplified the classification.spec.ts file by retaining only relevant tests for the current functionality. - Updated imports and adjusted test cases to reflect the changes in the classification module. - Enhanced clarity in the classification utility functions by removing unnecessary comments and code. * refactor: update ToolService to use AgentConstants for tool identification - Replaced direct references to Constants with AgentConstants in ToolService.js for better consistency and maintainability. - Enhanced logging in loadToolsForExecution and initializeClient to include toolRegistry size, improving debugging capabilities. - Updated import statements in run.ts to include Constants, ensuring proper tool name checks during execution. * chore: reorganize imports and enhance classification tests - Updated import statements in classification.spec.ts for better clarity and organization. - Reintroduced the getServerNameFromTool function to improve tool classification logic. - Removed unused imports and functions to streamline the test file, enhancing maintainability. * feat: enhance tool registry creation with additional properties - Added toolType property to tool definitions in buildToolRegistryFromAgentOptions for improved classification. - Included serverName assignment in tool definitions to enhance tool identification and management. |
||
|
|
5af1342dbb
|
🦥 refactor: Event-Driven Lazy Tool Loading (#11588)
* refactor: json schema tools with lazy loading - Added LocalToolExecutor class for lazy loading and caching of tools during execution. - Introduced ToolExecutionContext and ToolExecutor interfaces for better type management. - Created utility functions to generate tool proxies with JSON schema support. - Added ExtendedJsonSchema type for enhanced schema definitions. - Updated existing toolkits to utilize the new schema and executor functionalities. - Introduced a comprehensive tool definitions registry for managing various tool schemas. chore: update @librechat/agents to version 3.1.2 refactor: enhance tool loading optimization and classification - Improved the loadAgentToolsOptimized function to utilize a proxy pattern for all tools, enabling deferred execution and reducing overhead. - Introduced caching for tool instances and refined tool classification logic to streamline tool management. - Updated the handling of MCP tools to improve logging and error reporting for missing tools in the cache. - Enhanced the structure of tool definitions to support better classification and integration with existing tools. refactor: modularize tool loading and enhance optimization - Moved the loadAgentToolsOptimized function to a new service file for better organization and maintainability. - Updated the ToolService to utilize the new service for optimized tool loading, improving code clarity. - Removed legacy tool loading methods and streamlined the tool loading process to enhance performance and reduce complexity. - Introduced feature flag handling for optimized tool loading, allowing for easier toggling of this functionality. refactor: replace loadAgentToolsWithFlag with loadAgentTools in tool loader refactor: enhance MCP tool loading with proxy creation and classification refactor: optimize MCP tool loading by grouping tools by server - Introduced a Map to group cached tools by server name, improving the organization of tool data. - Updated the createMCPProxyTool function to accept server name directly, enhancing clarity. - Refactored the logic for handling MCP tools, streamlining the process of creating proxy tools for classification. refactor: enhance MCP tool loading and proxy creation - Added functionality to retrieve MCP server tools and reinitialize servers if necessary, improving tool availability. - Updated the tool loading logic to utilize a Map for organizing tools by server, enhancing clarity and performance. - Refactored the createToolProxy function to ensure a default response format, streamlining tool creation. refactor: update createToolProxy to ensure consistent response format - Modified the createToolProxy function to await the executor's execution and validate the result format. - Ensured that the function returns a default response structure when the result is not an array of two elements, enhancing reliability in tool proxy creation. refactor: ToolExecutionContext with toolCall property - Added toolCall property to ToolExecutionContext interface for improved context handling during tool execution. - Updated LocalToolExecutor to include toolCall in the runnable configuration, allowing for more flexible tool invocation. - Modified createToolProxy to pass toolCall from the configuration, ensuring consistent context across tool executions. refactor: enhance event-driven tool execution and logging - Introduced ToolExecuteOptions for improved handling of event-driven tool execution, allowing for parallel execution of tool calls. - Updated getDefaultHandlers to include support for ON_TOOL_EXECUTE events, enhancing the flexibility of tool invocation. - Added detailed logging in LocalToolExecutor to track tool loading and execution metrics, improving observability and debugging capabilities. - Refactored initializeClient to integrate event-driven tool loading, ensuring compatibility with the new execution model. chore: update @librechat/agents to version 3.1.21 refactor: remove legacy tool loading and executor components - Eliminated the loadAgentToolsWithFlag function, simplifying the tool loading process by directly using loadAgentTools. - Removed the LocalToolExecutor and related executor components to streamline the tool execution architecture. - Updated ToolService and related files to reflect the removal of deprecated features, enhancing code clarity and maintainability. refactor: enhance tool classification and definitions handling - Updated the loadAgentTools function to return toolDefinitions alongside toolRegistry, improving the structure of tool data returned to clients. - Removed the convertRegistryToDefinitions function from the initialize.js file, simplifying the initialization process. - Adjusted the buildToolClassification function to ensure toolDefinitions are built and returned simultaneously with the toolRegistry, enhancing efficiency in tool management. - Updated type definitions in initialize.ts to include toolDefinitions, ensuring consistency across the codebase. refactor: implement event-driven tool execution handler - Introduced createToolExecuteHandler function to streamline the handling of ON_TOOL_EXECUTE events, allowing for parallel execution of tool calls. - Updated getDefaultHandlers to utilize the new handler, simplifying the event-driven architecture. - Added handlers.ts file to encapsulate tool execution logic, improving code organization and maintainability. - Enhanced OpenAI handlers to integrate the new tool execution capabilities, ensuring consistent event handling across the application. refactor: integrate event-driven tool execution options - Added toolExecuteOptions to support event-driven tool execution in OpenAI and responses controllers, enhancing flexibility in tool handling. - Updated handlers to utilize createToolExecuteHandler, allowing for streamlined execution of tools during agent interactions. - Refactored service dependencies to include toolExecuteOptions, ensuring consistent integration across the application. refactor: enhance tool loading with definitionsOnly parameter - Updated createToolLoader and loadAgentTools functions to include a definitionsOnly parameter, allowing for the retrieval of only serializable tool definitions in event-driven mode. - Adjusted related interfaces and documentation to reflect the new parameter, improving clarity and flexibility in tool management. - Ensured compatibility across various components by integrating the definitionsOnly option in the initialization process. refactor: improve agent tool presence check in initialization - Added a check for tool presence using a new hasAgentTools variable, which evaluates both structuredTools and toolDefinitions. - Updated the conditional logic in the agent initialization process to utilize the hasAgentTools variable, enhancing clarity and maintainability in tool management. refactor: enhance agent tool extraction to support tool definitions - Updated the extractMCPServers function to handle both tool instances and serializable tool definitions, improving flexibility in agent tool management. - Added a new property toolDefinitions to the AgentWithTools type for better integration of event-driven mode. - Enhanced documentation to clarify the function's capabilities in extracting unique MCP server names from both tools and tool definitions. refactor: enhance tool classification and registry building - Added serverName property to ToolDefinition for improved tool identification. - Introduced buildToolRegistry function to streamline the creation of tool registries based on MCP tool definitions and agent options. - Updated buildToolClassification to utilize the new registry building logic, ensuring basic definitions are returned even when advanced classification features are not allowed. - Enhanced documentation and logging for clarity in tool classification processes. refactor: update @librechat/agents dependency to version 3.1.22 fix: expose loadTools function in ToolService - Added loadTools function to the exported module in ToolService.js, enhancing the accessibility of tool loading functionality. chore: remove configurable options from tool execute options in OpenAI controller refactor: enhance tool loading mechanism to utilize agent-specific context chore: update @librechat/agents dependency to version 3.1.23 fix: simplify result handling in createToolExecuteHandler * refactor: loadToolDefinitions for efficient tool loading in event-driven mode * refactor: replace legacy tool loading with loadToolsForExecution in OpenAI and responses controllers - Updated OpenAIChatCompletionController and createResponse functions to utilize loadToolsForExecution for improved tool loading. - Removed deprecated loadToolsLegacy references, streamlining the tool execution process. - Enhanced tool loading options to include agent-specific context and configurations. * refactor: enhance tool loading and execution handling - Introduced loadActionToolsForExecution function to streamline loading of action tools, improving organization and maintainability. - Updated loadToolsForExecution to handle both regular and action tools, optimizing the tool loading process. - Added detailed logging for missing tools in createToolExecuteHandler, enhancing error visibility. - Refactored tool definitions to normalize action tool names, improving consistency in tool management. * refactor: enhance built-in tool definitions loading - Updated loadToolDefinitions to include descriptions and parameters from the tool registry for built-in tools, improving the clarity and usability of tool definitions. - Integrated getToolDefinition to streamline the retrieval of tool metadata, enhancing the overall tool management process. * feat: add action tool definitions loading to tool service - Introduced getActionToolDefinitions function to load action tool definitions based on agent ID and tool names, enhancing the tool loading process. - Updated loadToolDefinitions to integrate action tool definitions, allowing for better management and retrieval of action-specific tools. - Added comprehensive tests for action tool definitions to ensure correct loading and parameter handling, improving overall reliability and functionality. * chore: update @librechat/agents dependency to version 3.1.26 * refactor: add toolEndCallback to handle tool execution results * fix: tool definitions and execution handling - Introduced native tools (execute_code, file_search, web_search) to the tool service, allowing for better integration and management of these tools. - Updated isBuiltInTool function to include native tools in the built-in check, improving tool recognition. - Added comprehensive tests for loading parameters of native tools, ensuring correct functionality and parameter handling. - Enhanced tool definitions registry to include new agent tool definitions, streamlining tool retrieval and management. * refactor: enhance tool loading and execution context - Added toolRegistry to the context for OpenAIChatCompletionController and createResponse functions, improving tool management. - Updated loadToolsForExecution to utilize toolRegistry for better integration of programmatic tools and tool search functionalities. - Enhanced the initialization process to include toolRegistry in agent context, streamlining tool access and configuration. - Refactored tool classification logic to support event-driven execution, ensuring compatibility with new tool definitions. * chore: add request duration logging to OpenAI and Responses controllers - Introduced logging for request start and completion times in OpenAIChatCompletionController and createResponse functions. - Calculated and logged the duration of each request, enhancing observability and performance tracking. - Improved debugging capabilities by providing detailed logs for both streaming and non-streaming responses. * chore: update @librechat/agents dependency to version 3.1.27 * refactor: implement buildToolSet function for tool management - Introduced buildToolSet function to streamline the creation of tool sets from agent configurations, enhancing tool management across various controllers. - Updated AgentClient, OpenAIChatCompletionController, and createResponse functions to utilize buildToolSet, improving consistency in tool handling. - Added comprehensive tests for buildToolSet to ensure correct functionality and edge case handling, enhancing overall reliability. * refactor: update import paths for ToolExecuteOptions and createToolExecuteHandler * fix: update GoogleSearch.js description for maximum search results - Changed the default maximum number of search results from 10 to 5 in the Google Search JSON schema description, ensuring accurate documentation of the expected behavior. * chore: remove deprecated Browser tool and associated assets - Deleted the Browser tool definition from manifest.json, which included its name, plugin key, description, and authentication configuration. - Removed the web-browser.svg asset as it is no longer needed following the removal of the Browser tool. * fix: ensure tool definitions are valid before processing - Added a check to verify the existence of tool definitions in the registry before accessing their properties, preventing potential runtime errors. - Updated the loading logic for built-in tool definitions to ensure that only valid definitions are pushed to the built-in tool definitions array. * fix: extend ExtendedJsonSchema to support 'null' type and nullable enums - Updated the ExtendedJsonSchema type to include 'null' as a valid type option. - Modified the enum property to accept an array of values that can include strings, numbers, booleans, and null, enhancing schema flexibility. * test: add comprehensive tests for tool definitions loading and registry behavior - Implemented tests to verify the handling of built-in tools without registry definitions, ensuring they are skipped correctly. - Added tests to confirm that built-in tools include descriptions and parameters in the registry. - Enhanced tests for action tools, checking for proper inclusion of metadata and handling of tools without parameters in the registry. * test: add tests for mixed-type and number enum schema handling - Introduced tests to validate the parsing of mixed-type enum values, including strings, numbers, booleans, and null. - Added tests for number enum schema values to ensure correct parsing of numeric inputs, enhancing schema validation coverage. * fix: update mock implementation for @librechat/agents - Changed the mock for @librechat/agents to spread the actual module's properties, ensuring that all necessary functionalities are preserved in tests. - This adjustment enhances the accuracy of the tests by reflecting the real structure of the module. * fix: change max_results type in GoogleSearch schema from number to integer - Updated the type of max_results in the Google Search JSON schema to 'integer' for better type accuracy and validation consistency. * fix: update max_results description and type in GoogleSearch schema - Changed the type of max_results from 'number' to 'integer' for improved type accuracy. - Updated the description to reflect the new default maximum number of search results, changing it from 10 to 5. * refactor: remove unused code and improve tool registry handling - Eliminated outdated comments and conditional logic related to event-driven mode in the ToolService. - Enhanced the handling of the tool registry by ensuring it is configurable for better integration during tool execution. * feat: add definitionsOnly option to buildToolClassification for event-driven mode - Introduced a new parameter, definitionsOnly, to the BuildToolClassificationParams interface to enable a mode that skips tool instance creation. - Updated the buildToolClassification function to conditionally add tool definitions without instantiating tools when definitionsOnly is true. - Modified the loadToolDefinitions function to pass definitionsOnly as true, ensuring compatibility with the new feature. * test: add unit tests for buildToolClassification with definitionsOnly option - Implemented tests to verify the behavior of buildToolClassification when definitionsOnly is set to true or false. - Ensured that tool instances are not created when definitionsOnly is true, while still adding necessary tool definitions. - Confirmed that loadAuthValues is called appropriately based on the definitionsOnly parameter, enhancing test coverage for this new feature. |
||
|
|
75c02a1a18
|
🗂️ feat: Better Persistence for Code Execution Files Between Sessions (#11362)
* refactor: process code output files for re-use (WIP) * feat: file attachment handling with additional metadata for downloads * refactor: Update directory path logic for local file saving based on basePath * refactor: file attachment handling to support TFile type and improve data merging logic * feat: thread filtering of code-generated files - Introduced parentMessageId parameter in addedConvo and initialize functions to enhance thread management. - Updated related methods to utilize parentMessageId for retrieving messages and filtering code-generated files by conversation threads. - Enhanced type definitions to include parentMessageId in relevant interfaces for better clarity and usage. * chore: imports/params ordering * feat: update file model to use messageId for filtering and processing - Changed references from 'message' to 'messageId' in file-related methods for consistency. - Added messageId field to the file schema and updated related types. - Enhanced file processing logic to accommodate the new messageId structure. * feat: enhance file retrieval methods to support user-uploaded execute_code files - Added a new method `getUserCodeFiles` to retrieve user-uploaded execute_code files, excluding code-generated files. - Updated existing file retrieval methods to improve filtering logic and handle edge cases. - Enhanced thread data extraction to collect both message IDs and file IDs efficiently. - Integrated `getUserCodeFiles` into relevant endpoints for better file management in conversations. * chore: update @librechat/agents package version to 3.0.78 in package-lock.json and related package.json files * refactor: file processing and retrieval logic - Added a fallback mechanism for download URLs when files exceed size limits or cannot be processed locally. - Implemented a deduplication strategy for code-generated files based on conversationId and filename to optimize storage. - Updated file retrieval methods to ensure proper filtering by messageIds, preventing orphaned files from being included. - Introduced comprehensive tests for new thread data extraction functionality, covering edge cases and performance considerations. * fix: improve file retrieval tests and handling of optional properties - Updated tests to safely access optional properties using non-null assertions. - Modified test descriptions for clarity regarding the exclusion of execute_code files. - Ensured that the retrieval logic correctly reflects the expected outcomes for file queries. * test: add comprehensive unit tests for processCodeOutput functionality - Introduced a new test suite for the processCodeOutput function, covering various scenarios including file retrieval, creation, and processing for both image and non-image files. - Implemented mocks for dependencies such as axios, logger, and file models to isolate tests and ensure reliable outcomes. - Validated behavior for existing files, new file creation, and error handling, including size limits and fallback mechanisms. - Enhanced test coverage for metadata handling and usage increment logic, ensuring robust verification of file processing outcomes. * test: enhance file size limit enforcement in processCodeOutput tests - Introduced a configurable file size limit for tests to improve flexibility and coverage. - Mocked the `librechat-data-provider` to allow dynamic adjustment of file size limits during tests. - Updated the file size limit enforcement test to validate behavior when files exceed specified limits, ensuring proper fallback to download URLs. - Reset file size limit after tests to maintain isolation for subsequent test cases. |
||
|
|
7c9c7e530b
|
⏲️ feat: Defer Loading MCP Tools (#11270)
* WIP: code ptc
* refactor: tool classification and calling logic
* 🔧 fix: Update @librechat/agents dependency to version 3.0.68
* chore: import order and correct renamed tool name for tool search
* refactor: streamline tool classification logic for local and programmatic tools
* feat: add per-tool configuration options for agents, including deferred loading and allowed callers
- Introduced `tool_options` in agent forms to manage tool behavior.
- Updated tool classification logic to prioritize agent-level configurations.
- Enhanced UI components to support tool deferral functionality.
- Added localization strings for new tool options and actions.
* feat: enhance agent schema with per-tool options for configuration
- Added `tool_options` schema to support per-tool configurations, including `defer_loading` and `allowed_callers`.
- Updated agent data model to incorporate new tool options, ensuring flexibility in tool behavior management.
- Modified type definitions to reflect the new `tool_options` structure for agents.
* feat: add tool_options parameter to loadTools and initializeAgent for enhanced agent configuration
* chore: update @librechat/agents dependency to version 3.0.71 and enhance agent tool loading logic
- Updated the @librechat/agents package to version 3.0.71 across multiple files.
- Added support for handling deferred loading of tools in agent initialization and execution processes.
- Improved the extraction of discovered tools from message history to optimize tool loading behavior.
* chore: update @librechat/agents dependency to version 3.0.72
* chore: update @librechat/agents dependency to version 3.0.75
* refactor: simplify tool defer loading logic in MCPTool component
- Removed local state management for deferred tools, relying on form state instead.
- Updated related functions to directly use form values for checking and toggling defer loading.
- Cleaned up code by eliminating unnecessary optimistic updates and local state dependencies.
* chore: remove deprecated localization strings for tool deferral in translation.json
- Eliminated unused strings related to deferred loading descriptions in the English translation file.
- Streamlined localization to reflect recent changes in tool loading logic.
* refactor: improve tool defer loading handling in MCPTool component
- Enhanced the logic for managing deferred loading of tools by simplifying the update process for tool options.
- Ensured that the state reflects the correct loading behavior based on the new deferred loading conditions.
- Cleaned up the code to remove unnecessary complexity in handling tool options.
* refactor: update agent mocks in callbacks test to use actual implementations
- Modified the agent mocks in the callbacks test to include actual implementations from the @librechat/agents module.
- This change enhances the accuracy of the tests by ensuring they reflect the real behavior of the agent functions.
|
||
|
|
36c5a88c4e
|
💰 fix: Multi-Agent Token Spending & Prevent Double-Spend (#11433)
* fix: Token Spending Logic for Multi-Agents on Abort Scenarios * Implemented logic to skip token spending if a conversation is aborted, preventing double-spending. * Introduced `spendCollectedUsage` function to handle token spending for multiple models during aborts, ensuring accurate accounting for parallel agents. * Updated `GenerationJobManager` to store and retrieve collected usage data for improved abort handling. * Added comprehensive tests for the new functionality, covering various scenarios including cache token handling and parallel agent usage. * fix: Memory Context Handling for Multi-Agents * Refactored `buildMessages` method to pass memory context to parallel agents, ensuring they share the same user context. * Improved handling of memory context when no existing instructions are present for parallel agents. * Added comprehensive tests to verify memory context propagation and behavior under various scenarios, including cases with no memory available and empty agent configurations. * Enhanced logging for better traceability of memory context additions to agents. * chore: Memory Context Documentation for Parallel Agents * Updated documentation in the `AgentClient` class to clarify the in-place mutation of agentConfig objects when passing memory context to parallel agents. * Added notes on the implications of mutating objects directly to ensure all parallel agents receive the correct memory context before execution. * chore: UsageMetadata Interface docs for Token Spending * Expanded the UsageMetadata interface to support both OpenAI and Anthropic cache token formats. * Added detailed documentation for cache token properties, including mutually exclusive fields for different model types. * Improved clarity on how to access cache token details for accurate token spending tracking. * fix: Enhance Token Spending Logic in Abort Middleware * Refactored `spendCollectedUsage` function to utilize Promise.all for concurrent token spending, improving performance and ensuring all operations complete before clearing the collectedUsage array. * Added documentation to clarify the importance of clearing the collectedUsage array to prevent double-spending in abort scenarios. * Updated tests to verify the correct behavior of the spending logic and the clearing of the array after spending operations. |
||
|
|
083251508e
|
⏭️ fix: Skip Title Generation for Temporary Chats (#11282)
* Not generating titles for temporary chats * Minor linter fix to prettify debug line * Adding a test for skipping title generation for temporary chats |
||
|
|
791dab8f20
|
🫱🏼🫲🏽 refactor: Improve Agent Handoffs (#11172)
* fix: Tool Resources Dropped between Agent Handoffs * fix: agent deletion process to remove handoff edges - Added logic to the `deleteAgent` function to remove references to the deleted agent from other agents' handoff edges. - Implemented error handling to log any issues encountered during the edge removal process. - Introduced a new test case to verify that handoff edges are correctly removed when an agent is deleted, ensuring data integrity across agent relationships. * fix: Improve agent loading process by handling orphaned references - Added logic to track and log agents that fail to load during initialization, preventing errors from interrupting the process. - Introduced a Set to store skipped agent IDs and updated edge filtering to exclude these orphaned references, enhancing data integrity in agent relationships. * chore: Update @librechat/agents to version 3.0.62 * feat: Enhance agent initialization with edge collection and filtering - Introduced new functions for edge collection and filtering orphaned edges, improving the agent loading process. - Refactored the `initializeClient` function to utilize breadth-first search (BFS) for discovering connected agents, enabling transitive handoffs. - Added a new module for edge-related utilities, including deduplication and participant extraction, to streamline edge management. - Updated the agent configuration handling to ensure proper edge processing and integrity during initialization. * refactor: primary agent ID selection for multi-agent conversations - Added a new function `findPrimaryAgentId` to determine the primary agent ID from a set of agent IDs based on suffix rules. - Updated `createMultiAgentMapper` to filter messages by primary agent for parallel agents and handle handoffs appropriately. - Enhanced message processing logic to ensure correct inclusion of agent content based on group and agent ID presence. - Improved documentation to clarify the distinctions between parallel execution and handoff scenarios. * feat: Implement primary agent ID selection for multi-agent content filtering * chore: Update @librechat/agents to version 3.0.63 in package.json and package-lock.json * chore: Update @librechat/agents to version 3.0.64 in package.json and package-lock.json * chore: Update @librechat/agents to version 3.0.65 in package.json and package-lock.json * feat: Add optional agent name to run creation for improved identification * chore: Update @librechat/agents to version 3.0.66 in package.json and package-lock.json * test: Add unit tests for edge utilities including key generation, participant extraction, and orphaned edge filtering - Implemented tests for `getEdgeKey`, `getEdgeParticipants`, `filterOrphanedEdges`, and `createEdgeCollector` functions. - Ensured comprehensive coverage for various edge cases, including handling of arrays and default values. - Verified correct behavior of edge filtering based on skipped agents and deduplication of edges. |
||
|
|
439bc98682
|
⏸ refactor: Improve UX for Parallel Streams (Multi-Convo) (#11096)
* 🌊 feat: Implement multi-conversation feature with added conversation context and payload adjustments
* refactor: Replace isSubmittingFamily with isSubmitting across message components for consistency
* feat: Add loadAddedAgent and processAddedConvo for multi-conversation agent execution
* refactor: Update ContentRender usage to conditionally render PlaceholderRow based on isLast and isSubmitting
* WIP: first pass, sibling index
* feat: Enhance multi-conversation support with agent tracking and display improvements
* refactor: Introduce isEphemeralAgentId utility and update related logic for agent handling
* refactor: Implement createDualMessageContent utility for sibling message display and enhance useStepHandler for added conversations
* refactor: duplicate tools for added agent if ephemeral and primary agent is also ephemeral
* chore: remove deprecated multimessage rendering
* refactor: enhance dual message content creation and agent handling for parallel rendering
* refactor: streamline message rendering and submission handling by removing unused state and optimizing conditional logic
* refactor: adjust content handling in parallel mode to utilize existing content for improved agent display
* refactor: update @librechat/agents dependency to version 3.0.53
* refactor: update @langchain/core and @librechat/agents dependencies to latest versions
* refactor: remove deprecated @langchain/core dependency from package.json
* chore: remove unused SearchToolConfig and GetSourcesParams types from web.ts
* refactor: remove unused message properties from Message component
* refactor: enhance parallel content handling with groupId support in ContentParts and useStepHandler
* refactor: implement parallel content styling in Message, MessageRender, and ContentRender components. use explicit model name
* refactor: improve agent ID handling in createDualMessageContent for dual message display
* refactor: simplify title generation in AddedConvo by removing unused sender and preset logic
* refactor: replace string interpolation with cn utility for className in HoverButtons component
* refactor: enhance agent ID handling by adding suffix management for parallel agents and updating related components
* refactor: enhance column ordering in ContentParts by sorting agents with suffix management
* refactor: update @librechat/agents dependency to version 3.0.55
* feat: implement parallel content rendering with metadata support
- Added `ParallelContentRenderer` and `ParallelColumns` components for rendering messages in parallel based on groupId and agentId.
- Introduced `contentMetadataMap` to store metadata for each content part, allowing efficient parallel content detection.
- Updated `Message` and `ContentRender` components to utilize the new metadata structure for rendering.
- Modified `useStepHandler` to manage content indices and metadata during message processing.
- Enhanced `IJobStore` interface and its implementations to support storing and retrieving content metadata.
- Updated data schemas to include `contentMetadataMap` for messages, enabling multi-agent and parallel execution scenarios.
* refactor: update @librechat/agents dependency to version 3.0.56
* refactor: remove unused EPHEMERAL_AGENT_ID constant and simplify agent ID check
* refactor: enhance multi-agent message processing and primary agent determination
* refactor: implement branch message functionality for parallel responses
* refactor: integrate added conversation retrieval into message editing and regeneration processes
* refactor: remove unused isCard and isMultiMessage props from MessageRender and ContentRender components
* refactor: update @librechat/agents dependency to version 3.0.60
* refactor: replace usage of EPHEMERAL_AGENT_ID constant with isEphemeralAgentId function for improved clarity and consistency
* refactor: standardize agent ID format in tests for consistency
* chore: move addedConvo property to the correct position in payload construction
* refactor: rename agent_id values in loadAgent tests for clarity
* chore: reorder props in ContentParts component for improved readability
* refactor: rename variable 'content' to 'result' for clarity in RedisJobStore tests
* refactor: streamline useMessageActions by removing duplicate handleFeedback assignment
* chore: revert placeholder rendering logic MessageRender and ContentRender components to original
* refactor: implement useContentMetadata hook for optimized content metadata handling
* refactor: remove contentMetadataMap and related logic from the codebase and revert back to agentId/groupId in content parts
- Eliminated contentMetadataMap from various components and services, simplifying the handling of message content.
- Updated functions to directly access agentId and groupId from content parts instead of relying on a separate metadata map.
- Adjusted related hooks and components to reflect the removal of contentMetadataMap, ensuring consistent handling of message content.
- Updated tests and documentation to align with the new structure of message content handling.
* refactor: remove logging from groupParallelContent function to clean up output
* refactor: remove model parameter from TBranchMessageRequest type for simplification
* refactor: enhance branch message creation by stripping metadata for standalone content
* chore: streamline branch message creation by simplifying content filtering and removing unnecessary metadata checks
* refactor: include attachments in branch message creation for improved content handling
* refactor: streamline agent content processing by consolidating primary agent identification and filtering logic
* refactor: simplify multi-agent message processing by creating a dedicated mapping method and enhancing content filtering
* refactor: remove unused parameter from loadEphemeralAgent function for cleaner code
* refactor: update groupId handling in metadata to only set when provided by the server
|
||
|
|
0ae3b87b65
|
🌊 feat: Resumable LLM Streams with Horizontal Scaling (#10926)
* ✨ feat: Implement Resumable Generation Jobs with SSE Support
- Introduced GenerationJobManager to handle resumable LLM generation jobs independently of HTTP connections.
- Added support for subscribing to ongoing generation jobs via SSE, allowing clients to reconnect and receive updates without losing progress.
- Enhanced existing agent controllers and routes to integrate resumable functionality, including job creation, completion, and error handling.
- Updated client-side hooks to manage adaptive SSE streams, switching between standard and resumable modes based on user settings.
- Added UI components and settings for enabling/disabling resumable streams, improving user experience during unstable connections.
* WIP: resuming
* WIP: resumable stream
* feat: Enhance Stream Management with Abort Functionality
- Updated the abort endpoint to support aborting ongoing generation streams using either streamId or conversationId.
- Introduced a new mutation hook `useAbortStreamMutation` for client-side integration.
- Added `useStreamStatus` query to monitor stream status and facilitate resuming conversations.
- Enhanced `useChatHelpers` to incorporate abort functionality when stopping generation.
- Improved `useResumableSSE` to handle stream errors and token refresh seamlessly.
- Updated `useResumeOnLoad` to check for active streams and resume conversations appropriately.
* fix: Update query parameter handling in useChatHelpers
- Refactored the logic for determining the query parameter used in fetching messages to prioritize paramId from the URL, falling back to conversationId only if paramId is not available. This change ensures consistency with the ChatView component's expectations.
* fix: improve syncing when switching conversations
* fix: Prevent memory leaks in useResumableSSE by clearing handler maps on stream completion and cleanup
* fix: Improve content type mismatch handling in useStepHandler
- Enhanced the condition for detecting content type mismatches to include additional checks, ensuring more robust validation of content types before processing updates.
* fix: Allow dynamic content creation in useChatFunctions
- Updated the initial response handling to avoid pre-initializing content types, enabling dynamic creation of content parts based on incoming delta events. This change supports various content types such as think and text.
* fix: Refine response message handling in useStepHandler
- Updated logic to determine the appropriate response message based on the last message's origin, ensuring correct message replacement or appending based on user interaction. This change enhances the accuracy of message updates in the chat flow.
* refactor: Enhance GenerationJobManager with In-Memory Implementations
- Introduced InMemoryJobStore, InMemoryEventTransport, and InMemoryContentState for improved job management and event handling.
- Updated GenerationJobManager to utilize these new implementations, allowing for better separation of concerns and easier maintenance.
- Enhanced job metadata handling to support user messages and response IDs for resumable functionality.
- Improved cleanup and state management processes to prevent memory leaks and ensure efficient resource usage.
* refactor: Enhance GenerationJobManager with improved subscriber handling
- Updated RuntimeJobState to include allSubscribersLeftHandlers for managing client disconnections without affecting subscriber count.
- Refined createJob and subscribe methods to ensure generation starts only when the first real client connects.
- Added detailed documentation for methods and properties to clarify the synchronization of job generation with client readiness.
- Improved logging for subscriber checks and event handling to facilitate debugging and monitoring.
* chore: Adjust timeout for subscriber readiness in ResumableAgentController
- Reduced the timeout duration from 5000ms to 2500ms in the startGeneration function to improve responsiveness when waiting for subscriber readiness. This change aims to enhance the efficiency of the agent's background generation process.
* refactor: Update GenerationJobManager documentation and structure
- Enhanced the documentation for GenerationJobManager to clarify the architecture and pluggable service design.
- Updated comments to reflect the potential for Redis integration and the need for async refactoring.
- Improved the structure of the GenerationJob facade to emphasize the unified API while allowing for implementation swapping without affecting consumer code.
* refactor: Convert GenerationJobManager methods to async for improved performance
- Updated methods in GenerationJobManager and InMemoryJobStore to be asynchronous, enhancing the handling of job creation, retrieval, and management.
- Adjusted the ResumableAgentController and related routes to await job operations, ensuring proper flow and error handling.
- Increased timeout duration in ResumableAgentController's startGeneration function to 3500ms for better subscriber readiness management.
* refactor: Simplify initial response handling in useChatFunctions
- Removed unnecessary pre-initialization of content types in the initial response, allowing for dynamic content creation based on incoming delta events. This change enhances flexibility in handling various content types in the chat flow.
* refactor: Clarify content handling logic in useStepHandler
- Updated comments to better explain the handling of initialContent and existingContent in edit and resume scenarios.
- Simplified the logic for merging content, ensuring that initialContent is used directly when available, improving clarity and maintainability.
* refactor: Improve message handling logic in useStepHandler
- Enhanced the logic for managing messages in multi-tab scenarios, ensuring that the most up-to-date message history is utilized.
- Removed existing response placeholders and ensured user messages are included, improving the accuracy of message updates in the chat flow.
* fix: remove unnecessary content length logging in the chat stream response, simplifying the debug message while retaining essential information about run steps. This change enhances clarity in logging without losing critical context.
* refactor: Integrate streamId handling for improved resumable functionality for attachments
- Added streamId parameter to various functions to support resumable mode in tool loading and memory processing.
- Updated related methods to ensure proper handling of attachments and responses based on the presence of streamId, enhancing the overall streaming experience.
- Improved logging and attachment management to accommodate both standard and resumable modes.
* refactor: Streamline abort handling and integrate GenerationJobManager for improved job management
- Removed the abortControllers middleware and integrated abort handling directly into GenerationJobManager.
- Updated abortMessage function to utilize GenerationJobManager for aborting jobs by conversation ID, enhancing clarity and efficiency.
- Simplified cleanup processes and improved error handling during abort operations.
- Enhanced metadata management for jobs, including endpoint and model information, to facilitate better tracking and resource management.
* refactor: Unify streamId and conversationId handling for improved job management
- Updated ResumableAgentController and AgentController to generate conversationId upfront, ensuring it matches streamId for consistency.
- Simplified job creation and metadata management by removing redundant conversationId updates from callbacks.
- Refactored abortMiddleware and related methods to utilize the unified streamId/conversationId approach, enhancing clarity in job handling.
- Removed deprecated methods from GenerationJobManager and InMemoryJobStore, streamlining the codebase and improving maintainability.
* refactor: Enhance resumable SSE handling with improved UI state management and error recovery
- Added UI state restoration on successful SSE connection to indicate ongoing submission.
- Implemented detailed error handling for network failures, including retry logic with exponential backoff.
- Introduced abort event handling to reset UI state on intentional stream closure.
- Enhanced debugging capabilities for testing reconnection and clean close scenarios.
- Updated generation function to retry on network errors, improving resilience during submission processes.
* refactor: Consolidate content state management into IJobStore for improved job handling
- Removed InMemoryContentState and integrated its functionality into InMemoryJobStore, streamlining content state management.
- Updated GenerationJobManager to utilize jobStore for content state operations, enhancing clarity and reducing redundancy.
- Introduced RedisJobStore for horizontal scaling, allowing for efficient job management and content reconstruction from chunks.
- Updated IJobStore interface to reflect changes in content state handling, ensuring consistency across implementations.
* feat: Introduce Redis-backed stream services for enhanced job management
- Added createStreamServices function to configure job store and event transport, supporting both Redis and in-memory options.
- Updated GenerationJobManager to allow configuration with custom job stores and event transports, improving flexibility for different deployment scenarios.
- Refactored IJobStore interface to support asynchronous content retrieval, ensuring compatibility with Redis implementations.
- Implemented RedisEventTransport for real-time event delivery across instances, enhancing scalability and responsiveness.
- Updated InMemoryJobStore to align with new async patterns for content and run step retrieval, ensuring consistent behavior across storage options.
* refactor: Remove redundant debug logging in GenerationJobManager and RedisEventTransport
- Eliminated unnecessary debug statements in GenerationJobManager related to subscriber actions and job updates, enhancing log clarity.
- Removed debug logging in RedisEventTransport for subscription and subscriber disconnection events, streamlining the logging output.
- Cleaned up debug messages in RedisJobStore to focus on essential information, improving overall logging efficiency.
* refactor: Enhance job state management and TTL configuration in RedisJobStore
- Updated the RedisJobStore to allow customizable TTL values for job states, improving flexibility in job management.
- Refactored the handling of job expiration and cleanup processes to align with new TTL configurations.
- Simplified the response structure in the chat status endpoint by consolidating state retrieval, enhancing clarity and performance.
- Improved comments and documentation for better understanding of the changes made.
* refactor: cleanupOnComplete option to GenerationJobManager for flexible resource management
- Introduced a new configuration option, cleanupOnComplete, allowing immediate cleanup of event transport and job resources upon job completion.
- Updated completeJob and abortJob methods to respect the cleanupOnComplete setting, enhancing memory management.
- Improved cleanup logic in the cleanup method to handle orphaned resources effectively.
- Enhanced documentation and comments for better clarity on the new functionality.
* refactor: Update TTL configuration for completed jobs in InMemoryJobStore
- Changed the TTL for completed jobs from 5 minutes to 0, allowing for immediate cleanup.
- Enhanced cleanup logic to respect the new TTL setting, improving resource management.
- Updated comments for clarity on the behavior of the TTL configuration.
* refactor: Enhance RedisJobStore with local graph caching for improved performance
- Introduced a local cache for graph references using WeakRef to optimize reconnects for the same instance.
- Updated job deletion and cleanup methods to manage the local cache effectively, ensuring stale entries are removed.
- Enhanced content retrieval methods to prioritize local cache access, reducing Redis round-trips for same-instance reconnects.
- Improved documentation and comments for clarity on the caching mechanism and its benefits.
* feat: Add integration tests for GenerationJobManager, RedisEventTransport, and RedisJobStore, add Redis Cluster support
- Introduced comprehensive integration tests for GenerationJobManager, covering both in-memory and Redis modes to ensure consistent job management and event handling.
- Added tests for RedisEventTransport to validate pub/sub functionality, including cross-instance event delivery and error handling.
- Implemented integration tests for RedisJobStore, focusing on multi-instance job access, content reconstruction from chunks, and consumer group behavior.
- Enhanced test setup and teardown processes to ensure a clean environment for each test run, improving reliability and maintainability.
* fix: Improve error handling in GenerationJobManager for allSubscribersLeft handlers
- Enhanced the error handling logic when retrieving content parts for allSubscribersLeft handlers, ensuring that any failures are logged appropriately.
- Updated the promise chain to catch errors from getContentParts, improving robustness and clarity in error reporting.
* ci: Improve Redis client disconnection handling in integration tests
- Updated the afterAll cleanup logic in integration tests for GenerationJobManager, RedisEventTransport, and RedisJobStore to use `quit()` for graceful disconnection of the Redis client.
- Added fallback to `disconnect()` if `quit()` fails, enhancing robustness in resource management during test teardown.
- Improved comments for clarity on the disconnection process and error handling.
* refactor: Enhance GenerationJobManager and event transports for improved resource management
- Updated GenerationJobManager to prevent immediate cleanup of eventTransport upon job completion, allowing final events to transmit fully before cleanup.
- Added orphaned stream cleanup logic in GenerationJobManager to handle streams without corresponding jobs.
- Introduced getTrackedStreamIds method in both InMemoryEventTransport and RedisEventTransport for better management of orphaned streams.
- Improved comments for clarity on resource management and cleanup processes.
* refactor: Update GenerationJobManager and ResumableAgentController for improved event handling
- Modified GenerationJobManager to resolve readyPromise immediately, eliminating startup latency and allowing early event buffering for late subscribers.
- Enhanced event handling logic to replay buffered events when the first subscriber connects, ensuring no events are lost due to race conditions.
- Updated comments for clarity on the new event synchronization mechanism and its benefits in both Redis and in-memory modes.
* fix: Update cache integration test command for stream to ensure proper execution
- Modified the test command for cache integration related to streams by adding the --forceExit flag to prevent hanging tests.
- This change enhances the reliability of the test suite by ensuring all tests complete as expected.
* feat: Add active job management for user and show progress in conversation list
- Implemented a new endpoint to retrieve active generation job IDs for the current user, enhancing user experience by allowing visibility of ongoing tasks.
- Integrated active job tracking in the Conversations component, displaying generation indicators based on active jobs.
- Optimized job management in the GenerationJobManager and InMemoryJobStore to support user-specific job queries, ensuring efficient resource handling and cleanup.
- Updated relevant components and hooks to utilize the new active jobs feature, improving overall application responsiveness and user feedback.
* feat: Implement active job tracking by user in RedisJobStore
- Added functionality to retrieve active job IDs for a specific user, enhancing user experience by allowing visibility of ongoing tasks.
- Implemented self-healing cleanup for stale job entries, ensuring accurate tracking of active jobs.
- Updated job creation, update, and deletion methods to manage user-specific job sets effectively.
- Enhanced integration tests to validate the new user-specific job management features.
* refactor: Simplify job deletion logic by removing user job cleanup from InMemoryJobStore and RedisJobStore
* WIP: Add backend inspect script for easier debugging in production
* refactor: title generation logic
- Changed the title generation endpoint from POST to GET, allowing for more efficient retrieval of titles based on conversation ID.
- Implemented exponential backoff for title fetching retries, improving responsiveness and reducing server load.
- Introduced a queuing mechanism for title generation, ensuring titles are generated only after job completion.
- Updated relevant components and hooks to utilize the new title generation logic, enhancing user experience and application performance.
* feat: Enhance updateConvoInAllQueries to support moving conversations to the top
* chore: temp. remove added multi convo
* refactor: Update active jobs query integration for optimistic updates on abort
- Introduced a new interface for active jobs response to standardize data handling.
- Updated query keys for active jobs to ensure consistency across components.
- Enhanced job management logic in hooks to properly reflect active job states, improving overall application responsiveness.
* refactor: useResumableStreamToggle hook to manage resumable streams for legacy/assistants endpoints
- Introduced a new hook, useResumableStreamToggle, to automatically toggle resumable streams off for assistants endpoints and restore the previous value when switching away.
- Updated ChatView component to utilize the new hook, enhancing the handling of streaming behavior based on endpoint type.
- Refactored imports in ChatView for better organization.
* refactor: streamline conversation title generation handling
- Removed unused type definition for TGenTitleMutation in mutations.ts to clean up the codebase.
- Integrated queueTitleGeneration call in useEventHandlers to trigger title generation for new conversations, enhancing the responsiveness of the application.
* feat: Add USE_REDIS_STREAMS configuration for stream job storage
- Introduced USE_REDIS_STREAMS to control Redis usage for resumable stream job storage, defaulting to true if USE_REDIS is enabled but not explicitly set.
- Updated cacheConfig to include USE_REDIS_STREAMS and modified createStreamServices to utilize this new configuration.
- Enhanced unit tests to validate the behavior of USE_REDIS_STREAMS under various environment settings, ensuring correct defaults and overrides.
* fix: title generation queue management for assistants
- Introduced a queueListeners mechanism to notify changes in the title generation queue, improving responsiveness for non-resumable streams.
- Updated the useTitleGeneration hook to track queue changes with a queueVersion state, ensuring accurate updates when jobs complete.
- Refactored the queueTitleGeneration function to trigger listeners upon adding new conversation IDs, enhancing the overall title generation flow.
* refactor: streamline agent controller and remove legacy resumable handling
- Updated the AgentController to route all requests to ResumableAgentController, simplifying the logic.
- Deprecated the legacy non-resumable path, providing a clear migration path for future use.
- Adjusted setHeaders middleware to remove unnecessary checks for resumable mode.
- Cleaned up the useResumableSSE hook to eliminate redundant query parameters, enhancing clarity and performance.
* feat: Add USE_REDIS_STREAMS configuration to .env.example
- Updated .env.example to include USE_REDIS_STREAMS setting, allowing control over Redis usage for resumable LLM streams.
- Provided additional context on the behavior of USE_REDIS_STREAMS when not explicitly set, enhancing clarity for configuration management.
* refactor: remove unused setHeaders middleware from chat route
- Eliminated the setHeaders middleware from the chat route, streamlining the request handling process.
- This change contributes to cleaner code and improved performance by reducing unnecessary middleware checks.
* fix: Add streamId parameter for resumable stream handling across services (actions, mcp oauth)
* fix(flow): add immediate abort handling and fix intervalId initialization
- Add immediate abort handler that responds instantly to abort signal
- Declare intervalId before cleanup function to prevent 'Cannot access before initialization' error
- Consolidate cleanup logic into single function to avoid duplicate cleanup
- Properly remove abort event listener on cleanup
* fix(mcp): clean up OAuth flows on abort and simplify flow handling
- Add abort handler in reconnectServer to clean up mcp_oauth and mcp_get_tokens flows
- Update createAbortHandler to clean up both flow types on tool call abort
- Pass abort signal to createFlow in returnOnOAuth path
- Simplify handleOAuthRequired to always cancel existing flows and start fresh
- This ensures user always gets a new OAuth URL instead of waiting for stale flows
* fix(agents): handle 'new' conversationId and improve abort reliability
- Treat 'new' as placeholder that needs UUID in request controller
- Send JSON response immediately before tool loading for faster SSE connection
- Use job's abort controller instead of prelimAbortController
- Emit errors to stream if headers already sent
- Skip 'new' as valid ID in abort endpoint
- Add fallback to find active jobs by userId when conversationId is 'new'
* fix(stream): detect early abort and prevent navigation to non-existent conversation
- Abort controller on job completion to signal pending operations
- Detect early abort (no content, no responseMessageId) in abortJob
- Set conversation and responseMessage to null for early aborts
- Add earlyAbort flag to final event for frontend detection
- Remove unused text field from AbortResult interface
- Frontend handles earlyAbort by staying on/navigating to new chat
* test(mcp): update test to expect signal parameter in createFlow
fix(agents): include 'new' conversationId in newConvo check for title generation
When frontend sends 'new' as conversationId, it should still trigger
title generation since it's a new conversation. Rename boolean variable for clarity
fix(agents): check abort state before completeJob for title generation
completeJob now triggers abort signal for cleanup, so we need to
capture the abort state beforehand to correctly determine if title
generation should run.
|
||
|
|
f9060fa25f
|
🔧 chore: Update ESLint Config & Run Linter (#10986) | ||
|
|
04a4a2aa44
|
🧵 refactor: Migrate Endpoint Initialization to TypeScript (#10794)
* refactor: move endpoint initialization methods to typescript * refactor: move agent init to packages/api - Introduced `initialize.ts` for agent initialization, including file processing and tool loading. - Updated `resources.ts` to allow optional appConfig parameter. - Enhanced endpoint configuration handling in various initialization files to support model parameters. - Added new artifacts and prompts for React component generation. - Refactored existing code to improve type safety and maintainability. * refactor: streamline endpoint initialization and enhance type safety - Updated initialization functions across various endpoints to use a consistent request structure, replacing `unknown` types with `ServerResponse`. - Simplified request handling by directly extracting keys from the request body. - Improved type safety by ensuring user IDs are safely accessed with optional chaining. - Removed unnecessary parameters and streamlined model options handling for better clarity and maintainability. * refactor: moved ModelService and extractBaseURL to packages/api - Added comprehensive tests for the models fetching functionality, covering scenarios for OpenAI, Anthropic, Google, and Ollama models. - Updated existing endpoint index to include the new models module. - Enhanced utility functions for URL extraction and model data processing. - Improved type safety and error handling across the models fetching logic. * refactor: consolidate utility functions and remove unused files - Merged `deriveBaseURL` and `extractBaseURL` into the `@librechat/api` module for better organization. - Removed redundant utility files and their associated tests to streamline the codebase. - Updated imports across various client files to utilize the new consolidated functions. - Enhanced overall maintainability by reducing the number of utility modules. * refactor: replace ModelService references with direct imports from @librechat/api and remove ModelService file * refactor: move encrypt/decrypt methods and key db methods to data-schemas, use `getProviderConfig` from `@librechat/api` * chore: remove unused 'res' from options in AgentClient * refactor: file model imports and methods - Updated imports in various controllers and services to use the unified file model from '~/models' instead of '~/models/File'. - Consolidated file-related methods into a new file methods module in the data-schemas package. - Added comprehensive tests for file methods including creation, retrieval, updating, and deletion. - Enhanced the initializeAgent function to accept dependency injection for file-related methods. - Improved error handling and logging in file methods. * refactor: streamline database method references in agent initialization * refactor: enhance file method tests and update type references to IMongoFile * refactor: consolidate database method imports in agent client and initialization * chore: remove redundant import of initializeAgent from @librechat/api * refactor: move checkUserKeyExpiry utility to @librechat/api and update references across endpoints * refactor: move updateUserPlugins logic to user.ts and simplify UserController * refactor: update imports for user key management and remove UserService * refactor: remove unused Anthropics and Bedrock endpoint files and clean up imports * refactor: consolidate and update encryption imports across various files to use @librechat/data-schemas * chore: update file model mock to use unified import from '~/models' * chore: import order * refactor: remove migrated to TS agent.js file and its associated logic from the endpoints * chore: add reusable function to extract imports from source code in unused-packages workflow * chore: enhance unused-packages workflow to include @librechat/api dependencies and improve dependency extraction * chore: improve dependency extraction in unused-packages workflow with enhanced error handling and debugging output * chore: add detailed debugging output to unused-packages workflow for better visibility into unused dependencies and exclusion lists * chore: refine subpath handling in unused-packages workflow to correctly process scoped and non-scoped package imports * chore: clean up unused debug output in unused-packages workflow and reorganize type imports in initialize.ts |
||
|
|
69200623c2
|
🪨 feat: Add PROXY support for AWS Bedrock endpoints (#8871)
* feat: added PROXY support for AWS Bedrock endpoint * chore: explicit install of new packages required for bedrock proxy --------- Co-authored-by: Danny Avila <danny@librechat.ai> |
||
|
|
656e1abaea
|
🪦 refactor: Remove Legacy Code (#10533)
* 🗑️ chore: Remove unused Legacy Provider clients and related helpers * Deleted OpenAIClient and GoogleClient files along with their associated tests. * Removed references to these clients in the clients index file. * Cleaned up typedefs by removing the OpenAISpecClient export. * Updated chat controllers to use the OpenAI SDK directly instead of the removed client classes. * chore/remove-openapi-specs * 🗑️ chore: Remove unused mergeSort and misc utility functions * Deleted mergeSort.js and misc.js files as they are no longer needed. * Removed references to cleanUpPrimaryKeyValue in messages.js and adjusted related logic. * Updated mongoMeili.ts to eliminate local implementations of removed functions. * chore: remove legacy endpoints * chore: remove all plugins endpoint related code * chore: remove unused prompt handling code and clean up imports * Deleted handleInputs.js and instructions.js files as they are no longer needed. * Removed references to these files in the prompts index.js. * Updated docker-compose.yml to simplify reverse proxy configuration. * chore: remove unused LightningIcon import from Icons.tsx * chore: clean up translation.json by removing deprecated and unused keys * chore: update Jest configuration and remove unused mock file * Simplified the setupFiles array in jest.config.js by removing the fetchEventSource mock. * Deleted the fetchEventSource.js mock file as it is no longer needed. * fix: simplify endpoint type check in Landing and ConversationStarters components * Updated the endpoint type check to use strict equality for better clarity and performance. * Ensured consistency in the handling of the azureOpenAI endpoint across both components. * chore: remove unused dependencies from package.json and package-lock.json * chore: remove legacy EditController, associated routes and imports * chore: update banResponse logic to refine request handling for banned users * chore: remove unused validateEndpoint middleware and its references * chore: remove unused 'res' parameter from initializeClient in multiple endpoint files * chore: remove unused 'isSmallScreen' prop from BookmarkNav and NewChat components; clean up imports in ArchivedChatsTable and useSetIndexOptions hooks; enhance localization in PromptVersions * chore: remove unused import of Constants and TMessage from MobileNav; retain only necessary QueryKeys import * chore: remove unused TResPlugin type and related references; clean up imports in types and schemas |
||
|
|
801c95a829
|
🦙 fix: Ollama Provider Handling (#10711)
* 🔧 fix: Correct URL Construction in fetchModels Function Updated the URL construction in the fetchModels function to ensure proper formatting by removing trailing slashes from the base URL. This change prevents potential issues with API endpoint calls. * 🔧 fix: Remove OLLAMA from Known Custom Providers Updated the isKnownCustomProvider function and providerConfigMap to exclude OLLAMA as a known custom provider, streamlining the provider checks and configurations. * 🔧 test: Enhance fetchModels Tests for URL Construction Added new test cases to validate the URL construction in the fetchModels function, ensuring it handles trailing slashes correctly and appends query parameters as expected. This improves the robustness of the API endpoint calls. * chore: remove ollama provider-specific handling * chore: Refactor imports to use isUserProvided from @librechat/api |
||
|
|
35319c1354
|
🔧 fix: Remove Bedrock Config Transform introduced in #9931 (#10628)
* fix: Header and Environment Variable Handling Bug from #9931 * refactor: Remove warning log for missing tokens in extractOpenIDTokenInfo function * feat: Enhance resolveNestedObject function for improved placeholder processing - Added a new function `resolveNestedObject` to recursively process nested objects, replacing placeholders in string values while preserving the original structure. - Updated `createTestUser` to use `IUser` type and modified user ID generation. - Added comprehensive unit tests for `resolveNestedObject` to cover various scenarios, including nested structures, arrays, and custom user variables. - Improved type handling in `processMCPEnv` to ensure correct processing of mixed numeric and placeholder values. * refactor: Remove unnecessary manipulation of Bedrock options introduced in #9931 - Eliminated the resolveHeaders function call from the getOptions method in options.js, as it was no longer necessary for processing additional model request fields. - This change simplifies the code and improves maintainability. |
||
|
|
ef3bf0a932
|
🆔 feat: Add OpenID Connect Federated Provider Token Support (#9931)
* feat: Add OpenID Connect federated provider token support
Implements support for passing federated provider tokens (Cognito, Azure AD, Auth0)
as variables in LibreChat's librechat.yaml configuration for both custom endpoints
and MCP servers.
Features:
- New LIBRECHAT_OPENID_* template variables for federated provider tokens
- JWT claims parsing from ID tokens without verification (for claim extraction)
- Token validation with expiration checking
- Support for multiple token storage locations (federatedTokens, openidTokens)
- Integration with existing template variable system
- Comprehensive test suite with Cognito-specific scenarios
- Provider-agnostic design supporting Cognito, Azure AD, Auth0, etc.
Security:
- Server-side only token processing
- Automatic token expiration validation
- Graceful fallbacks for missing/invalid tokens
- No client-side token exposure
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Add federated token propagation to OIDC authentication strategies
Adds federatedTokens object to user during authentication to enable
federated provider token template variables in LibreChat configuration.
Changes:
- OpenID JWT Strategy: Extract raw JWT from Authorization header and
attach as federatedTokens.access_token to enable {{LIBRECHAT_OPENID_TOKEN}}
placeholder resolution
- OpenID Strategy: Attach tokenset tokens as federatedTokens object to
standardize token access across both authentication strategies
This enables proper token propagation for custom endpoints and MCP
servers that require federated provider tokens for authorization.
Resolves missing token issue reported by @ramden in PR #9931
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Denis Ramic <denis.ramic@nfon.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* test: Add federatedTokens validation tests for OIDC strategies
Adds comprehensive test coverage for the federated token propagation
feature implemented in the authentication strategies.
Tests added:
- Verify federatedTokens object is attached to user with correct structure
(access_token, refresh_token, expires_at)
- Verify both tokenset and federatedTokens are present in user object
- Ensure tokens from OIDC provider are correctly propagated
Also fixes existing test suite by adding missing mocks:
- isEmailDomainAllowed function mock
- findOpenIDUser function mock
These tests validate the fix from commit
|
||
|
|
2524d33362
|
📂 refactor: Cleanup File Filtering Logic, Improve Validation (#10414)
* feat: add filterFilesByEndpointConfig to filter disabled file processing by provider * chore: explicit define of endpointFileConfig for better debugging * refactor: move `normalizeEndpointName` to data-provider as used app-wide * chore: remove overrideEndpoint from useFileHandling * refactor: improve endpoint file config selection * refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling * refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components * test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling * refactor: streamline agent endpoint assignment and improve file filtering logic * feat: add error handling for disabled file uploads in endpoint configuration * refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint * refactor: streamline requestFiles handling in initializeAgent function * fix: getEndpointFileConfig partial config merging scenarios * refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types * refactor: user-configured default file config in getEndpointFileConfig * fix: prevent file handling when endpoint is disabled and file is dragged to chat * refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks * fix: prioritize endpointType based on agent.endpoint in file filtering logic * fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting |
||
|
|
8a4a5a4790
|
🤖 feat: Agent Handoffs (Routing) (#10176)
* feat: Add support for agent handoffs with edges in agent forms and schemas chore: Mark `agent_ids` field as deprecated in favor of edges across various schemas and types chore: Update dependencies for @langchain/core and @librechat/agents to latest versions chore: Update peer dependency for @librechat/agents to version 3.0.0-rc2 in package.json chore: Update @librechat/agents dependency to version 3.0.0-rc3 in package.json and package-lock.json feat: first pass, multi-agent handoffs fix: update output type to ToolMessage in memory handling functions fix: improve type checking for graphConfig in createRun function refactor: remove unused content filtering logic in AgentClient chore: update @librechat/agents dependency to version 3.0.0-rc4 in package.json and package-lock.json fix: update @langchain/core peer dependency version to ^0.3.72 in package.json and package-lock.json fix: update @librechat/agents dependency to version 3.0.0-rc6 in package.json and package-lock.json; refactor stream rate handling in various endpoints feat: Agent handoff UI chore: update @librechat/agents dependency to version 3.0.0-rc8 in package.json and package-lock.json fix: improve hasInfo condition and adjust UI element classes in AgentHandoff component refactor: remove current fixed agent display from AgentHandoffs component due to redundancy feat: enhance AgentHandoffs UI with localized beta label and improved layout chore: update @librechat/agents dependency to version 3.0.0-rc10 in package.json and package-lock.json feat: add `createSequentialChainEdges` function to add back agent chaining via multi-agents feat: update `createSequentialChainEdges` call to only provide conversation context between agents feat: deprecate Agent Chain functionality and update related methods for improved clarity * chore: update @librechat/agents dependency to version 3.0.0-rc11 in package.json and package-lock.json * refactor: remove unused addCacheControl function and related imports and import from @librechat/agents * chore: remove unused i18n keys * refactor: remove unused format export from index.ts * chore: update @librechat/agents to v3.0.0-rc13 * chore: remove BEDROCK_LEGACY provider from Providers enum * chore: update @librechat/agents to version 3.0.2 in package.json |
||
|
|
d904b281f1
|
🦙 fix: Ollama Custom Headers (#10314)
* 🦙 fix: Ollama Custom Headers
* chore: Correct import order for resolveHeaders in OllamaClient.js
* fix: Improve error logging for Ollama API model fetch failure
* ci: update Ollama model fetch tests
* ci: Add unit test for passing headers and user object to Ollama fetchModels
|
||
|
|
6adb425780
|
🔄 refactor: Max tokens handling in Agent Initialization (#10299)
* Refactored the logic for determining max output tokens in the agent initialization process. * Changed variable names for clarity, updating from `maxTokens` to `maxOutputTokens` to better reflect their purpose. * Adjusted calculations for `maxContextTokens` to use the new `maxOutputTokens` variable. |
||
|
|
e6aeec9f25
|
🎚️ feat: Reasoning Parameters for Custom Endpoints (#10297) | ||
|
|
33d6b337bc
|
📛 feat: Chat Badges via Model Specs (#10272)
* refactor: remove `useChatContext` from `useSelectMention`, explicitly pass `conversation` object * feat: ephemeral agents via model specs * refactor: Sync Jotai state with ephemeral agent state, also when Ephemeral Agent has no MCP servers selected * refactor: move `useUpdateEphemeralAgent` to store and clean up imports * refactor: reorder imports and invalidate queries for mcpConnectionStatus in event handler * refactor: replace useApplyModelSpecEffects with useApplyModelSpecAgents and update event handlers to use new agent template logic * ci: update useMCPSelect test to verify mcpValues sync with empty ephemeralAgent.mcp |
||
|
|
36f0365fd4
|
🧮 feat: Enhance Model Pricing Coverage and Pattern Matching (#10173)
* updated gpt5-pro it is here and on openrouter https://platform.openai.com/docs/models/gpt-5-pro * feat: Add gpt-5-pro pricing - Implemented handling for the new gpt-5-pro model in the getValueKey function. - Updated tests to ensure correct behavior for gpt-5-pro across various scenarios. - Adjusted token limits and multipliers for gpt-5-pro in the tokens utility files. - Enhanced model matching functionality to include gpt-5-pro variations. * refactor: optimize model pricing and validation logic - Added new model pricing entries for llama2, llama3, and qwen variants in tx.js. - Updated tokenValues to include additional models and their pricing structures. - Implemented validation tests in tx.spec.js to ensure all models resolve correctly to pricing. - Refactored getValueKey function to improve model matching and resolution efficiency. - Removed outdated model entries from tokens.ts to streamline pricing management. * fix: add missing pricing * chore: update model pricing for qwen and gemma variants * chore: update model pricing and add validation for context windows - Removed outdated model entries from tx.js and updated tokenValues with new models. - Added a test in tx.spec.js to ensure all models with pricing have corresponding context windows defined in tokens.ts. - Introduced 'command-text' model pricing in tokens.ts to maintain consistency across model definitions. * chore: update model names and pricing for AI21 and Amazon models - Refactored model names in tx.js for AI21 and Amazon models to remove versioning and improve consistency. - Updated pricing values in tokens.ts to reflect the new model names. - Added comprehensive tests in tx.spec.js to validate pricing for both short and full model names across AI21 and Amazon models. * feat: add pricing and validation for Claude Haiku 4.5 model * chore: increase default max context tokens to 18000 for agents * feat: add Qwen3 model pricing and validation tests * chore: reorganize and update Qwen model pricing in tx.js and tokens.ts --------- Co-authored-by: khfung <68192841+khfung@users.noreply.github.com> |
||
|
|
dbe4dd96b4
|
🧹 chore: Cleanup Logger and Utility Imports (#9935)
* 🧹 chore: Update logger imports to use @librechat/data-schemas across multiple files and remove unused sleep function from queue.js (#9930) * chore: Replace local isEnabled utility with @librechat/api import across multiple files, update test files * chore: Replace local logger import with @librechat/data-schemas logger in countTokens.js and fork.js * chore: Update logs volume path in docker-compose.yml to correct directory * chore: import order of isEnabled in static.js |
||
|
|
f9aebeba92
|
🛡️ fix: Title Generation Skip Logic Based On Endpoint Config (#9811) | ||
|
|
c6ecf0095b
|
🎚️ feat: Anthropic Parameter Set Support via Custom Endpoints (#9415)
* refactor: modularize openai llm config logic into new getOpenAILLMConfig function (#9412) * ✈️ refactor: Migrate Anthropic's getLLMConfig to TypeScript (#9413) * refactor: move tokens.js over to packages/api and update imports * refactor: port tokens.js to typescript * refactor: move helpers.js over to packages/api and update imports * refactor: port helpers.js to typescript * refactor: move anthropic/llm.js over to packages/api and update imports * refactor: port anthropic/llm.js to typescript with supporting types in types/anthropic.ts and updated tests in llm.spec.js * refactor: move llm.spec.js over to packages/api and update import * refactor: port llm.spec.js over to typescript * 📝 Add Prompt Parameter Support for Anthropic Custom Endpoints (#9414) feat: add anthropic llm config support for openai-like (custom) endpoints * fix: missed compiler / type issues from addition of getAnthropicLLMConfig * refactor: update tokens.ts to export constants and functions, enhance type definitions, and adjust default values * WIP: first pass, decouple `llmConfig` from `configOptions` * chore: update import path for OpenAI configuration from 'llm' to 'config' * refactor: enhance type definitions for ThinkingConfig and update modelOptions in AnthropicConfigOptions * refactor: cleanup type, introduce openai transform from alt provider * chore: integrate removeNullishValues in Google llmConfig and update OpenAI exports * chore: bump version of @librechat/api to 1.3.5 in package.json and package-lock.json * refactor: update customParams type in OpenAIConfigOptions to use TConfig['customParams'] * refactor: enhance transformToOpenAIConfig to include fromEndpoint and improve config extraction * refactor: conform userId field for anthropic/openai, cleanup anthropic typing * ci: add backward compatibility tests for getOpenAIConfig with various endpoints and configurations * ci: replace userId with user in clientOptions for getLLMConfig * test: add Azure OpenAI endpoint tests for various configurations in getOpenAIConfig * refactor: defaultHeaders retrieval for prompt caching for anthropic-based custom endpoint (litellm) * test: add unit tests for getOpenAIConfig with various Anthropic model configurations * test: enhance Anthropic compatibility tests with addParams and dropParams handling * chore: update @librechat/agents dependency to version 2.4.78 in package.json and package-lock.json * chore: update @librechat/agents dependency to version 2.4.79 in package.json and package-lock.json --------- Co-authored-by: Danny Avila <danny@librechat.ai> |