Commit graph

722 commits

Author SHA1 Message Date
Danny Avila
9441563b95
🛡️ refactor: Scope allowedAddresses By Port (#13022)
* fix: Scope allowedAddresses by port

* test: Fix SSRF agent spec typing
2026-05-08 12:28:34 -04:00
Danny Avila
b39bf837a7
📦 chore: Update @librechat/agents to v3.1.79 (#13000) 2026-05-07 16:27:17 -04:00
Danny Avila
40a05bbf83
📦 chore: npm audit fixes and Mongoose 8.23 TypeScript follow-ups (#12996)
* chore: Update axios dependency to version 1.16.0 across multiple package files

* chore: Update express-rate-limit and ip-address dependencies to versions 8.5.1 and 10.2.0 in package-lock.json and package.json

* chore: Update mongoose and hono dependencies to versions 8.23.1 and 4.12.18 across multiple package files

* fix: Add type parameters to mongoose lean queries in accessRole and aclEntry methods

* fix: Add type parameters to mongoose lean queries in action, agent, and agentCategory methods

* chore: Update moduleResolution to 'bundler' in tsconfig.json for api and data-schemas packages

* fix: Update mongoose lean queries to include type parameters across various methods for improved type safety
2026-05-07 09:47:40 -04:00
Danny Avila
1bc2692a15
🌥️ feat: Add Optional Region-aware S3/CloudFront Storage Keys (#12987)
* feat(files): add optional region-aware storage keys

* test(files): fix region storage CI fixtures

* feat(files): finalize inline CloudFront asset namespaces

* fix(files): allow wildcard region CloudFront cookies

* fix(files): preserve legacy storage key compatibility

* fix(files): align CloudFront clear cookie cleanup

* fix(files): clear legacy CloudFront cookie scopes

* chore(files): clean up storage review nits

* fix(files): keep inline namespaces CloudFront-only
2026-05-06 23:16:56 -04:00
Danny Avila
ddf5879ccd
⏱️ fix: Align Auto-Refill Next Date (#12980)
* fix: Align auto-refill next date

* style: Fix auto-refill lint formatting

* refactor: Share auto-refill eligibility date

* refactor: Consolidate refill interval units

* fix: Guard malformed refill interval units

* fix: Preserve refill unit fallback label
2026-05-06 21:40:18 -04:00
Danny Avila
9c81792d25
🔐 feat: Add Signed CloudFront File Downloads (#12970)
* feat: add signed CloudFront downloads

* fix: preserve local IdP avatar paths

* fix: address signed download review findings

* fix: harden CloudFront cookie scope validation

* fix: preserve URL save API compatibility

* fix: store CDN SSO avatars under shared prefix

* fix: Harden CloudFront tenant file access

* fix: Preserve CloudFront download compatibility

* fix: Address CloudFront review follow-ups

* fix: Preserve file URL fallback user paths

* fix: Address download review hardening

* fix: Use file owner for S3 RAG cleanup

* fix: Address final download review nits

* fix: Clear stale avatar CloudFront cookies

* fix: Align download filename helpers with dev

* fix: Address final CloudFront review follow-ups

* fix: Stream S3 URL uploads

* fix: Set S3 stream upload length

* fix: Preserve download metadata filepath

* fix: Avoid remote content length for stream uploads

* fix: Use bounded multipart URL uploads

* fix: Harden S3 filename boundaries
2026-05-06 19:48:30 -04:00
Danny Avila
f2de3a219c
🌐 fix: Preserve Unicode Filenames (#12977)
* fix: Preserve unicode filenames

* fix: Cap unicode filenames by bytes

* fix: Preserve clean artifact directories

* fix: Disambiguate normalized artifact names
2026-05-06 14:57:38 -04:00
Danny Avila
56b87f70bd
🧮 feat: Add GPT-5.5 Token Definitions (#12973)
* fix: add gpt-5.5 token definitions

* fix: align gpt-5.5 context limit
2026-05-06 10:50:16 -04:00
Danny Avila
6c6c72def7
🚀 feat: Decouple File Attachment Persistence from Preview Rendering (#12957)
* 🗂️ feat: add `status` lifecycle to file records for two-phase previews

Schema and model foundation for decoupling the agent's final response
from CPU-heavy office-format HTML extraction.

- `MongoFile.status: 'pending' | 'ready' | 'failed'` (indexed) and
  `previewError?: string` mirror the lifecycle: phase-1 emits the file
  record at `pending` so the response is unblocked; phase-2 transitions
  to `ready` (with text/textFormat) or `failed` (with previewError) in
  the background. Absent for legacy records — clients treat that as
  `ready` for back-compat.
- Mirror types added to `TFile` in data-provider so frontend cache
  consumers see the new fields.
- New `sweepOrphanedPreviews(maxAgeMs)` method on the file model
  recovers stale `pending` records left behind by a process restart
  mid-extraction; transitions them to `failed` with
  `previewError: 'orphaned'`. Cheap because `status` is indexed.

*  feat: two-phase code-execution preview flow (unblocks final response)

The agent's final response no longer waits on CPU-heavy office HTML
extraction. Phase-1 (download + storage save + DB record at
`status: 'pending'`) is awaited as before; phase-2 (extract +
`updateFile`) runs in the background with a hard 60s ceiling.

Three flows, all funneling through `processCodeOutput` and updated to
the new `{ file, finalize? }` return shape:

- `callbacks.js` (chat-completions + Open Responses streaming): emit
  the phase-1 attachment immediately (carries `status: 'pending'` for
  office buckets so the UI shows "preparing preview…"), then
  fire-and-forget `finalize()`. If the SSE stream is still open when
  phase-2 lands, push an `attachment` update event with the same
  `file_id` so the client merges over the placeholder in place.

- `tools.js` direct endpoint: same split — return the phase-1
  metadata immediately, run extraction in the background. Client
  polls for the resolved record.

`finalize()` wraps the existing 12s per-render timeout in a 60s outer
`withTimeout`. The HTML-or-null contract from #12934 is preserved:
office types that fail extraction transition to `status: 'failed'`
with `previewError: 'parser-error' | 'timeout'` rather than falling
back to plain text (would be an XSS vector).

Promises continue running after the HTTP response closes (Node
doesn't kill them). The boot-time orphan sweep covers the only case
that loses progress — actual process restart mid-extraction.

`primeFiles` annotates the agent's `toolContext` line for prior-turn
files: `(preview not yet generated)` for pending, `(preview
unavailable: <reason>)` for failed. The model can volunteer "you can
still download it" instead of pretending the preview is fine.

`hasOfficeHtmlPath` exported from `@librechat/api` so `processCodeOutput`
can decide whether a file expects a preview at all.

* 🔍 feat: `GET /api/files/:file_id/preview` endpoint and boot orphan sweep

- New `GET /api/files/:file_id/preview` route returns
  `{ status, text?, textFormat?, previewError? }`. The frontend's
  `useFilePreview` React Query hook polls this while phase-2 is in
  flight, then auto-stops on terminal status. ACL identical to the
  download route (reuses `fileAccess` middleware). Defaults `status`
  to `'ready'` for legacy records so back-compat is implicit.
  `text` only included when `status === 'ready'` and non-null —
  preserves the HTML-or-null security contract from #12934.

- `sweepOrphanedPreviews()` invoked on boot in both `server/index.js`
  and `server/experimental.js`. Recovers any `pending` records left
  behind by a process restart mid-extraction (the only case the
  in-process two-phase flow can't handle on its own). Fire-and-forget
  so a transient sweep failure doesn't block startup.

* 🖥️ feat: frontend two-phase preview consumer (polling + UI states)

Wires the React side to the new lifecycle so the user sees what's
happening with their file while phase-2 extraction runs in the
background and after the response stream closes.

- `useAttachmentHandler` upserts by `file_id` (was append-only) so
  the phase-2 SSE update event merges over the pending placeholder
  in place. Lightweight attachments without a `file_id`
  (web_search / file_search citations) keep the legacy append path.

- `useFilePreview(file_id)` React Query hook with
  `refetchInterval: (data) => data?.status === 'pending' ? 2500 : false`
  so polling auto-stops on the first terminal response without the
  caller having to flip `enabled`.

- `useAttachmentPreviewSync(attachment)` bridges polled data into
  `messageAttachmentsMap`. Polling enabled iff
  `status === 'pending' && isAnySubmitting` — per the design ask:
  active polling while the LLM is still generating, then quiet.
  Process-restart and post-stream cases are covered by polling on
  the next interaction.

- `Attachment.tsx` renders a small `PreviewStatusIndicator` (spinner +
  "Preparing preview…" for pending, alert icon + "Preview unavailable"
  for failed) inside `FileAttachment`. Download button stays fully
  functional in both states. Two new English locale keys.

- Data-provider scaffolding: `TFilePreview` type, `endpoints.filePreview`,
  `dataService.getFilePreview`, `QueryKeys.filePreview`.

* 🧪 fix: stub `useAttachmentPreviewSync` in pre-existing Attachment test mocks

The new `useAttachmentPreviewSync` hook is called unconditionally inside
`FileAttachment` (added in the prior commit). Two pre-existing test
files mock `~/hooks` to provide `useLocalize` only — the un-mocked
preview hook reference resolved to undefined and crashed render with
`(0 , _hooks.useAttachmentPreviewSync) is not a function` on the
Ubuntu/Windows CI runners.

Fix is local to the test mocks: add a no-op stub that returns
`{ status: 'ready' }` so the component renders the legacy chip path.
The two-phase preview behavior itself has its own dedicated suites
(`useAttachmentHandler.spec.tsx`, `useAttachmentPreviewSync.spec.tsx`).

* 🐛 fix: route phase-2 attachment update to current-run messageId

Codex P1 review on PR #12957. `processCodeOutput` intentionally
preserves the original DB `messageId` across cross-turn filename reuse
so `getCodeGeneratedFiles` can still trace a file back to the
assistant message that originally produced it. The phase-1 SSE emit
already routes by the current run's messageId — `processCodeOutput`
runtime-overlays it via `Object.assign(file, { messageId, toolCallId })`
and the callback writes `result.file` directly.

Phase-2 was passing the raw `updateFile` return through
`attachmentFromFileMetadata`, which read `messageId` straight off the
DB record. On a turn-N run that re-emitted a filename from turn-1
(e.g. agent writes `output.csv` again), the phase-2 SSE update
routed to `turn-1-msg` instead of `turn-N-msg`. Frontend's
`useAttachmentHandler` upserts under the wrong messageAttachmentsMap
slot — turn-N's pending chip stays stuck at "preparing preview…"
while turn-1's already-resolved attachment gets re-merged.

Fix: thread `runtimeMessageId` through `attachmentFromFileMetadata`
and pass `metadata.run_id` from the phase-2 emit site. Mirrors how
phase-1 sources its messageId. Tests cover the cross-turn reuse case
plus the writableEnded / null-finalize / no-finalize paths to lock
in the broader phase-2 emit contract.

* 🛠️ refactor: address codex audit findings (wire-shape parity, DRY, defensive catch)

Comprehensive audit on PR #12957. Resolves all valid findings:

- **MAJOR #1 — Wire-shape parity**: phase-1 ships the full `fileMetadata`
  record over SSE; phase-2 was using a tight `attachmentFromFileMetadata`
  projection. Drop the projection and have phase-2 spread `{...updated,
  messageId, toolCallId}` so both events match the long-standing
  legacy phase-1 shape clients depend on.

- **MAJOR #2 — DRY**: extract `runPhase2Finalize({ finalize, fileId,
  onResolved })` into `process.js` (alongside `processCodeOutput` whose
  contract it pairs with). Both `callbacks.js` paths and `tools.js`
  now flow through it. Single catch path eliminates divergence
  surface — the fix landed in 01704d4f0 (cross-turn messageId routing)
  was a symptom of this duplication risk.

- **MINOR #3 — JSDoc accuracy**: `finalizePreview`'s buffer is bounded
  by `fileSizeLimit`, not the 1MB extractor cap. Updated and added a
  note about peak heap from queued buffers.

- **MINOR #4 — Defensive catch**: `runPhase2Finalize`'s catch attempts
  a best-effort `updateFile({ status: 'failed', previewError:
  'unexpected' })` for the file_id, so a programming bug in
  `finalizePreview` doesn't leave the record stuck `'pending'` until
  the next boot-time orphan sweep.

- **NIT #6 — Stale PR refs**: 12952 → 12957 in 3 places.

- **NIT #7 — Schema bound**: `previewError` capped at `maxlength: 200`
  to prevent a future codepath from accidentally persisting a stack
  trace.

Skipped per audit verdict (non-blocking):
- #5 (memory pressure): documented in JSDoc; impl change was reviewer's
  "consider", not actionable.
- #8 (double DB query per poll): low cost, indexed by_id, polling is
  gated narrow.
- #9 (TAttachment cast): the union type is intentional; the casts are
  safe widening, refactoring TAttachment is invasive and out of scope.

Tests: 11 new (7 `runPhase2Finalize` unit tests covering happy path,
null-finalize, throws, double-fail, no-fileId, no-onResolved; +4
wire-shape parity assertions in the existing cross-turn test). 328
backend tests pass; 528 frontend tests pass; lint and typecheck clean.

* 🛡️ refactor: address codex P1+P2 + rename to drop phase-1/2 jargon

Codex round 2 review on PR #12957 caught two race conditions and one
recovery gap, all triggered by cross-turn filename reuse (`claimCodeFile`
intentionally returns the same `file_id` for the same
`(filename, conversationId)` across turns). Plus naming cleanup the
user requested — internal "phase 1 / phase 2" vocabulary leaks across
sprints, replace it everywhere with terms describing what's actually
happening.

P1 — stale render overwrites newer revision (process.js)
  Two turns reusing `output.csv` share a `file_id`. If turn-1's
  background render resolves AFTER turn-2's persist step, the
  unconditional `updateFile` writes turn-1's stale text/status over
  turn-2's pending placeholder. Fix: stamp a fresh `previewRevision`
  UUID on every emit, thread it through `finalizePreview`, and make
  the commit conditional via a new optional `extraFilter` argument
  on `updateFile` (`{ previewRevision: <expected> }`). The defensive
  `updateFile` in `runPreviewFinalize`'s catch uses the same guard
  so a programming error from an older render also can't override a
  newer turn.

P1 — stale React Query cache on pending remount (queries.ts)
  Same root cause from the frontend side. Cache key
  `[QueryKeys.filePreview, file_id]` may hold a prior turn's `'ready'`
  payload; with `refetchOnMount: false` and the polling gate on
  `pending`, polling never starts for the new placeholder. Fix:
  `useAttachmentHandler` invalidates that query whenever an attachment
  with a `file_id` arrives. Both initial-emit and update events
  trigger invalidation — uniform gate.

P2 — quick-restart orphans skipped by boot sweep (files.js)
  Boot `sweepOrphanedPreviews` uses a 5-min cutoff for multi-instance
  safety. A crash + restart inside the cutoff leaves `pending` records
  that never get touched again. Fix: lazy sweep inside the preview
  endpoint — if a polled record is `pending` and `updatedAt` is older
  than 5 min, mark it `failed:orphaned` on the spot before responding.
  Conditional on the same `updatedAt` we observed so a concurrent
  legitimate update wins. Cheap, bounded by user activity.

Naming cleanup
  - `runPhase2Finalize` → `runPreviewFinalize`
  - `PHASE_TWO_TIMEOUT_MS` → `PREVIEW_FINALIZE_TIMEOUT_MS`
  - All `phase-1` / `phase-2` / `two-phase` prose replaced with
    "the immediate emit", "the deferred render", "the persist step",
    "the deferred preview", etc. Skill-feature `phase 1/2` references
    (different feature) left alone.

Tests: 10 new (4 lazy-sweep × preview endpoint, 3 cache-invalidation ×
useAttachmentHandler, 3 extraFilter × updateFile data-schemas).
Backend 332/332, frontend 531/531, data-schemas 37/37, lint clean.

* 🛠️ refactor: address comprehensive review (round 3) — stale-cache MAJOR + 3 minors

Comprehensive review on PR #12957 caught a P1 follow-on bug from the
prior `invalidateQueries` fix, plus 3 maintainability findings.

MAJOR: stale React Query cache not actually fixed by `invalidateQueries`
  The previous fix called `invalidateQueries` to flush stale cached
  preview data on cross-turn filename reuse. But `useFilePreview` had
  `refetchOnMount: false`, which made the new observer read the
  stale-marked 'ready' data without refetching. The polling
  `refetchInterval` then evaluated against stale 'ready' → returned
  `false` → polling never started → user stuck on stale content.

  Fix (belt-and-suspenders):
    a) `useAttachmentHandler` switched to `removeQueries` — drops the
       cache entry entirely so the next mount has nothing to read and
       must fetch.
    b) `useFilePreview` no longer sets `refetchOnMount: false`, so the
       React Query default (`true`) kicks in — second line of defense
       if any future codepath observes stale data before the handler
       has a chance to evict.

MINOR: `finalizePreview` JSDoc missing `previewRevision` param
  Added with explanation of the conditional update guard.

MINOR: asymmetric stream-writable guard between SSE protocols
  Chat-completions delegated the gate to `writeAttachmentUpdate`;
  Open Responses inlined `!res.writableEnded && res.headersSent`.
  Extracted `isStreamWritable(res, streamId)` predicate; both paths
  + `writeAttachmentUpdate` now share the single source of truth.

NIT: `(data as Partial<TFile>).file_id` cast repeated 4 times
  Extracted to a `fileId` local at the top of the handler.

Tests: existing 9 invalidate-tests rewritten as remove-tests; +1 new
lock-in test asserts removeQueries is called and invalidateQueries
is NOT (regression guard against round-3 finding). 332 backend pass,
532 frontend pass, lint clean.

Skipped findings (deferred / acceptable):
- MINOR: post-submission pending state has no auto-recovery — the
  `isAnySubmitting` polling gate was the user's explicit design;
  LLM context surfaces failed/pending so the model can volunteer.
  Worth a follow-up if real users hit it.
- NIT: double DB query per preview poll — reviewer marked acceptable;
  changing `fileAccess` middleware is out of scope.

* 🛡️ test: address comprehensive review NITs (initial-emit guard + isStreamWritable coverage)

NIT — chat-completions initial emit skips writableEnded check
  The Open Responses initial emit was switched to use the new
  `isStreamWritable` predicate in the round-3 commit, but the
  chat-completions initial emit kept the older narrower check
  (`streamId || res.headersSent`). On a client disconnect mid-stream
  (`writableEnded === true`) it would still hit `res.write` and
  raise `ERR_STREAM_WRITE_AFTER_END` — caught by the outer IIFE
  catch but logged as noise. Switch this site to `isStreamWritable`
  too so both initial-emit paths share the same gate as the
  deferred update emits.

NIT — `isStreamWritable` not directly unit-tested
  The predicate was only covered indirectly via the deferred-preview
  SSE tests (writableEnded skip, headersSent check). Export from
  `callbacks.js` and add 5 parametric tests pinning down each branch
  (streamId truthy, res null, !headersSent, writableEnded, happy
  path) so a future condition addition can't silently regress.

* 🐛 fix: stuck "Preparing preview…" + inline the chip subtitle

Two related fixes for a stuck-spinner bug a user reported in manual
testing of PR #12957.

**Stuck spinner (the bug)**
The deferred preview render can complete a few seconds AFTER the SSE
stream closes (typical case: PPTX render finishes ~3s after the LLM
emits FINAL). When that happens, the SSE update is silently dropped
(`isStreamWritable` returns false on a closed stream) and polling is
the only recovery path.

The earlier polling gate was `status === 'pending' && isAnySubmitting`,
which mirrored the original design intent ("only query while the LLM
is still generating"). But `isAnySubmitting` flips false the moment
the model emits FINAL — milliseconds before the deferred render
commits. Polling never runs, the chip stays "Preparing preview…"
forever even though the DB has `status: 'ready'` with valid HTML.

Drop the `isAnySubmitting` part of the gate. `useFilePreview`'s
`refetchInterval` is already a function-form that returns `false` on
the first terminal response, so polling auto-stops within one tick of
resolution. The server-side render ceiling (60s) plus the lazy sweep
in the preview endpoint cap the worst case to ~24 polls per pending
attachment. Polling itself never blocks UX — the gate's purpose was
"don't waste cycles", and capping by terminal status is the correct
expression of that.

**Inline the chip subtitle (the visual)**
The previous design rendered "Preparing preview…" as a loose-feeling
spinner+text BELOW the file chip. The chip itself looked done while a
floating annotation said it wasn't.

`FileContainer` gains an optional `subtitle?: ReactNode` prop that
overrides the default file-type label. `Attachment.tsx` passes a
`PreviewStatusSubtitle` (spinner + "Preparing preview…" / alert +
"Preview unavailable") into that slot when the file's preview is
pending or failed. The chip footprint stays identical to its `'ready'`
form — just the second row swaps from "PowerPoint Presentation" to
the status indicator. No floating element, no layout shift.

Tests: regression test pinning down "polling stays enabled after the
LLM finishes" so a future revert can't reintroduce the stuck-spinner
bug. Existing FileContainer tests pass unchanged (subtitle override
is opt-in). 522 frontend tests pass; lint clean.

* 🐛 fix: deferred-preview survives reload + matches artifact card chrome

Fixes the remaining stuck-pending case after the polling gate fix: on
a reloaded conversation, message.attachments come from the DB frozen at
the immediate-persist `status: 'pending'`, but `messageAttachmentsMap`
is empty because no SSE handler ever fired for that messageId. Polling
now INSERTS a new live entry when no record matches the file_id, and
`useAttachments` merges live entries onto DB entries by file_id so the
resolved text/textFormat reach `artifactTypeForAttachment` and the
chip routes through the proper PanelArtifact card.

Also replaces the small file chip used during the pending state with
a PreviewPlaceholderCard that mirrors ToolArtifactCard chrome, so the
transition to the resolved PanelArtifact no longer reshapes the UI.

*  feat: auto-open panel when deferred preview resolves pending→ready

The legacy auto-open path is gated only on `isSubmitting`, so an
office-file preview that resolves *after* the SSE stream closes would
render in place but never auto-open the panel — even though that's
exactly the moment the result becomes meaningful to the user. Adds a
per-file_id one-shot signal that `useAttachmentPreviewSync` flips on
the pending→ready edge; `ToolArtifactCard` consumes it on mount and
auto-opens regardless of submission state. The signal is *only* set on
the actual transition (history loads of pre-resolved files don't
trigger it) and is consumed once (panel close + reopen on the same
card stays user-controlled).

* 🐛 fix: drop placeholder Terminal overlay + scope auto-open to fresh resolutions

Two fixes for issues spotted in manual testing of the deferred-preview
auto-open feature:

1. PreviewPlaceholderCard was passing `file={attachment}` to FilePreview,
   which triggered SourceIcon's Terminal overlay (`metadata.fileIdentifier`
   is set on every code-execution file). The artifact card itself doesn't
   show that overlay; the placeholder shouldn't either, so the
   pending→resolved transition is visually seamless.

2. The `previewJustResolved` flag flipped on every pending→ready
   transition observed by the polling hook — including stale-pending
   DB records that resolve via the first poll on a *history load*.
   Conversations whose immediate-persist snapshot left attachments at
   `status: 'pending'` would yank the panel open every revisit.
   Adds `mountedDuringStreamRef` to the hook (mirroring ToolArtifactCard)
   so the flag fires only when the hook itself was mounted during an
   active turn — preserving the pre-PR contract that the panel only
   auto-opens for results the user is actively waiting on, never for
   history.

* 🐛 fix: don't downgrade preview to failed when only the SSE emit throws

Codex P2 finding on PR #12957: the original chain placed `.catch` after
`.then(onResolved)`, so a throw inside `onResolved` (transport-side
errors — SSE write race after stream close, an emitter listener
throwing) would propagate into the finalize catch and persist
`status: 'failed'` / `previewError: 'unexpected'`. That surfaced
"preview unavailable" in the UI for a perfectly valid file, and
degraded next-turn LLM context to reflect a non-existent failure.

Wraps `onResolved` in its own try/catch so emit errors are logged but
do not affect the file's persisted status. Extraction success and
emit success are now independent: if extraction succeeds and
`finalizePreview` writes the terminal status, the polling layer / next
page load surfaces the resolved preview even if this turn's SSE emit
didn't land.

* 🛡️ fix: run boot-time orphan sweep under system tenant context

Codex P2 finding on PR #12957: `File` is tenant-isolated, so under
`TENANT_ISOLATION_STRICT=true` the boot-time `sweepOrphanedPreviews`
threw `[TenantIsolation] Query attempted without tenant context in
strict mode` and the recovery path silently failed every restart.
Stale `status: 'pending'` records would be stuck until a user happened
to poll the preview endpoint and trigger the lazy sweep — which only
covers the file the user is currently looking at, not the bulk
candidate set the boot sweep is designed to recover.

Wraps the sweep in `runAsSystem(...)` in both boot paths
(`api/server/index.js` and `api/server/experimental.js`) and pins the
contract with regression tests in `file.spec.ts` — one test asserts
the bare call throws under strict mode, the other asserts the
`runAsSystem`-wrapped call succeeds.

* 🧹 chore: trim verbose comments from previous commit

* 🧹 chore: address review findings (dead branch, lazy-sweep cutoff, stale JSDoc)

- finalizePreview: drop unreachable !isOfficeBucket branch (caller
  already gates on hasOfficeHtmlPath, so this path is always office)
- preview endpoint: drop lazy-sweep cutoff from 5min to 2min — anything
  past the 60s render ceiling is definitively orphaned, and per-request
  sweep can be tighter than the per-instance boot sweep
- strip stale `isSubmitting` references from JSDoc in 3 spots (the
  client-side gate was removed in 9a65840)

Skipped: function-length (#3) and client-side polling cap (#4) —
refactors without correctness/perf wins; remaining NITs.

* 🧹 fix: trim 1 query off pending polls + clear stale lifecycle on cross-shape updates

- Preview endpoint: reuse fileAccess middleware's record for the
  lifecycle check; only re-fetch with text on the terminal ready
  response. Cuts the typical poll lifecycle from 2(N+1) to N+1
  queries, since the vast majority of polls hit while pending and
  don't need text at all.
- processCodeOutput non-office branch: explicitly null out status,
  previewError, previewRevision (codex P2). Without this, an update at
  the same (filename, conversationId) where the prior emit was an
  office file leaves stale lifecycle fields and the client renders
  the wrong state for the now non-office artifact.
- Tests: rewire preview.spec mocks for the new shape, add boundary
  test pinning the 2min cutoff, add regression test for the
  cross-shape update.

* 🐛 fix: keep polling on transient errors but cap permanently-broken endpoint

Codex P2: the previous `data?.status === 'pending' ? 2500 : false` gate
killed polling on the first transient error. With `retry: false`, a 500
left `data` undefined, the callback returned false, and the chip was
stuck "Preparing preview…" forever — exactly the bug the polling layer
was supposed to recover from.

Inverts the gate: stop on terminal success (`ready`/`failed`) or after
5 consecutive errors. Transient errors keep retrying; a permanently
broken endpoint caps at ~12.5s instead of polling forever. Predicate
extracted as `previewRefetchInterval` for direct unit testing without
fighting React Query's timer machinery.

*  feat: render pending-preview files in their own row

Pending deferred-preview chips now bucket into a separate row above
the resolved attachments — reads as "this is still happening" rather
than mixing with completed downloads. Once status flips to ready, the
chip re-buckets into panelArtifacts; failed re-buckets into the file
row alongside other downloads.

* 🎨 fix: render pending-preview chips in the panel-artifact row, not the file row

Previous bucketing put pending chips in the file row (since
`artifactTypeForAttachment` returns null for empty-text records). The
pending placeholder is a future panel artifact — sharing the row keeps
the chip in place when it resolves instead of jumping rows.

Plain files still get their own row.

* 🐛 fix: phase-1 SSE replay must not regress a resolved attachment

Codex P1: useEventHandlers.finalHandler iterates
responseMessage.attachments at stream end and dispatches each through
the attachment handler. Those records are the immediate-persist
snapshot (status:pending, text:null) — if a deferred update has
already moved the same file_id to ready/failed, the existing merge
let the pending fields win and downgraded the resolved record. Result:
chip flickers back to pending and polling restarts until the lazy
sweep corrects.

Pin the terminal lifecycle fields (status, text, textFormat,
previewError) when existing is ready/failed and incoming is pending.
Other field updates still go through.

* 🐛 fix: track preview-poll error cap outside React Query state

Codex P2: the previous cap relied on `query.state.fetchFailureCount`,
but React Query v4's reducer resets that to 0 on every fetch dispatch
(the `'fetch'` action). With `retry: false`, each failed poll left
count at 1 and the next dispatch reset it back to 0, so the `>= 5`
branch never fired and a permanently-broken endpoint polled forever.

Track consecutive errors in a module-level Map keyed by file_id,
incremented in a thin `fetchFilePreview` wrapper around the data
service call. The Map is cleared on success and on cap-stop, so
memory is bounded by in-flight pending file_ids per session.
2026-05-06 03:04:19 -04:00
Danny Avila
cf0657509c
🧵 feat: Enable Anthropic Tool Argument Streaming (#12962)
* fix: Enable Anthropic Tool Argument Streaming

* fix: Honor Anthropic clientOptions drops

* fix: Preserve custom Anthropic beta headers

* fix: Enable Bedrock Anthropic Tool Streaming
2026-05-06 01:09:14 -04:00
Danny Avila
f839a447e1
🧬 fix: Subagent MCP requestBody Propagation (bump @librechat/agents to 3.1.78 + cleanup) (#12959)
* 📦 chore: bump `@librechat/agents` to v3.1.78

v3.1.78 ships [danny-avila/agents#147](https://github.com/danny-avila/agents/pull/147),
which makes `SubagentExecutor` inherit the parent invocation's
`configurable` (with `thread_id`/`run_id`/`parent_run_id` scrubbed)
into the child workflow. Subagent tool dispatches through the parent's
`ON_TOOL_EXECUTE` handler now arrive with parent's `requestBody`,
`user`, `userMCPAuthMap`, etc. — so `{{LIBRECHAT_BODY_*}}` placeholder
substitution and per-user MCP connection lookup work for subagent
tool calls the same way they do for the parent agent.

Note: `package-lock.json` will need an `npm install` refresh once
v3.1.78 lands on the registry. The user/user_id injection added in
PR #12950 stays as defense-in-depth.

* 🗑️ refactor: drop redundant user/user_id injection from `loadToolsForExecution`

`@librechat/agents@3.1.78` (via danny-avila/agents#147) makes
`SubagentExecutor` forward the parent's `configurable` verbatim into
the child workflow. Subagent `ON_TOOL_EXECUTE` dispatches now arrive
with parent's `user` / `user_id` already in `data.configurable` —
making the host-side injection added in #12950 a no-op.

Removes:
- The conditional `user: createSafeUser(req.user); user_id: req.user.id`
  block in `loadToolsForExecution` (req.user.id-guarded so the
  `'api-user'` fallback in Responses/OpenAI controllers is preserved).
- The unused `createSafeUser` import.
- The 4 unit tests covering the now-deleted behavior.

The merge in `handlers.ts` (`{ ...configurable, ...toolConfigurable }`)
still produces a `mergedConfigurable` with the right user identity for
both parent and subagent paths — the values just come from
`configurable` (forwarded by the SDK) rather than `toolConfigurable`.

Other fixes from #12950 stay (IUser.id narrowing, the env.ts /
google/initialize.ts / remoteAgentAuth.ts TS-warning fixes) — they
were independent of the subagent identity propagation issue.

* 📦 chore: update `@librechat/agents` to v3.1.78

This update reflects the transition from the development version `3.1.78-dev.0` to the stable release `3.1.78`. The package-lock.json has been refreshed to ensure consistency with the new version, including updated integrity checks and resolved URLs for the package. This change is part of ongoing improvements to enhance the functionality and stability of the agents module.
2026-05-05 22:07:26 -04:00
Danny Avila
9efd61d57d
🔐 fix: Forward per-file entity_id through code-env priming (#12958)
* 🔐 fix: Forward per-file `entity_id` through code-env priming

Skill files and persisted code-env files now carry their `entity_id` on
the in-memory file refs that seed `Graph.sessions`. Without this, an
execute call that mixes a skill file (uploaded with `entity_id=skillId`)
and a user attachment (uploaded with no `entity_id`) collapses onto a
single request-level entity at the codeapi authorization step and one
side 403s. With per-file `entity_id`, codeapi resolves sessionKey per
file and both authorize.

- `primeSkillFiles` / `primeInvokedSkills`: thread `entity_id` through
  fresh-upload, cache-hit, and per-skill-batch paths in
  `packages/api/src/agents/skillFiles.ts`.
- `primeFiles` (Code/process.js): parse `entity_id` from the persisted
  `codeEnvIdentifier` query string once per iteration; forward through
  `pushFile`, including the reupload path which re-parses the fresh
  identifier returned by codeapi.
- Tests: extend `skillFiles.spec.ts` with two cases — fresh-upload
  propagation and cached-hot-path parsing.

Companion PRs in flight on `@librechat/agents` (forward `entity_id`
through `_injected_files`) and codeapi (per-file authorization). All
three are wire-back-compat: an absent `entity_id` falls back to the
existing request-level resolution.

* 🔧 chore: Update dependencies in package-lock.json and package.json

- Bump `@librechat/agents` to version `3.1.78-dev.0` across multiple package files.
- Upgrade `@langchain/langgraph-checkpoint` to version `1.0.2` and update its peer dependency for `@langchain/core` to `^1.1.44`.
- Update `axios` to version `1.16.0` and `follow-redirects` to version `1.16.0`.
- Add `@types/diff` as a new dependency at version `7.0.2` and include `diff` at version `9.0.0` in the `@librechat/agents` module.
- Introduce optional peer dependency `@anthropic-ai/sandbox-runtime` for `@librechat/agents` with metadata indicating it is optional.

* 🐛 fix: Make skill code-env cache persistence observable

Two changes to surface the skill-bundle re-upload issue without
behavioral changes to tenant scoping (root cause to be confirmed via
the new warn log):

1. `primeSkillFiles` now awaits `updateSkillFileCodeEnvIds` instead of
   firing-and-forgetting it. The prior shape could race with the next
   prime (read-before-write) even when the bulkWrite itself succeeds,
   producing a silent cache miss. Latency cost: ~10–50ms on first
   prime; in exchange every subsequent prime can rely on the
   identifier being persisted by the time it reads.

2. `updateSkillFileCodeEnvIds` now returns `{matchedCount, modifiedCount}`
   from the underlying bulkWrite. `primeSkillFiles` warn-logs when
   `modifiedCount < updates.length`, making any silent drop visible —
   whether the cause is tenant filtering, a `relativePath` mismatch,
   schema-plugin scoping, or something else. Prior shape returned
   `Promise<void>` so any zero-modification result was invisible.

Tests:
- `skill.spec.ts`: real-MongoDB happy path (counts match), no-match
  case (modifiedCount=0), and empty-input contract.
- `skillFiles.spec.ts`: deferred-promise harness proving the call
  site awaits the persist (prime stays pending until the persist
  resolves) and forwards partial-write counts.

Deliberately narrower than the original draft of this commit, which
also bypassed `tenantSafeBulkWrite` for the codeEnvIdentifier write
on the speculative diagnosis that tenant filtering was the cause.
That change was a behavior shift on tenant scoping without
confirmation; reverted pending real-world signal from the new warn
log.

* 🐛 fix: Justify await for skill code-env persistence under concurrency

The await on `updateSkillFileCodeEnvIds` isn't a defensive nicety —
it's load-bearing for cache effectiveness under concurrent priming.

Verified with an out-of-tree harness (`config/test-skill-cache.ts`,
not committed) that wires `primeSkillFiles` against a real codeapi
stack:

- With fire-and-forget (prior shape after this branch's revert):
  back-to-back primes for the same skill miss the cache. Call N+1
  reads SkillFile docs before Call N's write commits, sees no
  `codeEnvIdentifier`, re-uploads, fires its own forget that Call N+2
  also races. Steady-state stays in cache miss for the full burst.

- With await: the prime that does the upload commits its persist
  before resolving, so the next concurrent prime observes the cache
  pointer instead of racing the read. Latency cost ~10–50ms on the
  upload prime; subsequent concurrent primes save an entire batch
  upload.

In production with primes seconds apart this race is rare; at scale
with many users hitting the same skill in the same second it's the
difference between M and N×M uploads.

Updates the regression test to assert the await contract (deferred
persist promise → prime stays pending until persist resolves).
Comment in `skillFiles.ts` rewritten to document the concurrency
rationale rather than the weaker "race-with-next-prime" framing the
prior commit used.
2026-05-05 18:35:09 -04:00
Atef Bellaaj
187ab787da
🌩️ feat: CloudFront CDN File Strategy (#12193)
* 🌩️ feat: CloudFront CDN File Strategy + signed cookies

Squashed from PR #12193:
- feat(storage): add CloudFront CDN file strategy
- feat(auth): add CloudFront signed cookie support

Note: package.json/package-lock.json dependency additions are intentionally
omitted from this commit and will be re-added via `npm install` after rebase
to avoid lock-file merge conflicts. The two new peer deps that need to be
re-installed are:
  - @aws-sdk/client-cloudfront@^3.1032.0
  - @aws-sdk/cloudfront-signer@^3.1012.0

Also fixes 4 missing destructured names in AuthService.spec.js
(getUserById, generateToken, generateRefreshToken, createSession) that
were referenced in tests but not imported from the mocked '~/models'.

* 📦 chore: install CloudFront SDK deps for PR #12193

Adds the two AWS CloudFront packages required by the rebased
CloudFront CDN strategy:
  - @aws-sdk/client-cloudfront
  - @aws-sdk/cloudfront-signer

Following the @aws-sdk/client-s3 pattern:
  - api/package.json: regular dependency (runtime resolution)
  - packages/api/package.json: peerDependency

Generated by `npm install` against the freshly rebased lock file
to avoid the merge conflicts that came from the original PR's
lock-file edits being made against an older base of dev.

* 🐛 fix: CI failures + review findings on CloudFront PR #12193

CI fixes
- Rename packages/data-provider/src/__tests__/cloudfront-config.test.ts
  → src/cloudfront-config.spec.ts. Jest's default testMatch picks up
  __tests__/ directories even inside dist/, so the compiled .d.ts shell
  was being executed as an empty test suite. Moving to .spec.ts (matching
  the rest of the package) avoids the dist/ pickup.
- Add cookieExpiry: 1800 to CloudFront crud.test makeConfig: the schema
  applies a default so CloudFrontFullConfig requires it.

Review findings addressed
- #1 (Codex + comprehensive): Normalize CloudFront domain with /\/+$/
  regex (and key with /^\/+/ regex) in buildCloudFrontUrl, matching the
  cookie code so resource policy and file URLs stay aligned even when
  the configured domain has multiple trailing slashes. Added tests.
- #2: Move DEFAULT_BASE_PATH out of s3Config into shared
  packages/api/src/storage/constants.ts. ImageService no longer imports
  S3-specific config.
- #3: getCloudFrontConfig() returns Readonly<CloudFrontFullConfig> | null
  to discourage mutation of the cached signing config.
- #4: Add cross-field refinement tests for cloudfrontConfigSchema
  (invalidateOnDelete-without-distributionId,
  imageSigning="cookies"-without-cookieDomain).
- #6: Revert unrelated MCP comment re-indentation in
  librechat.example.yaml.
- #7: Add azure_blob to the strategy list comment.

Skipped
- #5 (extractKeyFromS3Url with CloudFront URLs): existing
  deleteFileFromCloudFront tests already cover the path-equivalence
  assumption; renaming the helper is real refactor work beyond this
  PR's scope.
- #8, #9 (NIT, low confidence): leaving for author judgement.

* 🧹 chore: drop dead DEFAULT_BASE_PATH from s3Config test mock

After moving DEFAULT_BASE_PATH to ~/storage/constants, crud.ts no longer
reads it from s3Config — so the entry in the s3Config jest mock was
misleading dead config. The tests still pass because the unmocked real
constants module provides the value.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-05 13:21:05 -04:00
Danny Avila
4583d5a926
refactor: Bound Concurrent Office-HTML Rendering for Code Artifacts (#12951)
* 🧰 feat: add `createConcurrencyLimiter` promise utility

Lightweight, dependency-free FIFO concurrency limiter for bounding
parallelism of expensive async work that fans out from a single
producer. Tasks queue rather than reject when the cap is reached;
slots release on both fulfillment and rejection so a single failure
cannot stall the queue. Each task runs inside a thunk so timeouts
and other side effects do not start until a slot is acquired.

*  perf: bound concurrent office-HTML rendering for code artifacts

A tool result with N office files (DOCX/XLSX/XLS/ODS/CSV/PPTX)
previously fanned out into N parallel mammoth/SheetJS/PPTX renders,
all CPU-bound and synchronous. Under bursty agent output this
competes with the still-running inference loop for event-loop time
and inflates end-of-run "finalize" waits in non-streaming flows
(BaseClient chat-completions, non-streaming Responses, the tools.js
direct endpoint) which all `await Promise.all(artifactPromises)`
before responding.

Cap the parser layer at 2 concurrent office-HTML renders process-
wide via a shared `createConcurrencyLimiter`. Tasks queue in FIFO
and the per-render timeout starts only after a slot is acquired so
queue waits do not consume the timeout budget. The HTML-or-null
contract (no text fallback for office types) is preserved.
2026-05-05 08:53:21 -04:00
Danny Avila
bff9bfea87
🐛 fix: Propagate User Identity to Subagent MCP Tool Calls (#12950)
* 🐛 fix: Propagate User Identity to Subagent MCP Tool Calls

The `@librechat/agents` SDK's `SubagentExecutor` invokes the child
workflow with a fresh configurable of `{ thread_id }` only — the
parent's `user` / `user_id` are dropped on the way into the child
graph. The child's `ToolNode` then dispatches `ON_TOOL_EXECUTE` to the
parent's handler, which merges `{ ...configurable, ...toolConfigurable }`,
but neither side carries user identity for subagents.

Downstream MCP tools read `config.configurable.user?.id || user_id` and
got `undefined`, so `MCPManager.getConnection` fell through to the
"No connection found for server X" error path — it can't reach the
user-connection lookup without a userId.

Re-inject `user` (via `createSafeUser`) and `user_id` from `req.user`
into the configurable returned by `loadToolsForExecution`. This is the
single point all controllers (chat, Responses API, OpenAI-compat) flow
through. For the parent agent it's a no-op (outer config already
carries the same values); for subagents it fills the gap so MCP
connection lookup, user-placeholder substitution, and tools that read
configurable.user all work correctly.

* 🐛 fix: Preserve `api-user` Fallback When Injecting Subagent Identity

Codex review pointed out that the prior commit unconditionally wrote
`user_id: req.user?.id` (and `user`) into `toolConfigurable`. The handler
merges via `{ ...configurable, ...toolConfigurable }` — `toolConfigurable`
wins — so when `req.user` is absent, this overwrote the outer config's
`'api-user'` fallback (set by `responses.js` / `openai.js` for the
unauthenticated API-key path) with `undefined`, breaking MCP connection
lookup for that path.

Only inject the keys when `req.user.id` is truthy. Omitting them lets
the merge preserve whatever the outer configurable already had. Tests
updated to assert key omission for `req.user` undefined / null / present
without `id`.

* 🩹 fix: Narrow `IUser.id` to required string

`IUser` extends mongoose `Document`, which types `id?: any` (the optional
virtual). At runtime `id` is always `_id.toString()` for a hydrated doc,
so narrow the type to a required string.

Closes two `@rollup/plugin-typescript` TS2322 warnings introduced by
PR #12450 (OIDC Bearer Token Authentication for Remote Agent API)
where `req.user = userResolution.user` and the
`(req: Request, res: Response, next: NextFunction)` signature both
failed against the project's local `Express.User` augmentation
(`{ [key: string]: any; id: string; }`) because `IUser.id` was
`any`/optional. Narrowing here fixes both at the source rather than
casting at every assignment site.

* 🩹 fix: Resolve TS Build Warnings Surfaced by `IUser.id` Narrowing

Three rollup TS plugin warnings surfaced after narrowing
`IUser.id` from `any` to `string`:

- `utils/env.ts:95` — `safeUser[field] = user[field]` failed strict
  checking because indexed write through a union-typed key collapses
  the LHS to the intersection of all field write types (i.e.,
  `undefined` when fields have mixed types). The previous `id?: any`
  on IUser had been masking this. Switch to `Object.assign(safeUser,
  { [field]: user[field] })` which widens the assignment.

- `endpoints/google/initialize.ts:35` — `getUserKey({ userId:
  req.user?.id, ... })` failed because `req.user?.id` is now
  `string | undefined` (no longer `any`). Match the pattern already
  used in `endpoints/openAI/initialize.ts:49`: `req.user?.id ?? ''`.

- `middleware/remoteAgentAuth.ts:465` — pre-existing, unrelated to
  the IUser change. The local (gitignored) `express.d.ts` augments
  `express.Request` but not `express-serve-static-core.Request`,
  so the explicit `(req: Request, ...)` annotation imported from
  `'express'` resolves to a Request whose `req.user` differs from
  the one `RequestHandler` expects internally. Type the closure as
  `RequestHandler` directly so TS infers params from the augmented
  type.

* 🩹 fix: Cast `RemoteAgentAuth` Closure to `RequestHandler`

My previous attempt removed the explicit `req: Request` annotation on
the closure to side-step the outer `RequestHandler` mismatch. That
shifted the error to every helper call site inside the closure
(`getConfigOptions(req)`, `runApiKeyAuth(req, ...)` at 467/474/493/
512/531), because the helpers annotate their params with
`express.Request` (which has the local `Request.user` augmentation),
while the unannotated closure inferred `req` as
`express-serve-static-core.Request` (no augmentation). Reproduced
locally by stubbing the gitignored `src/types/express.d.ts`.

Right approach: keep the explicit `req: Request` annotation so the
closure body matches the helpers' types, then cast at the return —
`RequestHandler`'s internal `Request` resolves through
`express-serve-static-core` and lacks the augmentation, so the cast
is the boundary that bridges the two views of `req.user`.

Verified against a build with the local express.d.ts stub: zero
warnings on `remoteAgentAuth.ts`, `env.ts`, and `google/initialize.ts`.
2026-05-05 12:19:50 +09:00
Danny Avila
f20419d0b7
📄 feat: Rich File Artifact Previews for DOCX, CSV, XLSX, PPTX (#12934)
* 📄 feat: Rich File Artifact Previews for DOCX, CSV, XLSX, PPTX

Render office files emitted by tools as interactive previews in the
artifact panel instead of raw extracted text. The backend produces a
sanitized HTML document via mammoth (DOCX), SheetJS (CSV/XLSX/XLS/ODS),
or yauzl-based slide extraction (PPTX) and ships it through the
existing SSE attachment payload; the client routes it through the
Sandpack `static` template's `index.html` slot — no new browser deps,
no client-side blob fetch, no React renderer components.

* 🔐 fix: Restrict data: URLs to <img> in office HTML sanitizer

Codex review on #12934 caught that `data:` lived in the global
`allowedSchemes`, which meant a smuggled `<a href="data:text/html,
<script>...</script>">` would survive sanitization. The Sandpack
iframe sandbox does not gate `target="_blank"` navigations, so a
click would open attacker-controlled HTML in a new tab.

Scope `data:` to `<img src>` only via `allowedSchemesByTag` (mammoth
inlines DOCX images as base64 `data:image/...` URIs — that path still
works). Add a regression suite (`sanitizeOfficeHtml security`) with
8 cases covering: <script> stripping, event-handler removal,
javascript:/data: rejection on anchors, data:image preservation in
<img>, http/https/mailto allowance, target=_blank rel=noopener
enforcement, and <iframe> stripping.

* 🔧 fix: Route extensionless office files by MIME alone

Codex review on #12934 caught that the office-render gate in
`extractCodeArtifactText` only fired when the extension was in
`OFFICE_HTML_EXTENSIONS` or the category was `document`/`pptx`. A tool
emitting `data` with `text/csv` (no extension) classifies as
`utf8-text`, so the gate was skipped and raw CSV text shipped to the
client — but the client routes by MIME to the SPREADSHEET bucket
expecting a full HTML document, so the panel rendered broken text.

Extract a shared `officeHtmlBucket(name, mime)` predicate from
`html.ts` (returns the bucket name or null). Both `bufferToOfficeHtml`
(the dispatcher) and the upstream gate in `extract.ts` now go through
this single source of truth, so they can never drift apart again. The
predicate already mirrors the dispatcher's extension/MIME logic
(extension wins; MIME is the fallback for extensionless inputs).

Adds:
- 14 cases for the new `officeHtmlBucket` predicate covering the
  positive paths (each bucket via extension OR MIME) and the negative
  paths (txt, py, json, jpg, pdf, zip, odt, plain noext).
- A direct regression test in `extract.spec.ts` for the Codex catch:
  `data` with `text/csv` + utf8-text category routes through the
  office HTML producer.
- Parameterized cases for extensionless DOCX/XLSX/XLS/ODS/PPTX files
  identified by MIME alone.

* 🛡️ fix: Enforce extension-wins precedence in officeHtmlBucket

Codex review on #12934 caught that the predicate's if-chain interleaved
extension and MIME checks for each bucket — e.g. CSV's branch was
`ext === 'csv' || CSV_MIME_PATTERN.test(mimeType)`. A `deck.pptx`
shipped with `text/csv` (sandboxed tools sometimes ship generic MIMEs)
matched the CSV branch BEFORE the PPTX extension branch was reached,
so a binary PPTX would have been handed to `csvToHtml` to parse as
text — yielding garbage or a parse exception.

Restructure to a strict two-pass dispatch: an exhaustive extension
table first (one lookup, all known extensions), then MIME-only
fallback for extensionless / unknown-ext inputs. The doc comment's
"extension wins" claim is now actually enforced by the implementation.

Add 7 regression cases covering the conflicting-MIME footgun for each
bucket: deck.pptx + text/csv → pptx; workbook.xlsx + text/csv →
spreadsheet; legacy.xls + pptx-MIME → spreadsheet; report.docx +
text/csv → docx; data.csv + docx-MIME → csv; etc.

* 🛡️ fix: Reject zip-bomb office files before in-process parsing (SEC)

Addresses pre-existing availability vulnerability validated by
SEC review (Codex finding 275344c5...) and made worse by this PR's
HTML rendering path. A sub-1MiB compressed XLSX/DOCX/PPTX (highly
compressed run-of-zeros) inflates to 200+ MiB of XML when handed
to mammoth/xlsx — blocking the Node event loop for 10+ seconds and
spiking RSS to ~1 GiB. The existing 8s `withTimeout` wrapper uses
`Promise.race`, which can only return early; it cannot interrupt
synchronous parser CPU/RAM consumption. PoC ran an authenticated
execute_code call to OOM the API process.

Add `assertSafeZipSize(buffer)` — a yauzl-based pre-flight that
streams every entry with mid-inflate byte counting and bails on
either a per-entry or total decompressed-size cap. Mid-inflate
counting cannot be bypassed by falsifying the central directory's
`uncompressedSize` field (the technique the PoC used). Defaults:
25 MiB per entry, 100 MiB total — generous headroom for legitimate
image-heavy office files, well below the attack profile.

Hook the check into every path that hands a buffer to mammoth/xlsx
/yauzl:
- New HTML producers (`wordDocToHtml`, `excelSheetToHtml`,
  `pptxToSlideListHtml`) — added by this PR
- Legacy RAG text extractors (`wordDocToText`, `excelSheetToText`
  in `crud.ts`) — pre-existing path, also vulnerable
Errors propagate as a tag-distinct `ZipBombError` so callers can
distinguish a refused bomb from generic parse failures. The outer
`extractCodeArtifactText` swallows the error and returns null,
falling back to the regular download UI.

`.xls` (BIFF/CFB binary, not ZIP) is detected by magic bytes and
skipped — yauzl would reject it as malformed anyway.

Adds 15 tests:
- `zipSafety.spec.ts` (9): benign passes, per-entry cap, total cap,
  ZipBombError type-tagging, malformed-zip distinction, directory-
  entry handling, named-error surfacing, and the SEC-PoC pattern
  (sub-1 MiB compressed → 50 MiB inflated rejected on default caps).
- `html.spec.ts` zip-bomb suite (5): each producer rejects a bomb;
  dispatcher propagates correctly; legitimate fixtures still render.
- `extract.spec.ts` (1): outer extractor swallows ZipBombError and
  returns null so the download UI fallback fires.

* 🧹 fix: Normalize MIME parameters; add legacy CSV MIME variant

Two related Codex catches on PR #12934 — both about MIME-routing
inconsistencies between backend and client that would cause
extensionless CSV files to render as broken (raw text under an HTML
slot) or skip the artifact panel entirely.

P2 — backend MIME normalization:
`officeHtmlBucket` matched MIME strings exactly, so a real-world
`text/csv; charset=utf-8` Content-Type slipped through and the
backend returned raw CSV text. The client's `baseMime` helper
strips parameters before its own MIME lookup, so it routed the
same file to the SPREADSHEET bucket expecting an HTML body that
never arrived. Mirror the client's normalization on the backend
(strip everything from `;` onward, lowercase) before bucket
matching.

P3 — client legacy CSV MIME:
Backend's `CSV_MIME_PATTERN` accepts three variants (`text/csv`,
`application/csv`, `text/comma-separated-values`); the client's
`MIME_TO_TOOL_ARTIFACT_TYPE` only had the first two. An
extensionless file with `text/comma-separated-values` would have
backend HTML produced but the client would skip the artifact
panel entirely. Add the missing variant.

Tests:
- 9 new parameterized-MIME cases on backend covering charset/
  boundary/case variants for every bucket.
- 1 new client routing case for `text/comma-separated-values`.

* 🩹 fix: Try office HTML before short-circuiting on category=other

Codex review on #12934 caught that the early `category === 'other'`
return short-circuited before `hasOfficeHtmlPath` was checked. The
classifier returns 'other' for inputs the new dispatcher can still
route — extensionless `application/csv` (CSV MIMEs aren't in the
classifier's text-MIME set and don't start with `text/`), and
extensionless office MIMEs with parameters like `application/vnd...
spreadsheetml.sheet; charset=binary` (the classifier's `isDocumentMime`
exact-matches these MIMEs without parameter normalization). Both would
route correctly through `officeHtmlBucket` but never reached it.

Move the office-HTML attempt above the 'other' early return, and drop
the `|| category === 'document' || category === 'pptx'` shortcut now
that `hasOfficeHtmlPath` covers the same surface (with parameter
normalization) and a wider one. ODT still routes through `extractDocument`
unchanged — `hasOfficeHtmlPath` returns false for it and the
`category === 'document'` branch below handles it.

Adds 3 regression tests:
- extensionless `application/csv` + category='other' → office HTML
- extensionless parameterized office MIME + category='other' → office HTML
- defense check: actual binary 'other' (image/jpeg) still returns null
  without invoking the office producer

* 🛡️ fix: Office types are HTML-or-null (no text fallback → XSS)

Codex P1 review on #12934 caught that when `renderOfficeHtml` failed
(timeout, malformed file, zip-bomb rejection) for an office type, the
extractor fell through to `extractDocument` and returned plain text.
The client routes by extension/MIME to the office preview buckets and
feeds `attachment.text` straight into the Sandpack iframe's
`index.html`. A spreadsheet cell or document body containing the
literal string `<script>alert(1)</script>` would have been injected
as executable markup — direct XSS.

The contract for office types is now HTML-or-null with no text
fallback. Failed render returns null, the client's empty-text gate
keeps the artifact off the panel, and the file falls back to the
regular download UI (matching what PPTX already did). PDF and ODT
still go through `extractDocument` because the client routes them to
PLAIN_TEXT (which the markdown viewer escapes) or no artifact at all,
so plain text is safe there.

Test reshuffle:
- `document` describe block now uses ODT/PDF for the legacy
  parseDocument-path tests (DOCX/XLSX/XLS/ODS bypass that path).
- New "does NOT call parseDocument for office HTML types" test locks
  in the SEC contract for all four office HTML buckets.
- "falls back to ..." tests rewritten as "returns null when ..." with
  explicit `parseDocumentCalls.length === 0` assertions to prove no
  text leaks back to the client.
- New XSS regression test for the XLSX failure path.
- Mock parseDocument failure-name match relaxed to `includes()` so
  ODT-named tests can use the same trigger.

* 🧽 chore: Address follow-up review findings on PR #12934

Wraps up the 10-finding follow-up review. Two MAJOR + four MINOR + two
NIT addressed; one NIT skipped after verifying it was a misread of the
package.json structure.

MAJOR
- #1: Rewrite `renderOfficeHtml` JSDoc to document the HTML-or-null
  contract explicitly. The pre-fix doc described a text-fallback path
  that was the original XSS vector (commit b06f08a). A future
  maintainer trusting the stale doc could reintroduce the fallback.
- #2: Replace byte-truncation of office HTML with a small "preview too
  large" banner document. Cutting at a UTF-8 boundary lands mid-tag
  (`<table><tr><td>con\n…[truncated]`) and ships malformed markup to
  the iframe — unpredictable rendering, occasional broken layouts on
  DOCX with embedded images / wide spreadsheets.

MINOR
- #4: Wrap `readSlidesFromZip`'s `zipfile.close()` in try/catch so a
  close-time exception (mid-flight stream) doesn't replace the
  original error. Mirrors the defensive pattern in zipSafety.ts.
- #5: Refactor PPTX extraction to use `yauzl.fromBuffer` directly,
  eliminating the temp-file write/unlink the safety pre-flight already
  proved unnecessary. Removes 4 unused imports (os, path, fs/promises,
  randomUUID).
- #6: Extract `isPreviewOnlyArtifact(type)` to `client/src/utils/
  artifacts.ts` so the membership check is unit-testable without
  mounting the full Artifacts component (Recoil + Sandpack + media
  query). 15 new test cases covering positive types, negative types,
  null/undefined, and unknown strings.

NIT
- #3: Remove dead `stripColorStyles` / `COLOR_PROPERTY_PATTERN` —
  unused (sanitizer's `allowedStyles` config handles color implicitly).
- #7: Remove dead `!_lc_csv_label` worksheet property write.
- #9: Remove no-op `exclusiveFilter: () => false` sanitize-html config.
- #10: Type-narrow `PREVIEW_ONLY_ARTIFACT_TYPES` to
  `ReadonlySet<ToolArtifactType>` so the membership table is
  compile-time checked against the enum.

SKIPPED
- #8: Reviewer flagged `sanitize-html` as duplicated in devDeps and
  dependencies. The package has no `dependencies` section — only
  `devDependencies` and `peerDependencies`. Existing convention
  (mammoth, xlsx, yauzl, pdfjs-dist) is to appear in BOTH. Removing
  the devDep entry would break local test runs.

Tests: packages/api 4406/4406, client artifacts 128/128.

* 🪞 chore: Fix isPreviewOnlyArtifact test description parameter order

Follow-up review nit on PR #12934. Jest's `it.each` substitutes `%s`
positionally, and the table rows were `[type, expected]` while the
description template read `'returns %s for type %s'` — outputting
"returns application/vnd.librechat.docx-preview for type true"
instead of the intended "type ... returns true".

Reorder the template to match the column order. Test runner output
now reads naturally: "type application/vnd.librechat.docx-preview
returns true". Pure cosmetic — runtime behavior unchanged.

*  feat: Improve DOCX rendering and surface filename in panel header

Two UX improvements based on hands-on use of the office preview pipeline.

DOCX rendering — mammoth strips the navy banners, cell shading, and
column layouts that direct-formatted docs apply (python-docx-style
output is a common case). The flat `<p><strong>X</strong></p>` and
bare `<table><tr><td>` it emits looks washed out next to the source.
Three targeted compensations:

- Style map promotes `Title`, `Subtitle`, `Heading 1` thru `Heading 6`,
  and `Quote` paragraphs to their semantic HTML equivalents (mammoth's
  default only handles Heading 1-6, missing Title/Subtitle/Quote).
- Extra CSS scoped to `.lc-docx` gives the first table row sticky-
  looking header styling regardless of `<thead>` (mammoth never emits
  `<thead>`), adds zebra striping, and treats the python-docx
  `<p><strong>X</strong></p>` section-heading idiom as a pseudo-h2 with
  a thin accent left border so document structure survives the round
  trip. Headings get a left accent or underline so they read as
  headings instead of just bold paragraphs.
- Sanitizer's `allowedAttributes` opens `class` on the heading and
  block tags the styleMap and CSS heuristics rely on. `<script>`,
  event handlers, javascript: URLs, etc. are still stripped — the
  existing security regression suite catches any drift.

Panel header — `Artifacts.tsx` showed a generic "Preview" pill for
preview-only artifacts. Single-tab Radio is a no-op; surfacing the
document filename there gives the user something useful in the chrome
without taking real estate. `displayFilename` handles the sandbox
dotfile suffix the upload pipeline applies.

Tests: html.spec.ts +1 (new CSS-emission lock), 71/71. Backend files
suite 428/428. Client 308/308.

*  feat: High-fidelity DOCX preview via docx-preview in iframe

Switch the default DOCX render path from server-side mammoth → flat
HTML to client-side `docx-preview` loaded inside the Sandpack iframe.
Mammoth becomes the fallback for files above the cap.

Why
---
The Sandpack iframe is a real browser DOM. Server-side rendering
ceiling for DOCX→HTML is well below the source's visual fidelity —
mammoth strips cell shading, run colors, banners, and column layouts
because Word's layout model doesn't fit HTML's flow model. Pushing the
render into the iframe lifts that ceiling without paying the
server-side cost of jsdom or LibreOffice.

What
----
- New `wordDocToHtmlViaCdn(buffer)` builds a self-contained HTML doc
  that embeds the binary as base64 and lets `docx-preview@0.3.7`
  render it on load. CSS preserves dark/light mode handoff via
  `prefers-color-scheme`. Bootstrap script falls back to a "preview
  unavailable, please download" message if the CDN is unreachable or
  the parse throws.
- `docx-preview` and its `jszip` peer dep are pinned to specific
  versions on jsdelivr with SRI sha384 integrity hashes and
  `crossorigin="anonymous"`. Refresh: re-fetch the file, run
  `openssl dgst -sha384 -binary FILE | openssl base64 -A`.
- CSP locked down on the iframe: `default-src 'none'`, scripts only
  from jsdelivr (no eval), `connect-src 'none'` so a parser bug in
  docx-preview can't be turned into exfiltration of the embedded
  document, `base-uri 'none'`, `form-action 'none'`. Defense in depth
  on top of the Sandpack cross-origin sandbox.
- `wordDocToHtml` dispatches by size: ≤ 350 KB binary → CDN path
  (high fidelity), larger → mammoth fallback (preserves the size cap
  on `attachment.text`). 350 KB chosen so worst-case base64-inflated
  output (~478 KB) plus wrapper overhead (~5 KB) fits under
  MAX_TEXT_CACHE_BYTES (512 KB) with 40 KB headroom.
- Internal renderers exported as `_internal` for tests. Public API
  unchanged — callers still go through `wordDocToHtml`.

PPTX intentionally NOT switched
-------------------------------
Surveyed the available client-side PPTX libraries:
- `pptx-preview@1.0.7` ships an ESM-only main entry plus a 1.36 MB
  UMD that references `require("stream"/"events"/"buffer"/"util")` —
  bundled for Node, not browser-clean. Could work but the runtime
  references to undefined Node globals are a fragility risk worth
  more validation than this PR can absorb.
- `pptxjs` is jQuery-era, requires four separate UMD scripts in a
  specific order, less actively maintained.
- The honest answer for PPTX is the LibreOffice sidecar (DOCX/XLSX/
  PPTX → PDF → PDF.js), which is the architecture every major
  product (Google Drive, Claude.ai, ChatGPT) effectively uses and
  the only path to ~5/5 fidelity for arbitrary user decks.

PPTX stays on the existing slide-list extraction for now. Open a
follow-up issue for the LibreOffice/Gotenberg sidecar.

Tests
-----
- 6 new in CDN-rendered describe block: wrapper structure, base64
  round-trip, SRI integrity + crossorigin, CSP locks
  (connect-src/eval/base-uri/form-action), fallback message wiring,
  size-threshold lock.
- Adjusted 2 existing tests that asserted on mammoth-path artifacts
  (literal document text in `<article class="lc-docx">`) — those
  assertions move to the mammoth-fallback test that calls
  `_internal.wordDocToHtmlViaMammoth` directly. Dispatcher tests now
  assert CDN-path signatures instead.

packages/api files: 434/434 , full unit suite 4473/4473 .

* 🧷 fix: Address Codex P1 (MIME aliases) + P2 (CDN dependency)

Two follow-up review findings on PR #12934, both real.

P1 — Spreadsheet MIME aliases on client
----------------------------------------
Backend's `officeHtmlBucket` uses the broad `excelMimeTypes` regex from
`librechat-data-provider` (covers `application/x-ms-excel`,
`application/x-msexcel`, `application/msexcel`, `application/x-excel`,
`application/x-dos_ms_excel`, `application/xls`, `application/x-xls`,
plus the canonical sheet MIMEs). The client's exact-match
`MIME_TO_TOOL_ARTIFACT_TYPE` only had three of those, so an
extensionless XLS upload with a legacy MIME would have backend HTML
produced but the client would fail to route the artifact at all —
preview chip never registers.

Fix: import the same regex on the client and add it as a fallback in
`detectArtifactTypeFromFile` after the exact-match map miss. Stays in
lock-step with the backend automatically.

7 new test cases — one per legacy alias.

P2 — Hard CDN dependency on jsdelivr
-------------------------------------
Air-gapped / corporate-filtered networks where jsdelivr is unreachable
would see DOCX previews permanently degrade to "Preview unavailable"
because the iframe could never load the renderer scripts. Mammoth was
sitting right there on the server but the dispatcher always preferred
the CDN path for files under 350 KB.

Fix: `OFFICE_PREVIEW_DISABLE_CDN` env var. When truthy (`1`, `true`,
`yes`, case-insensitive, whitespace-trimmed), `wordDocToHtml`
short-circuits to the mammoth path regardless of file size. Operators
on filtered networks set the env var; default behavior is unchanged.

Read at function-call time (not module load) so jest can flip it in
`beforeEach` without `jest.resetModules()`. The cost is one property
access per render.

12 new test cases: env-unset uses CDN (default), all five truthy
forms force mammoth, five non-truthy forms (`false`/`0`/`no`/empty/
arbitrary string) leave CDN active.

Tests
-----
packages/api/src/files: 446/446  (was 434, +12 from env-var matrix).
client artifact suites: 235/235  (was 228, +7 from MIME aliases).

*  feat: High-fidelity PPTX preview via pptx-preview in iframe

Mirrors the DOCX CDN architecture for PPTX: small files (≤350 KB
binary) embed as base64 and render via `pptx-preview` loaded from
jsdelivr inside the Sandpack iframe. Larger files and air-gapped
deployments fall back to the existing slide-list extraction.

Why
---
PPTX is the format where the gap between LibreChat's preview and
Claude.ai-style previews was most visible (slide-list of bullet
points vs. rendered slide layouts). LibreOffice → PDF → PDF.js is
still the eventual gold-standard answer for PPTX fidelity, but
client-side rendering inside the Sandpack iframe gets us a
meaningful intermediate step (~1.5/5 → ~3.5/5) without a sidecar.

What
----
- `pptx-preview@1.0.7` (ISC license, ~1.36 MB UMD bundle that
  includes its echarts/lodash/uuid/jszip/tslib deps inline). Pinned
  to a specific version on jsdelivr with SHA-384 SRI and
  `crossorigin="anonymous"`.
- `buildPptxCdnDocument` mirrors the DOCX wrapper: same CSP locks
  (`default-src 'none'`, `connect-src 'none'`, no eval, no base/form
  tampering), same `id="lc-doc-data"` base64 slot, same fallback
  message wiring (`typeof pptxPreview === 'undefined'` →
  "Preview unavailable").
- New public `pptxToHtml(buffer)` dispatcher; `bufferToOfficeHtml`
  switches its `'pptx'` case to call it. `pptxToSlideListHtml` stays
  exported as the slide-list-only path (still hit by tests directly
  and by the dispatcher fallback).
- `OFFICE_PREVIEW_DISABLE_CDN=true` env-var hatch applies to PPTX
  too — air-gapped operators get the slide-list path. Same env-var
  read at call time, same matrix of truthy values (`1` / `true` /
  `yes` / case-insensitive / whitespace-trimmed).
- `_internal` re-exports moved to after the PPTX section since the
  PPTX internals live further down in the file. Adds
  `pptxToHtmlViaCdn`, `MAX_PPTX_CDN_BINARY_BYTES`,
  `PPTX_PREVIEW_CDN`.

Honest caveats
--------------
- The 1.36 MB UMD bundle has `require("stream"/"events"/"buffer"/
  "util")` references in its outer wrapper. Those are bundled-dep
  artifacts (likely from `tslib` / Node-shim transforms) and don't
  appear to execute on the browser code paths, but I haven't done
  manual e2e on a wide range of decks. If a class of files turns up
  that breaks rendering, the iframe-side fallback message catches it
  and operators have `OFFICE_PREVIEW_DISABLE_CDN=true` as the bail.
- First-render CDN fetch is ~1.36 MB (browser-cached after).
- PPTX with embedded media easily exceeds the 350 KB binary cap;
  those files take the slide-list path. Lifting the cap is a
  follow-up (tied to the broader self-hosting work).

Tests
-----
11 new in two new describe blocks:
- `pptxToHtml dispatcher`: routing predicate (small → CDN, env-set
  → slide-list).
- `CDN-rendered path`: base64 round-trip, SRI integrity +
  crossorigin, CSP locks (connect/eval/base/form), fallback message,
  size-threshold lock at 350 KB.
- `OFFICE_PREVIEW_DISABLE_CDN escape hatch`: env-var matrix for
  truthy values.

packages/api/src/files: 457/457  (was 446, +11).

* 🪟 fix: DOCX preview fills the artifact panel width

docx-preview defaults to rendering at the document's native page
width (8.5in for letter, 21cm for A4). In a wide artifact panel
that left whitespace on either side; in a narrow one it forced
horizontal scroll.

Two changes:
- Pass `ignoreWidth: true` to `docx.renderAsync` so the library skips
  the document's pageSize width and uses its container's width.
- Defensive CSS overrides on `.docx-wrapper` and `.docx-wrapper > section.docx`
  in case a future library version regresses on the option, plus
  `padding: 0` on the wrapper to drop the page-edge whitespace
  docx-preview otherwise reserves.

`renderHeaders`/`renderFooters`/etc. stay enabled — those still
appear in the rendered output, just inside a container that fills
the panel instead of a fixed-width "page."

Tests unchanged (100/100); manual e2e ahead of merge.

* 🩹 fix: PPTX black screen — allow blob: workers + harden bootstrap

Manual e2e of the PPTX CDN renderer surfaced a black screen with
"Could not establish connection. Receiving end does not exist."
unhandled-rejection — characteristic of a Web Worker that couldn't
start.

Root cause: pptx-preview's bundled echarts dep spins up Web Workers
via blob: URLs for chart rendering. Our CSP had `default-src 'none'`
and no `worker-src`, so workers fell back to default → blocked. The
async failure deep inside echarts didn't surface through the outer
`previewer.preview()` promise, so my bootstrap's `.catch` never fired,
the loading state was removed, and the iframe sat with the body
background showing through (dark navy in dark mode = "black screen").

Three changes:
- Add `worker-src blob:` to the PPTX CSP. Allows blob:-only worker
  creation without permitting arbitrary worker URLs.
- Bootstrap: window-level `unhandledrejection` and `error` listeners
  so rejections from inside bundled-dep async pipelines surface as
  the user-facing "Preview unavailable" fallback instead of going
  silent.
- Bootstrap: 8-second timeout that checks `container.children.length`
  — if the renderer hasn't appended anything visible by then, assume
  silent failure and show the fallback.

Also wipe `container.innerHTML` when showing the fallback so a partial
render doesn't compete with the message.

DOCX wrapper unchanged: docx-preview doesn't use workers, so the
worker-src directive doesn't apply, and the existing fallback path
already covers its failure modes.

Tests
-----
- Existing PPTX CSP test now also asserts `worker-src blob:` is present.
- Existing fallback-message test extended to cover the new
  unhandledrejection/error/timeout listeners.

packages/api/src/files: 467/467 .

* 🔒 fix: gate office HTML routing on backend trust flag (textFormat)

Codex P1 review on PR #12934: routing .docx/.csv/.xlsx/.xls/.ods/.pptx
into the office preview buckets assumed `attachment.text` was already
sanitized full-document HTML, but that guarantee only existed for the
new code-output extractor path. Existing stored attachments and other
non-code paths can still carry plain extracted text — `useArtifactProps`
would then inject that as `index.html` inside the Sandpack iframe.

Adds a `textFormat: 'html' | 'text' | null` trust flag persisted on
the file record by the code-output extractor, surfaced over the SSE
attachment payload and the TFile API type. The client's routing in
`detectArtifactTypeFromFile` requires `textFormat === 'html'` before
landing on an office HTML bucket; everything else (legacy attachments,
RAG-extracted plain text from `parseDocument`, explicitly-marked
'text' entries) falls back to the PLAIN_TEXT bucket where the
markdown viewer escapes content rather than executing it.

Tests: new `getExtractedTextFormat` helper has 14 cases covering all
office paths, legacy XLS MIME aliases, parseDocument fallthroughs,
and null-input. Client `artifacts.test.ts` adds three security-gate
tests proving downgrade behavior for missing/null/'text' textFormat,
plus a `fileToArtifact` test that legacy office attachments without
the flag end up in PLAIN_TEXT with their content escaped.

* 🌐 fix: air-gapped DOCX preview — embed mammoth fallback in CDN doc

Codex P2 review on PR #12934: the CDN-rendered DOCX path always pulled
docx-preview + jszip from cdn.jsdelivr.net. Air-gapped or corporate-
filtered networks where jsdelivr is blocked would degrade to a static
"Preview unavailable" message even though the server already had a
local mammoth renderer that could produce readable output.

Now the dispatcher renders mammoth first and embeds the sanitized
output inside the CDN document as a hidden `#lc-fallback` block. The
iframe's existing `typeof docx === 'undefined'` check (which fires
when the CDN scripts can't load) un-hides the fallback so the user
sees a real preview. CDN-success path is unchanged: high-fidelity
docx-preview output owns the viewport, mammoth fallback stays hidden.

Two new safeguards in the dispatcher:
- Size budget: if base64(binary) + mammoth body + wrapper > 512 KB
  (the `attachment.text` cache cap), drop to mammoth-only so a giant
  document still renders. The `OFFICE_HTML_OUTPUT_CAP` constant
  mirrors `MAX_TEXT_CACHE_BYTES` from extract.ts (separate constant
  to avoid a circular import; pinned by a unit test).
- `lc-render` is hidden when fallback shows so the empty padded slot
  doesn't sit above the mammoth content.

Tests: existing CDN-path tests updated for the new
`wordDocToHtmlViaCdn(buffer, mammothBody)` signature; new test for
the embedded fallback structure (`#lc-fallback`, mammoth body
content, "High-fidelity renderer unavailable" notice, render-slot
hide); new constant pin and per-fixture cap-respect assertion.

* 🧪 feat: LibreOffice → PDF preview path (POC, opt-in via env)

Per the plan-mode discussion: prove out a LibreOffice subprocess
pipeline as an alternative to the docx-preview / pptx-preview CDN
renderers. LibreOffice handles every office format Microsoft and
LibreOffice itself can open (DOCX, PPTX, XLSX, ODT, ODP, ODS, RTF,
many more), produces a PDF, and the host browser's built-in PDF
viewer renders it inside the Sandpack iframe via a `data:` URI.
No client-side JS dependency, no CDN dependency, true high
fidelity for any feature LibreOffice supports.

Off by default. Operators opt in by setting both:
  - `OFFICE_PREVIEW_LIBREOFFICE=true`
  - LibreOffice (`soffice` or `libreoffice`) on the server's `$PATH`

When either is missing, the dispatcher falls through to the
existing CDN/mammoth/slide-list pipeline so a misconfiguration
doesn't break previews.

Hardening (`packages/api/src/files/documents/libreoffice.ts`):
- Fresh subprocess per call with isolated temp dir, stripped env
  (PATH/HOME/TMPDIR only), and `-env:UserInstallation` so concurrent
  conversions can't collide on shared `~/.config/libreoffice` locks
- 30-second wall-time cap; SIGKILL on timeout
- 50 MB PDF output cap to bound disk pressure
- 512 KB output cap on the wrapped HTML so the SSE/cache contract
  stays intact (base64 inflates ~33%, effective PDF cap ~380 KB)
- Macros disabled by default flags (`--norestore --invisible
  --nodefault --nofirststartwizard --nolockcheck`)
- Tag-distinct `LibreOfficeUnavailableError` /
  `LibreOfficeConversionError` so callers can swallow appropriately

Iframe wrapper (`buildPdfEmbedDocument`):
- Native browser PDF viewer via `<iframe src="data:application/pdf;
  base64,...">` — works in Chrome, Edge, Safari, Firefox
- CSP locks the iframe to `default-src 'none'; frame-src data:;
  connect-src 'none'; script-src 'unsafe-inline'` — no outbound
  network, no eval, no external scripts
- `#view=FitH` for first-paint sizing
- 4-second heuristic timer that swaps to a "Preview unavailable"
  fallback when the browser's PDF viewer is disabled (kiosk mode,
  Brave Shields, etc.)

Wired into `wordDocToHtml` and `pptxToHtml` as the first branch —
returns null when disabled / unavailable / oversized so the existing
pipeline takes over. XLSX intentionally NOT routed through this
path: SheetJS's HTML output is already excellent for spreadsheets
(sortable, sticky headers) and PDF rendering of sheets is awkward.

Tests (`libreoffice.spec.ts`, 30 cases — 25 always run, 5 conditional
on the binary): env-gating parser semantics matching
`OFFICE_PREVIEW_DISABLE_CDN`, fallthrough contract (never throws,
returns null on any failure), CSP lock-down, fallback structure,
binary probe caching + missing-binary path, error tagging, and
integration tests that engage when `soffice`/`libreoffice` is on
PATH (DOCX→PDF, PPTX→PDF, output-cap fallthrough). Integration
tests skip cleanly on bare CI.

* 🩹 fix: CI — preserve legacy download path for empty-text office attachments

Two regressions surfaced after the textFormat security gate landed.

1. **Client** (`LogContent.test.tsx` "falls back to the legacy download
   branch for an office file with no extracted text"):

   When the security gate downgraded an office type without
   `textFormat: 'html'` to PLAIN_TEXT, the lenient empty-text gate on
   PLAIN_TEXT then accepted a missing `text` field and rendered a
   half-empty panel card. The historical contract is "office type +
   no text → legacy download UI"; the downgrade should only fire when
   there's actual plain text that needs safe-escaping.

   Fix in `detectArtifactTypeFromFile`: short-circuit to null when the
   office type lands in the security-gate branch with no text. The
   PLAIN_TEXT downgrade still fires for legacy attachments that DO
   carry plain text.

2. **API** (`process.spec.js` + `process-traversal.spec.js`): the
   `@librechat/api` mocks didn't expose `getExtractedTextFormat`, so
   `processCodeOutput` called `undefined(...)` → TypeError → tests got
   undefined results. Added the helper to both mocks with a faithful
   default (returns 'text' for non-null extractor output, null
   otherwise).

Tests: new regression in `artifacts.test.ts` pinning the empty-text
+ no-textFormat → null contract for all four office types
(.docx/.csv/.xlsx/.pptx), so a future refactor can't silently
re-introduce the half-empty card.

* 🩹 fix: PPTX slides scale to fit panel width (no horizontal scroll)

Manual e2e on PR #12934: pptx-preview rendered slides at their native
init dimensions (960×540 default). The artifact panel is much narrower
than that, so the iframe got a horizontal scrollbar and only a corner
of each slide showed at any time — the user had to drag-scroll across
each slide to read it.

Fix: keep pptx-preview's init at 960×540 so its internal layout math
stays correct, then post-process each rendered slide:
- Cache the slide's native width/height on its dataset BEFORE
  applying any transform (so subsequent re-fits don't measure the
  already-transformed box).
- Wrap the slide in `.lc-slide-wrap` with explicit width/height set
  inline to the scaled dimensions; the wrap shrinks the layout space
  the slide occupies.
- Apply `transform: scale(panel_width / 960)` to the slide itself
  with `transform-origin: top left` so the rendered output shrinks
  from the top-left corner into the wrap.
- Cap the scale at 1.0 so small slides don't upscale and get blurry.

Streaming + resize:
- `MutationObserver` watches the container for slide insertions so
  streaming renders get scaled on arrival rather than waiting for
  the entire `previewer.preview` promise to settle.
- `ResizeObserver` re-fits all wrapped slides when the iframe
  resizes (panel drag, window resize).

Tests: new "bootstrap wraps + scales each slide" lock in the wrap
class, scale computation, observer setup, and native-size caching
so a future refactor can't silently re-introduce the overflow.

* 🩹 fix: PPTX wrap+scale runs after preview, not during streaming

Manual e2e on PR #12934: regenerated PPTX showed "Preview unavailable"
in the iframe. Root cause: the MutationObserver I added in the
previous commit fired during pptx-preview's render and moved slides
out from under the library's references. pptx-preview's async
pipeline raised an unhandled rejection, the iframe's window-level
listener caught it, and the fallback message replaced the partial
render.

Fix: drop the MutationObserver. Apply the wrap+scale ONCE in a
`finalize` step that runs:
  - On `previewer.preview().then` (the happy path)
  - On the 8-second timeout safety net IF the container has children
    (silent-failure path — pptx-preview emitted slides but never
    resolved its outer promise)

To prevent the user from seeing an unscaled flash while pptx-preview
renders into the 960px-wide canvas, the container is set to
`visibility: hidden` at init and only revealed inside `finalize`
after wrap+scale completes.

Resize handling stays via `ResizeObserver` on `document.body`,
installed AFTER the wrap pass so it doesn't fire during the wrap
itself.

Tests: regression assertion now also locks in:
  - `container.style.visibility = 'hidden' / 'visible'` (the flash-
    prevention contract)
  - Absence of MutationObserver (the bug we just removed — must NOT
    creep back in via a future "let's scale during streaming" idea)

* 🩹 fix: PPTX slides fill panel width (drop upscale cap, per-slide scale)

Manual e2e on PR #12934: slides rendered correctly but didn't fill the
artifact panel — whitespace on either side. Two issues:

1. The scale was capped at `Math.min(1, available / SLIDE_W)`. On
   panels wider than 960px, the cap clamped the scale to 1.0 and
   slides rendered at native size with whitespace on the sides
   instead of stretching.

2. The scale was computed against the constant `SLIDE_W = 960`, but
   pptx-preview can emit slides whose `offsetWidth` differs from the
   init param if the source PPTX has a non-16:9 layout. Per-slide
   division of `available / nativeW` handles that case.

Fix: replace `computeScale()` with two helpers — `availableWidth()`
returns the panel content-box width and `scaleFor(nativeW)` returns
the per-slide scale. No upscale cap. The slide content is rendered
by pptx-preview against its 960×540 canvas using vector text /
canvas — scaling up to e.g. 1500px doesn't visibly degrade quality.

Tests: regression now also asserts:
  - `availableWidth()` and `scaleFor()` exist by name
  - The exact scale formula `availableWidth() / (nativeW || SLIDE_W)`
  - Negative assertion that `Math.min(1, ...)` is NOT present, so a
    future "let's add an upscale cap" rewrite can't silently
    re-introduce the whitespace.

* 🩹 fix: PPTX preview fills panel height (no white gap below slides)

Manual e2e on PR #12934: PPTX preview filled the panel width but left
empty space below the last slide. DOCX didn't have this issue because
its content (mammoth-rendered HTML) flows naturally and either fits
exactly or overflows; PPTX slides are fixed-aspect 16:9 and don't
grow with the panel.

Two changes:

1. **Body fills the iframe viewport** — `html, body { min-height:
   100vh }` plus `body { display: flex; flex-direction: column }`
   and `#lc-render { flex: 1 0 auto }`. The dark theme bg now fills
   the iframe even when total slide content is shorter than the
   panel, so a single-slide deck never reveals a "white below" gap.

2. **Per-slide scale honors viewport height** — `scaleFor(nativeW,
   nativeH)` now returns `min(width-fit, height-fit)` (largest
   factor that fits without overflowing either dimension). On a
   tall artifact panel with a short deck, slides grow up to the
   full panel height instead of staying at the width-bound size.
   Existing height-fit was always considered correct conceptually
   but the previous implementation only used width-fit, leaving
   half the viewport unused per slide.

Tests: regression now also asserts `availableHeight()`, the
`Math.min(sw, sh)` formula, and `min-height: 100vh` are in the
bootstrap. Negative assertion for the old `Math.min(1, ...)` upscale
cap remains.

* 🩹 fix: revert body flex on PPTX bootstrap (caused black-screen render)

Manual e2e regression on PR #12934: the previous commit added
`body { display: flex; flex-direction: column }` plus
`#lc-render { flex: 1 0 auto }` to fill the panel height. Side effect:
pptx-preview's internal layout assumes block flow on its ancestor
elements; making body a flex container caused slides to render as
solid-black rectangles (sized correctly, but with no visible content
inside).

Fix: keep just `html, body { min-height: 100vh }` for the bg-fill
effect — that alone gives empty space below short decks the dark
theme bg without changing flow. Drop the body-flex and the
`#lc-render { flex: 1 0 auto }` directives.

The height-aware `scaleFor(nativeW, nativeH)` from the same commit
stays — it doesn't interact with pptx-preview's layout, just chooses
a per-slide scale. Each slide still grows to fit the viewport
contain-style.

Negative-assertion added to the regression test: `body { display:
flex }` must NOT appear in the bootstrap, so a future "let's flex
the body to make height work" rewrite can't silently re-introduce
this.

(Note: the user also flagged DOCX theming as faint body text; I'm
leaving that for now per their note that it may be pre-existing.
Not addressed in this commit.)

* 🩹 fix: revert PPTX height-fill changes; lock DOCX CDN to light scheme

Two fixes for separate manual e2e regressions on PR #12934.

**1. PPTX black screen (single slide rendering as solid black).**

The previous fix removed `body { display: flex }` thinking that was
the sole cause, but the regression persisted. Bisecting against the
last known-good commit (4e2d538b0, width-fit only), the actual culprit
is the COMBINATION of:
- `min-height: 100vh` on html/body
- `availableHeight()` reading viewport-derived dimensions
- `Math.min(sw, sh)` height-aware scale

pptx-preview's CSS injection step interacts unpredictably with
these. Reverting to width-only `scaleFor(nativeW)` and dropping the
viewport min-height restores reliable rendering. Vertical empty
space below short decks now shows the body's bg color (`var(--bg)`)
which still matches the panel theme — that's an acceptable trade-off
vs. the black-screen regression.

Negative assertions added: `Math.min(sw, sh)`, `availableHeight`,
`min-height: 100vh`, `body { display: flex }` must NOT appear in
the bootstrap. So a future "let's fill height" rewrite has to
demonstrate it doesn't break pptx-preview before it can land.

**2. DOCX body text rendering as faint / translucent grey.**

docx-preview emits page-style rendering with white pages and the
docs native text colors. The CDN doc declared
`color-scheme: light dark`, so on OS dark mode the iframes
inheritable `--fg` resolved to `#e5e7eb` (light grey). docx-preview
body text (no explicit color in the source DOCX) inherited that
light-grey on the white page bg → barely-visible "translucent"
rendering.

Fix: declare `color-scheme: light` only in `buildDocxCdnDocument`,
drop the dark-mode `@media` override. docx-preview is a light-mode-
only renderer; matching that produces correct contrast regardless
of OS theme. The mammoth-only `wrapAsDocument` path is unaffected
— it owns its own bg + text colors and continues to respect the
users OS scheme.

New regression test pins the lock: CDN doc must contain
`color-scheme: light`, must NOT contain `color-scheme: light dark`,
must NOT contain `prefers-color-scheme: dark`.

* 🩹 fix: relax connect-src to allow sourcemap fetches (silence CSP noise)

Manual e2e on PR #12934: every time DevTools is open while viewing a
DOCX or PPTX preview, the console fills with CSP violations like:

  Connecting to 'https://cdn.jsdelivr.net/npm/docx-preview@0.3.7/
  dist/docx-preview.min.js.map' violates the following Content
  Security Policy directive: "connect-src 'none'". The request has
  been blocked.

The actual rendering isn't affected (sourcemap fetches happen AFTER
the script has already loaded and executed via `script-src`), but
the noise is enough to make people suspect a real problem and
distracts from useful console output.

Fix: relax `connect-src` from `'none'` to `'self' https://cdn.
jsdelivr.net` in both DOCX and PPTX CDN docs. This allows:
  - Same-origin fetches (sandpack-static-server) — covers any
    bundler-embedded sourcemaps + same-origin runtime fetches the
    renderer might make
  - jsdelivr fetches — covers sourcemaps from the CDN where we
    loaded the script

Exfiltration risk stays minimal: the iframe is cross-origin to
LibreChat so an attacker can't read application data anyway, and
neither 'self' (sandpack-static-server) nor jsdelivr is a useful
target for exfiltrating slide content to a host the attacker
controls.

Tests updated: assertions for `connect-src 'none'` swapped to
`connect-src 'self' https://cdn.jsdelivr.net` for both DOCX + PPTX
CDN docs. Added negative assertion for wildcard `*` in connect-src
so a future "let's allow everything" rewrite can't widen the
exfiltration surface.

* 🩹 fix: surface PPTX/DOCX fallback reason (inline + console)

Manual e2e on PR #12934: "Preview unavailable" appears in the iframe
with no way to know what actually failed. The reason was tucked into
the fallback element's `title` attribute (hover-only tooltip) — easy
to miss and impossible to copy/paste.

Now surfaces three ways:
  1. Visible inline via a `<details>` element with the reason in
     monospace, folded so the friendly message stays primary but the
     diagnostic is one click away in the iframe itself.
  2. `title` attribute (preserved) for hover tooltip.
  3. `console.error('[pptx-preview] fallback fired:', reason)` so
     DevTools shows it in red — also the only reliable way to see
     the reason if the iframe is detached / re-mounted.

DOCX gets the same console mirror (as `console.warn` since the
fallback there is "high-fidelity unavailable, showing simplified
preview" — informational, not error). The DOCX fallback already
displays the mammoth-rendered content visibly, so no `<details>`
needed there.

Tests: regression assertions pin the diagnostic surfacing — the
`<details>` element, the `title` write, and the `console.error`
call must all be present in the bootstrap.

* 🩹 fix: PPTX CDN embeds slide-list fallback + detects empty renders

Manual e2e + DOM inspection on PR #12934: pptx-preview silently
produces empty `.pptx-preview-wrapper` placeholders for pptxgenjs-
generated decks. The library parses the file enough to create the
960×540 host element with a black bg, then fails to populate it.
The outer Promise resolves "successfully" — no throw, no rejection,
the bootstrap thinks rendering succeeded — and the user sees a black
rectangle with no content and no fallback message.

Fix mirrors the DOCX mammoth-fallback pattern from commit 0c0b0ce88:

1. **Server side**: `pptxToHtml` now renders the slide-list body
   (`<ol class="lc-pptx-list">...`) via the new `renderPptxSlidesBody`
   helper, then embeds it inside the CDN doc via the new
   `buildPptxCdnDocument(base64, slideListFallbackBody)` signature.
   Combined-doc size budget mirrors the DOCX pattern: if the CDN doc
   would exceed `OFFICE_HTML_OUTPUT_CAP` (512 KB), drop to slide-list
   only.

2. **Iframe bootstrap**: new `hasRenderedContent()` check after
   `wrapSlides()` walks each `.lc-slide-wrap` looking for actual
   child content inside pptx-preview's emitted slide nodes. If every
   wrap is empty, fires `showFallback('renderer-produced-empty-
   wrappers ...')` which reveals the embedded slide-list view
   instead of the previous static "Preview unavailable" message.

3. **CSS**: slide-list rules extracted to `PPTX_SLIDE_LIST_CSS`
   constant so they can be inlined into both the standalone slide-
   list document AND the CDN doc's `<style>` block (CSP `style-src`
   is `'unsafe-inline'` only — no external sheets).

`renderPptxSlidesHtml` now delegates to `renderPptxSlidesBody`
wrapped in `wrapAsDocument` — single source of truth for the slide
markup.

Tests (506 passing, +1 vs before): existing `pptxToHtmlViaCdn`
call sites updated for the new fallback-body argument; new
regression test pins `hasRenderedContent`, the
`renderer-produced-empty-wrappers` reason string, the embedded
fallback structure, and the inlined slide-list CSS.

* fix: Detect Empty PPTX Preview Slides

* 🩹 fix: LibreOffice PDF embed uses blob: URL (Chrome blocks data: PDFs)

Manual e2e on PR #12934: enabling `OFFICE_PREVIEW_LIBREOFFICE=true`
on a host with `soffice` installed surfaced "This page has been
blocked by Chrome" inside the PDF preview iframe.

Root cause: Chrome blocks `data:application/pdf;base64,...`
navigations inside sandboxed iframes (anti-phishing measure since
Chrome 76, see crbug.com/863001). The Sandpack iframe IS sandboxed
(its `sandbox="..."` attribute lacks `allow-top-navigation` for
data: URLs specifically), so when our inner `<iframe src="data:
application/pdf;...">` tries to navigate, Chrome's interstitial
fires and renders the "blocked" message.

Fix: switch from `data:` URL to `blob:` URL. The bootstrap now:
  1. Reads the base64 payload from a `<script type="application/
     octet-stream;base64">` data block (same pattern as the DOCX
     and PPTX wrappers).
  2. Decodes via `atob` + `Uint8Array.from`.
  3. Creates a `Blob` with `type: 'application/pdf'`.
  4. `URL.createObjectURL(blob)` produces a same-origin blob: URL.
  5. Sets `pdfFrame.src = url + '#view=FitH'` — Chrome treats blob:
     URLs as legitimate navigation and serves the built-in PDF
     viewer.

CSP updated: `frame-src blob:` (was `frame-src data:`). `data:` is
now explicitly NOT allowed in `frame-src` since Chrome would block
it anyway in our context — keeping it would be misleading
documentation.

Bonus: failure paths now log to `console.error` with a
`[libreoffice-pdf]` prefix so DevTools surfaces blob-creation
failures and PDF-viewer load timeouts in red.

Tests updated:
- "emits a complete sandboxed HTML document" now asserts the
  data-block + blob URL construction (not the old data: URL).
- New CSP test "allows blob: in frame-src (NOT data:)" with both
  positive and negative assertions to lock in the change.
- Integration test for `tryLibreOfficePreview` updated to look for
  the data block + `URL.createObjectURL` instead of the data: URL.
- Large-payload test now verifies the data block round-trip rather
  than data: URL escaping (base64 alphabet has no characters that
  break out of `<script>` anyway).

* 🩹 fix: LibreOffice PDF embed renders via pdf.js (Chrome blocks blob: PDFs too)

Manual e2e on PR #12934 round 2: switching from `data:` to `blob:`
URLs (commit d90f26c11) didn't fix the "This page has been blocked
by Chrome" interstitial. Chrome blocks BOTH data: AND blob: PDF
navigations inside sandboxed iframes — the built-in PDF viewer
requires a top-level browsing context. The Sandpack host iframe is
sandboxed, so neither approach works.

Fix: switch from native browser PDF viewer to pdf.js (Mozilla's
pdfjs-dist) loaded from CDN. pdf.js renders to `<canvas>` which
works in any context — no plugin, no privileged viewer, no
top-level requirement. ~1 MB CDN load is acceptable for a path
that's already opt-in via `OFFICE_PREVIEW_LIBREOFFICE=true`.

Implementation:
- Pin pdf.js v3.11.174 (single-file UMD; v4+ uses ES modules which
  complicate the load + SRI flow)
- Worker URL pointed at the same jsdelivr origin; CSP `worker-src
  https://cdn.jsdelivr.net blob:` allows it
- DPR-aware canvas rendering: scale based on `panelWidth /
  page.viewport.width * devicePixelRatio` so retina displays get
  crisp pixels
- Sequential page rendering (Promise chain) so a many-slide PDF
  doesn't spawn N parallel render jobs
- 15 s timeout safety net (was 4 s for the native viewer; pdf.js
  with DPR=2 on a many-page PDF can take longer)

CSP changes:
- Added `script-src https://cdn.jsdelivr.net 'unsafe-inline'` (was
  inline-only)
- Added `worker-src https://cdn.jsdelivr.net blob:`
- Removed `frame-src` entirely (no nested iframes)
- Removed `object-src` (no `<object>`/`<embed>` either)

Same diagnostic surfacing as the other CDN paths: failure reasons
shown via `<details>` disclosure inline + `console.error` to
DevTools.

Tests updated: PDF.js script presence, GlobalWorkerOptions setup,
canvas render path, all the new failure detection paths. Negative
assertions for both `data:application/pdf` and `blob:...application
/pdf` so a future "let's just try the native viewer again" rewrite
can't silently re-introduce the Chrome block.

SRI hashes intentionally omitted (unlike docx-preview / pptx-
preview) — operator opted in by setting the env flag and trusts
the LibreOffice render pipeline. Worth adding once the path is
proven in production.

* 🧹 cleanup: trim unused _internal exports + stale JSDoc references

After the LibreOffice + pdf.js path proved out, swept the office HTML
modules for dead code and stale documentation.

**Unused `_internal` exports removed (`html.ts`):**
  - `renderMammothBody` — only called within the file (by
    `wordDocToHtmlViaMammoth` and `wordDocToHtml`), never imported by
    tests.
  - `DOCX_PREVIEW_CDN` — internal config constant, never referenced.
  - `PPTX_PREVIEW_CDN` — same, never referenced.

The remaining `_internal` surface (`wordDocToHtmlViaCdn`,
`wordDocToHtmlViaMammoth`, `pptxToHtmlViaCdn`,
`MAX_DOCX_CDN_BINARY_BYTES`, `MAX_PPTX_CDN_BINARY_BYTES`,
`OFFICE_HTML_OUTPUT_CAP`) is all actively used by the spec file.

**Stale JSDoc fixed (`libreoffice.ts`):**

Module-level header still claimed we "embed the PDF as a base64
data:application/pdf URI" and "rely on the host browser's built-in
PDF viewer". Both untrue after the pdf.js switch in commit b2cc81ad8.
Updated to:
  - Describe the actual pipeline: PPTX → soffice → PDF → pdf.js → canvas
  - Document the dead-end iterations (data: blocked, blob: also blocked,
    pdf.js works) so future readers don't re-discover the same Chrome
    PDF-viewer-in-sandboxed-iframe limitation
  - Drop "(POC)" tag — the path is production-quality, just opt-in
  - Adjust disk footprint estimate (250-350 MB with
    `--no-install-recommends` is more accurate than the 500 MB original)

No production code changes; tests still 505 passing.

*  feat: per-format LibreOffice opt-in (env value accepts format list)

Manual e2e on PR #12934: enabling `OFFICE_PREVIEW_LIBREOFFICE=true`
forces both DOCX and PPTX through the LibreOffice path. DOCX renders
~instantly via docx-preview and rarely needs the LibreOffice
treatment; paying the ~2-3 s cold-start there hurts UX without
adding much.

Solution: extend the env var to accept three forms:
  - Truthy (`true`/`1`/`yes`): all formats — backwards compatible
    with the previous behavior
  - Falsy (`false`/`0`/`no`/empty/unset): no formats — default
  - Comma-separated list (`pptx`, `pptx,docx`): just those formats

Practical guidance documented in the module header: most operators
will set `OFFICE_PREVIEW_LIBREOFFICE=pptx` — pptx-preview chokes on
pptxgenjs decks and the slide-list fallback loses formatting, so
LibreOffice is the only path that produces a faithful PPTX preview.
DOCX is well-served by docx-preview's existing CDN renderer.

API:
- New `isLibreOfficeEnabledFor(format)` is the per-format gate, used
  by `tryLibreOfficePreview` to short-circuit before doing work.
- Existing `isLibreOfficeEnabled()` retained for "any format
  enabled" diagnostic checks (returns true if at least one format
  is opted in).
- Internal `parseLibreOfficeEnablement` returns `'all' | Set | null`
  — keeps the gate future-proof: adding a new format to the
  LibreOffice route doesnt require operators to re-enumerate their
  env value.

Edge cases handled:
- Whitespace-tolerant: `  pptx  ,  docx  ` works
- Case-insensitive on both env value AND format name
- Empty list entries dropped: `pptx, ,docx` enables pptx + docx
- Empty string treated as unset (not as a valid empty list)

Tests: 21 new cases pinning the parse semantics + per-format gate
(`pptx` env vs `docx` lookup → false, etc.). Existing
`isLibreOfficeEnabled` tests retained but renamed to clarify the
"any format" semantic.

Total file tests: 526 passing (+21 vs before).

* 🔒 fix: officeHtmlBucket only does MIME fallback when extension is empty

Codex P2 review on PR #12934: the server's `officeHtmlBucket` falls
back to MIME whenever the extension isn't an OFFICE extension. The
client's `detectArtifactTypeFromFile` is stricter — it routes by
extension first for ANY known extension (`.txt` → PLAIN_TEXT,
`.md` → MARKDOWN, `.py` → CODE, etc.), only falling back to MIME
when the extension is unknown.

Mismatch case: `notes.txt` shipped with `Content-Type: application/
vnd.openxmlformats-officedocument.wordprocessingml.document`. Server
runs `officeHtmlBucket` → extension `.txt` not office → MIME fallback
→ 'docx' → produces full HTML, sets `textFormat: 'html'`. Client
routes by extension to PLAIN_TEXT (extension wins), markdown viewer
escapes the HTML, user sees raw `<html>...` markup instead of the
rendered preview.

Fix: server only falls back to MIME when extension is genuinely empty
(extensionless filename). Symmetric with the client's "extension
wins for any known extension" semantic — neither will mis-route.

Trade-off: a true DOCX renamed to `myfile.bin` with the canonical
DOCX MIME no longer routes through office HTML on the server. The
client would have routed to the office bucket via MIME, then the
security gate (`textFormat !== 'html'`) would have downgraded to
PLAIN_TEXT anyway. So the user-visible outcome is the same (raw
bytes via PLAIN_TEXT) — the new behavior just avoids producing HTML
that the client would never use.

Long-term fix: share the extension routing table in data-provider
so both server and client query the same source of truth. Out of
scope for this PR.

Tests: new 8-case `it.each` block in `officeHtmlBucket predicate`
locks in the contract — `.txt`/`.md`/`.json`/`.py`/`.html`/`.css`
+ office MIME → null, and `.bin`/`.dat` + office MIME → null too.
Existing extension-wins tests still pass unchanged.

Total file tests: 534 (+8 vs before).
2026-05-05 12:06:10 +09:00
Artyom Bogachenko
5683706af5
🔐 feat: OIDC Bearer Token Authentication for Remote Agent API (#12450)
* Remote Agent Auth middleware

* consider migration and update user

* fix eslint errors

* add scope validation

* fix codex review errors

* add filter for use: sig

* add jwks-rsa deps

* Fix remote agent OIDC auth review findings

* Polish remote agent OIDC timeout coverage

* Reject remote OIDC tokens without subject

* Use tenant context for remote agent auth config

* Harden remote agent OIDC scope handling

* Polish remote agent OIDC cache and scope tests

* Resolve remote agent auth review comments

* Reuse OpenID email claim resolver for remote auth

* Skip empty OpenID email fallback claims

* Use pre-auth tenant context for remote auth config

* Downgrade expected OIDC fallback logging

* Require secure remote OIDC endpoints

* Polish remote agent auth edge cases

* Enforce unique balance records

* Bind remote OpenID users to issuer

* Fix issuer-scoped OpenID indexes

* Avoid unique balance index requirement

* Fix remote OpenID issuer normalization boundaries

* Require issuer-bound OpenID lookups

* Enforce tenant API key policy after auth

* Fix remote auth tenant policy types

* Normalize remote OIDC discovery issuer

* Allow normalized remote OIDC issuer validation

* Enforce resolved tenant OIDC policy

* Polish OpenID issuer and scope validation

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 17:06:35 -04:00
Danny Avila
f5dd053128
🛡️ refactor: Restrict User Tavily Endpoint URLs (#12946) 2026-05-05 05:09:11 +09:00
Danny Avila
3170bd8b22
📦 chore: bump @librechat/agents to v3.1.77 2026-05-03 23:54:46 -04:00
Danny Avila
2c4a78094a
🛂 refactor: Avoid Default Tavily Safe Search (#12939) 2026-05-04 12:01:23 +09:00
Yashwanth Alapati
3da1d8c961
🔍 feat: add Tavily as Search and Scraper Provider (#12581)
* feat: add Tavily integration as search provider and scraper provider

* chore:update tavily web search parameters

* chore:tavily paramer update

* chore:update data-schemas test for tavily

* fix: allow Tavily string option modes

* fix: align Tavily config options

* fix: scope Tavily scraper timeout

* fix: use resolved scraper provider timeout

* fix: widen Tavily search provider types

* fix: harden Tavily web search config

* fix: cap Tavily option timeouts

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 11:29:13 +09:00
Danny Avila
d6d70eeb26
📦 chore: bump @librechat/agents to v3.1.76 2026-05-03 22:25:12 -04:00
Danny Avila
4cce88be42
🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets (#12933)
* 🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets

LibreChat already blocks SSRF-prone targets (private IPs, loopback,
link-local, .internal/.local TLDs) at every server-side fetch site
that consumes user-controllable URLs — custom-endpoint baseURLs, MCP
servers, OpenAPI Actions, and OAuth endpoints. The only existing
escape hatch is `allowedDomains`, but that flips the field into a
strict whitelist: adding `127.0.0.1` to permit a self-hosted Ollama
also blocks every public destination that isn't in the list.

Introduce `allowedAddresses` as the orthogonal primitive: a private-
IP-space exemption list. When a hostname or its resolved IP appears
in the list, the SSRF block is bypassed for that target. Public
destinations remain reachable. Operators can now run self-hosted
LLMs / MCP servers / Action endpoints on private addresses without
weakening the default-deny posture for everything else.

Schema additions in `packages/data-provider/src/config.ts`:
- `endpoints.allowedAddresses` (new — gates `validateEndpointURL`)
- `mcpSettings.allowedAddresses` (parallel to `allowedDomains`)
- `actions.allowedAddresses` (parallel to `allowedDomains`)

Core changes in `packages/api/src/auth/`:
- New `isAddressAllowed(hostnameOrIP, allowedAddresses)` — pure,
  case-insensitive, bracket-stripped literal match.
- Threaded the list through `isSSRFTarget`, `resolveHostnameSSRF`,
  `isDomainAllowedCore`, `isActionDomainAllowed`, `isMCPDomainAllowed`,
  `isOAuthUrlAllowed`, and `validateEndpointURL`.
- Extended `createSSRFSafeAgents` and `createSSRFSafeUndiciConnect`
  to accept the list, building an SSRF-safe DNS lookup that exempts
  matching hostnames/IPs at TCP connect time (TOCTOU-safe).

Wiring:
- Custom and OpenAI endpoint initialize sites pass
  `endpoints.allowedAddresses` to `validateEndpointURL`.
- `MCPServersRegistry` stores `allowedAddresses` and exposes it via
  `getAllowedAddresses()`. The factory, connection class, manager,
  `UserConnectionManager`, and `ConnectionsRepository` all thread
  it through to the SSRF utilities.
- `MCPOAuthHandler.initiateOAuthFlow`, `refreshOAuthTokens`, and
  `validateOAuthUrl` accept the list and consult it on every URL
  validation along the OAuth chain.
- `ToolService`, `ActionService`, and the assistants/agents action
  routes pass `actions.allowedAddresses` to `isActionDomainAllowed`
  and to `createSSRFSafeAgents` for runtime action calls.
- `initializeMCPs.js` reads `mcpSettings.allowedAddresses` from the
  app config and forwards it to the registry constructor.

Documentation:
- `librechat.example.yaml` shows the new field next to each existing
  `allowedDomains` block, with a note clarifying that
  `allowedAddresses` is an exemption list (not a whitelist).

Tests:
- Unit tests for `isAddressAllowed` covering literal IPs, hostnames,
  IPv6 brackets, case insensitivity, and partial-match rejection.
- Exemption tests for every entry point: `isSSRFTarget`,
  `resolveHostnameSSRF`, `validateEndpointURL`, `isActionDomainAllowed`,
  `isMCPDomainAllowed`, `isOAuthUrlAllowed`.
- Existing tests updated to reflect the new optional parameter.

Default behavior is unchanged: omitted = empty list = no exemptions.

* 🩹 fix: Plumb allowedAddresses Through AppConfig endpoints Type

The initial PR added `endpoints.allowedAddresses` to the
data-provider config schema and consumed it in the endpoint
initialize sites, but the runtime `AppConfig.endpoints` shape in
`@librechat/data-schemas` was a hand-maintained subset that didn't
include the new field — so `tsc` rejected `appConfig.endpoints.allowedAddresses`.

Add the field to `AppConfig['endpoints']` in
`packages/data-schemas/src/types/app.ts` and forward it from the
loaded config in `packages/data-schemas/src/app/endpoints.ts` so the
runtime config carries the value.

Update `initializeMCPs.spec.js` to expect the third positional
argument (`allowedAddresses`) on the `createMCPServersRegistry` call.

* 🩹 fix: Enforce allowedDomains Before allowedAddresses In isOAuthUrlAllowed

The initial implementation checked the address exemption first, so a
URL whose hostname appeared in `allowedAddresses` would return true
even when the admin had configured `allowedDomains` as a strict bound
on OAuth endpoints. A malicious MCP server could advertise OAuth
metadata, token, or revocation URLs at any address the admin had
permitted for an unrelated reason (a self-hosted LLM at `127.0.0.1`,
for example) and pass validation, expanding SSRF reach beyond the
configured domain whitelist.

Reorder: when `allowedDomains` is set, treat it as authoritative —
return true only if the URL matches a domain entry, otherwise fall
through to false. The address exemption only applies when no
`allowedDomains` is configured (mirrors how the downstream SSRF check
in `validateOAuthUrl` consults `allowedAddresses`).

Add a regression test asserting that an `allowedAddresses` entry does
not broaden a configured `allowedDomains` list.

Reported by chatgpt-codex-connector on PR #12933.

* 🩹 fix: Forward allowedAddresses To Remaining OAuth Callers

Two `MCPOAuthHandler` callers still used the pre-feature signatures and
were silently dropping the new `allowedAddresses` argument:

- `api/server/routes/mcp.js` invoked `initiateOAuthFlow` with the old
  5-argument shape, so OAuth flows initiated through the route handler
  ignored the registry's `getAllowedAddresses()` and would reject any
  metadata/authorization/token URL on a permitted private host.
- `api/server/controllers/UserController.js#maybeUninstallOAuthMCP`
  invoked `revokeOAuthToken` without the address exemption, so
  uninstalling an OAuth-backed MCP server on a permitted private host
  would fail at the revocation step even though the rest of the MCP
  connection path now permits it.

Both sites now read `allowedAddresses` from the registry alongside
`allowedDomains` and forward it. Reported by Copilot on PR #12933.

* 🩹 fix: Update Test Mocks And Assertions For OAuth allowedAddresses

The previous commit started passing `allowedAddresses` to
`MCPOAuthHandler.initiateOAuthFlow` from `api/server/routes/mcp.js`
and to `MCPOAuthHandler.revokeOAuthToken` from
`api/server/controllers/UserController.js`, but the corresponding
test files mocked the registry without `getAllowedAddresses` (causing
`TypeError`s) and asserted the old positional shape on
`toHaveBeenCalledWith`.

Update the mocks and assertions to match the new arity:

- `api/server/routes/__tests__/mcp.spec.js`: add
  `getAllowedDomains`/`getAllowedAddresses` to the registry mock and
  expect the additional positional args on `initiateOAuthFlow`.
- `api/server/controllers/__tests__/maybeUninstallOAuthMCP.spec.js`:
  add a `getAllowedAddresses` mock alongside the existing
  `getAllowedDomains` and seed it in `setupOAuthServerFound`.
- `api/server/controllers/__tests__/UserController.mcpOAuth.spec.js`:
  add `getAllowedAddresses` to the registry mock and expect the
  trailing `null` arg on the three `revokeOAuthToken` assertions.

* 🛡️ fix: Address Comprehensive Review — Scope allowedAddresses To Private IP Space

Major findings from the comprehensive PR review (severity → fix):

**CRITICAL — `validateOAuthUrl` SSRF fallback bypass.** When `allowedDomains`
is configured and a URL fails the whitelist, the SSRF fallback in
`validateOAuthUrl` was still passing `allowedAddresses` to `isSSRFTarget` /
`resolveHostnameSSRF`, letting a malicious MCP server advertise OAuth
endpoints at any address the admin had permitted for an unrelated reason.
Suppress `allowedAddresses` in the fallback when `allowedDomains` is active —
the address exemption is opt-in for the no-whitelist mode only.

**MAJOR — WebSocket transport SSRF check ignored exemptions.** The
`constructTransport` WebSocket branch called `resolveHostnameSSRF(wsHostname)`
without `this.allowedAddresses`, so a permitted private MCP server would
pass `isMCPDomainAllowed` but be blocked at transport creation. Forward
the exemption.

**Scope `allowedAddresses` to private IP space only (operator directive).**
The exemption list is for permitting private/internal targets; it must not
be a back-door to broaden trust to public destinations.
- Schema (`packages/data-provider/src/config.ts`): new
  `allowedAddressesSchema` rejects URLs (`://`), paths/CIDR (`/`),
  whitespace, and public IPv4/IPv6 literals at config-load time. Wired
  into `endpoints`, `mcpSettings`, and `actions`.
- Runtime (`packages/api/src/auth/domain.ts`): `isAddressAllowed` now
  drops public-IP candidates and public-IP entries on the match path —
  defense in depth so a misconfigured runtime list never grants exemption.
- Hot path (`packages/api/src/auth/agent.ts`): `buildSSRFSafeLookup`
  pre-normalizes the list into a `Set<string>` once at construction and
  applies the same scoping filter, so the connect-time DNS lookup is an
  O(1) Set membership check instead of a full re-iterate-and-normalize on
  every outbound request.

**Test coverage for the connect-time and OAuth-fallback paths.**
- `agent.spec.ts`: new describe block exercising `buildSSRFSafeLookup` and
  `createSSRFSafe*` with `allowedAddresses` — hostname-literal exemption,
  resolved-IP exemption, public-IP scoping, URL/CIDR/whitespace rejection,
  and the default no-list block.
- `handler.allowedAddresses.test.ts` (new): integration tests for
  `validateOAuthUrl` — covers both the no-domains-set "permit private"
  path and the strict-bound regression where `allowedAddresses` must NOT
  bypass `allowedDomains`.

**Documentation & cleanup.**
- `connection.ts` redirect SSRF check: explicit comment that
  `allowedAddresses` is intentionally NOT consulted for redirect targets
  (server-controlled, must not inherit the admin's exemption).
- `MCPConnectionFactory.test.ts`: replaced an `eslint-disable` with a
  proper `import { getTenantId } from '@librechat/data-schemas'`. The
  disable was added to make a pre-existing `require()` quiet — the cleaner
  fix is to use the existing top-level import.

Updated `MCPConnectionSSRF.test.ts` WebSocket SSRF assertions to match the
new two-argument call shape (`hostname, allowedAddresses`).

* 🩹 fix: Require Absolute URL Before allowedAddresses Trust Bypass In isOAuthUrlAllowed

`parseDomainSpec` is lenient — it silently prepends `https://` to
schemeless inputs so it can match patterns like bare `example.com`.
That leniency leaked into `isOAuthUrlAllowed`'s new
`allowedAddresses` short-circuit: a value like `10.0.0.5/oauth` (no
scheme) would parse successfully via the prepended default, hit the
address-exemption path, return `true`, and skip `validateOAuthUrl`'s
strict `new URL(url)` parse-or-throw — only to fail later in OAuth
discovery with a less clear runtime error.

Add a strict `new URL(url)` gate at the top of `isOAuthUrlAllowed`.
Schemeless inputs now fall through to `validateOAuthUrl`'s explicit
"Invalid OAuth <field>" rejection. Tests added in both
`auth/domain.spec.ts` (unit) and the OAuth handler integration spec
(end-to-end).

Reported by chatgpt-codex-connector (P2) on PR #12933.

* 🛡️ fix: Address Follow-Up Comprehensive Review — Schema Tests, Shared Normalization, host:port

Auditing the second comprehensive review:

**F1 MAJOR — schema validation untested.** `allowedAddressesSchema` had
zero coverage, so a regression in the three refinement stages or the
three wiring locations (`endpoints` / `mcpSettings` / `actions`) would
silently let invalid entries reach the runtime. Added a dedicated
`describe('allowedAddressesSchema')` block in `config.spec.ts` covering:
valid private IPs (v4 + v6, including the previously-missed 192.0.0.0/24
range), accepted hostnames, all rejection categories (URLs, CIDR, paths,
whitespace tabs/newlines, host:port, public IP literals), and full
`configSchema.parse()` integration at each of the three nesting points.

**F2 MINOR — `isPrivateIPv4Literal` divergence.** The schema reimpl in
`packages/data-provider` was discarding the `c` octet, so the
`192.0.0.0/24` (RFC 5736 IETF protocol assignments) range that the
authoritative `isPrivateIPv4` accepts was being rejected with a
misleading "public IP" error. Destructure `c` and add the missing range
check; covered by the new schema tests.

**F3 MINOR — DRY violation across `domain.ts` and `agent.ts`.** Both
files had independent normalization implementations with a subtle
whitespace-check divergence (`/\s/` vs `.includes(' ')`). Extracted the
shared logic into a new `packages/api/src/auth/allowedAddresses.ts`
module that both consumers import:
  - `normalizeAddressEntry(entry)` — single-entry shape check
  - `looksLikeHostPort(entry)` — host:port detector (used by F4)
  - `normalizeAllowedAddressesSet(list)` — pre-normalized Set for the
    connect-time hot path
  - `isAddressInAllowedSet(candidate, set)` — membership check that
    enforces private-IP scoping on the candidate

Both `isAddressAllowed` (preflight) and `buildSSRFSafeLookup` (connect)
now go through the same primitives; the whitespace divergence is gone.

To break the import cycle (`allowedAddresses` needs `isPrivateIP`,
`domain` previously owned it), extracted IP private-range detection
into a leaf `auth/ip.ts` module. `domain.ts` re-exports `isPrivateIP`
for backward compatibility with existing call sites.

**F4 MINOR — `host:port` silently misclassified.** Entries like
`localhost:8080` previously slipped through the URL/path guard, were
mis-detected as IPv6, failed `isPrivateIP`, and were silently dropped
with a misleading "public IP" schema error. Added an explicit
`looksLikeHostPort` check with a clear error: "allowedAddresses
entries must not include a port — list the bare hostname or IP only."
Bare `::1`, `[::1]`, and other valid IPv6 literals are intentionally
not matched (regex distinguishes by colon count and the bracketed
`[ipv6]:port` form).

**F5 MINOR — hostname-trust documentation gap.** Hostname entries
short-circuit `resolveHostnameSSRF` before any DNS lookup — that's a
deliberate design (admin trusts the name) but it means the exemption
follows whatever the name resolves to at runtime. Added an explicit
note in `librechat.example.yaml` for both `mcpSettings.allowedAddresses`
and `endpoints.allowedAddresses`: "a hostname entry trusts whatever IP
that name resolves to. Only list hostnames whose DNS you control.
Prefer literal IPs when you can."

**F6** (8 positional params) is flagged for follow-up; refactor to an
options object is a breaking-API change deferred to a separate PR.
**F7** (redirect/WebSocket asymmetry, NIT, conf 40) — skipping; the
existing inline comment is sufficient.

* 🧹 chore: Address Follow-Up NITs — Import Order And Mirror-Function Naming

Three NITs from the latest comprehensive review:

**NIT #1 (conf 85) — local import order.** AGENTS.md requires local
imports sorted longest-to-shortest. Both `domain.ts` and `agent.ts`
had `./ip` (shorter) before `./allowedAddresses` (longer). Swapped.

**NIT #2 (conf 60) — missing cross-reference.** The schema-side
`isHostPortShape` in `packages/data-provider/src/config.ts` had no
note pointing at the canonical runtime mirror. Added a JSDoc paragraph
explaining the mirror relationship and why a local copy exists (the
data-provider package can't import from `@librechat/api` without
creating a circular dependency).

**NIT #3 (conf 50) — naming inconsistency.** Renamed
`isHostPortShape` → `looksLikeHostPort` so the schema mirror matches
the runtime helper exactly. Kept as a separate function (not a shared
import) for the same circular-dependency reason; the matching name
makes it obvious they should stay in lockstep.
2026-05-03 21:43:59 -04:00
Danny Avila
5013d6d35c
🧯 fix: Harden Code Env Filepath Uploads (#12936)
* fix: Harden code env filepath uploads

* test: Cover code env filepath edge cases

* fix: Scrub code env fallback filenames
2026-05-04 10:25:59 +09:00
Danny Avila
41a6d6d11c
🛡️ fix: Harden MCP Redirect SSRF Checks (#12931)
* fix: Harden MCP redirect SSRF checks

* fix: Address MCP redirect review feedback

* test: Tighten MCP SSRF redirect assertions
2026-05-04 09:58:47 +09:00
Danny Avila
1b79e0b785
🧬 chore: Align LibreChat With Agents LangChain Upgrade (#12922)
* 🔧 chore: Update dependencies in package-lock.json and package.json

- Bump version of @librechat/agents to 3.1.75-dev.0 in multiple package.json files.
- Upgrade various AWS SDK and Smithy dependencies to their latest versions in package-lock.json for improved stability and performance.

* 🔧 chore: Update AWS SDK and Smithy dependencies in package-lock.json

- Bump version of @aws-sdk/client-bedrock-runtime to 3.1041.0 and update related dependencies for improved performance and stability.
- Upgrade various AWS SDK and Smithy packages to their latest versions, ensuring compatibility and enhanced functionality.

* chore: Align LibreChat with agents LangChain upgrade

- Route LangChain imports through @librechat/agents facade exports
- Update @librechat/agents to 3.1.75-dev.1 and remove direct LangChain deps
- Normalize nullable agent model params and API key override typing
- Update Google thinking config typing for newer LangChain packages
- Refresh targeted audit-related dependency overrides

* chore: Add Jest types for API specs

* test: Fix LangChain upgrade CI specs

* test: Exercise agents env facade

* fix: Clean up TS preview diagnostics

* fix: Address Codex review feedback
2026-05-03 12:46:01 -04:00
Danny Avila
eb22bb6969
🧭 fix: Migrate Anthropic Long Context (#12911) 2026-05-02 22:14:19 +09:00
Danny Avila
f3e1201ae7
📌 fix: Stabilize Agent Prompt Cache Prefix (#12907)
* fix: stabilize agent prompt cache prefix

* chore: refresh agents sdk lockfile integrity

* test: format agent memory assertion

* test: type agent context fixtures

* fix: preserve MCP instruction precedence

* fix: reuse resolved conversation anchor

* fix: keep resumable startup immediate
2026-05-02 09:55:31 +09:00
Danny Avila
74307e6dcc
💭 feat: Require Explicit Auto-agent Enablement for Memories (#12886) 2026-05-01 23:56:08 +09:00
Nick Haffie-Emslie
3758380c61
🔌 fix: Follow 307/308 redirects in MCP streamable HTTP transport (#12850)
* 🔌 fix: Follow 307/308 redirects in MCP streamable HTTP transport

Some MCP servers (e.g. Coda) return 308 Permanent Redirect to route
doc-scoped tool calls to a different endpoint path. The fetch wrapper
used `redirect: 'manual'` for SSRF protection, which silently dropped
these redirects and caused tool calls to fail with empty error bodies.

Follow 307/308 redirects (method-preserving per RFC 7538) up to a
depth of 5. SSRF safety is preserved because the same undici Agent
with its SSRF-safe connect function validates redirect targets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🛡️ fix: Harden MCP 307/308 redirect handling against SSRF and credential leaks

- Validate every redirect target against `resolveHostnameSSRF` so allowlist
  deployments (which disable connect-time SSRF protection) still block hops to
  private/reserved IPs.
- Strip `Authorization`, `Cookie`, `mcp-session-id`, and any user-injected
  headers when a 307/308 crosses an origin boundary, mirroring browser/Fetch
  behavior so a redirecting MCP server can't exfiltrate credentials.
- Cancel the intermediate response body before each next hop so undici can
  reuse pooled sockets rather than holding them until GC.
- Restructure redirect test helpers to be same-origin (matching real-world
  Coda-style routing), drop dead setup code, fix the misleading "5 hops
  successfully" test, and add coverage for SSRF-blocked redirects, cross-
  origin credential stripping, and same-origin credential preservation.

* 🛡️ fix: Also strip `serverConfig.headers` on cross-origin MCP redirects

Previously only runtime `setRequestHeaders` keys were treated as secret on a
307/308 cross-origin hop. API keys baked into `serverConfig.headers` (passed
through `requestInit.headers` at transport construction time) survived
stripping, so a malicious MCP endpoint could exfiltrate them by returning a
cross-origin `Location`. Pass the configured header keys through to
`createFetchFunction` so both runtime and config secrets are stripped.

The cross-origin credential test now also configures `serverConfig.headers`
to lock in this behavior.

* 🧹 chore: Tighten MCP redirect-stripping coverage and helper duplication

- Add `proxy-authorization` to the cross-origin forbidden header set so a
  forward-proxy credential header would also be stripped on a cross-origin
  hop, matching the Fetch-spec list.
- Strengthen the cross-origin credential test with positive assertions that
  benign protocol headers (`accept`, `content-type`) survive the hop, so a
  regression that strips everything indiscriminately would now fail.
- Extract the duplicated MCP request handler / session-teardown logic from
  three test helpers into shared `createMCPRequestHandler` and
  `closeMCPSessions` utilities.

* 🛠️ fix: Handle `Request` inputs in MCP `customFetch` URL derivation

`customFetch` is typed to accept `UndiciRequestInfo` (`string | URL | Request`),
but `Request.prototype.toString()` returns `"[object Request]"`. The previous
implementation derived `originalOrigin` and the redirect base via
`url.toString()`, so a `Request` input would throw inside `new URL(...)` before
any network call — a regression even when no redirect was involved.

Add a `getRequestUrlString` helper that extracts the URL string for all three
shapes, track the URL string alongside the fetch input through the redirect
loop, and add parameterized tests that exercise `customFetch` with each shape.

* 🛠️ fix: Don't override `Request` input headers in MCP `buildFetchInit`

Previously `buildFetchInit` always set `headers` on the returned init —
even when neither `init.headers` nor runtime headers contributed anything.
Passing `headers: {}` to `undiciFetch` overrides the headers carried on a
`Request` input (auth tokens, MCP session, protocol negotiation), so
Request-based wrappers could fail authentication even without a redirect
in play. Skip the `headers` override entirely when there is nothing to
merge.

Adds a regression test that supplies `Authorization` and a custom header
on the `Request` itself and asserts both reach the target server.

* 🛠️ fix: Preserve `Request` method/body across MCP redirects + guard cross-origin strip

Two regressions surfaced by extending `customFetch` to accept `Request` inputs:

1. **307/308 method/body loss.** The redirect loop switches `url` to the
   new `Location` string, but the original `Request`'s method and body
   stayed bound to the (now-discarded) `Request` object. A redirected
   POST silently became a GET with no payload — the exact behavior the
   method-preserving codes are designed to prevent. Added a
   `resolveFetchInput` helper that runs once at the top of `customFetch`,
   extracts a `Request`'s method/body/headers into the shared init, and
   buffers the body via `arrayBuffer()` so 307/308 retries can replay it.

2. **Cross-origin strip crashed on absent headers.** After the previous
   fix that stopped `buildFetchInit` from setting `headers: {}`,
   `currentInit.headers` could legitimately be `undefined`. The
   cross-origin branch read it as a `Record` and called `Object.entries`
   on `undefined`, throwing `TypeError`. Guard the branch on
   `currentInit.headers != null` — when there are no headers there is
   nothing to strip.

Adds two regression tests: a POST-with-body `Request` that 308-redirects
cross-origin (asserts both method and body survive) and a no-headers
cross-origin redirect (asserts the strip path no longer crashes).

* 🛠️ fix: Forward `Request.signal` through MCP `customFetch` normalization

`resolveFetchInput` was copying method/body/headers off a `Request` input
but dropping `Request.signal` on the floor, so a caller that wired an
`AbortController` onto the `Request` for cancellation/timeouts lost that
wiring as soon as we re-shaped the input into the `(string, init)` pair
used by the redirect loop. Subsequent aborts no longer reached the
in-flight fetch — a regression from the pre-PR code, which forwarded
the original `Request` directly to undici.

Forward the signal alongside method/body/headers, with explicit
`init.signal` still winning per Fetch-spec semantics. Regression test
aborts a controller before calling \`customFetch\` with the wired
`Request` and asserts the call rejects.

* 🧪 test: Pin URL.origin contract for protocol-downgrade redirect handling

Audit follow-up. The cross-origin strip path keys off
`targetUrl.origin !== originalOrigin`, and `URL.origin` is defined as
`scheme + "://" + host + ":" + port`, so a same-host `https → http`
redirect produces a different origin and trips the strip path through
the existing logic — no separate code path needed.

Pin that contract with a small unit test so a future change to URL
semantics (or a refactor that swaps in a different comparison) doesn't
silently regress protocol-downgrade stripping. Standing up a TLS
fixture (self-signed cert, undici skip-verify, etc.) just to re-prove
the URL spec is wasted complexity.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-29 22:12:29 +09:00
Danny Avila
756530c2b8
🩹 fix: Polish code-execution attachment UX (#12870)
* 🧹 chore: Strip code-execution boilerplate from tool output

The bash executor in `@librechat/agents` appends two kinds of noise to
every successful run:

1. Trailing `Note:` paragraphs — long behavioral hints repeating
   rules already in the system prompt ("Files from previous executions
   are automatically available...", "Files in 'Available files' are
   inputs..."). Re-stating these on every tool call adds ~50 tokens of
   waste per call, which compounds across long agent traces.

2. Per-file `| <annotation>` suffixes on every line of `Generated
   files:` / `Available files (...):`. The two section headers already
   convey the new-vs-known distinction; the per-file annotations are
   redundant *and* phrased inconsistently ("downloaded by the user"
   vs. "displayed to the user" vs. "known to the user").

Strip both in a small `cleanCodeToolOutput` helper invoked from
`packages/api/src/agents/handlers.ts` for every tool listed in
`CODE_EXECUTION_TOOLS`. Non-code-execution tools pass through
unchanged. The cleaning happens *after* tool resolution but *before*
downstream consumers (model context, SSE forwarding, persistence) see
the content, so subsequent model turns get the lean output.

* 🩹 fix: Polish code-execution attachment rendering

Three rough edges visible in code-interpreter conversations:

1. **Sandbox-internal `.dirkeep` placeholders leak as file chips.** The
   bash executor creates `.dirkeep` inside any new directory so the
   stateless container preserves the folder across executions. After
   `sanitizeArtifactPath`'s `_` prefix and 6-hex collision suffix it
   surfaces as `_.dirkeep-<hash>` — a 0-byte chip with no value to the
   user, sometimes hiding the real artifact behind it. New
   `isInternalSandboxArtifact` helper filters them out of every
   routing path (`Attachment`, `AttachmentGroup`, `LogContent`).

2. **The `-<hash>` collision suffix is visible in chip labels.** The
   suffix is collision-avoidance machinery; users only need to see the
   canonical name. New `displayFilename` strips it for display while
   leaving the on-disk `attachment.filename` untouched so downloads
   resolve. Applied across `FileContainer`, `ToolArtifactCard`,
   `ToolMermaidArtifact`, and `LogContent`'s text-attachment label
   path.

3. **0-byte / placeholder files outrank real artifacts in render
   order.** Bucket sort by salience (non-empty before empty) sinks
   stragglers to the bottom. Stable sort preserves arrival order for
   peers.

Added regression tests cover the new helpers, the dirkeep filter
across buckets, and the within-bucket salience ordering.

* 🩹 fix: Don't auto-open artifact panel on history navigation

Navigating to a previous conversation full of code-execution artifacts
would auto-open the side panel and focus the most-recent artifact —
the same code path that fires for fresh streaming artifacts. Users
expect that "auto-open" behavior only when an artifact arrives via
SSE, not when they revisit an old chat.

Two-part gate:

1. `ToolArtifactCard`'s focus effect captures `isSubmitting` at first
   render via a ref. A card mounted *during* a stream means a new
   artifact arrived → steal panel focus (legacy behavior). A card
   mounted while `isSubmitting === false` is part of conversation
   history → leave focus alone.

2. `Presentation`'s panel-render condition gains `currentArtifactId
   != null`. With (1) keeping `currentArtifactId` null on history
   load, the panel stops rendering at all on navigation — even if
   `artifactsVisibility` was left `true` by a prior conversation.
   User clicks on a chip to re-open (the click handler is unchanged
   and unconditional).

Test seeds `isSubmittingFamily(0)` per case: existing tests opt into
streaming (default `true`) so legacy auto-focus assertions still hold;
new tests for history-load opt into `streaming: false` and verify
no auto-focus + click-to-open still works.

* 🩹 fix: Force panel visible on streaming artifact arrival

The previous commit gated `setCurrentArtifactId` on `isSubmitting` but
left `artifactsVisibility` untouched. When a user had explicitly
closed the panel earlier in the session, a fresh SSE artifact would
set `currentArtifactId` (so the chip read "click to close") but
`Presentation`'s render condition still required `visibility === true`
— net effect: the card claimed to be open, the panel stayed hidden.

Streaming arrivals now also call `setVisible(true)`, which is the
explicit "auto-open when first created" behavior the user asked for.
History mounts (`isSubmitting === false`) still leave both focus and
visibility alone, so navigating to an old conversation does not
re-open the panel.

Two regression tests added: one asserts streaming flips visibility on
even when seeded false, the other asserts history mounts leave a
seeded-false visibility alone.

* 🧹 chore: Tighten code-execution attachment polish per audit feedback

Resolves the eight actionable findings from the comprehensive audit:

- Scope `displayFilename` out of `FileContainer`: opt-in via a new
  `displayName` prop. User-uploaded chips (input area, persisted
  message files) keep their raw filename, eliminating the false-positive
  class where `report-abc123.pdf` was silently rewritten to `report.pdf`.
  Code-execution artifact paths in `Attachment.tsx` explicitly compute
  the de-suffixed name and pass it through.
- Tighten `TRAILING_NOTES_PATTERN` to anchor on the two known boilerplate
  openings (`Files from previous executions`, `Files in "Available files"`),
  so a user-authored `Note:` line preceded by a blank line in stdout no
  longer gets eaten along with everything after it.
- `ToolMermaidArtifact`: compute `visibleFilename` once and reuse for
  title, content, and the download `aria-label` (was using the raw
  `attachment.filename` for the aria-label, creating a screen-reader
  inconsistency).
- `ToolArtifactCard`: read `isSubmittingFamily(0)` once via a
  non-subscribing `useRecoilCallback`, instead of subscribing for the
  full lifetime to a value the ref only ever needs at first render.
- Extract `bySalience` and `byEntrySalience` comparators from
  `attachmentTypes.ts`, replacing the ten duplicated sort lambdas in
  `Attachment.tsx` and `LogContent.tsx`.
- Treat `attachmentSalience({ bytes: undefined })` as neutral (`0`)
  rather than empty (`1`); only an explicit `bytes === 0` sinks. Stops
  non-code-exec sources (web-search inline results, files where the
  schema omits the byte count) from silently sinking past real content.
- Pin the click-history test to the panel-open button by name instead
  of relying on `getByRole('button', { pressed: false })`, which
  matched by DOM order.
- Add the missing blank line between adjacent `it(...)` blocks.
- Drop the verbose narrating comments in `FileContainer` along with the
  removed `displayFilename` import.

Adds three regression tests for the new behavior (FileContainer raw
filename, artifact-context displayName flow, user-authored `Note:` line
preserved through cleanup) and updates the salience test for the new
neutral-undefined semantics.

* 🧹 chore: Drop redundant `@testing-library/jest-dom` import in FileContainer spec

`client/test/setupTests.js` already imports the matchers globally for every
Jest test in the client workspace, so the explicit import here was dead code.
Removing it brings the spec in line with the broader convention used by
`ArtifactRouting.test.tsx`, `LogContent.test.tsx`, and `attachmentTypes.test.ts`.

* 🛡️ fix: Narrow `.dirkeep`/`.gitkeep` filter to the sandbox-specific form

`isInternalSandboxArtifact` was filtering bare `.dirkeep` / `.gitkeep`
along with the post-sanitization form. Bare versions never originate
from the bash executor (the dotfile rewrite + disambiguator step in
`sanitizeArtifactPath` always produces `_.dirkeep-<6 hex>`), so the only
real-world source of a bare `.gitkeep` is project scaffolding the user
uploaded — silently hiding it from every attachment bucket meant the
file disappeared with no way to surface or download it.

Tightening to `^_\.(?:dirkeep|gitkeep)-[0-9a-f]{6}$` keeps the
sandbox-placeholder filter intact while letting user-uploaded markers
render normally. Tests inverted accordingly: bare forms now expected to
render; only the post-sanitization form is filtered.

* 🩹 fix: Address comprehensive-review findings on attachment helpers

Five findings from the latest pass:

- **MAJOR — `displayFilename` false-positive on extensionless 6-hex.**
  The previous regex `/-[0-9a-f]{6}(?=\.[^.]+$|$)/` stripped any leaf
  ending in `-XXXXXX` regardless of context, so a user-named
  `build-a1b2c3` (script-emitted hash artifact, no extension) lost its
  tail and rendered as `build`. Split into two narrower patterns:
  `COLLISION_SUFFIX_BEFORE_EXT` only matches when followed by an
  extension; `SANITIZED_DOTFILE_TRAILING_SUFFIX` only fires when the
  leaf starts with `_.` AND ends with `-XXXXXX` — the unambiguous
  fingerprint of `sanitizeArtifactPath`'s dotfile rewrite.

- **MINOR — `isInternalSandboxArtifact` filter too aggressive.**
  `(file.bytes ?? 0) > 0` treated undefined bytes as zero, falling
  through to the regex check. Tightened to `file.bytes !== 0`: only
  an *explicit* zero counts as the empty-placeholder shape worth
  hiding. Non-code-exec sources without `bytes` populated render
  normally now.

- **MINOR — `getValue()` could throw on a degenerate atom state.**
  Switched the snapshot read in `ToolArtifactCard` to
  `valueMaybe() ?? false` so a transient error / loading state on the
  upstream selector doesn't crash card mount. The `false` default is
  the right history-fallback (don't auto-open if we can't classify).

- **NIT — `attachmentSalience` / `bySalience` over-broad signature.**
  Removed the test-only `{ bytes?: number }` arm; functions now accept
  `TAttachment` directly. The internal `bytes` read still goes through
  a cast since not every TAttachment branch declares it. Tests updated
  to use the existing `baseAttachment(...)` helper.

- **MINOR — Missing regression test for extensionless 6-hex.**
  Added `'build-a1b2c3'` and `'out/blob-deadbe'` cases that pin the
  preservation behavior, plus an `isInternalSandboxArtifact` test that
  asserts undefined-bytes attachments are not filtered.

* 🩹 fix: Make code-file artifacts click-to-open only

Removes mount-time auto-open from `ToolArtifactCard`. Streaming
arrivals no longer hijack the panel — even a freshly-emitted SSE
artifact registers silently in `artifactsState` and waits for the
user to click. Combined with `Presentation`'s
`currentArtifactId != null` render gate, the panel stays closed
across history navigation, page reload, and SSE arrival.

Click is the only path that opens the panel. `handleOpen` is
unchanged: first click focuses + reveals, second click on the same
chip closes.

Dropped:
- `useRecoilCallback` snapshot read of `isSubmittingFamily(0)`
- `mountedDuringStreamRef` ref + lazy-init block
- The whole focus + visibility effect (was effect 3)
- `useRef` import (now unused)

Tests:
- `ArtifactRouting.test.tsx` rewritten to exercise the click path:
  registers-on-mount-without-focus, click-to-open-then-close, multi-
  card-no-auto-focus, click-when-visibility-was-false. The streaming
  state is no longer seeded; both `renderWith` and `renderWithProbe`
  collapsed back to plain `RecoilRoot`.
- `LogContent.test.tsx` flips its panel-routing assertions from
  `pressed: true` (which asserted auto-focus) to `pressed: false`
  with a chip-title check (which asserts the panel card rendered
  but stayed unfocused).

* Revert "🩹 fix: Make code-file artifacts click-to-open only"

This reverts commit 6761531287.

* 🩹 fix: Exclude CODE bucket from streaming auto-open

Narrows the previous-commit revert: rich-preview artifacts (HTML,
React, Markdown, plain text) keep the legacy SSE auto-open UX, but
the CODE bucket (`.py`, `.js`, `.cpp`, `Dockerfile`, `Makefile`, …)
stays click-to-open even on streaming.

Source-code artifacts are typically supporting helpers the agent
emits alongside a richer deliverable (a Python script that builds
the actual `.html` output, for example). Auto-opening every
helper's panel each time it gets written would shove the panel
in front of the user every tool call. The user explicitly opens
a code chip when they want to inspect it.

Implementation:
- Focus+open effect skips early when `artifact.type === CODE`.
- `artifact.type` added to the dep array so the gate re-evaluates
  if the type ever changes (it shouldn't, but the dep is honest).
- JSDoc updated to call out the carve-out.

Tests:
- New `does NOT auto-open a streaming CODE artifact (test.py is
  click-to-open)` — seeds isSubmitting=true, mounts a `.py`,
  asserts the artifact registers but currentArtifactId stays null.
- New `clicking a CODE artifact focuses it even though it skipped
  auto-open` — confirms the click path still surfaces a `.py`.
- All 25 prior auto-open tests for HTML/React/Markdown/plain-text
  buckets still pass unchanged: those types continue to auto-open
  on streaming.

* 🧹 chore: Address two NITs from the audit-fix follow-up review

- **NIT #1 (conf 60)**: Add a test for the dotfile-with-extension
  intersection (`_.config-abcdef.txt` → `.config.txt`). Both halves
  of the path were tested separately — extension-anchored suffix
  stripping and `_.` underscore restoration — but the combination
  wasn't pinned. Adds `expect(displayFilename('_.config-abcdef.txt'))
  .toBe('.config.txt')`.

- **NIT #2 (conf 25)**: Tighten the cast in `attachmentSalience` from
  the anonymous `{ bytes?: number }` shape to the concrete
  `TFile & TAttachmentMetadata` (the actual TAttachment branch that
  declares `bytes`). Same runtime behavior; a future retype of
  `TFile.bytes` will now surface here at compile time instead of
  being silently papered over.

* 🩹 fix: Stop stripping `-<6 hex>` suffixes from non-dotfile filenames

Codex's repeated P2 was correct: the `COLLISION_SUFFIX_BEFORE_EXT`
regex stripped any `-<6 hex>` immediately before an extension
regardless of context. That collapsed legitimate user-named files
like `report-deadbe.csv` and `report-beef01.csv` onto the same chip
label `report.csv`, silently merging distinct files in the UI.

The structural truth: only the dotfile shape (`_.foo-XXXXXX`) carries
an unambiguous discriminator (the leading `_.` that
`sanitizeArtifactPath` adds when rewriting a leading dot). The
extension-only case (`name-<hash>.ext`) has no such discriminator —
we can't distinguish a sanitized `report 1.csv` (which became
`report_1-<hash>.csv`) from a user-named `report-deadbe.csv` from
the filename alone.

Recovering the non-dotfile case cleanly would require a backend
`wasSanitized` metadata flag we don't have. Without it, the safer
choice is to leave non-dotfile names alone — uglier when the file
*was* sanitized, but never collapses distinct files onto a shared
label.

Changes:
- Drop `COLLISION_SUFFIX_BEFORE_EXT`. Replace
  `SANITIZED_DOTFILE_TRAILING_SUFFIX` with a unified
  `SANITIZED_DOTFILE_PATTERN` that handles both extensionless and
  with-extension dotfile shapes in one regex.
- Simplify `displayFilename` to a single match + reconstruct path.
- Update tests: drop the broad-stripping assertion
  (`output-deadbe.csv` → `output.csv`), add explicit codex-regression
  cases (`report-deadbe.csv` and `report-beef01.csv` preserve
  unchanged), document the deliberate non-recovery for sanitized
  non-dotfiles, update the AttachmentGroup→FileContainer integration
  test to reflect the narrower stripping (non-dotfile `archive-deadbe.zip`
  passes through; new dotfile `_.config-abcdef.zip` → `.config.zip`
  exercises the recoverable path).

* 🩹 fix: Scope code-tool annotation stripping to file-list sections

Codex was right: the previous global `.replace` would mutate any line
ending in one of the three annotation phrases — even legitimate
stdout. A user script doing
`echo "foo | File is already downloaded by the user"` had its output
silently scrubbed before being fed back into model context.

New `FILE_SECTION_PATTERN` captures `Generated files:` /
`Available files (...)` blocks (header + lines starting with `- /`).
Annotation stripping now only runs *within* the captured file-list
section via a nested `.replace`, so:

- Inside the section: per-file `| <ann>` suffixes still get stripped
  (line-per-file ≥ 4 files form, inline `, ` comma-separated ≤ 3
  files form — both already covered by existing patterns).
- Outside the section: stdout, stderr, blank lines, the trailing
  `Note:` paragraphs (handled by their own pattern), and any user
  text that coincidentally contains an annotation phrase pass
  through unchanged.

Tests:
- New `does NOT mutate stdout that legitimately contains an
  annotation phrase outside a file-list section` pins the codex
  regression: three coincidental phrases in stdout, no
  `Generated files:` header, all three preserved verbatim.
- New `strips annotations inside a file-list section but preserves
  identical phrases in stdout above it` covers the mixed case where
  the same phrase appears in both stdout and a file listing —
  stdout survives, listing gets cleaned, exactly one occurrence
  remains.
- All 9 prior tests still pass (file-section stripping behavior
  unchanged for both line-per-file and inline-comma layouts).
2026-04-29 08:53:10 -04:00
Danny Avila
4a5fc701d2
📂 fix: Preserve Nested Skill Paths in Code-Env Uploads (#12877)
* fix(code): preserve code env upload filepaths

* chore: Reorder import statements in crud.js
2026-04-29 08:07:46 -04:00
Danny Avila
915b30c60d
📦 chore: update @librechat/agents to v3.1.74 (#12869) 2026-04-29 10:20:52 +09:00
David Newman
1f37ec842a
🔌 fix: Prevent Repeated Idle Check Triggers for Users With Failed MCP Connections (#12853) 2026-04-29 09:19:35 +09:00
Daniel Lew
2503365c44
🚫 feat: Add Support for none Reranker Type in Web Search Config (#12765)
Most of the codebase already supports the concept of *not* using a reranker w/
web search, except there was no way to initially setup an absent reranker
component.

Now there's a special path for skipping the reranker auth when loading web
search config, which allows for skipping the reranker when using web search.
2026-04-29 09:17:04 +09:00
Danny Avila
89bf2ab7b4
💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868)
* 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12855)

Different providers report `usage_metadata.input_tokens` with different
semantics:

  - Anthropic / Bedrock: `input_tokens` EXCLUDES cache; cache reads/writes
    arrive separately and must be added to get the total prompt size.
  - Gemini / OpenAI: `input_tokens` ALREADY INCLUDES cached tokens
    (Google's `promptTokenCount`, OpenAI's `prompt_tokens`). Their
    `input_token_details.cache_*` are subsets of `input_tokens`.

`recordCollectedUsage` treated both schemes as additive, so for cache-hit
requests on Gemini/OpenAI it added cache tokens on top of an
`input_tokens` value that already contained them — overcharging users by
the cache_hit_rate (e.g., ~67% cache hit ≈ 1.67x overcharge). This
matches the issue reporter's GCP billing comparison.

Adds a small `splitUsage` helper that classifies the provider by model
name and computes `inputOnly` (the non-cached portion) plus the
all-inclusive `totalInput` for both the spend math and the returned
`input_tokens` summary. The helper defaults to additive semantics (the
historical behavior) so unknown providers are unaffected.

Updates existing OpenAI-shaped tests that previously asserted the buggy
additive math, and adds Gemini regression tests using the exact numbers
from the issue report (input=11125, cache_read=7441 → input=3684).

Anthropic / Bedrock paths remain bit-identical to before.

* 🔧 refactor: Classify Cache-Token Semantics by Provider, Not Model Name

Follows up the previous commit. Replaces a model-name regex
(`gemini|gpt|o[1-9]|chatgpt`) with an explicit `Providers` enum lookup
keyed off the `usage.provider` field — `UsageMetadata.provider` already
exists in `IJobStore.ts` but was never being populated.

  - `callbacks.js#ModelEndHandler` now attaches `usage.provider` from
    `agentContext.provider` alongside `usage.model`.
  - `usage.ts` uses a `SUBSET_PROVIDERS` set (`openAI`, `azureOpenAI`,
    `google`, `vertexai`, `xai`, `deepseek`, `openrouter`, `moonshot`)
    backed by the canonical `Providers` enum from
    `librechat-data-provider`.
  - `xai`, `deepseek`, `openrouter`, `moonshot` extend `ChatOpenAI` so
    they inherit subset semantics (verified in node_modules).
  - Defaults to additive when `usage.provider` is missing, so the title
    flow (which doesn't propagate provider) and any pre-this-PR usage
    entries keep their existing behavior.

Tests: switch fixtures from model-name signaling to explicit `provider`
field, plus a Vertex AI case and a "missing provider" fallback case.
2026-04-29 08:36:00 +09:00
Danny Avila
46a86d849f
🛂 fix: Skip Inherited / Mark Skill Files Read-Only in Code-Env Pipeline (#12866)
* 🛂 fix: Skip Re-Download of Inherited Code-Env Files (No More 403 Storms)

When a bash/code-interpreter call lists or operates on inputs the user
already owns (skill files primed via primeInvokedSkills, files inherited
from a prior session), codeapi echoes those files back in the tool
result with `inherited: true`. We were treating every entry as a
generated artifact and calling processCodeOutput on each, which:

1. Hit `/api/files/code/download/<session_id>/<file_id>` with the
   user's session key. Skill files are uploaded under the skill's
   entity_id, so every download 403'd — producing dozens of
   "Unauthorized download" log lines per turn.

2. Surfaced those inputs as ghost file chips in the UI even though
   they were never generated by the run.

3. Wasted a download round-trip even when no auth boundary was
   crossed — the file is already persisted at its origin.

Fix: skip files where `file.inherited === true` in all three
artifact-files loops (`tools.js`, `createToolEndCallback`, and
`createResponsesToolEndCallback`). Skill files remain available to
subsequent calls via primeInvokedSkills / session inheritance — we
just don't redundantly re-download them.

Pairs with codeapi-side change that adds the `inherited` flag.

* 🔒 feat: Mark Skill Files as `read_only` During Code-Env Priming

Pairs with codeapi `read_only` upload flag (ClickHouse/ai#1345). When
LibreChat primes a skill into the code-env, every file in the batch
(SKILL.md plus all bundled scripts/schemas/docs) is now uploaded with
`read_only: true`. Codeapi seals these inputs at the filesystem layer
(chmod 444) and the walker echoes the original refs as `inherited:
true` regardless of whether sandboxed code modified the bytes on disk.

Without this, the previous PR's `inherited` skip handled only the
unchanged case. A modified skill file (pip writing pyc near a .py, a
script accidentally truncating LICENSE.txt, etc.) still flowed through
the modified-input branch on codeapi, got a fresh user-owned file_id,
uploaded as a "generated" artifact, and surfaced in the UI as a chip
the user couldn't actually authorize a download for.

Changes:

- `api/server/services/Files/Code/crud.js`:
  `batchUploadCodeEnvFiles({ ..., read_only })` forwards the flag as
  a multipart form field. Default `false` preserves existing behavior
  for user-attached files and prior-session inheritance.

- `packages/api/src/agents/skillFiles.ts`: type signature gains
  `read_only?: boolean`; `primeSkillFiles` passes `true`.

- `packages/api/src/agents/skillFiles.spec.ts`: assert the upload call
  carries `read_only: true`.

The flag is intentionally not skill-specific. Any future
infrastructure-input flow (system fixtures, cached datasets, etc.) can
opt in the same way.
2026-04-29 08:26:25 +09:00
Danny Avila
c9dee962e7
📂 fix: Preserve Nested Folder Paths for Code-Execution Artifacts (#12848)
* 📂 fix: Preserve Nested Folder Paths for Code-Execution Artifacts

When codeapi reports a generated file at a nested path (`a/b/file.txt`),
`processCodeOutput` was running it through `sanitizeFilename` — which
calls `path.basename()` and then collapses `/` to `_`. The DB row ended
up with `filename: "file.txt"`, `primeFiles` shipped that flat name back
to the next sandbox session, and `cat /mnt/data/a/b/file.txt` 404'd.

Fix: split the sanitizer into two helpers in `packages/api/src/utils/files.ts`:

  - `sanitizeArtifactPath` — segment-wise sanitize while preserving
    `/`. Falls back to basename on `..` traversal, absolute paths, and
    other malformed inputs. The DB record uses this so the next prime()
    can recreate the nested path in the sandbox.

  - `flattenArtifactPath` — encode `/` as `__` for the local
    `saveBuffer` strategies, which key by single-component filename and
    would otherwise create unintended subdirectories under uploads/.

`process.js` is updated to use both: DB filename keeps the path, storage
key flattens. `claimCodeFile` is also keyed on `safeName` so the
(filename, conversationId) compound key stays consistent with the
record `createFile` writes.

Tests:
  +13 unit tests in `files.spec.ts` (sanitizeArtifactPath table,
  flattenArtifactPath round-trip).
  +1 integration test in `process.spec.js` asserting the DB-row vs
  storage-key split for a nested path.
  Updated `process-traversal.spec.js` to mock the new helpers.

64 pass / 0 fail across `Files/Code/`; 36 pass / 0 fail in
`packages/api/src/utils/files.spec.ts`.

Companion: ClickHouse/ai#1327 — the codeapi-side counterpart that stops
phantom file IDs from reaching this code path in the first place. These
two are independent but the matplotlib bug is most cleanly resolved when
both ship.

* 🛡️ fix: Re-add 255-char per-segment cap in sanitizeArtifactPath (codex review P2)

`sanitizeArtifactPath` dropped the 255-char basename cap that
`sanitizeFilename` enforces. Long artifact names then flowed unbounded
into `processCodeOutput`'s storage key (`${file_id}__${flatName}`) and
tripped `ENAMETOOLONG` on filesystems that enforce `NAME_MAX` —
saveBuffer fails, and the file falls back to a download URL instead of
persisting / priming. This was a regression specifically for flat
filenames that the original `sanitizeFilename` would have truncated
safely.

Re-add the cap as a per-path-component limit so it applies cleanly to
both flat and nested paths:

  - Leaf segment: extension-preserving truncation, matching
    `sanitizeFilename`'s shape (`<truncated-stem>-<6 hex>.<ext>`).
  - Non-leaf (directory) segments: plain truncate-and-disambiguate
    (`<truncated-name>-<6 hex>`); directory names don't carry semantic
    extensions worth preserving.
  - Defensive fallback when `path.extname` returns a pathologically long
    "extension" (e.g. `_.aaaa…aaa` after the dotfile underscore prefix
    rewrite turns a long hidden file into a non-dotfile with a 300-char
    "extension"): collapse to whole-segment truncation rather than
    leaving the cap unmet.

+6 unit tests covering: long leaf (regression case), long leaf under a
preserved directory, long non-leaf segment, deeply nested mixed-length,
exact-255 boundary (no truncation), and the dotfile + truncation
interaction.

* 🛡️ fix: Cap flattened storage key against NAME_MAX in processCodeOutput (codex review P1)

Per-segment caps on the path-preserving form aren't enough. Once segments
are joined with `__` for the storage key, deeply-nested or moderately
long paths can still produce a flat form that overflows once
`${file_id}__` is prepended — `${file_id}__a__b__c.csv` for a 3-level
100-char-each path is ~344 chars, well past filesystem NAME_MAX (255).
saveBuffer then trips ENAMETOOLONG and falls back to a download URL,
and the artifact never persists / primes.

`flattenArtifactPath` gets an optional `maxLength` parameter. When set,
the function truncates the flat form to fit, preserving the leaf
extension with the same disambiguating-hex-suffix shape sanitizeFilename
uses. Default (`undefined`) keeps existing call sites uncapped — the cap
is opt-in for callers that are actually building a filesystem key.
Pathologically long "extensions" from `path.extname` (e.g. `.aaaa…aaa`)
fall back to whole-key truncation rather than leaving the cap unmet.

processCodeOutput composes the storage key after `file_id` is known and
passes `255 - file_id.length - 2` as the budget so the full
`${file_id}__${flatName}` string fits in one filesystem path component.

+7 unit tests in files.spec.ts:
  - Pass-through when no maxLength supplied (cap is opt-in).
  - Pass-through when flat form fits within maxLength.
  - Truncation with leaf extension preserved (the regression case).
  - Leaf-only overflow with extension preservation.
  - Pathological long-extension fallback (whole-key truncation).
  - No-extension stem truncation.
  - Boundary equality (off-by-one guard).

+1 integration test in process.spec.js: processCodeOutput passes the
file_id-aware budget (`255 - file_id.length - 2`) to flattenArtifactPath.

114/114 across files.spec.ts + Files/Code (49 + 65).

* 🛡️ fix: Determinize + clamp artifact-path truncation (codex review P2 ×2)

Two follow-ups to Codex review on the path/flat-key cap:

1. **Deterministic truncation suffixes**. The previous helpers used
   `crypto.randomBytes(3)` for the disambiguator, mirroring
   `sanitizeFilename`'s shape. That made the truncated form non-
   deterministic: a re-upload of the same long filename would compute a
   *different* storage key, orphaning the previous on-disk file under
   the reused `file_id` returned by `claimCodeFile`.

   New `deterministicHexSuffix(input)` helper hashes the input with
   SHA-256 and takes the first 6 hex chars. Same input → same suffix
   (storage key stable across re-uploads); different inputs sharing a
   truncation prefix still get different suffixes (collision avoidance).
   24 bits ≈ 16M values is collision-safe for our scale (single-digit
   artifacts per turn per (filename, conversationId) bucket).

   Applied to `truncateLeafSegment`, `truncateDirSegment`, and
   `flattenArtifactPath` — every truncation site in the new helpers.
   `sanitizeFilename` (pre-existing) is intentionally left alone; its
   tests rely on the random-bytes mock and it's outside this PR's scope.

2. **Final clamp on flattenArtifactPath result**. The old `Math.max(1,
   maxLength - ext.length - 7)` floor could let the result slip past
   `maxLength` when the extension was nearly as large as the budget
   (e.g. `maxLength=5`, `ext=".txt"`: budget computed as 0, but result
   was `-<6 hex>.txt` = 11 chars). Drop the `Math.max(1, …)` floor and
   add a final `truncated.slice(0, maxLength)` so the contract holds
   for any input. Also short-circuit `maxLength <= 0` to `''` for
   pathological budgets.

Tests updated to compute the expected hash inline (the existing
`randomBytes` mock doesn't apply to the new code path), plus 4 new
regression tests:
  - sanitizeArtifactPath: same input → same output, different inputs →
    different outputs (determinism + collision avoidance).
  - flattenArtifactPath: same input → same output, different inputs
    sharing a truncation prefix → different outputs.
  - flattenArtifactPath: clamp holds when ext.length > maxLength - 7.
  - flattenArtifactPath: returns '' for maxLength <= 0.

53 unit tests pass. 65 integration tests pass.

* 🛡️ fix: Total-path cap + basename for classifier (codex P2 + comprehensive review)

Four follow-ups from the latest reviews on this PR:

1. **Codex P2: total-path cap in sanitizeArtifactPath**. Per-segment
   caps weren't enough — a deeply nested path (3+ at-cap segments) can
   still produce a joined form past Mongo's 1024-byte indexed-key limit
   (4.0 and earlier reject; later versions configurable). Added
   `ARTIFACT_PATH_TOTAL_MAX = 512` and a leaf-only fallback when the
   joined form exceeds it. Same shape as the absolute-path /
   `..`-traversal fallbacks above; the leaf is already segment-capped to
   ≤255, so the final result stays within bounds.

2. **Codex P2: pass basename to classifier/extractor in process.js**.
   With the path-preserving sanitizer, `safeName` can now be a nested
   string like `reports.v1/Makefile`. The classifier's `extensionOf`
   reads that as `v1/Makefile` (the slice after the dot in the directory
   name) and the bare-name branch rejects because it sees a `.`
   anywhere. Result: extensionless artifacts under dotted folders
   (Makefile, Dockerfile, etc.) get misclassified as `other` and skip
   text extraction. Pass `path.basename(safeName)` to both
   `classifyCodeArtifact` and `extractCodeArtifactText` so
   classification matches what the old flat-name flow produced.

3. **Review nit: drop dead `sanitizeFilename` mock in process.spec.js**.
   process.js no longer imports `sanitizeFilename`; the mock was
   misleading dead code.

4. **Review nit: rename misleading `'embedded parent traversal'` test**.
   `path.posix.normalize('a/../escape.txt')` resolves to `escape.txt`
   which goes through the normal segment-split path, not the
   `sanitizeFilename` fallback. Test name now says "resolves embedded
   parent traversal via path normalization" to match the actual code
   path.

+3 regression tests:
  - sanitizeArtifactPath falls back to leaf-only when joined > 512.
  - sanitizeArtifactPath keeps nested path within the 512 budget.
  - process.spec: passes basename (`Makefile` from `reports.v1/Makefile`)
    to classifyCodeArtifact + extractCodeArtifactText.

Existing "caps every segment in a deeply-nested path" test now uses 2
segments (not 3) so the joined form stays under the new total cap; the
3-segment scenario is covered by the new fallback test instead.

55 unit + 66 integration = 121/121 pass.

* 📝 docs: Correct sanitizeArtifactPath JSDoc to match actual schema index

Two doc-only fixes from the latest comprehensive review (both NIT):

1. **Index field list was wrong**. JSDoc claimed the compound unique
   index was `{ file_id, filename, conversationId, context }`. The
   actual index in `packages/data-schemas/src/schema/file.ts:92-95` is
   `{ filename, conversationId, context, tenantId }` with a partial
   filter for `context: FileContext.execute_code`. The cap rationale
   (Mongo 4.0 indexed-key limit) is correct and unchanged; just the
   field list was wrong. Added the schema file path so future readers
   can find the source of truth.

2. **Trade-off acknowledgement**. The reviewer noted that the
   leaf-only fallback loses directory structure, which means the
   model's `cat /mnt/data/<deep>/<path>/file.txt` would 404 on the
   pathological-depth case — partially re-introducing the original
   flat-name bug for >512-char paths. This is intentional (DB write
   failure is strictly worse than losing structure), but the trade-off
   wasn't called out explicitly in the JSDoc. Added a paragraph
   acknowledging it and noting that the cap is monotonically better
   than the pre-PR behavior, where ALL artifacts were treated this way
   regardless of depth.

No code or test changes — pure JSDoc correction. Tests still 55/0.

* 🛡️ fix: Disambiguate sanitized artifact names to keep claimCodeFile keys unique (codex P2)

`sanitizeArtifactPath` is not injective — multiple raw inputs can collapse
onto the same regex-and-normalize output. Codex's example:
`reports 2026/out.csv` and `reports_2026/out.csv` both sanitize to
`reports_2026/out.csv`. `claimCodeFile` is keyed on the schema's compound
unique `(filename, conversationId, context, tenantId)` index, so the
later upload silently matches the earlier record and overwrites the first
artifact's bytes via the reused `file_id` — a single conversation can
drop files when both names are valid in the sandbox.

This collision space isn't strictly new — pre-PR `sanitizeFilename`
(basename-only) had the same property — but the path-preserving form
gives us enough information to fix it for the first time.

**Fix.** When character-level sanitization changed something (regex
replacement, path normalization, dotfile prefix, empty-segment collapse),
embed a deterministic SHA-256 prefix of the **raw input** in the leaf
segment via the new `embedDisambiguatorInLeaf` helper. Same raw input →
same safe form (idempotent for re-uploads); different raw inputs that
would have collided → different safe forms.

**Why "character-level"** specifically:
- The disambiguator fires when `preCapJoined !== inputName` (post-regex
  + dotfile + empty-segment, BUT pre-truncation).
- Truncation alone is already disambiguated by `truncateLeafSegment`'s
  own seg-hash; firing the input-hash branch on truncation would just
  stack a second hash for no collision-avoidance benefit and clutter
  human-readable filenames.

**Three known collision shapes covered:**
1. `out 1.csv` vs `out_1.csv` (and `out@1.csv` vs `out#1.csv`, etc.)
2. `dir//file.txt` vs `dir/file.txt` (empty-segment collapse)
3. `.x` vs `_.x` (dotfile-prefix step)

**Disambiguator + truncation interaction:** for very long mutated leaves,
`truncateLeafSegment` caps at 255 first, then `embedDisambiguatorInLeaf`
re-trims to insert the input hash. The seg-hash from the first pass is
replaced by the input-hash from the second pass — that's intentional
(input-hash is the load-bearing collision-avoidance suffix; seg-hash was
only ever decorative once the input-hash exists). Final clamp ensures
the result never exceeds `ARTIFACT_PATH_SEGMENT_MAX` regardless of input.

**Disambiguator + total-cap fallback:** when joined > 512, we fall back
to the leaf-only form. The leaf has already had the disambiguator
embedded, so collision avoidance survives the pathological-depth case.

**`embedDisambiguatorInLeaf`** uses `dot <= 1` to detect "no real
extension" (covers extensionless names AND dotfile-prefixed leaves like
`_.hidden` — without this, `_.hidden` would split as stem `_` + ext
`.hidden` and produce the awkward `_-<hash>.hidden`).

**Updated 5 existing tests** that asserted the old collision-prone
outputs — they now verify the disambiguator-included form. The
character-level-only firing rule was load-bearing here: tests for
"clean inputs (no mutation)" and "long inputs (truncation only)" still
pass without any disambiguator clutter.

**+7 regression tests** in a new `collision avoidance (Codex review P2)`
describe block:
1. Different raw inputs sanitizing to the same form get distinct safes
2. Whitespace-vs-underscore in directory segment
3. Dotfile-prefix collision
4. Idempotency: same raw → same safe across calls
5. Clean inputs skip the disambiguator (cosmetic guarantee)
6. Disambiguator survives leaf truncation (long mutated leaf)
7. Disambiguator survives total-cap fallback (pathological depth)

62 unit + 66 integration = 128/128 pass.
2026-04-28 12:52:04 +09:00
Danny Avila
7070eb76aa
🔧 fix: Replace Literal NUL Bytes in handlers.spec Test Fixture + Normalize CRLF (#12852)
Two test-file hygiene fixes:

1. **Literal NUL bytes**. The `'rejects binary content (NUL bytes)
   post-fetch'` test embedded raw `\x00` bytes directly in the source
   string (`const binaryWithNul = '<3 NULs>\rIHDR<2 NULs>\x04'`).
   Embedding NULs in source files breaks editors, linters, ts-loader,
   and most git tooling — `grep` even classifies the file as binary
   ("Binary file matches"). Replace with `\x00` escape sequences in the
   string literal so the source is plain ASCII while the runtime string
   value is unchanged.

2. **CRLF line endings**. My earlier commits to this file picked up
   Windows-style `\r\n` from git's `core.autocrlf=true` checkout
   conversion, then staged them as `\r\n`. The diff against `dev`
   showed the entire file as changed even though only a few lines were
   touched semantically. Normalize the whole file back to LF so future
   diffs read clean.

The diff for this commit is large (~1248 lines marked changed) but
every change is one of: CRLF → LF, or the single `binaryWithNul`
escape-sequence rewrite. No semantic test changes.

Tests: 39/39 pass (unchanged behavior).
2026-04-28 11:20:26 +09:00
Danny Avila
2624c18633
🚫 fix: Reject Binary Files in read_file Sandbox Fallback (No More Mojibake) (#12851)
* 🚫 fix: Reject Binary Files in read_file Sandbox Fallback (No More Mojibake)

`read_file("/mnt/data/simple_graph.png")` was shelling `cat` through codeapi
`/exec` and shipping the result back to the model. codeapi's transport is
JSON, so `stdout` containing PNG bytes round-tripped through lossy UTF-8
replacement (every non-ASCII byte became U+FFFD), got line-numbered by
`addLineNumbers`, and arrived in the model's context as a multi-KB blob of
`�PNG\r\n  2 | ...`. The bytes were unrecoverable — and the same
codeapi sandbox logged the base64-style mojibake too — so the goal is
fail-fast, not retrieval.

Two guards in `handleSandboxFileFallback`, both bypassed by the existing
text path:

  1. **Extension precheck** (BEFORE any network call) for known-binary
     types: images, documents (pdf/docx/xlsx/etc), archives, audio, video,
     native libs, fonts, and a few other byte-soup formats. The message
     for image extensions points at the existing chat attachment ("the
     image is already shown to the user, use bash_tool for programmatic
     processing"); other binaries get the generic "use bash" hint.

  2. **NUL-byte sniff** (AFTER the read) on the first 8KB for unknown
     extensions or no-extension paths. The codeapi `/exec` JSON encoding
     mangles most non-UTF-8 bytes but a NUL terminator from a magic
     header survives the round-trip, so this catches novel binary
     formats without an extension precheck.

`lowercaseExtension` uses the basename to avoid false-triggering on
directory-name dots (e.g. `proj.v1/notes` has no extension, not `.v1/notes`).

+6 tests:
  - Image rejected by extension, never calls readSandboxFile, message
    points at the existing attachment.
  - Non-image binary (.zip) rejected with a different (bash-only) message.
  - Case-insensitive extension match (.PNG vs .png).
  - NUL-byte sniff catches unknown-extension binary post-fetch.
  - Text files with binary-adjacent extensions (.txt) still readable.
  - Dotted directory names don't false-trigger the extension match.

38/38 handlers.spec.ts pass.

The companion bash "command not found" issue from the same conversation
is a separate LLM mistake (writing raw Python as the bash command without
`python3 -c` / heredoc wrapper). Not coded here — flagged to the user.

* 🖼️ fix: Allow SVG read_file (XML text, no mojibake risk) — codex review P2

`.svg` was bucketed with raster-image extensions in
`BINARY_EXTENSIONS_NEVER_READABLE`, which made `handleSandboxFileFallback`
reject every SVG before calling readSandboxFile. SVG is an XML text
format — there's no mojibake risk for normal content, and the model has
legitimate reasons to inspect or edit a generated SVG (tweaking colors,
paths, viewBox, etc.). Block was a regression for valid read_file usage.

Remove `.svg` from both `BINARY_EXTENSIONS_NEVER_READABLE` (so it routes
through the normal sandbox read path) and `IMAGE_EXTENSIONS_FOR_HINT`
(now-dead entry — only used by the rejection-message picker). The
post-fetch NUL-byte sniff still catches anything that turns out to be
binary despite a `.svg` extension.

+1 regression test that an SVG with valid XML content reads through
successfully (`<svg>...<circle/>...</svg>` → status: 'success', content
contains `<svg`/`viewBox`).
2026-04-27 20:47:38 -04:00
Danny Avila
47f65fe39a
🪟 feat: Render Code-Execution Text Artifacts as Side-Panel Artifacts (#12832)
* 🪟 feat: Render Code-Execution Text Artifacts as Side-Panel Artifacts

Builds on PR #12829 (which populates `text` on code-execution file
attachments). When a tool-output file's extension/MIME maps to a
viewer we already have, route it through the artifact UI instead of
the inline `<pre>`:

- text/html, text/htm        → existing artifacts side panel (sandpack)
- App.jsx / App.tsx          → existing artifacts side panel (sandpack)
- *.md / *.markdown / *.mdx  → existing artifacts side panel (sandpack)
- *.mmd / *.mermaid          → standalone Mermaid component, inline
                                (no sandpack/react template)

The card and the mermaid header both expose a download button so the
underlying file is still reachable. Everything else (csv, py, json,
xls/docx/pptx, …) keeps PR #12829's inline behaviour — dedicated
viewers for csv/docx/xlsx/pptx will land in follow-ups.

Backend: `.mmd` and `.mermaid` added to UTF8_TEXT_EXTENSIONS so
mermaid sources reach the client with `text` populated.

Frontend changes:
- `client/src/utils/artifacts.ts` — `TOOL_ARTIFACT_TYPES` constant,
  `detectArtifactTypeFromFile`, `fileToArtifact` (id is derived from
  `file_id` so the same artifact across renders dedupes cleanly).
- `client/src/components/Chat/Messages/Content/Parts/ToolArtifactCard.tsx`
  — registers the artifact in `artifactsState`, renders an
  `ArtifactButton`-style trigger paired with a download button.
- `client/src/components/Chat/Messages/Content/Parts/ToolMermaidArtifact.tsx`
  — wraps the standalone Mermaid component with a filename + download
  header so the file stays reachable.
- `Attachment.tsx` and `LogContent.tsx` — gain panel-artifact and
  mermaid branches in the routing decision tree, ahead of the existing
  inline-text fallback. Existing branches untouched.

Test coverage: backend extension matrix (mmd/mermaid), frontend
predicates (`isPanelArtifact`, `isMermaidArtifact`,
`artifactTypeForAttachment`), `fileToArtifact`, and an RTL suite that
verifies each type routes to the right component (panel card / mermaid
render / inline pre / file chip).

* 🩹 fix: Address review on code-artifacts-panel routing

- ToolArtifactCard: defer artifact registration to the click handler so
  rendering a card never side-effects into `artifactsState`. With
  `artifactsVisibility` defaulting to `true`, eager mount-time
  registration would surface tool artifacts in the side panel without
  user intent — now matches ArtifactButton's pattern. Drop the
  redundant `artifacts` subscription (write-only via useSetRecoilState).
- LogContent.tsx: precompute `Artifact`s inside the existing useMemo
  bucket-sort so each render isn't producing fresh objects. Without
  this, missing updatedAt/createdAt fields would make `toLastUpdate`
  return `Date.now()` and churn Recoil state on every parent render.
- Attachment.tsx + LogContent.tsx: classify each attachment once via
  `artifactTypeForAttachment` and branch on the result, instead of
  calling `isMermaidArtifact` and `isPanelArtifact` back-to-back
  (each of which internally re-classified). AGENTS.md single-pass rule.
- artifacts.ts `detectArtifactTypeFromFile`: strip `;` parameters
  before the MIME comparison (so `text/html; charset=utf-8` is
  recognized) and add fallbacks for `application/vnd.react`,
  `application/vnd.ant.react`, and `application/vnd.mermaid`.
- ToolMermaidArtifact: drop the `id` prop entirely when `file_id` is
  undefined so we never pass an undefined DOM id through to mermaid.
- AttachmentGroup: keys derived from `file_id` (not bare index) so
  add/remove churn doesn't remount stable cards.
- Wrappers (PanelArtifact / MermaidArtifact / ToolMermaidArtifact)
  tightened from `Partial<TAttachment>` to `TAttachment` since the
  caller always passes a full attachment.
- fileToArtifact: drop dead `?? ''` on content (guarded by the
  preceding type check).
- Tests: new click-interaction suite verifying the deferred-registration
  invariant, click registers + opens panel, and second click toggles
  closed without losing the registered artifact.

* 🧹 chore: Address follow-up review NITs

- artifacts.test.ts: regression-pin baseMime() with charset/case
  variants for text/html, text/markdown, application/vnd.react.
- attachmentTypes.ts: drop the now-unused isMermaidArtifact and
  isPanelArtifact wrappers (the routing collapsed onto a single
  artifactTypeForAttachment call in the previous commit, so they
  were only kept alive by their own test). attachmentTypes.test.ts
  rewritten to exercise artifactTypeForAttachment branches directly.
- Attachment.tsx + LogContent.tsx: re-sort the local imports
  longest-to-shortest per AGENTS.md (~/utils/artifacts is 72 chars
  and was sitting after a 51-char import).

*  feat: Auto-open panel + route txt/docx/odt/pptx through artifacts

- artifacts.ts: add `text/plain` to TOOL_ARTIFACT_TYPES so plain-text
  documents (and the markdown-like ones we don't have rich viewers for
  yet) can route through the side panel. `useArtifactProps` already
  dispatches `text/plain` to the markdown-style template, so they
  render cleanly with no panel-side change.
- Extension map gains txt/docx/odt/pptx → text/plain. pptx is wired
  up speculatively — backend extraction is still deferred, so the
  routing fires the moment that lands. The MIME map gets the matching
  office MIME types for symmetry (extension wins, but it's nice to
  have the fallback when sniffing returns the canonical office MIME).
- ToolArtifactCard: register the artifact in `artifactsState` on
  mount again. With visibility defaulting to `true` and the panel's
  `useArtifacts` hook auto-selecting the latest artifact, this gives
  the auto-open behaviour that the legacy streaming artifacts have.
  Click handler is now just "focus + reveal" (registration already
  happened); a user who has explicitly closed the panel keeps it
  closed and uses the click to re-open.
- Tests: parameterised row for each new extension; ArtifactRouting
  invariant flipped from "no register on mount" to "registers on
  mount so panel can auto-open". Existing TextAttachment test that
  used `a.txt` switched to `a.csv` since `.txt` now panel-routes.

* 🐛 fix: Auto-focus latest tool artifact + self-heal after panel close

Two bugs in the previous commit's auto-open behaviour:

1. After closing the side panel, no artifact card could be reopened.
   `useArtifacts.ts` resets `artifactsState` in its unmount cleanup
   (line 50), which fires when visibility goes to `false`. The card's
   mount-only `useEffect` doesn't refire after that wipe, so the
   subsequent click set `currentArtifactId` to an id that was no
   longer in `artifactsState`, and `Presentation.tsx` then refused to
   render the panel because `Object.keys(artifacts).length === 0`.

   Fix: the registration `useEffect` now has no dependency array, so
   it self-heals after the wipe (the dedup check keeps it cheap when
   nothing actually needs writing).

2. Newly-arrived artifacts didn't steal focus from an already-selected
   one. `useArtifacts`'s fallback auto-select (line 64) only fires
   when `currentId` is null or no longer in the list — it deliberately
   protects an existing selection, while the streaming-specific effect
   that handles legacy focus-stealing is gated on `isSubmitting`.
   That gate doesn't apply to tool-output artifacts.

   Fix: a second `useEffect` keyed on `artifact.id` calls
   `setCurrentArtifactId(artifact.id)` whenever a new card mounts.
   Cards mount in attachment-array order, so the LAST-mounted card
   (the newest tool output) wins — matching the legacy "latest auto-
   opens" UX.

Tests: replace the now-stale "no register on mount" assertion with
"registers and auto-focuses on mount", flip the toggle test to start
from the auto-focused state, and add two regression tests covering
the close-then-reopen path and the latest-of-many auto-focus.

*  feat: Route pptx through artifact panel with placeholder content

Before this commit, pptx files fell through to a plain FileContainer
chip even though the extension was wired into the artifact map: backend
text extraction is still deferred for pptx, so `attachment.text` came
back null/empty and `detectArtifactTypeFromFile`'s strict text check
returned null. That meant docx/odt rendered as proper artifact cards
while pptx in the same message rendered as a tiny download chip.

`detectArtifactTypeFromFile` now allows empty text for the plain-text
and markdown buckets, since their viewers (the markdown template) handle
empty content gracefully. HTML / React / Mermaid still require real
content because sandpack and mermaid.js error on empty input.

`fileToArtifact` substitutes a markdown placeholder
("Preview not available yet — click Download to view the file.") when
the file routes through the panel without text. The panel renders the
placeholder via the markdown template; pptx (and any docx that fails
extraction) gets visual parity with its siblings, and the moment
backend extraction lands the placeholder is replaced by real content
without any frontend change.

Tests: split the "no text returns null" assertion into the strict
viewers (HTML/React/Mermaid) and the lenient ones (plain-text/markdown);
add a fileToArtifact case proving pptx without text gets the
placeholder, and another proving real text wins when present.

*  feat: Dedup duplicate tool-artifact cards across tool calls + messages

Two `ToolArtifactCard` instances for the same file_id (e.g. agent reads
back what it just wrote, or the same file is referenced in turns 1 and
5) now collapse to a single chip — the most recent mount wins, the
older sibling re-renders to `null`.

Implementation:

- New `toolArtifactClaim` atomFamily keyed by artifact id. Each card
  generates a unique component-instance key via `useId()`, claims the
  slot in a `useLayoutEffect` (synchronous before paint, no flicker),
  and releases it on unmount only if the claim is still ours. A later
  card with the same id overwrites the claim → earlier card subscribes
  via `useRecoilState` and renders `null`.

- Family-keyed (per artifact id) so adding/removing a claim for one
  file never re-renders cards for unrelated files. Addresses the
  "messages view re-renders frequently" concern: each card subscribes
  only to its own slice.

- `ToolMermaidArtifact` shares the same atom via the new exported
  `toolArtifactKey()` helper, so the same `.mmd` file can't double-
  render either.

- Latest content always wins for the panel because the eager
  `setArtifacts` registration is last-write-wins on `artifactsState`
  by id — independent of which card holds the claim. Updating a file
  refreshes the panel content even if the chip's visual location
  doesn't move.

Tests: two new cases asserting that duplicate panel and mermaid
attachments collapse to a single rendered card.

* 🧹 chore: Address comprehensive review on code-artifacts-panel

- ToolArtifactCard self-heal now subscribes to a per-id selector
  (`artifactByIdSelector`) instead of a no-deps `useEffect`. Effect
  deps are `(artifact, existingEntry, setArtifacts)` so it runs
  deterministically when the slice transitions to undefined (panel-
  unmount cleanup) or when artifact content drifts — not on every
  parent render. Each card subscribes only to its own slice via the
  selectorFamily, so unrelated state changes don't re-render.
- artifacts.ts: localize the empty-content placeholder via a new
  `fileToArtifact(attachment, options?)` signature. Callers in
  `Attachment.tsx` (PanelArtifact) and `LogContent.tsx` resolve
  `com_ui_artifact_preview_pending` from `useLocalize` and thread
  it in. Default is empty string when no placeholder is supplied.
- artifacts.ts: thread `preClassifiedType` through `fileToArtifact`
  so the routing decision tree's `artifactTypeForAttachment` call
  is the only classification — previously `fileToArtifact` re-ran
  `detectArtifactTypeFromFile` after the routing already had the
  answer. Bucket type updated to `Array<{ attachment, type }>`.
- artifacts.ts: drop bare `text/plain` from `MIME_TO_TOOL_ARTIFACT_TYPE`.
  The extension map handles `.txt` explicitly; routing every
  unrecognized-extension `text/plain` file (extensionless scripts,
  `.env`, etc.) through the panel was a wider catch than the PR
  scope intended.
- artifacts.ts: stable `toLastUpdate` fallback of `0` (was
  `Date.now()`). `useArtifacts` sorts by `lastUpdateTime`, so a fresh
  timestamp on every call would re-sort entries non-deterministically
  across renders.
- artifacts.ts: drop dead `toolArtifactId = toolArtifactKey` alias.
  Add `filepath` to the key-derivation fallback chain so two
  unnamed-and-unidentified files don't collide on the literal
  `tool-artifact-unknown` key.
- ToolArtifactCard import order: package types before local types.
- store/artifacts.ts: JSDoc on `toolArtifactClaim` documenting the
  atomFamily-entries-persist-after-unmount trade-off (entries reset
  to null on card unmount; total cost is one key + a null per
  artifact — fine at typical session scale).
- Tests:
  - Updated existing `fileToArtifact` placeholder assertion to use
    the caller-provided string.
  - New: panel routing skips re-classification when
    `preClassifiedType` is provided.
  - New: bare `text/plain` MIME with unrecognized extension does
    NOT route through the panel.
  - New `LogContent.test.tsx` (6 cases) — HTML→panel, mermaid→
    inline, CSV→inline `<pre>`, archive→download chip, pptx→
    placeholder card, mixed split.
  - Dedup tests rewritten to use two AttachmentGroups (matching
    the real per-tool-call render) instead of a same-array
    duplicate that triggered React's duplicate-key warning.

* 🩹 fix: Address codex review + comprehensive review NITs

codex (P2):
- artifacts.ts: switch placeholder fallback to nullish coalescing.
  Empty string is now preserved as legitimate content (a 0-byte `.md`
  or `.txt` is a valid artifact, not "extraction unavailable") —
  only `null`/`undefined` triggers the deferred-extraction placeholder.
- Attachment.tsx: derive React keys via a new `renderKey` helper that
  combines `file_id` with the array index. Prevents duplicate keys
  when the same file_id appears twice in one bucket (rare but
  possible — a tool call writing the same path twice). Without
  unique keys, React's reconciler could reuse the wrong card
  instance, undermining the latest-mention dedup.

comprehensive review NITs:
- Attachment.tsx: hoist `import type { ToolArtifactType }` up into
  the type-import section per AGENTS.md.
- artifacts.ts `fileToArtifact`: defense-in-depth empty-text guard
  for the `preClassifiedType` path. Mirrors the gate in
  `detectArtifactTypeFromFile` so a future caller that bypasses
  classification can't hand sandpack/mermaid an empty buffer.
  Plain-text and markdown remain tolerated empty.

Tests:
- New: empty `.md` content passes through unchanged when a
  placeholder is also supplied.
- New: sibling cards with the same file_id in one group render
  without React key-collision warnings.
- Updated existing placeholder test to use `text: null` (the case
  where the placeholder is actually meant to fire).
- Three parameterized cases pinning the new
  preClassifiedType-with-empty-text safety guard.

* 🩹 fix: Address codex P1/P2 review on code-artifacts-panel

- P1 (stale artifacts leak across conversations): Add a top-level
  `useResetArtifactsOnConversationChange` hook in `Presentation.tsx`
  that wipes `artifactsState` / `currentArtifactId` on every
  conversation switch, regardless of panel visibility. Without this
  guard, ToolArtifactCard's self-heal effect would re-register the
  previous conversation's artifacts after panel close, leaking them
  into the next conversation's panel on open.

- P2 (expiresAt skipped on panel-routed entries): Restore the legacy
  expiry gate in `LogContent` ahead of panel/mermaid bucket-sort, so
  expired pptx/html/etc. attachments fall back to the
  "download expired" message instead of rendering as a clickable
  artifact card backed by a dead link.

Includes regression coverage for both paths.

* 🧹 chore: Share renderAttachmentKey across Attachment + LogContent

Hoist the per-occurrence React-key helper from `Attachment.tsx` into
`attachmentTypes.ts` so `LogContent` can use the same pattern. Apply
it to LogContent's panel/mermaid/text/image/nonInline buckets — the
prior keys (e.g. `mermaid-${file_id ?? index}`, `file.filepath ?? ...`)
would have collided if the same file_id appeared twice in one render,
even though that's astronomically rare for a single tool call.

Also drops the unused `file_id` field on `MermaidEntry` since the key
no longer needs it.

* 🩹 fix: Loosen artifacts util input types to match runtime fallbacks

`fileToArtifact`, `detectArtifactTypeFromFile`, `toolArtifactKey`, and
`toLastUpdate` all read every picked field with a nullish fallback —
their inputs were nonetheless typed as required `Pick<TFile, ...>`.
That mismatch made every realistic fixture (and several call sites
that lack a stable `filepath`) fail typecheck for fields the
implementations never strictly need.

Wrap the picks in `Partial<>` so the type matches the contract.

* 🩹 fix: Gate tool-artifact registration on claim winner

When two `ToolArtifactCard` instances mount for the same `artifact.id`
with divergent content (a code-execution file overwritten across
turns reuses its file_id), both effects subscribe to `existingEntry`
through `artifactByIdSelector`. Each card detects the other's write as
drift and overwrites it back, ping-ponging `artifactsState` between
old and new content and causing render churn / panel flicker.

Gate the self-heal registration on `isMyClaim` so only the latest
(claim-holding) card writes. The non-winner still subscribes to the
slice but short-circuits before calling `setArtifacts`, breaking the
loop. Adds a regression test that fails (loop / wrong final content)
without the gate.
2026-04-28 06:08:32 +09:00
Danny Avila
24e29aa8cb
🌱 fix: Inject Code-Tool Files Into Graph Sessions on First Call (+ read_file Sandbox Fallback) (#12831)
* 🌱 fix: Seed Code Tool Files Into Graph Sessions on First Call

Files attached to an agent's `tool_resources.execute_code` (user uploads
or generated artifacts from a prior turn) were silently dropped on the
first `execute_code` invocation of a turn. The agents-side `ToolNode`
populates `_injected_files` only when its `sessions` map already has an
`EXECUTE_CODE` entry — but that entry is only written by a previous
successful execution, so call #1 had nothing to inject. CodeExecutor
then fell back to a `/files/{session_id}` fetch, but `session_id` was
also empty on call #1, leaving the sandbox without the primed files.

Mirror the existing skill-priming pattern (`primeInvokedSkills` →
`initialSessions`) for code-resource files: eagerly call `primeFiles`
before `createRun` and merge the result into `initialSessions` via a
new `seedCodeFilesIntoSessions` helper. Skill files and code-resource
files now share the same `EXECUTE_CODE` entry; the prior representative
`session_id` is preserved on merge.

* 🔬 chore: Add Diagnostic Logging for Code-Files Seeding

Temporary debug logs to diagnose why first-call file injection is not
firing in real agent runs. Logs `wantsCodeExec`, available tool-resource
keys, primed file count, and the seeded EXECUTE_CODE entry. Will revert
once the failure mode is identified.

* 🪛 refactor: Capture primedCodeFiles per-agent at init, merge across run

Replace the client.js eager `primeFiles` call with a per-agent capture at
initialization time so every agent in a multi-agent run (primary +
handoff + addedConvo) contributes its `tool_resources.execute_code`
files to the shared `Graph.sessions` seed.

- handleTools.js (eager loadTools): the `execute_code` factory closes
  over a `primedCodeFiles` slot and surfaces it in the return.
- ToolService.js loadToolDefinitionsWrapper (event-driven): captures
  `files` from the existing `primeCodeFiles` call (was dropping them
  while only keeping `toolContext`) and surfaces them.
- packages/api initialize.ts: the loadTools callback contract now
  includes `primedCodeFiles`, threaded onto `InitializedAgent`.
- client.js: iterate `[primary, ...agentConfigs.values()]` and merge
  each agent's `primedCodeFiles` into `initialSessions`. Drop the
  primary-only `primeCodeFiles` call and diagnostic logs from the prior
  attempt — wrong layer (single-agent), wrong gate (`agent.tools`
  contained Tool instances after init, so the `.includes("execute_code")`
  string check always failed).

* 🔬 chore: Add per-agent diagnostic logs for code-files seeding

Logs `tool_resources` keys + file counts inside loadToolDefinitionsWrapper
and per-agent `primedCodeFiles` + final initialSessions inside
AgentClient. Will revert once the failure mode is confirmed.

* 🔬 chore: Add file-lookup diagnostics inside initializeAgent

Logs the inputs and intermediate counts of the conversation-file lookup
chain (convo file ids, thread message ids, code-generated and
user-code file counts) so we can pinpoint why `tool_resources.execute_code`
is arriving empty at `loadToolDefinitionsWrapper` despite the agent
having `execute_code` in its tools list.

* 🔬 chore: Probe execute_code files without messageId filter

Adds a relaxed `getFiles({conversationId, context: execute_code})` probe
that runs only when `getCodeGeneratedFiles` returns empty. Lists what's
actually in the DB for this conversation so we can confirm whether the
file is missing entirely or whether the messageId filter is rejecting it.

* 🔬 chore: Fix probe getFiles arg order (sort vs projection)

Probe was passing a projection object as the sort arg, which mongoose
rejected with `Invalid sort value`. Move it to the third arg
(selectFields) so the probe actually runs.

* 🪢 fix: Preserve Original messageId on Code-Output File Update

Each `processCodeOutput` call was overwriting the persisted file's
`messageId` with the *current* run's id. When a turn re-creates an
existing file (filename + conversationId match → `claimCodeFile`
returns the existing record, `isUpdate=true`), the file's link to
the assistant message that originally produced it gets clobbered.

`initializeAgent` later runs `getCodeGeneratedFiles({messageId: $in: <thread>})`
to seed `tool_resources.execute_code` from prior-turn artifacts. With a
stale `messageId` (e.g. from a failed read attempt that re-shelled the
same filename), the file no longer matches the parent-walk thread, so
`tool_resources` arrives empty at agent init, the new
`primedCodeFiles` channel has nothing to seed, and the LLM can't see
its own prior-turn artifacts on the next turn — defeating the
just-added Graph-sessions seeding fix.

Preserve the existing `claimed.messageId` on update; first-creation
behavior is unchanged. The runtime return value still includes the
current run's `messageId` (via `Object.assign(file, { messageId })`)
so the artifact is correctly attributed to the live tool_call.

* 🧹 chore: Remove diagnostic logs from code-files seeding path

Drops the temporary debug logs added to trace the empty-tool_resources
failure mode. Production code paths (loadToolDefinitionsWrapper,
client.js seed loop, initializeAgent file lookup) are left as the
permanent shape: capture primedCodeFiles, merge across agents, seed
initialSessions before run start.

* 🪛 feat: read_file Sandbox Fallback for /mnt/data + Non-Skill Paths

When the model called `read_file` with a code-execution path (e.g.
`/mnt/data/sentinel.txt`), the handler returned a misleading
`Use format: {skillName}/{path}` error. Adds a sandbox-aware fallback:

- Short-circuit `/mnt/data/...` (can never be a skill reference) →
  route to a sandbox `cat` via the new host-provided `readSandboxFile`
  callback, which POSTs to the codeapi `/exec` endpoint.
- Skip the skill resolver entirely when `accessibleSkillIds` is empty
  — the resolved-output of `resolveAgentScopedSkillIds` already
  collapses the admin capability + ephemeral badge + persisted
  `skills_enabled` chain, so an empty value is the authoritative
  "skills aren't in scope for this agent" signal.
- For `{firstSegment}/...` paths, consult the catalog-derived
  `activeSkillNames` Set (no DB read) to detect non-skill names and
  fall through to the sandbox before the model has to retry with
  `bash_tool`.

`activeSkillNames` is captured from `injectSkillCatalog`, threaded onto
`InitializedAgent`, into `agentToolContexts`, then through
`enrichWithSkillConfigurable` into `mergedConfigurable` for the handler.

The host implementation of `readSandboxFile` lives in
`api/server/services/Files/Code/process.js` and shells `cat <path>`
through the seeded sandbox session — `tc.codeSessionContext`
(emitted by ToolNode for `read_file` calls in `@librechat/agents`
v3.1.72+) provides the `session_id` + `_injected_files` so the read
lands in the same sandbox that holds prior-turn artifacts. When the
seeded context isn't available (older agents version, no codeapi
configured), the handler returns a model-visible error pointing at
`bash_tool` instead of silently failing.

Tests: 8 new `handleReadFileCall` cases cover the new short-circuits,
the skills-not-enabled gate, the activeSkillNames lookup, the
sandbox-fallback success path, and the bash_tool retry hint on
fallback failure. Existing `read_file` tests now opt into "skills are
in scope" via a `skillsInScope()` fixture (production wouldn't reach
the skill lookup with empty `accessibleSkillIds`).

* 🔧 chore: Update @librechat/agents dependency to version 3.1.72

Bumps the version of the @librechat/agents package across package-lock.json and relevant package.json files to ensure compatibility with the latest features and fixes.

* 🪛 refactor: Centralize Tool-Session Seed in buildInitialToolSessions Helper

Addresses review feedback on the per-agent merge in client.js:

- **Run-wide semantics, named explicitly.** The merge into a single
  `Graph.sessions[EXECUTE_CODE]` was a deliberate match to the
  agents-library design (`Graph.sessions` is shared across every
  `ToolNode` in the run), but the inline `for (const a of agents)`
  loop in `AgentClient.chatCompletion` made it look per-agent. Move
  the logic to a TS helper `buildInitialToolSessions` that documents
  the run-wide-by-design contract in one place. The CJS controller
  now contains a single call site, no business logic.

- **Subagent walk (P2).** The previous loop only iterated
  `[primary, ...agentConfigs.values()]`. Pure subagents are pruned
  out of `agentConfigs` after init and retained on each parent's
  `subagentAgentConfigs`, so their primed code files were silently
  dropped from the seed. The helper now walks recursively, with a
  visited-Set keyed on object identity that terminates safely on a
  malformed agent graph (cycle).

- **`jest.setup.cjs` polyfill for undici `File`.** Reviewer hit
  `ReferenceError: File is not defined` running the targeted spec on
  WSL — a known Node 18 issue where `globalThis.File` from
  `node:buffer` isn't auto-exposed. Polyfill it inside a Jest setup
  file so the suite boots regardless of Node patch version.

Helper test coverage (8 new): skill-only / agent-only / both,
recursive subagent walk, cycle-safe walk, primary+subagent
deduplication, undefined/null entries in the agents iterable, and
representative session_id preservation across the merge.

16 tests pass total in `codeFilesSession.spec.ts` (8 prior + 8 new).
No behavior change vs. the previous commit for the existing
primary+agentConfigs case — subagent inclusion is the only new
behavior, and it matches what the existing seeding logic would have
done if subagents had been in `agentConfigs`.

* 🪛 fix: FIFO Walk Order in buildInitialToolSessions (P3 review)

The traversal used `Array.pop()` (LIFO), which visited the LAST
top-level agent first. The docstring says "primary first"; the code
contradicted it. When no skill seed exists the first-visited agent's
first file supplies the representative `session_id` written to
`Graph.sessions[EXECUTE_CODE]` — so a LIFO walk silently flipped which
agent that came from. `ToolNode` ultimately uses per-file `session_id`s
for runtime injection (so behavior was indistinguishable for current
callers), but the discrepancy was a footgun for any future consumer
that read the representative.

Switch to FIFO via `Array.shift()` to match both the docstring and the
existing `loadSubagentsFor` walk pattern in
`Endpoints/agents/initialize.js`. Add a regression test that asserts
the primary's `session_id` is the representative (and that all three
agents' files still contribute, with per-file `session_id`s preserved).

* 🔬 test: Lock In Code-Files Bug Fixes Per Comprehensive Review

Addresses MAJOR + MINOR + NIT findings from the multi-pass review:

**Finding #4 (MINOR) — empty relativePath misses sandbox fallback.**
A model calling `read_file("output/")` where "output" isn't a skill
name dead-ended with `Missing file path after skill name` instead of
being routed to the sandbox like every other malformed-path branch.
Add the same `codeEnvAvailable → handleSandboxFileFallback` pattern,
plus two regression tests.

**Finding #7 (NIT) — duplicate `skillsInScope()` helper.**
Hoist the identical helper out of two nested describe blocks to
module scope. Single source of truth.

**Finding #1 (MAJOR) — `persistedMessageId` had zero test coverage.**
The fix preserves a file's original `messageId` on update so
`getCodeGeneratedFiles` can still match it on subsequent turns. A
regression in the `isUpdate ? (claimed.messageId ?? messageId) : messageId`
ternary would silently re-introduce the original cross-turn priming
bug. Five new tests cover:
- UPDATE preserves `claimed.messageId` in the persisted record
- UPDATE falls back to current run id when `claimed.messageId` is
  absent (legacy records predating the field)
- CREATE uses current run id (no claimed record exists)
- The runtime return value uses the LIVE id (artifact attribution)
  even when the persisted record kept the original
- The image branch follows the same contract (would silently regress
  if the ternary diverged across the two file-build branches)

The tests use a `snapshotCreateFileArgs()` helper because
`processCodeOutput` mutates the file object after `createFile`
returns (`Object.assign(file, { messageId, toolCallId })`) and a
naive `createFile.mock.calls[0][0]` would reflect the post-mutation
state instead of what was actually persisted.

**Finding #2 (MAJOR) — `readSandboxFile` had no direct tests.**
The model-controlled `file_path` flows through a POSIX single-quote
escape into a shell `cat` command, making this a security boundary.
A quoting regression would let a malicious filename break out of the
quoted argument and inject arbitrary shell. 20 new tests across:
- Shell quoting (7): plain filenames, embedded `'`, `$()`, backticks,
  newlines, shell metachars, multiple consecutive single-quotes
- Payload shape (6): /exec URL, bash language, conditional
  session_id / files inclusion, dedicated keepAlive:false agents
- Response handling (6): `{content}` on success, null on missing
  base URL or absent stdout, throws on stderr-only, partial-success
  returns stdout, transport errors are logged then rethrown
- Timeout (1): matches processCodeOutput's 15s SLA

Audited findings #5 (acknowledged tech debt — readSandboxFile in JS
workspace), #6 (pre-existing positional-args debt on
enrichWithSkillConfigurable), and #8 (cosmetic JSDoc style) — no
action taken per the reviewer's own assessment.

Audited finding #3 (walk order vs docstring) — already addressed in
commit 007f32341 which converted to FIFO via `queue.shift()` plus a
regression test. The audit was performed against an earlier PR head.

Tests: 152 packages/api + 195 api JS = 347 pass. Typecheck clean.

* 🪛 fix: Pure-Subagent codeEnv + Primed-Skill Routing + ToolService Early Returns

Three findings from the second-pass review:

**P2 — Pure subagents missed `codeEnvAvailable`** (initialize.js).
The pure-subagent init path didn't forward the endpoint-level
`codeEnvAvailable` flag to `initializeAgent`, unlike the primary,
handoff, and addedConvo paths. A code-enabled subagent loaded only
through `subagentAgentConfigs` initialized with
`codeEnvAvailable: false`, so even though the recursive seed walk
found its primed code files, the subagent's own `bash_tool` /
`read_file` sandbox fallback were silently gated off. Forward the
flag and add `codeEnvAvailable: config.codeEnvAvailable` to the
`agentToolContexts.set` for symmetry with the other paths.

**P2 — Primed skills outside the catalog cap were misrouted to
sandbox** (handlers.ts). Manual ($-popover) and always-apply primes
are intentionally resolved off the wider `accessibleSkillIds` ACL
set BEFORE catalog injection — see `resolveManualSkills` for why a
skill outside the `SKILL_CATALOG_LIMIT` cap can still be authorized
for direct manual invocation. The `activeSkillNames` shortcut ran
before reading `skillPrimedIdsByName`, so a primed skill not in the
catalog would fall through to the sandbox instead of resolving via
the pinned `_id`. Read the primed map first and bypass the shortcut
for primed names. New regression test asserts a primed-but-not-
cataloged skill resolves through the existing skill path with
`getSkillByName` invoked and `readSandboxFile` NOT called.

**P3 — `loadAgentTools` early returns dropped `primedCodeFiles`**
(ToolService.js). The non-`definitionsOnly` path captures the field
correctly, but two early-return branches (no-action-tools fast path,
no-action-sets fast path) omitted it. Any traditional
`loadAgentTools(..., definitionsOnly: false)` caller using
execute_code without action tools would have its first-call session
seed silently empty. Add `primedCodeFiles` to both early returns
for consistency with the final return shape.

Tests: 153 packages/api + 195 api JS = 348 pass.

* 🧹 chore: Document jest.mock arrow-indirection pattern in process.spec.js

Per the second-pass review's Finding #2 (NIT, "would help future
readers"): the mock setup mixes direct `jest.fn()` references with
arrow-function indirection (`(...args) => mockX(...args)`). The
indirection isn't stylistic — it's required because `jest.mock(...)`
is hoisted above the outer `const` declarations at parse time, so a
direct reference would capture `undefined`. Inline comment explains
the pattern so the next reader doesn't have to reverse-engineer it
or accidentally "simplify" the mocks and break per-test
`mockReturnValueOnce` / `mockImplementationOnce` overrides.

* 🪛 fix: Five Issues from Pass-N + Codex Review (incl. 404 root cause)

Five real bugs surfaced by another review pass + Codex PR comments
+ the codeapi-side logs we collected during manual testing:

**1) `processCodeOutput` 404 root cause (`callbacks.js`).**
The codeapi worker emits TWO distinct `session_id`s on a tool result:
  - `artifact.session_id` is the EXEC session — the sandbox VM that
    ran the bash command. Files don't live there; it's torn down
    post-execution.
  - `file.session_id` is the STORAGE session — the file-server
    bucket prefix where artifacts actually live.
`callbacks.js` was passing the EXEC id to `processCodeOutput`, which
builds `/download/{session_id}/{id}` and 404s because the file-server
doesn't know about that path. This explains every "Error
downloading/processing code environment file" we saw during testing.
Use `file.session_id ?? output.artifact.session_id` (per-file id with
artifact-level fallback for older worker payloads).

**2) `primeFiles` reupload pushed STALE sandbox ids (`process.js`).**
When `getSessionInfo` returns null (file expired/missing in sandbox),
`reuploadFile` re-uploads via `handleFileUpload`, gets a NEW
`fileIdentifier`, and persists it on the DB record. But `pushFile`
was a closure capturing the OLD `(session_id, id)` parsed at the top
of the loop, so the in-memory `files[]` array (now consumed by
`buildInitialToolSessions` to seed `Graph.sessions`) silently
referenced a sandbox object that no longer existed. The first tool
call would 404 trying to mount it; only the next turn's metadata
re-read would correct course. Parameterize `pushFile` with optional
`(session_id, id)` overrides; in `reuploadFile` parse the new
identifier and pass through. 2 regression tests.

**3) Codex P2 — Cap sandbox fallback output before line-numbering
(`handlers.ts`).** The new `handleSandboxFileFallback` returned
`addLineNumbers(result.content)` without a size guard, so reading a
multi-MB `/mnt/data/*` artifact materialized the file twice in
memory (raw + line-numbered) before downstream truncation. Match the
existing skill-file path's `MAX_READABLE_BYTES` (256KB): truncate
raw first, then number, surface the truncation to the model so it
can use `bash_tool` (`head` / `tail`) for the rest. 2 tests
(oversized truncates with hint, in-cap doesn't).

**4) Codex P2 — Dedupe seeded code files by `(session_id, id)`
(`codeFilesSession.ts`).** Multiple agents in a run commonly carry
the same primed execute-code resources (shared conversation files);
without dedupe, `_injected_files` grows proportionally to agent
count and bloats every `/exec` POST. Use a `(session_id, id)`
identity key so first-seen wins (preserves source ordering); name
alone isn't sufficient because two distinct primed uploads can
share a filename across different sessions. 4 tests covering dedup
across iterations, against pre-existing entries, name-collision
distinct-session preservation, and the multi-agent realistic case
in `buildInitialToolSessions`.

**5) Pass-N P2 — Polyfill `globalThis.File` in api Jest setup
(`api/test/jestSetup.js`).** `packages/api/jest.setup.cjs` had the
polyfill; the legacy api workspace's Jest config has its own
`setupFiles` that didn't, so on Node 18 / WSL the api focused tests
still failed at import time with `ReferenceError: File is not
defined` from undici. Mirror the polyfill.

Tests: 159 packages/api + 206 api JS = 365 pass. Typecheck clean.

* 🔧 chore: Update @librechat/agents dependency to version 3.1.73

Bumps the version of the @librechat/agents package across package-lock.json and relevant package.json files to ensure compatibility with the latest features and fixes.
2026-04-27 08:56:39 +09:00
Danny Avila
f7e47f6012
🪢 feat: Enable Tool-Output References for Bash Tool (#12830)
* chore: Update `@librechat/agents` to v3.1.71-dev.0 across package-lock and package.json files

This commit updates the version of the `@librechat/agents` package from `3.1.70` to `3.1.71-dev.0` in the `package-lock.json` and relevant `package.json` files. Additionally, it marks several dependencies as peer dependencies, ensuring better compatibility and integration across the project.

* 🔗 feat: Enable Tool-Output References for bash_tool when codeenv is on

Wires `@librechat/agents`' `RunConfig.toolOutputReferences` into
`createRun()` and the bash tool's LLM-facing description, gated by the
per-agent `effectiveCodeEnvAvailable` flag. The feature auto-activates
for any run where the bash tool is actually registered; SDK defaults
(~400 KB per output, 5 MB total) match the shell-safe budget. No new
env var or yaml capability — piggybacks on the existing `execute_code`
gate.

- `tools.ts`: replace the module-level `BASH_TOOL_DEF` constant with a
  per-call `buildBashToolDef` that wraps `buildBashExecutionToolDescription`.
  Description now includes the `{{tool<idx>turn<turn>}}` reference syntax
  guide iff the new `enableToolOutputReferences` param is true.
- `initialize.ts`: pass `enableToolOutputReferences: effectiveCodeEnvAvailable`
  into `registerCodeExecutionTools`.
- `run.ts`: add `codeEnvAvailable?: boolean` to `RunAgent`, compute the
  flag from `agents[*].codeEnvAvailable`, and conditionally spread
  `toolOutputReferences: { enabled: true }` into `Run.create`.

* 🧪 test: Cover tool-output references gating end-to-end

- `tools.spec.ts`: 3 new cases asserting `bash_tool.description`
  contains `{{tool<idx>turn<turn>}}` iff `enableToolOutputReferences` is
  true (and unset → false).
- `run-summarization.test.ts`: 4 new cases asserting `Run.create` is
  invoked with `toolOutputReferences: { enabled: true }` iff at least
  one `RunAgent.codeEnvAvailable === true`. Covers the present /
  absent / unset / multi-agent-OR cases.
- `initialize.test.ts` + `skills.test.ts`: extend the existing
  `@librechat/agents` jest mocks with a `buildBashExecutionToolDescription`
  stub so suites stay green when the on-disk SDK lags the published
  3.1.71-dev.0 export.

* chore: Update `@librechat/agents` version to `3.1.71-dev.1` in package-lock and package.json files

This commit updates the version of the `@librechat/agents` package from `3.1.71-dev.0` to `3.1.71-dev.1` across the relevant package files. This change ensures consistency and incorporates any updates or fixes from the new version.

* 🪢 fix: Walk Subagents in toolOutputReferences run-level gate

Codex P2 review on PR #12830: the run-level
`enableToolOutputReferences` flag only inspected the top-level
`agents` array. A parent agent without `execute_code` that spawns a
subagent that *does* have it left the SDK's tool-output reference
registry inactive for the run, so the subagent's `bash_tool` calls
saw `{{tool<idx>turn<turn>}}` placeholders pass through to the
shell unsubstituted.

Replace `agents.some(a => a.codeEnvAvailable === true)` with a
recursive `anyAgentHasCodeEnv` helper that walks
`subagentAgentConfigs` transitively. Cycle-safe via a `visited` set,
mirroring the existing `buildSubagentConfigs.ancestors` pattern in
the same module. The bash tool *description* stays per-agent in
`initializeAgent` (only agents with bash actually registered learn
the `{{…}}` syntax), so broadening the run-level gate doesn't
broaden the model-facing surface — it just lets the SDK's shared
registry serve every `ToolNode` the run compiles, which is exactly
the contract the SDK already implements.

Tests cover three new cases: parent-off / subagent-on, parent-off /
child-off / grandchild-on (transitive descent past one level), and
a cyclic A↔B tree with neither codeenv-enabled (asserts both
termination and absence of `toolOutputReferences`). Existing
single-agent and multi-agent tests stay valid since the new helper
returns `true` whenever the previous `.some(...)` did.

* chore: Update `@librechat/agents` version to `3.1.71` in package-lock and package.json files

This commit updates the version of the `@librechat/agents` package from `3.1.71-dev.1` to `3.1.71` across the relevant package files. This change ensures consistency and incorporates any updates or fixes from the stable release.

* review: address audit findings on tool-output references PR

Two findings from comprehensive PR review on #12830:

#1 (MINOR) — `injectSkillCatalog` omitted `enableToolOutputReferences`
when calling `registerCodeExecutionTools`, so its resulting
`bash_tool` description always lacked the `{{tool<idx>turn<turn>}}`
guide. Today this is a no-op because `initializeAgent` registers
first and the registry `.has()` check makes the skills-path call a
dedupe-only operation. But if call order ever flips (skills-first),
the missing flag would silently ship a `bash_tool` without the
syntax guide, and the `initializeAgent` pass would itself become
the no-op — the feature would silently break with no visible error.
Forward `enableToolOutputReferences: codeEnvAvailable === true` so
both call sites produce identical tool definitions regardless of
firing order. Defense-in-depth, not a current bug. Added a test in
`skills.test.ts` that asserts the bash description contains the
`{{tool<idx>turn<turn>}}` marker when `codeEnvAvailable` is on,
exercising the skills caller end-to-end.

#2 (NIT) — `buildBashToolDef` allocated + froze a fresh object on
every agent init. Replaced with two module-level frozen singletons
(`BASH_TOOL_DEF_WITH_OUTPUT_REFS`, `BASH_TOOL_DEF_WITHOUT_OUTPUT_REFS`)
built once at module load via a `createBashToolDef` helper. The
factory now picks the right cached reference instead of building.
Restores the no-allocation intent of the original `BASH_TOOL_DEF`
constant while keeping the per-agent gate behavior. Two new tests
in `tools.spec.ts` pin the contract: identical-flag calls return
reference-equal `bash_tool` defs across registries; opposite-flag
calls return distinct frozen objects with the expected description
content.
2026-04-26 02:06:23 -07:00
Danny Avila
8c073b4400
📄 feat: Auto-render Text-Based Code Execution Artifacts Inline (#12829)
* 📄 feat: Auto-render Text-Based Code Execution Artifacts Inline

Eagerly extract text content from non-image artifacts produced by code
execution tools and render it inline in the message instead of behind a
click-to-download file card. Reuses the SkillFiles binary-detection
helper and the existing parseDocument dispatcher so docx, xlsx, csv,
html, code, and other text-renderable formats land directly under the
tool call.

PPTX is intentionally classified but not yet extracted — follow-up.

* 🌐 chore: Remove unused com_download_expires locale key

Removed in en/translation.json so the detect-unused-i18n-keys CI check
passes. The only reference was a commented-out localize() call in
LogContent.tsx that was deleted in the previous commit.

* 🩹 fix: Address PR review on code artifact text extraction

- extract.ts: build the temp document path from a randomUUID and pass
  path.basename(name) as originalname so a malicious artifact name
  cannot escape os.tmpdir() (P1 traversal flagged by codex/Copilot).
- process.js: classify and extract using safeName, not the raw name —
  defense in depth alongside the temp-path fix.
- classify.ts: add a bare-name lookup so extensionless text artifacts
  (Makefile, Dockerfile, …) classify as utf8-text instead of falling
  through to other.
- Attachment.tsx: wire aria-expanded / aria-controls on the show-all
  toggle for screen reader support.
- LogContent.tsx: restore a download chip (LogLink) on inline-text
  attachments so users can still pull down the underlying file.
- Tests: cover extensionless filenames and the temp-path traversal
  invariant.

* 🩹 fix: Address comprehensive PR review on code artifact extraction

- extract.ts: walk back to a UTF-8 code-point boundary before truncating
  so cuts cannot land mid-multibyte and emit U+FFFD (CJK/emoji concern).
  truncate() now accepts the original buffer to skip a redundant encode.
- extract.ts: add an 8s timeout around parseDocument via Promise.race so
  a pathological docx/xlsx cannot stall the response path.
- process.js: always set `text` (string or null) on the file payload —
  createFile uses findOneAndUpdate with $set semantics, so omitting the
  field leaves a stale value behind when an artifact's content changes.
- Attachment.tsx: switch the show-all toggle from char-count threshold
  to a useLayoutEffect ref measurement on scrollHeight, and use
  overflow-hidden when collapsed (overflow-auto when expanded) so the
  collapsed box has a single clear interaction model.
- Attachment.tsx + LogContent.tsx: lift `isImageAttachment` /
  `isTextAttachment` into a shared attachmentTypes module. LogContent
  keeps its looser image check (no width/height required) because the
  legacy log surface receives attachments without dimensions.
- Tests: cover multi-byte boundary, the always-set-text contract on
  updates, and the new shared predicates.

* 🧪 test: Component test for TextAttachment + direct withTimeout coverage

- Attachment.tsx: re-order local imports longest-to-shortest per
  AGENTS.md (attachmentTypes ahead of FileContainer/Image).
- extract.ts: export withTimeout so it can be unit-tested directly
  (it's also used internally — exporting carries no runtime cost).
- extract.spec.ts: three small unit tests on withTimeout that cover
  resolve, propagated rejection, and timeout rejection paths with
  real timers.
- TextAttachment.test.tsx: ten cases for the new React component —
  text rendering in <pre>, download chip presence/absence, ref-based
  collapse measurement (with scrollHeight stubbed via prototype),
  aria-expanded toggle, fall-through to FileAttachment for missing
  and empty text, and AttachmentGroup routing.

* 🩹 fix: Canonicalize document MIME by extension before parseDocument

When the classifier puts a file on the document path via its extension
(.docx, .xlsx, …) but the buffer sniffer returned a generic value like
application/zip or application/octet-stream, we previously forwarded
that generic MIME to parseDocument, which dispatches strictly by MIME
and silently rejected it — exactly defeating the extension-first
classification this PR added.

extractDocument now remaps the MIME from the extension (falling back
to the original sniffed MIME if the extension is unrecognized, so files
that reached the document branch via MIME detection still work). Adds
a parameterized test across docx/xlsx/xls/ods/odt against zip/octet
sniffs to guard the regression.

* 🩹 fix: Reuse existing withTimeout from utils/promise

The previous commit's local withTimeout export collided with the
already-exported `withTimeout` from `~/utils/promise`, breaking the
@librechat/api tsc job (TS2308 ambiguous re-export).

Drops the duplicate, imports from `~/utils/promise`, and removes the
now-redundant unit tests (the helper has its own coverage in
utils/promise.spec.ts). The third argument shifts from a label to the
fully-formed timeout error message that the existing helper expects.

* 🧹 chore: TextAttachment test polish (NITs)

- Use the conventional `import Attachment, { AttachmentGroup }` form
  rather than `default as Attachment`.
- Save the original `scrollHeight` property descriptor and restore it
  in afterAll, so the prototype patch never leaks past this suite.
2026-04-26 02:04:00 -07:00
Danny Avila
596f806f60 🛡️ fix: Strict Opt-In Skills Activation per Agent (#12823)
* 🛡️ fix: Strict opt-in skills activation per agent

Skills were activating on every agent run that had the capability +
RBAC enabled, regardless of whether the user (ephemeral) or author
(persisted) had opted in. `scopeSkillIds(undefined)` fell through to
"full accessible catalog" whenever `agent.skills` was unset, which is
the default state for any agent created before skills existed and for
every ephemeral agent.

Activation now requires an explicit signal:
- Ephemeral agent → per-conversation skills badge toggle.
- Persisted agent → new `skills_enabled` master switch on the agent
  doc, surfaced as a toggle in the Agent Builder skills section.
  Enabled + empty/undefined allowlist = full accessible catalog;
  enabled + non-empty allowlist = narrow to those ids; disabled (or
  undefined) = no skills available, even if an allowlist is set.

Centralised the predicate in `resolveAgentScopedSkillIds` so the
primary-agent path, handoff/discovery, the subagent loop, and both
OpenAI controllers all share one source of truth. Frontend `$`
popover scope mirrors the same logic so the UI never offers skills
the backend would refuse to activate.

* test: mock resolveAgentScopedSkillIds in agent controller specs

* refactor: address review findings on skills opt-in PR

- AgentConfig: associate skills label with toggle via htmlFor for
  click/keyboard affordance; simplify Switch handler to Boolean(value).
- skills: mark scopeSkillIds as @internal so runtime callers continue
  to route through resolveAgentScopedSkillIds and inherit the activation
  predicate (ephemeral toggle, persisted skills_enabled).

* fix(agents): include skills_enabled in agent list projection

Without this field, agents loaded via the list endpoint hydrate into the
client agentsMap with skills_enabled === undefined, causing the `$`
skill popover to hide every skill on a fresh page load even when the
agent was saved with skills_enabled: true.

* fix(skills): fail closed for persisted agents during agentsMap hydration

Returning undefined while the agents map loads let the popover render the
full catalog for a persisted agent before we could read its
skills_enabled flag, so the user could pick a skill the backend would
then refuse for the turn. Match the strict opt-in contract by returning
[] until the map is authoritative.

* refactor(skills): extract skillsHintKey for readability

Replaces the nested ternary in the skills section JSX with a
pre-computed constant so the activation -> hint key mapping reads
top-down.

* refactor(skills): unflatten skillsHintKey to remove nested ternary
2026-04-25 04:02:01 -04:00
Danny Avila
7f3d41024a 📦 chore: Update @librechat/agents to v3.1.70
This commit updates the version of the `@librechat/agents` package from `3.1.68-dev.1` to `3.1.70` in the `package-lock.json` and relevant `package.json` files. This change ensures consistency across the project and incorporates any updates or fixes from the new version.
2026-04-25 04:02:01 -04:00
Danny Avila
d83cb84f59 🪆 feat: Subagent configuration in Agent Builder (#12725)
* 🪆 feat: Subagents configuration (isolated-context child agents)

Surfaces the new @librechat/agents `SubagentConfig` primitive in the Agent
Builder. Subagents let a supervisor delegate a focused subtask to a child
graph running in an isolated context window: verbose tool output stays in
the child, only a filtered summary returns to the parent.

Data model: new `subagents: { enabled, allowSelf, agent_ids }` on Agent,
wired through the Zod, Mongoose, and form schemas plus a new
`AgentCapabilities.subagents` capability (enabled by default).

Backend: `initialize.js` loads explicit subagent configs alongside handoff
agents, and drops subagent-only references from the parallel/handoff maps
so they don't leak into the supervisor's graph. `run.ts` emits
`SubagentConfig[]` on the primary `AgentInputs` — a self-spawn entry when
`allowSelf` is enabled plus one entry per configured agent.

UI: an "Advanced" panel section with an enable toggle, a self-spawn
toggle, and an agent picker (capped at 10). Enabling without adding
agents still yields self-spawn; disabling self-spawn with no agents shows
a warning. A capability flag gates the whole section.

* 🪆 feat: Stream subagent progress to UI (dialog + inline ticker)

Pairs with the @librechat/agents SDK change that forwards child-graph
events through the parent's handler registry (danny-avila/agents#107):

- Self-spawn and explicit subagents can now use event-driven tools,
  because child `ON_TOOL_EXECUTE` dispatches reach our ToolService via
  the parent's registered handler.
- The same forwarding path wraps the child's run_step / run_step_delta
  / run_step_completed / message_delta / reasoning_delta dispatches in
  a new `ON_SUBAGENT_UPDATE` envelope, with start/stop/error bookends.

Backend: `callbacks.js` registers an `ON_SUBAGENT_UPDATE` handler that
forwards each envelope straight to the SSE stream.

Frontend:
- `useStepHandler` consumes `ON_SUBAGENT_UPDATE` events and merges them
  into a per-tool_call Recoil atom (`subagentProgressByToolCallId`).
  First-seen `subagentRunId` claims the most-recent unclaimed `subagent`
  tool call in the active response message — a temporal mapping, no SDK
  wire-format change needed to correlate child runs with parent tool
  calls.
- New `SubagentCall` part component replaces the default `ToolCall`
  rendering when `toolCall.name === Constants.SUBAGENT`: compact status
  ticker showing the 3 most recent update labels, clickable to open a
  dialog with the full activity log + final markdown-rendered result.
- Adds `Constants.SUBAGENT`, `StepEvents.ON_SUBAGENT_UPDATE`, and
  `SubagentUpdateEvent` type in data-provider.

Tests:
- `packages/api npx jest run-summarization` — 23 pass
- `api npx jest initialize` — 16 pass
- `npm run build` — clean

Dependency note: bumps `@librechat/agents` to `^3.1.67-dev.1` — requires
the SDK PR (danny-avila/agents#107) to be merged to dev and published
before this PR merges. `ON_SUBAGENT_UPDATE` is absent from dev.0, so the
handler registration would be a no-op with the older SDK but would not
crash.

* 🪆 fix: address Codex review and review audit on subagents

Stacks on top of the SDK change in danny-avila/agents#107 (bumped to
`^3.1.67-dev.2`).

- **P1 (`initialize.js`)**: subagent-only agents were being deleted from
  both `agentConfigs` AND `agentToolContexts`. The tool-execute handler
  resolves execution context (agent, tool_resources, skill ACLs) from
  `agentToolContexts`, so explicit subagents would run without their
  configured resources and skip action tools. Now only `agentConfigs`
  is pruned — tool context stays intact.
- **P2 (`AgentSubagents.tsx`)**: toggling subagents off set the form
  field to `undefined`; `removeNullishValues` stripped it from the
  PATCH, leaving the server copy enabled. Now it persists an explicit
  `{ enabled: false, ... }` so the update actually clears state.

- **Finding 1 (MAJOR)** — `agent_ids` Zod schema gains `.max()` via a
  new `MAX_SUBAGENTS` export from `data-provider` (shared with the UI
  cap). Crafted payloads can't trigger hundreds of `processAgent`
  calls.
- **Finding 2 (MAJOR)** — `subagentProgressByToolCallId` atomFamily
  atoms are now tracked in a ref and reset from `clearStepMaps` via a
  `useRecoilCallback({ reset })`. No monotonic growth across a session.
- **Finding 3 (MAJOR)** — early-arriving `ON_SUBAGENT_UPDATE` events
  whose parent `tool_call_id` is not yet mapped are now buffered in
  `pendingSubagentBuffer` (keyed by `subagentRunId`) and replayed in
  arrival order once correlation completes. Mirrors the existing
  `pendingDeltaBuffer` pattern.
- **Finding 4 (MAJOR)** — switched to deterministic correlation via
  the new `parentToolCallId` that SDK `3.1.67-dev.2` threads through
  from `ToolRunnableConfig.toolCall.id`. Temporal fallback now iterates
  oldest-unclaimed-first (forward), matching tool-call creation order,
  so concurrent spawns map correctly.
- **Finding 6 (MINOR)** — `agent_ids` are deduped on the backend via
  `new Set(...)` before the load loop. Duplicates no longer produce
  duplicate `SubagentConfig` entries visible to the LLM.
- **Finding 7 (MINOR)** — events array inside each Recoil atom is
  capped at 200 entries. Long-running subagents no longer replay O(n)
  spreads on every update; the dialog log still shows the cap window.
- **Finding 8 (MINOR)** — documented: subagents are loaded only for
  the primary agent this release (handoff children get self-spawn but
  not explicit sub-subagents). In-code comment added so the next
  maintainer doesn't wonder.
- **Finding 9 (NIT)** — removed `{!isSubmitting && null}` dead code
  and the misleading announce-polite comment in `SubagentCall`.

- New `validation.spec.ts` — 9 tests covering the cap on
  `agent_ids.length` at the subagent schema, agent-create, and
  agent-update layers.
- `run-summarization` — 23 pass, `initialize` — 16 pass, total backend
  package: 103 pass across touched areas.

Findings 5 (component tests) and 10 (micro-allocation) are tracked
but deferred; the former needs a Recoil-RenderHook harness that isn't
in this PR's scope, and the latter has negligible impact (one `Array.from`
per subagent run).

* 🧪 test: integration coverage for subagent correlation + backend loading

Addresses the follow-up audit on #12725 with real-code tests (no mock
handlers, only the existing setMessages/getMessages spies and the
standard mongodb-memory-server harness).

Six new tests under a dedicated `describe('subagent loading')`:
- loads a configured subagent, populates `subagentAgentConfigs`, keeps
  it out of `agentConfigs`
- **P1 regression guard**: drives the real `toolExecuteOptions.loadTools`
  closure with the subagent id and asserts `loadToolsForExecution` is
  called with `agent: <subagent>`, `tool_resources`, `actionsEnabled`.
  If anyone deletes `agentToolContexts` again, this fails.
- dedup: three copies of the same id load the agent once
- overlap: agent referenced both as handoff target and subagent stays in
  `agentConfigs`
- capability gate: admin disabling `subagents` suppresses loading even
  when the agent has a config
- per-agent disable: `subagents.enabled: false` skips loading entirely

Five new tests under `describe('on_subagent_update event')` using a
real `RecoilRoot` and a companion `useRecoilCallback` reader so writes
from the hook are observable:
- deterministic correlation via `parentToolCallId` (happy path with
  SDK dev.2+)
- fallback: oldest-unclaimed tool call wins for concurrent spawns
  without `parentToolCallId`
- early-arrival buffer: updates with no mapping get buffered and
  replayed once the tool call appears
- event cap: 205 updates collapse to 200 retained, oldest dropped
- `clearStepMaps` resets tracked atoms back to their null default

- F2 — added explicit `// TODO` marker for handoff-subagent-loading
  extension (matches the comment that referenced it).
- F3 — dropped the unnecessary `MAX_SUBAGENTS as MAX_SUBAGENTS_CAP`
  alias; just import the constant directly.
- Bumped `@librechat/agents` to `^3.1.67-dev.3` to pick up the SDK's
  paired test additions.

- `api/server/services/Endpoints/agents/initialize.spec.js` — 22 pass
  (6 new + 16 existing)
- `packages/api/src/agents/validation.spec.ts` +
  `run-summarization.test.ts` — 103 pass
- `client/src/hooks/SSE/__tests__/useStepHandler.spec.ts` — 48 pass
  (5 new + 43 existing)

* 🪆 fix: strip parent run summary + discovered tools from subagent inputs

Codex P1 on #12725: `buildSubagentConfigs` reused the shared
`buildAgentInput` factory for each explicit child, and that factory
always stamps the parent run's `initialSummary` (cross-run conversation
summary) and `discoveredTools` (tool names the parent's LLM searched
earlier) onto every `AgentInputs` it returns. When subagents were
enabled on a conversation that had already been summarized, every
child inherited that summary — silently defeating the isolated-context
contract and burning extra tokens on unrelated prior chat.

Fix in `run.ts`: after `buildAgentInput(child)`, explicitly clear
`childInputs.initialSummary` and `childInputs.discoveredTools` before
attaching to the `SubagentConfig`. The parent keeps both — that's how
the supervisor receives cross-turn context — but the child starts
fresh.

Paired with danny-avila/agents#107 (bumped to `^3.1.67-dev.4`), which
adds the equivalent strip inside `buildChildInputs` to cover the
self-spawn path where the SDK clones parent `_sourceInputs` directly
and LibreChat never sees the intermediate shape. Belt and suspenders.

Regression test (new):
- `does NOT leak the parent run initialSummary into an explicit child
  (Codex P1 regression)` — sets `initialSummary` on the run, enables
  subagents with an explicit child, asserts the parent still has the
  summary but `childConfig.agentInputs.initialSummary` is `undefined`.
  Same for `discoveredTools`. 24 pass.

* 🪆 fix: capability gate applies to handoff agents + parallel subagent test

### Codex P2 — handoff agents kept `subagents` after capability disabled
The endpoint-level `AgentCapabilities.subagents` gate only cleared
`subagents` on `primaryConfig`. Handoff agents loaded into
`agentConfigs` retained their persisted `subagents.enabled: true`,
and because `run.ts` calls `buildSubagentConfigs` for every agent
input, self-spawn would still fire on a handoff target even when the
admin had disabled the capability globally.

Fix in `initialize.js`: after the subagent loading block, when the
capability is off, iterate `agentConfigs.values()` and clear
`subagents` + `subagentAgentConfigs` on every loaded config.

Regression test: `clears subagents on handoff agents too when
capability is disabled (Codex P2 regression)` — seeds a handoff target
with its own `subagents.enabled: true`, disables the capability at
the endpoint, asserts both primary AND handoff have `subagents`
undefined in the client args. 23 init tests pass.

### Parallel subagent correlation — user-requested verification
Added `keeps parallel subagent streams independent when events
interleave` to `useStepHandler.spec.ts`. Two `subagent` tool calls
seeded side by side, 6 interleaved `ON_SUBAGENT_UPDATE` envelopes
dispatched (a-start, b-start, a-step, b-step, a-stop, b-step), each
carrying its own `parentToolCallId`. Asserts each `tool_call_id`'s
Recoil bucket accumulates only its own run's events, statuses reflect
each run independently (`call_a` → stop, `call_b` → run_step), no
cross-contamination. 49 step-handler tests pass.

* 🪆 fix: SubagentCall detects cancelled / errored states (Codex P2)

Codex P2 on #12725: the old `running` check only consulted
`initialProgress` and the subagent's phase. A user stop, dropped
stream, or backend crash before a terminal `stop`/`error` envelope
arrived would leave the ticker permanently stuck on "working…". Other
*Call components (ToolCall.tsx) already model this via
`!isSubmitting && !finished` → cancelled.

Mirror that pattern. Re-introduce `isSubmitting` on `SubagentCallProps`
(the prop was dropped earlier as 'unused' — that was a bug) and resolve
status as a tri-state:

- `finished`  — initialProgress >= 1, or subagent `stop`/`error`
- `cancelled` — `!isSubmitting && !finished`
- `running`   — neither

New locale keys `com_ui_subagent_cancelled` + `com_ui_subagent_errored`
swap in the right header text per state.

Tests: new `SubagentCall.test.tsx` covers all four states with a real
`RecoilRoot` and a `useRecoilCallback` seeder — no mocked store — 5/5
pass. Includes an explicit P2 regression test that simulates the
`isSubmitting=false, progress.status='run_step', initialProgress<1`
scenario and asserts the cancelled label renders.

* 🪆 feat: semantic ticker + aggregated content-part dialog for subagents

Two rounds of feedback on #12725:

### Ticker — user-readable lines, not raw event names

The old ticker showed \`on_run_step\`, \`on_message_delta\`, etc. — not
meaningful to users. Replaced with \`buildSubagentTickerLines\`, a pure
helper that walks the \`SubagentUpdateEvent\` stream and emits:

- message/reasoning deltas → a single live "Writing: <last 60 chars>"
  (or "Reasoning: …") line that updates in place as chunks arrive
- run_step with tool_calls → "Using calculator(expression=42*58)" for
  a single call, "Using tool: a, b" for parallel (args dropped when
  multiple so the line stays short)
- run_step_completed → "calculator → 42*58 = 2436" (output truncated
  to 48 chars; falls back to "Tool X complete" when output is empty)
- error → "Error: <message>"
- start / stop / run_step_delta → suppressed (too granular / lifecycle-only)

Args and output pass through \`summarizeArgs\` / \`summarizeOutput\`
which flatten JSON to \`key=value\` pairs and head-truncate long
strings so a 200-line tool output never bloats the ticker.

### Dialog — aggregated content parts via leaf renderers

\`aggregateSubagentContent\` folds the raw event stream into
\`TMessageContentParts[]\` — text/reasoning delta streaks collapse into
single \`TEXT\` / \`THINK\` parts, tool calls become \`TOOL_CALL\` parts,
and \`run_step\` boundaries correctly break text runs around tool
calls. The dialog iterates those parts through a \`SubagentDialogPart\`
renderer that delegates to the existing \`Text\`, \`Reasoning\`, and
\`ToolCall\` leaf components — the same sub-components \`<Part />\` uses
— wrapped in a minimal \`MessageContext\` so reasoning expand state and
cursor animation work.

Leaf components are used directly rather than importing \`<Part />\`
itself to avoid a module cycle (Part → Parts/index → SubagentCall →
Part) and to sidestep a hypothetical nested-subagent rendering.

### Tests

- \`subagentContent.test.ts\` — 19 pure-function tests covering the
  aggregator (text concat, reasoning concat, tool call lifecycle,
  interleaving, phase suppression, late-arriving completions) and the
  ticker builder (live preview truncation, args/output snippets,
  parallel-call handling, output truncation, i18n formatter override).
- \`SubagentCall.test.tsx\` — 9 component tests: 5 status-resolution
  (existing) + 2 ticker (semantic text, delta collapse) + 2 dialog
  (aggregated parts routed to leaf renderers, raw-output fallback).

### Locale keys

New: \`com_ui_subagent_ticker_writing\`, \`…_reasoning\`, \`…_error\`,
\`…_using\`, \`…_using_with_args\`, \`…_tool_complete\`,
\`…_tool_output\`. Preserves i18n at the display layer while the
helper stays pure.

* chore: drop unused com_ui_subagent_activity_log locale key

The dialog no longer renders an "Activity log" section — the new
content-parts renderer replaced it. Also tweaks the dialog description
copy to match.

* 🪆 fix: subagent dialog order, persistence, auto-scroll, width

Follow-up pass addressing the four issues observed in real runs
against a live subagent-using parent.

### Aggregator ordering (reasoning appearing after text it preceded)

Reproducible pattern: LLM emits reasoning → text → tool call in that
order, but the dialog rendered text BEFORE reasoning in the content
array. Root cause: `aggregateSubagentContent` maintained `currentText`
and `currentThink` buffers in parallel and only flushed them at a
`run_step` boundary in a fixed (text, think) order, losing the actual
arrival order.

Fix: when a text chunk arrives, close any open think buffer first
(pushes it into the content array right then); symmetric for think →
text. Two new regression tests cover the exact reasoning → text →
tool_call sequence from the screenshot and the repeated
reasoning ↔ text flow across a turn.

### Content persists after completion (markdown not rendering when done)

`clearStepMaps` was calling `resetSubagentAtoms()` at stream end,
which wiped every `subagentProgressByToolCallId` entry. Once reset,
`contentParts.length === 0` and the dialog fell back to rendering the
raw `output` string with plain text — hence the literal `##`/`**` in
the completed-state screenshot. Stopped resetting; the atoms are
bounded per-call (200-event cap) and per-conversation (one per
subagent spawn) so growth matches the rest of the conversation state.
`resetSubagentAtoms` is kept for a future conversation-switch caller.

Also: routed the raw-`output` fallback (older subagent runs recorded
before the event forwarder existed) through the same
`SubagentDialogPart` → `Text` leaf that content parts use, so its
markdown renders the same way.

### Auto-scroll to bottom while running

Added a `scrollRef` on the dialog body and a `useEffect` that pins
`scrollTop = scrollHeight` while the dialog is open AND the subagent
is running. Triggers on `contentParts.length` (new tool calls / part
boundaries) and `events.length` (intra-part deltas) so the cursor
tracks text streaming. Disabled post-completion so re-opening a
finished run doesn't yank to the bottom.

### Wider dialog

Went from `max-w-2xl` (42rem / 672px — too cramped on maximized
laptop windows) to `w-[min(95vw,64rem)] max-w-[min(95vw,64rem)]`.
Narrow on phones, scales up to 64rem on desktop, always leaves a bit
of margin from the viewport edge. Bumped `max-h-[65vh]` on the scroll
area to give the extra width room to breathe vertically too.

### Tests

- `subagentContent.test.ts` — 21 pass (2 new ordering regressions).
- `useStepHandler.spec.ts` — 49 pass (1 updated to assert atoms are
  *preserved* on clearStepMaps).
- `SubagentCall.test.tsx` — 9 pass (unchanged; aggregator-level tests
  cover the ordering).

* 🪆 feat: persist subagent_content via SDK createContentAggregator

Per-request map of createContentAggregator instances keyed by the
parent's tool_call_id. ON_SUBAGENT_UPDATE handler feeds each event
into the matching aggregator (phase → GraphEvent mapping); AgentClient
harvests contentParts onto the subagent tool_call at message save so
the child's reasoning / tool calls / final text survive a page refresh.

Reusing the SDK's battle-tested aggregator instead of a bespoke one
keeps the persisted shape identical to the parent graph's output and
drops ~100 lines of custom aggregation code.

* 🪆 fix: incremental subagent aggregation + dialog render parity

**Disappearing tool_calls**: the Recoil atom trimmed events to a 200-long
rolling window, so verbose subagents could shed the `run_step` that
originally created a tool_call part — rebuilding content from the trimmed
window then produced only the surviving text/reasoning. Fix: fold each
envelope into `contentParts` incrementally in the atom as it arrives
(new `foldSubagentEvent` + cursor state). Event trim window now affects
only the ticker, never the dialog.

**Render parity**: dialog now applies `groupSequentialToolCalls` and
renders single parts through `Container` + grouped batches through
`ToolCallGroup` — same spacing and "Used N tools" collapsing the main
message view uses.

**Width**: `min(96vw, 80rem)` — wider on big screens, still responsive.

**Labels**: "Subagent: X" is jargon. Named subagents render as
`Running "{name}" agent` / `Ran "{name}" agent` (past tense on
completion); self-spawns use `Running subtask` / `Ran subtask` since
`Running "self" agent` reads badly.

* 🪆 polish: subagent dialog parity + agent avatar in header

**Labels**: drop "subtask" framing. Self-spawn shows `Running agent` /
`Ran agent` (past tense on completion); named subagents stay
`Running "X" agent` / `Ran "X" agent`.

**Dialog render parity**: stop wrapping every part in `Container`.
TEXT keeps its `Container` (gap-3 + `mt-5` sibling margin), THINK and
TOOL_CALL render bare so their own wrappers set the full-column width
the regular message view gives them — matches the main `<Part>` dispatch.
Outer scroll region now uses `px-4 py-3` padding and a
`max-w-full flex-grow flex-col gap-0` inner wrapper, mirroring the
`MessageParts` container the main conversation uses.

**Avatar**: header icon now renders the subagent's configured avatar
via `MessageIcon` when `useAgentsMapContext()` has the child agent,
falling back to the `Users` SVG (which keeps its running-state pulse).
Same icon-left-of-label pattern the tool UI uses.

* 🪆 polish: subagent group label, ticker throttle + tail-ellipsis, scroll button

**Grouped label**: ToolCallGroup now detects all-subagent batches and
labels them "Running N agents" / "Ran N agents" instead of "Used N
tools". Mixed batches keep the existing label. The tool-name summary
is suppressed for all-subagent groups (every entry dedupes to
"subagent", which adds nothing).

**Ticker width + tail-ellipsis**: raise the preview cap to 300 chars so
wide containers aren't half-empty, and flip the ticker `<li>` to
`dir="rtl"` so `text-overflow: ellipsis` clips the *oldest* characters
(visually the left edge) — the newest tokens stay pinned to the right
regardless of container width. Bidi lays out the Latin text LTR
internally, the rtl only affects which side gets the ellipsis.

**Throttle**: `useThrottledValue` hook (trailing-edge, 1.2s) smooths
the live `Writing: …` preview so tokens no longer strobe past the
eye faster than they can be read. Ref-based internals (not `useState`)
avoid infinite-update loops when the upstream value is a new-reference
each render; `NEGATIVE_INFINITY` sentinel ensures the very first value
passes through synchronously so tests and first paint aren't delayed.

**Scroll-to-bottom**: dialog tracks `isAtBottom` with a 120px
threshold; auto-scroll only engages when the user is already following
along, and a persistent jump-to-latest button appears whenever they
scroll up — no more fighting the auto-scroll to read back.

* 🪆 polish: snappier ticker, prefix-safe labels, agents icon, readable lines

**Ticker lines are now incrementally aggregated in the atom** — same
pattern as contentParts. The raw-events rolling window is gone; event
volume no longer caps what the ticker can display. Verbose subagents
that used to drop early tool_call lines out of the window now keep the
full 3-line history (using_tool, tool_complete, writing).

**Discriminated-union ticker lines** split a constant prefix (e.g.
"Writing:") from a tail-truncatable body. The prefix lives in a
`shrink-0` span so it never gets clipped when the body overflows; the
body uses `dir="rtl"` only on itself — scoped so non-streaming lines
(e.g. "Waiting for first update…") can't get their trailing ellipsis
flipped by bidi.

**Content-aware throttle**: 800ms interval (down from 1200ms), skipped
entirely while the live buffer is below 120 chars. Early tokens now
appear immediately — no more "Reasoning: I" sitting blank for a full
second before the next heartbeat. Once the preview is long enough to
fill the container, throttling kicks in at the tighter interval.

**Header label** is now a constant verb + optional muted sub-label.
Base reads "Running agent" / "Ran agent" / "Cancelled agent" / "Agent
errored" for every subagent; named subagents get the configured agent
name rendered to the right in secondary text (self-spawns and
unresolved names omit it — "Running self agent" is nonsense).

**ToolCallGroup** now detects `allSubagents` and swaps `StackedToolIcons`
for a single `Users` glyph — otherwise the group header shows a wrench
("tool") icon next to "Ran 5 agents", which reads wrong.

* 🪆 feat: delimiter-aware tool labels in ticker + full-width tool lines

New shared `parseToolName` helper in `client/src/utils/toolLabels.ts`
— single source of truth for splitting `<tool>_mcp_<server>` ids and
mapping native tool names (web_search, execute_code, …) to their
friendly translation keys. `ToolCallGroup` drops its inline copy and
pulls from this helper.

Ticker tool lines now use the shared parser + a new `ToolIdentifier`
sub-renderer so the live log reads like the main tool UI:

  - MCP tool  → `<server> · <code-badge:tool>` (e.g. "github · `search_code`")
  - Native   → friendly name from `TOOL_FRIENDLY_NAME_KEYS`
  - Unknown  → bare `<code>` badge of the raw id

The `using_tool` / `tool_complete` rows now render with a
`flex w-full items-baseline gap-1 overflow-hidden` layout matching
the writing/reasoning rows — they take the full container width
instead of collapsing to content size. Output snippets on
`tool_complete` get the same tail-side `dir="rtl"` ellipsis so the
newest characters stay flush-right when the container is narrow.

Dropped the now-unused template i18n keys
(`com_ui_subagent_ticker_using_with_args`,
`com_ui_subagent_ticker_tool_complete`,
`com_ui_subagent_ticker_tool_output`) in favor of tokens the JSX
composes structurally. Only English is touched per the project rule;
other locales follow externally.

* 🪆 fix: dialog scroll button + auto-scroll during streaming deltas

Two race/trigger bugs in the dialog's scroll behavior:

**Button never showed**: `addEventListener('scroll', …)` in a
`useEffect` ran before Radix's portal had actually committed the
scroll container, so `scrollRef.current` was still null — the listener
never attached, `isAtBottom` stayed stuck at its initial `true`, and
the jump-to-latest button was never rendered. Swap to React's
`onScroll` prop on the element itself so the handler wires up as part
of DOM commit, not a post-commit effect.

**Auto-scroll stalled during text streaming**: the pin-to-bottom
effect only re-fired on `contentParts.length` changes. Message/reasoning
deltas extend the last TEXT/THINK part's `.text` without changing the
array length — so the view would drift up as tokens piled in and
never catch back up. Replace the length-dep effect with a
`ResizeObserver` on the inner content div; every height change (new
part or in-place growth) triggers a scroll-pin when the user is still
at the bottom.

* 🪆 fix: drop leading ellipsis from ticker body

truncatePreview was prepending ... to the tail when the buffer
exceeded 300 chars. The component's CSS already produces a left-side
ellipsis for overflow via dir=rtl + text-overflow: ellipsis — stacking
a data-level ellipsis on top renders a stray dot character right after
the Writing: / Reasoning: label (Writing: .Sure!), which looks like a
typo to the reader.

Data now returns just the last 300 chars when truncating; CSS handles
the visual cue whenever the body actually overflows its container.

* 🪆 fix: Codex review — subagent isolation + concurrent-safe throttle

Three findings from the @codex review pass, all valid:

**P1 — buildAgentInput leaks parent discovered-tool state into subagent
children.** `buildAgentInput` mutates `agent.toolRegistry`
(`overrideDeferLoadingForDiscoveredTools` flips `defer_loading:true→false`
on tools the parent previously searched for) and appends those tools'
definitions to the returned `toolDefinitions` before the function
returns. `buildSubagentConfigs` was clearing the reported
`initialSummary` / `discoveredTools` fields on the returned
AgentInputs, but that happened post-return — the registry writes and
extra tool definitions persisted on the child, silently defeating
context isolation and inflating the child's prompt.

Fix: `buildAgentInput` now takes an `isSubagent` flag that gates the
registry-mutation block and omits `initialSummary` /
`discoveredTools` at the source. `buildSubagentConfigs` passes
`{ isSubagent: true }` for every explicit child; no post-hoc cleanup
needed.

**P2 — ToolCallGroup labels a finished subagent group as still
running when the child returned no output.** `getToolMeta` computed
`hasOutput` as `!!tc.output`, which is `false` for a completed
subagent that returned empty text (the UI already has an "empty
result" fallback for that case). `allCompleted` would stay `false`
and the group header stuck on "Running N agents" forever.

Fix: treat `tc.progress === 1` as completion too — progress is the
authoritative lifecycle signal, output is just content.

**P2 — useThrottledValue schedules `setTimeout` during render.**
Discarded renders under Strict Mode / Concurrent rendering would
leave orphan timers firing against stale trees.

Fix: move `setTimeout` into a `useEffect` keyed on `[value, intervalMs,
enabled]`. Render-time still mutates refs (idempotent), but timer
scheduling lives post-commit. Cleanup on unmount and on passthrough
transitions is preserved.

* 🪆 fix: Codex P2 — wipe subagent atoms on conversation switch

`clearStepMaps()` intentionally doesn't reset `subagentProgressByToolCallId`
so a user can reopen a completed subagent's dialog mid-conversation,
but `resetSubagentAtoms` was defined and never exposed / called —
so each completed run's aggregated `contentParts` + `tickerState`
stayed resident in the `atomFamily` for the whole app session.
Unbounded growth across multi-conversation sessions.

Expose `resetSubagentAtoms` from `useStepHandler` and fire it from
`useEventHandlers` whenever the URL's `conversationId` changes.
That's the correct cleanup boundary: historical subagent dialogs
rehydrate from persisted `subagent_content` on each `tool_call` at
message-save time, so wiping live atoms on switch doesn't lose any
viewable history — it just releases per-tool-call state that the
old conversation's components no longer subscribe to.

* 🪆 fix: Codex round 3 — subagent registry isolation + post-run label

Two more valid findings.

**P1 — parent-order registry mutation leaks into subagent inputs.**
`overrideDeferLoadingForDiscoveredTools` mutates `agent.toolRegistry`
in place (the Map *and* the LCTool objects inside it). When an agent
appears both as a handoff target (normal graph node) AND an explicit
subagent child, a subagent build that ran before the parent's build
captures a reference to the same registry — the parent's later mutation
leaks through to the child.

Fix: for subagent children (`isSubagent`), clone the `toolRegistry`
Map and shallow-clone each LCTool inside before returning the inputs.
`defer_loading` flips on parent-graph registry mutations can't
propagate across the clone boundary. `toolDefinitions` also gets a
shallow-copy pass so the same isolation holds for definitions the
child carries directly.

**P2 — "Running N agents" label stuck after cancel/error.**
ToolCallGroup's all-subagent label was gated only on `allCompleted`,
which requires every child to have `hasOutput || progress === 1`.
A subagent that gets cancelled (stream ends, no `stop` phase, no
output) never satisfies that — so even after `isSubmitting` flips
false, the header stays on "Running N agents" while each individual
card correctly shows "Cancelled agent".

Fix: derive a `subagentsDone` flag as `allCompleted || !isSubmitting`
and gate the past-tense label on that. Matches the tri-state each
SubagentCall card already resolves (finished / cancelled / running).

* 🪆 fix: Codex P2 — ACL-check subagents.agent_ids on create/update

Codex flagged that `subagents.agent_ids` was accepted as arbitrary
strings on the create/update routes while `edges` got a
`validateEdgeAgentAccess` pass — so users could save subagent
references to agents they can't VIEW. At runtime `initializeClient`'s
`processAgent` ACL gate silently drops those, so the persisted
configuration and the actual behavior diverged in a way that is
difficult to diagnose.

Refactor: extract the id-set → unauthorized-ids check into a shared
`collectUnauthorizedAgentIds`, wrap it with a dedicated
`validateSubagentAccess`, and plumb the same 403-on-failure response
the edge path already returns. Applied on both POST /agents and
PATCH /agents/:id.

* 🪆 fix: Codex round 5 — ACL-disable escape hatch + ticker order

Two valid findings.

**P1 — can't disable subagents after losing access to a child.**
The subagent ACL check ran on every create/update that echoed back
the `agent_ids` list, even when the user was explicitly disabling
the feature. The UI keeps the list intact when toggling `enabled:
false`, so a user who subsequently lost VIEW on any child would be
locked in a 403 loop — every edit (including the one that turns
subagents off) bounces.

Fix: gate the ACL check on `subagents.enabled !== false` at both
the POST /agents and PATCH /agents/:id handlers. Empty list stays
a no-op. Disabling the feature is always permitted.

**P2 — ticker fold merges out-of-order previews across delta
switches.** `foldSubagentEventIntoTicker` carried `textLineIdx` /
`thinkLineIdx` across a reasoning → text → reasoning transition,
so the second reasoning chunk appended to the original reasoning
line instead of starting a new chronological one.

Fix: close the opposite buffer + cursor when a delta-type switch
is detected (same rule the content-parts reducer already applies).
Added a regression test.

* 🪆 fix: Codex round 6 — preserve mid-stream atoms + honor sequential suppression

Two valid findings.

**P2 — atom reset fires on initial chat URL assignment.**
`useEventHandlers` initialized `lastConversationIdRef` from the URL's
current `paramId`, then reset subagent atoms whenever the ref and
`paramId` disagreed. For a brand-new conversation the URL stamp goes
from `undefined → "abc123"` while the first response is still
streaming, which used to drop subagent ticker/content state mid-run
and leave dialogs missing earlier updates.

Fix: only reset when *both* the old and new IDs are non-null and
differ — i.e. a user-initiated switch between two established
conversations. The initial assignment passes through without
clearing.

**P2 — ON_SUBAGENT_UPDATE bypassed `hide_sequential_outputs`.**
Every other streaming handler in `callbacks.js`
(`ON_RUN_STEP`, `ON_MESSAGE_DELTA`, etc.) gates emission on
`checkIfLastAgent` + `metadata?.hide_sequential_outputs`, but the
subagent forwarder did an unconditional `emitEvent` — so intermediate
agents in a sequential chain were leaking their children's activity
to the client even when the chain was configured to suppress
intermediates.

Fix: accept `metadata` and apply the same `isLastAgent ||
!hide_sequential_outputs` gate. Aggregation still runs regardless of
visibility (persistence + dialog depend on it); only the SSE forward
is suppressed.

* 🪆 fix: Codex P2 — gate subagent ACL check on endpoint capability

`validateSubagentAccess` ran on every create/update where
`subagents.enabled !== false`, regardless of the endpoint-level
`subagents` capability. When the capability is off at the appConfig
level, `initializeClient` already strips the `subagents` block at
runtime — so persisted `agent_ids` are inert — but the validation
could still 403 on a legacy record whose referenced child is no
longer viewable, blocking unrelated edits.

Fix: add `isSubagentsCapabilityEnabled(req)` that reads the agents
endpoint's capabilities from `req.config` and gate both the create
and update ACL checks on it. Capability-off environments can update
agents with stale `subagents` data freely; capability-on keeps the
full ACL protection.

* 🪆 fix: Codex P2 — reset subagent atoms on id→null navigation too

Previous guard (both-established) skipped the reset whenever
`paramId` became null/undefined, so navigating from an existing
chat to a "new chat" route left stale subagent progress resident
in the `atomFamily` until the user picked a specific different
chat.

Swap the both-established check for a one-time flag: skip only the
very first `undefined → id` transition (the brand-new-chat URL
stamp that happens mid-stream), then reset on any subsequent
change — id→id, id→null, null→id-after-reset. If the user started
on an established chat the flag is true at mount, so the guard is
a no-op and every navigation resets normally.

* 🪆 fix: Codex round 9 — subagent persistence gate + handoff children

Two valid findings.

**P1 — hide_sequential_outputs also gates persistence.**
The previous fix gated the SSE forward on
`isLastAgent || !hide_sequential_outputs` but still ran the
per-tool-call `createContentAggregator` aggregation unconditionally.
`finalizeSubagentContent` would then attach the hidden intermediate
agent's child reasoning / tool output to the saved message, so a
page refresh could reveal activity that was intentionally suppressed
live. Move the visibility gate to the top of the handler — hidden
agents now skip both aggregation and emission, so
"hide_sequential_outputs" is a consistent "don't record" rule for
subagent traces.

**P2 — handoff agents' explicit subagents were silently dropped.**
`initializeClient` only resolved `subagentAgentConfigs` for the
primary config, so an agent used via handoff that had its own
`subagents.agent_ids` saved in the builder would get self-spawn
only; every explicit child was quietly ignored, creating a
saved-config / runtime mismatch the user couldn't diagnose.

Extract the resolution into a shared `loadSubagentsFor(config)`
helper and invoke it for the primary and every handoff agent in
`agentConfigs`. The `edgeAgentIds` precomputation stays outside
the helper (it's loop-invariant). Capability-off shortcuts return
empty early so the existing strip-on-capability-off path still
holds.

* 🪆 fix: Codex P2 — recursive subagent build for multi-level delegation

Previously only the outer `agents[]` loop attached `subagentConfigs`
to its inputs, so a child used as a subagent (invoked via the
`subagent` tool) lost every explicit spawn target of its own. A
user-valid configuration like A → B → C would only run the top
layer; B could never actually delegate to C from inside A's run.

Recursively build `subagentConfigs` for each child inside
`buildSubagentConfigs`, passing the child's freshly-constructed
`childInputs` down so its own `subagents.enabled` children get
resolved too. Added cycle protection via an `ancestors` Set — a
configuration like A → B → A is safely cut off at the second
encounter of A rather than recursing forever (the existing
`child.id === agent.id` guard already prevents the direct self-loop).

* 🪆 fix: Codex P2 — reset subagent atoms on useEventHandlers unmount

The effect that resets subagent atoms only fired on `paramId` change,
so unmounting the chat container (route change away from /c) never
flushed the atoms. `knownSubagentAtomKeys` lives in a ref inside
`useStepHandler` — once the hook unmounts the ref is gone, so a
subsequent remount can't clean atoms it never registered.

Added a second `useEffect` that only runs cleanup on unmount (empty
deps aside from the stable `resetSubagentAtoms` callback). Keeps
`atomFamily` bounded across full route teardowns too.

* 🪆 fix: Codex round 13 — cyclic subagent guard + prefer persisted

Two valid findings.

**P1 — cyclic subagent ref reloads the primary.** A configuration
like `A ↔ B` (B lists A as its own subagent) would send
`loadSubagentsFor` down a path that couldn't find A in
`agentConfigs` (the primary isn't stored there), so it called
`processAgent(A)` a second time. That inserts a fresh config for
the primary id, which downstream duplicates in
`[primary, ...agentConfigs.values()]` and can replace the primary's
tool context with the reloaded copy.

Fix: short-circuit when a subagent ref points back at
`primaryConfig.id` — reuse the already-loaded primary config.
Primary is always an edge id so no pruning bookkeeping needed.

**P2 — live atom preferred over canonical persisted trace.** The
dialog picked `progress.contentParts` ahead of
`persistedContent`, but the Recoil bucket is best-effort — after
a disconnect/reconnect it can be stale or partial. The server's
`subagent_content` on the `tool_call` is the canonical record
refreshed on sync. Preferring live could hide completed
tool/reasoning history that was actually persisted.

Fix: flip the preference order. Persisted wins when it's
non-empty; live covers the mid-stream window (before the parent
message saves, persisted is empty) and the older-runs fallback.
Updated the test that enforced the old order to lock the new
semantics in (separate mid-stream live-fallback assertion kept).

* 🪆 fix: Codex P2 — subagent atom reset rule simplified to 'leaving established id'

The `hasEstablishedConversationRef` + check for initial undefined→id
covered the first navigation but missed the equivalent mid-stream
URL stamp when a user goes from an existing chat to a new chat and
sends a message there (`id → null → newId`). The null → newId
transition was still hitting the reset branch and wiping the
in-flight subagent ticker/content for that first turn.

Simpler rule: only reset when the PREVIOUS paramId is an established
id. Every transition AWAY from an established chat clears (id→id2,
id→null, id→undefined); every transition FROM null/undefined passes
through (initial mount, new-chat URL stamp mid-stream). Drop the
`hasEstablishedConversationRef` machinery in favor of that single
condition.

* 🪆 fix: Codex P2 — match runtime's strict subagent enable check in ACL

Runtime (`initializeClient` + `run.ts`) treats `subagents?.enabled`
as a truthy predicate — `undefined`, `null`, missing, and `false`
all short-circuit. The ACL gate was using `!== false` which
accepted `undefined` / missing as "enabled" and could 403 a payload
whose subagent tool would be inert at runtime.

Swap both create and update to `enabled === true`. Only a strictly-
enabled payload triggers the ACL check; the disable path (`false`)
still passes through so a user who lost VIEW on a child can still
save the disable edit.

* 🪆 fix: Codex P2 — reject missing subagent references with 400

`validateSubagentAccess` collapsed through `collectUnauthorizedAgentIds`,
which returns an empty list for ids with no DB record — so typos and
references to deleted agents passed validation silently, and
`initializeClient` later dropped them at runtime. Saved config would
then list spawn targets that the backend never honored, a hard-to-
diagnose drift.

Refactor the helper into `classifyAgentReferences(ids, …)` which
returns `{ missing, unauthorized }` separately. `validateEdgeAgentAccess`
keeps its old semantics (missing is intentional — a self-referential
`from` names the agent being created). `validateSubagentReferences`
surfaces both buckets so the create/update handlers can 400 on
missing and 403 on unauthorized with distinct error messages and
`agent_ids` lists.

* 🪆 polish: tighten subagent dialog grid gap to gap-2

OGDialogContent's grid default is `gap-4`, which renders the title,
description, and scroll area as three visually separated panels.
Drop to `gap-2` so they read as one block.

* 🪆 polish: swap Subagents above Handoffs in Advanced panel

Subagents is the more common knob users reach for, so show it first.
Handoffs keep the same Controller wiring, just move below.
2026-04-25 04:02:01 -04:00
Danny Avila
35bf04b26c 🧰 refactor: Unify code-execution tools (#12767)
* 🛠️ feat: Add registerCodeExecutionTools helper

Idempotently registers `bash_tool` + `read_file` in the run's tool
registry and tool-definition list via a registry `.has()` dedupe. Sets
up the single code-execution tool path shared by:

- `initializeAgent` (when an agent has `execute_code` in its tools and
  the capability is enabled for the run)
- `injectSkillCatalog` (when skills are active; unconditional read_file,
  bash_tool follows `codeEnvAvailable`)

Both callers reach the helper in the same initialization sequence, so
the second call becomes a no-op and exactly one copy of each tool
reaches the LLM — no more double registration for agents that combine
`execute_code` capability with active skills.

Unit-tested on a fresh run, idempotence (second call, overlap with
prior tooldefs, partial overlap), and the no-registry variant.

* 🔀 refactor: Route injectSkillCatalog bash_tool + read_file through registerCodeExecutionTools

The `skill` tool is still registered inline (it's skill-path-specific),
but `bash_tool` + `read_file` now flow through the shared idempotent
helper so a prior registration from the execute_code path doesn't
produce a duplicate copy later in the same run. Behavior preserved:

- `read_file` always registers when any active skill is in scope —
  manually-primed `disable-model-invocation: true` skills still need it
  to load `references/*` from storage.
- `bash_tool` follows `codeEnvAvailable` exactly as before.

Adds a test pinning the cross-call dedupe: when `injectSkillCatalog`
runs AFTER `registerCodeExecutionTools` has already seeded the registry
+ tool definitions with bash_tool/read_file, the resulting
`toolDefinitions` still contains exactly one copy of each.

* 🪄 feat: Expand `execute_code` tool name into bash_tool + read_file at initialize-time

When an agent's `tools` include `execute_code` and the `execute_code`
capability is enabled for the run, `initializeAgent` now registers
`bash_tool` + `read_file` via `registerCodeExecutionTools` before
`injectSkillCatalog`. The legacy `execute_code` tool definition is no
longer handed to the LLM — `execute_code` remains on the agent
document as a capability-trigger marker, but the runtime expands it
into the skill-flavored tool pair.

Call ordering matters: the `execute_code` registration runs BEFORE
`injectSkillCatalog`, so the skill path's own `registerCodeExecutionTools`
call inside `injectSkillCatalog` becomes a no-op via the registry's
`.has()` check. Exactly one copy of each tool reaches the LLM whether
the agent has:

- only `execute_code` (legacy path)
- only skills
- both

No data migration needed — `agent.tools: ['execute_code']` stays in
the DB unchanged; the expansion is a runtime operation.

Three tests cover the matrix: execute_code + capability on →
bash_tool + read_file registered; execute_code + capability off →
neither registered; no execute_code + capability on → neither
registered.

* 🗑️ refactor: Drop CodeExecutionToolDefinition from the builtin registry

Removes the legacy `execute_code` entry from `agentToolDefinitions` and
the corresponding import. With the initialize-time expansion in place,
nothing consults `getToolDefinition('execute_code')` for a tool schema
any more — the capability gate still filters on the string
`execute_code`, but the actual tool definitions the LLM sees come from
`registerCodeExecutionTools` (i.e. `bash_tool` + `read_file`).

`loadToolDefinitions` in `packages/api/src/tools/definitions.ts`
silently drops `execute_code` when it no longer resolves in the
registry — that's the expected path and is now covered by an updated
test. No caller of `getToolDefinition('execute_code')` expects a
non-undefined result after this change.

* 🔌 refactor: Read CODE_API_KEY from env for primeCodeFiles + PTC

Finishes the Phase 4 server-env-keyed rollout on the two remaining
`loadAuthValues({ authFields: [EnvVar.CODE_API_KEY] })` sites in
`ToolService.js`:

- `primeCodeFiles` (user-attached file priming on execute_code agents)
- Programmatic Tool Calling (`createProgrammaticToolCallingTool`)

Both now read `process.env[EnvVar.CODE_API_KEY]` directly, matching
`bash_tool`'s pattern. The per-user plugin-auth path is no longer
consulted for code-env credentials anywhere in the hot path — the
agents library owns the actual tool-call execution and also reads the
env var internally.

Priming still fires for existing user-file workflows so the legacy
`toolContextMap[execute_code]` hint ("files available at /mnt/data/...")
stays in the prompt; only the key lookup changed.

* 🔧 fix: Type the pre-seeded dedupe-test tools as LCTool

CI TypeScript type checks caught `{ parameters: {} }` in the new
cross-call dedupe test: `LCTool.parameters` is a `JsonSchemaType`,
not `{}`. Use `{ type: 'object', properties: {} }` and type the
local registry Map through the parameter-derived shape so the
pre-seeded values match what `toolRegistry.set` expects.

* 🛡️ fix: Run execute_code expansion before GOOGLE_TOOL_CONFLICT gate

Codex review caught a latent regression: the original Phase 8 placement
ran `registerCodeExecutionTools` after `hasAgentTools` was computed,
so an execute-code-only agent on Google/Vertex with provider-specific
`options.tools` populated would no longer trip `GOOGLE_TOOL_CONFLICT`
— the legacy `CodeExecutionToolDefinition` used to populate
`toolDefinitions` before the guard, but after dropping it from the
registry, `toolDefinitions` stayed empty until my expansion ran
downstream of the guard. Mixed provider + agent tools would silently
flow through to the LLM.

Fix moves the `execute_code` expansion to BEFORE `hasAgentTools`
computation. `bash_tool` + `read_file` now contribute to the check
the same way the legacy `execute_code` def did. Covered by a new
test that pins the Google+execute_code+provider-tools scenario —
the `rejects.toThrow(/google_tool_conflict/)` path would have
silently passed on the prior placement.

* 🔗 fix: Thread codeEnvAvailable through handoff sub-agents

Round-2 codex review caught the other half of the execute_code
expansion gap: `discoverConnectedAgents` omitted `codeEnvAvailable`
from its forwarded `initializeAgent` params, so handoff sub-agents
with `agent.tools: ['execute_code']` lost the `bash_tool` + `read_file`
registration (pre-Phase 8 the legacy `CodeExecutionToolDefinition`
would have landed in their `toolDefinitions` via the registry).

- Add `codeEnvAvailable?` to `DiscoverConnectedAgentsParams` and
  forward it verbatim on every sub-agent `initializeAgent` call.
- Update the three JS call sites that construct the primary's
  `codeEnvAvailable` (`services/Endpoints/agents/initialize.js`,
  `controllers/agents/openai.js`, `controllers/agents/responses.js`)
  to pass the same flag into `discoverConnectedAgents` — one
  authoritative source per request.
- Two regression tests in `discovery.spec.ts` pin the true/false
  passthrough so a future refactor that drops the param-forwarding
  surfaces immediately.

Left intentionally unchanged: `packages/api/src/agents/openai/service.ts`
(public API helper with no in-repo caller). External consumers of
`createAgentChatCompletion` who want code execution should pass a
`codeEnvAvailable`-aware `initializeAgent` via `deps` — documenting
the full public-API surface is out of scope for this Phase 8 PR.

* 🔗 fix: Thread codeEnvAvailable through addedConvo + memory-agent paths

Round-3 codex review caught the last two production `initializeAgent`
callers missing the Phase-8 capability flag:

- `api/server/services/Endpoints/agents/addedConvo.js` (multi-convo
  parallel agent execution). Added `codeEnvAvailable` to
  `processAddedConvo`'s destructured params and forwarded it into
  the per-added-agent `initializeAgent` call. Caller in
  `api/server/services/Endpoints/agents/initialize.js` passes the
  same `codeEnvAvailable` it computed for the primary.
- `api/server/controllers/agents/client.js` (`useMemory` — memory
  extraction agent). Computes its own `codeEnvAvailable` from
  `appConfig?.endpoints?.[EModelEndpoint.agents]?.capabilities` and
  forwards into `initializeAgent`. Memory agents rarely list
  `execute_code`, but if one does, pre-Phase 8 they got the legacy
  `execute_code` tool registered unconditionally — the passthrough
  restores parity.

With this, every production caller of `initializeAgent` explicitly
resolves the capability: main chat flow (primary + handoff), OpenAI
chat completions (primary + handoff), Responses API (primary + handoff),
added convo parallel agents, and memory agents. The one remaining
caller, `packages/api/src/agents/openai/service.ts::createAgentChatCompletion`,
is a public API helper with no in-repo consumer (external callers
must pass a capability-aware `initializeAgent` via `deps`).

* 🪤 fix: Remove duplicate appConfig declaration causing TDZ ReferenceError

The Responses API controller had TWO `const appConfig = req.config;`
bindings inside `createResponse`: one at the top of the function
(added by the Phase 4 `bash_tool` decouple) and one inside the try
block (added by the polish PR #12760). Because `const` is block-scoped
with a temporal dead zone, the inner redeclaration put `appConfig` in
TDZ for the entire try block, so any earlier reference inside the
try — notably `appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders`
at line 348 — threw `ReferenceError: Cannot access 'appConfig'
before initialization`. The error was silently swallowed by the
outer try/catch, leaving `recordCollectedUsage` unreached and the
six `responses.unit.spec.js` token-usage tests failing.

Removing the inner redeclaration fixes the six failing tests
(verified: 11/11 pass locally post-fix, 0 regressions elsewhere).
The outer function-scoped binding already provides `appConfig` to
every downstream reference.

* 🔗 fix: Thread codeEnvAvailable through the OpenAI chat-completion public API

Round-4 codex review (legitimate on the type-safety angle, even though the
runtime concern was already covered): the `createAgentChatCompletion`
helper defines its own narrower `InitializeAgentParams` interface locally,
and the type was missing `codeEnvAvailable`. External consumers who
supply a capability-aware `deps.initializeAgent` couldn't route
`codeEnvAvailable` through without a type-cast workaround.

- Widen the local `InitializeAgentParams` interface to include
  `codeEnvAvailable?: boolean` (matches the real
  `packages/api/src/agents/initialize.ts` type).
- Derive `codeEnvAvailable` inside `createAgentChatCompletion` from
  `deps.appConfig?.endpoints?.agents?.capabilities` (the same source
  the in-repo controllers use) and forward to `deps.initializeAgent`.
  Uses a string literal `'execute_code'` lookup so this file stays free
  of a `librechat-data-provider` import — keeping the dependency surface
  of the public helper minimal.

With this, external consumers of `createAgentChatCompletion` who pass
`appConfig` with the agents capabilities get `bash_tool` + `read_file`
registration automatically; consumers who don't pass `appConfig` retain
the existing "explicit opt-in" semantics (the flag stays `undefined`,
expansion is skipped).

* 🧹 chore: Review-driven polish — observability log, JSDoc DRY, test gaps, no-op allocation

Addresses the comprehensive review of PR #12767:

- **Finding #1** (MINOR, observability): `initializeAgent` now emits a
  debug log when an agent lists `execute_code` in its tools but the
  runtime gate is off (`params.codeEnvAvailable` !== true). The
  event-driven `loadToolDefinitionsWrapper` path doesn't log
  capability-disabled warnings, so without this the tool silently
  vanishes from the LLM's definitions with zero trace. Operators
  debugging "why isn't code interpreter working?" now get a signal at
  the initialize layer.

- **Finding #5** (NIT, allocation): `registerCodeExecutionTools` now
  returns the input `toolDefinitions` array by reference on the no-op
  path (both tools already registered by a prior caller in the same
  run) instead of allocating a fresh spread array every time. The
  common dual-call scenario — `initializeAgent` then
  `injectSkillCatalog` — saves one O(n) copy per request.

- **Finding #4** (NIT, DRY): Collapsed the duplicated 6-line JSDoc
  comment in `openai.js`, `responses.js`, and `addedConvo.js` into
  either a one-line `@see DiscoverConnectedAgentsParams.codeEnvAvailable`
  pointer (the two JS call sites) or a compact 3-line block referring
  back to the canonical source (addedConvo's @param).

- **Finding #2** (MINOR, test gap): Added
  `api/server/services/Endpoints/agents/addedConvo.spec.js` with three
  cases covering `codeEnvAvailable=true`, `codeEnvAvailable=false`,
  and omitted (undefined) passthrough. A future refactor that drops
  the param from destructuring now surfaces here instead of silently
  regressing multi-convo parallel agents with `execute_code`.

- **Finding #3** (MINOR, test gap): Added
  `api/server/controllers/agents/__tests__/client.memory.spec.js`
  pinning the capability-flag derivation that `AgentClient::useMemory`
  uses — six cases covering present/absent/null/undefined config shapes
  plus an enum-literal pin (`'execute_code'` / `'agents'`). Catches
  enum renames or config-path shifts that would otherwise silently
  strip `bash_tool` + `read_file` from memory agents.

Finding #7 (jest.mock scoping, confidence 40) left as-is: the
reviewer's own risk assessment noted `buildToolSet` doesn't touch
the mocked exports, and restructuring a file-level `jest.mock` to
`jest.doMock` + dynamic `import()` introduces more complexity than
the speculative risk justifies. The existing mock is scoped to the
test file and contains the same stubs the adjacent
`skills.test.ts` already uses.

Finding #6 (PR description commit count) addressed out-of-band via
PR description update.

All existing tests pass, typecheck clean, lint clean across touched
files. New tests: 9 cases across 2 new spec files.

* 🧽 refactor: Replace hardcoded 'execute_code' string with AgentCapabilities enum in service.ts

Follow-up review (conf 55) caught that `openai/service.ts`'s Phase 8
`codeEnvAvailable` derivation used the literal `'execute_code'` while
every in-repo controller uses `AgentCapabilities.execute_code` from
`librechat-data-provider`. The file deliberately uses local type
interfaces to keep the public API helper's type surface small, but
that pattern was never a ban on single-value imports from the data
provider — `packages/api` already depends on it. Importing the enum
value means a future rename of `AgentCapabilities.execute_code`
propagates to this file automatically, matching the in-repo
controllers' behavior.

Other follow-up findings left as-is per the reviewer's own verdict:

- #2 (memory spec mirrors the production expression rather than
  calling `AgentClient::useMemory` directly): reviewer flagged as
  "not blocking" / "design-philosophy observation." The test file's
  JSDoc already explicitly documents the tradeoff and pins the enum
  literals to catch the most likely drift vector. Standing up
  `AgentClient` + all its mocks for a one-line regression guard is
  disproportionate.
- #3 (`addedConvo.spec.js` mock signature vs. underlying
  `loadAddedAgent` arity): reviewer's own confidence 25 noted the
  mock matches the wrapper's actual call pattern in the production
  file. Not a real gap.
- #4 was self-retracted as a false alarm.

* 🗑️ refactor: Fully deprecate CODE_API_KEY — remove all LibreChat-side references

The code-execution sandbox no longer authenticates via a per-run
`CODE_API_KEY` (frontend or backend). Auth moved server-side into the
agents library / sandbox service, so LibreChat drops every reference:

**Backend plumbing:**
- `api/server/services/Files/Code/crud.js`: `getCodeOutputDownloadStream`,
  `uploadCodeEnvFile`, `batchUploadCodeEnvFiles` no longer accept
  `apiKey` or send the `X-API-Key` header.
- `api/server/services/Files/Code/process.js`: `processCodeOutput`,
  `getSessionInfo`, `primeFiles` drop the `apiKey` param throughout.
- `api/server/services/ToolService.js`: stop reading
  `process.env[EnvVar.CODE_API_KEY]` for `primeCodeFiles` and PTC; the
  agents library handles auth internally. Remove the now-dead
  `loadAuthValues` + `EnvVar` imports. Drop the misleading
  "LIBRECHAT_CODE_API_KEY" hint from the bash_tool error log.
- `api/server/services/Files/process.js`: remove the `loadAuthValues`
  call around `uploadCodeEnvFile`.
- `api/server/routes/files/files.js`: code-env file download no longer
  fetches a per-user key.
- `api/server/controllers/tools.js`: `execute_code` is no longer a
  tool that needs verifyToolAuth with `[EnvVar.CODE_API_KEY]` — the
  endpoint always reports system-authenticated so the client skips
  the key-entry dialog. `processCodeOutput` called without `apiKey`.
- `api/server/controllers/agents/callbacks.js`: `processCodeOutput`
  invoked without the loadAuthValues round trip, for both LegacyHandler
  and Responses-API handlers.
- `api/app/clients/tools/util/handleTools.js`: `createCodeExecutionTool`
  called with just `user_id` + files.

**packages/api:**
- `packages/api/src/agents/skillFiles.ts`: `PrimeSkillFilesParams`,
  `PrimeInvokedSkillsDeps`, `primeSkillFiles`, `primeInvokedSkills` all
  drop the `apiKey` param; the gate is purely `codeEnvAvailable`.
- `packages/api/src/agents/handlers.ts`: `handleSkillToolCall` drops
  the `process.env[EnvVar.CODE_API_KEY]` read; skill-file priming is
  now gated solely on `codeEnvAvailable`. `ToolExecuteOptions`
  signatures drop apiKey from `batchUploadCodeEnvFiles` and
  `getSessionInfo`.
- `packages/api/src/agents/skillConfigurable.ts`: JSDoc no longer
  references the env var.
- `packages/api/src/tools/classification.ts`: PTC creation no longer
  gated on `loadAuthValues`; `buildToolClassification` drops the
  `loadAuthValues` dep entirely (no LibreChat-side callers need it for
  this path anymore).
- `packages/api/src/tools/definitions.ts`: `LoadToolDefinitionsDeps`
  drops the `loadAuthValues` field.

**Frontend:**
- Delete `client/src/hooks/Plugins/useAuthCodeTool.ts`,
  `useCodeApiKeyForm.ts`, and
  `client/src/components/SidePanel/Agents/Code/ApiKeyDialog.tsx` —
  the install/revoke dialogs for CODE_API_KEY are fully dead.
- `BadgeRowContext.tsx`: drop `codeApiKeyForm` from the context type and
  provider. `codeInterpreter` toggle treated as always authenticated
  (sandbox auth is server-side).
- `ToolsDropdown.tsx`, `ToolDialogs.tsx`, `CodeInterpreter.tsx`,
  `RunCode.tsx`, `SidePanel/Agents/Code/Action.tsx` +`Form.tsx`: all
  API-key dialog trigger refs, "Configure code interpreter" gear
  buttons, and auth-verification plumbing removed. The
  "Code Interpreter" toggle is now a plain `AgentCapabilities.execute_code`
  checkbox — no key-entry gate.
- `client/src/locales/en/translation.json`: drop the three
  `com_ui_librechat_code_api*` keys and `com_ui_add_code_interpreter_api_key`.
  Other locales are externally automated per CLAUDE.md.

**Config:**
- `.env.example`: remove the `# LIBRECHAT_CODE_API_KEY=your-key` section
  and its header.

**Tests:**
- `crud.spec.js`: assertions flipped to pin "no X-API-Key header" and
  "no apiKey param".
- `skillFiles.spec.ts`: removed env-var save/restore; tests now pin
  that the batch-upload path is gated solely on `codeEnvAvailable` and
  that no apiKey is threaded through.
- `handlers.spec.ts`: same — just the `codeEnvAvailable` gate pins
  remain.
- `classification.spec.ts`: remove the two tests that asserted
  `loadAuthValues` was (not) called for PTC.
- `definitions.spec.ts`: drop every `loadAuthValues: mockLoadAuthValues`
  entry from the deps shape.
- `process.spec.js`: strip the mock of `EnvVar.CODE_API_KEY`.

**Comment hygiene:**
- `tools.ts`, `initialize.ts`, `registry/definitions.ts`: shortened
  stale comment references to "legacy `execute_code` tool" without
  naming the retired env var.

Tests verified: 678 packages/api tests pass, 836 backend api tests
pass. Typecheck clean, lint clean. Only remaining CODE_API_KEY
mentions in the code are two regression-guard assertions:
- `crud.spec.js`: pins "no X-API-Key header" stays absent.
- `skillConfigurable.spec.ts`: pins `configurable` never grows a
  `codeApiKey` field.

* 🧹 chore: Remove the last two CODE_API_KEY name mentions in LibreChat

Follow-up to the prior full deprecation commit: two tests still named
the retired identifier in their regression-guard assertions.

- `packages/api/src/agents/skillConfigurable.spec.ts`: drop the
  "does not inject a codeApiKey key" test. The `codeApiKey` field is
  gone from the production configurable shape, so an absence-assertion
  naming it re-introduces the retired identifier in code.
- `api/server/services/Files/Code/crud.spec.js`: rename the
  "without an X-API-Key header" case back to "should request stream
  response from the correct URL" and drop the
  `expect(headers).not.toHaveProperty('X-API-Key')` assertion. The
  surrounding request-shape checks (URL, timeout, responseType) still
  pin the behavior; the explicit header-absence line was named-after
  the deprecated contract.

Result: `grep -rn "CODE_API_KEY\|codeApiKey\|LIBRECHAT_CODE_API_KEY"`
against the LibreChat source tree returns zero hits. The only
remaining `X-API-Key` strings in this repo are on unrelated OpenAPI
Action + MCP server auth configurations, where the string is
user-facing config, not a LibreChat-owned identifier.

Tests: 677 packages/api pass (2 pre-existing summarization e2e failures
unrelated); 126 api-workspace controller/service tests pass.
Typecheck and lint clean.

* 🎯 fix: Narrow codeEnvAvailable to per-agent (admin cap AND agent.tools)

Before this commit, `codeEnvAvailable` was computed in the three JS
controllers as the admin-level capability flag only
(`enabledCapabilities.has(AgentCapabilities.execute_code)`) and passed
through `initializeAgent` → `injectSkillCatalog` / `primeInvokedSkills` /
`enrichWithSkillConfigurable` unchanged. A skills-only agent whose
`tools` array didn't include `execute_code` still got `bash_tool`
registered (via `injectSkillCatalog`) and skill files re-primed to the
sandbox on every turn — wrong, because the agent never opted in to
code execution.

**Fix:** `initializeAgent` now computes the per-agent effective value
once as `params.codeEnvAvailable === true && agent.tools.includes(Tools.execute_code)`,
reuses the same boolean for:

1. The `execute_code` → `bash_tool + read_file` expansion gate
   (previously already consulted `agent.tools`; now shares the single
   `effectiveCodeEnvAvailable` binding).
2. The `injectSkillCatalog` call (previously got the raw admin flag).
3. The returned `InitializedAgent.codeEnvAvailable` field (new, typed as
   required boolean).

**Controllers (initialize.js, openai.js, responses.js):** store
`primaryConfig.codeEnvAvailable` in `agentToolContexts.set(primaryId, ...)`,
capture `config.codeEnvAvailable` in every handoff `onAgentInitialized`
callback, and read it from the per-agent ctx inside the
`toolExecuteOptions.loadTools` runtime closure. The hoisted
`const codeEnvAvailable = enabledCapabilities.has(...)` locals in the
two OpenAI-compat controllers are gone — they were shadowing the
narrowed per-agent value.

**primeInvokedSkills:** `handlePrimeInvokedSkills` in
`services/Endpoints/agents/initialize.js` now uses
`primaryConfig.codeEnvAvailable` (per-agent, narrowed) instead of the
raw admin flag. A skills-only primary agent won't re-prime historical
skill files to the sandbox even when the admin enabled the capability
globally.

**Efficiency:** one extra `&&` in `initializeAgent`. No runtime hot-path
cost — the `includes()` scan on `agent.tools` was already happening for
the `execute_code` expansion gate; it's now just bound to a local. Tool
execution closures read `ctx.codeEnvAvailable === true` (property
access + strict equality, O(1)).

**Ephemeral-agent note:** per-agent narrowing is authoritative for both
persisted and ephemeral flows. The ephemeral toggle
(`ephemeralAgent.execute_code`) is reconciled into `agent.tools`
upstream in `packages/api/src/agents/added.ts`, so
`agent.tools.includes('execute_code')` is the single source of truth
by the time `initializeAgent` runs.

**Tests:** two new regression tests pin the narrowing contract:

- `initialize.test.ts` — four-quadrant matrix on
  `InitializedAgent.codeEnvAvailable` (cap on × agent asks, cap on ×
  doesn't ask, cap off × asks, neither). Catches future refactors that
  drop either half of the AND.
- `skills.test.ts` — `injectSkillCatalog` with `codeEnvAvailable: false`
  against an active skill catalog must NOT register `bash_tool` even
  though it still registers `read_file` + `skill`. This is the state
  a skills-only agent gets post-narrowing.

All 191 affected packages/api tests pass + 836 backend api tests pass.
Typecheck clean, lint clean.

* 🧽 refactor: Comprehensive-review polish — hoist tool defs, pin verifyToolAuth contract, doc appConfig

Addresses the comprehensive review of Phase 8. Findings mapped:

**#1 (MINOR): `verifyToolAuth` unconditional auth for execute_code**
- Added doc comment explicitly stating the deployment contract
  (admin capability → reachable sandbox; no per-check health probe
  to keep UI-gate queries O(1)).
- New `api/server/controllers/__tests__/tools.verifyToolAuth.spec.js`
  with 4 regression tests pinning the contract:
  1. `authenticated: true` + `SYSTEM_DEFINED` for execute_code.
  2. 404 for unknown tool IDs.
  3. `loadAuthValues` is never consulted (catches a future revert
     that would resurface the per-user key-entry dialog).
  4. Response `message` is never `USER_PROVIDED`.

**#2 (MINOR): `openai/service.ts` undocumented `appConfig` dependency**
- Expanded the `ChatCompletionDependencies.appConfig` JSDoc to spell
  out that omitting it silently disables code execution for agents
  with `execute_code` in their tools. External consumers of
  `createAgentChatCompletion` now have the contract documented at
  the type boundary.

**#5 (NIT): `registerCodeExecutionTools` re-allocates tool defs**
- Hoisted `READ_FILE_DEF` and `BASH_TOOL_DEF` to module-level
  `Object.freeze`d constants. The shapes derive entirely from
  static `@librechat/agents` exports, so a single frozen object per
  tool is safe to share across every agent init. Eliminates the
  ~4-property allocations on every call (including the common
  second-call no-op path).

**#6 (NIT): Verbose history-priming comment in initialize.js**
- Trimmed the 16-line `handlePrimeInvokedSkills` block to a 5-line
  summary with `@see InitializedAgent.codeEnvAvailable` pointer.
  The canonical narrowing explanation lives on the type; the
  controller comment is just the ACL-vs-capability rationale.

**Skipped:**

- #3 (memory spec tests a mirror function): reviewer self-dismissed
  as a design tradeoff; the enum-literal pin already catches the
  highest-risk drift vector.
- #4 (cross-repo contract for `createCodeExecutionTool`): user will
  explicitly install the latest `@librechat/agents` dev version
  once the companion PR publishes, so the version pin will be
  authoritative.
- #7 (migration/deprecation note for self-hosters): out of scope
  per user direction — release notes handle this.

Tests verified: 679 packages/api + 840 backend api tests pass.
Typecheck + lint clean.

* 🔧 chore: Update @librechat/agents version to 3.1.68-dev.1 across package-lock and package.json files

This commit updates the version of the `@librechat/agents` package from `3.1.68-dev.0` to `3.1.68-dev.1` in the `package-lock.json` and relevant `package.json` files. This change ensures consistency across the project and incorporates any updates or fixes from the new version.
2026-04-25 04:02:01 -04:00
Danny Avila
ac913aa886 🔐 chore: Skills Permissions Housekeeping, Reachable Admin Dialog + Defaults Tests (#12766)
* 🔐 chore: Skills permissions housekeeping — reachable admin dialog + defaults tests

Phase 9 housekeeping pass. Skills was already gated on `PermissionTypes.SKILLS`
(seeded from `interface.skills`) and `AgentCapabilities.skills` everywhere it
matters, but two smaller parity gaps with Prompts/Memory/MCP remained:

- The skills admin settings dialog had no UI entry point. The only mount was
  inside an unused `FilterSkills` component, so admins had no way to reach the
  role-permissions editor for skills. Mounted it in `SkillsAccordion` gated on
  `SystemRoles.ADMIN`, matching the `PromptsAccordion` pattern.
- No regression lock on skill permission defaults. `roles.spec.ts` asserted
  structural completeness but not the specific shape — a future refactor
  could silently flip ADMIN's `USE/CREATE/SHARE/SHARE_PUBLIC` to false or
  drop SKILLS from USER defaults without failing. Added explicit Skills
  assertions for both roles.
- No lock on `AgentCapabilities.skills` being in `defaultAgentCapabilities`.
  Added an assertion in `endpoints.spec.ts`.

* 🩹 fix: Remove duplicate `const appConfig` in Responses createResponse

The Skills polish commit (#12760) added `const appConfig = req.config;` at
line 381 inside the try block of `createResponse`, without noticing that
the earlier drive-by fix (2463b6acd) already declared it at the function
top (line 283). The second `const` creates a new block-scoped binding
inside the try, so earlier references within the same block (e.g.
line 348's `appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders`)
now hit the TDZ instead of the outer binding and throw
`ReferenceError: Cannot access 'appConfig' before initialization` —
which the outer try/catch then swallows into a generic 500.

This surfaced as all six token-usage tests in
`api/server/controllers/agents/__tests__/responses.unit.spec.js` failing
with `mockRecordCollectedUsage` never being called (because the throw
skips past the `recordCollectedUsage(...)` call).

Dropping the inner re-declaration restores the full control flow. All 11
tests in the file pass again.

* 🧹 refactor: Address review nits on Phase 9 housekeeping

- Move the `defaultAgentCapabilities` regression test out of the
  `createEndpointsConfigService` describe block and into its own
  top-level describe. It tests a module constant and has no relationship
  to the service factory; nesting was misleading and made it easier to
  accidentally drop if the service tests are later restructured.
- Re-order local imports in `SkillsAccordion.tsx` longest-to-shortest
  per AGENTS.md convention (`SkillsSidePanel` 48 chars before
  `useAuthContext` 41 chars).
2026-04-25 04:02:01 -04:00
Danny Avila
7581540ab6 🔌 refactor: Decouple bash_tool from Per-User CODE_API_KEY (#12712)
* 🔌 refactor: Decouple bash_tool from Per-User CODE_API_KEY

Phase 4 of Agent Skills umbrella (#12625): gate bash_tool and skill
file priming on the `execute_code` capability only. Thread a boolean
`codeEnvAvailable` through `enrichWithSkillConfigurable` and
`primeInvokedSkills` in place of the old per-user `codeApiKey` +
`loadAuthValues` plumbing. The sandbox API key is the LibreChat-
hosted service key — system-level, not a user secret — so the
per-user lookup was legacy; when needed, it's read directly from
`process.env[EnvVar.CODE_API_KEY]` inside the capability gate.

`handleSkillToolCall` and `primeInvokedSkills` gate sandbox uploads
on `codeEnvAvailable` first, preventing skill-file uploads to the
sandbox when an agent has `execute_code` disabled even if the env
var happens to be set. The agents library resolves the env key
itself for `bash_tool`, so `ToolService.js` drops the
`loadAuthValues` lookup and the "Code execution is not available"
placeholder tool in favor of a plain `createBashExecutionTool({})`
with a loud error log if the env var is missing.

Also fixes a pre-existing `appConfig`-undefined lint error in
`responses.js`/`createResponse` that surfaced when this file was
touched (declares `const appConfig = req.config` at function top,
matching the existing pattern in other controllers).

Preserves the `skillPrimedIdsByName` threading added by Phase 3/5/6
and all Phase 3/5/6 call-site signatures. Adds
`skillConfigurable.spec.ts` (5 cases pinning the new surface) and
`skillFiles.spec.ts` (4-way matrix of capability × env key for
`primeInvokedSkills`).

* 🧪 refactor: Address Codex Review Feedback

Resolves findings from the second codex review on #12712:

- MAJOR: `handlers.spec.ts` now covers the `codeEnvAvailable` gate in
  `handleSkillToolCall` across three cases (gate off, gate on + env
  set, gate on + env unset). The gate is the critical regression
  prevention — a future edit that drops it would silently re-enable
  sandbox uploads for agents with `execute_code` disabled.

- MINOR: Hoist `codeEnvAvailable` and `skillPrimedIdsByName` out of
  `loadTools` closures in `openai.js` and `responses.js`. Both values
  are fixed once `initializeAgent` resolves, so recomputing them on
  every tool execution was wasted work. `responses.js` shares a single
  pair between its streaming and non-streaming branches.

- MINOR: `skillFiles.spec.ts` now has a test that exercises the full
  upload path end-to-end with real file records, asserting
  `batchUploadCodeEnvFiles` is called with the env-sourced apiKey and
  the correct file set (including the synthetic `SKILL.md`).

- NIT: Finish the `appConfig` extraction in `responses.js/createResponse`
  — replaces the remaining `req.config` references with `appConfig` for
  consistency with the pattern in other controllers.

No behavioral changes beyond what was already in place; this is
coverage and readability polish.

* 🧷 test: Tighten Spec Hygiene Per Codex Nit Feedback

Round-3 codex review flagged two NITs on the test code added in the
previous commit:

- Replace `_id: 'skill-id' as unknown as never` in the new
  `makeSkillHandlerWithFiles` helper with a real `Types.ObjectId`,
  matching the pattern used by the primed-skill tests further up in
  the same file (and by `skillFiles.spec.ts`). The `never` cast
  hides the fact that `_id` really is a string / ObjectId at runtime.

- Replace the ad-hoc `{ on, pipe, read }` stub with a real
  `Readable.from(Buffer.from(''))` in the upload-path test. The stub
  worked only because `batchUploadCodeEnvFiles` is mocked and never
  iterates the stream; `Readable.from` satisfies the same contract
  and is robust to any future partial-real replacement of the upload
  function.

Pure test-hygiene improvements; no runtime code touched.

* 🧹 chore: Remove Duplicate appConfig Declaration After Rebase

The upstream `2463b6acd` fix declared `const appConfig = req.config`
inside the try block (line 381) to patch the same `no-undef` error
I fixed in this PR at the top of `createResponse` (line 283). The
rebase kept both declarations side-by-side. Drops the inner one —
the outer binding covers every downstream reference already.
2026-04-25 04:02:01 -04:00