Commit graph

250 commits

Author SHA1 Message Date
Danny Avila
9f8b6d92c0
🤖 feat: Add Claude Sonnet 5 Support (#14042)
*  feat: Add Claude Sonnet 5 Support

Wire up the claude-sonnet-5 model across token, pricing, and model-list
config:

- Context window (1M) and max output (128K) in @librechat/api token maps
- Standard pricing ($3/$15 per MTok) and cache rates in data-schemas tx
- 128K output-token carve-out in anthropicSettings (the family-wide 64K
  rule capped Sonnet 5 below its real limit); Bedrock/Vertex thinking and
  1M-context detection already cover sonnet major >= 5 generically
- Add to shared Anthropic, Bedrock, and Vertex default model lists, plus
  the .env.example examples
- Tests for context/output/pricing/matching across the affected packages

*  test: Align Sonnet 5+ maxOutputTokens defaults with 128K spec

getLLMConfig defaults flow from anthropicSettings.maxOutputTokens.reset(),
which now returns 128K for Sonnet 5+. Update the future-proofing assertions
in llm.spec.ts (Sonnet 5.x and 6-9.x) that still expected the old family-wide
64K cap. Haiku stays 64K; Opus stays 128K.

* 🎚️ fix: Gate Sonnet 5 capability behaviors (sampling, thinking)

Adding claude-sonnet-5 to the default list exposed it without the Anthropic
capability gates, all confirmed against the live API:

- omitsSamplingParameters: Sonnet 5 returns 400 on non-default temperature/
  top_p/top_k ('deprecated for this model'); now dropped so selecting the
  model with saved sampling settings no longer fails.
- requiresExplicitThinkingDisabled: omitting 'thinking' runs adaptive ON by
  default on Sonnet 5, so disabling thinking now sends { type: 'disabled' }
  (verified: 200, no thinking block) instead of omitting the field.
- omitsThinkingByDefault: thinking.display defaults to omitted (empty thinking
  blocks); the display resolver now returns 'summarized' for Sonnet 5+ so the
  Thoughts UI keeps working (verified: 757-char summary returned).

Gates apply to both the direct Anthropic and Bedrock paths. Tests added in
bedrock.spec and llm.spec.

* 🩹 fix: Sonnet 5 Bedrock availability + thinking-off persistence

Round-2 Codex review (all verified against the live API / Anthropic docs):

- Sonnet 5 is NOT available on the legacy Bedrock InvokeModel/Converse surface
  (Anthropic docs: 'use Claude in Amazon Bedrock or Claude Platform on AWS'),
  which is what LibreChat's ChatBedrockConverse uses. Removed it from the
  default Bedrock model lists (config + .env.example). Opus 4.8/4.7/Fable 5
  stay — those ARE reachable via InvokeModel. Sonnet 5 remains on the direct
  Anthropic API and Vertex, where it works.
- Reverted the Bedrock-side explicit-disabled thinking handling added last
  round: with Sonnet 5 off Bedrock, no Bedrock model needs { type: 'disabled' },
  so that path (and its round-trip concern) no longer applies.
- Direct Anthropic path: a persisted { type: 'disabled' } thinking object now
  normalizes to a boolean flag in getLLMConfig, so a user's Sonnet 5
  'thinking off' setting stays off across the model_parameters round trip
  instead of flipping back to adaptive (a truthy object skipped the disabled
  branch).

* ↩️ fix: Restore Sonnet 5 on Bedrock (Converse) — verified live

Reverses the round-2 removal: Sonnet 5 IS available on AWS Bedrock. Tested
live via the Converse API:

- global.anthropic.claude-sonnet-5 returns a normal response
- bare anthropic.claude-sonnet-5 needs an inference profile — but that's
  identical to the already-shipping Opus 4.8 / Fable 5 / Sonnet 4.6 entries,
  which all fail bare on-demand the same way
- temperature=0.5 -> 400 'deprecated for this model'; thinking {type:disabled}
  suppresses reasoning — same as the direct API

The 'legacy' Bedrock docs page that claimed Sonnet 5 wasn't on the surface is
stale. Restored:
- anthropic.claude-sonnet-5 in bedrockModels + .env.example
- the Bedrock explicit-disabled thinking handling (requiresExplicitThinkingDisabled
  -> { type: 'disabled' })
- the Finding 4 round-trip fix in bedrockInputSchema (coerce a persisted
  disabled AMRF.thinking to thinking=false instead of !!thinking -> true), with
  an end-to-end schema->parser test proving 'thinking off' stays sticky.

Direct-path round-trip fix (getLLMConfig thinkingFlag) is unchanged.

* 💵 fix: Sonnet 5 intro pricing + sticky disabled thinking on Bedrock reload

Round-4 Codex review (both verified):

- Pricing: Anthropic lists Sonnet 5 at introductory $2/$10 per MTok (cache
  $2.50/$0.20) through 2026-08-31, reverting to $3/$15 ($3.75/$0.30) on
  Sep 1 (confirmed on platform.claude.com/pricing). The static tx multiplier
  table is used for real balance transactions, so the post-intro rates were
  overcharging ~50% during the launch window. Switched to the intro rates with
  a revert comment on both the token and cache entries.

- Bedrock disabled-thinking persistence: initializeBedrock feeds persisted
  model_parameters straight through bedrockInputParser (NOT bedrockInputSchema),
  where additionalModelRequestFields is a known key — so a prior
  thinking:{type:'disabled'} was ignored and rebuilt as adaptive on reload.
  bedrockInputParser now surfaces a persisted disabled AMRF.thinking as
  thinking=false so it re-emits {type:'disabled'}. Verified end-to-end against
  the real initializeBedrock call path.
2026-06-30 19:26:33 -04:00
Ravi Kumar L
a0529c9af7
🪭 feat: Add opt-in Langfuse fanout gateway + collector (#13872)
* feat: add opt-in Langfuse fanout collector

* feat: fan out Langfuse feedback scores

* docs: prepare Langfuse fanout for OSS setup

* fix: clarify Langfuse fanout collector config

* test: stabilize librechat suite

* test: fix upload dialog import order

* fix: omit empty Langfuse tenant fields

* fix: gate tenant Langfuse fanout

* test: cover central Langfuse env fallback

* style: format Langfuse fanout config

* feat: route langfuse fanout by destination

* docs: clarify langfuse compose destination scope

* test: remove unrelated suite stabilization

* style: sort agent imports

* fix: treat blank tenant fanout toggle as disabled

* fix: rename tenant fanout emergency toggle

* test: guard langfuse fanout collector config drift

* feat: tune langfuse fanout batching

* test: render fanout helm tests without dependencies

* fix: narrow remote agent run config

* refactor: share string normalization helper

* fix: align langfuse fanout env parsing

* fix(langfuse): align score fanout toggles with traces

* fix(langfuse): keep central fanout config collector-only

* fix(langfuse): type fanout collector config

* fix(langfuse): harden tenant fanout config

* feat(langfuse): support media fanout gateway

* fix(langfuse): route tenant fanout through destination URL

* fix(langfuse): harden fanout routing checks

* ci(langfuse): test fanout gateway changes

* ci(langfuse): check fanout go formatting

* fix(langfuse): satisfy api typecheck
2026-06-26 11:26:39 -04:00
Danny Avila
562bd8ec5f
🐛 fix: Prevent Infinite Render Loop on Code-Execution File Preview (#13922)
* 🐛 fix: Prevent Infinite Render Loop on Code-Execution File Preview

Loading a conversation that contains a large (>1MB) code-execution
office file crashed the whole app with React error #185 ("Maximum
update depth exceeded") on hard refresh.

Root cause (client-only): the terminal-write effect in
useAttachmentPreviewSync writes the resolved preview record back into
messageAttachmentsMap with a fresh object identity on every run, and
`attachment` is in the effect's dependency array. useAttachments
re-derives `attachment` ({...db, ...liveEntry}) with a new identity on
every map write, so once polling resolves (pending -> ready on a loaded
conversation) the effect ping-pongs forever:
setAttachmentsMap -> re-derive -> effect -> setAttachmentsMap.

Only files large/slow enough to defer extraction are persisted at
status: 'pending', which is why small documents never triggered it.

Fix: an idempotency gate that bails before setAttachmentsMap when the
merged attachment already carries the resolved status/text/textFormat/
previewError. The write happens once and then settles.

Tests:
- useAttachmentPreviewSync.loop.spec.tsx wires the real
  useAttachments -> hook feedback to reproduce the loop (verified to
  throw #185 without the gate, settle with it).
- e2e/specs/mock/attachment-preview-loop.spec.ts loads a conversation
  with a pending code-exec attachment whose preview resolves ready and
  asserts the app does not crash.

Closes #13916

* 🔧 feat: Make Office Preview Extraction Cap Configurable (default 2MB)

The inline code-execution preview extraction ceiling was a hardcoded 1MB
constant (MAX_TEXT_EXTRACT_BYTES). Office/text artifacts over that skip
the inline preview and resolve to "Preview unavailable" (download-only).

Make it configurable via FILE_PREVIEW_MAX_EXTRACT_BYTES and raise the
default to 2MB so larger documents get an inline preview out of the box.
The rendered HTML remains independently capped at MAX_TEXT_CACHE_BYTES
(512KB), so image-heavy files over that still fall back to the existing
"preview too large" banner rather than rendering unbounded output.

- resolveMaxTextExtractBytes(env) parses the override, falling back to
  2MB on missing/non-numeric/non-positive values (warns on invalid).
- Documented in .env.example next to the other file-size limits.
- Unit tests cover default, valid override, fractional flooring, and
  invalid fallback.

* 🐛 fix: Guard sub-byte preview cap from flooring to zero

A fractional FILE_PREVIEW_MAX_EXTRACT_BYTES in (0, 1) passed the
positive-number check then floored to 0, making MAX_TEXT_EXTRACT_BYTES
zero and treating every non-empty artifact as oversized. Floor first,
then require the result to be >= 1 byte before accepting it; otherwise
fall back to the 2 MB default. Adds coverage for the sub-byte case.

*  test: Make exported-ceiling assertion env-independent

The "exported ceiling" assertion compared MAX_TEXT_EXTRACT_BYTES to a
literal 2 MB, but that const is initialized from
FILE_PREVIEW_MAX_EXTRACT_BYTES at module load — so the suite would
falsely fail when run with the override set. Assert the export tracks
resolveMaxTextExtractBytes(env) for the current environment instead; the
undefined-case test continues to pin the 2 MB default.
2026-06-23 16:34:43 -04:00
Danny Avila
cb2bafb457
🧂 chore: Require an Operator-Supplied Admin Panel Session Secret (#13902)
* fix: require admin panel session secret

* 🩹 fix: Plain-Expand Admin SESSION_SECRET So Compose Maintenance Commands Run

The `${VAR:?}` required form fails interpolation for every deploy-compose
subcommand (down/pull/config), breaking `npm run update:deployed` for installs
whose .env predates ADMIN_PANEL_SESSION_SECRET. Plain expansion keeps those
commands working; the admin-panel image fail-fasts on an empty secret, so the
panel still refuses to start without it.
2026-06-23 08:43:54 -04:00
Danny Avila
77983dbd42
🐳 feat: Bundle Admin Panel in Docker Compose Stacks (#13876)
* 🐳 feat: Bundle ClickHouse Admin Panel in Docker Compose Stacks

* chore: route admin-panel image through Scarf gateway

* chore: route admin-panel image through Scarf gateway (deploy-compose)

* 📝 docs: Add Admin Panel to README Features

* 🏷️ chore: Rename Admin Panel Container to admin-panel

* 🔐 fix: Fall Back Admin Panel SESSION_SECRET to CREDS_KEY

* 📝 docs: Reference Open-Source code-interpreter Repo in README
2026-06-22 16:59:08 -04:00
Danny Avila
e515063ffe
🔗 feat: Snapshot Files for Shared-Link Attachments (#13740)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* 🔗 feat: Snapshot Files for Shared-Link Attachments

Shared-link viewers could read a shared conversation snapshot but not its
attachments: file preview/download still went through the owner-scoped file
ACL (the /api/files router sits behind requireJwtAuth + owner/agent checks),
so anonymous viewers got 401s and authenticated non-owners got 403s — the
repeated `[fileAccess] denied` warnings seen for the preview poller.

Capture an immutable per-share file snapshot (embedded on the SharedLink
document, referencing the original stored object — no byte copy) at share
create/update, and serve those files through new share-scoped routes
authorized by the existing shared-link view permission (public/ACL) plus
snapshot membership, never the owner's live file ACL.

- data-schemas: fileSnapshots on the share doc; capture in create/update;
  read-time rewrite of filepath/preview to /api/share/:id/files/:fileId;
  getSharedLinkFile + lazy backfillSharedLinkFiles for legacy links
- api: GET /api/share/:shareId/files/:file_id[/download|/preview]; route
  context added to fileAccess denial logs
- packages/api: isFileSnapshotEnabled resolver (env + yaml)
- data-provider: interface.sharedLinks.snapshotFiles (default on) + client
  endpoints/services
- client: ShareContext.shareId wired to Image, preview hook, and downloads
- config: SHARED_LINKS_SNAPSHOT_FILES env override (default on)

* 🔒 fix: Address Codex review on shared-link file snapshots

Triage of the Codex review on PR #13740 (2 P1, 7 P2 — all valid):

- P1 (cross-user access): scope the snapshot lookup to the sharing user's own
  files so a message referencing another user's file_id can't widen access.
- P1 (stored XSS): the inline share-file route now serves only safe preview
  types inline (raster images/pdf); everything else is forced to attachment with
  X-Content-Type-Options: nosniff.
- Stream shared downloads by default; redirect to a signed URL only on
  ?direct=true (blob/XHR callers work without bucket CORS).
- Read preview status live from the file record (always current for deferred
  previews) and stop embedding extracted text in the share doc (16MB-limit risk).
- Only lazily backfill when the fileSnapshots field is absent (legacy), not on
  every snapshot miss.
- Backfill legacy shares before rewriting message URLs, and gate URL rewriting
  to public shares so non-public (ACL) shares keep prior behavior (img/anchor
  can't carry the bearer token).
- Frontend: only route a download through the share path when the file was
  actually snapshotted (rewritten href / filepath), else fall back.

* 🔑 feat: Authorize shared-link files for non-public shares via cookie

Extends shared-link file access to non-public (ACL) shares (Codex finding 5).
`<img>`/anchor requests can't carry the bearer access token, so non-public
shares previously 401'd on file loads. Add an optional cookie-auth fallback on
the share file routes that resolves the viewer from the `refreshToken` cookie
(or signed `openid_user_id` cookie) — the same mechanism secure image links use
(validateImageRequest) — then let canAccessSharedLink run the viewer's ACL check.

- new middleware optionalShareFileAuth (+ unit spec); applied to the three
  share file routes after optionalJwtAuth
- URL rewriting in getSharedMessages is no longer gated to public shares (the
  route now authorizes header-less requests), so files work uniformly across
  public and non-public shares; revert the now-unused req.sharePublic plumbing

* 🔒 fix: Second Codex pass on shared-link file snapshots

Addresses the follow-up Codex findings on PR #13740:

- Don't snapshot transient text-source files: FileSources.text filepaths are
  Multer temp paths the upload route deletes, so they can't be streamed —
  removed from the streamable allowlist.
- Unset stale snapshots on a disabled-feature update: updateSharedLink now
  $unsets fileSnapshots when snapshotFiles is false, so an opted-out update
  can't keep serving file ids the update dropped.
- Load tenant config after share resolution: configMiddleware now runs after
  canAccessSharedLink (which enters the share's tenant ALS context), so
  per-tenant interface.sharedLinks.snapshotFiles overrides apply to anonymous
  public views.
- Return a clean 404 when the snapshotted object is gone: resolveShareFile now
  requires the live file record and 404s if it's been deleted/expired, instead
  of letting the stream error after headers are sent (ENOENT / 500).

(The re-flagged P1 about private-viewer rewriting was already fixed in the prior
commit's cookie-auth change.)

* 🔒 fix: Third Codex pass on shared-link file snapshots

Addresses the third Codex review pass on PR #13740:

- P1: keep shared previews/files pinned to the snapshotted version. Snapshot the
  small previewRevision; resolveShareFile 404s when the live file's revision no
  longer matches (file_id reused/overwritten by a later turn), so old links can't
  surface post-share content — covers both preview text and streamed bytes.
- Honor the toggle as a kill switch: resolveShareFile 404s when snapshotFiles is
  disabled, instead of only skipping backfill, so disabling stops serving
  already-snapshotted file URLs.
- Lazy-sweep orphaned 'pending' previews to 'failed' in the share preview route
  (mirrors the owner route) so the client poller reaches a terminal state.
- Resolve the cookie-fallback user in runAsSystem so strict tenant isolation
  doesn't throw before canAccessSharedLink establishes the share tenant context.

*  feat: Per-link "share files" checkbox for shared links

Add a checkbox to the share-link dialog (checked by default) letting the user
choose whether to include the conversation's files in the shared link, with
copy explaining images/files won't be visible to viewers otherwise. Opting out
skips snapshot creation/serving for that link.

- client: ShareButton renders the checkbox gated on the new
  startupConfig.sharedLinksSnapshotFilesEnabled flag; state threads through
  SharedLinkButton into the create/update mutations as `snapshotFiles`.
- data-provider: createSharedLink/updateSharedLink send `snapshotFiles` in the
  body; TStartupConfig gains `sharedLinksSnapshotFilesEnabled`.
- api: POST/PATCH /api/share compute snapshotFiles as
  isFileSnapshotEnabled(req.config) && body.snapshotFiles !== false (admin gate
  AND per-link opt-out); config.js exposes the effective enabled flag to clients.
- en locale: com_ui_share_files (+ _description).

* 🐛 fix: Make the "share files" opt-out actually hide files

Unchecking "share files" at creation didn't hide anything: the shared message
JSON still carried each file's original (e.g. static-served) path, and because
opting out only meant "no fileSnapshots field" — indistinguishable from a legacy
link — getSharedMessages would backfill snapshots on first view whenever the
admin feature was on, re-enabling files entirely.

Fix by persisting and honoring the per-link choice:
- Store `snapshotFiles` (boolean) on the SharedLink so opt-out is distinct from a
  legacy link; set it on create and update.
- getSharedMessages computes includeFiles = adminEnabled && link not opted out;
  when excluded it strips files/attachments from the payload (no original-path
  leak) and never backfills the opted-out link.
- Surface the stored choice via getSharedLink so the dialog checkbox reflects an
  existing link's actual setting instead of always defaulting to checked.

Note: changing the checkbox on an already-created link still applies only when
the link is refreshed (which regenerates the URL) — a UX follow-up.

* 🔒 fix: Close remaining shared-link file opt-out leaks (Codex)

Follow-up to the per-link opt-out, addressing the third Codex pass:

- Honor the opt-out on the file route too: getSharedLinkFile now returns the
  link's `optedOut` choice; resolveShareFile 404s (and never backfills) an
  opted-out link, so a direct /files/:id request can't re-create snapshots.
- Make read/serve viewer-independent: the gate no longer uses the viewer's
  resolved config (isFileSnapshotEnabled(req.config)) — it uses the link's stored
  choice plus a global env-only kill switch (isFileSnapshotKillSwitchActive). A
  viewer's own interface.sharedLinks.snapshotFiles can no longer hide a link's
  files. Create/update still use the creator's config to set the per-link choice.
- Neutralize render URLs for non-snapshotted files: applyShareFileRoute now
  strips filepath/preview for any file/attachment not in the snapshot, so the
  owner's original (e.g. static) path can't be loaded through the share.

* 🔒 fix: Harden shared-file version pinning and local path handling (Codex)

- Refuse reused/overwritten file snapshots more broadly: resolveShareFile now
  refuses to serve when either previewRevision OR `bytes` changed vs the
  snapshot. `bytes` catches non-office reused outputs (e.g. code-exec
  same-filename images that lack previewRevision) and is stable across S3 URL
  refresh and the pending->ready transition. Same-size content swaps remain a
  best-effort gap inherent to the no-byte-copy design.
- Strip cache-busting query strings before local streaming: code-output images
  add `?v=...` to filepath; the share route now splits it off so getLocalFileStream
  resolves the real filename instead of a literal `*.png?v=...` path.

* 💬 fix: Clarify that file-sharing changes apply on link refresh

For an already-created shared link, changing the "share files" checkbox only
takes effect when the link is refreshed (which regenerates the snapshot). Add a
note under the checkbox, shown only when a link already exists, so the behavior
isn't surprising: "Refresh the link to apply this change — files are snapshotted
when the link is refreshed."
2026-06-20 23:05:13 -04:00
Ravi Kumar L
bc5a3f502f
📡 refactor: Gate Noisy Redis OTEL Instrumentation (#13764)
* fix(telemetry): gate noisy database instrumentation

* test(telemetry): assert opt-in instrumentation baseline

* style(telemetry): sort imports

* fix(telemetry): preserve mongoose internal tracing by default

* fix(telemetry): limit database tracing flag to redis
2026-06-15 12:48:20 -04:00
Danny Avila
98704f28c1
🌐 fix: Centralize Outbound Proxy Handling (#13726)
* fix: centralize outbound proxy handling

* chore: sort proxy imports

* test: update proxy helper mocks

* fix: honor proxy bypasses consistently

* fix: support http axios proxy targets
2026-06-14 10:47:49 -04:00
Danny Avila
731a7c57c1
🥇 fix: Send First OpenID Audience on Authorization Requests (#13694)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
2026-06-11 13:21:54 -04:00
Ravi Kumar L
346ebea2d9
⚙️ refactor: brotli asset serving behind a feature toggle (#13641) 2026-06-10 08:49:50 -04:00
Danny Avila
a7f16911b2
fix: Extend and Decouple MCP OAuth Flow Timeouts (#13622)
*  fix: Extend and decouple MCP OAuth flow timeouts

The OAuth auth button disappeared after 2 minutes (the internal OAuth
handling timeout) while the flow state lived for 3 minutes, leaving users
who didn't click immediately stuck in an unrecoverable re-auth loop. The
handling timeouts also reused the connection/init timeout, so a short
initTimeout would shrink the OAuth window further.

- Add MCP_OAUTH_HANDLING_TIMEOUT (10m) and MCP_OAUTH_FLOW_TTL (15m) to mcpConfig
- Decouple the reactive/proactive OAuth waits from initTimeout/connectionTimeout
- Use OAUTH_FLOW_TTL for the FlowStateManager TTL and the UI status window
- Ensure the flow TTL outlives the handling timeout, fixing the
  "Flow state not found" race
- Remove dead FLOW_TTL constant and document new env vars

Fixes #13615

*  fix: Coordinate OAuth pending window with handling timeout

Address Codex review: the extended OAuth wait was still capped by other
timeouts that were not updated.

- Align PENDING_STALE_MS (button validity + pending-flow reuse window)
  with MCP_OAUTH_HANDLING_TIMEOUT so a flow stays reusable for the full
  wait instead of 2 minutes (Finding 3)
- Clamp MCP_OAUTH_FLOW_TTL to never fall below the handling timeout so a
  callback near the deadline still finds its flow state (Finding 2)
- Floor attemptToConnect's timeout to the handling window for OAuth
  servers so the reactive in-connect OAuth wait is not killed by the
  30s connection timeout (Finding 1)
- Update flow staleness tests to reference the threshold symbolically

*  fix: Align OAuth window across status, action flows, and client polling

Address Codex round 2: extending the server wait exposed three more
windows that were still capped or now over-extended.

- checkOAuthFlowStatus reports a PENDING flow as active only within the
  usable PENDING_STALE_MS window, not the longer Keyv retention TTL, so
  the connect button reappears instead of a stuck 'connecting' state
- Give Action (custom tool) OAuth its own FlowStateManager on the prior
  3-minute TTL so the longer MCP OAuth TTL can't leave an action tool
  call waiting up to 15 minutes
- Extend the MCP server-card client polling to the 10-minute handling
  window so a user who completes OAuth after 3 minutes is still picked up

* 🧪 test: Make stale-flow CSRF test track PENDING_STALE_MS

The CSRF-fallback stale-flow test hardcoded a 3-minute age, which is now
within the 10-minute PENDING_STALE_MS window and was wrongly treated as
active. Derive the age from PENDING_STALE_MS so it tracks the constant.

*  fix: Add grace buffers and surface OAuth timeout to the client

Address Codex round 3 (near-deadline edges):

- Clamp MCP_OAUTH_FLOW_TTL to handling timeout + 60s grace (not equality),
  so flow state outlives the wait instead of expiring at the same instant
- Extend attemptToConnect's OAuth floor by a 60s grace so a user who
  authorizes near the deadline still gets the post-OAuth reconnect
- Surface OAUTH_HANDLING_TIMEOUT on the connection-status response and
  have the client poll for the configured window instead of a hardcoded
  10 minutes, so a tuned server deadline isn't capped on the client

*  fix: Refresh client OAuth timeout from the first status refetch

If the connection-status cache is empty when polling starts, the client
captured the 10-minute fallback and never picked up a tuned oauthTimeout.
Re-read it after each refetch so a longer configured deadline is honored
even on a cold cache.

* 📝 refactor: Type oauthTimeout on MCPConnectionStatusResponse

Declare the oauthTimeout field on the shared response type in
data-provider instead of an ad-hoc inline cast in the client hook, and
replace the pre-existing 'as any' on the status query read with the
typed getQueryData. Type-level only; no runtime change.
2026-06-09 17:50:02 -04:00
Danny Avila
2aea5f4a3a
📖 feat: Add Claude Fable 5 Support (#13628)
* 📖 feat: Add Claude Fable 5 Support

Claude Fable 5 (`claude-fable-5`) is Anthropic's most capable widely
released model (GA 2026-06-09). Its naming drops the opus/sonnet/haiku
tier, so LibreChat's name-parsing helpers miss it; this teaches them the
Mythos-class family (Fable / Mythos) and registers the model.

- Add `parseMythosClassVersion` and route Fable/Mythos through
  `supportsAdaptiveThinking`, `omitsThinkingByDefault`,
  `omitsSamplingParameters`, and `supportsContext1m`
- Extend the Bedrock detection regexes (beta headers + adaptive-thinking
  branch) and `checkPromptCacheSupport` to match `claude-(fable|mythos)`
- Return 128K max output for Fable/Mythos in `maxOutputTokens.reset`/`set`
- Register `claude-fable-5` in shared Anthropic + Bedrock model lists,
  1M context / 128K output token maps, and $10/$50 pricing with 12.5/1
  cache rates (`claude-mythos-5` added to token + pricing maps only,
  since it is limited-availability)
- Update `.env.example` and the Vertex `librechat.example.yaml` examples
- Add parallel tests across tokens, Anthropic llm config, the Bedrock
  parser, and tx pricing

* 🧹 refactor: Centralize Mythos-class detection; address review feedback

- Add `isMythosClassModel` + `MYTHOS_CLASS_FAMILIES` in schemas.ts as the
  single source of truth for the Fable/Mythos family; route every gate
  (adaptive thinking, omit-thinking, omit-sampling, 1M context, prompt cache,
  128K max-output reset/set) through it. A future sibling class is now a
  one-line edit.
- [Codex P2] Exclude Mythos-class from getBedrockAnthropicBetaHeaders: Fable/
  Mythos ship 128K output + fine-grained tool streaming by default, and the
  legacy output-128k-2025-02-19 beta is 3.7-Sonnet-only on Bedrock and risks
  request rejection. They still get adaptive thinking + effort.
- [Copilot] Add Mythos 5 test parity (name variations, cache rates, pinned
  $10/$50) in tx.spec; add Mythos context/max-output/name-match in tokens.spec;
  fix the stale claude-3-7-sonnet-only comment in bedrock.ts.
- Add isMythosClassModel unit tests covering all declared families.

* 📝 docs: Clarify Mythos-class Bedrock requirements; correct beta-omit rationale

Verified live against Bedrock (acct 951834775723, us-west-2):
- anthropic.claude-fable-5 IS a real Bedrock catalog model, INFERENCE_PROFILE-only
  exactly like the existing anthropic.claude-opus-4-7/4-8 and claude-sonnet-4-6
  default entries (refutes the "invalid model id" review claim).
- Mythos-class also requires opting into Anthropic data sharing (Bedrock Data
  Retention API) before invocation.

Changes:
- .env.example: note that Mythos-class (Fable/Mythos) is inference-profile-only on
  Bedrock and needs the data-sharing opt-in.
- bedrock.ts: reword the beta-omit comment to the verified rationale — output-128k /
  fine-grained-tool-streaming are built-in/no-op for the 4.7+ generation, so omitting
  them is lossless (dropped the unverified "Bedrock may reject" wording).

* 🔄 refactor: Reorganize imports in schemas.ts and tx.spec.ts

- Moved `TFeedback` and `Tools` imports to the top of `schemas.ts` for better readability.
- Adjusted import order in `tx.spec.ts` to maintain consistency and improve clarity.
2026-06-09 16:22:39 -04:00
Peter Boers
98822341ed
feat: Make OpenID Token Reuse Window Configurable (#13546)
* feat: make OpenID token reuse window configurable via OPENID_REUSE_MAX_SESSION_AGE_MS

The OpenID session-token reuse window in AuthController was a hardcoded 15-minute
constant, forcing /api/auth/refresh to perform a real refreshTokenGrant against the
IdP every 15 minutes even when the current access token is still valid. IdPs that
rotate and revoke the previous access token on refresh then invalidate a token that
is still in use by downstream consumers of the reused OpenID token (e.g. MCP servers
that receive {{LIBRECHAT_OPENID_TOKEN}} and introspect the bearer), producing
~15-minute 401 cycles regardless of the access token's actual lifetime.

Read the window from process.env.OPENID_REUSE_MAX_SESSION_AGE_MS via the existing
math() helper, so it accepts an arithmetic expression like SESSION_EXPIRY (e.g.
60 * 60 * 24 * 1000), defaulting to the existing 15 minutes so behavior is unchanged
unless explicitly configured. The existing 30s-before-expiry guard still forces a
refresh before genuine expiry, so a larger window remains safe.

* fix: extend OpenID reuse session lifetime

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-06-06 15:15:58 -04:00
Danny Avila
2c8d54e18c
🗂️ feat: Add Deployment Skill Directory (#13523)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* feat: Add deployment skill directory

* chore: Address deployment skill review feedback

* fix: Include deployment skill file metadata

* test: Add deployment skills e2e smoke test
2026-06-05 10:24:28 -04:00
Danny Avila
83d8ac0682
🪜 feat: Add OpenID Role Sync (#13415)
* Shared Role-Sync Core

* Environment Configuration

* Browser OpenID Wiring & improved shared component

* API Auth Wiring

* Improved Role Lookup

* added example for sync env

* small simplification

* protect existing manual assigned ADMIN Roles

* fix: Apply OpenID role-sync fallback for present-but-empty claims

Both role-sync call sites skipped on a falsy `openIdRoleValues`, treating an
empty claim string ('') the same as a missing claim and returning before
`selectOpenIdRole` could apply the configured fallback role. An IdP emitting
an empty roles claim for a user with no mapped groups left the stale local
role in place instead of the authoritative fallback.

Skip only when the helper returns `undefined` (missing/invalid), letting an
empty string flow through to fallback selection — consistent with how an
empty array is already handled. Adds regression coverage on both the OpenID
strategy and the remote-agent API auth paths.

* refactor: Address OpenID role-sync review feedback

- role.ts: reuse the shared escapeRegExp util instead of a local escapeRegex
  duplicate, matching prompt/skill/user/userGroup methods (Copilot).
- openidStrategy.js / remoteAgentAuth.ts: make the tenantStorage.run callbacks
  async so the documented ALS contract is satisfied and tenant context cannot
  be lost during Mongoose execution; the wrapped lookups/updates are already
  async, so behavior is unchanged (codex P2).

* fix: Harden OpenID role-sync claim and fallback handling

Addresses the second Codex review cycle (P2 findings):

- Apply fallback when the claim is absent: getOpenIdRolesForOpenIdSync now
  returns an empty list (not undefined) when the token source exists but has
  no usable claim value, so callers still run selection and assign the
  configured fallback instead of leaving a stale elevated role. A truly
  unavailable source still returns undefined and skips sync.
- Resolve group overage for access tokens too: the _claim_names/_claim_sources
  overage path previously only ran for claimSource 'id'; Entra also moves an
  oversized groups claim into access tokens, so 'access'+'groups' (the only
  source supported by remote-agent API sync) now resolves overage as well.
- Allow system fallback roles for tenant users: getLibreChatRolesForOpenIdSync
  treats SystemRoles (e.g. USER) as always-available canonical names, since
  they are provisioned globally at startup and a tenant-scoped lookup may not
  return them — preventing a spurious 'configured roles do not exist: USER'.

Adds unit and strategy-level coverage for all three.

* fix: Tighten OpenID role-sync tenant scoping and config validation

Addresses the third Codex review cycle:

- Constrain base-user role lookups to base roles (P2): findRolesByNames now
  filters to roles with an unset tenantId when no tenant ALS context is active,
  so a base user cannot match — and be assigned — a role that only exists within
  a tenant. Tenant-scoped lookups remain controlled by the isolation plugin.
- Re-enforce tenant login policy after role sync (P2): when role sync changes a
  tenant user's role, the OpenID strategy re-resolves the tenant appConfig and
  re-checks allowedDomains, so a token cannot complete login under the previous
  role's looser policy.
- Skip role-sync-specific validation when disabled (P3): getOpenIdRoleSyncOptions
  returns disabled options before validating role-sync settings, so a stale or
  mistyped value no longer breaks OpenID login while the feature is off.

Adds unit and strategy-level coverage for all three.

* fix: Run base role lookups under system context for strict isolation

Follow-up to the base-role scoping fix (Codex P1). With TENANT_ISOLATION_STRICT=true,
the tenant-isolation pre('find') hook throws on a context-less query before the manual
tenantId filter is honored, so base OpenID/remote-agent auth would 500 instead of
validating base roles. findRolesByNames now runs the no-context lookup inside
runAsSystem (SYSTEM_TENANT_ID), bypassing strict-mode injection while still applying an
explicit base-role (tenantId unset) filter. Adds a strict-mode regression test.

---------

Co-authored-by: Peter Rothlaender <peter.rothlaender@ginkgo.com>
2026-06-02 14:00:56 -04:00
Ravi Kumar L
a86e504a57
📡 feat: Add Authenticated Proxy Mode for Browser RUM Telemetry (#13464) 2026-06-01 21:11:35 -04:00
Danny Avila
444d923e29
✂️ chore: Strip Session JWT Forwarding from Browser RUM (#13414)
* fix: disable RUM user JWT auth

* fix: remove stale RUM bootstrap import
2026-05-30 10:34:44 -04:00
Ravi Kumar L
71a7c9ce7b
📡 feat: Add Configurable HyperDX Browser Real User Monitoring (#13287) 2026-05-29 11:04:26 -07:00
Danny Avila
62dff69300
🧠 feat: Add Claude Opus 4.8 Support (#13380)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* feat: add Claude Opus 4.8 support

* fix: omit sampling params for Claude Opus 4.8

* fix: flatten Bedrock beta header merge

* fix: strip Bedrock sampling params for Opus 4.8
2026-05-28 13:50:39 -07:00
Serhii Zghama
9cd2fc2ed6
🧩 fix: Add REDIS_CLUSTER_SAFE_DELETE Flag for ElastiCache Serverless CROSSSLOT Errors (#13275)
* fix(redis): add REDIS_CLUSTER_SAFE_DELETE for ElastiCache Serverless CROSSSLOT errors

ElastiCache Serverless and similar managed Redis services present a single-node
connection endpoint but shard keys internally. When USE_REDIS_CLUSTER=false (as
required for single-endpoint services), batchDeleteKeys() uses multi-key DEL
commands that fail with CROSSSLOT errors because the managed cluster rejects
cross-slot operations.

Adds REDIS_CLUSTER_SAFE_DELETE=true which forces per-key deletion (the same
cluster-safe path) without changing the connection mode. This makes the delete
strategy independent of the connection topology.

Closes #13261

* test(cache): add REDIS_CLUSTER_SAFE_DELETE config tests

* fix: Avoid nested Redis delete mode ternary

* docs: Add Redis cluster-safe delete env example

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-24 14:49:40 -04:00
Dustin Healy
53e7c41033
🪙 feat: Add AWS Bedrock API key support (#8690)
Some checks are pending
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
* feat: Add Bedrock API key support

* fix: Respect Bedrock credential mode

* fix: Support mixed Bedrock credential forms

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-23 12:41:38 -04:00
Maxence Dominici
294bf7c87d
🛂 feat: Add AWS Profile Support for Bedrock Credentials (#10504)
- Add BEDROCK_AWS_PROFILE environment variable support
  - Implement AWS SDK credential provider chain for automatic refresh
  - Update credential loading logic to support profiles, static env vars, and user-provided credentials
  - Add logging for credential source transparency
  - Update .env.example with profile configuration documentation

  Follows S3 implementation pattern for credential handling.
  Enables users to configure AWS profiles with optional credential_process for automatic token refresh.

Co-authored-by: Maxence - Meca.lu <contact@meca.lu>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-23 10:39:11 -04:00
eleite93
5d393ad79a
🪪 fix: Support OpenID PKCE Without Client Secret (#12364)
* fix: allow OpenID PKCE authentication without client secret

* Linting

* Strategy fix

* fix(openid): trim secret gates and add PKCE client metadata tests

* chore(openid): normalize spec line endings

*  perf: Short-Circuit Config Override Resolution for Empty Principals (#12549)

Skip the getApplicableConfigs DB query when buildPrincipals returns
an empty array, since there are no principals to match against.

*  perf: Separate Error Handling for Principal Resolution vs Config Overrides (#12550)

Distinguish between buildPrincipals and getApplicableConfigs failures
so the uncached fallback to baseConfig is intentional and logged
separately from config override errors.

* Revert " perf: Separate Error Handling for Principal Resolution vs Config Overrides (#12550)"

This reverts commit 1729378a65.

* Revert " perf: Short-Circuit Config Override Resolution for Empty Principals (#12549)"

This reverts commit a100aa5738.

---------

Co-authored-by: CMF\e-leite <EduardoLeite@criticalmanufacturing.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-23 08:57:59 -04:00
Danny Avila
05a3d1ed81
🛣️ feat: Add MCP Remote Proxy Support (#13076)
* feat: add MCP remote proxy support

* fix: Harden MCP Proxy Review Findings

* fix: Honor MCP Proxy Env Precedence

* fix: Harden MCP proxy routing

* fix: Align MCP proxy bypass semantics

* test: Pin MCP proxy admin scope
2026-05-21 15:28:54 -04:00
Danny Avila
f34150e8e8
🧵 chore: Raise MCP SSE Line Default (#13224)
Some checks are pending
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
2026-05-21 09:06:02 -04:00
Danny Avila
1ed84ee4eb
🦣 fix: Response Size Limits for Streamable HTTP MCP Responses (#13219)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* fix: implement response size limits for streamable HTTP MCP responses

- Added environment variables for maximum response and line sizes in streamable HTTP responses.
- Introduced functions to handle response size validation and error logging.
- Updated MCP connection logic to enforce these limits, ensuring safe handling of large responses.

* fix: address MCP response guard review findings

* fix: satisfy logger spy typings

* fix: clarify blocked MCP response errors

* fix: harden MCP guard review edge cases

* test: cover MCP oversized SSE call failures
2026-05-21 01:29:43 -04:00
Danny Avila
830d124e4d
🪪 fix: Add Admin Panel SSO URL Config (#13220)
* fix: Add admin panel URL Helm configuration

* fix: Clarify admin panel URL configuration

* fix: Avoid duplicate admin panel URL env
2026-05-21 00:54:57 -04:00
Danny Avila
799a080479
🗂️ feat: Allow Disabling File Log Transports (#13215)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* fix: allow disabling file log transports

* fix: defer log directory setup when file logging disabled
2026-05-20 23:16:56 -04:00
Danny Avila
909329a7e8
🍪 feat: Add Session Cookie Secure Override (#13189)
* fix: add session cookie secure override

* chore: remove empty whitespace
2026-05-19 09:44:14 -04:00
Danny Avila
050b7fd43a
📡 feat: Add Backend OpenTelemetry Tracing (#12909)
* feat: add backend OpenTelemetry tracing

* fix: address telemetry type checks

* fix: mark aborted telemetry requests as errors

* fix: record telemetry identity after auth

* fix: avoid forced telemetry signal exit

* fix: harden telemetry request attribution

* fix: record telemetry errors on request span

* chore: order imports and reorganize middleware usage

* fix: reduce telemetry startup overhead

* fix: preserve live telemetry controller state

* fix: redact telemetry URL attributes
2026-05-14 09:08:55 -04:00
Danny Avila
947bfa4c40
📚 docs: Add Skills, Subagents, and CloudFront References (#13096) 2026-05-12 21:41:09 -04:00
Danny Avila
0a7255b234
🎭 feat: Support OpenID Audience On Refresh Grants (#13077) 2026-05-11 17:40:30 -04:00
Danny Avila
c3ec23f9b8
🌐 feat: Support Vertex AI Multi-Region Endpoints (#13044)
* feat: support Vertex AI multi-region endpoints

* fix: sync Vertex endpoint with final location
2026-05-10 13:41:58 -04:00
Danny Avila
93c4ef4ba8
🧱 refactor: typed CodeEnvRef + kind discriminator + principal-aware sandbox cache (#12960)
* 🧱 refactor: typed CodeEnvRef + kind discriminator + tenant-aware sandbox cache

Final cutover for the LibreChat ↔ codeapi sandbox file identity. Replaces
the magic string `${session_id}/${file_id}?entity_id=...` with a typed,
discriminated `CodeEnvRef`. Pre-release lockstep deploy with codeapi
#1455 and agents #148; no legacy aliases retained.

## Final shape

```ts
type CodeEnvRef =
  | { kind: 'skill'; id: string; storage_session_id: string; file_id: string; version: number }
  | { kind: 'agent'; id: string; storage_session_id: string; file_id: string }
  | { kind: 'user';  id: string; storage_session_id: string; file_id: string };
```

`kind` drives codeapi's sessionKey: `<tenant>:<kind>:<id>[✌️<version>]`
for shared kinds, `<tenant>:user:<userId>` for user-private (auth context
provides `userId`). `version` is statically required for `kind: 'skill'`
and forbidden otherwise via discriminated union — constraint holds at
compile time on every consumer, not just codeapi's runtime validator.

`id` is sessionKey-meaningful for `'skill'` / `'agent'`; informational
only for `'user'` (codeapi resolves user identity from auth context).

## What changed

- `packages/data-provider/src/codeEnvRef.ts` — discriminated union +
  `CODE_ENV_KINDS` const-tuple keeps the runtime list and TS union
  locked together.
- Schemas: `metadata.codeEnvRef` and `SkillFile.codeEnvRef` enums
  tightened to `['skill', 'agent', 'user']`.
- `primeSkillFiles` writes `kind: 'skill'`, `id: skill._id`,
  `version: skill.version`. Cache-hit path reads `codeEnvRef`
  directly. Bumping `skill.version` on edit naturally invalidates
  the prior cache entry under the new sessionKey.
- `processCodeOutput` writes `kind: 'user'`, `id: req.user.id`. Output
  bucket is always user-scoped, regardless of which skill the
  execution invoked. New regression test pins the asymmetry.
- `primeFiles` reupload preserves `kind`/`id`/`version?` from the
  existing ref so a skill-cache-miss reupload doesn't silently demote
  to user bucket.
- `crud.js` upload functions (`uploadCodeEnvFile` /
  `batchUploadCodeEnvFiles`) thread `kind`/`id`/`version?` to the
  multipart form (codeapi #1455 option α). Without these on the wire,
  codeapi falls back to user bucketing and skill-cache invalidation
  never fires. Client-side validation mirrors codeapi's validator.
- `Files/process.js` — chat attachments use `kind: 'user'`; agent
  setup files use `kind: 'agent'`.
- Drops `entity_id` everywhere (struct, schema sub-docs, write paths,
  upload form fields). Drops `'system'` from the kind enum (no emitter
  ever existed).

## Test plan

- [x] `cd packages/data-provider && npx jest src/codeEnvRef.spec` — 4 / 4
- [x] `cd packages/data-schemas && npx jest` — 1447 / 1447
- [x] `cd packages/api && npx jest src/agents` — 81 / 81 in skillFiles +
  handlers + resources
- [x] `cd api && npx jest server/services/Files server/controllers/agents` —
  436 / 436
- [x] `cd api && npx jest server/services/Files/Code` — 98 / 98 (incl.
  new "outputs are user-scoped regardless of which skill the execution
  invoked" regression and "reupload forwards kind/id/version from
  existing ref")
- [x] `npx tsc --noEmit -p packages/data-{provider,schemas}/tsconfig.json
  && npx tsc --noEmit -p packages/api/tsconfig.json` — clean (only
  pre-existing unrelated dev errors in storage/balance, untouched here)

## Deploy notes

- **24h cache-miss burst** on first deploy. Inputs (skill caches re-prime
  under new sessionKey shape) and outputs (any pre-Phase C skill-output
  cached files become unreadable). Bounded by codeapi's 24h TTL.
- **Lockstep with codeapi #1455 and agents #148.** Either repo can land
  first since no aliases to drain, but the three deploys must overlap
  within the same maintenance window.
- **`@librechat/agents` bump to `3.1.79-dev.0`** required after agents
  #148 lands and is published.

## What this enables

Auth bridge work (JWT-based tenant/user identity between LC and codeapi)
— codeapi now derives sessionKey purely from `req.codeApiAuthContext.{
tenantId, userId}`, so the next chapter is replacing the header-asserted
user identity with a verified-claim path.

* 🩹 fix: persist execute_code uploads under codeEnvRef metadata key

Codex review P1 (chatgpt-codex-connector). `Files/process.js` was
storing the upload result under `metadata.fileIdentifier` even though:
- `uploadCodeEnvFile` now returns `{ storage_session_id, file_id }`,
  not the legacy magic string.
- The post-cutover schema (`File.metadata.codeEnvRef`) only declares
  `codeEnvRef` — mongoose strict mode silently strips unknown keys.
- All readers (`primeFiles`, `getCodeFilesByIds`,
  `categorizeFileForToolResources`, controller filtering) check
  `metadata.codeEnvRef`.

Net effect of the bug: chat-attached and agent-setup execute_code files
would lose their sandbox reference on save, and primeFiles would skip
them on subsequent code-execution turns — the file blob would still be
available locally but never re-mounted in the sandbox.

Fix: construct the full `CodeEnvRef` (`{ kind, id, storage_session_id,
file_id }`) at the write site and persist under `metadata.codeEnvRef`.
`BaseClient`'s "is this a code-env file" presence check accepts the new
shape alongside the legacy `fileIdentifier` for back-compat with any
pre-cutover records still in the database. Mirrors the same change in
`processAttachments.spec.ts` (which re-implements the BaseClient logic
for testability).

New regression tests in `process.spec.js` cover three cases:
- chat attachments (`messageAttachment=true`) → `kind: 'user'`
- agent setup (`messageAttachment=false`) → `kind: 'agent'`
- legacy `fileIdentifier` key is NOT persisted (would be schema-stripped)

* 🩹 fix: read storage_session_id on primed file refs (Codex P1)

Codex review (chatgpt-codex-connector). After Phase B's per-file
`session_id` → `storage_session_id` rename, `primeFiles` emits the
new field — but `seedCodeFilesIntoSessions` was still reading
`files[0].session_id` for the representative session and `f.session_id`
for the dedupe key. In runs with only primed attachments (no skill
seed), `representativeSessionId` was `undefined`, the function
returned the unchanged map, and `seedCodeFilesIntoSessions` silently
dropped the entire batch. The first `execute_code` call then started
without `_injected_files` and the agent couldn't see prior-turn
artifacts.

Fix:
- `codeFilesSession.ts`: read `f.storage_session_id` for both the
  dedupe key and the representative session id. JSDoc updated to
  match the new field name.
- `callbacks.js`: the two output-file persistence paths read
  `file.session_id` to pass to `processCodeOutput` — switch to
  `file.storage_session_id`. The original comment explicitly says
  this should be the STORAGE session, which is exactly the field
  Phase B renamed.
- `codeFilesSession.spec.ts`: fixture builder uses `storage_session_id`
  and `kind: 'user'` to match the post-cutover `CodeEnvFile` shape.

Lockstep coordination: this matches the post-bump shape of
`@librechat/agents` 3.1.79+. CI tsc errors against the currently-pinned
3.1.78 are expected and resolve when the dep bumps in this PR before
merge.

* 📦 chore: Bump `@librechat/agents` to version 3.1.80-dev.0 in package-lock and package.json files

* 🪪 fix: thread kind/id/version through codeapi /download URLs (Phase C α)

Symmetric fix for the upload-side wire change in 537725a. Codeapi's
`sessionAuth` middleware now requires `kind`/`id`/`version?` on every
download/freshness URL — without them it 400s with "kind must be one
of: skill, agent, user" before serving the file.

Three sites construct codeapi-side URLs that go through `sessionAuth`:

- `processCodeOutput` (`Files/Code/process.js`): `/download/<sess>/<id>`
  for freshly-generated sandbox outputs. Always `kind: 'user'` +
  `id: req.user.id` — code-output files are always user-private,
  regardless of which skill the run invoked.
- `getSessionInfo` (`Files/Code/process.js`): `/sessions/<sess>/objects/<id>`
  for the 23h freshness check. Pulls kind/id/version straight off the
  `codeEnvRef` already in scope — skill files stay skill-bucketed,
  user files stay user-bucketed.
- `/code/download/:session_id/:fileId` LC route (`routes/files/files.js`):
  proxies to codeapi for manual downloads. Code-output files only on
  this route, so `kind: 'user'` + `id: req.user.id`.

The `getCodeOutputDownloadStream` helper in `crud.js` now takes an
`identity` param, validated by a `buildCodeEnvDownloadQuery` helper
that mirrors `appendCodeEnvFileIdentity`'s shape rules: kind required
from the closed `{skill, agent, user}` set, version required for
'skill' and forbidden otherwise. Bad callers fail fast on the client
instead of round-tripping a 400.

Also cleans up two log-noise sources reported alongside the 400:

- `logAxiosError` in `packages/api/src/utils/axios.ts` was dumping
  `error.response.data` raw. With `responseType: 'arraybuffer'` that's
  a `Buffer` (~4 chars per byte after JSON-serialization); with
  `responseType: 'stream'` it's a `Readable` whose internal state
  serializes the entire ring buffer + socket. New `renderResponseData`
  decodes small buffers as UTF-8 (truncated past 2KB) and stubs streams
  as `'[stream]'`. Diagnostics stay useful, log lines stop being
  megabytes.
- `/code/download` route's catch was bare `logger.error('...', error)`,
  bypassing the redactor. Switched to `logAxiosError` so it benefits
  from the same buffer/stream handling.

Tests updated to match the new contract:
- crud.spec: `getCodeOutputDownloadStream` fixtures pass `userIdentity`;
  new cases cover skill identity (with version), bad kind rejection,
  skill-without-version rejection.
- process.spec: `getSessionInfo` test passes a full `codeEnvRef` object.

* ♻️ refactor: extract codeEnv identity helpers into packages/api

Per the project convention that new backend code lives in TypeScript
under `packages/api`, moves `appendCodeEnvFileIdentity` and
`buildCodeEnvDownloadQuery` from `api/server/services/Files/Code/crud.js`
into a new `packages/api/src/files/code/identity.ts` module.

Both helpers are pure validators that mirror codeapi's
`parseUploadSessionKeyInput` server-side rules (closed kind set,
`version` required for `'skill'` and forbidden otherwise) — they
deserve TS support and a dedicated spec rather than living as
JSDoc-typed helpers in the legacy `/api` workspace. The new module:

- Exports a `CodeEnvIdentity` interface using the
  `librechat-data-provider` `CodeEnvKind` discriminated union.
- Adds 13 unit tests in `identity.spec.ts` covering the validation
  matrix (skill+version, agent, user, and every rejection path) plus
  URL encoding for the download query.
- Re-exported from `packages/api/src/files/code/index.ts` alongside
  `classify`, `extract`, and `form`.

Consumer updates:
- `api/server/services/Files/Code/crud.js`: drops the local helpers
  and imports them from `@librechat/api`. Net -64 lines.
- `api/server/services/Files/Code/process.js`: same.
- Test mocks for `@librechat/api` in three spec files now stub the
  helpers' validation behavior locally rather than pulling them
  through `requireActual` (which would drag in provider-config
  init-time side effects). The package's `exports` field only
  surfaces the root barrel, so leaf imports aren't reachable from
  legacy `/api` test setup.

No runtime behavior change. Identity validation rules and emitted
form/query shapes are byte-for-byte identical pre/post.

* 🪪 fix: emit resource_id alongside id on _injected_files (skill 403 fix)

Companion to codeapi #1455 fix and agents 3.1.80-dev.1 — the wire
shape for shared-kind files now requires `resource_id` distinct from
the storage `id`. Without this LC change, codeapi's sessionKey
re-derivation on every shared-kind /exec rejects with 403
session_key_mismatch:

    cached:  legacy:skill:69dcf561...✌️59  (signed at upload, skill _id)
    derived: legacy:skill:ysPwEURuPk-...✌️59  (storage nanoid)

Emit sites updated:

- `primeInvokedSkills` cache-hit path: `resource_id: ref.id` (the
  persisted skill `_id` from `codeEnvRef.id`); `id: ref.file_id`
  unchanged (storage uuid).
- `primeInvokedSkills` fresh-upload path: `resource_id: skill._id.toString()`
  on every primed file (the `allPrimedFiles` builder type now carries
  the field).
- `processCodeOutput`'s `pushFile` (Code/process.js): `resource_id: ref.id`
  — for `kind: 'user'` this is informational (codeapi derives
  sessionKey from auth context) but emitted for shape uniformity
  with shared kinds.

Bumps `@librechat/agents` to `^3.1.80-dev.1` (the version that
ships the matching `CodeEnvFile.resource_id` field).

## Test plan

- [x] `cd packages/api && npx jest src/agents` — 67 / 67 pass
  (skillFiles fixtures updated to assert `resource_id` on the
  emitted CodeSessionContext.files).
- [x] `cd api && npx jest server/services/Files server/controllers/agents` —
  445 / 445 pass (process.spec fixtures updated for the reupload
  + cache-hit emission).
- [x] `npx tsc --noEmit -p packages/api/tsconfig.json` — clean.

* fix(skill-tool-call): carry resource_id through primeSkillFiles → artifact

Codeapi was 400ing every /exec following a `handle_skill` tool call
with `resource_id is invalid` (`type: 'undefined'`). Both code paths
in `primeSkillFiles` (cache-hit + fresh-upload) returned files
without `resource_id`/`kind`/`version`, and the artifact in
`handlers.ts` forwarded the stripped shape into
`tc.codeSessionContext.files` → `_injected_files`.

`primeInvokedSkills` (the NL-detected loader) had already been fixed
end-to-end; this commit aligns the tool-invoked path with the same
contract: `resource_id` = `skill._id.toString()`, `kind: 'skill'`,
`version` = the skill's monotonic counter.

Tests added to `skillFiles.spec.ts` lock the contract on
`primeSkillFiles` directly so future refactors can't silently drop
the resource identity again.

* fix(handlers.spec): align session_id → storage_session_id rename + kind discriminator

Pre-existing TS errors against the post-rename `CodeEnvFile` shape:
the test file still used `session_id` on per-file objects (renamed to
`storage_session_id` in agents Phase B/C) and was missing the `kind`
discriminator the discriminated union requires. Both inputs and the
matching `expect.toEqual(...)` mirrors updated together so the
runtime equality check still holds.

Lines 723-732 stay as-is — they sit behind `as unknown as
ToolCallRequest` and TS already skipped them.

* chore: fix `@librechat/agents`, correct version to 3.1.80-dev.0 in package.json files

* chore: bump `@librechat/agents` to version 3.1.80-dev.1 in package.json and package-lock.json

* chore: bump `@librechat/agents` to version 3.1.80-dev.2

* feat(observability): trace file priming chain from primeCodeFiles to _injected_files

Diagnosing the user-upload "files=[] on first /exec" bug requires
seeing where in the LC chain a file ref disappears. Prior to this
patch the chain (primeCodeFiles → primedCodeFiles → initialSessions
→ CodeSessionContext → _injected_files) was opaque end-to-end:
  - primeCodeFiles silently dropped files without `metadata.codeEnvRef`
  - reuploadFile catches all errors and continues with no signal
  - the handlers.ts handoff to codeapi never logged what it was sending

After this patch, a single grep on `[primeCodeFiles]` plus
`[code-env:inject]` shows the full per-file path:

  [primeCodeFiles] in: file_ids=N resourceFiles=M
  [primeCodeFiles] file=<id> path=skip reason=no-codeenvref filename=...
  [primeCodeFiles] file=<id> path=cache-hit-by-session storage_session_id=...
  [primeCodeFiles] file=<id> path=reupload reason=no-uploadtime ...
  [primeCodeFiles] file=<id> path=reupload reason=stale ...
  [primeCodeFiles] file=<id> path=reupload-success oldSession=... newSession=... newFileId=...
  [primeCodeFiles] file=<id> path=reupload-failed session=...
  [primeCodeFiles] file=<id> path=fresh-active storage_session_id=...
  [primeCodeFiles] out: returned=N skippedNoRef=M reuploadFailures=K

  [code-env:inject] tool=<name> files=N missingResourceId=K     (debug)
  [code-env:inject] M/N files missing resource_id ...           (warn)
  [code-env:inject] tool=<name> _injected_files=0 ...           (warn)

The boundary log warns when LC sends zero injected files on a
code-execution tool call — that's the user's actual symptom showing
up at the LC side instead of having to correlate against codeapi's
`Request received { files: [] }`.

Tag chosen as `[code-env:inject]` rather than `[handoff:exec]` to
avoid collision with the app-level "handoff" semantic (subagent
handoff workflow).

Structural cleanup in primeFiles: replaced the `if (ref) { ... }`
nesting with an early `if (!ref) continue` so the per-path
instrumentation hooks land at top-level scope instead of indented
inside a conditional. Behavior unchanged; pushFile / reuploadFile
identical.

Spec fixtures (handlers.spec.ts, codeFilesSession.spec.ts) updated
to include `resource_id` on `CodeEnvFile` literals — required by
the post-3.1.80-dev.2 type now installed.

## Test plan

- [x] `cd packages/api && npx jest src/agents/handlers.spec.ts src/agents/codeFilesSession.spec.ts src/agents/skillFiles.spec.ts` — 69/69 pass
- [x] `cd api && npx jest server/services/Files/Code/process.spec.js` — 84/84 pass
- [x] `npx tsc --noEmit -p packages/api` — clean
- [x] `npx eslint` on all four touched files — clean

* chore: add CONSOLE_JSON_STRING_LENGTH to .env.example for JSON log string length configuration

* fix(files): align codeapi upload filename with LC's sanitized DB filename

User-attached files for code execution were uploading to codeapi
under `file.originalname` (raw upload filename, may contain spaces /
special chars) while LC's DB record stored the sanitized form
(`sanitizeFilename(file.originalname)`, underscores). Codeapi
preserves whatever filename the upload sent, so the sandbox saw
`/mnt/data/<originalname>` while LC's `primeFiles` toolContext text
+ `_injected_files.name` referenced `file.filename` (sanitized).

Visible failure: agent gets system prompt saying

    /mnt/data/librechat_code_api_-_active_customer_-_2025-11-05.xlsx

…tries that path, hits `FileNotFoundError`, then notices the
sandbox's actual `Available files` line says

    /mnt/data/librechat code api - active customer - 2025-11-05.xlsx

…retries with spaces, succeeds. Wastes a tool call per upload and
leaks raw filenames into model context.

Fix: sanitize once and use the sanitized form in both the codeapi
upload AND the LC DB record. Sandbox path = LC toolContext text =
in-memory ref name. No drift.

Reupload path (`Code/process.js` line 867 `filename: file.filename`)
already uses the sanitized DB name, so it stays consistent with the
fresh-upload path after this change.

## Test plan

- [x] `cd api && npx jest server/services/Files/process` — 32/32 pass
- [x] `npx eslint` on the touched file — clean

* chore: bump `@librechat/agents` to version 3.1.80-dev.3 in package.json and package-lock.json
2026-05-08 12:29:43 -04:00
Yashwanth Alapati
3da1d8c961
🔍 feat: add Tavily as Search and Scraper Provider (#12581)
* feat: add Tavily integration as search provider and scraper provider

* chore:update tavily web search parameters

* chore:tavily paramer update

* chore:update data-schemas test for tavily

* fix: allow Tavily string option modes

* fix: align Tavily config options

* fix: scope Tavily scraper timeout

* fix: use resolved scraper provider timeout

* fix: widen Tavily search provider types

* fix: harden Tavily web search config

* fix: cap Tavily option timeouts

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 11:29:13 +09:00
Danny Avila
eb22bb6969
🧭 fix: Migrate Anthropic Long Context (#12911) 2026-05-02 22:14:19 +09:00
Danny Avila
35bf04b26c 🧰 refactor: Unify code-execution tools (#12767)
* 🛠️ feat: Add registerCodeExecutionTools helper

Idempotently registers `bash_tool` + `read_file` in the run's tool
registry and tool-definition list via a registry `.has()` dedupe. Sets
up the single code-execution tool path shared by:

- `initializeAgent` (when an agent has `execute_code` in its tools and
  the capability is enabled for the run)
- `injectSkillCatalog` (when skills are active; unconditional read_file,
  bash_tool follows `codeEnvAvailable`)

Both callers reach the helper in the same initialization sequence, so
the second call becomes a no-op and exactly one copy of each tool
reaches the LLM — no more double registration for agents that combine
`execute_code` capability with active skills.

Unit-tested on a fresh run, idempotence (second call, overlap with
prior tooldefs, partial overlap), and the no-registry variant.

* 🔀 refactor: Route injectSkillCatalog bash_tool + read_file through registerCodeExecutionTools

The `skill` tool is still registered inline (it's skill-path-specific),
but `bash_tool` + `read_file` now flow through the shared idempotent
helper so a prior registration from the execute_code path doesn't
produce a duplicate copy later in the same run. Behavior preserved:

- `read_file` always registers when any active skill is in scope —
  manually-primed `disable-model-invocation: true` skills still need it
  to load `references/*` from storage.
- `bash_tool` follows `codeEnvAvailable` exactly as before.

Adds a test pinning the cross-call dedupe: when `injectSkillCatalog`
runs AFTER `registerCodeExecutionTools` has already seeded the registry
+ tool definitions with bash_tool/read_file, the resulting
`toolDefinitions` still contains exactly one copy of each.

* 🪄 feat: Expand `execute_code` tool name into bash_tool + read_file at initialize-time

When an agent's `tools` include `execute_code` and the `execute_code`
capability is enabled for the run, `initializeAgent` now registers
`bash_tool` + `read_file` via `registerCodeExecutionTools` before
`injectSkillCatalog`. The legacy `execute_code` tool definition is no
longer handed to the LLM — `execute_code` remains on the agent
document as a capability-trigger marker, but the runtime expands it
into the skill-flavored tool pair.

Call ordering matters: the `execute_code` registration runs BEFORE
`injectSkillCatalog`, so the skill path's own `registerCodeExecutionTools`
call inside `injectSkillCatalog` becomes a no-op via the registry's
`.has()` check. Exactly one copy of each tool reaches the LLM whether
the agent has:

- only `execute_code` (legacy path)
- only skills
- both

No data migration needed — `agent.tools: ['execute_code']` stays in
the DB unchanged; the expansion is a runtime operation.

Three tests cover the matrix: execute_code + capability on →
bash_tool + read_file registered; execute_code + capability off →
neither registered; no execute_code + capability on → neither
registered.

* 🗑️ refactor: Drop CodeExecutionToolDefinition from the builtin registry

Removes the legacy `execute_code` entry from `agentToolDefinitions` and
the corresponding import. With the initialize-time expansion in place,
nothing consults `getToolDefinition('execute_code')` for a tool schema
any more — the capability gate still filters on the string
`execute_code`, but the actual tool definitions the LLM sees come from
`registerCodeExecutionTools` (i.e. `bash_tool` + `read_file`).

`loadToolDefinitions` in `packages/api/src/tools/definitions.ts`
silently drops `execute_code` when it no longer resolves in the
registry — that's the expected path and is now covered by an updated
test. No caller of `getToolDefinition('execute_code')` expects a
non-undefined result after this change.

* 🔌 refactor: Read CODE_API_KEY from env for primeCodeFiles + PTC

Finishes the Phase 4 server-env-keyed rollout on the two remaining
`loadAuthValues({ authFields: [EnvVar.CODE_API_KEY] })` sites in
`ToolService.js`:

- `primeCodeFiles` (user-attached file priming on execute_code agents)
- Programmatic Tool Calling (`createProgrammaticToolCallingTool`)

Both now read `process.env[EnvVar.CODE_API_KEY]` directly, matching
`bash_tool`'s pattern. The per-user plugin-auth path is no longer
consulted for code-env credentials anywhere in the hot path — the
agents library owns the actual tool-call execution and also reads the
env var internally.

Priming still fires for existing user-file workflows so the legacy
`toolContextMap[execute_code]` hint ("files available at /mnt/data/...")
stays in the prompt; only the key lookup changed.

* 🔧 fix: Type the pre-seeded dedupe-test tools as LCTool

CI TypeScript type checks caught `{ parameters: {} }` in the new
cross-call dedupe test: `LCTool.parameters` is a `JsonSchemaType`,
not `{}`. Use `{ type: 'object', properties: {} }` and type the
local registry Map through the parameter-derived shape so the
pre-seeded values match what `toolRegistry.set` expects.

* 🛡️ fix: Run execute_code expansion before GOOGLE_TOOL_CONFLICT gate

Codex review caught a latent regression: the original Phase 8 placement
ran `registerCodeExecutionTools` after `hasAgentTools` was computed,
so an execute-code-only agent on Google/Vertex with provider-specific
`options.tools` populated would no longer trip `GOOGLE_TOOL_CONFLICT`
— the legacy `CodeExecutionToolDefinition` used to populate
`toolDefinitions` before the guard, but after dropping it from the
registry, `toolDefinitions` stayed empty until my expansion ran
downstream of the guard. Mixed provider + agent tools would silently
flow through to the LLM.

Fix moves the `execute_code` expansion to BEFORE `hasAgentTools`
computation. `bash_tool` + `read_file` now contribute to the check
the same way the legacy `execute_code` def did. Covered by a new
test that pins the Google+execute_code+provider-tools scenario —
the `rejects.toThrow(/google_tool_conflict/)` path would have
silently passed on the prior placement.

* 🔗 fix: Thread codeEnvAvailable through handoff sub-agents

Round-2 codex review caught the other half of the execute_code
expansion gap: `discoverConnectedAgents` omitted `codeEnvAvailable`
from its forwarded `initializeAgent` params, so handoff sub-agents
with `agent.tools: ['execute_code']` lost the `bash_tool` + `read_file`
registration (pre-Phase 8 the legacy `CodeExecutionToolDefinition`
would have landed in their `toolDefinitions` via the registry).

- Add `codeEnvAvailable?` to `DiscoverConnectedAgentsParams` and
  forward it verbatim on every sub-agent `initializeAgent` call.
- Update the three JS call sites that construct the primary's
  `codeEnvAvailable` (`services/Endpoints/agents/initialize.js`,
  `controllers/agents/openai.js`, `controllers/agents/responses.js`)
  to pass the same flag into `discoverConnectedAgents` — one
  authoritative source per request.
- Two regression tests in `discovery.spec.ts` pin the true/false
  passthrough so a future refactor that drops the param-forwarding
  surfaces immediately.

Left intentionally unchanged: `packages/api/src/agents/openai/service.ts`
(public API helper with no in-repo caller). External consumers of
`createAgentChatCompletion` who want code execution should pass a
`codeEnvAvailable`-aware `initializeAgent` via `deps` — documenting
the full public-API surface is out of scope for this Phase 8 PR.

* 🔗 fix: Thread codeEnvAvailable through addedConvo + memory-agent paths

Round-3 codex review caught the last two production `initializeAgent`
callers missing the Phase-8 capability flag:

- `api/server/services/Endpoints/agents/addedConvo.js` (multi-convo
  parallel agent execution). Added `codeEnvAvailable` to
  `processAddedConvo`'s destructured params and forwarded it into
  the per-added-agent `initializeAgent` call. Caller in
  `api/server/services/Endpoints/agents/initialize.js` passes the
  same `codeEnvAvailable` it computed for the primary.
- `api/server/controllers/agents/client.js` (`useMemory` — memory
  extraction agent). Computes its own `codeEnvAvailable` from
  `appConfig?.endpoints?.[EModelEndpoint.agents]?.capabilities` and
  forwards into `initializeAgent`. Memory agents rarely list
  `execute_code`, but if one does, pre-Phase 8 they got the legacy
  `execute_code` tool registered unconditionally — the passthrough
  restores parity.

With this, every production caller of `initializeAgent` explicitly
resolves the capability: main chat flow (primary + handoff), OpenAI
chat completions (primary + handoff), Responses API (primary + handoff),
added convo parallel agents, and memory agents. The one remaining
caller, `packages/api/src/agents/openai/service.ts::createAgentChatCompletion`,
is a public API helper with no in-repo consumer (external callers
must pass a capability-aware `initializeAgent` via `deps`).

* 🪤 fix: Remove duplicate appConfig declaration causing TDZ ReferenceError

The Responses API controller had TWO `const appConfig = req.config;`
bindings inside `createResponse`: one at the top of the function
(added by the Phase 4 `bash_tool` decouple) and one inside the try
block (added by the polish PR #12760). Because `const` is block-scoped
with a temporal dead zone, the inner redeclaration put `appConfig` in
TDZ for the entire try block, so any earlier reference inside the
try — notably `appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders`
at line 348 — threw `ReferenceError: Cannot access 'appConfig'
before initialization`. The error was silently swallowed by the
outer try/catch, leaving `recordCollectedUsage` unreached and the
six `responses.unit.spec.js` token-usage tests failing.

Removing the inner redeclaration fixes the six failing tests
(verified: 11/11 pass locally post-fix, 0 regressions elsewhere).
The outer function-scoped binding already provides `appConfig` to
every downstream reference.

* 🔗 fix: Thread codeEnvAvailable through the OpenAI chat-completion public API

Round-4 codex review (legitimate on the type-safety angle, even though the
runtime concern was already covered): the `createAgentChatCompletion`
helper defines its own narrower `InitializeAgentParams` interface locally,
and the type was missing `codeEnvAvailable`. External consumers who
supply a capability-aware `deps.initializeAgent` couldn't route
`codeEnvAvailable` through without a type-cast workaround.

- Widen the local `InitializeAgentParams` interface to include
  `codeEnvAvailable?: boolean` (matches the real
  `packages/api/src/agents/initialize.ts` type).
- Derive `codeEnvAvailable` inside `createAgentChatCompletion` from
  `deps.appConfig?.endpoints?.agents?.capabilities` (the same source
  the in-repo controllers use) and forward to `deps.initializeAgent`.
  Uses a string literal `'execute_code'` lookup so this file stays free
  of a `librechat-data-provider` import — keeping the dependency surface
  of the public helper minimal.

With this, external consumers of `createAgentChatCompletion` who pass
`appConfig` with the agents capabilities get `bash_tool` + `read_file`
registration automatically; consumers who don't pass `appConfig` retain
the existing "explicit opt-in" semantics (the flag stays `undefined`,
expansion is skipped).

* 🧹 chore: Review-driven polish — observability log, JSDoc DRY, test gaps, no-op allocation

Addresses the comprehensive review of PR #12767:

- **Finding #1** (MINOR, observability): `initializeAgent` now emits a
  debug log when an agent lists `execute_code` in its tools but the
  runtime gate is off (`params.codeEnvAvailable` !== true). The
  event-driven `loadToolDefinitionsWrapper` path doesn't log
  capability-disabled warnings, so without this the tool silently
  vanishes from the LLM's definitions with zero trace. Operators
  debugging "why isn't code interpreter working?" now get a signal at
  the initialize layer.

- **Finding #5** (NIT, allocation): `registerCodeExecutionTools` now
  returns the input `toolDefinitions` array by reference on the no-op
  path (both tools already registered by a prior caller in the same
  run) instead of allocating a fresh spread array every time. The
  common dual-call scenario — `initializeAgent` then
  `injectSkillCatalog` — saves one O(n) copy per request.

- **Finding #4** (NIT, DRY): Collapsed the duplicated 6-line JSDoc
  comment in `openai.js`, `responses.js`, and `addedConvo.js` into
  either a one-line `@see DiscoverConnectedAgentsParams.codeEnvAvailable`
  pointer (the two JS call sites) or a compact 3-line block referring
  back to the canonical source (addedConvo's @param).

- **Finding #2** (MINOR, test gap): Added
  `api/server/services/Endpoints/agents/addedConvo.spec.js` with three
  cases covering `codeEnvAvailable=true`, `codeEnvAvailable=false`,
  and omitted (undefined) passthrough. A future refactor that drops
  the param from destructuring now surfaces here instead of silently
  regressing multi-convo parallel agents with `execute_code`.

- **Finding #3** (MINOR, test gap): Added
  `api/server/controllers/agents/__tests__/client.memory.spec.js`
  pinning the capability-flag derivation that `AgentClient::useMemory`
  uses — six cases covering present/absent/null/undefined config shapes
  plus an enum-literal pin (`'execute_code'` / `'agents'`). Catches
  enum renames or config-path shifts that would otherwise silently
  strip `bash_tool` + `read_file` from memory agents.

Finding #7 (jest.mock scoping, confidence 40) left as-is: the
reviewer's own risk assessment noted `buildToolSet` doesn't touch
the mocked exports, and restructuring a file-level `jest.mock` to
`jest.doMock` + dynamic `import()` introduces more complexity than
the speculative risk justifies. The existing mock is scoped to the
test file and contains the same stubs the adjacent
`skills.test.ts` already uses.

Finding #6 (PR description commit count) addressed out-of-band via
PR description update.

All existing tests pass, typecheck clean, lint clean across touched
files. New tests: 9 cases across 2 new spec files.

* 🧽 refactor: Replace hardcoded 'execute_code' string with AgentCapabilities enum in service.ts

Follow-up review (conf 55) caught that `openai/service.ts`'s Phase 8
`codeEnvAvailable` derivation used the literal `'execute_code'` while
every in-repo controller uses `AgentCapabilities.execute_code` from
`librechat-data-provider`. The file deliberately uses local type
interfaces to keep the public API helper's type surface small, but
that pattern was never a ban on single-value imports from the data
provider — `packages/api` already depends on it. Importing the enum
value means a future rename of `AgentCapabilities.execute_code`
propagates to this file automatically, matching the in-repo
controllers' behavior.

Other follow-up findings left as-is per the reviewer's own verdict:

- #2 (memory spec mirrors the production expression rather than
  calling `AgentClient::useMemory` directly): reviewer flagged as
  "not blocking" / "design-philosophy observation." The test file's
  JSDoc already explicitly documents the tradeoff and pins the enum
  literals to catch the most likely drift vector. Standing up
  `AgentClient` + all its mocks for a one-line regression guard is
  disproportionate.
- #3 (`addedConvo.spec.js` mock signature vs. underlying
  `loadAddedAgent` arity): reviewer's own confidence 25 noted the
  mock matches the wrapper's actual call pattern in the production
  file. Not a real gap.
- #4 was self-retracted as a false alarm.

* 🗑️ refactor: Fully deprecate CODE_API_KEY — remove all LibreChat-side references

The code-execution sandbox no longer authenticates via a per-run
`CODE_API_KEY` (frontend or backend). Auth moved server-side into the
agents library / sandbox service, so LibreChat drops every reference:

**Backend plumbing:**
- `api/server/services/Files/Code/crud.js`: `getCodeOutputDownloadStream`,
  `uploadCodeEnvFile`, `batchUploadCodeEnvFiles` no longer accept
  `apiKey` or send the `X-API-Key` header.
- `api/server/services/Files/Code/process.js`: `processCodeOutput`,
  `getSessionInfo`, `primeFiles` drop the `apiKey` param throughout.
- `api/server/services/ToolService.js`: stop reading
  `process.env[EnvVar.CODE_API_KEY]` for `primeCodeFiles` and PTC; the
  agents library handles auth internally. Remove the now-dead
  `loadAuthValues` + `EnvVar` imports. Drop the misleading
  "LIBRECHAT_CODE_API_KEY" hint from the bash_tool error log.
- `api/server/services/Files/process.js`: remove the `loadAuthValues`
  call around `uploadCodeEnvFile`.
- `api/server/routes/files/files.js`: code-env file download no longer
  fetches a per-user key.
- `api/server/controllers/tools.js`: `execute_code` is no longer a
  tool that needs verifyToolAuth with `[EnvVar.CODE_API_KEY]` — the
  endpoint always reports system-authenticated so the client skips
  the key-entry dialog. `processCodeOutput` called without `apiKey`.
- `api/server/controllers/agents/callbacks.js`: `processCodeOutput`
  invoked without the loadAuthValues round trip, for both LegacyHandler
  and Responses-API handlers.
- `api/app/clients/tools/util/handleTools.js`: `createCodeExecutionTool`
  called with just `user_id` + files.

**packages/api:**
- `packages/api/src/agents/skillFiles.ts`: `PrimeSkillFilesParams`,
  `PrimeInvokedSkillsDeps`, `primeSkillFiles`, `primeInvokedSkills` all
  drop the `apiKey` param; the gate is purely `codeEnvAvailable`.
- `packages/api/src/agents/handlers.ts`: `handleSkillToolCall` drops
  the `process.env[EnvVar.CODE_API_KEY]` read; skill-file priming is
  now gated solely on `codeEnvAvailable`. `ToolExecuteOptions`
  signatures drop apiKey from `batchUploadCodeEnvFiles` and
  `getSessionInfo`.
- `packages/api/src/agents/skillConfigurable.ts`: JSDoc no longer
  references the env var.
- `packages/api/src/tools/classification.ts`: PTC creation no longer
  gated on `loadAuthValues`; `buildToolClassification` drops the
  `loadAuthValues` dep entirely (no LibreChat-side callers need it for
  this path anymore).
- `packages/api/src/tools/definitions.ts`: `LoadToolDefinitionsDeps`
  drops the `loadAuthValues` field.

**Frontend:**
- Delete `client/src/hooks/Plugins/useAuthCodeTool.ts`,
  `useCodeApiKeyForm.ts`, and
  `client/src/components/SidePanel/Agents/Code/ApiKeyDialog.tsx` —
  the install/revoke dialogs for CODE_API_KEY are fully dead.
- `BadgeRowContext.tsx`: drop `codeApiKeyForm` from the context type and
  provider. `codeInterpreter` toggle treated as always authenticated
  (sandbox auth is server-side).
- `ToolsDropdown.tsx`, `ToolDialogs.tsx`, `CodeInterpreter.tsx`,
  `RunCode.tsx`, `SidePanel/Agents/Code/Action.tsx` +`Form.tsx`: all
  API-key dialog trigger refs, "Configure code interpreter" gear
  buttons, and auth-verification plumbing removed. The
  "Code Interpreter" toggle is now a plain `AgentCapabilities.execute_code`
  checkbox — no key-entry gate.
- `client/src/locales/en/translation.json`: drop the three
  `com_ui_librechat_code_api*` keys and `com_ui_add_code_interpreter_api_key`.
  Other locales are externally automated per CLAUDE.md.

**Config:**
- `.env.example`: remove the `# LIBRECHAT_CODE_API_KEY=your-key` section
  and its header.

**Tests:**
- `crud.spec.js`: assertions flipped to pin "no X-API-Key header" and
  "no apiKey param".
- `skillFiles.spec.ts`: removed env-var save/restore; tests now pin
  that the batch-upload path is gated solely on `codeEnvAvailable` and
  that no apiKey is threaded through.
- `handlers.spec.ts`: same — just the `codeEnvAvailable` gate pins
  remain.
- `classification.spec.ts`: remove the two tests that asserted
  `loadAuthValues` was (not) called for PTC.
- `definitions.spec.ts`: drop every `loadAuthValues: mockLoadAuthValues`
  entry from the deps shape.
- `process.spec.js`: strip the mock of `EnvVar.CODE_API_KEY`.

**Comment hygiene:**
- `tools.ts`, `initialize.ts`, `registry/definitions.ts`: shortened
  stale comment references to "legacy `execute_code` tool" without
  naming the retired env var.

Tests verified: 678 packages/api tests pass, 836 backend api tests
pass. Typecheck clean, lint clean. Only remaining CODE_API_KEY
mentions in the code are two regression-guard assertions:
- `crud.spec.js`: pins "no X-API-Key header" stays absent.
- `skillConfigurable.spec.ts`: pins `configurable` never grows a
  `codeApiKey` field.

* 🧹 chore: Remove the last two CODE_API_KEY name mentions in LibreChat

Follow-up to the prior full deprecation commit: two tests still named
the retired identifier in their regression-guard assertions.

- `packages/api/src/agents/skillConfigurable.spec.ts`: drop the
  "does not inject a codeApiKey key" test. The `codeApiKey` field is
  gone from the production configurable shape, so an absence-assertion
  naming it re-introduces the retired identifier in code.
- `api/server/services/Files/Code/crud.spec.js`: rename the
  "without an X-API-Key header" case back to "should request stream
  response from the correct URL" and drop the
  `expect(headers).not.toHaveProperty('X-API-Key')` assertion. The
  surrounding request-shape checks (URL, timeout, responseType) still
  pin the behavior; the explicit header-absence line was named-after
  the deprecated contract.

Result: `grep -rn "CODE_API_KEY\|codeApiKey\|LIBRECHAT_CODE_API_KEY"`
against the LibreChat source tree returns zero hits. The only
remaining `X-API-Key` strings in this repo are on unrelated OpenAPI
Action + MCP server auth configurations, where the string is
user-facing config, not a LibreChat-owned identifier.

Tests: 677 packages/api pass (2 pre-existing summarization e2e failures
unrelated); 126 api-workspace controller/service tests pass.
Typecheck and lint clean.

* 🎯 fix: Narrow codeEnvAvailable to per-agent (admin cap AND agent.tools)

Before this commit, `codeEnvAvailable` was computed in the three JS
controllers as the admin-level capability flag only
(`enabledCapabilities.has(AgentCapabilities.execute_code)`) and passed
through `initializeAgent` → `injectSkillCatalog` / `primeInvokedSkills` /
`enrichWithSkillConfigurable` unchanged. A skills-only agent whose
`tools` array didn't include `execute_code` still got `bash_tool`
registered (via `injectSkillCatalog`) and skill files re-primed to the
sandbox on every turn — wrong, because the agent never opted in to
code execution.

**Fix:** `initializeAgent` now computes the per-agent effective value
once as `params.codeEnvAvailable === true && agent.tools.includes(Tools.execute_code)`,
reuses the same boolean for:

1. The `execute_code` → `bash_tool + read_file` expansion gate
   (previously already consulted `agent.tools`; now shares the single
   `effectiveCodeEnvAvailable` binding).
2. The `injectSkillCatalog` call (previously got the raw admin flag).
3. The returned `InitializedAgent.codeEnvAvailable` field (new, typed as
   required boolean).

**Controllers (initialize.js, openai.js, responses.js):** store
`primaryConfig.codeEnvAvailable` in `agentToolContexts.set(primaryId, ...)`,
capture `config.codeEnvAvailable` in every handoff `onAgentInitialized`
callback, and read it from the per-agent ctx inside the
`toolExecuteOptions.loadTools` runtime closure. The hoisted
`const codeEnvAvailable = enabledCapabilities.has(...)` locals in the
two OpenAI-compat controllers are gone — they were shadowing the
narrowed per-agent value.

**primeInvokedSkills:** `handlePrimeInvokedSkills` in
`services/Endpoints/agents/initialize.js` now uses
`primaryConfig.codeEnvAvailable` (per-agent, narrowed) instead of the
raw admin flag. A skills-only primary agent won't re-prime historical
skill files to the sandbox even when the admin enabled the capability
globally.

**Efficiency:** one extra `&&` in `initializeAgent`. No runtime hot-path
cost — the `includes()` scan on `agent.tools` was already happening for
the `execute_code` expansion gate; it's now just bound to a local. Tool
execution closures read `ctx.codeEnvAvailable === true` (property
access + strict equality, O(1)).

**Ephemeral-agent note:** per-agent narrowing is authoritative for both
persisted and ephemeral flows. The ephemeral toggle
(`ephemeralAgent.execute_code`) is reconciled into `agent.tools`
upstream in `packages/api/src/agents/added.ts`, so
`agent.tools.includes('execute_code')` is the single source of truth
by the time `initializeAgent` runs.

**Tests:** two new regression tests pin the narrowing contract:

- `initialize.test.ts` — four-quadrant matrix on
  `InitializedAgent.codeEnvAvailable` (cap on × agent asks, cap on ×
  doesn't ask, cap off × asks, neither). Catches future refactors that
  drop either half of the AND.
- `skills.test.ts` — `injectSkillCatalog` with `codeEnvAvailable: false`
  against an active skill catalog must NOT register `bash_tool` even
  though it still registers `read_file` + `skill`. This is the state
  a skills-only agent gets post-narrowing.

All 191 affected packages/api tests pass + 836 backend api tests pass.
Typecheck clean, lint clean.

* 🧽 refactor: Comprehensive-review polish — hoist tool defs, pin verifyToolAuth contract, doc appConfig

Addresses the comprehensive review of Phase 8. Findings mapped:

**#1 (MINOR): `verifyToolAuth` unconditional auth for execute_code**
- Added doc comment explicitly stating the deployment contract
  (admin capability → reachable sandbox; no per-check health probe
  to keep UI-gate queries O(1)).
- New `api/server/controllers/__tests__/tools.verifyToolAuth.spec.js`
  with 4 regression tests pinning the contract:
  1. `authenticated: true` + `SYSTEM_DEFINED` for execute_code.
  2. 404 for unknown tool IDs.
  3. `loadAuthValues` is never consulted (catches a future revert
     that would resurface the per-user key-entry dialog).
  4. Response `message` is never `USER_PROVIDED`.

**#2 (MINOR): `openai/service.ts` undocumented `appConfig` dependency**
- Expanded the `ChatCompletionDependencies.appConfig` JSDoc to spell
  out that omitting it silently disables code execution for agents
  with `execute_code` in their tools. External consumers of
  `createAgentChatCompletion` now have the contract documented at
  the type boundary.

**#5 (NIT): `registerCodeExecutionTools` re-allocates tool defs**
- Hoisted `READ_FILE_DEF` and `BASH_TOOL_DEF` to module-level
  `Object.freeze`d constants. The shapes derive entirely from
  static `@librechat/agents` exports, so a single frozen object per
  tool is safe to share across every agent init. Eliminates the
  ~4-property allocations on every call (including the common
  second-call no-op path).

**#6 (NIT): Verbose history-priming comment in initialize.js**
- Trimmed the 16-line `handlePrimeInvokedSkills` block to a 5-line
  summary with `@see InitializedAgent.codeEnvAvailable` pointer.
  The canonical narrowing explanation lives on the type; the
  controller comment is just the ACL-vs-capability rationale.

**Skipped:**

- #3 (memory spec tests a mirror function): reviewer self-dismissed
  as a design tradeoff; the enum-literal pin already catches the
  highest-risk drift vector.
- #4 (cross-repo contract for `createCodeExecutionTool`): user will
  explicitly install the latest `@librechat/agents` dev version
  once the companion PR publishes, so the version pin will be
  authoritative.
- #7 (migration/deprecation note for self-hosters): out of scope
  per user direction — release notes handle this.

Tests verified: 679 packages/api + 840 backend api tests pass.
Typecheck + lint clean.

* 🔧 chore: Update @librechat/agents version to 3.1.68-dev.1 across package-lock and package.json files

This commit updates the version of the `@librechat/agents` package from `3.1.68-dev.0` to `3.1.68-dev.1` in the `package-lock.json` and relevant `package.json` files. This change ensures consistency across the project and incorporates any updates or fixes from the new version.
2026-04-25 04:02:01 -04:00
Danny Avila
e2e3284713
🦉 feat: Claude Opus 4.7 Model Support (#12698)
* 🦉 feat: Claude Opus 4.7 Model Support

- Add `claude-opus-4-7` to shared Anthropic models and `anthropic.claude-opus-4-7` to Bedrock models
- Register 1M context window and 128K max output in anthropic token maps
- Add token pricing ($5/$25), cache rates (6.25/0.5), and premium tier ($10/$37.50 above 200K) in tx.ts
- Update `.env.example` with Opus 4.7 IDs in `ANTHROPIC_MODELS` and `BEDROCK_AWS_MODELS` examples
- Add parallel Opus 4.7 test cases for token/cache/premium rates, context length, max output, name-variation matching, and 1M-context qualification

* feat: Add `xhigh` Effort Level for Opus 4.7

- Add `xhigh` variant to `AnthropicEffort` enum between `high` and `max`
- Expose `xhigh` in `anthropicSettings.effort.options` and the UI slider `enumMappings`
- Reuse existing `com_ui_xhigh` translation key

* test: Cover `xhigh` Effort and Exact Opus 4.7 Premium Rates

- Assert `xhigh` position (between high and max), inclusion in
  `anthropicSettings.effort.options`, zod acceptance, and rejection of
  unknown values in schemas.spec.ts
- Verify bedrockInputParser emits `output_config: { effort: 'xhigh' }`
  for adaptive `anthropic.claude-opus-4-7`
- Verify getLLMConfig sets adaptive thinking and `output_config.effort =
  'xhigh'` for `claude-opus-4-7`
- Pin Opus 4.7 premium pricing to exact threshold/prompt/completion
  values (200000 / 10 / 37.5) so silent rate drift fails the test
2026-04-16 14:51:00 -04:00
Ivan
f6b73938af
📝 docs: add RAG_API_URL to .env.example (#12665) 2026-04-15 09:46:29 -04:00
Danny Avila
b5c097e5c7
⚗️ feat: Agent Context Compaction/Summarization (#12287)
* chore: imports/types

Add summarization config and package-level summarize handler contracts

Register summarize handlers across server controller paths

Port cursor dual-read/dual-write summary support and UI status handling

Selectively merge cursor branch files for BaseClient summary content
block detection (last-summary-wins), dual-write persistence, summary
block unit tests, and on_summarize_status SSE event handling with
started/completed/failed branches.

Co-authored-by: Cursor <cursoragent@cursor.com>

refactor: type safety

feat: add localization for summarization status messages

refactor: optimize summary block detection in BaseClient

Updated the logic for identifying existing summary content blocks to use a reverse loop for improved efficiency. Added a new test case to ensure the last summary content block is updated correctly when multiple summary blocks exist.

chore: add runName to chainOptions in AgentClient

refactor: streamline summarization configuration and handler integration

Removed the deprecated summarizeNotConfigured function and replaced it with a more flexible createSummarizeFn. Updated the summarization handler setup across various controllers to utilize the new function, enhancing error handling and configuration resolution. Improved overall code clarity and maintainability by consolidating summarization logic.

feat(summarization): add staged chunk-and-merge fallback

feat(usage): track summarization usage separately from messages

feat(summarization): resolve prompt from config in runtime

fix(endpoints): use @librechat/api provider config loader

refactor(agents): import getProviderConfig from @librechat/api

chore: code order

feat(app-config): auto-enable summarization when configured

feat: summarization config

refactor(summarization): streamline persist summary handling and enhance configuration validation

Removed the deprecated createDeferredPersistSummary function and integrated a new createPersistSummary function for MongoDB persistence. Updated summarization handlers across various controllers to utilize the new persistence method. Enhanced validation for summarization configuration to ensure provider, model, and prompt are properly set, improving error handling and overall robustness.

refactor(summarization): update event handling and remove legacy summarize handlers

Replaced the deprecated summarization handlers with new event-driven handlers for summarization start and completion across multiple controllers. This change enhances the clarity of the summarization process and improves the integration of summarization events in the application. Additionally, removed unused summarization functions and streamlined the configuration loading process.

refactor(summarization): standardize event names in handlers

Updated event names in the summarization handlers to use constants from GraphEvents for consistency and clarity. This change improves maintainability and reduces the risk of errors related to string literals in event handling.

feat(summarization): enhance usage tracking for summarization events

Added logic to track summarization usage in multiple controllers by checking the current node type. If the node indicates a summarization task, the usage type is set accordingly. This change improves the granularity of usage data collected during summarization processes.

feat(summarization): integrate SummarizationConfig into AppSummarizationConfig type

Enhanced the AppSummarizationConfig type by extending it with the SummarizationConfig type from librechat-data-provider. This change improves type safety and consistency in the summarization configuration structure.

test: add end-to-end tests for summarization functionality

Introduced a comprehensive suite of end-to-end tests for the summarization feature, covering the full LibreChat pipeline from message creation to summarization. This includes a new setup file for environment configuration and a Jest configuration specifically for E2E tests. The tests utilize real API keys and ensure proper integration with the summarization process, enhancing overall test coverage and reliability.

refactor(summarization): include initial summary in formatAgentMessages output

Updated the formatAgentMessages function to return an initial summary alongside messages and index token count map. This change is reflected in multiple controllers and the corresponding tests, enhancing the summarization process by providing additional context for each agent's response.

refactor: move hydrateMissingIndexTokenCounts to tokenMap utility

Extracted the hydrateMissingIndexTokenCounts function from the AgentClient and related tests into a new tokenMap utility file. This change improves code organization and reusability, allowing for better management of token counting logic across the application.

refactor(summarization): standardize step event handling and improve summary rendering

Refactored the step event handling in the useStepHandler and related components to utilize constants for event names, enhancing consistency and maintainability. Additionally, improved the rendering logic in the Summary component to conditionally display the summary text based on its availability, providing a better user experience during the summarization process.

feat(summarization): introduce baseContextTokens and reserveTokensRatio for improved context management

Added baseContextTokens to the InitializedAgent type to calculate the context budget based on agentMaxContextNum and maxOutputTokensNum. Implemented reserveTokensRatio in the createRun function to allow configurable context token management. Updated related tests to validate these changes and ensure proper functionality.

feat(summarization): add minReserveTokens, context pruning, and overflow recovery configurations

Introduced new configuration options for summarization, including minReserveTokens, context pruning settings, and overflow recovery parameters. Updated the createRun function to accommodate these new options and added a comprehensive test suite to validate their functionality and integration within the summarization process.

feat(summarization): add updatePrompt and reserveTokensRatio to summarization configuration

Introduced an updatePrompt field for updating existing summaries with new messages, enhancing the flexibility of the summarization process. Additionally, added reserveTokensRatio to the configuration schema, allowing for improved management of token allocation during summarization. Updated related tests to validate these new features.

feat(logging): add on_agent_log event handler for structured logging

Implemented an on_agent_log event handler in both the agents' callbacks and responses to facilitate structured logging of agent activities. This enhancement allows for better tracking and debugging of agent interactions by logging messages with associated metadata. Updated the summarization process to ensure proper handling of log events.

fix: remove duplicate IBalanceUpdate interface declaration

perf(usage): single-pass partition of collectedUsage

Replace two Array.filter() passes with a single for-of loop that
partitions message vs. summarization usages in one iteration.

fix(BaseClient): shallow-copy message content before mutating and preserve string content

Avoid mutating the original message.content array in-place when
appending a summary block. Also convert string content to a text
content part instead of silently discarding it.

fix(ui): fix Part.tsx indentation and useStepHandler summarize-complete handling

- Fix SUMMARY else-if branch indentation in Part.tsx to match chain level
- Guard ON_SUMMARIZE_COMPLETE with didFinalize flag to avoid unnecessary
  re-renders when no summarizing parts exist
- Protect against undefined completeData.summary instead of unsafe spread

fix(agents): use strict enabled check for summarization handlers

Change summarizationConfig?.enabled !== false to === true so handlers
are not registered when summarizationConfig is undefined.

chore: fix initializeClient JSDoc and move DEFAULT_RESERVE_RATIO to module scope

refactor(Summary): align collapse/expand behavior with Reasoning component

- Single render path instead of separate streaming vs completed branches
- Use useMessageContext for isSubmitting/isLatestMessage awareness so
  the "Summarizing..." label only shows during active streaming
- Default to collapsed (matching Reasoning), user toggles to expand
- Add proper aria attributes (aria-hidden, role, aria-controls, contentId)
- Hide copy button while actively streaming

feat(summarization): default to self-summarize using agent's own provider/model

When no summarization config is provided (neither in librechat.yaml nor
on the agent), automatically enable summarization using the agent's own
provider and model. The agents package already provides default prompts,
so no prompt configuration is needed.

Also removes the dead resolveSummarizationLLMConfig in summarize.ts
(and its spec) — run.ts buildAgentContext is the single source of truth
for summarization config resolution. Removes the duplicate
RuntimeSummarizationConfig local type in favor of the canonical
SummarizationConfig from data-provider.

chore: schema and type cleanup for summarization

- Add trigger field to summarizationAgentOverrideSchema so per-agent
  trigger overrides in librechat.yaml are not silently stripped by Zod
- Remove unused SummarizationStatus type from runs.ts
- Make AppSummarizationConfig.enabled non-optional to reflect the
  invariant that loadSummarizationConfig always sets it

refactor(responses): extract duplicated on_agent_log handler

refactor(run): use agents package types for summarization config

Import SummarizationConfig, ContextPruningConfig, and
OverflowRecoveryConfig from @librechat/agents and use them to
type-check the translation layer in buildAgentContext. This ensures
the config object passed to the agent graph matches what it expects.

- Use `satisfies AgentSummarizationConfig` on the config object
- Cast contextPruningConfig and overflowRecoveryConfig to agents types
- Properly narrow trigger fields from DeepPartial to required shape

feat(config): add maxToolResultChars to base endpoint schema

Add maxToolResultChars to baseEndpointSchema so it can be configured
on any endpoint in librechat.yaml. Resolved during agent initialization
using getProviderConfig's endpoint resolution: custom endpoint config
takes precedence, then the provider-specific endpoint config, then the
shared `all` config.

Passed through to the agents package ToolNode, which uses it to cap
tool result length before it enters the context window. When not
configured, the agents package computes a sensible default from
maxContextTokens.

fix(summarization): forward agent model_parameters in self-summarize default

When no explicit summarization config exists, the self-summarize
default now forwards the agent's model_parameters as the
summarization parameters. This ensures provider-specific settings
(e.g. Bedrock region, credentials, endpoint host) are available
when the agents package constructs the summarization LLM.

fix(agents): register summarization handlers by default

Change the enabled gate from === true to !== false so handlers
register when no explicit summarization config exists. This aligns
with the self-summarize default where summarization is always on
unless explicitly disabled via enabled: false.

refactor(summarization): let agents package inherit clientOptions for self-summarize

Remove model_parameters forwarding from the self-summarize default.
The agents package now reuses the agent's own clientOptions when the
summarization provider matches the agent's provider, inheriting all
provider-specific settings (region, credentials, proxy, etc.)
automatically.

refactor(summarization): use MessageContentComplex[] for summary content

Unify summary content to always use MessageContentComplex[] arrays,
matching the pattern used by on_message_delta. No more string | array
unions — content is always an array of typed blocks ({ type: 'text',
text: '...' } for text, { type: 'reasoning_content', ... } for
reasoning).

Agents package:
- SummaryContentBlock.content: MessageContentComplex[] (was string)
- tokenCount now optional (not sent on deltas)
- Removed reasoning field — reasoning is now a content block type
- streamAndCollect normalizes all chunks to content block arrays
- Delta events pass content blocks directly

LibreChat:
- SummaryContentPart.content: Agents.MessageContentComplex[]
- Updated Part.tsx, Summary.tsx, useStepHandler.ts, BaseClient.js
- Summary.tsx derives display text from content blocks via useMemo
- Aggregator uses simple array spread

refactor(summarization): enhance summary handling and text extraction

- Updated BaseClient.js to improve summary text extraction, accommodating both legacy and new content formats.
- Modified summarization logic to ensure consistent handling of summary content across different message formats.
- Adjusted test cases in summarization.e2e.spec.js to utilize the new summary text extraction method.
- Refined SSE useStepHandler to initialize summary content as an array.
- Updated configuration schema by removing unused minReserveTokens field.
- Cleaned up SummaryContentPart type by removing rangeHash property.

These changes streamline the summarization process and ensure compatibility with various content structures.

refactor(summarization): streamline usage tracking and logging

- Removed direct checks for summarization nodes in ModelEndHandler and replaced them with a dedicated markSummarizationUsage function for better readability and maintainability.
- Updated OpenAIChatCompletionController and responses handlers to utilize the new markSummarizationUsage function for setting usage types.
- Enhanced logging functionality by ensuring the logger correctly handles different log levels.
- Introduced a new useCopyToClipboard hook in the Summary component to encapsulate clipboard copy logic, improving code reusability and clarity.

These changes improve the overall structure and efficiency of the summarization handling and logging processes.

refactor(summarization): update summary content block documentation

- Removed outdated comment regarding the last summary content block in BaseClient.js.
- Added a new comment to clarify the purpose of the findSummaryContentBlock method, ensuring consistency in documentation.

These changes enhance code clarity and maintainability by providing accurate descriptions of the summarization logic.

refactor(summarization): update summary content structure in tests

- Modified the summarization content structure in e2e tests to use an array format for text, aligning with recent changes in summary handling.
- Updated test descriptions to clarify the behavior of context token calculations, ensuring consistency and clarity in the tests.

These changes enhance the accuracy and maintainability of the summarization tests by reflecting the updated content structure.

refactor(summarization): remove legacy E2E test setup and configuration

- Deleted the e2e-setup.js and jest.e2e.config.js files, which contained legacy configurations for E2E tests using real API keys.
- Introduced a new summarization.e2e.ts file that implements comprehensive E2E backend integration tests for the summarization process, utilizing real AI providers and tracking summaries throughout the run.

These changes streamline the testing framework by consolidating E2E tests into a single, more robust file while removing outdated configurations.

refactor(summarization): enhance E2E tests and error handling

- Added a cleanup step to force exit after all tests to manage Redis connections.
- Updated the summarization model to 'claude-haiku-4-5-20251001' for consistency across tests.
- Improved error handling in the processStream function to capture and return processing errors.
- Enhanced logging for cross-run tests and tight context scenarios to provide better insights into test execution.

These changes improve the reliability and clarity of the E2E tests for the summarization process.

refactor(summarization): enhance test coverage for maxContextTokens behavior

- Updated run-summarization.test.ts to include a new test case ensuring that maxContextTokens does not exceed user-defined limits, even when calculated ratios suggest otherwise.
- Modified summarization.e2e.ts to replace legacy UsageMetadata type with a more appropriate type for collectedUsage, improving type safety and clarity in the test setup.

These changes improve the robustness of the summarization tests by validating context token constraints and refining type definitions.

feat(summarization): add comprehensive E2E tests for summarization process

- Introduced a new summarization.e2e.test.ts file that implements extensive end-to-end integration tests for the summarization pipeline, covering the full flow from LibreChat to agents.
- The tests utilize real AI providers and include functionality to track summaries during and between runs.
- Added necessary cleanup steps to manage Redis connections post-tests and ensure proper exit.

These changes enhance the testing framework by providing robust coverage for the summarization process, ensuring reliability and performance under real-world conditions.

fix(service): import logger from winston configuration

- Removed the import statement for logger from '@librechat/data-schemas' and replaced it with an import from '~/config/winston'.
- This change ensures that the logger is correctly sourced from the updated configuration, improving consistency in logging practices across the application.

refactor(summary): simplify Summary component and enhance token display

- Removed the unused `meta` prop from the `SummaryButton` component to streamline its interface.
- Updated the token display logic to use a localized string for better internationalization support.
- Adjusted the rendering of the `meta` information to improve its visibility within the `Summary` component.

These changes enhance the clarity and usability of the Summary component while ensuring better localization practices.

feat(summarization): add maxInputTokens configuration for summarization

- Introduced a new `maxInputTokens` property in the summarization configuration schema to control the amount of conversation context sent to the summarizer, with a default value of 10000.
- Updated the `createRun` function to utilize the new `maxInputTokens` setting, allowing for more flexible summarization based on agent context.

These changes enhance the summarization capabilities by providing better control over input token limits, improving the overall summarization process.

refactor(summarization): simplify maxInputTokens logic in createRun function

- Updated the logic for the `maxInputTokens` property in the `createRun` function to directly use the agent's base context tokens when the resolved summarization configuration does not specify a value.
- This change streamlines the configuration process and enhances clarity in how input token limits are determined for summarization.

These modifications improve the maintainability of the summarization configuration by reducing complexity in the token calculation logic.

feat(summary): enhance Summary component to display meta information

- Updated the SummaryContent component to accept an optional `meta` prop, allowing for additional contextual information to be displayed above the main content.
- Adjusted the rendering logic in the Summary component to utilize the new `meta` prop, improving the visibility of supplementary details.

These changes enhance the user experience by providing more context within the Summary component, making it clearer and more informative.

refactor(summarization): standardize reserveRatio configuration in summarization logic

- Replaced instances of `reserveTokensRatio` with `reserveRatio` in the `createRun` function and related tests to unify the terminology across the codebase.
- Updated the summarization configuration schema to reflect this change, ensuring consistency in how the reserve ratio is defined and utilized.
- Removed the per-agent override logic for summarization configuration, simplifying the overall structure and enhancing clarity.

These modifications improve the maintainability and readability of the summarization logic by standardizing the configuration parameters.

* fix: circular dependency of `~/models`

* chore: update logging scope in agent log handlers

Changed log scope from `[agentus:${data.scope}]` to `[agents:${data.scope}]` in both the callbacks and responses controllers to ensure consistent logging format across the application.

* feat: calibration ratio

* refactor(tests): update summarizationConfig tests to reflect changes in enabled property

Modified tests to check for the new `summarizationEnabled` property instead of the deprecated `enabled` field in the summarization configuration. This change ensures that the tests accurately validate the current configuration structure and behavior of the agents.

* feat(tests): add markSummarizationUsage mock for improved test coverage

Introduced a mock for the markSummarizationUsage function in the responses unit tests to enhance the testing of summarization usage tracking. This addition supports better validation of summarization-related functionalities and ensures comprehensive test coverage for the agents' response handling.

* refactor(tests): simplify event handler setup in createResponse tests

Removed redundant mock implementations for event handlers in the createResponse unit tests, streamlining the setup process. This change enhances test clarity and maintainability while ensuring that the tests continue to validate the correct behavior of usage tracking during on_chat_model_end events.

* refactor(agents): move calibration ratio capture to finally block

Reorganized the logic for capturing the calibration ratio in the AgentClient class to ensure it is executed in the finally block. This change guarantees that the ratio is captured even if the run is aborted, enhancing the reliability of the response message persistence. Removed redundant code and improved clarity in the handling of context metadata.

* refactor(agents): streamline bulk write logic in recordCollectedUsage function

Removed redundant bulk write operations and consolidated document handling in the recordCollectedUsage function. The logic now combines all documents into a single bulk write operation, improving efficiency and reducing error handling complexity. Updated logging to provide consistent error messages for bulk write failures.

* refactor(agents): enhance summarization configuration resolution in createRun function

Streamlined the summarization configuration logic by introducing a base configuration and allowing for overrides from agent-specific settings. This change improves clarity and maintainability, ensuring that the summarization configuration is consistently applied while retaining flexibility for customization. Updated the handling of summarization parameters to ensure proper integration with the agent's model and provider settings.

* refactor(agents): remove unused tokenCountMap and streamline calibration ratio handling

Eliminated the unused tokenCountMap variable from the AgentClient class to enhance code clarity. Additionally, streamlined the logic for capturing the calibration ratio by using optional chaining and a fallback value, ensuring that context metadata is consistently defined. This change improves maintainability and reduces potential confusion in the codebase.

* refactor(agents): extract agent log handler for improved clarity and reusability

Refactored the agent log handling logic by extracting it into a dedicated function, `agentLogHandler`, enhancing code clarity and reusability across different modules. Updated the event handlers in both the OpenAI and responses controllers to utilize the new handler, ensuring consistent logging behavior throughout the application.

* test: add summarization event tests for useStepHandler

Implemented a series of tests for the summarization events in the useStepHandler hook. The tests cover scenarios for ON_SUMMARIZE_START, ON_SUMMARIZE_DELTA, and ON_SUMMARIZE_COMPLETE events, ensuring proper handling of summarization logic, including message accumulation and finalization. This addition enhances test coverage and validates the correct behavior of the summarization process within the application.

* refactor(config): update summarizationTriggerSchema to use enum for type validation

Changed the type of the `type` field in the summarizationTriggerSchema from a string to an enum with a single value 'token_count'. This modification enhances type safety and ensures that only valid types are accepted in the configuration, improving overall clarity and maintainability of the schema.

* test(usage): add bulk write tests for message and summarization usage

Implemented tests for the bulk write functionality in the recordCollectedUsage function, covering scenarios for combined message and summarization usage, summarization-only usage, and message-only usage. These tests ensure correct document handling and token rollup calculations, enhancing test coverage and validating the behavior of the usage tracking logic.

* refactor(Chat): enhance clipboard copy functionality and type definitions in Summary component

Updated the Summary component to improve the clipboard copy functionality by handling clipboard permission errors. Refactored type definitions for SummaryProps to use a more specific type, enhancing type safety. Adjusted the SummaryButton and FloatingSummaryBar components to accept isCopied and onCopy props, promoting better separation of concerns and reusability.

* chore(translations): remove unused "Expand Summary" key from English translations

Deleted the "Expand Summary" key from the English translation file to streamline the localization resources and improve clarity in the user interface. This change helps maintain an organized and efficient translation structure.

* refactor: adjust token counting for Claude model to account for API discrepancies

Implemented a correction factor for token counting when using the Claude model, addressing discrepancies between Anthropic's API and local tokenizer results. This change ensures accurate token counts by applying a scaling factor, improving the reliability of token-related functionalities.

* refactor(agents): implement token count adjustment for Claude model messages

Added a method to adjust token counts for messages processed by the Claude model, applying a correction factor to align with API expectations. This enhancement improves the accuracy of token counting, ensuring reliable functionality when interacting with the Claude model.

* refactor(agents): token counting for media content in messages

Introduced a new method to estimate token costs for image and document blocks in messages, improving the accuracy of token counting. This enhancement ensures that media content is properly accounted for, particularly for the Claude model, by integrating additional token estimation logic for various content types. Updated the token counting function to utilize this new method, enhancing overall reliability and functionality.

* chore: fix missing import

* fix(agents): clamp baseContextTokens and document reserve ratio change

Prevent negative baseContextTokens when maxOutputTokens exceeds the
context window (misconfigured models). Document the 10%→5% default
reserve ratio reduction introduced alongside summarization.

* fix(agents): include media tokens in hydrated token counts

Add estimateMediaTokensForMessage to createTokenCounter so the hydration
path (used by hydrateMissingIndexTokenCounts) matches the precomputed
path in AgentClient.getTokenCountForMessage. Without this, messages
containing images or documents were systematically undercounted during
hydration, risking context window overflow.

Add 34 unit tests covering all block-type branches of
estimateMediaTokensForMessage.

* fix(agents): include summarization output tokens in usage return value

The returned output_tokens from recordCollectedUsage now reflects all
billed LLM calls (message + summarization). Previously, summarization
completions were billed but excluded from the returned metadata, causing
a discrepancy between what users were charged and what the response
message reported.

* fix(tests): replace process.exit with proper Redis cleanup in e2e test

The summarization E2E test used process.exit(0) to work around a Redis
connection opened at import time, which killed the Jest runner and
bypassed teardown. Use ioredisClient.quit() and keyvRedisClient.disconnect()
for graceful cleanup instead.

* fix(tests): update getConvo imports in OpenAI and response tests

Refactor test files to import getConvo from the main models module instead of the Conversation submodule. This change ensures consistency across tests and simplifies the import structure, enhancing maintainability.

* fix(clients): improve summary text validation in BaseClient

Refactor the summary extraction logic to ensure that only non-empty summary texts are considered valid. This change enhances the robustness of the message processing by utilizing a dedicated method for summary text retrieval, improving overall reliability.

* fix(config): replace z.any() with explicit union in summarization schema

Model parameters (temperature, top_p, etc.) are constrained to
primitive types rather than the policy-violating z.any().

* refactor(agents): deduplicate CLAUDE_TOKEN_CORRECTION constant

Export from the TS source in packages/api and import in the JS client,
eliminating the static class property that could drift out of sync.

* refactor(agents): eliminate duplicate selfProvider in buildAgentContext

selfProvider and provider were derived from the same expression with
different type casts. Consolidated to a single provider variable.

* refactor(agents): extract shared SSE handlers and restrict log levels

- buildSummarizationHandlers() factory replaces triplicated handler
  blocks across responses.js and openai.js
- agentLogHandlerObj exported from callbacks.js for consistent reuse
- agentLogHandler restricted to an allowlist of safe log levels
  (debug, info, warn, error) instead of accepting arbitrary strings

* fix(SSE): batch summarize deltas, add exhaustiveness check, conditional error announcement

- ON_SUMMARIZE_DELTA coalesces rapid-fire renders via requestAnimationFrame
  instead of calling setMessages per chunk
- Exhaustive never-check on TStepEvent catches unhandled variants at
  compile time when new StepEvents are added
- ON_SUMMARIZE_COMPLETE error announcement only fires when a summary
  part was actually present and removed

* feat(agents): persist instruction overhead in contextMeta and seed across runs

Extend contextMeta with instructionOverhead and toolCount so the
provider-observed instruction overhead is persisted on the response message
and seeded into the pruner on subsequent runs. This enables the pruner to
use a calibrated budget from the first call instead of waiting for a
provider observation, preventing the ratio collapse caused by local
tokenizer overestimating tool schema tokens.

The seeded overhead is only used when encoding and tool count match
between runs, ensuring stale values from different configurations
are discarded.

* test(agents): enhance OpenAI test mocks for summarization handlers

Updated the OpenAI test suite to include additional mock implementations for summarization handlers, including buildSummarizationHandlers, markSummarizationUsage, and agentLogHandlerObj. This improves test coverage and ensures consistent behavior during testing.

* fix(agents): address review findings for summarization v2

Cancel rAF on unmount to prevent stale Recoil writes from dead
component context. Clear orphaned summarizing:true parts when
ON_SUMMARIZE_COMPLETE arrives without a summary payload. Add null
guard and safe spread to agentLogHandler. Handle Anthropic-format
base64 image/* documents in estimateMediaTokensForMessage. Use
role="region" for expandable summary content. Add .describe() to
contextMeta Zod fields. Extract duplicate usage loop into helper.

* refactor: simplify contextMeta to calibrationRatio + encoding only

Remove instructionOverhead and toolCount from cross-run persistence —
instruction tokens change too frequently between runs (prompt edits,
tool changes) for a persisted seed to be reliable. The intra-run
calibration in the pruner still self-corrects via provider observations.
contextMeta now stores only the tokenizer-bias ratio and encoding,
which are stable across instruction changes.

* test(SSE): enhance useStepHandler tests for ON_SUMMARIZE_COMPLETE behavior

Updated the test for ON_SUMMARIZE_COMPLETE to clarify that it finalizes the existing part with summarizing set to false when the summary is undefined. Added assertions to verify the correct behavior of message updates and the state of summary parts.

* refactor(BaseClient): remove handleContextStrategy and truncateToolCallOutputs functions

Eliminated the handleContextStrategy method from BaseClient to streamline message handling. Also removed the truncateToolCallOutputs function from the prompts module, simplifying the codebase and improving maintainability.

* refactor: add AGENT_DEBUG_LOGGING option and refactor token count handling in BaseClient

Introduced AGENT_DEBUG_LOGGING to .env.example for enhanced debugging capabilities. Refactored token count handling in BaseClient by removing the handleTokenCountMap method and simplifying token count updates. Updated AgentClient to log detailed token count recalculations and adjustments, improving traceability during message processing.

* chore: update dependencies in package-lock.json and package.json files

Bumped versions of several dependencies, including @librechat/agents to ^3.1.62 and various AWS SDK packages to their latest versions. This ensures compatibility and incorporates the latest features and fixes.

* chore: imports order

* refactor: extract summarization config resolution from buildAgentContext

* refactor: rename and simplify summarization configuration shaping function

* refactor: replace AgentClient token counting methods with single-pass pure utility

Extract getTokenCount() and getTokenCountForMessage() from AgentClient
into countFormattedMessageTokens(), a pure function in packages/api that
handles text, tool_call, image, and document content types in one loop.

- Decompose estimateMediaTokensForMessage into block-level helpers
  (estimateImageDataTokens, estimateImageBlockTokens, estimateDocumentBlockTokens)
  shared by both estimateMediaTokensForMessage and the new single-pass function
- Remove redundant per-call getEncoding() resolution (closure captures once)
- Remove deprecated gpt-3.5-turbo-0301 model branching
- Drop this.getTokenCount guard from BaseClient.sendMessage

* refactor: streamline token counting in createTokenCounter function

Simplified the createTokenCounter function by removing the media token estimation and directly calculating the token count. This change enhances clarity and performance by consolidating the token counting logic into a single pass, while maintaining compatibility with Claude's token correction.

* refactor: simplify summarization configuration types

Removed the AppSummarizationConfig type and directly used SummarizationConfig in the AppConfig interface. This change streamlines the type definitions and enhances consistency across the codebase.

* chore: import order

* fix: summarization event handling in useStepHandler

- Cancel pending summarizeDeltaRaf in clearStepMaps to prevent stale
  frames firing after map reset or component unmount
- Move announcePolite('summarize_completed') inside the didFinalize
  guard so screen readers only announce when finalization actually occurs
- Remove dead cleanup closure returned from stepHandler useCallback body
  that was never invoked by any caller

* fix: estimate tokens for non-PDF/non-image base64 document blocks

Previously estimateDocumentBlockTokens returned 0 for unrecognized MIME
types (e.g. text/plain, application/json), silently underestimating
context budget. Fall back to character-based heuristic or countTokens.

* refactor: return cloned usage from markSummarizationUsage

Avoid mutating LangChain's internal usage_metadata object by returning
a shallow clone with the usage_type tag. Update all call sites in
callbacks, openai, and responses controllers to use the returned value.

* refactor: consolidate debug logging loops in buildMessages

Merge the two sequential O(n) debug-logging passes over orderedMessages
into a single pass inside the map callback where all data is available.

* refactor: narrow SummaryContentPart.content type

Replace broad Agents.MessageContentComplex[] with the specific
Array<{ type: ContentTypes.TEXT; text: string }> that all producers
and consumers already use, improving compile-time safety.

* refactor: use single output array in recordCollectedUsage

Have processUsageGroup append to a shared array instead of returning
separate arrays that are spread into a third, reducing allocations.

* refactor: use for...in in hydrateMissingIndexTokenCounts

Replace Object.entries with for...in to avoid allocating an
intermediate tuple array during token map hydration.
2026-03-21 14:28:56 -04:00
mfish911
4e5ae28fa9
📡 feat: Support Unauthenticated SMTP Relays (#12322)
* allow smtp server that does not have authentication

* fix: align checkEmailConfig with optional SMTP credentials and add tests

Remove EMAIL_USERNAME/EMAIL_PASSWORD requirements from the hasSMTPConfig
predicate in checkEmailConfig() so the rest of the codebase (login,
startup checks, invite-user) correctly recognizes unauthenticated SMTP
as a valid email configuration.

Add a warning when only one of the two credential env vars is set,
in both sendEmail.js and checkEmailConfig(), to catch partial
misconfigurations early.

Add test coverage for both the transporter auth assembly in sendEmail.js
and the checkEmailConfig predicate in packages/api.

Document in .env.example that credentials are optional for
unauthenticated SMTP relays.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-03-20 13:07:39 -04:00
Airam Hernández Hernández
96f6976e00
🪂 fix: Automatic logout_hint Fallback for Oversized OpenID Token URLs (#12326)
* fix: automatic logout_hint fallback for long OpenID tokens

Implements OIDC RP-Initiated Logout cascading strategy to prevent errors when id_token_hint makes logout URL too long.

Automatically detects URLs exceeding configurable length and falls back to logout_hint only when URL is too long, preserving previous behavior when token is missing. Adds OPENID_MAX_LOGOUT_URL_LENGTH environment variable. Comprehensive test coverage with 20 tests. Works with any OpenID provider.

* fix: address review findings for OIDC logout URL length fallback

- Replace two-boolean tri-state (useIdTokenHint/urlTooLong) with a single
  string discriminant ('use_token'|'too_long'|'no_token') for clarity
- Fix misleading warning: differentiate 'url too long + no client_id' from
  'no token + no client_id' so operators get actionable advice
- Strict env var parsing: reject partial numeric strings like '500abc' that
  Number.parseInt silently accepted; use regex + Number() instead
- Pre-compute projected URL length from base URL + token length (JWT chars
  are URL-safe), eliminating the set-then-delete mutation pattern
- Extract parseMaxLogoutUrlLength helper for validation and early return
- Add tests: invalid env values, url-too-long + missing OPENID_CLIENT_ID,
  boundary condition (exact max vs max+1), cookie-sourced long token
- Remove redundant try/finally in 'respects custom limit' test
- Use empty value in .env.example to signal optional config (default: 2000)

---------

Co-authored-by: Airam Hernández Hernández <airam.hernandez@intelequia.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-03-20 12:46:57 -04:00
Danny Avila
fcb344da47
🛂 fix: MCP OAuth Race Conditions, CSRF Fallback, and Token Expiry Handling (#12171)
* fix: Implement race conditions in MCP OAuth flow

- Added connection mutex to coalesce concurrent `getUserConnection` calls, preventing multiple simultaneous attempts.
- Enhanced flow state management to retry once when a flow state is missing, improving resilience against race conditions.
- Introduced `ReauthenticationRequiredError` for better error handling when access tokens are expired or missing.
- Updated tests to cover new race condition scenarios and ensure proper handling of OAuth flows.

* fix: Stale PENDING flow detection and OAuth URL re-issuance

PENDING flows in handleOAuthRequired now check createdAt age — flows
older than 2 minutes are treated as stale and replaced instead of
joined. Fixes the case where a leftover PENDING flow from a previous
session blocks new OAuth initiation.

authorizationUrl is now stored in MCPOAuthFlowMetadata so that when a
second caller joins an active PENDING flow (e.g., the SSE-emitting path
in ToolService), it can re-issue the URL to the user via oauthStart.

* fix: CSRF fallback via active PENDING flow in OAuth callback

When the OAuth callback arrives without CSRF or session cookies (common
in the chat/SSE flow where cookies can't be set on streaming responses),
fall back to validating that a PENDING flow exists for the flowId. This
is safe because the flow was created server-side after JWT authentication
and the authorization code is PKCE-protected.

* test: Extract shared OAuth test server helpers

Move MockKeyv, getFreePort, trackSockets, and createOAuthMCPServer into
a shared helpers/oauthTestServer module. Enhance the test server with
refresh token support, token rotation, metadata discovery, and dynamic
client registration endpoints. Add InMemoryTokenStore for token storage
tests.

Refactor MCPOAuthRaceCondition.test.ts to import from shared helpers.

* test: Add comprehensive MCP OAuth test modules

MCPOAuthTokenStorage — 21 tests for storeTokens/getTokens with
InMemoryTokenStore: encrypt/decrypt round-trips, expiry calculation,
refresh callback wiring, ReauthenticationRequiredError paths.

MCPOAuthFlow — 10 tests against real HTTP server: token refresh with
stored client info, refresh token rotation, metadata discovery, dynamic
client registration, full store/retrieve/expire/refresh lifecycle.

MCPOAuthConnectionEvents — 5 tests for MCPConnection OAuth event cycle
with real OAuth-gated MCP server: oauthRequired emission on 401,
oauthHandled reconnection, oauthFailed rejection, token expiry detection.

MCPOAuthTokenExpiry — 12 tests for the token expiry edge case: refresh
success/failure paths, ReauthenticationRequiredError, PENDING flow CSRF
fallback, authorizationUrl metadata storage, full re-auth cycle after
refresh failure, concurrent expired token coalescing, stale PENDING
flow detection.

* test: Enhance MCP OAuth connection tests with cooldown reset

Added a `beforeEach` hook to clear the cooldown for `MCPConnection` before each test, ensuring a clean state. Updated the race condition handling in the tests to properly clear the timeout, improving reliability in the event data retrieval process.

* refactor: PENDING flow management and state recovery in MCP OAuth

- Introduced a constant `PENDING_STALE_MS` to define the age threshold for PENDING flows, improving the handling of stale flows.
- Updated the logic in `MCPConnectionFactory` and `FlowStateManager` to check the age of PENDING flows before joining or reusing them.
- Modified the `completeFlow` method to return false when the flow state is deleted, ensuring graceful handling of race conditions.
- Enhanced tests to validate the new behavior and ensure robustness against state recovery issues.

* refactor: MCP OAuth flow management and testing

- Updated the `completeFlow` method to log warnings when a tool flow state is not found during completion, improving error handling.
- Introduced a new `normalizeExpiresAt` function to standardize expiration timestamp handling across the application.
- Refactored token expiration checks in `MCPConnectionFactory` to utilize the new normalization function, ensuring consistent behavior.
- Added a comprehensive test suite for OAuth callback CSRF fallback logic, validating the handling of PENDING flows and their staleness.
- Enhanced existing tests to cover new expiration normalization logic and ensure robust flow state management.

* test: Add CSRF fallback tests for active PENDING flows in MCP OAuth

- Introduced new tests to validate CSRF fallback behavior when a fresh PENDING flow exists without cookies, ensuring successful OAuth callback handling.
- Added scenarios to reject requests when no PENDING flow exists, when only a COMPLETED flow is present, and when a PENDING flow is stale, enhancing the robustness of flow state management.
- Improved overall test coverage for OAuth callback logic, reinforcing the handling of CSRF validation failures.

* chore: imports order

* refactor: Update UserConnectionManager to conditionally manage pending connections

- Modified the logic in `UserConnectionManager` to only set pending connections if `forceNew` is false, preventing unnecessary overwrites.
- Adjusted the cleanup process to ensure pending connections are only deleted when not forced, enhancing connection management efficiency.

* refactor: MCP OAuth flow state management

- Introduced a new method `storeStateMapping` in `MCPOAuthHandler` to securely map the OAuth state parameter to the flow ID, improving callback resolution and security against forgery.
- Updated the OAuth initiation and callback handling in `mcp.js` to utilize the new state mapping functionality, ensuring robust flow management.
- Refactored `MCPConnectionFactory` to store state mappings during flow initialization, enhancing the integrity of the OAuth process.
- Adjusted comments to clarify the purpose of state parameters in authorization URLs, reinforcing code readability.

* refactor: MCPConnection with OAuth recovery handling

- Added `oauthRecovery` flag to manage OAuth recovery state during connection attempts.
- Introduced `decrementCycleCount` method to reduce the circuit breaker's cycle count upon successful reconnection after OAuth recovery.
- Updated connection logic to reset the `oauthRecovery` flag after handling OAuth, improving state management and connection reliability.

* chore: Add debug logging for OAuth recovery cycle count decrement

- Introduced a debug log statement in the `MCPConnection` class to track the decrement of the cycle count after a successful reconnection during OAuth recovery.
- This enhancement improves observability and aids in troubleshooting connection issues related to OAuth recovery.

* test: Add OAuth recovery cycle management tests

- Introduced new tests for the OAuth recovery cycle in `MCPConnection`, validating the decrement of cycle counts after successful reconnections.
- Added scenarios to ensure that the cycle count is not decremented on OAuth failures, enhancing the robustness of connection management.
- Improved test coverage for OAuth reconnect scenarios, ensuring reliable behavior under various conditions.

* feat: Implement circuit breaker configuration in MCP

- Added circuit breaker settings to `.env.example` for max cycles, cycle window, and cooldown duration.
- Refactored `MCPConnection` to utilize the new configuration values from `mcpConfig`, enhancing circuit breaker management.
- Improved code maintainability by centralizing circuit breaker parameters in the configuration file.

* refactor: Update decrementCycleCount method for circuit breaker management

- Changed the visibility of the `decrementCycleCount` method in `MCPConnection` from private to public static, allowing it to be called with a server name parameter.
- Updated calls to `decrementCycleCount` in `MCPConnectionFactory` to use the new static method, improving clarity and consistency in circuit breaker management during connection failures and OAuth recovery.
- Enhanced the handling of circuit breaker state by ensuring the method checks for the existence of the circuit breaker before decrementing the cycle count.

* refactor: cycle count decrement on tool listing failure

- Added a call to `MCPConnection.decrementCycleCount` in the `MCPConnectionFactory` to handle cases where unauthenticated tool listing fails, improving circuit breaker management.
- This change ensures that the cycle count is decremented appropriately, maintaining the integrity of the connection recovery process.

* refactor: Update circuit breaker configuration and logic

- Enhanced circuit breaker settings in `.env.example` to include new parameters for failed rounds and backoff strategies.
- Refactored `MCPConnection` to utilize the updated configuration values from `mcpConfig`, improving circuit breaker management.
- Updated tests to reflect changes in circuit breaker logic, ensuring accurate validation of connection behavior under rapid reconnect scenarios.

* feat: Implement state mapping deletion in MCP flow management

- Added a new method `deleteStateMapping` in `MCPOAuthHandler` to remove orphaned state mappings when a flow is replaced, preventing old authorization URLs from resolving after a flow restart.
- Updated `MCPConnectionFactory` to call `deleteStateMapping` during flow cleanup, ensuring proper management of OAuth states.
- Enhanced test coverage for state mapping functionality to validate the new deletion logic.
2026-03-10 21:15:01 -04:00
Lionel Ringenbach
6d0938be64
🔒 refactor: Set ALLOW_SHARED_LINKS_PUBLIC to false by Default (#12100)
* fix: default ALLOW_SHARED_LINKS_PUBLIC to false for security

Shared links were publicly accessible by default when
ALLOW_SHARED_LINKS_PUBLIC was not explicitly set, which could lead to
unintentional data exposure. Users may assume their authentication
settings protect shared links when they do not.

This changes the default behavior so shared links require JWT
authentication unless ALLOW_SHARED_LINKS_PUBLIC is explicitly set to
true.

* Document ALLOW_SHARED_LINKS_PUBLIC in .env.example

Add comment explaining ALLOW_SHARED_LINKS_PUBLIC setting.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Danny Avila <danacordially@gmail.com>
2026-03-06 19:05:56 -05:00
Danny Avila
a2a09b556a
🤖 feat: gemini-3.1-flash-lite-preview Window & Pricing (#12043)
* 🤖 feat: `gemini-3.1-flash-lite-preview` Window & Pricing

- Updated `.env.example` to include `gemini-3.1-flash-lite-preview` in the list of available models.
- Enhanced `tx.js` to define token values for `gemini-3.1-flash-lite`.
- Adjusted `tokens.ts` to allocate input tokens for `gemini-3.1-flash-lite`.
- Modified `config.ts` to include `gemini-3.1-flash-lite-preview` in the default models list.

* chore: testing for `gemini-3.1-flash-lite` model, comments

- Updated `tx.js` to include cache token values for `gemini-3.1-flash-lite` with specific write and read rates.
- Enhanced `tx.spec.js` to include tests for the new `gemini-3.1-flash-lite-preview` model, ensuring correct rate retrieval for both prompt and completion token types.
2026-03-03 13:47:16 -05:00
Juri Kuehn
13df8ed67c
🪪 feat: Add OPENID_EMAIL_CLAIM for Configurable OpenID User Identifier (#11699)
* Allow setting the claim field to be used when OpenID login is configured

* fix(openid): harden getOpenIdEmail and expand test coverage

Guard against non-string claim values in getOpenIdEmail to prevent a
TypeError crash in isEmailDomainAllowed when domain restrictions are
configured. Improve warning messages to name the fallback chain
explicitly and distinguish missing vs. non-string claim values.

Fix the domain-block error log to record the resolved identifier rather
than userinfo.email, which was misleading when OPENID_EMAIL_CLAIM
resolved to a different field (e.g. upn).

Fix a latent test defect in openIdJwtStrategy.spec.js where the
~/server/services/Config mock exported getCustomConfig instead of
getAppConfig, the symbol actually consumed by openidStrategy.js.

Add refreshController tests covering the OPENID_EMAIL_CLAIM paths,
which were previously untested despite being a stated fix target.
Expand JWT strategy tests with null-payload, empty/whitespace
OPENID_EMAIL_CLAIM, migration-via-preferred_username, and call-order
assertions for the findUser lookup sequence.

* test(auth): enhance AuthController and openIdJwtStrategy tests for openidId updates

Added a new test in AuthController to verify that the openidId is updated correctly when a migration is triggered during the refresh process. Expanded the openIdJwtStrategy tests to include assertions for the updateUser function, ensuring that the correct parameters are passed when a user is found with a legacy email. This improves test coverage for OpenID-related functionality.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-02-25 22:31:03 -05:00
Danny Avila
a0f9782e60
🪣 fix: Prevent Memory Retention from AsyncLocalStorage Context Propagation (#11942)
* fix: store hide_sequential_outputs before processStream clears config

processStream now clears config.configurable after completion to break
memory retention chains. Save hide_sequential_outputs to a local
variable before calling runAgents so the post-stream filter still works.

* feat: memory diagnostics

* chore: expose garbage collection in backend inspect command

Updated the backend inspect command in package.json to include the --expose-gc flag, enabling garbage collection diagnostics for improved memory management during development.

* chore: update @librechat/agents dependency to version 3.1.52

Bumped the version of @librechat/agents in package.json and package-lock.json to ensure compatibility and access to the latest features and fixes.

* fix: clear heavy config state after processStream to prevent memory leaks

Break the reference chain from LangGraph's internal __pregel_scratchpad
through @langchain/core RunTree.extra[lc:child_config] into the
AsyncLocalStorage context captured by timers and I/O handles.

After stream completion, null out symbol-keyed scratchpad properties
(currentTaskInput), config.configurable, and callbacks. Also call
Graph.clearHeavyState() to release config, signal, content maps,
handler registry, and tool sessions.

* chore: fix imports for memory utils

* chore: add circular dependency check in API build step

Enhanced the backend review workflow to include a check for circular dependencies during the API build process. If a circular dependency is detected, an error message is displayed, and the process exits with a failure status.

* chore: update API build step to include circular dependency detection

Modified the backend review workflow to rename the API package installation step to reflect its new functionality, which now includes detection of circular dependencies during the build process.

* chore: add memory diagnostics option to .env.example

Included a commented-out configuration option for enabling memory diagnostics in the .env.example file, which logs heap and RSS snapshots every 60 seconds when activated.

* chore: remove redundant agentContexts cleanup in disposeClient function

Streamlined the disposeClient function by eliminating duplicate cleanup logic for agentContexts, ensuring efficient memory management during client disposal.

* refactor: move runOutsideTracing utility to utils and update its usage

Refactored the runOutsideTracing function by relocating it to the utils module for better organization. Updated the tool execution handler to utilize the new import, ensuring consistent tracing behavior during tool execution.

* refactor: enhance connection management and diagnostics

Added a method to ConnectionsRepository for retrieving the active connection count. Updated UserConnectionManager to utilize this new method for app connection count reporting. Refined the OAuthReconnectionTracker's getStats method to improve clarity in diagnostics. Introduced a new tracing utility in the utils module to streamline tracing context management. Additionally, added a safeguard in memory diagnostics to prevent unnecessary snapshot collection for very short intervals.

* refactor: enhance tracing utility and add memory diagnostics tests

Refactored the runOutsideTracing function to improve warning logic when the AsyncLocalStorage context is missing. Added tests for memory diagnostics and tracing utilities to ensure proper functionality and error handling. Introduced a new test suite for memory diagnostics, covering snapshot collection and garbage collection behavior.
2026-02-25 17:41:23 -05:00
Danny Avila
f3eb197675
💎 fix: Gemini Image Gen Tool Vertex AI Auth and File Storage (#11923)
* chore: saveToCloudStorage function and enhance error handling

- Removed unnecessary parameters and streamlined the logic for saving images to cloud storage.
- Introduced buffer handling for base64 image data and improved the integration with file strategy functions.
- Enhanced error handling during local image saving to ensure robustness.
- Updated the createGeminiImageTool function to reflect changes in the saveToCloudStorage implementation.

* refactor: streamline image persistence logic in GeminiImageGen

- Consolidated image saving functionality by renaming and refactoring the saveToCloudStorage function to persistGeneratedImage.
- Improved error handling and logging for image persistence operations.
- Enhanced the replaceUnwantedChars function to better sanitize input strings.
- Updated createGeminiImageTool to reflect changes in image handling and ensure consistent behavior across storage strategies.

* fix: clean up GeminiImageGen by removing unused functions and improving logging

- Removed the getSafeFormat and persistGeneratedImage functions to streamline image handling.
- Updated logging in createGeminiImageTool for clarity and consistency.
- Consolidated imports by eliminating unused dependencies, enhancing code maintainability.

* chore: update environment configuration and manifest for unused GEMINI_VERTEX_ENABLED

- Removed the Vertex AI configuration option from .env.example to simplify setup.
- Updated the manifest.json to reflect the removal of the Vertex AI dependency in the authentication field.
- Cleaned up the createGeminiImageTool function by eliminating unused fields related to Vertex AI, streamlining the code.

* fix: update loadAuthValues call in loadTools function for GeminiImageGen tool

- Modified the loadAuthValues function call to include throwError: false, preventing exceptions on authentication failures.
- Removed the unused processFileURL parameter from the tool context object, streamlining the code.

* refactor: streamline GoogleGenAI initialization in GeminiImageGen

- Removed unused file system access check for Google application credentials, simplifying the environment setup.
- Added googleAuthOptions to the GoogleGenAI instantiation, enhancing the configuration for authentication.

* fix: update Gemini API Key label and description in manifest.json

- Changed the label to indicate that the Gemini API Key is optional.
- Revised the description to clarify usage with Vertex AI and service accounts, enhancing user guidance.

* fix: enhance abort signal handling in createGeminiImageTool

- Introduced derivedSignal to manage abort events during image generation, improving responsiveness to cancellation requests.
- Added an abortHandler to log when image generation is aborted, enhancing debugging capabilities.
- Ensured proper cleanup of event listeners in the finally block to prevent memory leaks.

* fix: update authentication handling for plugins to support optional fields

- Added support for optional authentication fields in the manifest and PluginAuthForm.
- Updated the checkPluginAuth function to correctly validate plugins with optional fields.
- Enhanced tests to cover scenarios with optional authentication fields, ensuring accurate validation logic.
2026-02-24 08:21:02 -05:00
Danny Avila
7692fa837e
🪣 fix: S3 path-style URL support for MinIO, R2, and custom endpoints (#11894)
* 🪣 fix: S3 path-style URL support for MinIO, R2, and custom endpoints

`extractKeyFromS3Url` now uses `AWS_BUCKET_NAME` to automatically detect and
strip the bucket prefix from path-style URLs, fixing `NoSuchKey` errors on URL
refresh for any S3-compatible provider using a custom endpoint (MinIO, Cloudflare
R2, Hetzner, Backblaze B2, etc.). No additional configuration required — the
bucket name is already a required env var for S3 to function.

`initializeS3` now passes `forcePathStyle: true` to the S3Client constructor
when `AWS_FORCE_PATH_STYLE=true` is set. Required for providers whose SSL
certificates do not support virtual-hosted-style bucket subdomains (e.g. Hetzner
Object Storage), which previously caused 401 / SignatureDoesNotMatch on upload.

Additional fixes:
- Suppress error log noise in `extractKeyFromS3Url` catch path: plain S3 keys
  no longer log as errors, only inputs that start with http(s):// do
- Fix test env var ordering so module-level constants pick up `AWS_BUCKET_NAME`
  and `S3_URL_EXPIRY_SECONDS` correctly before the module is required
- Add missing `deleteRagFile` mock and assertion in `deleteFileFromS3` tests
- Add `AWS_BUCKET_NAME` cleanup to `afterEach` to prevent cross-test pollution
- Add `initializeS3` unit tests covering endpoint, forcePathStyle, credentials,
  singleton, and IRSA code paths
- Document `AWS_FORCE_PATH_STYLE` in `.env.example`, `dotenv.mdx`, and `s3.mdx`

* 🪣 fix: Enhance S3 URL key extraction for custom endpoints

Updated `extractKeyFromS3Url` to support precise key extraction when using custom endpoints with path-style URLs. The logic now accounts for the `AWS_ENDPOINT_URL` and `AWS_FORCE_PATH_STYLE` environment variables, ensuring correct key handling for various S3-compatible providers.

Added unit tests to verify the new functionality, including scenarios for endpoints with base paths. This improves compatibility and reduces potential errors when interacting with S3-like services.
2026-02-21 18:36:48 -05:00
Danny Avila
7a1d2969b8
🤖 feat: Gemini 3.1 Pricing and Context Window (#11884)
- Added support for the new Gemini 3.1 models, including 'gemini-3.1-pro-preview' and 'gemini-3.1-pro-preview-customtools'.
- Updated pricing logic to apply standard and premium rates based on token usage thresholds for the new models.
- Enhanced tests to validate pricing behavior for both standard and premium scenarios.
- Modified configuration files to include Gemini 3.1 models in the default model lists and token value mappings.
- Updated environment example file to reflect the new model options.
2026-02-20 16:21:32 -05:00