Commit graph

742 commits

Author SHA1 Message Date
Marco Beretta
a4f095a856
feat(client): add updateUserLocation data-provider wiring 2026-06-15 18:45:13 +02:00
Marco Beretta
fdca6184ba
feat(config): add location feature flag and geocoder config
Adds a `location` section to librechat.yaml config with enabled/geocoder
fields, wires loadLocationConfig into AppService, and surfaces the resolved
flag in the startup-config endpoint so the UI can gate location controls.
2026-06-15 18:41:42 +02:00
Marco Beretta
955674db44
feat(types): add TUserLocation to user personalization 2026-06-15 01:44:39 +02:00
Danny Avila
9efe4878e7
🅰️ feat: Native Anthropic Provider for Custom Endpoints (#13748)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
* 🅰️ feat: Native Anthropic provider for Custom Endpoints

Let a custom endpoint declare `provider: anthropic` to use the native Anthropic
`/v1/messages` client (the agents SDK's ChatAnthropic) against its own
`baseURL`/`apiKey`/`headers`, instead of being forced through the
OpenAI-compatible client. Enables Anthropic itself and Anthropic-compatible
gateways (AI gateways, OpenCode Zen, etc.) as custom endpoints — including for
agents and role-scoped model access.

Closes #10655 (Option 1: explicit provider).

- Schema: add optional `provider` (currently `anthropic`) to the custom
  `endpointSchema` in data-provider.
- Routing: `getProviderConfig` maps a custom endpoint with `provider: anthropic`
  to `Providers.ANTHROPIC` (was always `Providers.OPENAI`).
- Config: `initializeCustom` builds the native Anthropic config via the Anthropic
  `getLLMConfig` (custom baseURL/apiKey/headers) and returns `provider: anthropic`;
  `useLegacyContent` is left unset to match the built-in Anthropic endpoint. The
  OpenAI-compatible path is unchanged for endpoints without `provider`.
- Summarization: `resolveSummarizationProvider` builds an Anthropic config for a
  cross-endpoint native-Anthropic summarization target (self-summarize already
  reuses the agent's client options).

Title generation already resolves via `agent.endpoint`, and provider-specific
handling (tool conflicts, content/PDF validation, token counting, streamUsage)
already branches on `Providers.ANTHROPIC`, so it applies automatically.

Note: model auto-fetch (`models.fetch`) uses the OpenAI `/models` convention and
is not used for this provider — list models explicitly under `models.default`.

* 🅰️ fix: Anthropic custom-endpoint param parity (Codex review)

Address Codex P2 findings — the native Anthropic path must match the
OpenAI-compatible path's parameter handling:

- UI param set: `loadCustomEndpointsConfig` now surfaces `provider` as the
  client `customParams.defaultParamsEndpoint`, so the Agents model panel shows
  Anthropic fields (`maxOutputTokens`/`thinking`) instead of OpenAI `max_tokens`
  (which the native initializer ignored). An explicit non-default
  `defaultParamsEndpoint` still wins.
- Provider override: `getProviderConfig` re-applies `provider: anthropic` after
  all `customEndpointConfig` resolution, so it also wins when the endpoint name
  collides with a known custom provider (e.g. `openrouter`) — fixing the
  token/context budget derived from `overrideProvider`.
- Default params: the native path (and cross-endpoint Anthropic summarization)
  now apply `customParams.paramDefinitions` defaults via `extractDefaultParams`,
  matching what `getOpenAIConfig` does for the OpenAI-compatible path.

Adds tests for each.
2026-06-14 18:23:48 -04:00
Danny Avila
2350ebb24a
📨 feat: Custom Headers on Built-in Provider Endpoints (#13742)
* 📨 feat: Custom Headers on Built-in Provider Endpoints

Add a `headers` config option to the built-in `openAI`, `anthropic`, and
`google` endpoints (incl. Anthropic/Google Vertex), mirroring the custom
endpoint header mechanism. Values support the same placeholder resolution
(env vars, `{{LIBRECHAT_USER_*}}`, `{{LIBRECHAT_BODY_CONVERSATIONID}}`) and
are resolved at request time so dynamic values like conversationId resolve
against the live request — without losing provider-native request shaping.

Closes #13082. Covers #13713: forwarding conversationId to a reverse proxy
is now `X-Conversation-Id: '{{LIBRECHAT_BODY_CONVERSATIONID}}'` — an unknown
header is ignored by the native Anthropic API, so no 400 and no metadata
gating needed.

- Schema: `headers` on `baseEndpointSchema` (openAI/google/anthropic/all).
- New `mergeHeaders`/`resolveConfigHeaders` utils centralize the per-provider
  header locations (`configuration.defaultHeaders`, Anthropic
  `clientOptions.defaultHeaders`, Google `customHeaders`); provider-managed
  headers (auth, `anthropic-beta`) always win on collision.
- Each initializer threads configured headers (endpoint over `all`) into the
  right place; request-time resolution runs across all locations in the main
  and title flows.

* 🩹 fix: Cast endpoints.all to TEndpoint for headers DeepPartial widening

Adding `headers` (a Record) to `baseEndpointSchema` makes `DeepPartial<TCustomConfig>`
widen its value type to `string | undefined`, which is not assignable to the
concrete `TEndpoint['headers']: Record<string, string>` at the `loadedEndpoints.all`
assignment. Cast at the assignment site, mirroring the existing
`anthropicConfig as TAnthropicEndpoint` cast in the same function.

* 🛡️ fix: Harden built-in endpoint custom headers (Codex review)

Address Codex P2 findings on the custom-headers feature:

- Anthropic title requests: `omitTitleOptions` strips the `clientOptions`
  carrier, which dropped its `defaultHeaders`. Preserve just the header carrier
  so gateway/reverse-proxy metadata still reaches title generation.
- mergeHeaders: match header names case-insensitively so an override (e.g. a
  provider-managed `Authorization`/`anthropic-beta`) replaces/uniones a
  case-variant from the base instead of emitting two names a client may collapse.
- OpenAI: withhold admin-configured headers when the user supplies the base URL
  (`user_provided`), since values may carry `${SECRET}`/token placeholders that
  must not reach a user-controlled endpoint — mirrors the custom-endpoint guard.
- Azure: honor global `endpoints.all` headers (same OpenAI carrier) while keeping
  Azure-managed `api-key`/version headers authoritative.

Adds tests for each.

* 🔐 fix: Resolve-once + provider-managed header safety (Codex review round 2)

Address Codex P2 findings:

- Azure: keep global `endpoints.all` headers unresolved at init and let
  request-time `resolveConfigHeaders` resolve them once, avoiding a
  second-order env expansion of already-substituted user values.
- Google: `resolveConfigHeaders` no longer template-resolves the
  provider-managed `Authorization` header (built from a possibly user-provided
  key), so a user key like `${ENV}` can't leak server environment values.
- Model fetches: thread configured headers (endpoint over `all`) + user object
  through `getOpenAIModels`/`getAnthropicModels` → `fetchModels`, so a
  gateway-fronted built-in provider receives the header on `/models` too. Fixed
  `fetchModels` to merge custom headers for Anthropic instead of overwriting
  them (managed `x-api-key`/version still win).

Adds/updates tests for each.

* 🧯 fix: Header provenance, memory/title coverage, idempotency (Codex round 3)

Address Codex P2 findings, including two regressions from the prior round:

- Google auth (findings 6 & 8): move native Google header resolution to init
  (`initializeGoogle`), resolving admin templates BEFORE the key-derived auth
  header is built. resolveConfigHeaders no longer touches Google `customHeaders`,
  so admin `Authorization` templates resolve again (fixes the round-2 regression)
  while the SDK auth header (possibly a user-provided key) is never env-expanded.
- Memory runs: memory extraction now calls `resolveConfigHeaders`, so native
  Anthropic (and OpenAI) headers resolve for memory requests too.
- Vertex titles: restore the ORIGINAL `clientOptions` object reference (not a
  copy) when preserving headers across `omitTitleOptions`, so the Vertex
  `createClient` closure and the resolved headers stay on the same object.
- Reuse: `resolveConfigHeaders` is now idempotent (resolve-once per header map),
  preventing a second pass from env-expanding values already substituted with
  user/body data when an agent object flows through buildAgentInput twice.

Adds/updates tests for each.
2026-06-14 17:02:04 -04:00
Danny Avila
7cf2877b45
🪙 feat: Context Gauge UX, Hover Snapshot, Click Breakdown, Currency, Cost-On-By-Default (#13739)
* 🪙 feat: Default Context Cost On + Configurable Display Currency

Flip interface.contextCost to default-on (schema default true, resolved per-field
in loadDefaultInterface so it applies unless an admin explicitly sets false).

Add interface.currency { code, rate }: an ISO-4217 code and a static USD→local
multiplier so non-USD communities (EUR, JPY, CNY, BRL, ZAR, …) can show costs in
their currency. Inner fields are required (no nested defaults) to keep zod
input/output identical; loadDefaultInterface passes it through. Display-only —
model prices stay USD server-side.

* 🪙 feat: Currency-Aware Context Cost Formatting

formatCost(usd, currency?) applies the static rate (usd × rate) and formats via
a cached Intl.NumberFormat keyed by currency code — locale-correct symbol and
per-currency decimals, falling back to USD on a malformed code. The USD default
(code USD, rate 1) is byte-identical to the prior output.

* 💄 feat: Gauge Hover Snapshot, Click-to-Open Breakdown, Hide Until Data

Replace the hover-only HoverCard with: a compact hover snapshot tooltip
("Context 341.7k / 1.0M (34%)" + cost when enabled) via the existing Tooltip
primitive, and a click-opened Ariakit popover for the full breakdown that
dismisses on outside-click/Escape/blur. Gate visibility on usedTokens > 0 so a
fresh, message-less chat shows nothing, with an animate-in fade as the first
tokens land. Thread the display currency into the breakdown + snapshot.

* 🧪 test: Gauge Interaction + Visibility E2E

Switch the breakdown specs from hover to click, and add a test that the gauge is
absent on a new chat, surfaces the snapshot tooltip on hover, opens the breakdown
on click, and dismisses on Escape and outside-click.

* 🪙 fix: Harden Currency Resolution + Layer Breakdown Above Tooltip

Address Codex review on the currency display:
- Unsupported currency code now falls back to USD AND rate 1, so a typo like
  { code: 'EURO', rate: 0.92 } no longer shows a converted amount under a $
  symbol (was $9.20 for a $10 cost; now $10.00).
- A non-finite/negative rate (e.g. a partial admin override that set code before
  rate) falls back to rate 1, so a cost never renders as NaN.
- Fraction digits derive from the currency's own defaults, so zero-decimal
  currencies (JPY) render ¥5, not ¥5.00, and extra sub-unit precision applies
  only to currencies that have minor units. USD output is unchanged.
- Raise the click breakdown popover to z-[200] so it always sits above the
  z-150 hover tooltip when both briefly coexist.

* 🪙 fix: Validate ISO-4217 Codes + Derive Tiny Threshold from Minor Unit

Address Codex review on currency formatting:
- Intl.NumberFormat accepts any well-formed 3-letter code (EUU, RMB) without
  throwing, so the previous construct-based check missed typos/non-ISO codes and
  applied the rate under a bogus label. Validate against Intl.supportedValuesOf
  ('currency') (the ISO-4217 set); unsupported codes fall back to USD + rate 1.
  Codes are normalized to upper-case; graceful fallback if the runtime lacks
  supportedValuesOf.
- The tiny-amount threshold now derives from the currency's minor unit
  (10^-fractionDigits): 0.01 for 2-decimal, 0.001 for 3-decimal (KWD/BHD/JOD),
  1 for zero-decimal — instead of a hard-coded 0.01. Sub-unit precision trims to
  each currency's own scale. USD output unchanged.
2026-06-14 13:38:27 -04:00
Danny Avila
b03b2a0a29
💾 feat: Persist Context Breakdown & Branch/Total Usage Cost (#13734)
* 💾 feat: Persist Context Breakdown & Branch/Total Usage Cost

Persist the granular context breakdown and per-response usage/cost on the
response message metadata, and re-derive branch + total usage/cost from a
per-message index so the popover survives reloads and is branch-aware live.

- Add aggregateEmittedUsage + buildPersistedContextUsage helpers in
  packages/api; capture the latest visible snapshot and every emitted
  on_token_usage payload via contextUsageSink/usageEmitSink.
- Attach metadata.contextUsage (Part A) and metadata.usage (Part B) on the
  agents response message in sendCompletion.
- Carry per-message usage on the token index; add sumTotalUsage/setEntryUsage
  and branch-scoped usage on sumBranch.
- Repurpose the session accumulator into a single in-flight pending holder;
  flush it into the index at finalize; hydrate breakdowns on load.
- Render branch cost with a conditional all-branches total in the breakdown.

* 🧹 chore: Remove orphaned com_ui_session_cost i18n key

* 🩹 fix: Address Codex review — normalize usage server-side, fix reload deltas

- Persist per-event-normalized display units in metadata.usage (TResponseUsage)
  so reloaded mixed-provider turns match the live session; client reads them
  directly instead of re-normalizing with a single stamped provider (P2).
- Persist completedOutputTokens (final call output) on metadata.contextUsage so
  a reloaded multi-call turn adds the post-snapshot delta, not the full
  tokenCount the snapshot already counts (P2).
- buildIndex preserves a prior entry's immutable usage when a rebuilt cache
  message lacks metadata.usage, so a mid-session rebuild (regenerate) keeps a
  sibling branch's flushed cost (fixes the e2e regenerate failure).
- Track costKnown so turns saved with contextCost off don't render $0.00 when
  cost display is later enabled (P3).
- Use an epsilon for the all-branches cost comparison to avoid a spurious total
  row from float summation order (P3).
- Update unit/integration/e2e tests for the new shapes; regenerate e2e asserts
  the all-branches total after reload (deterministic via persisted metadata).

* 🩹 fix: Address Codex round 2 — pending leak, cost coverage, reload delta

- Clear the in-flight pending usage on terminal abort/error (resetLive), so a
  stopped generation's tokens no longer merge into the next response (P2).
- costKnown now means COMPLETE coverage (ANDed): a branch mixing cost-bearing
  and cost-less turns is flagged incomplete and the cost row is hidden rather
  than rendering an under-reported total (P2).
- Drop the tokenCount fallback for completedOutputTokens on reload: only the
  persisted post-snapshot delta is used, so a multi-call turn whose provider
  emitted no usage_metadata no longer double-counts earlier output (P2).
- Update tokens.spec for AND coverage semantics + incomplete-cost case.

* 🩹 fix: Address Codex round 3 — no-usage snapshots, total coverage, provider-less cache

- Skip persisting metadata.contextUsage when the response emitted no primary
  usage event: without a known post-snapshot output the granular gauge would
  undercount the reply on reload, so fall back to the coarse per-message
  estimate instead (P2).
- Gate the all-branches cost row on totalUsage.costKnown so an incomplete total
  (a sibling saved without cost) never renders an under-reported figure (P2).
- aggregateEmittedUsage/finalCallOutputTokens now normalize per-event with the
  client's magnitude fallback (normalizeEventUnits) instead of billing
  splitUsage, so provider-less cached events match live on reload (P2).
- Add backend test for the provider-less cached case.

* 🩹 fix: Address Codex round 4 — abort attribution, complete cost coverage

- aggregateEmittedUsage persists cost only when EVERY call was priced; a partial
  pricing failure now omits cost so the client treats coverage as unknown rather
  than reading an under-reported sum as authoritative (P2).
- finalizeUsage flushes pending into the response entry only when events were
  folded this session (eventCount > 0), so a late/second resumable subscriber
  carrying persisted metadata.usage keeps it instead of being overwritten with
  an empty pending record (P2).
- On user stop, attribute the in-flight pending usage to the partial response
  (new attributePending handler) instead of discarding it in resetLive — the
  stopped reply's billed tokens are kept and still can't leak into the next
  response; resetLive's discard remains for the error path (P2).

* 🐛 fix: Persist branch cost across branch switches via sticky usage history

Branch cost vanished on switching to a sibling branch (until a new turn) — the
cost analog of the granularity bug. buildIndex rebuilds the token index from the
messages cache; a sibling generated this session whose cache message lacks
metadata.usage (and is transiently dropped from the cache during regenerate)
lost its live-flushed usage, so sumBranch found none and the cost row hid.

Fix: a sticky per-response usage map (conversationId → messageId → usage),
written by setEntryUsage and never rebuilt from the cache — the usage counterpart
of snapshotsByAnchorFamily for the breakdown. buildIndex/upsertEntries restore an
entry's usage from it when the message carries none; cleared on convo switch and
migrated with the index. Add unit coverage for the drop-then-readd regression and
an e2e assertion that branch cost survives a branch switch.

* 🐛 fix: Re-index on branch switch so branch cost survives the switch

The sticky usage history alone didn't fix the reported branch-switch cost drop:
on a branch switch no cache `updated` event fires, so the index subscriber never
re-ran, and the post-regenerate rebuild was skipped while `isSubmitting` was
still true — leaving the index stale and missing the now-viewed branch's
response entirely (sticky can only restore entries present in a rebuild).

Re-index from the messages cache on every tail change (created/finalize AND
branch switch), not just while submitting. The cache holds the full message set
at switch time, so the viewed branch's response is re-added and its usage
restored from metadata.usage or the sticky history → sumBranch finds it and the
branch cost renders. Verified locally: the branch-switch e2e now passes (the
cost section shows both the branch row and the all-branches total). Also fixed
that e2e assertion to target a single cost value (strict-mode safe).

* 🩹 fix: Handle stopped-stream usage — reset pending + persist abort metadata

Codex round (stop/abort edges):
- Resumable explicit-stop (intentional SSE close) reset UI state but never
  cleared pendingUsageFamily, so usage folded before the stop leaked into the
  next response in the conversation. Discard pending on intentional close
  (resetLive); a resume re-folds via backfillUsage, so nothing is lost.
- The abort save path (abortMiddleware) persisted the stopped response without
  metadata.usage/contextUsage, so its cost + breakdown vanished on reload.
  Rebuild both from the job's persisted tokenUsage (emitted payloads incl. cost)
  and contextUsage snapshot — parity with the normal sendCompletion path;
  breakdown gated on a primary usage event like buildResponseMetadata.

Deferred (per scope decision): mid-stream branch-switch transiently shows the
streaming branch's pending on the viewed sibling (cosmetic, until finalize).

* 🩹 fix: Persist abort metadata on the real agents route + tighten snapshot gate

Codex round (corrects last round's wrong-path fixes):
- Stopped AGENTS responses are saved by routes/agents/index.js (/chat/abort),
  not abortMiddleware — so last round's metadata fix never ran for them. Moved
  the rollup/snapshot builder into packages/api as buildAbortedResponseMetadata
  (shared, unit-tested) and applied it in BOTH abort save paths, so a stopped
  agent reply keeps its cost + breakdown on reload.
- Persist the breakdown only when the FINAL visible call emitted usage: track a
  per-response snapshot count and require primaryUsageCount >= snapshotCount.
  Previously any earlier primary usage event passed the gate, so a multi-call
  turn whose final call emitted no usage_metadata used an earlier call's output
  as completedOutputTokens (already counted by the latest snapshot) → reload
  over-reported. Now it falls back to the coarse estimate.

Resumable stop pending-reset (prior round, 3cde6fe035) already flows through
clearAllSubmissions → SSE close → the intentional-close handler's resetLive.
Deferred per scope: mid-stream branch-switch pending attribution (tracked).

* 🩹 fix: Abort breakdown over-count + resume re-fold after pending discard

Codex round (on the re-applied abort/snapshot work):
- buildAbortedResponseMetadata now persists ONLY the usage/cost rollup, not the
  context breakdown. The abort path can't tell whether the final call emitted
  usage (the job stores only the latest snapshot, not a count), so persisting
  the breakdown risked reusing an earlier call's output as completedOutputTokens
  (already in the snapshot) → reload over-count. Stopped/incomplete responses
  now fall back to the coarse gauge estimate, which is safe and apt.
- resetLive now also forgets the conversation's folded usage-event identities
  (clearUsageFolded). Discarding pending on a terminal/intentional close left
  the folded keys set, so a later resume's backfillUsage saw the persisted
  events as duplicates and never rebuilt pending — leaving the response's usage
  missing until a full reload. Clearing them lets the resume re-fold.
2026-06-14 10:48:07 -04:00
Danny Avila
db7011d567
📊 feat: Real-Time Context Window & Token Usage Tracking (#13670)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* 📊 feat: Real-Time Context Window & Token Usage Tracking

* 🧪 fix: Align Pricing Spec Dep Signatures with TxDeps

* 🩹 fix: Resolve Codex Findings for Context Usage Tracking

* 📊 feat: Granular Tool Token Breakdown with Deferred Splits

* 🧪 test: Cover Session Cost in Mock E2E and Scope Usage Selectors

* 🧪 test: Live Host-Pipeline Usage Verification (Env-Gated)

* 🧪 test: Local Real-Provider Multi-Turn E2E Harness

* 🪙 fix: Keep Tagged Usage Buckets Out of the Live Context Estimate

* 🩹 fix: Scoped Token-Config Fallback and Sequential Visibility for Usage Events

* 🩹 fix: Address Usage Review Findings — Cost Timing, Scoped Caches, Finalized Output

- carry the post-snapshot output estimate into the context snapshot at
  finalize so the gauge keeps the last response after live resets
- accumulate per-rate billable units and price the session cost at
  render, so usage events arriving before the token-config load still
  count once it resolves
- pass user-scoped token-config cache keys through loadConfigModels
  fetches and drop the controller's unscoped fallback to prevent serving
  another user's resolved config
- tag emitted usage events with a per-run seq so resume dedupe never
  drops a distinct call with an identical payload
- admit the static tokenConfig override in the custom endpoint schema so
  it survives zod parsing into req.config

* 🩹 fix: Align Client Usage Accounting with Backend Cost Semantics

- classify cache tokens by provider (shared inputTokensIncludesCache from
  data-provider, consumed by both the backend billing path and the client)
  instead of a magnitude heuristic, so Anthropic/Bedrock turns where cache
  is smaller than uncached input no longer under-bill input
- mirror resolveCompletionTokens on the client so Vertex-style hidden
  thinking tokens are reflected in the Output row and session cost
- prefer endpoint pricing over adapter-provider pricing so a custom
  endpoint can price a known model name without built-in rates shadowing it
- carry static cacheRead/cacheWrite overrides through the tokenConfig
  schema and buildTokenConfigMap

* 🩹 fix: Honor Static Token Config in Billing; Tighten Usage Freshness

- initializeCustom now uses a static endpoint tokenConfig as the agent's
  endpointTokenConfig (billing + balance checks), not just the advertised
  UI config — previously the gauge showed admin rates while the agent
  billed against built-in tables
- invalidate the token-config query alongside models on user-key add/
  revoke so context windows and pricing refresh without a reload
- include maxContextTokens in ChatForm's stabilized conversation memo so
  the gauge reflects a changed context-window setting immediately
- feed the live output estimate from the legacy content path (direct and
  assistants streams), setting from cumulative part text rather than
  accumulating deltas

* 🩹 fix: Resume Usage Dedup, Agent Pricing, and Partial Override Billing

- fold usage events idempotently by (runId, seq) so resume backfill no
  longer resets the conversation totals — a mid-stream reconnect keeps the
  usage of prompts already completed earlier in the session
- tap replayed pending message/reasoning/content events so output streamed
  past the resume snapshot reaches the live estimate, not just the message
- resolve cost against the agent's backing endpoint (Agents conversations
  report endpoint `agents` / provider `openAI`, neither of which keys a
  custom endpoint's tokenConfig)
- getMultiplier/getCacheMultiplier fall back to the standard tables for
  models absent from a partial endpointTokenConfig, so a partial static
  override no longer bills non-listed models at defaultRate while the UI
  shows the correct pattern rate

* 🩹 fix: Repaired Output in Gauge, Cache-Rate Keys, Config Gate, Usage Cleanup

- live/completed gauge counts the repaired completion (normalized output),
  so under-reporting providers don't drop the response from used context
- translate static tokenConfig cacheWrite/cacheRead onto the write/read
  keys getCacheMultiplier reads, so cache tokens bill at the configured
  rate instead of the prompt-rate fallback
- clear the token index and usage atoms when leaving a conversation, so
  visited histories don't accumulate in memory for the tab's lifetime
- wait for startupConfig before mounting the gauge, so a deployment with
  contextUsage disabled never briefly mounts it or fires the token-config
  query on first load

* 🩹 fix: Move Token-Config Resolution to TS; Key Live Usage by Created Convo

- extract the token-config resolution (override gathering + cache lookup +
  buildTokenConfigMap) into resolveTokenConfigMap in packages/api, leaving
  the /api controller a thin request-scoped wrapper (CLAUDE.md TS rule)
- getConvoKey prefers the user message's real conversationId once the
  `created` event stamps it, so a new chat's first-response live gauge and
  totals land under the id TokenUsage subscribes to instead of NEW_CONVO

* 🩹 fix: Clear Stale Redis Job Usage; Live-Tap Legacy Streams; Share Fetched Config

- DEL the Redis job hash before re-creating it so a reused streamId can't
  inherit a prior run's contextUsage/tokenUsage and backfill stale usage
- tap the legacy {message,text} stream branch (non-agent OpenAI/Anthropic
  streams) into the live estimate, not just the content path
- copy a deduped fetch's token config to every sibling endpoint sharing the
  baseURL/key/headers, so /token-config resolves each by its own name

*  revert: Don't DEL Redis job hash in createJob (breaks cross-replica resume)

createJob is an idempotent join — a second replica calls it for the same
streamId to share an in-flight stream's state. DELeting the hash wiped the
prior replica's persisted created/usage state, so a joining replica missed
the created event (GenerationJobManager cross-replica integration test).
Reverts the F1 change from 2bfce0c34b; the stale-usage concern doesn't
arise in practice (streamId is unique per generation).

* 🩹 fix: Best-Effort Usage Emit; Tag Hidden Sequential-Agent Usage

- wrap the ModelEndHandler usage emit in try/catch so a failed telemetry
  delivery (closed SSE / Redis publish error) can't abort the handler
  before thought-signature capture, which would break resumed tool calls
- tag hidden sequential-agent usage as 'sequential' (non-primary) so the
  client folds it into session cost/totals but not the live context gauge,
  instead of letting an undefined usage_type inflate the visible gauge

* 🩹 fix: Refetch Stale Token Config on Mount; Normalize Vertex for Lookup

- useTokenConfigQuery refetches on mount when stale, so a user-key change
  that invalidates tokenConfig while the gauge is unmounted takes effect on
  return instead of serving the prior key's resolved config
- normalize a Vertex-backed agent's provider (vertexai) to the google
  token-config key, so Gemini context windows and rates resolve instead of
  showing unknown context / $0 cost

*  feat: Server-Side Per-Event Cost (Authoritative Pricing for the Gauge)

Move usage-cost pricing to the single source of truth. The backend prices
each model call with the same billing functions (premium tiers via
getMultiplier(inputTokenCount), cache rates) and emits the USD cost on
on_token_usage when interface.contextCost is enabled; the client sums
emitted costs instead of re-deriving from base token-config rates.

- computeUsageCostUSD reuses prepareTokenSpend/prepareStructuredTokenSpend
  so the emitted cost matches what is billed (incl. premium thresholds)
- getDefaultHandlers gains a usageCost pricing context; initialize.js wires
  db.getMultiplier/getCacheMultiplier gated on contextCost (agents path)
- client UsageTotals carries a summed costUSD; retire the client-side rate
  lookups (costFromUnits/calcUsageCost) that drifted from backend pricing
  and produced the provider-keying / cache-key / Vertex / premium findings
- keep normalizeUsageUnits for the displayed token counts; token-config is
  still used for the context-window meter

Fixes the premium-tier session-cost under-report (gpt-5.x / gemini-3.1
above their input thresholds).

* 🩹 fix: Branch-Accurate Usage Snapshot + Clearer Gauge Track Contrast

- re-anchor the context snapshot from the user message to the response
  message at finalize. Regenerating a response branches off a shared user
  message, so anchoring on it made the snapshot read as "active" on both
  branches — switching to the sibling branch showed the wrong (other
  branch's) context. The response message is branch-unique, so sibling
  branches now correctly fall back to their own per-branch totals.
- raise the gauge ring's track/fill contrast (muted track, prominent fill)
  so the used portion reads clearly as a fill-level indicator

* 🩹 fix: Tag Sequential Usage in Billing; Emit Subagent Cost; Reset Live on Resume Errors

- tag hidden sequential-agent usage `usage_type: 'sequential'` on the
  COLLECTED usage (not just the emit), and treat it as non-primary in
  recordCollectedUsage (billed, excluded from the reported output total) so
  hidden intermediate output stops inflating the parent's tokenCount/pruning
- emit on_token_usage from the subagent usage sink (tagged `subagent`, with
  authoritative cost when contextCost is on) so the gauge's session
  cost/totals include billed subagent usage; it stays out of the live meter
- call resetLive on the resumable 404 and max-retry terminal branches so the
  gauge doesn't keep counting stale in-flight tokens after the stream ends

* 🎨 fix: Contrast the Popup Context Bar; Revert Ring Restyle

- raise the popup breakdown's context progressbar contrast (muted
  surface-tertiary track, prominent text-primary fill) — that's the bar the
  contrast feedback was about
- revert the gauge ring restyle (kept its original border-heavy track /
  text-secondary fill); the ring wasn't the element in question

* 🩹 fix: Stop Snapshot Granularity Leaking Across Branches; Revert Tree Memo

- a null-anchor context snapshot was treated as active on every branch,
  leaking one generation's granular breakdown onto sibling branches. Require
  a non-null (response-message) anchor on the viewed branch instead, so
  siblings without a matching snapshot fall back to their own totals.
- revert the buildTree WeakMap memo in messages.ts. buildTree is pure (builds
  from shallow copies) so the memo was behaviorally identical, but it was the
  feature's only change to core branch-navigation selectors — removing it
  matches upstream and rules it out of branch-navigation debugging.

* 🪙 fix: Thread Endpoint Token Config to Agent Billing, Cost, and Context Limits

Custom-endpoint agents resolve an endpointTokenConfig during agent init but
it never reached the AgentClient, so spending, emitted cost, and runtime
max-token resolution all fell back to default rates for those agents.

- Surface options.endpointTokenConfig on the returned InitializedAgent.
- Pass it to the AgentClient (this.options.endpointTokenConfig) so the
  spending path bills at configured rates.
- Thread it through usageCost to computeUsageCostUSD so emitted per-event
  cost matches billing.
- getModelMaxTokens/getModelMaxOutputTokens fall back to the built-in map
  for models absent from a partial override (matches buildTokenConfigMap);
  consolidates the duplicated fallback in pricing.ts.

* 🪙 fix: Preserve Granular Breakdown Across Branch Switches

The granular context breakdown lives only in the live on_context_usage
snapshot — a single per-conversation slot, anchored to the latest response
and overwritten by each generation. Switching to a branch generated earlier
this session lost its tool/skill/system rows and fell back to coarse totals.

Retain each generation's finalized snapshot in a per-conversation map keyed
by its branch-unique response id (snapshotsByAnchorFamily). When the live
snapshot is off the viewed branch, walk the branch tail for its deepest
stored anchor and render that breakdown. Bounded by generation count and
cleared on conversation switch; the live/just-generated path is unchanged.

* 🪙 fix: Harden Resume Seeding and Subagent Usage Emission

- useResumableSSE: skip the trailing-output live seed when the resume
  carries a context snapshot; the snapshot's messageTokens already counts
  produced output, so seeding it again inflated usage until the next reset.
- AgentClient subagent emitter: await GenerationJobManager.emitChunk like
  every other caller (it persists before publishing), so a floating promise
  can't race job cleanup and a Redis/publish failure is caught by the
  emitter's try/catch instead of surfacing as an unhandled rejection.

* 🧪 test: Playwright Coverage for Context Breakdown Granularity

Add a test-only data-testid distinguishing the granular snapshot breakdown
(context-breakdown) from the coarse message-history estimate
(context-estimate), then assert granularity in the mock e2e harness:

- renders the granular breakdown from the live on_context_usage snapshot
  (guards that the snapshot event actually reaches the popover, not just the
  usage totals).
- preserves the granular breakdown after switching branches — regenerate to
  overwrite the single live snapshot, switch back, and confirm the rows
  survive via the per-anchor snapshot history map.

Branch regenerate/sibling selectors mirror the existing chat.spec branch test.
All three usage specs pass against the mock pipeline.

* 🪙 fix: Correct Resume Live-Seed, Fallback Re-index, and Subagent Emit Flush

Codex round on the prior commit:

- countTrailingOutputChars now counts only output at the very END of the
  aggregated content (0 when the model paused at a tool call), and the resume
  path always seeds it. The earlier skip-trailing-tool-parts behavior plus the
  skip-seed-when-snapshot gate together over- or under-counted in-flight
  output on resume; one rule fixes both — pre-invoke snapshot budget is never
  double-counted, and genuine in-flight output is no longer dropped.
- useTokenUsage re-indexes from the messages cache on tail change while
  submitting. The cache subscriber is muted during streaming, so without a
  context snapshot (non-agent streams) sumBranch missed the created tail and
  dropped history + prompt until finalize. Bounded — tailId only shifts on
  created/finalize/branch-switch.
- AgentClient tracks subagent usage emit promises and flushes them in
  chatCompletion's finally. The sink fires the emitter without awaiting, and
  resume reads the usage emitChunk persists (HSET), so cleanup must not race
  it or resumed clients miss billed subagent usage.
2026-06-13 19:38:28 -04:00
Michael Harvey
05eb986097
💬 feat: Conversation Starters for Model Specs (#13710)
* 💬 feat: Conversation Starters for Model Specs

Adds an optional conversation_starters field to model specs in
librechat.yaml. When the active conversation uses a spec that defines
starters (and no agent/assistant starters apply), the chat landing
renders clickable starter prompts between the landing content and the
chat input; clicking one submits it as the first message.

- data-provider: add conversation_starters to TModelSpec and
  tModelSpecSchema so the field survives strict config parsing
- client: ConversationStarters falls back to the active spec's
  starters via getModelSpec; entity (agent/assistant) starters
  take precedence; starter cards are centered, size to content,
  wrap at word boundaries, stagger their fade-in, and gain a
  focus-visible ring
- sanitizeModelSpecs passes the field through (denylist); covered
  by a new unit test
- e2e: mock spec + tests for rendering, absence, click-to-submit,
  and the MAX_CONVO_STARTERS cap

Closes #3619

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* chore: Sort ChatView imports

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-06-13 11:38:49 -04:00
Lacy
dea71c8396
🪟 fix: Cross-Platform Absolute-Path Check in tsdown neverBundle Predicates (#13700)
The deps.neverBundle predicates in the four package tsdown configs detect
first-party (resolved) module ids with !id.startsWith('/'). On Windows,
resolved ids are absolute paths like C:\..., which never match, so every
project module is externalized. Builds still exit 0 but emit near-empty
bundles — e.g. packages/client dist/index.mjs drops from ~276 kB to
~2.7 kB and dist/style.css is never produced, breaking the client dev
server with "Failed to resolve import @librechat/client/style.css".

Replace the startsWith('/') check with path.isAbsolute(id), which is
behavior-identical on POSIX and correct on Windows.

Co-authored-by: phoenixtekk <phoenixtekk@users.noreply.github.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 11:04:46 -04:00
Danny Avila
197a1dc4e2
🧬 feat: Add GitHub Skill Sync (#13293)
Some checks failed
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
Publish `librechat-data-provider` to NPM / pack (push) Has been cancelled
Publish `librechat-data-provider` to NPM / publish-npm (push) Has been cancelled
* feat: Add GitHub skill sync

* fix: Address GitHub skill sync CI

* fix: Harden GitHub skill sync review paths

* fix: Prevent overlapping skill sync runs

* fix: Address GitHub skill sync review findings

* fix: Satisfy Git ref lint rule

* fix: Address GitHub sync review follow-ups

* fix: Match skill frontmatter closing fence

* fix: Address GitHub sync review cycle

* fix: Address GitHub sync review follow-ups

* fix: Harden GitHub skill sync worker

* fix: Format GitHub sync rollback log

* fix: Address GitHub sync review feedback

* fix: Format skill import parse handling

* fix: Coerce scalar skill frontmatter and correct scheduler timer clear

- parse: coerce numeric/boolean name and description scalars to strings instead of dropping them to empty (restores pre-refactor behavior; preserves absent-vs-empty distinction for the when-to-use fallback)
- scheduler: clear the setTimeout handle with clearTimeout rather than clearInterval
- test: cover non-string scalar frontmatter coercion

* fix: Tolerate trailing whitespace after SKILL.md opening frontmatter fence

extractFrontmatterBlock required the opening fence to be exactly '---\n', so an opener with trailing spaces/tabs (e.g. '---   \n') silently dropped all frontmatter even though the closing-fence regex already tolerates it. Match the opener with /^---[ \t]*\n/ for symmetry. Addresses Codex P3 (parse.ts:24).

* feat: Run GitHub skill sync under a per-source tenant context

Under TENANT_ISOLATION_STRICT, the sync ran with no async tenant context, so the tenant-isolation mongoose hooks threw on every Skill/SkillFile/AclEntry operation; in non-strict mode synced skills were written tenant-less and never matched tenant-scoped reads. Add an optional per-source tenantId to the skillSync config; when set, each source sync runs inside tenantStorage.run({ tenantId }) so skills, files, and public ACL grants are created and listed within that tenant, and the skill row is stamped with the tenantId for correct dedup. Sources without tenantId keep the prior single-tenant behavior. Avoids runAsSystem. Addresses Codex P2 (sync.js:70).

Lock/status/credential bookkeeping stays outside the tenant context (those collections are intentionally global).

* test: Restore dropped tenant-context coverage for GitHub skill sync

The prior commit shipped the getTenantId import in github.spec.ts without the tenant tests that use it (lost in an interrupted edit), which failed the eslint --max-warnings=0 CI job on an unused import. Restore both github.spec.ts tenant tests (tenant-scoped run stamps tenantId and executes inside the tenant ALS context; no-tenant run stays ambient) and the two config-schemas tenant tests (accepts tenantId, rejects __SYSTEM__).

* test: Restore dropped github.spec tenant-context tests

The previous commit's github.spec.ts edit did not apply (anchor mismatch), so the getTenantId import remained unused and failed eslint --max-warnings=0. Add the two tenant tests that use it: a tenant-scoped run stamps tenantId and executes inside the tenant ALS context, and a no-tenant run stays ambient.

* feat: Scope synced skill author to tenant and harden tenant-context sync

Addresses the latest Codex review on the per-source tenant change:
- makeSourceAuthorId now folds tenantId into the synthetic author hash so the
  same source mirrored into different tenants gets distinct author ids (clearer
  audits, no cross-tenant author collisions). Single-tenant author ids stay
  stable (suffix omitted when tenantId is absent).
- syncSourceInTenantContext uses an async callback per the tenant-context
  contract so the ALS store propagates across awaited Mongoose calls.
- Tests: same-source/different-tenant yields distinct authors; mirror cleanup
  is scoped to the source and deletes only its absent-upstream skills.

* fix: Repair tsc error and guard external edits in github skill sync

- Fix TS2352 in github.spec mirror-cleanup test: build the existing-skill mock via makeSkill with authorName instead of an under-typed 'as CreateSkillInput' cast (this was the failing TypeScript CI check on f00ce3c5a).
- 808: commitExistingRemoteSkillAfterFileSync re-reads to clear our own file-sync version bumps, but now compares refreshed content against the pre-sync snapshot (body/name/description/always-apply) and throws SKILL_CONFLICT on a concurrent external edit instead of overwriting it.

* docs: Note skillSync source tenantId is effectively immutable

Changing/adding/removing a source's tenantId orphans previously mirrored skills in the old tenant (a tenant-scoped sync cannot clean another tenant's data without runAsSystem, which is intentionally avoided).

* fix: Key GitHub skill upstream identity on source id and path only

Addresses Codex finding (github.ts:217): makeUpstreamId previously included owner/repo, so repointing a source to a renamed or replacement repository (same source id) changed the upstreamId, made findSkillBySourceIdentity miss the existing mirror, and then collided on the (name, author, tenantId) uniqueness constraint — leaving the source stuck failing. Identity now keys on the stable source id + root path only. The feature is unreleased, so there is no stored-id migration. Updated spec upstreamId fixtures to the new format; the existing ref-independent identity test now also covers repo moves.

* fix: Scope GitHub skill mirror deletion to the source tenant

Addresses Codex P1 (github.ts:1047/1057): an ambient source (no tenantId) runs listSkillsBySource without tenant context, which under non-strict isolation returns github-synced skills across all tenants. The mirror-deletion pass then treated other tenants' skills as absent-upstream and could delete them. Filter existingSyncedSkills to rows whose tenantId matches the source's configured tenantId (absent = its own ambient bucket) before deleting, so a sync never removes another tenant's mirrored skills. Covered by a test where an ambient run leaves a tenant-b-owned skill untouched.

* fix: Apply tenant-scoped mirror deletion implementation

The prior commit (75ccfa3fc) added the test but the source change to github.ts was lost in an interrupted edit, leaving a failing test with no implementation. This adds the actual guard: the mirror-deletion pass skips skills whose tenantId does not match the source's configured tenantId (absent = ambient bucket), so an ambient source whose listSkillsBySource returns cross-tenant rows under non-strict isolation cannot delete another tenant's mirrored skills.

* fix: Resolve global access role outside tenant context for synced skill grants

Addresses Codex P2 (github.ts:1166): default access roles (incl. skill_viewer) are seeded globally with no tenantId under runAsSystem, but a tenant-scoped sync wraps ensurePublicViewer in the source's tenant context. The PermissionService grantPermission resolved the role via a tenant-isolated AccessRole query, so the global role did not match and tenant-scoped syncs failed with 'Role skill_viewer not found'. The sync adapter now resolves the role inside runAsSystem (matching the global seed) and writes the ACL entry in the active tenant context, so the AclEntry is tenant-scoped (visible to tenant users) while the role lookup still succeeds. Covered by service tests for the resolve-vs-write split and the missing-role failure.

* fix: Strip placeholder frontmatter booleans and check skill conflict before file sync

- 1083 (github.ts:759): toCleanFrontmatter now drops a non-boolean always-apply (e.g. the 'always-apply:' / 'always-apply: # TODO' placeholder, which js-yaml yields as null). The boolean is already captured in the dedicated alwaysApply field; persisting null left ambiguous frontmatter on the synced skill.
- 1080 (github.ts:1057): for an existing mirrored skill, check for an external content edit (via getSkillById + hasExternalSkillEdit) BEFORE syncSkillFiles mutates the bundled files, so a concurrently edited skill fails fast with SKILL_CONFLICT without partial file rewrites. The post-file-sync check still guards edits that land during the file sync window.
Tests: placeholder always-apply is dropped from synced frontmatter; concurrent-edit conflict leaves files unmutated (no upsert/delete).

* fix: Harden GitHub skill sync review paths

* fix: Reuse moved GitHub skill mirrors

* fix: Scope GitHub sync identity conflicts

* test: Fix GitHub sync conflict mock typing

* fix: Support nested env-backed skill sync

* fix: Keep skill sync config base-only

* fix: Scope GitHub skill identity lookup by tenant

* fix: Harden GitHub skill sync admin gates

* fix: Guard existing skill sync permission grants

* feat: Trigger skill sync from resolved config

* fix: Scope resolved skill sync by tenant

* test: Allow manual skill sync status tenant scoping

* refactor: Extract skill sync trigger orchestrator

* test: Complete orchestrator status fixture

* chore: Bump data provider version

* fix: Restrict skill sync server credentials

* test: Complete admin skill sync status fixtures

* fix: tighten skill sync trigger safeguards

* fix: preserve alwaysApply skill sync alias

* chore: sort skill sync imports

* fix: preserve skill sync request scope

* fix: harden skill sync review edges

* refactor: move skill sync admin access to api package

* fix: add skill sync declaration return types

* fix: satisfy skill sync type checks

* fix: resolve codex skill sync review findings

* fix: harden skill sync review edges

* fix: resolve codex skill sync edge findings

* fix: satisfy API declaration build after rebase
2026-06-10 21:05:54 -04:00
Danny Avila
470be2395f
feat: Surface Model Spec Branding on Landing and Selector (#13662)
Adds an opt-in showOnLanding flag to model specs. When set, the chat
landing shows the spec's label and description in place of the
time-of-day greeting; specs without the flag are unaffected, so existing
deployments see no behavior change. HTML-valued descriptions (inline
icons + markup) render sanitized via the shared config-HTML sanitizer
with a new media tag/attribute allowlist, both on the landing and in
model selector items. Excludes e2e specs from the typed client lint
block so staged e2e files no longer fail pre-commit with 'file not
found in project'.
2026-06-10 21:02:22 -04:00
Danny Avila
a52c82489e
🚷 fix: Reject Client-Supplied Subagent Configuration (#13660) 2026-06-10 16:10:59 -04:00
Dustin Healy
5867f1a065
🛡️ feat: Configurable Message PII Filter (#13602)
* 🛡️ feat: Reject chat messages matching configured credential patterns

Adds an opt-in `messagePiiFilter` middleware mounted on the agent
chat route ahead of `moderateText`. When the configured patterns
match the user's input the request is refused with 400, so the
credential never reaches OpenAI moderation, the model, or MongoDB.
Three starter patterns ship by default and operators can subset
them or add their own regex via `customPatterns` in librechat.yaml.

* 🧪 test: Memoize compiled patterns + add middleware spec

Memoize the compiled pattern array via a WeakMap keyed by the
messagePiiFilter config object so repeat requests against the same
config skip the per-request RegExp construction. Cache entries are
released automatically when the config object itself rotates.

Adds packages/api/src/middleware/messagePiiFilter.spec.ts covering
the default-starter rejections, the starterPatterns subset and
empty-array semantics, customPatterns matching layered on top of and
in place of the starters, the no-config and empty-text pass-through
paths, and a memoization regression check.

* 🛡️ fix: Skip invalid customPattern regexes instead of crashing the request

Admin DB overrides for `messagePiiFilter.customPatterns` reach
`req.config` via `mergeConfigOverrides`, which deep-merges raw
override values without re-running `configSchema`. A typo'd regex
like `(` would slip past the YAML-load validation and throw inside
`new RegExp(...)` during `compile()`, returning 500 for every chat
request until the operator rolled the override back.

Wrapped the per-pattern compile in a try/catch that logs the
invalid pattern id + reason and skips it, so other valid patterns
(starters and other custom entries) keep filtering. Added a
regression test alongside the existing spec.

* 🛡️ feat: Extend PII filter to OpenAI-compatible and Responses agent APIs

The chat-route middleware operates on `req.body.text`, but the remote
agent API endpoints (`/api/agents/v1/chat/completions`,
`/api/agents/v1/responses`) accept the same prompt content as a
`messages` array or an `input` field. A caller using their API key
could send a credential-shaped value through either route and bypass
the configured PII filter even though they share the same agent and
model backbone the middleware is meant to guard.

Factored out `findPiiMatchInMessages`, a tolerant walker that handles
both `content: string` and `content: ContentPart[]` user-message
shapes against the same compiled, cached pattern list. Wired it into
the OpenAI-compat controller after agent lookup and into the
Responses controller right after `convertToInternalMessages`. Each
returns the endpoint's native 400 error shape
(`sendErrorResponse` / `sendResponsesErrorResponse`) with the
`message_pii_filter_block` code when a user message matches.

* 🩹 test: Add findPiiMatchInMessages to OpenAI + Responses controller mocks

The OpenAI-compat and Responses controller specs mock `@librechat/api`
with a hand-listed object. The new `findPiiMatchInMessages` export
wired into both controllers in 3ea35af9a was missing from those
mocks, so the production lookup returned undefined and the controllers
threw at request time under jest. Added the missing entries (default
mock: returns null so the handlers fall through to the existing happy
paths). All 278 agents-controller tests pass locally.

* 🧹 refactor: Namespace messagePiiFilter under messageFilter.pii + fix import order

Renames the yaml field `messagePiiFilter` to `messageFilter.pii`, the
module to `messageFilterPii`, the factory to `createMessageFilterPii`,
the type to `MessageFilterPiiConfig`, and the error code to
`message_filter_pii_block`. The wrapper `messageFilter` namespace
gives future safety filters (e.g. `messageFilter.toxicity`) a place
to plug in without restructuring the config later. The
`findPiiMatchInMessages` helper kept its name because it already
describes what it does at the value level.

Also fixes import order Danny flagged on the OpenAI-compatible and
Responses controllers: `findPiiMatchInMessages` was appended at the
bottom of two `require('@librechat/api')` destructures rather than
placed in the length-sorted slot the house style expects.

* 🧹 chore: Length-sort the general require destructure in responses.js

Reorders the general sub-group inside the `require('@librechat/api')`
destructure shortest to longest so the whole block conforms to the
length-sort rule the file's `// Responses API` sub-group already
follows. Pure reorder, no other changes.

* 🧹 chore: Length-sort the defaultConfig block in AppService

Reorders the `defaultConfig` keys in `packages/data-schemas/src/app/service.ts`
shortest-line to longest-line, with the explicit-value entries
(`mcpConfig`, `fileStrategies`, `cloudfront`) trailing the shorthand
ones. Pure reorder, no behavior change.
2026-06-10 09:03:05 -04:00
Danny Avila
f074bd9e09
📦 chore: Bump jest-junit to v17.0.0 2026-06-09 20:38:30 -04:00
Danny Avila
ca26a2dc9c
🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults (#13636)
* 🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults

* 🛰️ fix: Address Codex Review on OpenAI Model Refresh

- Replace nonexistent gpt-5.5-chat-latest with the actual chat-latest
  alias; register its context window, output cap, pricing, and cache
  rates, and pin explicit rates for legacy gpt-5.x-chat-latest aliases
  so the new chat-latest key cannot out-match their cheaper pricing
- Add long-context premium tiers (>272K input) for gpt-5.5 and gpt-5.4
- Disable streaming for pro reasoning models (o1-pro, gpt-5.x-pro),
  which OpenAI does not support, with spec coverage

* 🛰️ fix: Address Codex Round-2 Review and CI Spec Failure

- Allow chat-latest through the official OpenAI fetched-model filter
- Export isProReasoningModel and drop unsupported sampling parameters
  for versioned pro models (gpt-5.4-pro, gpt-5.5-pro), which the
  versioned-model exemption previously let through
- Honor the pro-model streaming disable in both agent chat-completions
  routes, which decide SSE from model_parameters before llmConfig exists
- Update models.spec default-list assertions for the refreshed defaults
  and cover chat-latest filter retention

* 🛰️ fix: Address Codex Round-3 Review

- Convert max_tokens for chat-latest, which the gpt-[5-9] guard missed
- Drop snake_case sampling params (top_p, logit_bias, penalties) in the
  reasoning-model exclusion list so addParams-sourced values are removed
- Add createOpenAIAggregatorHandlers and wire them into the agent
  chat-completions service's non-streaming branch, which previously ran
  with no handlers and always returned an empty aggregated response

* 🛰️ ci: Fix Import Order Drift and Controller Spec Mock

- Sort type import first in service.spec.ts per import-order convention
- Register isProReasoningModel in the openai controller spec's
  @librechat/api mock factory, whose enumerated exports left the new
  helper undefined and broke the non-streaming flow under test

* 🛰️ chore: Trim Scope to Model Catalog Changes

Revert the OpenAI endpoint and agent handler changes (pro-model
streaming, sampling exclusions, non-streaming aggregation) — that
surface is moving out of LibreChat into the agents SDK and belongs
in its own change. Keep the model list, token windows, pricing, and
the fetched-model filter for chat-latest.

* 🛰️ fix: Correct GPT-5.4 Context Windows and Pro Long-Context Pricing

- Set gpt-5.4 and gpt-5.4-pro context to the documented 1,050,000
  window — 272K is the long-context pricing breakpoint, not the cap,
  and using it truncated prompts before they could reach that tier
- Add gpt-5.4-pro long-context premium rates ($60/$270 above 272K)
  per its model page; gpt-5.5-pro documents no long-context tier

* 🛰️ fix: Add gpt-5.4-nano and gpt-5.5-pro Long-Context Pricing

- Register gpt-5.4-nano ($0.20/$1.25, cached $0.02, 400K context) in
  the model list, pricing, cache, and token maps — the longest-match
  fallback billed it at gpt-5.4's $2.50/$15
- Add gpt-5.5-pro long-context premium rates ($60/$270 above 272K);
  the pricing table lists the tier even though the model page omits it
2026-06-09 20:12:31 -04:00
Danny Avila
a7f16911b2
fix: Extend and Decouple MCP OAuth Flow Timeouts (#13622)
*  fix: Extend and decouple MCP OAuth flow timeouts

The OAuth auth button disappeared after 2 minutes (the internal OAuth
handling timeout) while the flow state lived for 3 minutes, leaving users
who didn't click immediately stuck in an unrecoverable re-auth loop. The
handling timeouts also reused the connection/init timeout, so a short
initTimeout would shrink the OAuth window further.

- Add MCP_OAUTH_HANDLING_TIMEOUT (10m) and MCP_OAUTH_FLOW_TTL (15m) to mcpConfig
- Decouple the reactive/proactive OAuth waits from initTimeout/connectionTimeout
- Use OAUTH_FLOW_TTL for the FlowStateManager TTL and the UI status window
- Ensure the flow TTL outlives the handling timeout, fixing the
  "Flow state not found" race
- Remove dead FLOW_TTL constant and document new env vars

Fixes #13615

*  fix: Coordinate OAuth pending window with handling timeout

Address Codex review: the extended OAuth wait was still capped by other
timeouts that were not updated.

- Align PENDING_STALE_MS (button validity + pending-flow reuse window)
  with MCP_OAUTH_HANDLING_TIMEOUT so a flow stays reusable for the full
  wait instead of 2 minutes (Finding 3)
- Clamp MCP_OAUTH_FLOW_TTL to never fall below the handling timeout so a
  callback near the deadline still finds its flow state (Finding 2)
- Floor attemptToConnect's timeout to the handling window for OAuth
  servers so the reactive in-connect OAuth wait is not killed by the
  30s connection timeout (Finding 1)
- Update flow staleness tests to reference the threshold symbolically

*  fix: Align OAuth window across status, action flows, and client polling

Address Codex round 2: extending the server wait exposed three more
windows that were still capped or now over-extended.

- checkOAuthFlowStatus reports a PENDING flow as active only within the
  usable PENDING_STALE_MS window, not the longer Keyv retention TTL, so
  the connect button reappears instead of a stuck 'connecting' state
- Give Action (custom tool) OAuth its own FlowStateManager on the prior
  3-minute TTL so the longer MCP OAuth TTL can't leave an action tool
  call waiting up to 15 minutes
- Extend the MCP server-card client polling to the 10-minute handling
  window so a user who completes OAuth after 3 minutes is still picked up

* 🧪 test: Make stale-flow CSRF test track PENDING_STALE_MS

The CSRF-fallback stale-flow test hardcoded a 3-minute age, which is now
within the 10-minute PENDING_STALE_MS window and was wrongly treated as
active. Derive the age from PENDING_STALE_MS so it tracks the constant.

*  fix: Add grace buffers and surface OAuth timeout to the client

Address Codex round 3 (near-deadline edges):

- Clamp MCP_OAUTH_FLOW_TTL to handling timeout + 60s grace (not equality),
  so flow state outlives the wait instead of expiring at the same instant
- Extend attemptToConnect's OAuth floor by a 60s grace so a user who
  authorizes near the deadline still gets the post-OAuth reconnect
- Surface OAUTH_HANDLING_TIMEOUT on the connection-status response and
  have the client poll for the configured window instead of a hardcoded
  10 minutes, so a tuned server deadline isn't capped on the client

*  fix: Refresh client OAuth timeout from the first status refetch

If the connection-status cache is empty when polling starts, the client
captured the 10-minute fallback and never picked up a tuned oauthTimeout.
Re-read it after each refetch so a longer configured deadline is honored
even on a cold cache.

* 📝 refactor: Type oauthTimeout on MCPConnectionStatusResponse

Declare the oauthTimeout field on the shared response type in
data-provider instead of an ad-hoc inline cast in the client hook, and
replace the pre-existing 'as any' on the status query read with the
typed getQueryData. Type-level only; no runtime change.
2026-06-09 17:50:02 -04:00
Danny Avila
2aea5f4a3a
📖 feat: Add Claude Fable 5 Support (#13628)
* 📖 feat: Add Claude Fable 5 Support

Claude Fable 5 (`claude-fable-5`) is Anthropic's most capable widely
released model (GA 2026-06-09). Its naming drops the opus/sonnet/haiku
tier, so LibreChat's name-parsing helpers miss it; this teaches them the
Mythos-class family (Fable / Mythos) and registers the model.

- Add `parseMythosClassVersion` and route Fable/Mythos through
  `supportsAdaptiveThinking`, `omitsThinkingByDefault`,
  `omitsSamplingParameters`, and `supportsContext1m`
- Extend the Bedrock detection regexes (beta headers + adaptive-thinking
  branch) and `checkPromptCacheSupport` to match `claude-(fable|mythos)`
- Return 128K max output for Fable/Mythos in `maxOutputTokens.reset`/`set`
- Register `claude-fable-5` in shared Anthropic + Bedrock model lists,
  1M context / 128K output token maps, and $10/$50 pricing with 12.5/1
  cache rates (`claude-mythos-5` added to token + pricing maps only,
  since it is limited-availability)
- Update `.env.example` and the Vertex `librechat.example.yaml` examples
- Add parallel tests across tokens, Anthropic llm config, the Bedrock
  parser, and tx pricing

* 🧹 refactor: Centralize Mythos-class detection; address review feedback

- Add `isMythosClassModel` + `MYTHOS_CLASS_FAMILIES` in schemas.ts as the
  single source of truth for the Fable/Mythos family; route every gate
  (adaptive thinking, omit-thinking, omit-sampling, 1M context, prompt cache,
  128K max-output reset/set) through it. A future sibling class is now a
  one-line edit.
- [Codex P2] Exclude Mythos-class from getBedrockAnthropicBetaHeaders: Fable/
  Mythos ship 128K output + fine-grained tool streaming by default, and the
  legacy output-128k-2025-02-19 beta is 3.7-Sonnet-only on Bedrock and risks
  request rejection. They still get adaptive thinking + effort.
- [Copilot] Add Mythos 5 test parity (name variations, cache rates, pinned
  $10/$50) in tx.spec; add Mythos context/max-output/name-match in tokens.spec;
  fix the stale claude-3-7-sonnet-only comment in bedrock.ts.
- Add isMythosClassModel unit tests covering all declared families.

* 📝 docs: Clarify Mythos-class Bedrock requirements; correct beta-omit rationale

Verified live against Bedrock (acct 951834775723, us-west-2):
- anthropic.claude-fable-5 IS a real Bedrock catalog model, INFERENCE_PROFILE-only
  exactly like the existing anthropic.claude-opus-4-7/4-8 and claude-sonnet-4-6
  default entries (refutes the "invalid model id" review claim).
- Mythos-class also requires opting into Anthropic data sharing (Bedrock Data
  Retention API) before invocation.

Changes:
- .env.example: note that Mythos-class (Fable/Mythos) is inference-profile-only on
  Bedrock and needs the data-sharing opt-in.
- bedrock.ts: reword the beta-omit comment to the verified rationale — output-128k /
  fine-grained-tool-streaming are built-in/no-op for the 4.7+ generation, so omitting
  them is lossless (dropped the unverified "Bedrock may reject" wording).

* 🔄 refactor: Reorganize imports in schemas.ts and tx.spec.ts

- Moved `TFeedback` and `Tools` imports to the top of `schemas.ts` for better readability.
- Adjusted import order in `tx.spec.ts` to maintain consistency and improve clarity.
2026-06-09 16:22:39 -04:00
Danny Avila
8fc2314208
🧠 fix: Bound Memory Agent Input (#13606) 2026-06-09 14:38:21 -04:00
Danny Avila
753e53eddd
🛬 fix: Coalesce Auth Recovery into a Single Refresh Flight (#13618)
* fix auth recovery singleflight

* add auth recovery e2e coverage

* handle invalid auth redirect timestamp
2026-06-09 12:04:12 -04:00
Danny Avila
2a956f143d
🪞 fix: Preserve Model Spec Icons Across Stream Resume and Abort (#13603)
Some checks are pending
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
2026-06-08 17:14:21 -04:00
Danny Avila
fb87abe773
🧩 feat: Enable Model Spec Subagents (#13598)
Some checks failed
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
Publish `@librechat/client` to NPM / pack (push) Has been cancelled
Publish `librechat-data-provider` to NPM / pack (push) Has been cancelled
Publish `@librechat/client` to NPM / publish-npm (push) Has been cancelled
Publish `librechat-data-provider` to NPM / publish-npm (push) Has been cancelled
2026-06-08 11:43:08 -04:00
Danny Avila
fbc51f6e66
refactor: Migrate data-provider Build to tsdown (split tsc dts) (#13597)
Replace the Rollup + `rollup-plugin-typescript2` build with a split
pipeline: tsdown (rolldown) bundles the JS in ~0.2s, and plain `tsc`
emits the declarations to `dist/types` (~2s). Full cold build drops from
~9.2s to ~2.5s (~3.6x) with zero source changes.

Unlike data-schemas, the fast oxc/isolated-declarations dts path isn't
viable here: the package's 78 exported zod schemas produce 374
`isolatedDeclarations` errors (TS9013/TS9038) and a `z.ZodType<T>`
annotation would break the 76 downstream `.extend`/`.shape`/`.pick`
usages. Plain `tsc` keeps the rich zod types intact, and since dts was
never the bottleneck (rollup-plugin-typescript2 was), the win stands.

- dts stays unbundled in `dist/types/` — identical to the prior output,
  so the existing deep `dist/types` imports and the exports `types`
  paths are unchanged.
- ESM output renamed `index.es.js` -> `index.mjs` (via the exports map;
  no consumer hardcodes the old path). cjs/types paths unchanged.
- `./react-query` now emits a real cjs build + types — the exports map
  already promised them, but Rollup only ever built the esm file.
- Kept `rollup` + the plugins used by `server-rollup.config.js`
  (the `rollup:api` server-bundle smoke test in backend-review.yml);
  removed only the deps used solely by the deleted `rollup.config.js`.
- Repointed CI build-cache keys from `rollup.config.js` to
  `tsdown.config.mjs`.
2026-06-08 11:09:16 -04:00
Danny Avila
6edbafd09d
⬆️ chore: Bump TypeScript to 5.9.3 (+ typescript-eslint 8.60.1) (#13584)
Bumps typescript 5.3.3 -> 5.9.3 across all workspaces. typescript-eslint must move 8.24.0 -> 8.60.1 too: 8.24's typescript peer was capped at <5.8.0; 8.60.1 widens it to <6.1.0.

Two errors surfaced by the newer compiler are fixed:
- api/src/rum/proxy.ts: TS 5.9 made `Buffer` generic (`Buffer<ArrayBufferLike>`), which no longer structurally matches `BodyInit`; cast the fetch body (Node's fetch accepts a Buffer at runtime).
- client usePresetIndexOptions.ts: drop a dead `|| {}` on an object spread (always truthy — flagged by the new TS2872 check).

All four package typecheck jobs + the client app typecheck pass under 5.9.3; builds (tsdown + rollup) and the rum proxy tests are unaffected.
2026-06-07 22:20:44 -04:00
Danny Avila
192703e041
perf: Migrate data-schemas Build to tsdown with isolatedDeclarations (#13578)
*  perf: Migrate data-schemas Build to tsdown with isolatedDeclarations

Replace Rollup with tsdown (rolldown + oxc) for @librechat/data-schemas. With the source made isolatedDeclarations-clean, oxc emits .d.ts without tsc, dropping the package build from ~5.8s to ~0.8s (~7x).

- Annotate exported model/method factories for isolatedDeclarations (TypeScript's fixMissingTypeAnnotationOnExports codefix plus hand-authored interfaces); type the ~44 mongoose `any`s and add an explicit PromptMethods interface (previously its declaration was silently dropped by the Rollup build).
- Repoint package.json exports/main/module/types to tsdown output; drop rollup config.
- Config lives in tsdown.config.mjs (native ESM) so CI without a TS-config loader can build it; bundle `dotenv` so the package stays self-contained for its env-loading side effect.
- Fix a latent token `metadata` mismatch the accurate types surfaced: widen TokenCreate/UpdateData inputs to accept plain objects, flatten OAuthMetadata at the api boundary.
- Update mongoMeili/aclEntry specs to the precise model types; drop redundant terser minification from data-provider's library build.

All data-schemas tests pass; api builds clean against the new output.

* 🔧 chore: Hash tsdown.config.mjs in data-schemas CI build-cache keys

The data-schemas build switched from rollup to tsdown, but the build-data-schemas / build-api cache keys in backend-review, config-review, and playwright-mock still hashed the (now-deleted) rollup.config.js. Hash tsdown.config.mjs instead so a config-only change invalidates the cached dist/api builds. (Found by Codex review.)

* 🔧 chore: Replace deprecated tsdown `external` with `deps.neverBundle`

tsdown 0.22 deprecated the top-level `external` option in favor of `deps.neverBundle`. Migrate the data-schemas config and set `deps.onlyBundle: false` to silence the (intentional) dotenv bundling hint. Build output and externalization are unchanged — dotenv bundled, all peers external.
2026-06-07 21:40:48 -04:00
René Mulder
c80c54eefb
🔐 fix: Resolve Env Variables in MCP OAuth URL Fields (#13573)
* fix: resolve env variables in MCP OAuth URL fields before validation

Apply the extractEnvVariable transform to authorization_url, token_url,
redirect_uri, and revocation_endpoint in OAuthOptionsBaseSchema. Without
this, ${ENV_VAR} syntax in these fields caused a Zod URL validation error
at startup before any env substitution could happen.

The same .transform().pipe() pattern is already used on all transport url
fields (SSE, WebSocket, StreamableHTTP) and ProxyUrlSchema.

Closes #13572

* fix: block env var expansion in user OAuth URL fields

Override redirect_uri and revocation_endpoint in UserOAuthOptionsSchema
with userOAuthEndpointUrlSchema, matching the existing overrides for
authorization_url and token_url. Without this, user-submitted configs
could inherit the extractEnvVariable transform added to the base schema
and resolve env vars like ${OPENAI_API_KEY} in those fields.

Add envVarPattern rejection to userOAuthEndpointUrlSchema so that
valid-URL-shaped payloads containing ${VAR} patterns are also blocked,
not just bare non-URL strings. Move envVarPattern declaration above the
schema to make it available at module evaluation time.

Add regression tests for all four OAuth URL fields on the user path,
using structurally valid URLs with embedded ${VAR} patterns to confirm
it is the env var guard — not URL shape — that rejects them.
2026-06-07 21:39:44 -04:00
Danny Avila
cb1d536874
📻 fix: Replay MCP OAuth Prompts for Coalesced Connections (#13565)
* fix: Replay MCP OAuth URL for Joined Connections

* chore: Sort MCP OAuth Imports

* test: Restore MCP OAuth Registry Spies

* fix: Replay pending MCP OAuth prompts

* fix: Replay MCP OAuth on Stream Resume

* fix: Preserve MCP OAuth Replay Context

* chore: Format MCP OAuth Replay Context

* test: Expect MCP OAuth Replay Expiry

* fix: Render pending MCP OAuth prompts

* chore: Clean MCP OAuth Replay Type Narrowing

* fix: Stabilize new MCP OAuth chats

* fix: Re-emit cached MCP OAuth prompts

* fix: Replay pending OAuth for selected MCP tools

* fix: Avoid stalling pending MCP OAuth replay

* test: Clean MCP OAuth review findings

* test: Restore MCP OAuth registry spy

* fix: Resolve OAuth Typecheck Regressions

* fix: Harden MCP OAuth replay edge cases

* test: Cover MCP OAuth joined prompt expiry

* test: Mark joined OAuth replay fixture

* test: Use OAuth fixture for joined replay expiry

* fix: Anchor resumed MCP OAuth prompts

* fix: Seed resumable turn metadata before MCP init

* test: Format resume metadata regression

* fix: Prioritize resumable stream routes

* fix: Preserve MCP OAuth resume message tree

* test: Fix MCP OAuth Resume Test Types

* fix: Replay MCP OAuth Regenerate Prompts

* fix: Skip OAuth-only Abort Persistence

* fix: Stabilize OAuth Resume Replay

* fix: Target Non-Tail Regenerate Responses

* fix: Scope Regenerate Step Updates

* fix: Clean Up OAuth Abort State

* fix: Preserve Regenerate Branch Siblings

* fix: Preserve OAuth Resume Branch State

* fix: Preserve OAuth Branch Resume State

* chore: Sort OAuth Resume Imports

* fix: Address OAuth Resume Review Findings

* test: Fix Abort Fixture Typing
2026-06-07 10:45:54 -04:00
Danny Avila
265d660076
👷 ci: Type-check the Client Workspace (#13560)
The `client/` workspace was never type-checked: the existing typecheck
job only covered `packages/` and `api/`, and Vite/esbuild transpiles
without type-checking, so type errors shipped through every CI gate.

- Add a `typecheck` job to frontend-review.yml running `tsc --noEmit`
  over `client/` (zero tolerance), reusing the data-provider +
  client-package build artifacts. Triggers on `client/**`,
  `packages/client/**`, `packages/data-provider/**`.
- Fix all 168 pre-existing client type errors this surfaced (source +
  tests), including genuine latent bugs:
  - `getFileConfig()` was typed as merged `FileConfig`, but the server
    returns the raw config that `mergeFileConfig()` consumes (`TFileConfig`).
  - SidePanel/Agents `Retrieval`/`ImageVision` were bound to `AgentForm`
    but use the assistants `Capabilities` enum → `AssistantForm`.
  - `useSearchResultsByTurn` read a `sources` field its type lacked.
  - Removed orphaned dead code: `Artifacts/Mermaid.tsx` (imported a
    never-installed dep) and dead barrel re-exports (`./Plugins`, `./MCPAuth`).
- Narrow `client/tsconfig.json` to the client app (drop `../e2e` and
  `../config/translations`, which reference backend/tooling modules) so
  the gate's scope matches its trigger.

No `any`/`@ts-ignore`/`as unknown as`. Localized newly-surfaced strings.
2026-06-06 18:40:31 -04:00
Danny Avila
83bdd3d65d
🌱 feat: Support Soft Default Model Spec (#13554)
* feat: add soft default model spec

* chore: sort ChatRoute imports
2026-06-06 14:20:59 -04:00
Danny Avila
2c8d54e18c
🗂️ feat: Add Deployment Skill Directory (#13523)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* feat: Add deployment skill directory

* chore: Address deployment skill review feedback

* fix: Include deployment skill file metadata

* test: Add deployment skills e2e smoke test
2026-06-05 10:24:28 -04:00
Danny Avila
6357ea10c1
🧭 feat: Scope Model Spec Skills (#13522)
* feat: scope model spec skills

* style: format skill catalog limit

* fix: serialize model spec skill resolution

* test: satisfy model spec load config typing

* fix: apply model spec skills to added conversations

* fix: support alwaysApply frontmatter alias

* fix: address model spec skills review
2026-06-05 10:22:02 -04:00
Danny Avila
baa23a8e24
🗂️ feat: Add Private Chat Projects (#13467)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
* feat: Add private chat projects

* fix: Format project files

* fix: Address project review findings

* fix: Resolve project review follow-ups

* fix: Handle project stats and cache edge cases

* style: align projects UI with sidebar patterns

* fix: resolve projects UI lint issues

* style: Align project menus and composer

* fix: Avoid project placeholder shadowing

* fix: Handle project search and stale ids

* fix: Polish project sidebar behavior

* fix: Preserve new chat stream after creation

* fix: Stabilize project sidebar sections

* fix: Smooth project sidebar organization

* fix: stabilize project chat entry

* fix: keep project workspace outside chat context

* fix: show default model on project workspace

* fix: fallback project workspace model label

* fix: preserve project scope during draft hydration

* fix: include route project in new chat submission

* fix: persist project id in agent chat saves

* fix: refine project sidebar and creation UX

* fix: export chat project method types

* fix: polish project landing context

* fix: refine project navigation affordances

* feat: rework projects UX — coexisting sidebar sections + URL-driven scope

Sidebar
- Replace the chronological/by-project mode toggle with coexisting
  Projects + Chats sections (both always visible)
- Remove ProjectConversations (927 lines), the org-mode Header, and types
- Add ProjectsSection: collapsible project rows that unfurl chats inline
  (full-size rows), with per-project new chat and an open/rename/delete menu
- Lift the marketplace/favorites shortcuts above the Projects section

Chat scope
- Derive a new chat's project strictly from the URL ?projectId, so the
  global New Chat no longer stays stuck in a project after a project chat

Surfaces
- Chat landing: subtle, clickable project chip instead of the floating badge
- Project workspace: modest header, composer-style entry, chats list
- All-projects grid: Claude-style cards with pluralized chat counts

* chore: prune unused i18n keys; fix project chat-count pluralization

* fix: project new-chat keeps model spec; sidebar header + row polish

- newConversation: ignore a chatProjectId-only template when deciding to
  apply the default model spec, so starting a chat in a project no longer
  strips the conversation `spec`
- useSelectMention: the Model Selector and @ command now retain the active
  project across endpoint/spec/preset switches; other new-chat paths still
  clear it
- Chats header now matches the Projects header (inline chevron + a new-chat
  icon button) and starts a non-project chat
- Project rows: use the new-chat icon for the per-project add button, render
  at text-sm to match the chat list, and align the row actions + hover color
  with conversation rows

* fix: read project scope from router params; align sidebar header icons

- useSelectMention now reads the active project from React Router's search
  params instead of window.location, which can drift out of sync because
  new-chat params are written to the URL via raw history.pushState; the
  Model Selector and @ command now reliably keep the project on switch
- Move the Chats section header out of the virtualized list so it renders
  in the same context as the Projects header and isn't shifted by the
  list scrollbar
- Inset header action icons (pr-2) so Projects/Chats header icons line up
  with the project-row and conversation-row trailing actions
- Extract getRouteChatProjectId into utils for the submit path

* fix: preserve chatProjectId through the new-chat template reduction

The param-endpoint guard in newConversation reduced a new chat's template to
{ endpoint } only, dropping the chatProjectId injected by the Model Selector /
@ switch — so switching models cleared the project scope. Keep chatProjectId
in the reduced template.

* style: align chat-history panel top padding; improve projects page contrast

- Add pt-2 to the chat-history panel so its top spacing matches the other
  side panels (agent builder, skills, files, etc.)
- Projects grid + workspace now use the darkest surface for the page
  (surface-primary) with cards, inputs, and the composer one step lighter
  (surface-secondary) and tertiary on hover, so cards read as elevated
  rather than darker than the background

* feat: interactive project landing chip + gallery icon for all-projects

- All-projects sidebar button uses the gallery-vertical-end icon
- The project landing chip is now interactive: click it to switch projects
  via a searchable combobox (ControlCombobox), or the trailing × to drop the
  project scope. Both update the draft conversation and the ?projectId search
  param in place, so the typed message and selected model are preserved

* test: fix Conversations unit test for refactored sidebar; add projects e2e

- Update Conversations.test.tsx mocks for the inline Chats header
  (useNewConvo, useQueryClient, conversation atom, NewChatIcon, TooltipAnchor),
  drop the removed chatsHeaderControls prop, and remove the mock for the
  deleted ../Header module — fixes the failing frontend Jest job
- Add e2e/specs/mock/projects.spec.ts covering project creation, the
  project-scoped new-chat landing + interactive chip (switch/remove), and
  listing projects on /projects
- Give the landing chip combobox a stable selectId for reliable targeting

* fix: refresh project stats after project-chat activity; stabilize e2e

- useEventHandlers: when a project chat is created/updated, invalidate the
  live [projects] query (gated on chatProjectId) instead of the now-unused
  projectConversations key, so the sidebar + all-projects stats refresh
  after a streamed reply (addresses a Codex finding)
- projects e2e: assert the reliable project-landing behavior (chip, scoped
  composer, accepted send) rather than the /c/:id transition, which the
  mock LLM harness doesn't complete

* test: verify a project chat saves and is filed under its project (e2e)

- Switch to a mock endpoint before sending so the message streams without a
  real API key (the default model failed with "No key found", so no chat was
  saved and the page never left /c/new); this also asserts the project chip
  survives the model switch
- Restore the reply + /c/:id transition assertions and add a check that the
  chat is listed under the expanded project in the sidebar
- Add data-testid="project-chats-<id>" to the inline project chat list

* fix: address Codex review findings (project scope edge cases)

- useSelectMention: fall back to the conversation's chatProjectId when the
  URL has no projectId, so switching model/spec inside an existing project
  chat (/c/:id) keeps the project assignment
- Conversations: include chatProjectId in the MemoizedConvo comparator so a
  sidebar row's project menu doesn't stay stale after a reassignment
- useDeleteProjectMutation: clear the active conversation's chatProjectId
  when its project is deleted (mirrors the assignment mutation); drop the
  now-dead projectConversations invalidation
- useQueryParams: carry the project into the new conversation when applying
  URL settings, so /c/new?projectId=...&<settings> stays scoped

* fix: project stats pagination + archived-chat edge cases (data-schemas)

- listChatProjects: include the null lastConversationAt bucket in the desc
  cursor so empty projects paginate (a $lt:<date> predicate excluded nulls,
  hiding chat-less projects from "Load more")
- saveConvo: recompute project stats instead of the incremental fast path
  when the saved conversation is itself archived/temporary/expired, so a
  project's lastConversationAt/Id no longer points at a hidden chat

* test: cover chat-less project pagination across the dated→null boundary

* fix: validate project ownership in bulkSaveConvos

Bulk paths (import/duplicate/fork) persisted whatever chatProjectId the
payload carried; an id that does not belong to the user created an orphan
assignment hidden from both the project and the unassigned sidebar. Validate
ownership like saveConvo and strip un-owned project ids before persisting,
refreshing stats only for owned projects.

* fix(projects): preserve chatProjectId on continuation, basename-safe delete redirect, project-detail invalidation

* fix(projects): navigate project workspace chats via useNavigateToConvo to avoid stale conversation state

* fix(projects): include projectConversations cache when resolving deleted chat's project for detail invalidation

* fix(projects): refresh both projects when a save or bulk write moves a chat between them

* style(projects): use Folders icon for the sidebar Projects header

* fix(projects): require id on ProjectUser so ProjectRequest extends Express Request cleanly

* style(projects): taller project chip with hover-revealed remove button, upward combobox; sort en translations

* style(projects): show endpoint/agent icon for project workspace chat rows
2026-06-03 15:29:18 -04:00
Atef Bellaaj
86fe79c37d
🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051)
* feat: Add granular access control to shared links via ACL system

* fix(shared-links): preserve isPublic on failed migration grants

Transient ACL failures during auto-migration permanently stranded
links — $unset ran unconditionally, removing the legacy flag that
triggers retry. Now only $unset isPublic after all grants succeed.

* fix(config): skip isPublic unset for failed ACL grants

Bulk migration unconditionally removed isPublic from all links,
even those whose ACL writes failed. Failed links then lost the
legacy marker needed for auto-migration retry. Now tracks failed
link IDs per-batch and excludes them from the $unset step.

Also adds sharedLink to AccessRole resourceType schema enum —
was missing, only worked because seedDefaultRoles uses
findOneAndUpdate which bypasses validation.

* ci(config): add jest config and PR workflow for migration tests

config/__tests__/ specs depend on api/jest.config.js module
mappings but had no dedicated runner. Adds config/jest.config.js
extending api config with absolutized paths, npm test:config
script, and a GitHub Actions workflow triggered by changes to
config/, api/models/, api/db/, or packages/ ACL code.

* fix(permissions): honor boolean sharedLinks config

SHARED_LINKS has no USE permission, so boolean config produced
an empty update payload — gate conditions only matched object
form, making `sharedLinks: false` a no-op on existing perms.

* fix(share): resolve role before creating shared link

Role lookup between create and grant left an orphaned link
without ACL entries if getRoleByName threw — retry then hit "Share already exists" with no recovery path.

* fix: Restore Public ACL Access Checks

* fix: Type Public ACL Lookup

* fix: Preserve Private Legacy Shared Links

* chore: Promote Shared Link Permission Migration

* fix: Address Shared Link Review Findings

* fix: Repair Shared Link CI Follow-Up

* fix: Narrow Shared Link Mongoose Test Mock

* fix: Address Shared Link Review Follow-Ups

* fix: Close Shared Link Review Gaps

* fix: Guard Missing Shared Link Permission Backfill

* test: Add Shared Link Mock E2E

* test: Stabilize Shared Link Mock E2E

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-06-03 14:17:17 -04:00
Danny Avila
2ef7bdfbc2
feat: Immediate Conversation Title Generation (#13395)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
*  feat: Immediate Conversation Title Generation

Generate conversation titles as soon as the request is made (in parallel
with the response, from the user's first message) as the new default,
fixing the #13318 race where a transient /gen_title 404 left new chats
stuck on "New Chat".

- Add per-endpoint `titleTiming` ('immediate' | 'final') to baseEndpointSchema;
  `endpoints.all` acts as the global default, unset = immediate. Resolve via
  a new `resolveTitleTiming` helper (`all` takes precedence).
- Fire title generation in parallel with `sendMessage`; `titleConvo` waits
  (bounded, abortable) for the agent run and titles from the user input only.
  Persist after the conversation row exists; defer `disposeClient` until the
  title settles.
- Expose `titleGenerationTiming` via startup config; `useTitleGeneration`
  fetches eagerly in immediate mode with a bounded 404 retry and never treats
  a transient 404 as final. Skip title queueing for temporary conversations.
- Supersedes #13329 while incorporating its bounded 404-retry.

* 🩹 fix: Address Copilot review findings on title timing

- Guard against an undefined conversationId in addTitle (skip + warn) so the
  gen_title cache key can't collide as `userId-undefined` and saveConvo is
  never called without a conversationId.
- Gate the title `useQueries` on `enabled` so no /gen_title request fires while
  unauthenticated (e.g. after logout) even if the module queue holds IDs.
- Drop the stale `conversationId` param from the titleConvo JSDoc.
- Add a regression test for the undefined-conversationId guard.

* 🧵 fix: Harden immediate-title edge cases from codex review

- Cancel in-flight immediate title generation when the request aborts: thread
  job.abortController.signal through addTitle so pressing Stop on a new chat
  neither consumes the title model nor surfaces a title for a cancelled turn.
- Preserve a locally-applied title when the final SSE event's conversation
  carries no title yet (built before the title was saved), so long immediate-mode
  responses no longer revert the chat to "New Chat" until reload.
- Guarantee one full post-completion gen_title fetch cycle before giving up, so a
  `final`-mode title (generated only after the stream ends) is still fetched under
  a global `immediate` default instead of being stranded.
- Add regression tests for the abort propagation and the undefined-conversationId guard.

* 🔁 fix: Correct title abort, post-completion refetch, and replacement ordering

Follow-up to codex review of the immediate-title fixes:

- Use a dedicated title AbortController instead of `job.abortController`. The
  latter is also aborted by `completeJob` on *successful* completion, which
  cancelled any title slower than a short response. The title is now cancelled
  only on a real user Stop or when the stream is replaced; a completed-then-
  aborted title is discarded (no save, cache cleared) rather than persisted.
- Reset (not remove) the post-completion title query: `resetQueries` refetches
  the mounted observer with a fresh retry budget, whereas `removeQueries` left it
  stuck in its error state, so the promised post-completion cycle never ran.
- Run the job-replacement check before resolving `convoReady`, and on a replaced
  stream cancel/discard the stale title so a discarded prompt can't persist a title.

* 🧷 fix: Tighten title abort ordering and endpoint-level timing resolution

Follow-up to codex review:

- Abort the title controller before resolving `convoReady` on a stopped turn, so
  the title task can't resume and persist before the later abort.
- Cancel the title and unblock its waits on ANY send failure (not just user
  aborts): a preflight/quota failure before the run exists otherwise hangs
  `_waitForRun`, deferring client disposal until the 45s title timeout.
- Resolve `titleTiming` for custom endpoints via `getCustomEndpointConfig`
  (their config lives under `endpoints.custom[]`, not `endpoints[endpoint]`).
- Derive the startup `titleGenerationTiming` via `resolveTitleTiming` for the
  agents endpoint so an endpoint-level `final` (without `endpoints.all`) is honored
  client-side instead of defaulting to immediate and burning eager gen_title polls.

* 🪢 fix: Per-agent title timing and safer abort/replacement handling

Follow-up to codex review:

- Resolve `titleTiming` from the agent's actual endpoint after initialization, so a
  per-endpoint `final` override on a custom/provider endpoint backing an (ephemeral)
  agent is honored instead of always using the `agents` endpoint's value.
- Don't preserve a locally-fetched title on a stopped (unfinished) turn: the server
  cancels and discards that title, so keeping it client-side would diverge from
  server state and leave the stopped chat titled until reload.
- On abort/replacement, only delete the cached title if it still holds THIS task's
  value — a replacement stream shares the `userId-conversationId` key and may have
  already cached its own valid title that must not be removed.

* 🪞 fix: Mirror AgentClient title-config resolution for titleTiming

Per maintainer guidance, keep titleTiming resolution identical to how
`AgentClient#titleConvo` already resolves the endpoint config — `endpoints.all`
is the intended global override and the agent's actual provider endpoint is used:

- Resolve via `endpoints.all ?? endpoints[endpoint] ?? getProviderConfig(endpoint)
  .customEndpointConfig` (was using `getCustomEndpointConfig` directly). Going
  through `getProviderConfig` picks up its case-insensitive fallback for normalized
  provider names (e.g. `openrouter` → `OpenRouter`), so a custom endpoint's
  `titleTiming` is honored like its other title settings.
- Add `titleTiming` to the Azure endpoint schema `.pick()` so
  `endpoints.azureOpenAI.titleTiming` is no longer silently stripped by Zod.

Note: per-endpoint title settings being skipped when `endpoints.all` is present is
the existing, intended global-override behavior — not changed here.

* 🧪 test: Cover useTitleGeneration effect logic (integration)

Adds a deterministic white-box integration test that drives the real hook's
React effects with a controllable react-query surface, locking down the
stateful decisions that previously had no coverage:

- immediate mode fetches a queued conversation while its stream is still active
- final mode gates until the stream completes, then becomes eligible
- success applies the fetched title to the conversation caches
- a 404 while active defers (removeQueries) instead of giving up
- a 404 after completion forces a fresh fetch via resetQueries (post-completion remount)

* feat: Stream immediate title events

* style: Format title SSE handler

* test: Preserve data-provider exports in OAuth mock

* test: Isolate OAuth route API mock

* test: Keep OAuth callback factory capture

* fix: Replay streamed title events on resume

* fix: Honor agents title timing precedence

* style: Format title timing fixes
2026-06-02 16:40:57 -04:00
Danny Avila
8ba0249f1e
🗃️ feat: Retain Agent Files During All-Data Retention (#13477)
* feat: add agent file retention exemption

* refactor: centralize agent file retention policy
2026-06-02 15:04:10 -04:00
jcbartle
268f095c1a
🔒 feat: Add On-Behalf-Of (OBO) token exchange support for MCP Servers (#13429)
Some checks failed
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
Publish `librechat-data-provider` to NPM / pack (push) Has been cancelled
Publish `@librechat/data-schemas` to NPM / pack (push) Has been cancelled
Publish `librechat-data-provider` to NPM / publish-npm (push) Has been cancelled
Publish `@librechat/data-schemas` to NPM / publish-npm (push) Has been cancelled
* Add OBO (On-Behalf-Of) token exchange support for MCP server connections

Enables transparent authentication to Entra ID-backed MCP servers using the logged-in user's federated token via the OAuth 2.0 jwt-bearer grant. Configured via obo.scopes in librechat.yaml server config.

- Extract generic OboTokenService from GraphTokenService (jwt-bearer grant + cache)
- Refactor GraphTokenService to thin wrapper delegating to OboTokenService
- Add obo schema field to BaseOptionsSchema in data-provider
- Add resolveOboToken in packages/api/src/mcp/oauth/obo.ts (validates federated token, calls resolver, returns MCPOAuthTokens)
- Wire oboTokenResolver through MCPConnectionFactory, MCPManager, UserConnectionManager
- OBO tokens injected via request headers (not OAuth transport), refreshed on each tool call
- Explicit error on OBO failure (no fallthrough to standard OAuth redirect)
- Add unit tests for both resolveOboToken (9 tests) and exchangeOboToken (14 tests)

* Add OBO authentication option to MCP server UI configuration

  Enable users to configure On-Behalf-Of (OBO) token exchange for MCP servers created via the UI (MongoDB-stored), in addition to the existing YAML-based configuration.

  - Add "On-Behalf-Of (OBO)" radio option to MCP server auth section with scopes input field
  - Remove obo from omitServerManagedFields so the field passes UI schema validation
  - Add OBO to AuthTypeEnum, obo_scopes to AuthConfig, and OBO handling in form defaults and submission
  - Add .min(1) validation on obo.scopes to reject empty strings
  - Add English localization keys: com_ui_obo, com_ui_obo_scopes, com_ui_obo_scopes_description
  - Add 5 schema validation tests for OBO field acceptance, transport compatibility, and edge cases

* 🧊 fix: Add obo to safe properties in redactServerSecrets. Fixes the OBO configuration not showing up in the MCP UI after app restart

* Address linter errors

* 🧊 fix: fail closed on OBO refresh errors and retry transient token exchange failures

- stop tool calls from falling back to stale Authorization headers when per-call OBO refresh fails
- add one-time retry for transient Entra OBO exchange failures (network/429/5xx)
- preserve structured OBO failure reasons and retryability in resolveOboToken
- improve OBO auth error messaging for connection setup and tool execution
- add tests for transient vs permanent OBO failure paths

* Addressing linting errors / warnings

* 🧊 fix: isolate OBO MCP auth to user-scoped connections

- block OBO-enabled servers from app-level shared MCP connections
- bypass shared connection lookup for OBO servers in MCPManager.getConnection
- add regressions covering OBO connection scoping and preserve non-OBO app connection reuse

* 🛠️ refactor: centralize MCP user-scoped connection policy

- add shared requiresUserScopedConnection helper for OAuth, OBO, and customUserVars
- use the shared predicate in MCPManager and ConnectionsRepository
- add utils coverage for user-scoped connection policy

* 🧊 fix: restrict MCP OBO config to header-capable transports

- Move OBO configuration out of the shared MCP base options schema and allow it
only on SSE and streamable-http transports, where request headers are applied.
- Explicitly reject OBO on stdio and websocket configs to avoid accepted-but-
nonfunctional server definitions. Add schema coverage for admin/config parsing
and user-input websocket validation.

* 🧊 fix: single-flight concurrent OBO token exchanges

Concurrent tool calls that arrive on a cache miss were each issuing
their own jwt-bearer request to the IdP. Under that fan-out, Entra
intermittently returned errors that the retry classifier saw as
non-retryable, surfacing as:

  "The identity provider rejected the OBO token exchange.
   Cannot execute tool <name>. Re-authenticate the user or
   verify the configured OBO scopes and retry."

A user retry then hit the populated cache and succeeded, which matches
the observed flakiness — the cache was empty at the moment of fan-out
but populated by the time the user clicked retry.

- Coalesce concurrent exchanges in `OboTokenService.exchangeOboToken`
keyed by `${openidId}:${scopes}`. Callers that arrive while an exchange
is in flight share the same upstream request and receive the same
result. `fromCache=false` continues to force a fresh, independent
exchange (and is not joined by `fromCache=true` callers). The IdP
call, single-retry path, and cache write are unchanged — they were
moved into a `performOboExchange` helper so the coalescing wrapper
stays small.
- Tests cover: coalescing on the same key, isolation between different
keys, cleanup on success, cleanup on failure, and the
`fromCache=false` bypass.

* 🔒 feat: gate MCP OBO config behind MCP_SERVERS.CONFIGURE_OBO permission

OBO silently mints per-user delegated tokens from the caller's federated
access token and forwards them to whatever URL the server config points at.
Previously, anyone with MCP_SERVERS.CREATE could configure obo.scopes — so
if server creation is ever delegated beyond admins, a user could stand up
an attacker-controlled server, attach it to a shared agent, and exfiltrate
other users' downstream tokens on tool invocation.

Add a dedicated MCP_SERVERS.CONFIGURE_OBO permission (ADMIN: true, USER:
false by default) and enforce it at three layers so the safety property
no longer depends on CREATE staying admin-only:

- Create/update: POST/PATCH /api/mcp/servers returns 403 when the body
  carries `obo` and the caller's role lacks the permission.
- Runtime fail-closed: for DB-sourced configs, MCPConnectionFactory and
  MCPManager.callTool re-check the original author's role before each
  OBO exchange. If the author has been downgraded, the exchange is
  skipped (factory) or refused (callTool) — retained configs lose their
  privileges automatically.
- UI: the OBO option is hidden in the MCP server dialog for users
  without the permission; a CONFIGURE_OBO toggle is exposed in the MCP
  admin role editor.

Existing role docs receive the new sub-key via the permission backfill
in updateInterfacePermissions on next startup, preserving any
operator-set values. YAML/Config-sourced server configs are unaffected
since they're admin-controlled at the deployment level.

* 🧊 fix: wire OBO machinery for servers with requiresOAuth: false

The discovery and user-connection paths gated OAuth wiring (flow
manager, token methods, oboTokenResolver, oboTrustChecker) behind
isOAuthServer(), which only considers requiresOAuth/oauth fields.
A DB-stored OBO server with requiresOAuth: false therefore landed in
the non-OAuth branch, never received an oboTokenResolver, and the
factory's usesObo getter evaluated to false — sending a bare request
that the upstream rejected with invalid_token.

Add requiresOAuthMachinery() (OAuth OR OBO) and use it at those two
gates. isOAuthServer remains for the OAuth-handshake-only check
(shouldInitiateOAuthBeforeConnect), where OBO must not initiate a
handshake. Plumb the OBO resolver/trust-checker through
ToolDiscoveryOptions so reinitMCPServer can pass them on the
discovery path.

* 🧊 fix: lock all OBO-target fields (URL, proxy, headers, auth) without CONFIGURE_OBO

The CONFIGURE_OBO permission was meant to gate control of the endpoint
that receives OBO-minted per-user delegated tokens and the scopes that
are requested. The previous frontend lock + backend gate only covered
obo.scopes and the auth section, leaving url/proxy/headers/etc. editable
by anyone with UPDATE — meaning a non-permission user could still
redirect an existing OBO server's token flow to an attacker endpoint.

Switch to an allowlist policy: when editing an OBO server without
CONFIGURE_OBO, only title/description/iconPath are mutable. Backend
rejects any other field change with 403; frontend disables the
non-allowlist sections (URL, transport, auth, trust) via fieldset.
The comparison surface (MCP_USER_INPUT_FIELDS) is derived from
MCPServerUserInputSchema's union members so it stays in sync with the
schema. New schema fields land in the locked set by default — adding to
the allowlist is the only way to unlock them, which preserves the
security-review boundary.

* 🧊 fix: skip unauthenticated MCP inspection for OBO-only servers

MCPServerInspector.inspectServer() ran an unauthenticated temp connection
unless the config had requiresOAuth or customUserVars set. For OBO-only
servers without standard MCP OAuth advertisement, this caused
MCPConnectionFactory.create to attempt the connection without a user or
oboTokenResolver — failing on servers that reject the MCP initialize
handshake without a valid bearer token, which surfaced as
MCP_INSPECTION_FAILED on create/update.

Add `obo` to the skip list alongside requiresOAuth and customUserVars,
matching the existing pattern for user-scoped auth modes.

* Addressed linting error: watchedTitle is declared but never referenced (the auto-fill logic at line 156 uses getValues('title') instead). Deleted constant.
2026-06-01 22:36:18 -04:00
Ravi Kumar L
a86e504a57
📡 feat: Add Authenticated Proxy Mode for Browser RUM Telemetry (#13464) 2026-06-01 21:11:35 -04:00
Danny Avila
2ab432bd0a
💭 fix: Preserve Custom Endpoint Reasoning Params (#13447)
* fix: Preserve custom endpoint reasoning params

* fix: Address custom reasoning review cases

* fix: Format configured reasoning defaults

* fix: Honor dropped reasoning params

* fix: Configure custom reasoning response key
2026-06-01 18:20:20 -04:00
Marco Beretta
730878bc5a
🔐 feat: Use SecretInput for Sensitive Fields (#12955)
* feat: use SecretInput for sensitive fields

* fix: align auth SecretInput styles

* chore: remove unused password i18n keys

* fix: align SecretInput controls

* fix: use SecretInput for dynamic credentials

* fix: reveal SecretInput controls on hover

* fix: align SecretInput eye icon and modernize controls

The wrapper was a flex container, so passing 'mb-2' on the input made it
contribute its margin to the wrapper's cross-axis size — the controls overlay
spanned the inflated height and centered the toggle 4px below the input's
true center. Switching the wrapper to a plain relative block collapses height
back to the input.

Also tightens the toggle/copy buttons (size-7 rounded-md with hover:bg-surface-hover)
and adds a focus ring on the input. Auth pages still override className/buttonClassName
so login/register styling is unchanged.

* fix: remove focus ring from SecretInput

* fix: keep green focus border on auth secret inputs

SecretInput's modernized default uses focus-visible:border-border-heavy and
hover:border-border-medium, which Tailwind emits after the auth pages' focus:
rules and overrides them. Auth pages now also declare focus-visible:border-green-500
and hover:border-border-light so cn()/twMerge resolves them as the winners
when classes are concatenated.

* feat: add optional sensitive flag to MCP customUserVars

Dynamic MCP credential fields all rendered as masked SecretInputs, which
also hid non-secret setup values like usernames, project keys, and URLs.

Add an optional `sensitive` flag to customUserVars and the plugin auth
config. It defaults to masked when omitted, so existing configs keep the
safe-by-default behavior; set `sensitive: false` to render a field as
plain text. The flag is display-only — values remain encrypted at rest.
2026-06-01 18:14:12 -04:00
Danny Avila
fb282a2afa
🐳 chore: Upgrade Docker Builds To Node 24 (#13448)
* chore: upgrade docker builds to node 24

* test: avoid array at in telemetry spec
2026-06-01 10:03:18 -04:00
Danny Avila
566e20b613
v0.8.6 (#13302)
Some checks failed
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Publish `librechat-data-provider` to NPM / pack (push) Waiting to run
Publish `librechat-data-provider` to NPM / publish-npm (push) Blocked by required conditions
Publish `@librechat/data-schemas` to NPM / pack (push) Waiting to run
Publish `@librechat/data-schemas` to NPM / publish-npm (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Publish `@librechat/client` to NPM / pack (push) Has been cancelled
Publish `@librechat/client` to NPM / publish-npm (push) Has been cancelled
2026-05-31 17:36:47 -04:00
Danny Avila
5bfef51ed2
🏟️ fix: Restrict MCP OAuth Audience in User-Managed Configs (#13418)
Some checks are pending
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
2026-05-30 14:39:54 -04:00
ChrisJr404
6db059b8a9
🔒 fix: Strip post-login fields from unauthenticated /api/config response (#13102)
* 🔒 fix: Strip post-login fields from unauthenticated /api/config response

Follow-up to #12490 reported in #12688.

The unauthenticated /api/config response still included fields that are
only consumed after login (helpAndFaqURL, sharedLinksEnabled,
publicSharedLinksEnabled, showBirthdayIcon, analyticsGtmId,
openidReuseTokens, allowAccountDeletion, customFooter, cloudFront).
None of these are read by the auth pages (Login, Registration,
RequestPasswordReset, ResetPassword, VerifyEmail, TwoFactorScreen,
AuthLayout, Footer, SocialLoginRender).

Split buildSharedPayload into two helpers:

- buildPreLoginPayload returns only the fields the unauthenticated auth
  pages need (appTitle, server domain, social-login flags, OpenID/SAML
  labels and image URLs, registration/email/password-reset flags,
  minPasswordLength, ldap).
- buildPostLoginPayload returns the post-login informational fields and
  is merged into the response only when req.user is present.

Also move buildCloudFrontStartupConfig into the authenticated branch:
useAppStartup is the only consumer and it runs after login.

Tests updated: existing CloudFront and allowAccountDeletion assertions
move to the authenticated context, and two new assertions cover the
stripped fields (one for the post-login informational fields, one for
cloudFront) in the unauthenticated context.

Signed-off-by: ChrisJr404 <chris@hacknow.com>

* fix: Request share-context startup config

* fix: Pass share startup config into footer

---------

Signed-off-by: ChrisJr404 <chris@hacknow.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-30 09:51:21 -07:00
Danny Avila
444d923e29
✂️ chore: Strip Session JWT Forwarding from Browser RUM (#13414)
* fix: disable RUM user JWT auth

* fix: remove stale RUM bootstrap import
2026-05-30 10:34:44 -04:00
Freudator86
c6a6f2e3ae
🪪 feat: MCP OAuth - Support audience parameter for Auth0/Cognito-style providers (#13402)
* feat(mcp/oauth): support audience parameter for Auth0/Cognito-style providers

LibreChat already follows RFC 9728 (Protected Resource Metadata discovery)
and RFC 8707 (resource indicators on /authorize). However, authorization
servers that pre-date RFC 8707 — most prominently Auth0 — issue
API-scoped access tokens only when an Auth0-specific 'audience' parameter
is supplied on /authorize and /token. Without it, refresh_token responses
strip the API audience and the next MCP call 401s.

This change adds an optional 'audience' field to OAuthOptionsSchema and
forwards it on:
  * pre-configured authorize URL build
  * discovered (DCR + RFC 9728) authorize URL build
  * refresh_token grant body

'resource' (RFC 8707) is left untouched and remains the
standards-conformant route; 'audience' covers providers that ignore
'resource'. The two are independent — providers may accept either, both,
or neither, so we forward whichever the operator configures.

Schema tests added; no behavioral change for existing configs (field is
optional with no default).

Refs: MCP Authorization Spec 2025-06-18, RFC 9728, RFC 8707.

* ci: build audience-fix branch image to ghcr.io/freudator86/librechat:audience-fix

* Revert "ci: build audience-fix branch image to ghcr.io/freudator86/librechat:audience-fix"

This reverts commit 7b3dfa6cd7.

* tests: assert audience param in authorize URL + refresh body; tighten schema (.min(1)); refine comment to reflect actual code paths

Adresses PR review:
- audience: z.string().min(1).optional() rejects empty strings
- schema comment now precisely lists the two code paths (authorize + refresh_token grant); explicitly notes the authorization_code exchange intentionally does not receive audience because Auth0 binds it from the initial /authorize request
- new MCPOAuthAudience.test.ts: 4 cases — authorize URL with/without audience, refresh body with/without audience — using a local recording HTTP server (no shared helper changes)
- new schema test: empty-string audience is rejected

* style: inline two logger.debug calls (prettier)

* style: inline third audience-debug log (prettier)

* feat(mcp/oauth): add forward_audience_on_refresh opt-out for strict token endpoints (Cognito)

Addresses Codex review P2 'Avoid sending audience on refresh grants':
the previous behavior forwarded audience on every refresh_token grant,
which is correct for Auth0 (strips the audience claim otherwise) but is
non-standard for Cognito and other strict OAuth 2.0 token endpoints that
document refresh as grant_type + client_id + refresh_token only.

New optional boolean 'forward_audience_on_refresh' (default: true)
preserves the existing Auth0-friendly default while letting operators
of strict tenants opt out cleanly. Schema + handler tests cover both
cases.

No behavioral change for existing configs.

* style: format MCP OAuth refresh audience log

---------

Co-authored-by: Tim Freudenthal <tim@allesknut.de>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-30 06:59:39 -07:00
Ravi Kumar L
71a7c9ce7b
📡 feat: Add Configurable HyperDX Browser Real User Monitoring (#13287) 2026-05-29 11:04:26 -07:00
Danny Avila
06a6c42435
feat: Model-Aware Max Output Tokens for Google/Gemini (#13390)
* 📤 feat: Model-Aware Max Output Tokens for Google/Gemini

Resolves #13384.

Current Gemini text models (2.5 and 3+, including Gemini 3.5 Flash)
support 64K output tokens, but LibreChat defaulted every Google model
to the legacy 8K value — most visibly in the Agents model-parameter
panel.

- Add model-aware `reset`/`set` to `googleSettings.maxOutputTokens`,
  mirroring the Anthropic pattern: Gemini 2.5/3+ -> 65536, legacy
  (2.0 and earlier) and Gemma -> 8192.
- Resolve the default server-side in `getGoogleConfig` and in the
  Agents, preset, and standard Google settings panels via a shared
  `applyModelAwareDefaults` helper.
- Make `compactGoogleSchema` and `generateGoogleSchema` model-aware so
  explicit user values are preserved and not overwritten.

* 🛡️ fix: Cap Google max output at Vertex-safe limits

Addresses Codex review (P1) on #13390. Vertex AI caps current Gemini
text models at 65,535 output tokens (vs 65,536 on AI Studio) and image
models at 32,768, so an unconditional 65,536 default could make
otherwise-default Vertex requests fail validation.

- Lower the modern text default/ceiling to 65535 (valid on both Vertex
  and AI Studio).
- Resolve Gemini image models (e.g. gemini-2.5-flash-image) to 32768.
- Add reset/set + getGoogleConfig tests for image models and the Vertex
  default path.

* 🧮 fix: Respect configured Google defaults and legacy image caps

Addresses Codex review round 2 on #13390 (one P2, two P3).

- P2 (llm.ts): apply the model-aware maxOutputTokens default as the final
  fallback instead of pre-filling it, so an explicit value, `defaultParams`,
  and `addParams` all take precedence and `dropParams` is honored. Empty-string
  values stay stripped (preserves prior Gemini empty-payload handling).
- P3 (panels): pass the resolved params endpoint (`overriddenEndpointKey`) to
  `applyModelAwareDefaults`, so custom endpoints with
  `defaultParamsEndpoint: 'google'` also surface the model-aware default.
- P3 (schemas): nest the image-model check inside the 2.5+/3+ version check, so
  legacy image IDs (e.g. gemini-2.0-flash-preview-image-generation) keep the 8K
  cap instead of being treated as 32K models.
- Add tests for defaultParams precedence, dropParams, legacy image models, and
  the Vertex default path.

* 🧭 fix: Base Google defaults on final model and configured overrides

Addresses Codex review round 3 on #13390 (two P2).

- llm.ts: resolve the model-aware maxOutputTokens default from the final
  `llmConfig.model` (after defaultParams/addParams) instead of the model
  captured from modelOptions, so a model forced via addParams/paramDefinitions
  on a Google-compatible custom endpoint gets its correct limit.
- Panels: apply model-aware defaults to the built-in settings first, then
  overlay `customParams.paramDefinitions`, so an admin-configured
  maxOutputTokens default wins in the UI (consistent with backend precedence).
- Add parameterSettings.spec for applyModelAwareDefaults (incl. override
  precedence) and a getGoogleConfig final-model test.
2026-05-29 08:09:32 -07:00
Danny Avila
786dae41bd
🪨 fix: Preserve Bedrock Guardrail Config (#13381) 2026-05-28 21:38:53 -07:00
Danny Avila
62dff69300
🧠 feat: Add Claude Opus 4.8 Support (#13380)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* feat: add Claude Opus 4.8 support

* fix: omit sampling params for Claude Opus 4.8

* fix: flatten Bedrock beta header merge

* fix: strip Bedrock sampling params for Opus 4.8
2026-05-28 13:50:39 -07:00
Danny Avila
5071b2b617
🧷 fix: Pin MCP OAuth Client Secrets (#13276)
* fix: Pin MCP OAuth client secrets

* fix: Require MCP OAuth client id for secrets
2026-05-23 16:51:45 -04:00