* 🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults
* 🛰️ fix: Address Codex Review on OpenAI Model Refresh
- Replace nonexistent gpt-5.5-chat-latest with the actual chat-latest
alias; register its context window, output cap, pricing, and cache
rates, and pin explicit rates for legacy gpt-5.x-chat-latest aliases
so the new chat-latest key cannot out-match their cheaper pricing
- Add long-context premium tiers (>272K input) for gpt-5.5 and gpt-5.4
- Disable streaming for pro reasoning models (o1-pro, gpt-5.x-pro),
which OpenAI does not support, with spec coverage
* 🛰️ fix: Address Codex Round-2 Review and CI Spec Failure
- Allow chat-latest through the official OpenAI fetched-model filter
- Export isProReasoningModel and drop unsupported sampling parameters
for versioned pro models (gpt-5.4-pro, gpt-5.5-pro), which the
versioned-model exemption previously let through
- Honor the pro-model streaming disable in both agent chat-completions
routes, which decide SSE from model_parameters before llmConfig exists
- Update models.spec default-list assertions for the refreshed defaults
and cover chat-latest filter retention
* 🛰️ fix: Address Codex Round-3 Review
- Convert max_tokens for chat-latest, which the gpt-[5-9] guard missed
- Drop snake_case sampling params (top_p, logit_bias, penalties) in the
reasoning-model exclusion list so addParams-sourced values are removed
- Add createOpenAIAggregatorHandlers and wire them into the agent
chat-completions service's non-streaming branch, which previously ran
with no handlers and always returned an empty aggregated response
* 🛰️ ci: Fix Import Order Drift and Controller Spec Mock
- Sort type import first in service.spec.ts per import-order convention
- Register isProReasoningModel in the openai controller spec's
@librechat/api mock factory, whose enumerated exports left the new
helper undefined and broke the non-streaming flow under test
* 🛰️ chore: Trim Scope to Model Catalog Changes
Revert the OpenAI endpoint and agent handler changes (pro-model
streaming, sampling exclusions, non-streaming aggregation) — that
surface is moving out of LibreChat into the agents SDK and belongs
in its own change. Keep the model list, token windows, pricing, and
the fetched-model filter for chat-latest.
* 🛰️ fix: Correct GPT-5.4 Context Windows and Pro Long-Context Pricing
- Set gpt-5.4 and gpt-5.4-pro context to the documented 1,050,000
window — 272K is the long-context pricing breakpoint, not the cap,
and using it truncated prompts before they could reach that tier
- Add gpt-5.4-pro long-context premium rates ($60/$270 above 272K)
per its model page; gpt-5.5-pro documents no long-context tier
* 🛰️ fix: Add gpt-5.4-nano and gpt-5.5-pro Long-Context Pricing
- Register gpt-5.4-nano ($0.20/$1.25, cached $0.02, 400K context) in
the model list, pricing, cache, and token maps — the longest-match
fallback billed it at gpt-5.4's $2.50/$15
- Add gpt-5.5-pro long-context premium rates ($60/$270 above 272K);
the pricing table lists the tier even though the model page omits it
* fix: Resolve MCP Runtime User Placeholders
* fix: Harden MCP Runtime Placeholder Connections
* fix: Update MCP Source Tag Test Expectations
* fix: Complete MCP Runtime Placeholder Reinit
* fix: Harden MCP Request Scoped Runtime Configs
* fix: Align MCP OAuth Tests With Domain Policy
* fix: Harden MCP Runtime Resolution Edges
* fix: Avoid MCP Runtime Reprocessing Pitfalls
* fix: Reuse MCP Request Scoped Tool Discovery
* fix: Validate MCP Body Runtime Fields
* 🛡️ refactor: Harden runtime placeholder edges from review
- Warn at inspection when a trusted server URL contains runtime
placeholders but no domain allowlist restricts the resolved target
- Document the three resolution sites that must stay in sync so the
validated config always matches the connected one
- Note the per-call connect cost of ephemeral GRAPH/BODY connections
- Drop the no-op removeUserConnection in callTool's ephemeral cleanup;
ephemeral connections are never stored, and removing the entry could
orphan a still-connected cached connection after a config change
* 🪪 fix: Cover oauth_headers, Graph URL gating, and request-scoped reconnects
Address Codex review:
- Resolve runtime placeholders in oauth_headers (processMCPEnv + Graph
pre-pass) and include the field in placeholder detection, so OAuth
discovery/token requests no longer send literals; consolidate the
detection field lists into one helper
- Defer the early domain gate when the URL still carries a Graph
placeholder (resolved async later); the authoritative
assertResolvedRuntimeConfigAllowed check still enforces policy
- Bypass the 10s reconnect throttle for request-scoped servers, which
re-fetch tool definitions on every message by design
* ⏳ fix: Extend and decouple MCP OAuth flow timeouts
The OAuth auth button disappeared after 2 minutes (the internal OAuth
handling timeout) while the flow state lived for 3 minutes, leaving users
who didn't click immediately stuck in an unrecoverable re-auth loop. The
handling timeouts also reused the connection/init timeout, so a short
initTimeout would shrink the OAuth window further.
- Add MCP_OAUTH_HANDLING_TIMEOUT (10m) and MCP_OAUTH_FLOW_TTL (15m) to mcpConfig
- Decouple the reactive/proactive OAuth waits from initTimeout/connectionTimeout
- Use OAUTH_FLOW_TTL for the FlowStateManager TTL and the UI status window
- Ensure the flow TTL outlives the handling timeout, fixing the
"Flow state not found" race
- Remove dead FLOW_TTL constant and document new env vars
Fixes#13615
* ⏳ fix: Coordinate OAuth pending window with handling timeout
Address Codex review: the extended OAuth wait was still capped by other
timeouts that were not updated.
- Align PENDING_STALE_MS (button validity + pending-flow reuse window)
with MCP_OAUTH_HANDLING_TIMEOUT so a flow stays reusable for the full
wait instead of 2 minutes (Finding 3)
- Clamp MCP_OAUTH_FLOW_TTL to never fall below the handling timeout so a
callback near the deadline still finds its flow state (Finding 2)
- Floor attemptToConnect's timeout to the handling window for OAuth
servers so the reactive in-connect OAuth wait is not killed by the
30s connection timeout (Finding 1)
- Update flow staleness tests to reference the threshold symbolically
* ⏳ fix: Align OAuth window across status, action flows, and client polling
Address Codex round 2: extending the server wait exposed three more
windows that were still capped or now over-extended.
- checkOAuthFlowStatus reports a PENDING flow as active only within the
usable PENDING_STALE_MS window, not the longer Keyv retention TTL, so
the connect button reappears instead of a stuck 'connecting' state
- Give Action (custom tool) OAuth its own FlowStateManager on the prior
3-minute TTL so the longer MCP OAuth TTL can't leave an action tool
call waiting up to 15 minutes
- Extend the MCP server-card client polling to the 10-minute handling
window so a user who completes OAuth after 3 minutes is still picked up
* 🧪 test: Make stale-flow CSRF test track PENDING_STALE_MS
The CSRF-fallback stale-flow test hardcoded a 3-minute age, which is now
within the 10-minute PENDING_STALE_MS window and was wrongly treated as
active. Derive the age from PENDING_STALE_MS so it tracks the constant.
* ⏳ fix: Add grace buffers and surface OAuth timeout to the client
Address Codex round 3 (near-deadline edges):
- Clamp MCP_OAUTH_FLOW_TTL to handling timeout + 60s grace (not equality),
so flow state outlives the wait instead of expiring at the same instant
- Extend attemptToConnect's OAuth floor by a 60s grace so a user who
authorizes near the deadline still gets the post-OAuth reconnect
- Surface OAUTH_HANDLING_TIMEOUT on the connection-status response and
have the client poll for the configured window instead of a hardcoded
10 minutes, so a tuned server deadline isn't capped on the client
* ⏳ fix: Refresh client OAuth timeout from the first status refetch
If the connection-status cache is empty when polling starts, the client
captured the 10-minute fallback and never picked up a tuned oauthTimeout.
Re-read it after each refetch so a longer configured deadline is honored
even on a cold cache.
* 📝 refactor: Type oauthTimeout on MCPConnectionStatusResponse
Declare the oauthTimeout field on the shared response type in
data-provider instead of an ad-hoc inline cast in the client hook, and
replace the pre-existing 'as any' on the status query read with the
typed getQueryData. Type-level only; no runtime change.
When a skill is primed fresh this turn (manual $-popover or always-apply) AND
also appears in history as a `skill` tool_call, its SKILL.md body was injected
twice — once by injectSkillPrimes and once reconstructed by formatAgentMessages.
- add `collectFreshSkillPrimeNames` helper (packages/api) — union of manual +
always-apply prime names
- client.js: pass the set as `skipSkillBodyNames` to formatAgentMessages for
both the initialMessages and memoryMessages paths so the body reconstructs
once. Names not primed this turn still reconstruct (sticky manual re-prime).
Requires `@librechat/agents` with `skipSkillBodyNames` support; the published
dist silently ignores the unknown option until upgraded.
* 📖 feat: Add Claude Fable 5 Support
Claude Fable 5 (`claude-fable-5`) is Anthropic's most capable widely
released model (GA 2026-06-09). Its naming drops the opus/sonnet/haiku
tier, so LibreChat's name-parsing helpers miss it; this teaches them the
Mythos-class family (Fable / Mythos) and registers the model.
- Add `parseMythosClassVersion` and route Fable/Mythos through
`supportsAdaptiveThinking`, `omitsThinkingByDefault`,
`omitsSamplingParameters`, and `supportsContext1m`
- Extend the Bedrock detection regexes (beta headers + adaptive-thinking
branch) and `checkPromptCacheSupport` to match `claude-(fable|mythos)`
- Return 128K max output for Fable/Mythos in `maxOutputTokens.reset`/`set`
- Register `claude-fable-5` in shared Anthropic + Bedrock model lists,
1M context / 128K output token maps, and $10/$50 pricing with 12.5/1
cache rates (`claude-mythos-5` added to token + pricing maps only,
since it is limited-availability)
- Update `.env.example` and the Vertex `librechat.example.yaml` examples
- Add parallel tests across tokens, Anthropic llm config, the Bedrock
parser, and tx pricing
* 🧹 refactor: Centralize Mythos-class detection; address review feedback
- Add `isMythosClassModel` + `MYTHOS_CLASS_FAMILIES` in schemas.ts as the
single source of truth for the Fable/Mythos family; route every gate
(adaptive thinking, omit-thinking, omit-sampling, 1M context, prompt cache,
128K max-output reset/set) through it. A future sibling class is now a
one-line edit.
- [Codex P2] Exclude Mythos-class from getBedrockAnthropicBetaHeaders: Fable/
Mythos ship 128K output + fine-grained tool streaming by default, and the
legacy output-128k-2025-02-19 beta is 3.7-Sonnet-only on Bedrock and risks
request rejection. They still get adaptive thinking + effort.
- [Copilot] Add Mythos 5 test parity (name variations, cache rates, pinned
$10/$50) in tx.spec; add Mythos context/max-output/name-match in tokens.spec;
fix the stale claude-3-7-sonnet-only comment in bedrock.ts.
- Add isMythosClassModel unit tests covering all declared families.
* 📝 docs: Clarify Mythos-class Bedrock requirements; correct beta-omit rationale
Verified live against Bedrock (acct 951834775723, us-west-2):
- anthropic.claude-fable-5 IS a real Bedrock catalog model, INFERENCE_PROFILE-only
exactly like the existing anthropic.claude-opus-4-7/4-8 and claude-sonnet-4-6
default entries (refutes the "invalid model id" review claim).
- Mythos-class also requires opting into Anthropic data sharing (Bedrock Data
Retention API) before invocation.
Changes:
- .env.example: note that Mythos-class (Fable/Mythos) is inference-profile-only on
Bedrock and needs the data-sharing opt-in.
- bedrock.ts: reword the beta-omit comment to the verified rationale — output-128k /
fine-grained-tool-streaming are built-in/no-op for the 4.7+ generation, so omitting
them is lossless (dropped the unverified "Bedrock may reject" wording).
* 🔄 refactor: Reorganize imports in schemas.ts and tx.spec.ts
- Moved `TFeedback` and `Tools` imports to the top of `schemas.ts` for better readability.
- Adjusted import order in `tx.spec.ts` to maintain consistency and improve clarity.
* 🧰 fix: Flatten union schemas for Gemini/Vertex MCP tool compatibility
`@langchain/google-common`'s `zod_to_gemini_parameters` throws "Gemini cannot
handle union types" on any genuine `anyOf`/`oneOf` (e.g. discriminated unions),
so MCP tools shipping union-typed schemas crash on the Google endpoint while
working fine on OpenAI/Claude.
Add `flattenJsonSchemaUnions` (packages/api) to collapse unions to their first
non-null member and multi-entry `type` arrays to a single nullable type, and
apply it in `createToolInstance`'s existing `isGoogle` branch so only the
Google/Vertex path is affected. Lossy by design, mirroring the existing
empty-object fallback.
Closes#13612
* 🩹 fix: Address Codex review — preserve fields, strip null enums, cover definitions path
- Preserve parent-level `properties`/`required` when collapsing a union: merge the
chosen branch into the parent instead of overwriting, so args declared outside the
union (e.g. always-required fields) still reach Gemini.
- Drop the `null` member from `enum` when a union/type-array makes a field nullable,
keeping Gemini's required homogeneous-enum invariant.
- Propagate the Google-flattened schema to the definitions/deferred-tool path:
thread `provider` into `loadToolDefinitions` and flatten there, and store the
flattened schema on `mcpJsonSchema` so `extractMCPToolDefinition` no longer emits
raw unions on Google/Vertex.
* 🎨 style: Sort imports in tools/definitions per import-order check
* ♊ feat: Broaden union flatten into a full Gemini schema sanitizer
The union flatten alone wasn't enough — real GitHub MCP tools on Gemini also 400
with `Invalid value ... (TYPE_STRING), true`, because Gemini's function-calling
Schema (https://ai.google.dev/api/caching#Schema) accepts only a restricted JSON
Schema subset, and `enum` is `Type.STRING`-only.
Rename `flattenJsonSchemaUnions` → `sanitizeGeminiSchema` and broaden it (one pass,
Gemini-gated) to cover the documented subset:
- Keep only string `enum` values; drop the keyword for non-string types (fixes the
reported boolean-enum 400, incl. boolean `const` normalized to `enum: [true]`).
- `const` → single-value string enum, or drop if non-string.
- Merge `allOf` intersections; fold `exclusiveMinimum`/`exclusiveMaximum` into
`minimum`/`maximum`.
- Strip unsupported keywords: `additionalProperties`, `default`, `$schema`, `$id`.
- (Existing) collapse `anyOf`/`oneOf`, multi-entry `type` arrays, nullable.
Grounded in Google's Schema docs rather than reverse-engineered from 400s. Verified
end-to-end against the real `@langchain/google-common` converter. Complements
danny-avila/agents#232 (langchain bump), which defers schema flattening to LibreChat.
* 🩹 fix: Gate enum retention on the effective (collapsed) type
Codex review: a mixed-type enum like `type: ['integer','string'], enum: [1,'auto']`
collapsed the type to `integer` but still kept the string value `'auto'`, yielding
`{type:'integer', enum:['auto']}` — a non-string type with an enum, which Gemini
rejects. Keep `enum` only when the effective collapsed type is string (or unset),
and stamp `type: 'string'` on a surviving typeless enum (e.g. a string `const`
discriminator) so it satisfies Gemini's Type.STRING enum requirement.
Keep the Google Gen AI SDK aligned with the latest 2.x release. Updates the
declared range in both backend manifests (api, packages/api) and regenerates
the lockfile to resolve @google/genai to 2.8.0.
No application code changes: the sole consumer
(api/app/clients/tools/structured/GeminiImageGen.js) uses the stable
`GoogleGenAI` constructor and `models.generateContent` API, and the upstream
changelog records no breaking changes to those between 2.0 and 2.8.
Closes#13551
* 👷 ci: Add API runtime smoke (boot the production image) to docker-smoke
The docker-smoke workflow only built the `client-package-build` stage and
never booted the runtime, so it couldn't catch the class of regression that
recently took production down: the api tsdown bundle externalizes runtime
deps that, after `npm ci --omit=dev`, were missing from the image
(`Cannot find module 'get-stream'`).
- Add an `api-runtime-smoke` job that builds the real production image
(final `api-build` stage, `npm ci --omit=dev`), then:
1. loads the @librechat/api bundle's full require graph in the pruned
image (deterministic, no DB) — fails on any missing/ESM-incompatible
runtime dependency.
2. boots the actual entrypoint and asserts no module-load crash (the
server loads its require graph before connecting to Mongo, so this
surfaces without a database).
- Expand triggers to include `packages/api/**`, `packages/data-schemas/**`,
and `api/package.json` (previously a packages/api change only triggered
this via a root lockfile change, and even then only built the client stage).
- Add gha build cache + concurrency cancellation to bound CI cost.
* 👷 ci: Address Codex review — boot smoke against real Mongo + crash detection
- Boot the production image against a real MongoDB container with the env
the server needs, so the *entire* require graph loads. `api/db/connect.js`
throws at module scope without `MONGO_URI` and is imported before
models/services/routes, so the previous no-env boot exercised almost none
of the legacy API graph. (Codex finding 2)
- Gate on `/health` returning 200 AND the container staying alive, failing on
any container exit. A non-module startup crash (ReferenceError, SyntaxError,
bad config) now fails the smoke instead of slipping past a missing-module
grep. (Codex finding 3)
- Expand trigger from `api/package.json` to `api/**`, since the image copies
the whole `api/` tree and runs `node server/index.js`. (Codex finding 1)
* 👷 ci: Address Codex round 2 — poll /readyz + cover all image inputs
- Poll /readyz instead of /health. /health returns 200 at app.listen, but
initializeMCPs() and checkMigrations() run *after* listen and process.exit(1)
on failure; /readyz only returns 200 once serverReady is set after those
complete. So post-listen startup crashes now fail the smoke too. (finding A)
- Expand triggers to every source tree copied into the production image:
client/**, config/**, skill/** (the final stage copies client/dist, config,
and skill). (finding B)
The tsdown migration (#13595) externalizes all third-party imports
(Rollup inlined them), so several modules the api source imports must be
present at runtime. Six were not, causing production (`npm ci --omit=dev`)
to crash on boot with `Cannot find module 'get-stream'` (then the next).
Fixed following the package's existing convention — packages/api declares
runtime libs as `peerDependencies`, and the `/api` app provides them as
real `dependencies` (how express/mongoose/sharp already resolve):
- `api/package.json` (the prod app, the provider): add the 3 that were
missing — `get-stream`, `jszip`, `mongodb`. (`dedent`/`lodash`/`nanoid`
were already provided by /api.)
- `packages/api/package.json`: add all 6 to `peerDependencies` (the
contract) and to `devDependencies` (workspace build/tests), matching
the existing `mammoth`/`pdfjs-dist`/`sanitize-html` dev+peer pattern.
`jszip`/`mongodb` move out of dev-only (were pruned in production).
Pinned to CJS-compatible majors (get-stream@6, nanoid@3). Verified the
built bundle has zero undeclared externals and the 3 newly-provided deps
are production (non-dev) in the lockfile, so they survive `--omit=dev`.
* ⚡ refactor: Migrate @librechat/client build from Rollup to tsdown
Mirrors the data-schemas migration. Replaces Rollup (rpt2 + postcss) with
tsdown (rolldown + oxc); the package build drops from tens of seconds to ~0.3s.
- Emit isolated-declaration .d.ts via oxc (dts.oxc) and enforce
isolatedDeclarations in tsconfig for editor DX (source made clean: explicit
export type annotations added across src, no `any`).
- Extract component CSS to dist/style.css so the CJS output stays valid
CommonJS (the prior postcss runtime-injection produced an ESM import in the
CJS bundle that breaks jest/require). Imported once in the client app entry;
Vite bundles it for the app.
- Repoint package.json to dual .mjs/.cjs + .d.mts/.d.cts and add ./style.css
and ./package.json exports.
- Update CI build-cache keys to hash tsdown.config.mjs; remove rollup.config.js.
* 🔧 chore: address Codex review on client tsdown migration
- Add tsdown.config.mjs to turbo.json build `inputs` so changes to the new
bundler config invalidate the Turbo cache (the shared inputs only listed the
rollup configs). Also covers the already-migrated data-schemas.
- Name the memoized default export (ControlComboboxMemo) instead of the
codefix-generated `_default_1`, for clearer stack traces / grepping.
Replace the Rollup + `rollup-plugin-typescript2` build with a split
pipeline: tsdown (rolldown) bundles the JS in ~0.2s, and plain `tsc`
emits the declarations to `dist/types` (~2s). Full cold build drops from
~9.2s to ~2.5s (~3.6x) with zero source changes.
Unlike data-schemas, the fast oxc/isolated-declarations dts path isn't
viable here: the package's 78 exported zod schemas produce 374
`isolatedDeclarations` errors (TS9013/TS9038) and a `z.ZodType<T>`
annotation would break the 76 downstream `.extend`/`.shape`/`.pick`
usages. Plain `tsc` keeps the rich zod types intact, and since dts was
never the bottleneck (rollup-plugin-typescript2 was), the win stands.
- dts stays unbundled in `dist/types/` — identical to the prior output,
so the existing deep `dist/types` imports and the exports `types`
paths are unchanged.
- ESM output renamed `index.es.js` -> `index.mjs` (via the exports map;
no consumer hardcodes the old path). cjs/types paths unchanged.
- `./react-query` now emits a real cjs build + types — the exports map
already promised them, but Rollup only ever built the esm file.
- Kept `rollup` + the plugins used by `server-rollup.config.js`
(the `rollup:api` server-bundle smoke test in backend-review.yml);
removed only the deps used solely by the deleted `rollup.config.js`.
- Repointed CI build-cache keys from `rollup.config.js` to
`tsdown.config.mjs`.
* ⚡️ refactor: Migrate @librechat/api build to tsdown
Replace Rollup with tsdown (rolldown + oxc isolated-declarations) for the
@librechat/api package build, mirroring the merged data-schemas migration.
- Add tsdown.config.mjs (cjs output, oxc dts, externalize all bare deps,
bundle first-party `~/` + relative imports)
- Annotate exports for isolatedDeclarations (codefix-driven). Collapse the
tokens.ts model->token maps to Record<string, Record<string, number>> and
switch validation.ts's runtime `files` field from z.any() to z.unknown()
so no explicit `any` is introduced
- Repoint package.json main/types/exports to tsdown's .cjs/.d.cts output
- Add src/telemetry.ts entry shim so the two index.ts entries don't collide
in oxc's flat dts output (stable dist/telemetry.{cjs,d.cts})
- Delete rollup.config.js
Build time ~36s -> ~0.5s. No runtime behavior change: 5712 unit tests pass,
both entries load via require(), legacy /api consumes them unchanged.
* 👷 ci: Hash packages/api/tsdown.config.mjs in build-api cache keys
The build-api cache keys hashed `packages/api/server-rollup.config.js`,
which never existed (api used `rollup.config.js`, now removed) — a copy-paste
artifact from the data-provider key that matched no file. Replace it with the
new `packages/api/tsdown.config.mjs` so edits to the build config (entry,
format, externals) bust the api build cache, matching the data-schemas key.
Enable `isolatedDeclarations` (and `declaration`) in
packages/data-schemas/tsconfig.json so any exported declaration missing
an explicit type annotation is flagged directly in editors and the CI
typecheck, instead of only surfacing during the tsdown/oxc dts emit at
build time.
The package is already fully annotated, so this is a zero-error,
enforcement-only change that keeps data-schemas eligible for the fast
oxc-based declaration emit going forward.
Bumps typescript 5.3.3 -> 5.9.3 across all workspaces. typescript-eslint must move 8.24.0 -> 8.60.1 too: 8.24's typescript peer was capped at <5.8.0; 8.60.1 widens it to <6.1.0.
Two errors surfaced by the newer compiler are fixed:
- api/src/rum/proxy.ts: TS 5.9 made `Buffer` generic (`Buffer<ArrayBufferLike>`), which no longer structurally matches `BodyInit`; cast the fetch body (Node's fetch accepts a Buffer at runtime).
- client usePresetIndexOptions.ts: drop a dead `|| {}` on an object spread (always truthy — flagged by the new TS2872 check).
All four package typecheck jobs + the client app typecheck pass under 5.9.3; builds (tsdown + rollup) and the rum proxy tests are unaffected.
* ⚡ perf: Migrate data-schemas Build to tsdown with isolatedDeclarations
Replace Rollup with tsdown (rolldown + oxc) for @librechat/data-schemas. With the source made isolatedDeclarations-clean, oxc emits .d.ts without tsc, dropping the package build from ~5.8s to ~0.8s (~7x).
- Annotate exported model/method factories for isolatedDeclarations (TypeScript's fixMissingTypeAnnotationOnExports codefix plus hand-authored interfaces); type the ~44 mongoose `any`s and add an explicit PromptMethods interface (previously its declaration was silently dropped by the Rollup build).
- Repoint package.json exports/main/module/types to tsdown output; drop rollup config.
- Config lives in tsdown.config.mjs (native ESM) so CI without a TS-config loader can build it; bundle `dotenv` so the package stays self-contained for its env-loading side effect.
- Fix a latent token `metadata` mismatch the accurate types surfaced: widen TokenCreate/UpdateData inputs to accept plain objects, flatten OAuthMetadata at the api boundary.
- Update mongoMeili/aclEntry specs to the precise model types; drop redundant terser minification from data-provider's library build.
All data-schemas tests pass; api builds clean against the new output.
* 🔧 chore: Hash tsdown.config.mjs in data-schemas CI build-cache keys
The data-schemas build switched from rollup to tsdown, but the build-data-schemas / build-api cache keys in backend-review, config-review, and playwright-mock still hashed the (now-deleted) rollup.config.js. Hash tsdown.config.mjs instead so a config-only change invalidates the cached dist/api builds. (Found by Codex review.)
* 🔧 chore: Replace deprecated tsdown `external` with `deps.neverBundle`
tsdown 0.22 deprecated the top-level `external` option in favor of `deps.neverBundle`. Migrate the data-schemas config and set `deps.onlyBundle: false` to silence the (intentional) dotenv bundling hint. Build output and externalization are unchanged — dotenv bundled, all peers external.
* fix: resolve env variables in MCP OAuth URL fields before validation
Apply the extractEnvVariable transform to authorization_url, token_url,
redirect_uri, and revocation_endpoint in OAuthOptionsBaseSchema. Without
this, ${ENV_VAR} syntax in these fields caused a Zod URL validation error
at startup before any env substitution could happen.
The same .transform().pipe() pattern is already used on all transport url
fields (SSE, WebSocket, StreamableHTTP) and ProxyUrlSchema.
Closes#13572
* fix: block env var expansion in user OAuth URL fields
Override redirect_uri and revocation_endpoint in UserOAuthOptionsSchema
with userOAuthEndpointUrlSchema, matching the existing overrides for
authorization_url and token_url. Without this, user-submitted configs
could inherit the extractEnvVariable transform added to the base schema
and resolve env vars like ${OPENAI_API_KEY} in those fields.
Add envVarPattern rejection to userOAuthEndpointUrlSchema so that
valid-URL-shaped payloads containing ${VAR} patterns are also blocked,
not just bare non-URL strings. Move envVarPattern declaration above the
schema to make it available at module evaluation time.
Add regression tests for all four OAuth URL fields on the user path,
using structurally valid URLs with embedded ${VAR} patterns to confirm
it is the env var guard — not URL shape — that rejects them.
Render assistant markdown as independently memoized top-level blocks instead of a
single ReactMarkdown that re-parses and re-highlights the entire message on every
streamed token. Once a block's source slice is stable it skips re-parse/re-render;
only the final, still-growing block re-parses.
- splitMarkdown: split a message into top-level blocks via mdast-util-from-markdown
(+ gfm/directive/math extensions) using node source offsets; also report per-block
executable-code and artifact index counts.
- MarkdownBlocks: render each block memoized on its raw slice, each wrapped in its
own CodeBlock/Artifact providers seeded with prefix-summed base indices, so the
document-order indices used to match code-execution results stay stable under
memoization (verified by OLD-vs-NEW parity tests across direct + streamed renders).
- CodeBlockContext/ArtifactContext: add optional baseIndex (default 0, fully
backward compatible) so per-block providers continue the running index.
- markdownConfig: extract the shared remark/rehype plugins + components map.
- deps: declare mdast-util-from-markdown, mdast-util-gfm/math/directive and the
micromark gfm/math/directive extensions as direct client dependencies (previously
resolved transitively via react-markdown).
- Tests: splitter unit tests; index parity + DOM equivalence vs the whole-message
renderer; rendering smoke tests.
- Bench (MarkdownBlocks.bench.tsx, outside __tests__ so the default jest run skips
it): ~88% fewer code-block renders and ~2.3x faster cumulative render across a
simulated stream.
* feat: surface message feedback (thumbs up/down) as Langfuse scores
When Langfuse tracing is enabled, the message feedback endpoint now posts a
boolean `user-feedback` score (1/0 + tag/comment) to Langfuse for the
assistant message's trace; clearing feedback deletes the score. Fire-and-
forget, so the feedback UX never blocks on Langfuse.
Linking is lookup-free: the run opts into deterministic Langfuse trace ids
(`langfuse.deterministicTraceId`, passed to the agents Run), so the trace id
is sha256(messageId)[:32]. The feedback route recomputes the same id and
scores by it.
- api/server/services/Langfuse.js: POST/DELETE /api/public/scores (env-gated)
- api/server/utils/langfuseTrace.js: traceIdForMessage(messageId)
- api/server/routes/messages.js: fire feedback score after the Mongo write
- packages/api: pass langfuse.deterministicTraceId to the run
- bump @librechat/agents to ^3.2.21 (adds LangfuseConfig.deterministicTraceId)
Closes#13537
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: match Langfuse trace environment for feedback scores
@librechat/agents passes no environment to its Langfuse tracer, so
@langfuse/otel falls back to LANGFUSE_TRACING_ENVIRONMENT and otherwise to
Langfuse's "default". The score helper instead fell back to NODE_ENV, so a
deployment with only NODE_ENV=production filed scores under "production" while
the trace stayed on "default" — the score never landed on the trace.
Use LANGFUSE_TRACING_ENVIRONMENT only, and omit `environment` when unset so
Langfuse defaults both score and trace to "default".
Addresses Codex review on #13544.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: don't require LANGFUSE_BASE_URL to post feedback scores
The agent tracer emits traces with just the public/secret keys (defaulting to
Langfuse Cloud, or via the legacy LANGFUSE_BASEURL alias), but the score helper
disabled itself unless LANGFUSE_BASE_URL was set — so an otherwise-traced
deployment silently posted no scores. Resolve the base URL the same way the
tracer does (LANGFUSE_BASE_URL -> LANGFUSE_BASEURL -> Cloud) and gate enablement
on the credentials only.
Addresses Codex review on #13544.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: only post feedback scores for agent-endpoint messages
The feedback route is shared by all message types, but deterministic Langfuse
trace IDs are only enabled for agent runs. Rating a message from a non-agent
endpoint (with Langfuse configured) posted a user-feedback score for
sha256(messageId) that no trace will ever match, leaving orphan scores.
Gate scoring on isAgentsEndpoint(message.endpoint); `updateMessage` now returns
`endpoint` so the route can check it.
Addresses Codex review on #13544.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: gate feedback scoring by !isAssistantsEndpoint, not isAgentsEndpoint
The previous gate used isAgentsEndpoint, which only matches the literal
`agents` endpoint. But provider endpoints (anthropic, openai, custom, …) run
through the agents runtime as ephemeral agents and DO emit deterministic
AgentRun traces, so isAgentsEndpoint('anthropic') === false suppressed scoring
for the common case. Only the OpenAI/Azure Assistants endpoints use a separate
runtime with no agent trace, so gate on !isAssistantsEndpoint instead.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* style: sort message method imports
* fix: honor Langfuse tracing gates for feedback scores
* refactor: move Langfuse feedback logic to api package
* fix: support Langfuse host for feedback scores
* test: type Langfuse feedback fetch mock
* chore: compact Langfuse feedback comment
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
Immediate title generation discarded an already-generated title when the
user stopped the turn, both in the backend (skipped saveConvo) and the
frontend (rolled back the streamed title), leaving the chat as "Untitled"
in the interim and "New Chat" after refresh.
Split the title abort into two signals: `signal` still cancels an in-flight
title model call on Stop, while a new `discardSignal` discards an
already-generated title only when the stream is superseded by a newer run
or the turn fails. A plain user Stop now persists and keeps the title.
The frontend no longer rolls back a real, already-applied title on an
aborted final event.
* 🩹 fix: Bump GitNexus to 1.6.5 and Fail-Soft the PR Index Job
The GitNexus Index workflow began failing on most PRs with
"Analysis failed: Maximum call stack size exceeded". Root cause is in
the pinned gitnexus@1.5.3 CLI: pipeline.js does
`deferredWorkerCalls.push(...chunkWorkerData.calls)`, and once a chunk
yields more extracted calls than V8's argument-count limit (~125k on
this repo) the spread-push throws a RangeError. It is deterministic on
repo size, not flaky — LibreChat simply grew past the threshold, so it
fails "more often" as more branches cross it. Stack-size flags don't
help; it's an arg-count limit, not stack depth.
gitnexus@1.6.5 refactored that code path (the .calls spread-pushes are
gone) and indexes this repo cleanly. Bump the indexer, the deploy image
tag/build-arg, and the Dockerfile default in lockstep (an index written
by 1.6.5 must be served by a 1.6.5 server), and move the co-pinned
@ladybugdb/core to 0.16.1 to match.
Also make the index job fail-soft on pull_request events so a future
tool-internal crash degrades gracefully instead of red-X'ing PRs. Push,
dispatch, and /gitnexus command runs still fail loudly, keeping the
deploy-gating and completion-comment logic correct.
* 🐳 fix: Unbreak the GitNexus Deploy Image for 1.6.5
Addresses two issues in the deploy image surfaced after the 1.6.5 bump:
- The image build's lbug-adapter patch grepped
dist/mcp/core/lbug-adapter.js for "LOAD EXTENSION fts", but in 1.6.5
that file is a shim re-export and the FTS load moved to
dist/core/lbug/lbug-adapter.js. The grep would fail the build on the
next image rebuild. The patch is also obsolete: 1.6.5 loads the vector
extension itself via loadVectorExtension. Removed the patch step.
- The image installed only gitnexus, letting @ladybugdb/core resolve
freely via gitnexus's ^0.16.1 range while the index workflow pins
0.16.1 exactly. Pin the native DB in the image too (nested under
gitnexus so install-extensions.js keeps resolving it), restoring the
intended indexer/server lockstep.
The `endpointsConfig` fixture in `EndpointIcon.test.tsx` casts an object whose
values are `{}` to `TEndpointsConfig` (`Record<EModelEndpoint | string, TConfig | null | undefined>`).
`TConfig.order` is required, so `{}` doesn't overlap `TConfig` and the direct
assertion is a TS2352 error under a fresh `tsc --noEmit` over the client
workspace (the type-check job added in #13560), when `librechat-data-provider`
is built from source (the test was added in #13563):
Conversion of type '{ agents: {}; google: {}; }' to type 'TEndpointsConfig'
may be a mistake because neither type sufficiently overlaps with the other.
Give the fixture entries the required `order` field so they're valid `TConfig`
values. This keeps the plain `as TEndpointsConfig` assertion type-checking the
fixture shape, rather than blanking it out with `as unknown as`.
The `client/` workspace was never type-checked: the existing typecheck
job only covered `packages/` and `api/`, and Vite/esbuild transpiles
without type-checking, so type errors shipped through every CI gate.
- Add a `typecheck` job to frontend-review.yml running `tsc --noEmit`
over `client/` (zero tolerance), reusing the data-provider +
client-package build artifacts. Triggers on `client/**`,
`packages/client/**`, `packages/data-provider/**`.
- Fix all 168 pre-existing client type errors this surfaced (source +
tests), including genuine latent bugs:
- `getFileConfig()` was typed as merged `FileConfig`, but the server
returns the raw config that `mergeFileConfig()` consumes (`TFileConfig`).
- SidePanel/Agents `Retrieval`/`ImageVision` were bound to `AgentForm`
but use the assistants `Capabilities` enum → `AssistantForm`.
- `useSearchResultsByTurn` read a `sources` field its type lacked.
- Removed orphaned dead code: `Artifacts/Mermaid.tsx` (imported a
never-installed dep) and dead barrel re-exports (`./Plugins`, `./MCPAuth`).
- Narrow `client/tsconfig.json` to the client app (drop `../e2e` and
`../config/translations`, which reference backend/tooling modules) so
the gate's scope matches its trigger.
No `any`/`@ts-ignore`/`as unknown as`. Localized newly-surfaced strings.