Commit graph

105 commits

Author SHA1 Message Date
Danny Avila
21607ba3d7
📎 fix: Preserve Provider Document Uploads (#13550)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* fix: Preserve provider document uploads

* test: Add provider upload e2e coverage
2026-06-06 10:03:32 -04:00
Danny Avila
5b66196f58
🪪 fix: Scope Message Conversation Access (#13183)
* fix: Scope message conversation access

* style: Format message route query
2026-05-18 17:34:30 -04:00
Danny Avila
93c4ef4ba8
🧱 refactor: typed CodeEnvRef + kind discriminator + principal-aware sandbox cache (#12960)
* 🧱 refactor: typed CodeEnvRef + kind discriminator + tenant-aware sandbox cache

Final cutover for the LibreChat ↔ codeapi sandbox file identity. Replaces
the magic string `${session_id}/${file_id}?entity_id=...` with a typed,
discriminated `CodeEnvRef`. Pre-release lockstep deploy with codeapi
#1455 and agents #148; no legacy aliases retained.

## Final shape

```ts
type CodeEnvRef =
  | { kind: 'skill'; id: string; storage_session_id: string; file_id: string; version: number }
  | { kind: 'agent'; id: string; storage_session_id: string; file_id: string }
  | { kind: 'user';  id: string; storage_session_id: string; file_id: string };
```

`kind` drives codeapi's sessionKey: `<tenant>:<kind>:<id>[✌️<version>]`
for shared kinds, `<tenant>:user:<userId>` for user-private (auth context
provides `userId`). `version` is statically required for `kind: 'skill'`
and forbidden otherwise via discriminated union — constraint holds at
compile time on every consumer, not just codeapi's runtime validator.

`id` is sessionKey-meaningful for `'skill'` / `'agent'`; informational
only for `'user'` (codeapi resolves user identity from auth context).

## What changed

- `packages/data-provider/src/codeEnvRef.ts` — discriminated union +
  `CODE_ENV_KINDS` const-tuple keeps the runtime list and TS union
  locked together.
- Schemas: `metadata.codeEnvRef` and `SkillFile.codeEnvRef` enums
  tightened to `['skill', 'agent', 'user']`.
- `primeSkillFiles` writes `kind: 'skill'`, `id: skill._id`,
  `version: skill.version`. Cache-hit path reads `codeEnvRef`
  directly. Bumping `skill.version` on edit naturally invalidates
  the prior cache entry under the new sessionKey.
- `processCodeOutput` writes `kind: 'user'`, `id: req.user.id`. Output
  bucket is always user-scoped, regardless of which skill the
  execution invoked. New regression test pins the asymmetry.
- `primeFiles` reupload preserves `kind`/`id`/`version?` from the
  existing ref so a skill-cache-miss reupload doesn't silently demote
  to user bucket.
- `crud.js` upload functions (`uploadCodeEnvFile` /
  `batchUploadCodeEnvFiles`) thread `kind`/`id`/`version?` to the
  multipart form (codeapi #1455 option α). Without these on the wire,
  codeapi falls back to user bucketing and skill-cache invalidation
  never fires. Client-side validation mirrors codeapi's validator.
- `Files/process.js` — chat attachments use `kind: 'user'`; agent
  setup files use `kind: 'agent'`.
- Drops `entity_id` everywhere (struct, schema sub-docs, write paths,
  upload form fields). Drops `'system'` from the kind enum (no emitter
  ever existed).

## Test plan

- [x] `cd packages/data-provider && npx jest src/codeEnvRef.spec` — 4 / 4
- [x] `cd packages/data-schemas && npx jest` — 1447 / 1447
- [x] `cd packages/api && npx jest src/agents` — 81 / 81 in skillFiles +
  handlers + resources
- [x] `cd api && npx jest server/services/Files server/controllers/agents` —
  436 / 436
- [x] `cd api && npx jest server/services/Files/Code` — 98 / 98 (incl.
  new "outputs are user-scoped regardless of which skill the execution
  invoked" regression and "reupload forwards kind/id/version from
  existing ref")
- [x] `npx tsc --noEmit -p packages/data-{provider,schemas}/tsconfig.json
  && npx tsc --noEmit -p packages/api/tsconfig.json` — clean (only
  pre-existing unrelated dev errors in storage/balance, untouched here)

## Deploy notes

- **24h cache-miss burst** on first deploy. Inputs (skill caches re-prime
  under new sessionKey shape) and outputs (any pre-Phase C skill-output
  cached files become unreadable). Bounded by codeapi's 24h TTL.
- **Lockstep with codeapi #1455 and agents #148.** Either repo can land
  first since no aliases to drain, but the three deploys must overlap
  within the same maintenance window.
- **`@librechat/agents` bump to `3.1.79-dev.0`** required after agents
  #148 lands and is published.

## What this enables

Auth bridge work (JWT-based tenant/user identity between LC and codeapi)
— codeapi now derives sessionKey purely from `req.codeApiAuthContext.{
tenantId, userId}`, so the next chapter is replacing the header-asserted
user identity with a verified-claim path.

* 🩹 fix: persist execute_code uploads under codeEnvRef metadata key

Codex review P1 (chatgpt-codex-connector). `Files/process.js` was
storing the upload result under `metadata.fileIdentifier` even though:
- `uploadCodeEnvFile` now returns `{ storage_session_id, file_id }`,
  not the legacy magic string.
- The post-cutover schema (`File.metadata.codeEnvRef`) only declares
  `codeEnvRef` — mongoose strict mode silently strips unknown keys.
- All readers (`primeFiles`, `getCodeFilesByIds`,
  `categorizeFileForToolResources`, controller filtering) check
  `metadata.codeEnvRef`.

Net effect of the bug: chat-attached and agent-setup execute_code files
would lose their sandbox reference on save, and primeFiles would skip
them on subsequent code-execution turns — the file blob would still be
available locally but never re-mounted in the sandbox.

Fix: construct the full `CodeEnvRef` (`{ kind, id, storage_session_id,
file_id }`) at the write site and persist under `metadata.codeEnvRef`.
`BaseClient`'s "is this a code-env file" presence check accepts the new
shape alongside the legacy `fileIdentifier` for back-compat with any
pre-cutover records still in the database. Mirrors the same change in
`processAttachments.spec.ts` (which re-implements the BaseClient logic
for testability).

New regression tests in `process.spec.js` cover three cases:
- chat attachments (`messageAttachment=true`) → `kind: 'user'`
- agent setup (`messageAttachment=false`) → `kind: 'agent'`
- legacy `fileIdentifier` key is NOT persisted (would be schema-stripped)

* 🩹 fix: read storage_session_id on primed file refs (Codex P1)

Codex review (chatgpt-codex-connector). After Phase B's per-file
`session_id` → `storage_session_id` rename, `primeFiles` emits the
new field — but `seedCodeFilesIntoSessions` was still reading
`files[0].session_id` for the representative session and `f.session_id`
for the dedupe key. In runs with only primed attachments (no skill
seed), `representativeSessionId` was `undefined`, the function
returned the unchanged map, and `seedCodeFilesIntoSessions` silently
dropped the entire batch. The first `execute_code` call then started
without `_injected_files` and the agent couldn't see prior-turn
artifacts.

Fix:
- `codeFilesSession.ts`: read `f.storage_session_id` for both the
  dedupe key and the representative session id. JSDoc updated to
  match the new field name.
- `callbacks.js`: the two output-file persistence paths read
  `file.session_id` to pass to `processCodeOutput` — switch to
  `file.storage_session_id`. The original comment explicitly says
  this should be the STORAGE session, which is exactly the field
  Phase B renamed.
- `codeFilesSession.spec.ts`: fixture builder uses `storage_session_id`
  and `kind: 'user'` to match the post-cutover `CodeEnvFile` shape.

Lockstep coordination: this matches the post-bump shape of
`@librechat/agents` 3.1.79+. CI tsc errors against the currently-pinned
3.1.78 are expected and resolve when the dep bumps in this PR before
merge.

* 📦 chore: Bump `@librechat/agents` to version 3.1.80-dev.0 in package-lock and package.json files

* 🪪 fix: thread kind/id/version through codeapi /download URLs (Phase C α)

Symmetric fix for the upload-side wire change in 537725a. Codeapi's
`sessionAuth` middleware now requires `kind`/`id`/`version?` on every
download/freshness URL — without them it 400s with "kind must be one
of: skill, agent, user" before serving the file.

Three sites construct codeapi-side URLs that go through `sessionAuth`:

- `processCodeOutput` (`Files/Code/process.js`): `/download/<sess>/<id>`
  for freshly-generated sandbox outputs. Always `kind: 'user'` +
  `id: req.user.id` — code-output files are always user-private,
  regardless of which skill the run invoked.
- `getSessionInfo` (`Files/Code/process.js`): `/sessions/<sess>/objects/<id>`
  for the 23h freshness check. Pulls kind/id/version straight off the
  `codeEnvRef` already in scope — skill files stay skill-bucketed,
  user files stay user-bucketed.
- `/code/download/:session_id/:fileId` LC route (`routes/files/files.js`):
  proxies to codeapi for manual downloads. Code-output files only on
  this route, so `kind: 'user'` + `id: req.user.id`.

The `getCodeOutputDownloadStream` helper in `crud.js` now takes an
`identity` param, validated by a `buildCodeEnvDownloadQuery` helper
that mirrors `appendCodeEnvFileIdentity`'s shape rules: kind required
from the closed `{skill, agent, user}` set, version required for
'skill' and forbidden otherwise. Bad callers fail fast on the client
instead of round-tripping a 400.

Also cleans up two log-noise sources reported alongside the 400:

- `logAxiosError` in `packages/api/src/utils/axios.ts` was dumping
  `error.response.data` raw. With `responseType: 'arraybuffer'` that's
  a `Buffer` (~4 chars per byte after JSON-serialization); with
  `responseType: 'stream'` it's a `Readable` whose internal state
  serializes the entire ring buffer + socket. New `renderResponseData`
  decodes small buffers as UTF-8 (truncated past 2KB) and stubs streams
  as `'[stream]'`. Diagnostics stay useful, log lines stop being
  megabytes.
- `/code/download` route's catch was bare `logger.error('...', error)`,
  bypassing the redactor. Switched to `logAxiosError` so it benefits
  from the same buffer/stream handling.

Tests updated to match the new contract:
- crud.spec: `getCodeOutputDownloadStream` fixtures pass `userIdentity`;
  new cases cover skill identity (with version), bad kind rejection,
  skill-without-version rejection.
- process.spec: `getSessionInfo` test passes a full `codeEnvRef` object.

* ♻️ refactor: extract codeEnv identity helpers into packages/api

Per the project convention that new backend code lives in TypeScript
under `packages/api`, moves `appendCodeEnvFileIdentity` and
`buildCodeEnvDownloadQuery` from `api/server/services/Files/Code/crud.js`
into a new `packages/api/src/files/code/identity.ts` module.

Both helpers are pure validators that mirror codeapi's
`parseUploadSessionKeyInput` server-side rules (closed kind set,
`version` required for `'skill'` and forbidden otherwise) — they
deserve TS support and a dedicated spec rather than living as
JSDoc-typed helpers in the legacy `/api` workspace. The new module:

- Exports a `CodeEnvIdentity` interface using the
  `librechat-data-provider` `CodeEnvKind` discriminated union.
- Adds 13 unit tests in `identity.spec.ts` covering the validation
  matrix (skill+version, agent, user, and every rejection path) plus
  URL encoding for the download query.
- Re-exported from `packages/api/src/files/code/index.ts` alongside
  `classify`, `extract`, and `form`.

Consumer updates:
- `api/server/services/Files/Code/crud.js`: drops the local helpers
  and imports them from `@librechat/api`. Net -64 lines.
- `api/server/services/Files/Code/process.js`: same.
- Test mocks for `@librechat/api` in three spec files now stub the
  helpers' validation behavior locally rather than pulling them
  through `requireActual` (which would drag in provider-config
  init-time side effects). The package's `exports` field only
  surfaces the root barrel, so leaf imports aren't reachable from
  legacy `/api` test setup.

No runtime behavior change. Identity validation rules and emitted
form/query shapes are byte-for-byte identical pre/post.

* 🪪 fix: emit resource_id alongside id on _injected_files (skill 403 fix)

Companion to codeapi #1455 fix and agents 3.1.80-dev.1 — the wire
shape for shared-kind files now requires `resource_id` distinct from
the storage `id`. Without this LC change, codeapi's sessionKey
re-derivation on every shared-kind /exec rejects with 403
session_key_mismatch:

    cached:  legacy:skill:69dcf561...✌️59  (signed at upload, skill _id)
    derived: legacy:skill:ysPwEURuPk-...✌️59  (storage nanoid)

Emit sites updated:

- `primeInvokedSkills` cache-hit path: `resource_id: ref.id` (the
  persisted skill `_id` from `codeEnvRef.id`); `id: ref.file_id`
  unchanged (storage uuid).
- `primeInvokedSkills` fresh-upload path: `resource_id: skill._id.toString()`
  on every primed file (the `allPrimedFiles` builder type now carries
  the field).
- `processCodeOutput`'s `pushFile` (Code/process.js): `resource_id: ref.id`
  — for `kind: 'user'` this is informational (codeapi derives
  sessionKey from auth context) but emitted for shape uniformity
  with shared kinds.

Bumps `@librechat/agents` to `^3.1.80-dev.1` (the version that
ships the matching `CodeEnvFile.resource_id` field).

## Test plan

- [x] `cd packages/api && npx jest src/agents` — 67 / 67 pass
  (skillFiles fixtures updated to assert `resource_id` on the
  emitted CodeSessionContext.files).
- [x] `cd api && npx jest server/services/Files server/controllers/agents` —
  445 / 445 pass (process.spec fixtures updated for the reupload
  + cache-hit emission).
- [x] `npx tsc --noEmit -p packages/api/tsconfig.json` — clean.

* fix(skill-tool-call): carry resource_id through primeSkillFiles → artifact

Codeapi was 400ing every /exec following a `handle_skill` tool call
with `resource_id is invalid` (`type: 'undefined'`). Both code paths
in `primeSkillFiles` (cache-hit + fresh-upload) returned files
without `resource_id`/`kind`/`version`, and the artifact in
`handlers.ts` forwarded the stripped shape into
`tc.codeSessionContext.files` → `_injected_files`.

`primeInvokedSkills` (the NL-detected loader) had already been fixed
end-to-end; this commit aligns the tool-invoked path with the same
contract: `resource_id` = `skill._id.toString()`, `kind: 'skill'`,
`version` = the skill's monotonic counter.

Tests added to `skillFiles.spec.ts` lock the contract on
`primeSkillFiles` directly so future refactors can't silently drop
the resource identity again.

* fix(handlers.spec): align session_id → storage_session_id rename + kind discriminator

Pre-existing TS errors against the post-rename `CodeEnvFile` shape:
the test file still used `session_id` on per-file objects (renamed to
`storage_session_id` in agents Phase B/C) and was missing the `kind`
discriminator the discriminated union requires. Both inputs and the
matching `expect.toEqual(...)` mirrors updated together so the
runtime equality check still holds.

Lines 723-732 stay as-is — they sit behind `as unknown as
ToolCallRequest` and TS already skipped them.

* chore: fix `@librechat/agents`, correct version to 3.1.80-dev.0 in package.json files

* chore: bump `@librechat/agents` to version 3.1.80-dev.1 in package.json and package-lock.json

* chore: bump `@librechat/agents` to version 3.1.80-dev.2

* feat(observability): trace file priming chain from primeCodeFiles to _injected_files

Diagnosing the user-upload "files=[] on first /exec" bug requires
seeing where in the LC chain a file ref disappears. Prior to this
patch the chain (primeCodeFiles → primedCodeFiles → initialSessions
→ CodeSessionContext → _injected_files) was opaque end-to-end:
  - primeCodeFiles silently dropped files without `metadata.codeEnvRef`
  - reuploadFile catches all errors and continues with no signal
  - the handlers.ts handoff to codeapi never logged what it was sending

After this patch, a single grep on `[primeCodeFiles]` plus
`[code-env:inject]` shows the full per-file path:

  [primeCodeFiles] in: file_ids=N resourceFiles=M
  [primeCodeFiles] file=<id> path=skip reason=no-codeenvref filename=...
  [primeCodeFiles] file=<id> path=cache-hit-by-session storage_session_id=...
  [primeCodeFiles] file=<id> path=reupload reason=no-uploadtime ...
  [primeCodeFiles] file=<id> path=reupload reason=stale ...
  [primeCodeFiles] file=<id> path=reupload-success oldSession=... newSession=... newFileId=...
  [primeCodeFiles] file=<id> path=reupload-failed session=...
  [primeCodeFiles] file=<id> path=fresh-active storage_session_id=...
  [primeCodeFiles] out: returned=N skippedNoRef=M reuploadFailures=K

  [code-env:inject] tool=<name> files=N missingResourceId=K     (debug)
  [code-env:inject] M/N files missing resource_id ...           (warn)
  [code-env:inject] tool=<name> _injected_files=0 ...           (warn)

The boundary log warns when LC sends zero injected files on a
code-execution tool call — that's the user's actual symptom showing
up at the LC side instead of having to correlate against codeapi's
`Request received { files: [] }`.

Tag chosen as `[code-env:inject]` rather than `[handoff:exec]` to
avoid collision with the app-level "handoff" semantic (subagent
handoff workflow).

Structural cleanup in primeFiles: replaced the `if (ref) { ... }`
nesting with an early `if (!ref) continue` so the per-path
instrumentation hooks land at top-level scope instead of indented
inside a conditional. Behavior unchanged; pushFile / reuploadFile
identical.

Spec fixtures (handlers.spec.ts, codeFilesSession.spec.ts) updated
to include `resource_id` on `CodeEnvFile` literals — required by
the post-3.1.80-dev.2 type now installed.

## Test plan

- [x] `cd packages/api && npx jest src/agents/handlers.spec.ts src/agents/codeFilesSession.spec.ts src/agents/skillFiles.spec.ts` — 69/69 pass
- [x] `cd api && npx jest server/services/Files/Code/process.spec.js` — 84/84 pass
- [x] `npx tsc --noEmit -p packages/api` — clean
- [x] `npx eslint` on all four touched files — clean

* chore: add CONSOLE_JSON_STRING_LENGTH to .env.example for JSON log string length configuration

* fix(files): align codeapi upload filename with LC's sanitized DB filename

User-attached files for code execution were uploading to codeapi
under `file.originalname` (raw upload filename, may contain spaces /
special chars) while LC's DB record stored the sanitized form
(`sanitizeFilename(file.originalname)`, underscores). Codeapi
preserves whatever filename the upload sent, so the sandbox saw
`/mnt/data/<originalname>` while LC's `primeFiles` toolContext text
+ `_injected_files.name` referenced `file.filename` (sanitized).

Visible failure: agent gets system prompt saying

    /mnt/data/librechat_code_api_-_active_customer_-_2025-11-05.xlsx

…tries that path, hits `FileNotFoundError`, then notices the
sandbox's actual `Available files` line says

    /mnt/data/librechat code api - active customer - 2025-11-05.xlsx

…retries with spaces, succeeds. Wastes a tool call per upload and
leaks raw filenames into model context.

Fix: sanitize once and use the sanitized form in both the codeapi
upload AND the LC DB record. Sandbox path = LC toolContext text =
in-memory ref name. No drift.

Reupload path (`Code/process.js` line 867 `filename: file.filename`)
already uses the sanitized DB name, so it stays consistent with the
fresh-upload path after this change.

## Test plan

- [x] `cd api && npx jest server/services/Files/process` — 32/32 pass
- [x] `npx eslint` on the touched file — clean

* chore: bump `@librechat/agents` to version 3.1.80-dev.3 in package.json and package-lock.json
2026-05-08 12:29:43 -04:00
Danny Avila
f3e1201ae7
📌 fix: Stabilize Agent Prompt Cache Prefix (#12907)
* fix: stabilize agent prompt cache prefix

* chore: refresh agents sdk lockfile integrity

* test: format agent memory assertion

* test: type agent context fixtures

* fix: preserve MCP instruction precedence

* fix: reuse resolved conversation anchor

* fix: keep resumable startup immediate
2026-05-02 09:55:31 +09:00
Danny Avila
91cd3f7b7c 🧽 refactor: Skills polish: precedence-aware body validation, controller drop logs, SkillPills rename (#12760)
Post-merge sanity-review cleanup on top of #12746:

- `createSkill` / `updateSkill` now parse SKILL.md body's always-apply
  status once and reuse it for both validation and derivation (was
  parsing the same YAML block twice per call).
- Body-inline `always-apply:` validation becomes precedence-aware: a
  caller sending an explicit top-level `alwaysApply` or a structured
  `frontmatter['always-apply']` no longer gets rejected for a typo in
  the body — the body value is never consulted at derivation time when
  a higher-precedence source wins. New tests cover the three relevant
  interactions (explicit+body-typo, frontmatter+body-typo, body-only
  typo still rejects).
- OpenAI and Responses controllers now emit a `logger.warn` when
  `injectSkillPrimes` drops always-apply primes to stay under
  `MAX_PRIMED_SKILLS_PER_TURN`. `injectSkillPrimes` already logs
  internally; the controller-level warn adds endpoint context so
  operators can identify which path hit the cap at a glance. Mirrors
  AgentClient's existing log.
- Rename `ManualSkillPills` → `SkillPills` (component + type + file +
  test + all JSDoc references). The component handles both manual and
  always-apply pills now; the original name was carried over from the
  manual-only Phase 3 and misleads new readers.
- Drive-by fix: declare `appConfig = req.config` at the top of
  `createResponse` in `responses.js` — it was used unqualified on
  lines 381/396, which silently evaluated to `undefined` (via optional
  chaining) and disabled the skills-capability check on the Responses
  endpoint. Pre-existing, surfaced by lint on the touched file.
2026-04-25 04:02:01 -04:00
Danny Avila
dfc3dfa57f 📍 feat: always-apply frontmatter: auto-prime skills every turn (#12746)
* 🔁 refactor: Rebase always-apply work onto merged structured-frontmatter columns

Phase 6 (disable-model-invocation / user-invocable / allowed-tools)
landed first on feat/agent-skills. Reconcile this branch with the new
mainline:

- Thread alwaysApplySkillPrimes through unionPrimeAllowedTools alongside
  manualSkillPrimes, applying the combined MAX_PRIMED_SKILLS_PER_TURN
  ceiling before loading tools.
- Add `_id` to ResolvedAlwaysApplySkill to match Phase 6's
  ResolvedManualSkill shape (read_file name-collision protection).
- Register 'always-apply' in ALLOWED_FRONTMATTER_KEYS / FRONTMATTER_KIND
  so Phase 6's validator recognizes it.
- Drop frontmatter from the listSkillsByAccess projection; the backfill
  helper remains as defensive code but its read path is no longer
  exercised on summary rows (no legacy rows exist — the branch never
  shipped), saving ~200KB per page.
- Retire the corresponding "backfills legacy on summaries" test.
- Plumb listAlwaysApplySkills through the JS controllers + endpoint
  initializer so the always-apply resolver sees a real DB method.

* 🧹 fix: Dedupe manual/always-apply overlap, share YAML util, tidy comments

Addresses review findings:

- Cross-list dedup: when a user $-invokes a skill that is also marked
  always-apply, the always-apply copy is now dropped so the same
  SKILL.md body never primes twice in one turn. Manual wins (explicit
  intent, closer to the user message). Dedup runs in both
  initializeAgent (so persisted user-bubble pills stay in sync) and
  injectSkillPrimes (defense-in-depth at splice time). New test cases
  cover single-overlap, partial-overlap, and dedup-before-cap.
- DRY: extract stripYamlTrailingComment to
  packages/data-schemas/src/utils/yaml.ts; packages/api/src/skills/import.ts
  now imports the shared helper. Also drop the redundant inner
  stripYamlTrailingComment call inside parseBooleanScalar — the call
  site already strips.
- Mark injectManualSkillPrimes as @deprecated in favor of
  injectSkillPrimes (kept for external consumers of @librechat/api).
- Document SKILL_TRIGGER_MODEL as forward-looking plumbing for the
  model-invoked path rather than leaving it as a bare unused export.
- Replace the stale "frontmatter is included" comment on
  listSkillsByAccess with an accurate explanation of why it was
  intentionally excluded.

* 🔒 fix: Include always-apply primes in skillPrimedIdsByName + clear alwaysApply on body opt-out

Two bugs flagged by Codex review:

P1 (read_file): `manualSkillPrimedIdsByName` only carried manual-invocation
primes, so an always-apply skill with `disable-model-invocation: true`
was blocked from reading its own bundled files, and same-name collisions
could resolve to a different doc than the one whose body got primed.
- Rename `buildManualSkillPrimedIdsByName` → `buildSkillPrimedIdsByName`
  (accepts both manual + always-apply prime arrays).
- Rename the configurable field `manualSkillPrimedIdsByName` →
  `skillPrimedIdsByName` throughout the plumbing (skillConfigurable.ts,
  handlers.ts, CJS callers, tests).
- Overlap resolution: manual wins on the rare edge case where the same
  name appears in both arrays (upstream dedup should prevent this, but
  defensive merging treats manual as authoritative).
- New tests: (1) gate-relaxation fires for always-apply primes, (2) `_id`
  pinning works for always-apply same-name collisions.

P2 (updateSkill): when a body update had no `always-apply:` key,
`extractAlwaysApplyFromBody` returned `absent` and the column was left
untouched. A skill that was once `alwaysApply: true` would keep
auto-priming even after its SKILL.md no longer declared the flag.
- Treat `absent` as a positive "not always-apply" declaration when the
  body is explicitly submitted; flip the column to `false`.
- Explicit top-level `alwaysApply` still wins (three-source precedence
  unchanged).
- New tests: body removes key → false, body has no frontmatter at all →
  false, explicit + body-without-key → explicit wins.

* 🧵 refactor: Collapse duplicate prime types + tighten parse + test hygiene

Sanity-check review follow-ups:

- Collapse `ResolvedManualSkill` / `ResolvedAlwaysApplySkill` into a
  single `ResolvedSkillPrime` canonical interface with two backward-
  compatible type aliases. Both resolvers feed the same pipeline stages
  (injectSkillPrimes, unionPrimeAllowedTools, buildSkillPrimedIdsByName);
  the per-source distinction lives on `additional_kwargs.trigger`, not
  on the resolver output.
- Move the `always-apply` branch in `parseFrontmatter` to operate on the
  raw post-colon text. The outer `unquoteYaml` was fine today because
  it's idempotent on non-quoted strings, but running it twice (once per
  line, once after stripping the inline comment) would be fragile if the
  unquoter ever grows richer YAML-escape handling.
- Add the missing `alwaysApplyDedupedFromManual: 0` field to the
  `injectSkillPrimes` mocks in `openai.spec.js` and `responses.unit.spec.js`
  so they match the full `InjectSkillPrimesResult` contract.
- Insert the blank line between the `unionPrimeAllowedTools` and
  `resolveAlwaysApplySkills` describe blocks.

* 🔧 fix(tsc): Cast mock.calls via `unknown` for strict tuple destructure

`getSkillByName.mock.calls[0]` is typed as `[]` by jest's generic default;
a direct cast to `[string, ..., ...]` fails TS2352 under `--noEmit` even
though the runtime shape matches. Go through `as unknown as [...]` like
the earlier test in the same file so CI's type-check step stays green.

* 🪢 fix: Propagate skillPrimedIdsByName into handoff agent tool context

Handoff agents go through the same `initializeAgent` flow as the primary
(with `listAlwaysApplySkills` now plumbed), so they resolve their own
`manualSkillPrimes` and `alwaysApplySkillPrimes` — but the
`agentToolContexts.set(...)` for handoff agents didn't carry
`skillPrimedIdsByName` into the per-agent context.

That meant `handleReadFileCall` fell back to the full ACL set + a
`prefer*` flag for handoff agents: same-name collisions could resolve to
a different doc than the one whose body got primed, and a
`disable-model-invocation: true` skill primed via manual `$` or
always-apply inside the handoff flow would be blocked from reading its
own bundled files.

Build the map via `buildSkillPrimedIdsByName(config.manualSkillPrimes,
config.alwaysApplySkillPrimes)` for every handoff tool context so
`read_file` behaves identically across primary and handoff agents.
2026-04-25 04:02:00 -04:00
Danny Avila
539c4c7e4d 🎬 feat: Prime Manually-Invoked Skills via $ Popover (#12709)
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.

* 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input

Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.

* 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.

* 🔖 refactor: Single source for the skill-message source marker

Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.

* ♻️ refactor: Drop Multi-Agent Guard + Review Polish

- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.

* 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks

Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.

* 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills

The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.

*  feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.

* 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.

* 🎗️ feat: Replace \$skill-name Text with Pills on the User Message

The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.

* 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips

Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.

* 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding

Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.

* 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.

* 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.

* 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.

* 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts

The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.

* 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.

* 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator

Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow

* ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.

* 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.

* 🔧 fix: Update ParallelContent to Handle Optional Content Prop

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.

* 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer

This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.

* 🧹 polish: Address Audit Follow-Up (F1/F3/F6)

F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.

*  feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
2026-04-25 04:02:00 -04:00
Danny Avila
6ecd1b510f
📎 fix: Route Unrecognized File Types via supportedMimeTypes Config (#12508)
* fix: check supportedMimeTypes before routing unrecognized file types

In processAttachments, files not matching the hardcoded mime type
categories (image, PDF, video, audio) were silently dropped. Now
resolves the endpoint's file config and checks the file type against
supportedMimeTypes before routing to the documents pipeline. Files
not matching any config are still skipped (original behavior).

Closes #12482

* feat: encode generic document types for supported providers

Remove restrictive mime type filter in encodeAndFormatDocuments that
only allowed PDFs and application/* types. Add a generic encoding
path for non-PDF, non-Bedrock files using the provider's native
format (Anthropic base64 document, OpenAI file block, Google media
block). Files are already validated upstream by supportedMimeTypes.

* fix: guard file.type and cache file config in processAttachments

- Add file.type truthiness check before checkType to prevent
  coercion of null/undefined to string 'null'/'undefined'
- Cache mergedFileConfig and endpointFileConfig on the instance
  so addPreviousAttachments doesn't recompute per message

* refactor: harden generic document encoding with validation and tests

- Extract formatDocumentBlock helper to eliminate ~30 lines of
  duplicate provider-dispatch code between PDF and generic paths
- Add size validation in generic encoding path using
  configuredFileSizeLimit (was fetched but unused)
- Guard Bedrock from generic path — non-bedrockDocumentFormats
  types are now skipped instead of silently tracking metadata
- Only push metadata to result.files when a document block was
  actually created, preventing silent inconsistent state
- Enable Anthropic citations for text/plain, text/html,
  text/markdown (supported by Anthropic's document API)
- Fix != to !== for Providers.AZURE comparison
- Add 9 tests covering all four provider branches, Bedrock
  exclusion, size limit enforcement, and unhandled provider

* fix: resolve filename type mismatch in formatDocumentBlock

filename parameter is string | undefined but OpenAIFileBlock and
OpenAIInputFileBlock require string. Default to 'document' when
filename is undefined.

* fix: use endpoint name for file config lookup in processAttachments

Agent runs can have agent.provider set to a base provider (e.g.,
openAI) while agent.endpoint is a custom endpoint name. Using
provider for the getEndpointFileConfig lookup bypassed custom
endpoint supportedMimeTypes config. Now uses agent.endpoint,
matching the pattern in addDocuments.

* perf: filter non-Bedrock files before fetching streams

Bedrock only supports types in bedrockDocumentFormats. Previously,
getFileStream was called for all files and unsupported types were
discarded after download. Now pre-filters the file list for Bedrock
to avoid unnecessary network and memory overhead for large
unsupported attachments.

* refactor: clean up processAttachments file config handling

- Remove redundant ?? null intermediaries; use instance properties
  directly in the else-if condition
- Add JSDoc @type annotations for _mergedFileConfig and
  _endpointFileConfig in the constructor

* refactor: harden document encoding and add routing tests

- Hoist configuredFileSizeLimit above the loop to avoid recomputing
  mergeFileConfig per file
- Replace Buffer.from decode with base64 length formula in the
  generic size check to avoid unnecessary heap allocation
- Use nullish coalescing (??) for filename fallback
- Clean up test: remove unnecessary type cast, use createMockRequest
  helper for size-limit test
- Add 14 tests for processAttachments categorization logic covering
  supportedMimeTypes routing, null/undefined guards, standard type
  passthrough, and edge cases

* fix: use optional chaining for checkType in routing tests

FileConfig.checkType is typed as optional. Use optional chaining
to satisfy strict type checking.

* fix: skip stream fetches for unsupported providers, block Bedrock generic routing

- Return early from encodeAndFormatDocuments when the provider is
  neither document-supported nor Bedrock, avoiding unnecessary
  getFileStream calls for providers that would discard all results
- Add !isBedrock guard to the supportedMimeTypes fallback branch in
  processAttachments so permissive patterns like '.*' don't route
  non-Bedrock types into documents that would be silently dropped
- Add test for Bedrock + non-Bedrock-document-type skipping

* fix: respect supportedMimeTypes config for Bedrock endpoints

Remove !isBedrock guard from the generic supportedMimeTypes routing
branch. If a user configures permissive supportedMimeTypes for a
Bedrock endpoint, the upload validation already accepted the file.
The encoding layer pre-filters to Bedrock-supported types before
fetching streams, so unsupported types are handled there without
silently dropping files the user explicitly allowed.
2026-04-01 23:04:43 -04:00
Danny Avila
fd01dfc083
💰 fix: Lazy-Initialize Balance Record at Check Time for Overrides (#12474)
* fix: Lazy-initialize balance record when missing at check time

When balance is configured via admin panel DB overrides, users with
existing sessions never pass through the login middleware that creates
their balance record. This causes checkBalanceRecord to find no record
and return balance: 0, blocking the user.

Add optional balanceConfig and upsertBalanceFields deps to
CheckBalanceDeps. When no balance record exists but startBalance is
configured, lazily create the record instead of returning canSpend: false.

Pass the new deps from BaseClient, chatV1, and chatV2 callers.

* test: Add checkBalance lazy initialization tests

Cover lazy balance init scenarios: successful init with startBalance,
insufficient startBalance, missing config fallback, undefined
startBalance, missing upsertBalanceFields dep, and startBalance of 0.

* fix: Address review findings for lazy balance initialization

- Use canonical BalanceConfig and IBalanceUpdate types from
  @librechat/data-schemas instead of inline type definitions
- Include auto-refill fields (autoRefillEnabled, refillIntervalValue,
  refillIntervalUnit, refillAmount, lastRefill) during lazy init,
  mirroring the login middleware's buildUpdateFields logic
- Add try/catch around upsertBalanceFields with graceful fallback to
  canSpend: false on DB errors
- Read balance from DB return value instead of raw startBalance constant
- Fix misleading test names to describe observable throw behavior
- Add tests: upsertBalanceFields rejection, auto-refill field inclusion,
  DB-returned balance value, and logViolation assertions

* fix: Address second review pass findings

- Fix import ordering: package type imports before local type imports
- Remove misleading comment on DB-fallback test, rename for clarity
- Add logViolation assertion to insufficient-balance lazy-init test
- Add test for partial auto-refill config (autoRefillEnabled without
  required dependent fields)

* refactor: Replace createMockReqRes factory with describe-scoped consts

Replace zero-argument factory with plain const declarations using
direct type casts instead of double-cast through unknown.

* fix: Sort local type imports longest-first, add missing logViolation assertion

- Reorder local type imports in spec file per AGENTS.md (longest to
  shortest within sub-group)
- Add logViolation assertion to startBalance: 0 test for consistent
  violation payload coverage across all throw paths
2026-03-30 22:51:07 -04:00
Danny Avila
1455f15b7b
📄 feat: Model-Aware Bedrock Document Size Validation (#12467)
* 📄 fix: Model-Aware Bedrock Document Size Validation

Remove the hard 4.5MB clamp on Bedrock document uploads so that
Claude 4+ (PDF) and Nova (PDF/DOCX) models can accept larger files
per AWS documentation. The default 4.5MB limit is preserved for
other models/formats, and fileConfig can now override it in either
direction—consistent with every other provider.

* address review: restore Math.min for non-exempt docs, tighten regexes, add tests

- Restore Math.min clamp for non-exempt Bedrock documents (fileConfig can
  only lower the hard 4.5 MB API limit, not raise it); only exempt models
  (Claude 4+ PDF, Nova PDF/DOCX) use ?? to allow fileConfig override
- Replace copied isBedrockClaude4Plus regex with cleaner anchored pattern
  that correctly handles multi-digit version numbers (e.g. sonnet-40)
  and removes dead Alt 1 branch matching no real Bedrock model IDs
- Tighten isBedrockNova from includes() to startsWith() to prevent
  substring matching in unexpected positions
- Add integration test verifying model is threaded to validateBedrockDocument
- Add boundary tests for exempt + low configuredFileSizeLimit, non-exempt
  + high configuredFileSizeLimit, and exempt model accepting files up to 32 MB
- Revert two tests that were incorrectly inverted to prove wrong behavior
- Fix inaccurate JSDoc and misleading test name

* simplify: allow fileConfig to override Bedrock limit in either direction

Make Bedrock consistent with all other providers — fileConfig sets the
effective limit unconditionally via ?? rather than clamping with Math.min.
The model-aware defaults (4.5 MB for non-exempt, 32 MB for exempt) remain
as sensible fallbacks when no fileConfig is set.

* fix: handle cross-region inference profile IDs in Bedrock model matchers

Bedrock cross-region inference profiles prepend a region code to the
model ID (e.g. "us.amazon.nova-pro-v1:0", "eu.anthropic.claude-sonnet-4-...").
Both isBedrockNova and isBedrockClaude4Plus would miss these prefixed IDs,
silently falling back to the 4.5 MB default for eligible models.

Switch both matchers to use (?:^|\.) to anchor the vendor segment so the
pattern matches with or without a leading region prefix.
2026-03-30 16:50:10 -04:00
Danny Avila
0d94881c2d
🧹 refactor: Tighten Config Schema Typing and Remove Deprecated Fields (#12452)
* refactor: Remove deprecated and unused fields from endpoint schemas

- Remove summarize, summaryModel from endpointSchema and azureEndpointSchema
- Remove plugins from azureEndpointSchema
- Remove customOrder from endpointSchema and azureEndpointSchema
- Remove baseURL from all and agents endpoint schemas
- Type paramDefinitions with full SettingDefinition-based schema
- Clean up summarize/summaryModel references in initialize.ts and config.spec.ts

* refactor: Improve MCP transport schema typing

- Add defaults to transport type discriminators (stdio, websocket, sse)
- Type stderr field as IOType union instead of z.any()

* refactor: Add narrowed preset schema for model specs

- Create tModelSpecPresetSchema omitting system/DB/deprecated fields
- Update tModelSpecSchema to use the narrowed preset schema

* test: Add explicit type field to MCP test fixtures

Add transport type discriminator to test objects that construct
MCPOptions/ParsedServerConfig directly, required after type field
changed from optional to default in schema definitions.

* chore: Bump librechat-data-provider to 0.8.404

* refactor: Tighten z.record(z.any()) fields to precise value types

- Type headers fields as z.record(z.string()) in endpoint, assistant, and azure schemas
- Type addParams as z.record(z.union([z.string(), z.number(), z.boolean(), z.null()]))
- Type azure additionalHeaders as z.record(z.string())
- Type memory model_parameters as z.record(z.union([z.string(), z.number(), z.boolean()]))
- Type firecrawl changeTrackingOptions.schema as z.record(z.string())

* refactor: Type supportedMimeTypes schema as z.array(z.string())

Replace z.array(z.any()).refine() with z.array(z.string()) since config
input is always strings that get converted to RegExp via
convertStringsToRegex() after parsing. Destructure supportedMimeTypes
from spreads to avoid string[]/RegExp[] type mismatch.

* refactor: Tighten enum, role, and numeric constraint schemas

- Type engineSTT as enum ['openai', 'azureOpenAI']
- Type engineTTS as enum ['openai', 'azureOpenAI', 'elevenlabs', 'localai']
- Constrain playbackRate to 0.25–4 range
- Type titleMessageRole as enum ['system', 'user', 'assistant']
- Add int().nonnegative() to MCP timeout and firecrawl timeout

* chore: Bump librechat-data-provider to 0.8.405

* fix: Accept both string and RegExp in supportedMimeTypes schema

The schema must accept both string[] (config input) and RegExp[]
(post-merge runtime) since tests validate merged output against the
schema. Use z.union([z.string(), z.instanceof(RegExp)]) to handle both.

* refactor: Address review findings for schema tightening PR

- Revert changeTrackingOptions.schema to z.record(z.unknown()) (JSON Schema is nested, not flat strings)
- Remove dead contextStrategy code from BaseClient.js and cleanup.js
- Extract paramDefinitionSchema to named exported constant
- Add .int() constraint to columnSpan and columns
- Apply consistent .int().nonnegative() to initTimeout, sseReadTimeout, scraperTimeout
- Update stale stderr JSDoc to match actual accepted types
- Add comprehensive tests for paramDefinitionSchema, tModelSpecPresetSchema,
  endpointSchema deprecated field stripping, and azureEndpointSchema

* fix: Address second review pass findings

- Revert supportedMimeTypesSchema to z.array(z.string()) and remove
  as string[] casts — fix tests to not validate merged RegExp[] output
  against the config input schema
- Remove unused tModelSpecSchema import from test file
- Consolidate duplicate '../src/schemas' imports
- Add expiredAt coverage to tModelSpecPresetSchema test
- Assert plugins is absent in azureEndpointSchema test
- Add sync comments for engineSTT/engineTTS enum literals

* refactor: Omit preset-management fields from tModelSpecPresetSchema

Omit conversationId, presetId, title, defaultPreset, and order from the
model spec preset schema — these are preset-management fields that don't
belong in model spec configuration.
2026-03-29 01:10:57 -04:00
Danny Avila
f277b32030
📸 fix: Snapshot Options to Prevent Mid-Await Client Disposal Crash (#12398)
* 🐛 fix: Prevent crash when token balance exhaustion races with client disposal

Add null guard in saveMessageToDatabase to handle the case where
disposeClient nullifies this.options while a userMessagePromise is
still pending from a prior async save operation.

* 🐛 fix: Snapshot this.options to prevent mid-await disposal crash

The original guard at function entry was insufficient — this.options is
always valid at entry. The crash occurs after the first await
(db.saveMessage) when disposeClient nullifies client.options while the
promise is suspended.

Fix: capture this.options into a local const before any await. The local
reference is immune to client.options = null set by disposeClient.

Also add .catch on userMessagePromise in sendMessage to prevent
unhandled rejections when checkBalance throws before the promise is
awaited, and add two regression tests.
2026-03-25 14:18:32 -04:00
Danny Avila
b5c097e5c7
⚗️ feat: Agent Context Compaction/Summarization (#12287)
* chore: imports/types

Add summarization config and package-level summarize handler contracts

Register summarize handlers across server controller paths

Port cursor dual-read/dual-write summary support and UI status handling

Selectively merge cursor branch files for BaseClient summary content
block detection (last-summary-wins), dual-write persistence, summary
block unit tests, and on_summarize_status SSE event handling with
started/completed/failed branches.

Co-authored-by: Cursor <cursoragent@cursor.com>

refactor: type safety

feat: add localization for summarization status messages

refactor: optimize summary block detection in BaseClient

Updated the logic for identifying existing summary content blocks to use a reverse loop for improved efficiency. Added a new test case to ensure the last summary content block is updated correctly when multiple summary blocks exist.

chore: add runName to chainOptions in AgentClient

refactor: streamline summarization configuration and handler integration

Removed the deprecated summarizeNotConfigured function and replaced it with a more flexible createSummarizeFn. Updated the summarization handler setup across various controllers to utilize the new function, enhancing error handling and configuration resolution. Improved overall code clarity and maintainability by consolidating summarization logic.

feat(summarization): add staged chunk-and-merge fallback

feat(usage): track summarization usage separately from messages

feat(summarization): resolve prompt from config in runtime

fix(endpoints): use @librechat/api provider config loader

refactor(agents): import getProviderConfig from @librechat/api

chore: code order

feat(app-config): auto-enable summarization when configured

feat: summarization config

refactor(summarization): streamline persist summary handling and enhance configuration validation

Removed the deprecated createDeferredPersistSummary function and integrated a new createPersistSummary function for MongoDB persistence. Updated summarization handlers across various controllers to utilize the new persistence method. Enhanced validation for summarization configuration to ensure provider, model, and prompt are properly set, improving error handling and overall robustness.

refactor(summarization): update event handling and remove legacy summarize handlers

Replaced the deprecated summarization handlers with new event-driven handlers for summarization start and completion across multiple controllers. This change enhances the clarity of the summarization process and improves the integration of summarization events in the application. Additionally, removed unused summarization functions and streamlined the configuration loading process.

refactor(summarization): standardize event names in handlers

Updated event names in the summarization handlers to use constants from GraphEvents for consistency and clarity. This change improves maintainability and reduces the risk of errors related to string literals in event handling.

feat(summarization): enhance usage tracking for summarization events

Added logic to track summarization usage in multiple controllers by checking the current node type. If the node indicates a summarization task, the usage type is set accordingly. This change improves the granularity of usage data collected during summarization processes.

feat(summarization): integrate SummarizationConfig into AppSummarizationConfig type

Enhanced the AppSummarizationConfig type by extending it with the SummarizationConfig type from librechat-data-provider. This change improves type safety and consistency in the summarization configuration structure.

test: add end-to-end tests for summarization functionality

Introduced a comprehensive suite of end-to-end tests for the summarization feature, covering the full LibreChat pipeline from message creation to summarization. This includes a new setup file for environment configuration and a Jest configuration specifically for E2E tests. The tests utilize real API keys and ensure proper integration with the summarization process, enhancing overall test coverage and reliability.

refactor(summarization): include initial summary in formatAgentMessages output

Updated the formatAgentMessages function to return an initial summary alongside messages and index token count map. This change is reflected in multiple controllers and the corresponding tests, enhancing the summarization process by providing additional context for each agent's response.

refactor: move hydrateMissingIndexTokenCounts to tokenMap utility

Extracted the hydrateMissingIndexTokenCounts function from the AgentClient and related tests into a new tokenMap utility file. This change improves code organization and reusability, allowing for better management of token counting logic across the application.

refactor(summarization): standardize step event handling and improve summary rendering

Refactored the step event handling in the useStepHandler and related components to utilize constants for event names, enhancing consistency and maintainability. Additionally, improved the rendering logic in the Summary component to conditionally display the summary text based on its availability, providing a better user experience during the summarization process.

feat(summarization): introduce baseContextTokens and reserveTokensRatio for improved context management

Added baseContextTokens to the InitializedAgent type to calculate the context budget based on agentMaxContextNum and maxOutputTokensNum. Implemented reserveTokensRatio in the createRun function to allow configurable context token management. Updated related tests to validate these changes and ensure proper functionality.

feat(summarization): add minReserveTokens, context pruning, and overflow recovery configurations

Introduced new configuration options for summarization, including minReserveTokens, context pruning settings, and overflow recovery parameters. Updated the createRun function to accommodate these new options and added a comprehensive test suite to validate their functionality and integration within the summarization process.

feat(summarization): add updatePrompt and reserveTokensRatio to summarization configuration

Introduced an updatePrompt field for updating existing summaries with new messages, enhancing the flexibility of the summarization process. Additionally, added reserveTokensRatio to the configuration schema, allowing for improved management of token allocation during summarization. Updated related tests to validate these new features.

feat(logging): add on_agent_log event handler for structured logging

Implemented an on_agent_log event handler in both the agents' callbacks and responses to facilitate structured logging of agent activities. This enhancement allows for better tracking and debugging of agent interactions by logging messages with associated metadata. Updated the summarization process to ensure proper handling of log events.

fix: remove duplicate IBalanceUpdate interface declaration

perf(usage): single-pass partition of collectedUsage

Replace two Array.filter() passes with a single for-of loop that
partitions message vs. summarization usages in one iteration.

fix(BaseClient): shallow-copy message content before mutating and preserve string content

Avoid mutating the original message.content array in-place when
appending a summary block. Also convert string content to a text
content part instead of silently discarding it.

fix(ui): fix Part.tsx indentation and useStepHandler summarize-complete handling

- Fix SUMMARY else-if branch indentation in Part.tsx to match chain level
- Guard ON_SUMMARIZE_COMPLETE with didFinalize flag to avoid unnecessary
  re-renders when no summarizing parts exist
- Protect against undefined completeData.summary instead of unsafe spread

fix(agents): use strict enabled check for summarization handlers

Change summarizationConfig?.enabled !== false to === true so handlers
are not registered when summarizationConfig is undefined.

chore: fix initializeClient JSDoc and move DEFAULT_RESERVE_RATIO to module scope

refactor(Summary): align collapse/expand behavior with Reasoning component

- Single render path instead of separate streaming vs completed branches
- Use useMessageContext for isSubmitting/isLatestMessage awareness so
  the "Summarizing..." label only shows during active streaming
- Default to collapsed (matching Reasoning), user toggles to expand
- Add proper aria attributes (aria-hidden, role, aria-controls, contentId)
- Hide copy button while actively streaming

feat(summarization): default to self-summarize using agent's own provider/model

When no summarization config is provided (neither in librechat.yaml nor
on the agent), automatically enable summarization using the agent's own
provider and model. The agents package already provides default prompts,
so no prompt configuration is needed.

Also removes the dead resolveSummarizationLLMConfig in summarize.ts
(and its spec) — run.ts buildAgentContext is the single source of truth
for summarization config resolution. Removes the duplicate
RuntimeSummarizationConfig local type in favor of the canonical
SummarizationConfig from data-provider.

chore: schema and type cleanup for summarization

- Add trigger field to summarizationAgentOverrideSchema so per-agent
  trigger overrides in librechat.yaml are not silently stripped by Zod
- Remove unused SummarizationStatus type from runs.ts
- Make AppSummarizationConfig.enabled non-optional to reflect the
  invariant that loadSummarizationConfig always sets it

refactor(responses): extract duplicated on_agent_log handler

refactor(run): use agents package types for summarization config

Import SummarizationConfig, ContextPruningConfig, and
OverflowRecoveryConfig from @librechat/agents and use them to
type-check the translation layer in buildAgentContext. This ensures
the config object passed to the agent graph matches what it expects.

- Use `satisfies AgentSummarizationConfig` on the config object
- Cast contextPruningConfig and overflowRecoveryConfig to agents types
- Properly narrow trigger fields from DeepPartial to required shape

feat(config): add maxToolResultChars to base endpoint schema

Add maxToolResultChars to baseEndpointSchema so it can be configured
on any endpoint in librechat.yaml. Resolved during agent initialization
using getProviderConfig's endpoint resolution: custom endpoint config
takes precedence, then the provider-specific endpoint config, then the
shared `all` config.

Passed through to the agents package ToolNode, which uses it to cap
tool result length before it enters the context window. When not
configured, the agents package computes a sensible default from
maxContextTokens.

fix(summarization): forward agent model_parameters in self-summarize default

When no explicit summarization config exists, the self-summarize
default now forwards the agent's model_parameters as the
summarization parameters. This ensures provider-specific settings
(e.g. Bedrock region, credentials, endpoint host) are available
when the agents package constructs the summarization LLM.

fix(agents): register summarization handlers by default

Change the enabled gate from === true to !== false so handlers
register when no explicit summarization config exists. This aligns
with the self-summarize default where summarization is always on
unless explicitly disabled via enabled: false.

refactor(summarization): let agents package inherit clientOptions for self-summarize

Remove model_parameters forwarding from the self-summarize default.
The agents package now reuses the agent's own clientOptions when the
summarization provider matches the agent's provider, inheriting all
provider-specific settings (region, credentials, proxy, etc.)
automatically.

refactor(summarization): use MessageContentComplex[] for summary content

Unify summary content to always use MessageContentComplex[] arrays,
matching the pattern used by on_message_delta. No more string | array
unions — content is always an array of typed blocks ({ type: 'text',
text: '...' } for text, { type: 'reasoning_content', ... } for
reasoning).

Agents package:
- SummaryContentBlock.content: MessageContentComplex[] (was string)
- tokenCount now optional (not sent on deltas)
- Removed reasoning field — reasoning is now a content block type
- streamAndCollect normalizes all chunks to content block arrays
- Delta events pass content blocks directly

LibreChat:
- SummaryContentPart.content: Agents.MessageContentComplex[]
- Updated Part.tsx, Summary.tsx, useStepHandler.ts, BaseClient.js
- Summary.tsx derives display text from content blocks via useMemo
- Aggregator uses simple array spread

refactor(summarization): enhance summary handling and text extraction

- Updated BaseClient.js to improve summary text extraction, accommodating both legacy and new content formats.
- Modified summarization logic to ensure consistent handling of summary content across different message formats.
- Adjusted test cases in summarization.e2e.spec.js to utilize the new summary text extraction method.
- Refined SSE useStepHandler to initialize summary content as an array.
- Updated configuration schema by removing unused minReserveTokens field.
- Cleaned up SummaryContentPart type by removing rangeHash property.

These changes streamline the summarization process and ensure compatibility with various content structures.

refactor(summarization): streamline usage tracking and logging

- Removed direct checks for summarization nodes in ModelEndHandler and replaced them with a dedicated markSummarizationUsage function for better readability and maintainability.
- Updated OpenAIChatCompletionController and responses handlers to utilize the new markSummarizationUsage function for setting usage types.
- Enhanced logging functionality by ensuring the logger correctly handles different log levels.
- Introduced a new useCopyToClipboard hook in the Summary component to encapsulate clipboard copy logic, improving code reusability and clarity.

These changes improve the overall structure and efficiency of the summarization handling and logging processes.

refactor(summarization): update summary content block documentation

- Removed outdated comment regarding the last summary content block in BaseClient.js.
- Added a new comment to clarify the purpose of the findSummaryContentBlock method, ensuring consistency in documentation.

These changes enhance code clarity and maintainability by providing accurate descriptions of the summarization logic.

refactor(summarization): update summary content structure in tests

- Modified the summarization content structure in e2e tests to use an array format for text, aligning with recent changes in summary handling.
- Updated test descriptions to clarify the behavior of context token calculations, ensuring consistency and clarity in the tests.

These changes enhance the accuracy and maintainability of the summarization tests by reflecting the updated content structure.

refactor(summarization): remove legacy E2E test setup and configuration

- Deleted the e2e-setup.js and jest.e2e.config.js files, which contained legacy configurations for E2E tests using real API keys.
- Introduced a new summarization.e2e.ts file that implements comprehensive E2E backend integration tests for the summarization process, utilizing real AI providers and tracking summaries throughout the run.

These changes streamline the testing framework by consolidating E2E tests into a single, more robust file while removing outdated configurations.

refactor(summarization): enhance E2E tests and error handling

- Added a cleanup step to force exit after all tests to manage Redis connections.
- Updated the summarization model to 'claude-haiku-4-5-20251001' for consistency across tests.
- Improved error handling in the processStream function to capture and return processing errors.
- Enhanced logging for cross-run tests and tight context scenarios to provide better insights into test execution.

These changes improve the reliability and clarity of the E2E tests for the summarization process.

refactor(summarization): enhance test coverage for maxContextTokens behavior

- Updated run-summarization.test.ts to include a new test case ensuring that maxContextTokens does not exceed user-defined limits, even when calculated ratios suggest otherwise.
- Modified summarization.e2e.ts to replace legacy UsageMetadata type with a more appropriate type for collectedUsage, improving type safety and clarity in the test setup.

These changes improve the robustness of the summarization tests by validating context token constraints and refining type definitions.

feat(summarization): add comprehensive E2E tests for summarization process

- Introduced a new summarization.e2e.test.ts file that implements extensive end-to-end integration tests for the summarization pipeline, covering the full flow from LibreChat to agents.
- The tests utilize real AI providers and include functionality to track summaries during and between runs.
- Added necessary cleanup steps to manage Redis connections post-tests and ensure proper exit.

These changes enhance the testing framework by providing robust coverage for the summarization process, ensuring reliability and performance under real-world conditions.

fix(service): import logger from winston configuration

- Removed the import statement for logger from '@librechat/data-schemas' and replaced it with an import from '~/config/winston'.
- This change ensures that the logger is correctly sourced from the updated configuration, improving consistency in logging practices across the application.

refactor(summary): simplify Summary component and enhance token display

- Removed the unused `meta` prop from the `SummaryButton` component to streamline its interface.
- Updated the token display logic to use a localized string for better internationalization support.
- Adjusted the rendering of the `meta` information to improve its visibility within the `Summary` component.

These changes enhance the clarity and usability of the Summary component while ensuring better localization practices.

feat(summarization): add maxInputTokens configuration for summarization

- Introduced a new `maxInputTokens` property in the summarization configuration schema to control the amount of conversation context sent to the summarizer, with a default value of 10000.
- Updated the `createRun` function to utilize the new `maxInputTokens` setting, allowing for more flexible summarization based on agent context.

These changes enhance the summarization capabilities by providing better control over input token limits, improving the overall summarization process.

refactor(summarization): simplify maxInputTokens logic in createRun function

- Updated the logic for the `maxInputTokens` property in the `createRun` function to directly use the agent's base context tokens when the resolved summarization configuration does not specify a value.
- This change streamlines the configuration process and enhances clarity in how input token limits are determined for summarization.

These modifications improve the maintainability of the summarization configuration by reducing complexity in the token calculation logic.

feat(summary): enhance Summary component to display meta information

- Updated the SummaryContent component to accept an optional `meta` prop, allowing for additional contextual information to be displayed above the main content.
- Adjusted the rendering logic in the Summary component to utilize the new `meta` prop, improving the visibility of supplementary details.

These changes enhance the user experience by providing more context within the Summary component, making it clearer and more informative.

refactor(summarization): standardize reserveRatio configuration in summarization logic

- Replaced instances of `reserveTokensRatio` with `reserveRatio` in the `createRun` function and related tests to unify the terminology across the codebase.
- Updated the summarization configuration schema to reflect this change, ensuring consistency in how the reserve ratio is defined and utilized.
- Removed the per-agent override logic for summarization configuration, simplifying the overall structure and enhancing clarity.

These modifications improve the maintainability and readability of the summarization logic by standardizing the configuration parameters.

* fix: circular dependency of `~/models`

* chore: update logging scope in agent log handlers

Changed log scope from `[agentus:${data.scope}]` to `[agents:${data.scope}]` in both the callbacks and responses controllers to ensure consistent logging format across the application.

* feat: calibration ratio

* refactor(tests): update summarizationConfig tests to reflect changes in enabled property

Modified tests to check for the new `summarizationEnabled` property instead of the deprecated `enabled` field in the summarization configuration. This change ensures that the tests accurately validate the current configuration structure and behavior of the agents.

* feat(tests): add markSummarizationUsage mock for improved test coverage

Introduced a mock for the markSummarizationUsage function in the responses unit tests to enhance the testing of summarization usage tracking. This addition supports better validation of summarization-related functionalities and ensures comprehensive test coverage for the agents' response handling.

* refactor(tests): simplify event handler setup in createResponse tests

Removed redundant mock implementations for event handlers in the createResponse unit tests, streamlining the setup process. This change enhances test clarity and maintainability while ensuring that the tests continue to validate the correct behavior of usage tracking during on_chat_model_end events.

* refactor(agents): move calibration ratio capture to finally block

Reorganized the logic for capturing the calibration ratio in the AgentClient class to ensure it is executed in the finally block. This change guarantees that the ratio is captured even if the run is aborted, enhancing the reliability of the response message persistence. Removed redundant code and improved clarity in the handling of context metadata.

* refactor(agents): streamline bulk write logic in recordCollectedUsage function

Removed redundant bulk write operations and consolidated document handling in the recordCollectedUsage function. The logic now combines all documents into a single bulk write operation, improving efficiency and reducing error handling complexity. Updated logging to provide consistent error messages for bulk write failures.

* refactor(agents): enhance summarization configuration resolution in createRun function

Streamlined the summarization configuration logic by introducing a base configuration and allowing for overrides from agent-specific settings. This change improves clarity and maintainability, ensuring that the summarization configuration is consistently applied while retaining flexibility for customization. Updated the handling of summarization parameters to ensure proper integration with the agent's model and provider settings.

* refactor(agents): remove unused tokenCountMap and streamline calibration ratio handling

Eliminated the unused tokenCountMap variable from the AgentClient class to enhance code clarity. Additionally, streamlined the logic for capturing the calibration ratio by using optional chaining and a fallback value, ensuring that context metadata is consistently defined. This change improves maintainability and reduces potential confusion in the codebase.

* refactor(agents): extract agent log handler for improved clarity and reusability

Refactored the agent log handling logic by extracting it into a dedicated function, `agentLogHandler`, enhancing code clarity and reusability across different modules. Updated the event handlers in both the OpenAI and responses controllers to utilize the new handler, ensuring consistent logging behavior throughout the application.

* test: add summarization event tests for useStepHandler

Implemented a series of tests for the summarization events in the useStepHandler hook. The tests cover scenarios for ON_SUMMARIZE_START, ON_SUMMARIZE_DELTA, and ON_SUMMARIZE_COMPLETE events, ensuring proper handling of summarization logic, including message accumulation and finalization. This addition enhances test coverage and validates the correct behavior of the summarization process within the application.

* refactor(config): update summarizationTriggerSchema to use enum for type validation

Changed the type of the `type` field in the summarizationTriggerSchema from a string to an enum with a single value 'token_count'. This modification enhances type safety and ensures that only valid types are accepted in the configuration, improving overall clarity and maintainability of the schema.

* test(usage): add bulk write tests for message and summarization usage

Implemented tests for the bulk write functionality in the recordCollectedUsage function, covering scenarios for combined message and summarization usage, summarization-only usage, and message-only usage. These tests ensure correct document handling and token rollup calculations, enhancing test coverage and validating the behavior of the usage tracking logic.

* refactor(Chat): enhance clipboard copy functionality and type definitions in Summary component

Updated the Summary component to improve the clipboard copy functionality by handling clipboard permission errors. Refactored type definitions for SummaryProps to use a more specific type, enhancing type safety. Adjusted the SummaryButton and FloatingSummaryBar components to accept isCopied and onCopy props, promoting better separation of concerns and reusability.

* chore(translations): remove unused "Expand Summary" key from English translations

Deleted the "Expand Summary" key from the English translation file to streamline the localization resources and improve clarity in the user interface. This change helps maintain an organized and efficient translation structure.

* refactor: adjust token counting for Claude model to account for API discrepancies

Implemented a correction factor for token counting when using the Claude model, addressing discrepancies between Anthropic's API and local tokenizer results. This change ensures accurate token counts by applying a scaling factor, improving the reliability of token-related functionalities.

* refactor(agents): implement token count adjustment for Claude model messages

Added a method to adjust token counts for messages processed by the Claude model, applying a correction factor to align with API expectations. This enhancement improves the accuracy of token counting, ensuring reliable functionality when interacting with the Claude model.

* refactor(agents): token counting for media content in messages

Introduced a new method to estimate token costs for image and document blocks in messages, improving the accuracy of token counting. This enhancement ensures that media content is properly accounted for, particularly for the Claude model, by integrating additional token estimation logic for various content types. Updated the token counting function to utilize this new method, enhancing overall reliability and functionality.

* chore: fix missing import

* fix(agents): clamp baseContextTokens and document reserve ratio change

Prevent negative baseContextTokens when maxOutputTokens exceeds the
context window (misconfigured models). Document the 10%→5% default
reserve ratio reduction introduced alongside summarization.

* fix(agents): include media tokens in hydrated token counts

Add estimateMediaTokensForMessage to createTokenCounter so the hydration
path (used by hydrateMissingIndexTokenCounts) matches the precomputed
path in AgentClient.getTokenCountForMessage. Without this, messages
containing images or documents were systematically undercounted during
hydration, risking context window overflow.

Add 34 unit tests covering all block-type branches of
estimateMediaTokensForMessage.

* fix(agents): include summarization output tokens in usage return value

The returned output_tokens from recordCollectedUsage now reflects all
billed LLM calls (message + summarization). Previously, summarization
completions were billed but excluded from the returned metadata, causing
a discrepancy between what users were charged and what the response
message reported.

* fix(tests): replace process.exit with proper Redis cleanup in e2e test

The summarization E2E test used process.exit(0) to work around a Redis
connection opened at import time, which killed the Jest runner and
bypassed teardown. Use ioredisClient.quit() and keyvRedisClient.disconnect()
for graceful cleanup instead.

* fix(tests): update getConvo imports in OpenAI and response tests

Refactor test files to import getConvo from the main models module instead of the Conversation submodule. This change ensures consistency across tests and simplifies the import structure, enhancing maintainability.

* fix(clients): improve summary text validation in BaseClient

Refactor the summary extraction logic to ensure that only non-empty summary texts are considered valid. This change enhances the robustness of the message processing by utilizing a dedicated method for summary text retrieval, improving overall reliability.

* fix(config): replace z.any() with explicit union in summarization schema

Model parameters (temperature, top_p, etc.) are constrained to
primitive types rather than the policy-violating z.any().

* refactor(agents): deduplicate CLAUDE_TOKEN_CORRECTION constant

Export from the TS source in packages/api and import in the JS client,
eliminating the static class property that could drift out of sync.

* refactor(agents): eliminate duplicate selfProvider in buildAgentContext

selfProvider and provider were derived from the same expression with
different type casts. Consolidated to a single provider variable.

* refactor(agents): extract shared SSE handlers and restrict log levels

- buildSummarizationHandlers() factory replaces triplicated handler
  blocks across responses.js and openai.js
- agentLogHandlerObj exported from callbacks.js for consistent reuse
- agentLogHandler restricted to an allowlist of safe log levels
  (debug, info, warn, error) instead of accepting arbitrary strings

* fix(SSE): batch summarize deltas, add exhaustiveness check, conditional error announcement

- ON_SUMMARIZE_DELTA coalesces rapid-fire renders via requestAnimationFrame
  instead of calling setMessages per chunk
- Exhaustive never-check on TStepEvent catches unhandled variants at
  compile time when new StepEvents are added
- ON_SUMMARIZE_COMPLETE error announcement only fires when a summary
  part was actually present and removed

* feat(agents): persist instruction overhead in contextMeta and seed across runs

Extend contextMeta with instructionOverhead and toolCount so the
provider-observed instruction overhead is persisted on the response message
and seeded into the pruner on subsequent runs. This enables the pruner to
use a calibrated budget from the first call instead of waiting for a
provider observation, preventing the ratio collapse caused by local
tokenizer overestimating tool schema tokens.

The seeded overhead is only used when encoding and tool count match
between runs, ensuring stale values from different configurations
are discarded.

* test(agents): enhance OpenAI test mocks for summarization handlers

Updated the OpenAI test suite to include additional mock implementations for summarization handlers, including buildSummarizationHandlers, markSummarizationUsage, and agentLogHandlerObj. This improves test coverage and ensures consistent behavior during testing.

* fix(agents): address review findings for summarization v2

Cancel rAF on unmount to prevent stale Recoil writes from dead
component context. Clear orphaned summarizing:true parts when
ON_SUMMARIZE_COMPLETE arrives without a summary payload. Add null
guard and safe spread to agentLogHandler. Handle Anthropic-format
base64 image/* documents in estimateMediaTokensForMessage. Use
role="region" for expandable summary content. Add .describe() to
contextMeta Zod fields. Extract duplicate usage loop into helper.

* refactor: simplify contextMeta to calibrationRatio + encoding only

Remove instructionOverhead and toolCount from cross-run persistence —
instruction tokens change too frequently between runs (prompt edits,
tool changes) for a persisted seed to be reliable. The intra-run
calibration in the pruner still self-corrects via provider observations.
contextMeta now stores only the tokenizer-bias ratio and encoding,
which are stable across instruction changes.

* test(SSE): enhance useStepHandler tests for ON_SUMMARIZE_COMPLETE behavior

Updated the test for ON_SUMMARIZE_COMPLETE to clarify that it finalizes the existing part with summarizing set to false when the summary is undefined. Added assertions to verify the correct behavior of message updates and the state of summary parts.

* refactor(BaseClient): remove handleContextStrategy and truncateToolCallOutputs functions

Eliminated the handleContextStrategy method from BaseClient to streamline message handling. Also removed the truncateToolCallOutputs function from the prompts module, simplifying the codebase and improving maintainability.

* refactor: add AGENT_DEBUG_LOGGING option and refactor token count handling in BaseClient

Introduced AGENT_DEBUG_LOGGING to .env.example for enhanced debugging capabilities. Refactored token count handling in BaseClient by removing the handleTokenCountMap method and simplifying token count updates. Updated AgentClient to log detailed token count recalculations and adjustments, improving traceability during message processing.

* chore: update dependencies in package-lock.json and package.json files

Bumped versions of several dependencies, including @librechat/agents to ^3.1.62 and various AWS SDK packages to their latest versions. This ensures compatibility and incorporates the latest features and fixes.

* chore: imports order

* refactor: extract summarization config resolution from buildAgentContext

* refactor: rename and simplify summarization configuration shaping function

* refactor: replace AgentClient token counting methods with single-pass pure utility

Extract getTokenCount() and getTokenCountForMessage() from AgentClient
into countFormattedMessageTokens(), a pure function in packages/api that
handles text, tool_call, image, and document content types in one loop.

- Decompose estimateMediaTokensForMessage into block-level helpers
  (estimateImageDataTokens, estimateImageBlockTokens, estimateDocumentBlockTokens)
  shared by both estimateMediaTokensForMessage and the new single-pass function
- Remove redundant per-call getEncoding() resolution (closure captures once)
- Remove deprecated gpt-3.5-turbo-0301 model branching
- Drop this.getTokenCount guard from BaseClient.sendMessage

* refactor: streamline token counting in createTokenCounter function

Simplified the createTokenCounter function by removing the media token estimation and directly calculating the token count. This change enhances clarity and performance by consolidating the token counting logic into a single pass, while maintaining compatibility with Claude's token correction.

* refactor: simplify summarization configuration types

Removed the AppSummarizationConfig type and directly used SummarizationConfig in the AppConfig interface. This change streamlines the type definitions and enhances consistency across the codebase.

* chore: import order

* fix: summarization event handling in useStepHandler

- Cancel pending summarizeDeltaRaf in clearStepMaps to prevent stale
  frames firing after map reset or component unmount
- Move announcePolite('summarize_completed') inside the didFinalize
  guard so screen readers only announce when finalization actually occurs
- Remove dead cleanup closure returned from stepHandler useCallback body
  that was never invoked by any caller

* fix: estimate tokens for non-PDF/non-image base64 document blocks

Previously estimateDocumentBlockTokens returned 0 for unrecognized MIME
types (e.g. text/plain, application/json), silently underestimating
context budget. Fall back to character-based heuristic or countTokens.

* refactor: return cloned usage from markSummarizationUsage

Avoid mutating LangChain's internal usage_metadata object by returning
a shallow clone with the usage_type tag. Update all call sites in
callbacks, openai, and responses controllers to use the returned value.

* refactor: consolidate debug logging loops in buildMessages

Merge the two sequential O(n) debug-logging passes over orderedMessages
into a single pass inside the map callback where all data is available.

* refactor: narrow SummaryContentPart.content type

Replace broad Agents.MessageContentComplex[] with the specific
Array<{ type: ContentTypes.TEXT; text: string }> that all producers
and consumers already use, improving compile-time safety.

* refactor: use single output array in recordCollectedUsage

Have processUsageGroup append to a shared array instead of returning
separate arrays that are spread into a third, reducing allocations.

* refactor: use for...in in hydrateMissingIndexTokenCounts

Replace Object.entries with for...in to avoid allocating an
intermediate tuple array during token map hydration.
2026-03-21 14:28:56 -04:00
Danny Avila
8ba2bde5c1
📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830)
* chore: move database model methods to /packages/data-schemas

* chore: add TypeScript ESLint rule to warn on unused variables

* refactor: model imports to streamline access

- Consolidated model imports across various files to improve code organization and reduce redundancy.
- Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path.
- Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact.
- Enhanced test files to align with the new import paths, maintaining test coverage and integrity.

* chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas

* test: update agent model mocks in unit tests

- Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality.
- Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication.
- Ensured consistency in agent mock implementations across test files.

* fix: update types in data-schemas

* refactor: enhance type definitions in transaction and spending methods

- Updated type definitions in `checkBalance.ts` to use specific request and response types.
- Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety.
- Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods.
- Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness.

* refactor: streamline model imports and enhance code organization

- Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy.
- Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact.
- Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods.

* feat: implement loadAddedAgent and refactor agent loading logic

- Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution.
- Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`.
- Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`.
- Enhanced type definitions and improved error handling in the agent loading process.
- Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage.

* refactor: enhance balance handling with new update interface

- Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase.
- Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety.
- Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities.
- Enhanced overall code clarity and maintainability by refining type definitions related to balance operations.

* feat: add unit tests for loadAgent functionality and enhance agent loading logic

- Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks.
- Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality.
- Improved test coverage for agent loading, including handling of non-existent agents and user permissions.

* chore: reorder memory method exports for consistency

- Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`.
2026-03-21 14:28:53 -04:00
Danny Avila
43ff3f8473
💸 fix: Model Identifier Edge Case in Agent Transactions (#11988)
* 🔧 fix: Add skippedAgentIds tracking in initializeClient error handling

- Enhanced error handling in the initializeClient function to track agent IDs that are skipped during processing. This addition improves the ability to monitor and debug issues related to agent initialization failures.

* 🔧 fix: Update model assignment in BaseClient to use instance model

- Modified the model assignment in BaseClient to use `this.model` instead of `responseMessage.model`, clarifying that when using agents, the model refers to the agent ID rather than the model itself. This change improves code clarity and correctness in the context of agent usage.

* 🔧 test: Add tests for recordTokenUsage model assignment in BaseClient

- Introduced new test cases in BaseClient to ensure that the correct model is passed to the recordTokenUsage method, verifying that it uses this.model instead of the agent ID from responseMessage.model. This enhances the accuracy of token usage tracking in agent scenarios.
- Improved error handling in the initializeClient function to log errors when processing agents, ensuring that skipped agent IDs are tracked for better debugging.
2026-02-28 09:06:32 -05:00
Danny Avila
8b159079f5
🪙 feat: Add messageId to Transactions (#11987)
* feat: Add messageId to transactions

* chore: field order

* feat: Enhance token usage tracking by adding messageId parameter

- Updated `recordTokenUsage` method in BaseClient to accept a new `messageId` parameter for improved tracking.
- Propagated `messageId` in the AgentClient when recording usage.
- Added tests to ensure `messageId` is correctly passed and handled in various scenarios, including propagation across multiple usage entries.

* chore: Correct field order in createGeminiImageTool function

- Moved the conversationId field to the correct position in the object being passed to the recordTokenUsage method, ensuring proper parameter alignment for improved functionality.

* refactor: Update OpenAIChatCompletionController and createResponse to use responseId instead of requestId

- Replaced instances of requestId with responseId in the OpenAIChatCompletionController for improved clarity in logging and tracking.
- Updated createResponse to include responseId in the requestBody, ensuring consistency across the handling of message identifiers.

* test: Add messageId to agent client tests

- Included messageId in the agent client tests to ensure proper handling and propagation of message identifiers during transaction recording.
- This update enhances the test coverage for scenarios involving messageId, aligning with recent changes in the tracking of message identifiers.

* fix: Update OpenAIChatCompletionController to use requestId for context

- Changed the context object in OpenAIChatCompletionController to use `requestId` instead of `responseId` for improved clarity and consistency in handling request identifiers.

* chore: field order
2026-02-27 23:50:13 -05:00
Danny Avila
6169d4f70b
🚦 fix: 404 JSON Responses for Unmatched API Routes (#11976)
* feat: Implement 404 JSON response for unmatched API routes

- Added middleware to return a 404 JSON response with a message for undefined API routes.
- Updated SPA fallback to serve index.html for non-API unmatched routes.
- Ensured the error handler is positioned correctly as the last middleware in the stack.

* fix: Enhance logging in BaseClient for better token usage tracking

- Updated `getTokenCountForResponse` to log the messageId of the response for improved debugging.
- Enhanced userMessage logging to include messageId, tokenCount, and conversationId for clearer context during token count mapping.

* chore: Improve logging in processAddedConvo for better debugging

- Updated the logging structure in the processAddedConvo function to provide clearer context when processing added conversations.
- Removed redundant logging and enhanced the output to include model, agent ID, and endpoint details for improved traceability.

* chore: Enhance logging in BaseClient for improved token usage tracking

- Added debug logging in the BaseClient to track response token usage, including messageId, model, promptTokens, and completionTokens for better debugging and traceability.

* chore: Enhance logging in MemoryAgent for improved context

- Updated logging in the MemoryAgent to include userId, conversationId, messageId, and provider details for better traceability during memory processing.
- Adjusted log messages to provide clearer context when content is returned or not, aiding in debugging efforts.

* chore: Refactor logging in initializeClient for improved clarity

- Consolidated multiple debug log statements into a single message that provides a comprehensive overview of the tool context being stored for the primary agent, including the number of tools and the size of the tool registry. This enhances traceability and debugging efficiency.

* feat: Implement centralized 404 handling for unmatched API routes

- Introduced a new middleware function `apiNotFound` to standardize 404 JSON responses for undefined API routes.
- Updated the server configuration to utilize the new middleware, enhancing code clarity and maintainability.
- Added tests to ensure correct 404 responses for various non-GET methods and the `/api` root path.

* fix: Enhance logging in apiNotFound middleware for improved safety

- Updated the `apiNotFound` function to sanitize the request path by replacing problematic characters and limiting its length, ensuring safer logging of 404 errors.

* refactor: Move apiNotFound middleware to a separate file for better organization

- Extracted the `apiNotFound` function from the error middleware into its own file, enhancing code organization and maintainability.
- Updated the index file to export the new `notFound` middleware, ensuring it is included in the middleware stack.

* docs: Add comment to clarify usage of unsafeChars regex in notFound middleware

- Included a comment in the notFound middleware file to explain that the unsafeChars regex is safe to reuse with .replace() at the module scope, as it does not retain lastIndex state.
2026-02-27 22:49:54 -05:00
marbence101
3a079b980a
📌 fix: Populate userMessage.files Before First DB Save (#11939)
* fix: populate userMessage.files before first DB save

* fix: ESLint error fixed

* fix: deduplicate file-population logic and add test coverage

Extract `buildMessageFiles` helper into `packages/api/src/utils/message`
to replace three near-identical loops in BaseClient and both agent
controllers. Fixes set poisoning from undefined file_id entries, moves
file population inside the skipSaveUserMessage guard to avoid wasted
work, and adds full unit test coverage for the new behavior.

* chore: reorder import statements in openIdJwtStrategy.js for consistency

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-02-26 09:16:45 -05:00
Dustin Healy
1d0a4c501f
🪨 feat: AWS Bedrock Document Uploads (#11912)
* feat: add aws bedrock upload to provider support

* chore: address copilot comments

* feat: add shared Bedrock document format types and MIME mapping

Bedrock Converse API accepts 9 document formats beyond PDF. Add
BedrockDocumentFormat union type, MIME-to-format mapping, and helpers
in data-provider so both client and backend can reference them.

* refactor: generalize Bedrock PDF validation to support all document types

Rename validateBedrockPdf to validateBedrockDocument with MIME-aware
logic: 4.5MB hard limit applies to all types, PDF header check only
runs for application/pdf. Adds test coverage for non-PDF documents.

* feat: support all Bedrock document formats in encoding pipeline

Widen file type gates to accept csv, doc, docx, xls, xlsx, html, txt,
md for Bedrock. Uses shared MIME-to-format map instead of hardcoded
'pdf'. Other providers' PDF-only paths remain unchanged.

* feat: expand Bedrock file upload UI to accept all document types

Add 'image_document_extended' upload type for Bedrock with accept
filters for all 9 supported formats. Update drag-and-drop validation
to use isBedrockDocumentType helper.

* fix: route Bedrock document types through provider pipeline
2026-02-23 22:32:44 -05:00
Danny Avila
e452c1a8d9
🔀 refactor: Conditional Mapping Support for Multi-Convo (Parallel) Messages (#11180)
* refactor: message handling with addedConvo support

- Introduced `addedConvo` property in message schema to track conversation additions.
- Updated `BaseClient` to conditionally include `addedConvo` in saved messages based on request body.
- Enhanced `AgentClient` to apply mapping logic for messages with the `addedConvo` flag, improving message processing.
- Updated documentation to reflect new optional `mapCondition` parameter for message mapping functions, enhancing flexibility in message handling.

* test: Add comprehensive tests for getMessagesForConversation method

- Introduced a suite of tests for the `getMessagesForConversation` method in the `AgentClient` to validate mapping logic based on `mapMethod` and `mapCondition`.
- Covered various scenarios including applying mapping to all messages, conditional mapping based on `addedConvo`, handling of empty messages, and preserving message order.
- Ensured robust handling of edge cases such as null `mapMethod` and undefined `mapCondition`, enhancing overall test coverage and reliability of message processing.
2026-01-02 19:42:54 -05:00
Danny Avila
439bc98682
⏸ refactor: Improve UX for Parallel Streams (Multi-Convo) (#11096)
* 🌊 feat: Implement multi-conversation feature with added conversation context and payload adjustments

* refactor: Replace isSubmittingFamily with isSubmitting across message components for consistency

* feat: Add loadAddedAgent and processAddedConvo for multi-conversation agent execution

* refactor: Update ContentRender usage to conditionally render PlaceholderRow based on isLast and isSubmitting

* WIP: first pass, sibling index

* feat: Enhance multi-conversation support with agent tracking and display improvements

* refactor: Introduce isEphemeralAgentId utility and update related logic for agent handling

* refactor: Implement createDualMessageContent utility for sibling message display and enhance useStepHandler for added conversations

* refactor: duplicate tools for added agent if ephemeral and primary agent is also ephemeral

* chore: remove deprecated multimessage rendering

* refactor: enhance dual message content creation and agent handling for parallel rendering

* refactor: streamline message rendering and submission handling by removing unused state and optimizing conditional logic

* refactor: adjust content handling in parallel mode to utilize existing content for improved agent display

* refactor: update @librechat/agents dependency to version 3.0.53

* refactor: update @langchain/core and @librechat/agents dependencies to latest versions

* refactor: remove deprecated @langchain/core dependency from package.json

* chore: remove unused SearchToolConfig and GetSourcesParams types from web.ts

* refactor: remove unused message properties from Message component

* refactor: enhance parallel content handling with groupId support in ContentParts and useStepHandler

* refactor: implement parallel content styling in Message, MessageRender, and ContentRender components. use explicit model name

* refactor: improve agent ID handling in createDualMessageContent for dual message display

* refactor: simplify title generation in AddedConvo by removing unused sender and preset logic

* refactor: replace string interpolation with cn utility for className in HoverButtons component

* refactor: enhance agent ID handling by adding suffix management for parallel agents and updating related components

* refactor: enhance column ordering in ContentParts by sorting agents with suffix management

* refactor: update @librechat/agents dependency to version 3.0.55

* feat: implement parallel content rendering with metadata support

- Added `ParallelContentRenderer` and `ParallelColumns` components for rendering messages in parallel based on groupId and agentId.
- Introduced `contentMetadataMap` to store metadata for each content part, allowing efficient parallel content detection.
- Updated `Message` and `ContentRender` components to utilize the new metadata structure for rendering.
- Modified `useStepHandler` to manage content indices and metadata during message processing.
- Enhanced `IJobStore` interface and its implementations to support storing and retrieving content metadata.
- Updated data schemas to include `contentMetadataMap` for messages, enabling multi-agent and parallel execution scenarios.

* refactor: update @librechat/agents dependency to version 3.0.56

* refactor: remove unused EPHEMERAL_AGENT_ID constant and simplify agent ID check

* refactor: enhance multi-agent message processing and primary agent determination

* refactor: implement branch message functionality for parallel responses

* refactor: integrate added conversation retrieval into message editing and regeneration processes

* refactor: remove unused isCard and isMultiMessage props from MessageRender and ContentRender components

* refactor: update @librechat/agents dependency to version 3.0.60

* refactor: replace usage of EPHEMERAL_AGENT_ID constant with isEphemeralAgentId function for improved clarity and consistency

* refactor: standardize agent ID format in tests for consistency

* chore: move addedConvo property to the correct position in payload construction

* refactor: rename agent_id values in loadAgent tests for clarity

* chore: reorder props in ContentParts component for improved readability

* refactor: rename variable 'content' to 'result' for clarity in RedisJobStore tests

* refactor: streamline useMessageActions by removing duplicate handleFeedback assignment

* chore: revert placeholder rendering logic MessageRender and ContentRender components to original

* refactor: implement useContentMetadata hook for optimized content metadata handling

* refactor: remove contentMetadataMap and related logic from the codebase and revert back to agentId/groupId in content parts

- Eliminated contentMetadataMap from various components and services, simplifying the handling of message content.
- Updated functions to directly access agentId and groupId from content parts instead of relying on a separate metadata map.
- Adjusted related hooks and components to reflect the removal of contentMetadataMap, ensuring consistent handling of message content.
- Updated tests and documentation to align with the new structure of message content handling.

* refactor: remove logging from groupParallelContent function to clean up output

* refactor: remove model parameter from TBranchMessageRequest type for simplification

* refactor: enhance branch message creation by stripping metadata for standalone content

* chore: streamline branch message creation by simplifying content filtering and removing unnecessary metadata checks

* refactor: include attachments in branch message creation for improved content handling

* refactor: streamline agent content processing by consolidating primary agent identification and filtering logic

* refactor: simplify multi-agent message processing by creating a dedicated mapping method and enhancing content filtering

* refactor: remove unused parameter from loadEphemeralAgent function for cleaner code

* refactor: update groupId handling in metadata to only set when provided by the server
2025-12-25 01:43:54 -05:00
Danny Avila
b5ab32c5ae
🎯 refactor: Centralize Agent Model Handling Across Conversation Lifecycle (#10956)
* refactor: Implement clearModelForNonEphemeralAgent utility for improved agent handling

- Introduced clearModelForNonEphemeralAgent function to manage model state for non-ephemeral agents across various components.
- Updated ModelSelectorContext to initialize model based on agent type.
- Enhanced useNavigateToConvo, useQueryParams, and useSelectMention hooks to clear model for non-ephemeral agents.
- Refactored buildDefaultConvo and endpoints utility to ensure proper handling of agent_id and model state.
- Improved overall conversation logic and state management for better performance and reliability.

* refactor: Enhance useNewConvo hook to improve agent conversation handling

- Added logic to skip access checks for existing agent conversations, utilizing localStorage to restore conversations after refresh.
- Improved handling of default endpoints for agents based on user access and existing conversation state, ensuring more reliable conversation initialization.

* refactor: Update ChatRoute and useAuthRedirect to include user roles

- Enhanced ChatRoute to utilize user roles for improved conversation state management.
- Modified useAuthRedirect to return user roles alongside authentication status, ensuring roles are available for conditional logic in ChatRoute.
- Adjusted conversation initialization logic to depend on the loaded roles, enhancing the reliability of the conversation setup.

* refactor: Update BaseClient to handle non-ephemeral agents in conversation logic

- Added a check for non-ephemeral agents in BaseClient, modifying the exceptions set to include 'model' when applicable.
- Enhanced conversation handling to improve flexibility based on agent type.

* test: Add mock for clearModelForNonEphemeralAgent in useQueryParams tests

- Introduced a mock for clearModelForNonEphemeralAgent to enhance testing of query parameters related to non-ephemeral agents.
- This addition supports improved test coverage and ensures proper handling of model state in relevant scenarios.

* refactor: Simplify mocks in useQueryParams tests for improved clarity

- Updated the mocking strategy for utilities in useQueryParams tests to use actual implementations where possible, while still suppressing test output for the logger.
- Enhanced the mock for tQueryParamsSchema to minimize complexity and avoid unnecessary validation during tests, improving test reliability and maintainability.

* refactor: Enhance agent identification logic in BaseClient for improved clarity

* chore: Import Constants in families.ts for enhanced functionality
2025-12-13 08:29:15 -05:00
Danny Avila
04a4a2aa44
🧵 refactor: Migrate Endpoint Initialization to TypeScript (#10794)
* refactor: move endpoint initialization methods to typescript

* refactor: move agent init to packages/api

- Introduced `initialize.ts` for agent initialization, including file processing and tool loading.
- Updated `resources.ts` to allow optional appConfig parameter.
- Enhanced endpoint configuration handling in various initialization files to support model parameters.
- Added new artifacts and prompts for React component generation.
- Refactored existing code to improve type safety and maintainability.

* refactor: streamline endpoint initialization and enhance type safety

- Updated initialization functions across various endpoints to use a consistent request structure, replacing `unknown` types with `ServerResponse`.
- Simplified request handling by directly extracting keys from the request body.
- Improved type safety by ensuring user IDs are safely accessed with optional chaining.
- Removed unnecessary parameters and streamlined model options handling for better clarity and maintainability.

* refactor: moved ModelService and extractBaseURL to packages/api

- Added comprehensive tests for the models fetching functionality, covering scenarios for OpenAI, Anthropic, Google, and Ollama models.
- Updated existing endpoint index to include the new models module.
- Enhanced utility functions for URL extraction and model data processing.
- Improved type safety and error handling across the models fetching logic.

* refactor: consolidate utility functions and remove unused files

- Merged `deriveBaseURL` and `extractBaseURL` into the `@librechat/api` module for better organization.
- Removed redundant utility files and their associated tests to streamline the codebase.
- Updated imports across various client files to utilize the new consolidated functions.
- Enhanced overall maintainability by reducing the number of utility modules.

* refactor: replace ModelService references with direct imports from @librechat/api and remove ModelService file

* refactor: move encrypt/decrypt methods and key db methods to data-schemas, use `getProviderConfig` from `@librechat/api`

* chore: remove unused 'res' from options in AgentClient

* refactor: file model imports and methods

- Updated imports in various controllers and services to use the unified file model from '~/models' instead of '~/models/File'.
- Consolidated file-related methods into a new file methods module in the data-schemas package.
- Added comprehensive tests for file methods including creation, retrieval, updating, and deletion.
- Enhanced the initializeAgent function to accept dependency injection for file-related methods.
- Improved error handling and logging in file methods.

* refactor: streamline database method references in agent initialization

* refactor: enhance file method tests and update type references to IMongoFile

* refactor: consolidate database method imports in agent client and initialization

* chore: remove redundant import of initializeAgent from @librechat/api

* refactor: move checkUserKeyExpiry utility to @librechat/api and update references across endpoints

* refactor: move updateUserPlugins logic to user.ts and simplify UserController

* refactor: update imports for user key management and remove UserService

* refactor: remove unused Anthropics and Bedrock endpoint files and clean up imports

* refactor: consolidate and update encryption imports across various files to use @librechat/data-schemas

* chore: update file model mock to use unified import from '~/models'

* chore: import order

* refactor: remove migrated to TS agent.js file and its associated logic from the endpoints

* chore: add reusable function to extract imports from source code in unused-packages workflow

* chore: enhance unused-packages workflow to include @librechat/api dependencies and improve dependency extraction

* chore: improve dependency extraction in unused-packages workflow with enhanced error handling and debugging output

* chore: add detailed debugging output to unused-packages workflow for better visibility into unused dependencies and exclusion lists

* chore: refine subpath handling in unused-packages workflow to correctly process scoped and non-scoped package imports

* chore: clean up unused debug output in unused-packages workflow and reorganize type imports in initialize.ts
2025-12-11 16:37:16 -05:00
Danny Avila
8bdc808074
refactor: Optimize & Standardize Tokenizer Usage (#10777)
* refactor: Token Limit Processing with Enhanced Efficiency

- Added a new test suite for `processTextWithTokenLimit`, ensuring comprehensive coverage of various scenarios including under, at, and exceeding token limits.
- Refactored the `processTextWithTokenLimit` function to utilize a ratio-based estimation method, significantly reducing the number of token counting function calls compared to the previous binary search approach.
- Improved handling of edge cases and variable token density, ensuring accurate truncation and performance across diverse text inputs.
- Included direct comparisons with the old implementation to validate correctness and efficiency improvements.

* refactor: Remove Tokenizer Route and Related References

- Deleted the tokenizer route from the server and removed its references from the routes index and server files, streamlining the API structure.
- This change simplifies the routing configuration by eliminating unused endpoints.

* refactor: Migrate countTokens Utility to API Module

- Removed the local countTokens utility and integrated it into the @librechat/api module for centralized access.
- Updated various files to reference the new countTokens import from the API module, ensuring consistent usage across the application.
- Cleaned up unused references and imports related to the previous countTokens implementation.

* refactor: Centralize escapeRegExp Utility in API Module

- Moved the escapeRegExp function from local utility files to the @librechat/api module for consistent usage across the application.
- Updated imports in various files to reference the new centralized escapeRegExp function, ensuring cleaner code and reducing redundancy.
- Removed duplicate implementations of escapeRegExp from multiple files, streamlining the codebase.

* refactor: Enhance Token Counting Flexibility in Text Processing

- Updated the `processTextWithTokenLimit` function to accept both synchronous and asynchronous token counting functions, improving its versatility.
- Introduced a new `TokenCountFn` type to define the token counting function signature.
- Added comprehensive tests to validate the behavior of `processTextWithTokenLimit` with both sync and async token counting functions, ensuring consistent results.
- Implemented a wrapper to track call counts for the `countTokens` function, optimizing performance and reducing unnecessary calls.
- Enhanced existing tests to compare the performance of the new implementation against the old one, demonstrating significant improvements in efficiency.

* chore: documentation for Truncation Safety Buffer in Token Processing

- Added a safety buffer multiplier to the character position estimates during text truncation to prevent overshooting token limits.
- Updated the `processTextWithTokenLimit` function to utilize the new `TRUNCATION_SAFETY_BUFFER` constant, enhancing the accuracy of token limit processing.
- Improved documentation to clarify the rationale behind the buffer and its impact on performance and efficiency in token counting.
2025-12-02 12:22:04 -05:00
Zihao Zhou
5b8f0cba04
📄 refactor: Add Provider Fallback for Media Encoding using Client Endpoint (#10656)
When using direct endpoints (e.g., Google) instead of Agents, `this.options.agent` is undefined, causing provider detection to fail. This resulted in "Unknown content type document" errors for Google/Gemini PDF uploads.

Added `?? this.options.endpoint` fallback in addDocuments(), addVideos(), and addAudios() methods to ensure correct provider detection for all endpoint types.
2025-11-25 17:07:37 -05:00
Danny Avila
c0cb48256e
🤖 refactor: Improve Agent Handoff Context Tracking (#10553)
* chore: update @librechat/agents dependency to version 3.0.18

* refactor: add optional metadata field to message schema and types

* chore: update @librechat/agents to v3.0.19

* refactor: update return type of sendCompletion method to include metadata

* chore: linting

* chore: update @librechat/agents dependency to v3.0.20

* refactor: implement agent labeling for conversation history in multi-agent scenarios

* refactor: improve error handling for capturing agent ID map in AgentClient

* refactor: clear agentIdMap and related properties during client disposal to prevent memory leaks

* chore: update sendCompletion method for FakeClient to return an object with completion and metadata fields
2025-11-17 16:57:51 -05:00
Danny Avila
2524d33362
📂 refactor: Cleanup File Filtering Logic, Improve Validation (#10414)
* feat: add filterFilesByEndpointConfig to filter disabled file processing by provider

* chore: explicit define of endpointFileConfig for better debugging

* refactor: move `normalizeEndpointName` to data-provider as used app-wide

* chore: remove overrideEndpoint from useFileHandling

* refactor: improve endpoint file config selection

* refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling

* refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components

* test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling

* refactor: streamline agent endpoint assignment and improve file filtering logic

* feat: add error handling for disabled file uploads in endpoint configuration

* refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint

* refactor: streamline requestFiles handling in initializeAgent function

* fix: getEndpointFileConfig partial config merging scenarios

* refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types

* refactor: user-configured default file config in getEndpointFileConfig

* fix: prevent file handling when endpoint is disabled and file is dragged to chat

* refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks

* fix: prioritize endpointType based on agent.endpoint in file filtering logic

* fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting
2025-11-10 19:05:30 -05:00
Danny Avila
07d0abc9fd
🖼️ fix: Extract File Context & Persist Attachments (#10069)
- problem: `addImageUrls` had a side effect that was being leveraged before to populate both the `ocr` message field, now `fileContext`, and `client.options.attachments`, which would record the user's uploaded message attachments to the user message when saved to the database and returned at the end of the request lifecycle
- solution: created dedicated handling for file context, and made sure to populate `allFiles` with non-provider attachments
2025-10-10 05:35:37 -04:00
Danny Avila
bcd97aad2f
📎 feat: Direct Provider Attachment Support for Multimodal Content (#9994)
* 📎 feat: Direct Provider Attachment Support for Multimodal Content

* 📑 feat: Anthropic Direct Provider Upload (#9072)

* feat: implement Anthropic native PDF support with document preservation

- Add comprehensive debug logging throughout PDF processing pipeline
- Refactor attachment processing to separate image and document handling
- Create distinct addImageURLs(), addDocuments(), and processAttachments() methods
- Fix critical bugs in stream handling and parameter passing
- Add streamToBuffer utility for proper stream-to-buffer conversion
- Remove api/agents submodule from repository

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: remove out of scope formatting changes

* fix: stop duplication of file in chat on end of response stream

* chore: bring back file search and ocr options

* chore: localize upload to provider string in file menu

* refactor: change createMenuItems args to fit new pattern introduced by anthropic-native-pdf-support

* feat: add cache point for pdfs processed by anthropic endpoint since they are unlikely to change and should benefit from caching

* feat: combine Upload Image into Upload to Provider since they both perform direct upload and change provider upload icon to reflect multimodal upload

* feat: add citations support according to docs

* refactor: remove redundant 'document' check since documents are handled properly by formatMessage in the agents repo now

* refactor: change upload logic so anthropic endpoint isn't exempted from normal upload path using Agents for consistency with the rest of the upload logic

* fix: include width and height in return from uploadLocalFile so images are correctly identified when going through an AgentUpload in addImageURLs

* chore: remove client specific handling since the direct provider stuff is handled by the agent client

* feat: handle documents in AgentClient so no need for change to agents repo

* chore: removed unused changes

* chore: remove auto generated comments from OG commit

* feat: add logic for agents to use direct to provider uploads if supported (currently just anthropic)

* fix: reintroduce role check to fix render error because of undefined value for Content Part

* fix: actually fix render bug by using proper isCreatedByUser check and making sure our mutation of formattedMessage.content is consistent

---------

Co-authored-by: Andres Restrepo <andres@thelinuxkid.com>
Co-authored-by: Claude <noreply@anthropic.com>

📁 feat: Send Attachments Directly to Provider (OpenAI) (#9098)

* refactor: change references from direct upload to direct attach to better reflect functionality

since we are just using base64 encoding strategy now rather than Files/File API for sending our attachments directly to the provider, the upload nomenclature no longer makes sense. direct_attach better describes the different methods of sending attachments to providers anyways even if we later introduce direct upload support

* feat: add upload to provider option for openai (and agent) ui

* chore: move anthropic pdf validator over to packages/api

* feat: simple pdf validation according to openai docs

* feat: add provider agnostic validatePdf logic to start handling multiple endpoints

* feat: add handling for openai specific documentPart formatting

* refactor: move require statement to proper place at top of file

* chore: add in openAI endpoint for the rest of the document handling logic

* feat: add direct attach support for azureOpenAI endpoint and agents

* feat: add pdf validation for azureOpenAI endpoint

* refactor: unify all the endpoint checks with isDocumentSupportedEndpoint

* refactor: consolidate Upload to Provider vs Upload image logic for clarity

* refactor: remove anthropic from anthropic_multimodal fileType since we support multiple providers now

🗂️ feat: Send Attachments Directly to Provider (Google) (#9100)

* feat: add validation for google PDFs and add google endpoint as a document supporting endpoint

* feat: add proper pdf formatting for google endpoints (requires PR #14 in agents)

* feat: add multimodal support for google endpoint attachments

* feat: add audio file svg

* fix: refactor attachments logic so multi-attachment messages work properly

* feat: add video file svg

* fix: allows for followup questions of uploaded multimodal attachments

* fix: remove incorrect final message filtering that was breaking Attachment component rendering

fix: manualy rename 'documents' to 'Documents' in git since it wasn't picked up due to case insensitivity in dir name

fix: add logic so filepicker for a google agent has proper filetype filtering

🛫 refactor: Move Encoding Logic to packages/api (#9182)

* refactor: move audio encode over to TS

* refactor: audio encoding now functional in LC again

* refactor: move video encode over to TS

* refactor: move document encode over to TS

* refactor: video encoding now functional in LC again

* refactor: document encoding now functional in LC again

* fix: extend file type options in AttachFileMenu to include 'google_multimodal' and update dependency array to include agent?.provider

* feat: only accept pdfs if responses api is enabled for openai convos

chore: address ESLint comments

chore: add missing audio mimetype

* fix: type safety for message content parts and improve null handling

* chore: reorder AttachFileMenuProps for consistency and clarity

* chore: import order in AttachFileMenu

* fix: improve null handling for text parts in parseTextParts function

* fix: remove no longer used unsupported capability error message for file uploads

* fix: OpenAI Direct File Attachment Format

* fix: update encodeAndFormatDocuments to support  OpenAI responses API and enhance document result types

* refactor: broaden providers supported for documents

* feat: enhance DragDrop context and modal to support document uploads based on provider capabilities

* fix: reorder import statements for consistency in video encoding module

---------

Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
2025-10-06 17:30:16 -04:00
Danny Avila
9a210971f5
🛜 refactor: Streamline App Config Usage (#9234)
* WIP: app.locals refactoring

WIP: appConfig

fix: update memory configuration retrieval to use getAppConfig based on user role

fix: update comment for AppConfig interface to clarify purpose

🏷️ refactor: Update tests to use getAppConfig for endpoint configurations

ci: Update AppService tests to initialize app config instead of app.locals

ci: Integrate getAppConfig into remaining tests

refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests

refactor: Rename initializeAppConfig to setAppConfig and update related tests

ci: Mock getAppConfig in various tests to provide default configurations

refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests

chore: rename `Config/getAppConfig` -> `Config/app`

fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters

chore: correct parameter documentation for imageOutputType in ToolService.js

refactor: remove `getCustomConfig` dependency in config route

refactor: update domain validation to use appConfig for allowed domains

refactor: use appConfig registration property

chore: remove app parameter from AppService invocation

refactor: update AppConfig interface to correct registration and turnstile configurations

refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services

refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files

refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type

refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration

ci: update related tests

refactor: update getAppConfig call in getCustomConfigSpeech to include user role

fix: update appConfig usage to access allowedDomains from actions instead of registration

refactor: enhance AppConfig to include fileStrategies and update related file strategy logic

refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions

chore: remove deprecated unused RunManager

refactor: get balance config primarily from appConfig

refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic

refactor: remove getCustomConfig usage and use app config in file citations

refactor: consolidate endpoint loading logic into loadEndpoints function

refactor: update appConfig access to use endpoints structure across various services

refactor: implement custom endpoints configuration and streamline endpoint loading logic

refactor: update getAppConfig call to include user role parameter

refactor: streamline endpoint configuration and enhance appConfig usage across services

refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file

refactor: add type annotation for loadedEndpoints in loadEndpoints function

refactor: move /services/Files/images/parse to TS API

chore: add missing FILE_CITATIONS permission to IRole interface

refactor: restructure toolkits to TS API

refactor: separate manifest logic into its own module

refactor: consolidate tool loading logic into a new tools module for startup logic

refactor: move interface config logic to TS API

refactor: migrate checkEmailConfig to TypeScript and update imports

refactor: add FunctionTool interface and availableTools to AppConfig

refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig`

WIP: fix tests

* fix: rebase conflicts

* refactor: remove app.locals references

* refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware

* refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients

* test: add balance configuration to titleConvo method in AgentClient tests

* chore: remove unused `openai-chat-tokens` package

* chore: remove unused imports in initializeMCPs.js

* refactor: update balance configuration to use getAppConfig instead of getBalanceConfig

* refactor: integrate configMiddleware for centralized configuration handling

* refactor: optimize email domain validation by removing unnecessary async calls

* refactor: simplify multer storage configuration by removing async calls

* refactor: reorder imports for better readability in user.js

* refactor: replace getAppConfig calls with req.config for improved performance

* chore: replace getAppConfig calls with req.config in tests for centralized configuration handling

* chore: remove unused override config

* refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config

* chore: remove customConfig parameter from TTSService constructor

* refactor: pass appConfig from request to processFileCitations for improved configuration handling

* refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config`

* test: add mockAppConfig to processFileCitations tests for improved configuration handling

* fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor

* fix: type safety in useExportConversation

* refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached

* chore: change `MongoUser` typedef to `IUser`

* fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest

* fix: remove unused setAppConfig mock from Server configuration tests
2025-08-26 12:10:18 -04:00
Danny Avila
e0ebb7097e
fix: AbortSignal Cleanup Logic for New Chats (#9177)
* chore: import paths for isEnabled and logger in title.js

*  fix: `AbortSignal` Cleanup Logic for New Chats

* test: Add `isNewConvo` parameter to onStart expectation in BaseClient tests
2025-08-20 14:56:07 -04:00
Danny Avila
d7d02766ea
🏷️ feat: Request Placeholders for Custom Endpoint & MCP Headers (#9095)
* feat: Add conversation ID support to custom endpoint headers

- Add LIBRECHAT_CONVERSATION_ID to customUserVars when provided
- Pass conversation ID to header resolution for dynamic headers
- Add comprehensive test coverage

Enables custom endpoints to access conversation context using {{LIBRECHAT_CONVERSATION_ID}} placeholder.

* fix: filter out unresolved placeholders from headers (thanks @MrunmayS)

* feat: add support for request body placeholders in custom endpoint headers

- Add {{LIBRECHAT_BODY_*}} placeholders for conversationId, parentMessageId, messageId
- Update tests to reflect new body placeholder functionality

* refactor resolveHeaders

* style: minor styling cleanup

* fix: type error in unit test

* feat: add body to other endpoints

* feat: add body for mcp tool calls

* chore: remove changes that unnecessarily increase scope after clarification of requirements

* refactor: move http.ts to packages/api and have RequestBody intersect with Express request body

* refactor: processMCPEnv now uses single object argument pattern

* refactor: update processMCPEnv to use 'options' parameter and align types across MCP connection classes

* feat: enhance MCP connection handling with dynamic request headers to pass request body fields

---------

Co-authored-by: Gopal Sharma <gopalsharma@gopal.sharma1>
Co-authored-by: s10gopal <36487439+s10gopal@users.noreply.github.com>
Co-authored-by: Dustin Healy <dustinhealy1@gmail.com>
2025-08-16 20:45:55 -04:00
Danny Avila
35d8ef50f4
🪙 fix: Use Fallback Token Transaction if No Collected Usage (#8503) 2025-07-16 17:58:15 -04:00
Danny Avila
e370a87ebe
♻️ fix: Correct Message ID Assignment Logic (#8439)
* fix: Add `isRegenerate` flag to chat payload to avoid saving temporary response IDs

* fix: Remove unused `isResubmission` flag

* ci: Add tests for responseMessageId regeneration logic in BaseClient
2025-07-14 00:57:20 -04:00
Danny Avila
434289fe92
🔀 feat: Save & Submit Message Content Parts (#8171)
* 🐛 fix: Enhance provider validation and error handling in getProviderConfig function

* WIP: edit text part

* refactor: Allow updating of both TEXT and THINK content types in message updates

* WIP: first pass, save & submit

* chore: remove legacy generation user message field

* feat: merge edited content

* fix: update placeholder and description for bedrock setting

* fix: remove unsupported warning message for AI resubmission
2025-07-01 15:43:10 -04:00
Danny Avila
01e9b196bc
🤖 feat: Streamline Endpoints to Agent Framework (#8013)
* refactor(buildEndpointOption): Improve error logging in middleware, consolidate `isAgents` builder logic, remove adding `modelsConfig` to `endpointOption`

* refactor: parameter extraction and organization in agent services, minimize redundancy of shared fields across objects, make clear distinction of parameters processed uniquely by LibreChat vs LLM Provider Configs

* refactor(createPayload): streamline all endpoints to agent route

* fix: add `modelLabel` to response sender options for agent initialization

* chore: correct log message context in EditController abort controller cleanup

* chore: remove unused abortRequest hook

* chore: remove unused addToCache module and its dependencies

* refactor: remove AskController and related routes, update endpoint URLs (now all streamlined to agents route)

* chore: remove unused bedrock route and its related imports

* refactor: simplify response sender logic for Google endpoint

* chore: add `modelDisplayLabel` handling for agents endpoint

* feat: add file search capability to ephemeral agents, update code interpreter selection based of file upload, consolidate main upload menu for all endpoints

* feat: implement useToolToggle hook for managing tool toggle state, refactor CodeInterpreter and WebSearch components to utilize new hook

* feat: add ToolsDropdown component to BadgeRow for enhanced tool options

* feat: introduce BadgeRowContext and BadgeRowProvider for managing conversation state, refactor related components to utilize context

* feat: implement useMCPSelect hook for managing MCP selection state, refactor MCPSelect component to utilize new hook

* feat: enhance BadgeRowContext with MCPSelect and tool toggle functionality, refactor related components to utilize updated context and hooks

* refactor: streamline useToolToggle hook by integrating setEphemeralAgent directly into toggle logic and removing redundant setValue function

* refactor: consolidate codeApiKeyForm and searchApiKeyForm from CodeInterpreter and WebSearch to utilize new context properties

* refactor: update CheckboxButton to support controlled state and enhance ToolsDropdown with permission-based toggles for web search and code interpreter

* refactor: conditionally render CheckboxButton in CodeInterpreter and WebSearch components for improved UI responsiveness

* chore: add jotai dependency to package.json and package-lock.json

* chore: update brace-expansion package to version 2.0.2 in package-lock.json due to CVE-2025-5889

* Revert "chore: add jotai dependency to package.json and package-lock.json"

This reverts commit 69b6997396.

* refactor: add pinning functionality to CodeInterpreter and WebSearch components, and enhance ToolsDropdown with pin toggle for web search and code interpreter

* chore: move MCPIcon to correct location, remove duplicate

* fix: update MCP import to use type-only import from librechat-data-provider

* feat: implement MCPSubMenu component and integrate pinning functionality into ToolsDropdown

* fix: cycling to submenu by using parent menu context

* feat: add FileSearch component and integrate it into BadgeRow and ToolsDropdown

* chore: import order

* chore: remove agent specific logic that would block functionality for streamlined endpoints

* chore: linting for `createContextHandlers`

* chore: ensure ToolsDropdown doesn't show up for agents

* chore: ensure tool resource is selected when dragged to UI

* chore: update file search behavior to simulate legacy functionality

* feat: ToolDialogs with multiple trigger references, add settings to tool dropdown

* refactor: simplify web search and code interpreter settings checks

* chore: simplify local storage key for pinned state in useToolToggle

* refactor: reinstate agent check in AttachFileChat component, as individual providers will ahve different file configurations

* ci: increase timeout for MongoDB connection in Agent tests
2025-06-23 09:59:05 -04:00
Danny Avila
66093b1eb3
💬 refactor: MCP Chat Visibility Option, Google Rates, Remove OpenAPI Plugins (#7286)
* fix: Update Gemini 2.5 Pro Preview Model Name in Token Values

* refactor: Update DeleteButton to close menu when deletion is successful

* refactor: Add unmountOnHide prop to DropdownPopup in multiple components

* chore: linting

* chore: linting

* feat: Add `chatMenu` option for MCP Servers to control visibility in MCPSelect dropdown

* refactor: Update loadManifestTools to return combined tool manifest with MCP tools first

* chore: remove deprecated openapi plugins

* chore: linting

* chore(AgentClient): linting, remove unnecessary `checkVisionRequest` logger

* refactor(AuthService): change logoutUser logging from error to debug level

* chore: new Gemini models token values and rates

* chore(AskController): linting
2025-05-08 12:12:36 -04:00
Danny Avila
37964975c1
🤖 refactor: Improve Agents Memory Usage, Bump Keyv, Grok 3 (#6850)
* chore: remove unused redis file

* chore: bump keyv dependencies, and update related imports

* refactor: Implement IoRedis client for rate limiting across middleware, as node-redis via keyv not compatible

* fix: Set max listeners to expected amount

* WIP: memory improvements

* refactor: Simplify getAbortData assignment in createAbortController

* refactor: Update getAbortData to use WeakRef for content management

* WIP: memory improvements in agent chat requests

* refactor: Enhance memory management with finalization registry and cleanup functions

* refactor: Simplify domainParser calls by removing unnecessary request parameter

* refactor: Update parameter types for action tools and agent loading functions to use minimal configs

* refactor: Simplify domainParser tests by removing unnecessary request parameter

* refactor: Simplify domainParser call by removing unnecessary request parameter

* refactor: Enhance client disposal by nullifying additional properties to improve memory management

* refactor: Improve title generation by adding abort controller and timeout handling, consolidate request cleanup

* refactor: Update checkIdleConnections to skip current user when checking for idle connections if passed

* refactor: Update createMCPTool to derive userId from config and handle abort signals

* refactor: Introduce createTokenCounter function and update tokenCounter usage; enhance disposeClient to reset Graph values

* refactor: Update getMCPManager to accept userId parameter for improved idle connection handling

* refactor: Extract logToolError function for improved error handling in AgentClient

* refactor: Update disposeClient to clear handlerRegistry and graphRunnable references in client.run

* refactor: Extract createHandleNewToken function to streamline token handling in initializeClient

* chore: bump @librechat/agents

* refactor: Improve timeout handling in addTitle function for better error management

* refactor: Introduce createFetch instead of using class method

* refactor: Enhance client disposal and request data handling in AskController and EditController

* refactor: Update import statements for AnthropicClient and OpenAIClient to use specific paths

* refactor: Use WeakRef for response handling in SplitStreamHandler to prevent memory leaks

* refactor: Simplify client disposal and rename getReqData to processReqData in AskController and EditController

* refactor: Improve logging structure and parameter handling in OpenAIClient

* refactor: Remove unused GraphEvents and improve stream event handling in AnthropicClient and OpenAIClient

* refactor: Simplify client initialization in AskController and EditController

* refactor: Remove unused mock functions and implement in-memory store for KeyvMongo

* chore: Update dependencies in package-lock.json to latest versions

* refactor: Await token usage recording in OpenAIClient to ensure proper async handling

* refactor: Remove handleAbort route from multiple endpoints and enhance client disposal logic

* refactor: Enhance abort controller logic by managing abortKey more effectively

* refactor: Add newConversation handling in useEventHandlers for improved conversation management

* fix: dropparams

* refactor: Use optional chaining for safer access to request properties in BaseClient

* refactor: Move client disposal and request data processing logic to cleanup module for better organization

* refactor: Remove aborted request check from addTitle function for cleaner logic

* feat: Add Grok 3 model pricing and update tests for new models

* chore: Remove trace warnings and inspect flags from backend start script used for debugging

* refactor: Replace user identifier handling with userId for consistency across controllers, use UserId in clientRegistry

* refactor: Enhance client disposal logic to prevent memory leaks by clearing additional references

* chore: Update @librechat/agents to version 2.4.14 in package.json and package-lock.json
2025-04-12 18:46:36 -04:00
Danny Avila
910c73359b
🔦 feat: MCP Support for Non-Agent Endpoints (#6775)
* wip: mcp select

* refactor: Update useAvailableToolsQuery to support generic data types

* feat: Enhance MCPSelect to dynamically load server options and improve MultiSelect component styling

* WIP: ephemeral agents

* wip: Add null check for MCPSelect and improve MultiSelect focus handling

* feat: Pass conversationId prop to MCPSelect in BadgeRow to optimize badge rendering

* feat: useApplyNewAgentTemplate hook to manage ephemeral agent upon conversation creation

* WIP: eph. agent payload

* refactor(OpenAIClient): streamline message processing by replacing content handling with parseTextParts function

* feat: enhance applyAgentTemplate function to accept source conversation ID for improved template application

* feat(parsers): add skipReasoning parameter to parseTextParts for conditional reasoning handling

* WIP: first pass, ephemeral agent backend processing

* chore: import order

* feat: update loadEphemeralAgent and loadAgent functions to accept model_parameters for enhanced agent configuration

* feat: add showMCPServers prop to BadgeRow for conditional rendering of MCPSelect, fix react rule violation

* feat: enhance MCPSelect with localized placeholder and custom icon, add renderSelectedValues callback

* feat: simplify message processing in AnthropicClient by replacing content handling with parseTextParts function

* feat: implement useLocalStorage hook for managing MCP values and update MCPSelect to utilize it

* chore: remove chatGPTBrowserSchema from endpoint schemas and update types for improved schema management

* chore: remove compactChatGPTSchema from endpoint schemas and update types for better schema management

* refactor: rename schemas for clarity and improve schema management

* feat: extend model detection to include 'codestral' alongside 'mistral'

* feat: add endpointType parameter to buildOptions and initializeClient functions

* fix: update condition for handling completion in BaseClient to include agents client

* refactor: simplify payload parsing logic in AgentClient and remove unused providerParsers

* refactor: change useSetRecoilState to useRecoilState for better state management in MCPSelect component

* refactor: streamline chat route handlers by consolidating middleware and improving endpoint structure

* style: update MCPSelect and MultiSelect components for improved layout in mobile view

* v0.7.790

* feat: add getMessageMapMethod to process message text and content in GoogleClient

* chore: include LAST_MCP_ key prefix in clearLocalStorage function for proper teardown on logout
2025-04-07 19:16:56 -04:00
Marco Beretta
7f29f2f676
🎨 feat: UI Refresh for Enhanced UX (#6346)
*  feat: Add Expand Chat functionality and improve UI components

*  feat: Introduce Chat Badges feature with editing capabilities and UI enhancements

*  feat: re-implement file attachment functionality with new components and improved UI

*  feat: Enhance BadgeRow component with drag-and-drop functionality and add animations for better user experience

*  feat: Add useChatBadges hook and enhance Badge component with animations and toggle functionality

* feat: Improve Add/Delete Badges + style and bug fixes

*  feat: Refactor EditBadges component and optimize useChatBadges hook for improved performance and readability

*  feat: Add type definition for LucideIcon in EditBadges component

* refactor: Clean up BadgeRow component by removing outdated comment and improving code readability

* refactor: Rename app-icon class to badge-icon for consistency and improve badge styling

* feat: Add Center Chat Input toggle and update related components for improved UI/UX

* refactor: Simplify ChatView and MessagesView components for improved readability and performance

* refactor: Improve layout and positioning of scroll button in MessagesView component

* refactor: Adjust scroll button position in MessagesView component for better visibility

* refactor: Remove redundant background class from Badge component for cleaner styling

* feat: disable chat badges

* refactor: adjust positioning of scroll button and popover for improved layout

* refactor: simplify class names in ChatForm and RemoveFile components for cleaner code

* refactor: move Switcher to HeaderOptions from SidePanel

* fix(Landing): duplicate description

* feat: add SplitText component for animated text display and update Landing component to use it

* feat(Chat): add ConversationStarters component and integrate it into ChatView; remove ConvoStarter component

* feat(Chat): enhance Message component layout and styling for improved readability

* feat(ControlCombobox, Select): enhance styling and add animation for improved UI experience

* feat(Chat): update Header and HeaderNewChat components for improved layout and styling

* feat(Chat): add ModelDropdown (now includes both endpoint and model) and refactor Menu components for improved UI

* feat(ModelDropdown): add Agent Select; removed old AgentSwitcher components

* feat(ModelDropdown): add settings button for user key configuration

* fix(ModelDropdown): the model dropdown wasn't opening automatically when opening the endpoint one

* refactor(Chat): remove unused EndpointsMenu and related components to streamline codebase

* feat: enhance greeting message and improve accessibility fro ModelDropdown

* refactor(Endpoints): add new hooks and components for endpoint management

* feat(Endpoint): add support for modelSpecs

* feat(Endpoints): add mobile support

* fix: type issues

* fix(modelSpec): type issue

* fix(EndpointMenuDropdown): double overflow scroller in mobile model list

* fix: search model on mobile

* refactor: Endpoint/Model/modelSpec dropdown

* refactor: reorganize imports in Endpoint components

* refactor: remove unused translation keys from English locale

* BREAKING: moving to ariakit with new CustomMenu

* refactor: remove unnecessary comments

* refactor: remove EndpointItem, ModelDropdownButton, SpecIcon, and SpecItem components

* 🔧 fix: AI Icon bump when regenerating message

* wip: chat UI refactoring, fix issues

* chore: add recent update to useAutoSave

* feat: add access control for agent permissions in useMentions hook

* refactor: streamline ModelSelector by removing unused endpoints logic

* refactor: enhance ModelSelector and context by integrating endpointsConfig and improving type usage

* feat: update ModelSelectorContext to utilize conversation data for initial state

* feat: add selector effects for synced endpoint handling

* feat: add guard clause for conversation endpoint in useSelectorEffects hook

* fix: safely call onSelectMention and add autofocus to mention input

* chore: typing

* refactor: ModelSelector to streamline key dialog handling and improve endpoint rendering

* refactor: extract SettingsButton component for cleaner endpoint item rendering

* wip: first pass, expand set api key

* wip: first pass, expanding set key

* refactor: update EndpointItem styles for improved layout and hover effects

* refactor: adjust padding in EndpointItem for improved layout consistency

* refactor: update preset structure in useSelectMention to include spec as null

* refactor: rename setKeyDialogOpen to onOpenChange for clarity and consistency, bring focus back to button that opened dialog

* feat: add SpecIcon component for dynamic model spec icons in menu, adjust icon styling

* refactor: update getSelectedIcon to accept additional parameters and improve icon rendering logic

* fix: adjust padding in MessageRender for improved layout

* refactor: remove inline style for menu width in CustomMenu component

* refactor: enhance layout and styling in ModelSpecItem component for better responsiveness

* refactor: update getDefaultModelSpec to accept startupConfig and improve model spec retrieval logic

* refactor: improve key management and default values in ModelSelector and related components

* refactor: adjust menu width and improve responsiveness in CustomMenu and EndpointItem components

* refactor: enhance focus styles and responsiveness in EndpointItem component

* refactor: improve layout and spacing in Header and ModelSelector components for better responsiveness

* refactor: adjust button styles for consistency and improved layout in AddMultiConvo and PresetsMenu components

* fix: initial fix of assistant names

* fix: assistants handling

* chore: update version of librechat-data-provider to 0.7.75 and add 'spec' to excludedKeys

* fix: improve endpoint filtering logic based on interface configuration and access rights

* fix: remove unused HeaderOptions import and set spec to null in presets and mentions

* fix: ensure currentExample is always an object when updating examples

* fix: update interfaceConfig checks to ensure modelSelect is considered for rendering components

* fix: update model selection logic to consider interface configuration when prioritizing model specs

* fix: add missing localizations

* fix: remove unused agent and assistant selection translations

* fix: implement debounced state updates for selected values in useSelectorEffects

* style: minor style changes related to the ModelSelector

* fix: adjust maximum height for popover and set fixed height for model item

* fix: update placeholders for model and endpoint search inputs

* fix: refactor MessageRender and ContentRender components to better match each other

* fix: remove convo fallback for iconURL in MessageRender and ContentRender components

* fix: update handling of spec, iconURL, and modelLabel in conversation presets, to allow better interchangeability

* fix: replace chatGptLabel with modelLabel in OpenAI settings configuration (fully deprecate chatGptLabel)

* fix: remove console log for assistantNames in useEndpoints hook

* refactor: add cleanInput and cleanOutput options to default conversation handling

* chore: update bun.lockb

* fix: set default value for showIconInHeader in getSelectedIcon function

* refactor: enhance error handling in message processing when latest message has existing content blocks

* chore: allow import/no-cycle for messages

* fix: adjust flex properties in BookmarkMenu for better layout

* feat: support both 'prompt' and 'q' as query parameters in useQueryParams hook

* feat: re-enable Badges components

* refactor: disable edit badge component

* chore: rename assistantMap to assistantsMap for consistency

* chore: rename assistantMap to assistantsMap for consistency in Mention component

* feat: set staleTime for various queries to improve data freshness

* feat: add spec field to tQueryParamsSchema for model specification

* feat: enhance useQueryParams to handle model specs

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2025-03-25 18:50:58 -04:00
Danny Avila
842b68fc32
🏗️ fix: Agents Token Spend Race Conditions, Add Auto-refill Tx, Add Relevant Tests (#6480)
* 🏗️ refactor: Improve spendTokens logic to handle zero completion tokens and enhance test coverage

* 🏗️ test: Add tests to ensure balance does not go below zero when spending tokens

* 🏗️ fix: Ensure proper continuation in AgentClient when handling errors

* fix: spend token race conditions

* 🏗️ test: Add test for handling multiple concurrent transactions with high balance

* fix: Handle Omni models prompt prefix handling for user messages with array content in OpenAIClient

* refactor: Update checkBalance import paths to use new balanceMethods module

* refactor: Update checkBalance imports and implement updateBalance function for atomic balance updates

* fix: import from replace method

* feat: Add createAutoRefillTransaction method to handle non-balance updating transactions

* refactor: Move auto-refill logic to balanceMethods and enhance checkBalance functionality

* feat: Implement logging for auto-refill transactions in balance checks

* refactor: Remove logRefill calls from multiple client and handler files

* refactor: Move balance checking and auto-refill logic to balanceMethods for improved structure

* refactor: Simplify balance check calls by removing unnecessary balanceRecord assignments

* fix: Prevent negative rawAmount in spendTokens when promptTokens is zero

* fix: Update balanceMethods to use Balance model for findOneAndUpdate

* chore: import order

* refactor: remove unused txMethods file to streamline codebase

* feat: enhance updateBalance and createAutoRefillTransaction methods to support additional parameters for improved balance management
2025-03-22 17:54:25 -04:00
Ruben Talstra
3a62a2633d
💵 feat: Add Automatic Balance Refill (#6452)
* 🚀 feat: Add automatic refill settings to balance schema

* 🚀 feat: Refactor balance feature to use global interface configuration

* 🚀 feat: Implement auto-refill functionality for balance management

* 🚀 feat: Enhance auto-refill logic and configuration for balance management

* 🚀 chore: Bump version to 0.7.74 in package.json and package-lock.json

* 🚀 chore: Bump version to 0.0.5 in package.json and package-lock.json

* 🚀 docs: Update comment for balance settings in librechat.example.yaml

* chore: space in `.env.example`

* 🚀 feat: Implement balance configuration loading and refactor related components

* 🚀 test: Refactor tests to use custom config for balance feature

* 🚀 fix: Update balance response handling in Transaction.js to use Balance model

* 🚀 test: Update AppService tests to include balance configuration in mock setup

* 🚀 test: Enhance AppService tests with complete balance configuration scenarios

* 🚀 refactor: Rename balanceConfig to balance and update related tests for clarity

* 🚀 refactor: Remove loadDefaultBalance and update balance handling in AppService

* 🚀 test: Update AppService tests to reflect new balance structure and defaults

* 🚀 test: Mock getCustomConfig in BaseClient tests to control balance configuration

* 🚀 test: Add get method to mockCache in OpenAIClient tests for improved cache handling

* 🚀 test: Mock getCustomConfig in OpenAIClient tests to control balance configuration

* 🚀 test: Remove mock for getCustomConfig in OpenAIClient tests to streamline configuration handling

* 🚀 fix: Update balance configuration reference in config.js for consistency

* refactor: Add getBalanceConfig function to retrieve balance configuration

* chore: Comment out example balance settings in librechat.example.yaml

* refactor: Replace getCustomConfig with getBalanceConfig for balance handling

* fix: tests

* refactor: Replace getBalanceConfig call with balance from request locals

* refactor: Update balance handling to use environment variables for configuration

* refactor: Replace getBalanceConfig calls with balance from request locals

* refactor: Simplify balance configuration logic in getBalanceConfig

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2025-03-21 17:48:11 -04:00
Danny Avila
efb616d600
🔧 fix: Update Token Calculations/Mapping, MCP env Initialization (#6406)
* fix: Enhance MCP initialization to process environment variables

* fix: only build tokenCountMap with messages that are being used in the payload

* fix: Adjust maxContextTokens calculation to account for maxOutputTokens

* refactor: Make processMCPEnv optional in MCPManager initialization

* chore: Bump version of librechat-data-provider to 0.7.73
2025-03-18 23:16:45 -04:00
Danny Avila
d6a17784dc
🔗 feat: Agent Chain (Mixture-of-Agents) (#6374)
* wip: first pass, dropdown for selecting sequential agents

* refactor: Improve agent selection logic and enhance performance in SequentialAgents component

* wip: seq. agents working ideas

* wip: sequential agents style change

* refactor: move agent form options/submission outside of AgentConfig

* refactor: prevent repeating code

* refactor: simplify current agent display in SequentialAgents component

* feat: persist  form value handling in AgentSelect component for agent_ids

* feat: first pass, sequential agnets agent update

* feat: enhance message display with agent updates and empty text handling

* chore: update Icon component to use EModelEndpoint for agent endpoints

* feat: update content type checks in BaseClient to use constants for better readability

* feat: adjust max context tokens calculation to use 90% of the model's max tokens

* feat: first pass, agent run message pruning

* chore: increase max listeners for abort controller to prevent memory leaks

* feat: enhance runAgent function to include current index count map for improved token tracking

* chore: update @librechat/agents dependency to version 2.2.5

* feat: update icons and style of SequentialAgents component for improved UI consistency

* feat: add AdvancedButton and AdvancedPanel components for enhanced agent settings navigation, update styling for agent form

* chore: adjust minimum height of AdvancedPanel component for better layout consistency

* chore: update @librechat/agents dependency to version 2.2.6

* feat: enhance message formatting by incorporating tool set into agent message processing, in order to allow better mix/matching of agents (as tool calls for tools not found in set will be stringified)

* refactor: reorder components in AgentConfig for improved readability and maintainability

* refactor: enhance layout of AgentUpdate component for improved visual structure

* feat: add DeepSeek provider to Bedrock settings and schemas

* feat: enhance link styling in mobile.css for better visibility and accessibility

* fix: update banner model import in update banner script; export Banner model

* refactor: `duplicateAgentHandler` to include tool_resources only for OCR context files

* feat: add 'qwen-vl' to visionModels for enhanced model support

* fix: change image format from JPEG to PNG in DALLE3 response

* feat: reorganize Advanced components and add localizations

* refactor: simplify JSX structure in AgentChain component to defer container styling to parent

* feat: add FormInput component for reusable input handling

* feat: make agent recursion limit configurable from builder

* feat: add support for agent capabilities chain in AdvancedPanel and update data-provider version

* feat: add maxRecursionLimit configuration for agents and update related documentation

* fix: update CONFIG_VERSION to 1.2.3 in data provider configuration

* feat: replace recursion limit input with MaxAgentSteps component and enhance input handling

* feat: enhance AgentChain component with hover card for additional information and update related labels

* fix: pass request and response objects to `createActionTool` when using assistant actions to prevent auth error

* feat: update AgentChain component layout to include agent count display

* feat: increase default max listeners and implement capability check function for agent chain

* fix: update link styles in mobile.css for better visibility in dark mode

* chore: temp. remove agents package while bumping shared packages

* chore: update @langchain/google-genai package to version 0.1.11

* chore: update @langchain/google-vertexai package to version 0.2.2

* chore: add @librechat/agents package at version 2.2.8

* feat: add deepseek.r1 model with token rate and context values for bedrock
2025-03-17 16:43:44 -04:00
Danny Avila
ded3cd8876
🔍 feat: Mistral OCR API / Upload Files as Text (#6274)
* refactor: move `loadAuthValues` to `~/services/Tools/credentials`

* feat: add createAxiosInstance function to configure axios with proxy support

* WIP: First pass mistral ocr

* refactor: replace getConvoFiles with getToolFiles for improved file retrieval logic

* refactor: improve document formatting in encodeAndFormat function

* refactor: remove unused resendFiles parameter from buildOptions function (this option comes from the agent config)

* fix: update getFiles call to include files with `text` property as well

* refactor: move file handling to `initializeAgentOptions`

* refactor: enhance addImageURLs method to handle OCR text and improve message formatting

* refactor: update message formatting to handle OCR text in various content types

* refactor: remove unused resendFiles property from compactAgentsSchema

* fix: add error handling for Mistral OCR document upload and logging

* refactor: integrate OCR capability into file upload options and configuration

* refactor: skip processing for text source files in delete request, as they are directly tied to database

* feat: add metadata field to ExtendedFile type and update PanelColumns and PanelTable components for localization and metadata handling

* fix: source icon styling

* wip: first pass, frontend file context agent resources

* refactor: add hover card with contextual information for File Context (OCR) in FileContext component

* feat: enhance file processing by integrating file retrieval for OCR resources in agent initialization

* feat: implement OCR config; fix: agent resource deletion for ocr files

* feat: enhance agent initialization by adding OCR capability check in resource priming

* ci: fix `~/config` module mock

* ci: add OCR property expectation in AppService tests

* refactor: simplify OCR config loading by removing environment variable extraction, to be done when OCR is actually performed

* ci: add unit test to ensure environment variable references are not parsed in OCR config

* refactor: disable base64 image inclusion in OCR request

* refactor: enhance OCR configuration handling by validating environment variables and providing defaults

* refactor: use file stream from disk for mistral ocr api
2025-03-10 17:23:46 -04:00
Danny Avila
be280004cf
🔧 refactor: Improve Params Handling, Remove Legacy Items, & Update Configs (#6074)
* chore: include all assets for service worker, remove unused tsconfig.node.json, eslint ignore vite config

* chore: exclude image files from service worker caching

* refactor: simplify googleSchema transformation and error handling

* fix: max output tokens cap for 3.7 models

* fix: skip index fixing in CI, development, and test environments

* ci: add maxOutputTokens handling tests for Claude models

* refactor: drop top_k and top_p parameters for claude-3.7 in AnthropicClient and add tests for new behavior

* refactor: conditionally include top_k and top_p parameters for non-claude-3.7 models

* ci: add unit tests for getLLMConfig function with various model options

* chore: remove all OPENROUTER_API_KEY legacy logic

* refactor: optimize stream chunk handling

* feat: reset model parameters button

* refactor: remove unused examples field from convoSchema and presetSchema

* chore: update librechat-data-provider version to 0.7.6993

* refactor: move excludedKeys set to data-provider for better reusability

* feat: enhance saveMessageToDatabase to handle unset fields and fetched conversation state

* feat: add 'iconURL' and 'greeting' to excludedKeys in data provider config

* fix: add optional chaining to user ID retrieval in getConvo call
2025-02-26 15:02:03 -05:00
Danny Avila
63afb317c6
🚀 fix: Resolve Google Client Issues, CDN Screenshots, Update Models (#5703)
* 🤖 refactor: streamline model selection logic for title model in GoogleClient

* refactor: add options for empty object schemas in convertJsonSchemaToZod

* refactor: add utility function to check for empty object schemas in convertJsonSchemaToZod

* fix: Google MCP Tool errors, and remove Object Unescaping as Google fixed this

* fix: google safetySettings

* feat: add safety settings exclusion via GOOGLE_EXCLUDE_SAFETY_SETTINGS environment variable

* fix: rename environment variable for console JSON string length

* fix: disable portal for dropdown in ExportModal component

* fix: screenshot functionality to use image placeholder for remote images

* feat: add visionMode property to BaseClient and initialize in GoogleClient to fix resendFiles issue

* fix: enhance formatMessages to include image URLs in message content for Vertex AI

* fix: safety settings for titleChatCompletion

* fix: remove deprecated model assignment in GoogleClient and streamline title model retrieval

* fix: remove unused image preloading logic in ScreenshotContext

* chore: update default google models to latest models shared by vertex ai and gen ai

* refactor: enhance Google error messaging

* fix: update token values and model limits for Gemini models

* ci: fix model matching

* chore: bump version of librechat-data-provider to 0.7.699
2025-02-06 18:13:18 -05:00
Danny Avila
591a019766
🏄‍♂️ refactor: Optimize Reasoning UI & Token Streaming (#5546)
*  feat: Implement Show Thinking feature; refactor: testing thinking render optimizations

*  feat: Refactor Thinking component styles and enhance Markdown rendering

* chore: add back removed code, revert type changes

* chore: Add back resetCounter effect to Markdown component for improved code block indexing

* chore: bump @librechat/agents and google langchain packages

* WIP: reasoning type updates

* WIP: first pass, reasoning content blocks

* chore: revert code

* chore: bump @librechat/agents

* refactor: optimize reasoning tag handling

* style: ul indent padding

* feat: add Reasoning component to handle reasoning display

* feat: first pass, content reasoning part styling

* refactor: add content placeholder for endpoints using new stream handler

* refactor: only cache messages when requesting stream audio

* fix: circular dep.

* fix: add default param

* refactor: tts, only request after message stream, fix chrome autoplay

* style: update label for submitting state and add localization for 'Thinking...'

* fix: improve global audio pause logic and reset active run ID

* fix: handle artifact edge cases

* fix: remove unnecessary console log from artifact update test

* feat: add support for continued message handling with new streaming method

---------

Co-authored-by: Marco Beretta <81851188+berry-13@users.noreply.github.com>
2025-01-29 19:46:58 -05:00
Danny Avila
4110209494
♻️ fix: Prevent Instructions from Removal when nearing Max Context (#5516)
* refactor: getMessagesWithinTokenLimit to accept params object

* refactor: always include instructions in payload if provided

* ci: remove obsolete test

* refactor: update logoutUser to accept request object and handle session destruction

* test: enhance getMessagesWithinTokenLimit tests for instruction handling
2025-01-27 20:37:38 -05:00
Danny Avila
d6b4d83b68
🔥 feat: deepseek-reasoner Thought Streaming (#5379)
* 🔧 refactor: Remove unused penalties and enhance reasoning token handling in OpenAIClient

* 🔧 refactor: `addInstructions` default to adding instructions at index 0, flag for legacy behavior

* chore: remove long placeholder

* chore: update localization strings across multiple languages

* ci: adjust tests for new `addInstructions` behavior
2025-01-20 18:21:18 -05:00