Commit graph

12 commits

Author SHA1 Message Date
Danny Avila
6c6c72def7
🚀 feat: Decouple File Attachment Persistence from Preview Rendering (#12957)
* 🗂️ feat: add `status` lifecycle to file records for two-phase previews

Schema and model foundation for decoupling the agent's final response
from CPU-heavy office-format HTML extraction.

- `MongoFile.status: 'pending' | 'ready' | 'failed'` (indexed) and
  `previewError?: string` mirror the lifecycle: phase-1 emits the file
  record at `pending` so the response is unblocked; phase-2 transitions
  to `ready` (with text/textFormat) or `failed` (with previewError) in
  the background. Absent for legacy records — clients treat that as
  `ready` for back-compat.
- Mirror types added to `TFile` in data-provider so frontend cache
  consumers see the new fields.
- New `sweepOrphanedPreviews(maxAgeMs)` method on the file model
  recovers stale `pending` records left behind by a process restart
  mid-extraction; transitions them to `failed` with
  `previewError: 'orphaned'`. Cheap because `status` is indexed.

*  feat: two-phase code-execution preview flow (unblocks final response)

The agent's final response no longer waits on CPU-heavy office HTML
extraction. Phase-1 (download + storage save + DB record at
`status: 'pending'`) is awaited as before; phase-2 (extract +
`updateFile`) runs in the background with a hard 60s ceiling.

Three flows, all funneling through `processCodeOutput` and updated to
the new `{ file, finalize? }` return shape:

- `callbacks.js` (chat-completions + Open Responses streaming): emit
  the phase-1 attachment immediately (carries `status: 'pending'` for
  office buckets so the UI shows "preparing preview…"), then
  fire-and-forget `finalize()`. If the SSE stream is still open when
  phase-2 lands, push an `attachment` update event with the same
  `file_id` so the client merges over the placeholder in place.

- `tools.js` direct endpoint: same split — return the phase-1
  metadata immediately, run extraction in the background. Client
  polls for the resolved record.

`finalize()` wraps the existing 12s per-render timeout in a 60s outer
`withTimeout`. The HTML-or-null contract from #12934 is preserved:
office types that fail extraction transition to `status: 'failed'`
with `previewError: 'parser-error' | 'timeout'` rather than falling
back to plain text (would be an XSS vector).

Promises continue running after the HTTP response closes (Node
doesn't kill them). The boot-time orphan sweep covers the only case
that loses progress — actual process restart mid-extraction.

`primeFiles` annotates the agent's `toolContext` line for prior-turn
files: `(preview not yet generated)` for pending, `(preview
unavailable: <reason>)` for failed. The model can volunteer "you can
still download it" instead of pretending the preview is fine.

`hasOfficeHtmlPath` exported from `@librechat/api` so `processCodeOutput`
can decide whether a file expects a preview at all.

* 🔍 feat: `GET /api/files/:file_id/preview` endpoint and boot orphan sweep

- New `GET /api/files/:file_id/preview` route returns
  `{ status, text?, textFormat?, previewError? }`. The frontend's
  `useFilePreview` React Query hook polls this while phase-2 is in
  flight, then auto-stops on terminal status. ACL identical to the
  download route (reuses `fileAccess` middleware). Defaults `status`
  to `'ready'` for legacy records so back-compat is implicit.
  `text` only included when `status === 'ready'` and non-null —
  preserves the HTML-or-null security contract from #12934.

- `sweepOrphanedPreviews()` invoked on boot in both `server/index.js`
  and `server/experimental.js`. Recovers any `pending` records left
  behind by a process restart mid-extraction (the only case the
  in-process two-phase flow can't handle on its own). Fire-and-forget
  so a transient sweep failure doesn't block startup.

* 🖥️ feat: frontend two-phase preview consumer (polling + UI states)

Wires the React side to the new lifecycle so the user sees what's
happening with their file while phase-2 extraction runs in the
background and after the response stream closes.

- `useAttachmentHandler` upserts by `file_id` (was append-only) so
  the phase-2 SSE update event merges over the pending placeholder
  in place. Lightweight attachments without a `file_id`
  (web_search / file_search citations) keep the legacy append path.

- `useFilePreview(file_id)` React Query hook with
  `refetchInterval: (data) => data?.status === 'pending' ? 2500 : false`
  so polling auto-stops on the first terminal response without the
  caller having to flip `enabled`.

- `useAttachmentPreviewSync(attachment)` bridges polled data into
  `messageAttachmentsMap`. Polling enabled iff
  `status === 'pending' && isAnySubmitting` — per the design ask:
  active polling while the LLM is still generating, then quiet.
  Process-restart and post-stream cases are covered by polling on
  the next interaction.

- `Attachment.tsx` renders a small `PreviewStatusIndicator` (spinner +
  "Preparing preview…" for pending, alert icon + "Preview unavailable"
  for failed) inside `FileAttachment`. Download button stays fully
  functional in both states. Two new English locale keys.

- Data-provider scaffolding: `TFilePreview` type, `endpoints.filePreview`,
  `dataService.getFilePreview`, `QueryKeys.filePreview`.

* 🧪 fix: stub `useAttachmentPreviewSync` in pre-existing Attachment test mocks

The new `useAttachmentPreviewSync` hook is called unconditionally inside
`FileAttachment` (added in the prior commit). Two pre-existing test
files mock `~/hooks` to provide `useLocalize` only — the un-mocked
preview hook reference resolved to undefined and crashed render with
`(0 , _hooks.useAttachmentPreviewSync) is not a function` on the
Ubuntu/Windows CI runners.

Fix is local to the test mocks: add a no-op stub that returns
`{ status: 'ready' }` so the component renders the legacy chip path.
The two-phase preview behavior itself has its own dedicated suites
(`useAttachmentHandler.spec.tsx`, `useAttachmentPreviewSync.spec.tsx`).

* 🐛 fix: route phase-2 attachment update to current-run messageId

Codex P1 review on PR #12957. `processCodeOutput` intentionally
preserves the original DB `messageId` across cross-turn filename reuse
so `getCodeGeneratedFiles` can still trace a file back to the
assistant message that originally produced it. The phase-1 SSE emit
already routes by the current run's messageId — `processCodeOutput`
runtime-overlays it via `Object.assign(file, { messageId, toolCallId })`
and the callback writes `result.file` directly.

Phase-2 was passing the raw `updateFile` return through
`attachmentFromFileMetadata`, which read `messageId` straight off the
DB record. On a turn-N run that re-emitted a filename from turn-1
(e.g. agent writes `output.csv` again), the phase-2 SSE update
routed to `turn-1-msg` instead of `turn-N-msg`. Frontend's
`useAttachmentHandler` upserts under the wrong messageAttachmentsMap
slot — turn-N's pending chip stays stuck at "preparing preview…"
while turn-1's already-resolved attachment gets re-merged.

Fix: thread `runtimeMessageId` through `attachmentFromFileMetadata`
and pass `metadata.run_id` from the phase-2 emit site. Mirrors how
phase-1 sources its messageId. Tests cover the cross-turn reuse case
plus the writableEnded / null-finalize / no-finalize paths to lock
in the broader phase-2 emit contract.

* 🛠️ refactor: address codex audit findings (wire-shape parity, DRY, defensive catch)

Comprehensive audit on PR #12957. Resolves all valid findings:

- **MAJOR #1 — Wire-shape parity**: phase-1 ships the full `fileMetadata`
  record over SSE; phase-2 was using a tight `attachmentFromFileMetadata`
  projection. Drop the projection and have phase-2 spread `{...updated,
  messageId, toolCallId}` so both events match the long-standing
  legacy phase-1 shape clients depend on.

- **MAJOR #2 — DRY**: extract `runPhase2Finalize({ finalize, fileId,
  onResolved })` into `process.js` (alongside `processCodeOutput` whose
  contract it pairs with). Both `callbacks.js` paths and `tools.js`
  now flow through it. Single catch path eliminates divergence
  surface — the fix landed in 01704d4f0 (cross-turn messageId routing)
  was a symptom of this duplication risk.

- **MINOR #3 — JSDoc accuracy**: `finalizePreview`'s buffer is bounded
  by `fileSizeLimit`, not the 1MB extractor cap. Updated and added a
  note about peak heap from queued buffers.

- **MINOR #4 — Defensive catch**: `runPhase2Finalize`'s catch attempts
  a best-effort `updateFile({ status: 'failed', previewError:
  'unexpected' })` for the file_id, so a programming bug in
  `finalizePreview` doesn't leave the record stuck `'pending'` until
  the next boot-time orphan sweep.

- **NIT #6 — Stale PR refs**: 12952 → 12957 in 3 places.

- **NIT #7 — Schema bound**: `previewError` capped at `maxlength: 200`
  to prevent a future codepath from accidentally persisting a stack
  trace.

Skipped per audit verdict (non-blocking):
- #5 (memory pressure): documented in JSDoc; impl change was reviewer's
  "consider", not actionable.
- #8 (double DB query per poll): low cost, indexed by_id, polling is
  gated narrow.
- #9 (TAttachment cast): the union type is intentional; the casts are
  safe widening, refactoring TAttachment is invasive and out of scope.

Tests: 11 new (7 `runPhase2Finalize` unit tests covering happy path,
null-finalize, throws, double-fail, no-fileId, no-onResolved; +4
wire-shape parity assertions in the existing cross-turn test). 328
backend tests pass; 528 frontend tests pass; lint and typecheck clean.

* 🛡️ refactor: address codex P1+P2 + rename to drop phase-1/2 jargon

Codex round 2 review on PR #12957 caught two race conditions and one
recovery gap, all triggered by cross-turn filename reuse (`claimCodeFile`
intentionally returns the same `file_id` for the same
`(filename, conversationId)` across turns). Plus naming cleanup the
user requested — internal "phase 1 / phase 2" vocabulary leaks across
sprints, replace it everywhere with terms describing what's actually
happening.

P1 — stale render overwrites newer revision (process.js)
  Two turns reusing `output.csv` share a `file_id`. If turn-1's
  background render resolves AFTER turn-2's persist step, the
  unconditional `updateFile` writes turn-1's stale text/status over
  turn-2's pending placeholder. Fix: stamp a fresh `previewRevision`
  UUID on every emit, thread it through `finalizePreview`, and make
  the commit conditional via a new optional `extraFilter` argument
  on `updateFile` (`{ previewRevision: <expected> }`). The defensive
  `updateFile` in `runPreviewFinalize`'s catch uses the same guard
  so a programming error from an older render also can't override a
  newer turn.

P1 — stale React Query cache on pending remount (queries.ts)
  Same root cause from the frontend side. Cache key
  `[QueryKeys.filePreview, file_id]` may hold a prior turn's `'ready'`
  payload; with `refetchOnMount: false` and the polling gate on
  `pending`, polling never starts for the new placeholder. Fix:
  `useAttachmentHandler` invalidates that query whenever an attachment
  with a `file_id` arrives. Both initial-emit and update events
  trigger invalidation — uniform gate.

P2 — quick-restart orphans skipped by boot sweep (files.js)
  Boot `sweepOrphanedPreviews` uses a 5-min cutoff for multi-instance
  safety. A crash + restart inside the cutoff leaves `pending` records
  that never get touched again. Fix: lazy sweep inside the preview
  endpoint — if a polled record is `pending` and `updatedAt` is older
  than 5 min, mark it `failed:orphaned` on the spot before responding.
  Conditional on the same `updatedAt` we observed so a concurrent
  legitimate update wins. Cheap, bounded by user activity.

Naming cleanup
  - `runPhase2Finalize` → `runPreviewFinalize`
  - `PHASE_TWO_TIMEOUT_MS` → `PREVIEW_FINALIZE_TIMEOUT_MS`
  - All `phase-1` / `phase-2` / `two-phase` prose replaced with
    "the immediate emit", "the deferred render", "the persist step",
    "the deferred preview", etc. Skill-feature `phase 1/2` references
    (different feature) left alone.

Tests: 10 new (4 lazy-sweep × preview endpoint, 3 cache-invalidation ×
useAttachmentHandler, 3 extraFilter × updateFile data-schemas).
Backend 332/332, frontend 531/531, data-schemas 37/37, lint clean.

* 🛠️ refactor: address comprehensive review (round 3) — stale-cache MAJOR + 3 minors

Comprehensive review on PR #12957 caught a P1 follow-on bug from the
prior `invalidateQueries` fix, plus 3 maintainability findings.

MAJOR: stale React Query cache not actually fixed by `invalidateQueries`
  The previous fix called `invalidateQueries` to flush stale cached
  preview data on cross-turn filename reuse. But `useFilePreview` had
  `refetchOnMount: false`, which made the new observer read the
  stale-marked 'ready' data without refetching. The polling
  `refetchInterval` then evaluated against stale 'ready' → returned
  `false` → polling never started → user stuck on stale content.

  Fix (belt-and-suspenders):
    a) `useAttachmentHandler` switched to `removeQueries` — drops the
       cache entry entirely so the next mount has nothing to read and
       must fetch.
    b) `useFilePreview` no longer sets `refetchOnMount: false`, so the
       React Query default (`true`) kicks in — second line of defense
       if any future codepath observes stale data before the handler
       has a chance to evict.

MINOR: `finalizePreview` JSDoc missing `previewRevision` param
  Added with explanation of the conditional update guard.

MINOR: asymmetric stream-writable guard between SSE protocols
  Chat-completions delegated the gate to `writeAttachmentUpdate`;
  Open Responses inlined `!res.writableEnded && res.headersSent`.
  Extracted `isStreamWritable(res, streamId)` predicate; both paths
  + `writeAttachmentUpdate` now share the single source of truth.

NIT: `(data as Partial<TFile>).file_id` cast repeated 4 times
  Extracted to a `fileId` local at the top of the handler.

Tests: existing 9 invalidate-tests rewritten as remove-tests; +1 new
lock-in test asserts removeQueries is called and invalidateQueries
is NOT (regression guard against round-3 finding). 332 backend pass,
532 frontend pass, lint clean.

Skipped findings (deferred / acceptable):
- MINOR: post-submission pending state has no auto-recovery — the
  `isAnySubmitting` polling gate was the user's explicit design;
  LLM context surfaces failed/pending so the model can volunteer.
  Worth a follow-up if real users hit it.
- NIT: double DB query per preview poll — reviewer marked acceptable;
  changing `fileAccess` middleware is out of scope.

* 🛡️ test: address comprehensive review NITs (initial-emit guard + isStreamWritable coverage)

NIT — chat-completions initial emit skips writableEnded check
  The Open Responses initial emit was switched to use the new
  `isStreamWritable` predicate in the round-3 commit, but the
  chat-completions initial emit kept the older narrower check
  (`streamId || res.headersSent`). On a client disconnect mid-stream
  (`writableEnded === true`) it would still hit `res.write` and
  raise `ERR_STREAM_WRITE_AFTER_END` — caught by the outer IIFE
  catch but logged as noise. Switch this site to `isStreamWritable`
  too so both initial-emit paths share the same gate as the
  deferred update emits.

NIT — `isStreamWritable` not directly unit-tested
  The predicate was only covered indirectly via the deferred-preview
  SSE tests (writableEnded skip, headersSent check). Export from
  `callbacks.js` and add 5 parametric tests pinning down each branch
  (streamId truthy, res null, !headersSent, writableEnded, happy
  path) so a future condition addition can't silently regress.

* 🐛 fix: stuck "Preparing preview…" + inline the chip subtitle

Two related fixes for a stuck-spinner bug a user reported in manual
testing of PR #12957.

**Stuck spinner (the bug)**
The deferred preview render can complete a few seconds AFTER the SSE
stream closes (typical case: PPTX render finishes ~3s after the LLM
emits FINAL). When that happens, the SSE update is silently dropped
(`isStreamWritable` returns false on a closed stream) and polling is
the only recovery path.

The earlier polling gate was `status === 'pending' && isAnySubmitting`,
which mirrored the original design intent ("only query while the LLM
is still generating"). But `isAnySubmitting` flips false the moment
the model emits FINAL — milliseconds before the deferred render
commits. Polling never runs, the chip stays "Preparing preview…"
forever even though the DB has `status: 'ready'` with valid HTML.

Drop the `isAnySubmitting` part of the gate. `useFilePreview`'s
`refetchInterval` is already a function-form that returns `false` on
the first terminal response, so polling auto-stops within one tick of
resolution. The server-side render ceiling (60s) plus the lazy sweep
in the preview endpoint cap the worst case to ~24 polls per pending
attachment. Polling itself never blocks UX — the gate's purpose was
"don't waste cycles", and capping by terminal status is the correct
expression of that.

**Inline the chip subtitle (the visual)**
The previous design rendered "Preparing preview…" as a loose-feeling
spinner+text BELOW the file chip. The chip itself looked done while a
floating annotation said it wasn't.

`FileContainer` gains an optional `subtitle?: ReactNode` prop that
overrides the default file-type label. `Attachment.tsx` passes a
`PreviewStatusSubtitle` (spinner + "Preparing preview…" / alert +
"Preview unavailable") into that slot when the file's preview is
pending or failed. The chip footprint stays identical to its `'ready'`
form — just the second row swaps from "PowerPoint Presentation" to
the status indicator. No floating element, no layout shift.

Tests: regression test pinning down "polling stays enabled after the
LLM finishes" so a future revert can't reintroduce the stuck-spinner
bug. Existing FileContainer tests pass unchanged (subtitle override
is opt-in). 522 frontend tests pass; lint clean.

* 🐛 fix: deferred-preview survives reload + matches artifact card chrome

Fixes the remaining stuck-pending case after the polling gate fix: on
a reloaded conversation, message.attachments come from the DB frozen at
the immediate-persist `status: 'pending'`, but `messageAttachmentsMap`
is empty because no SSE handler ever fired for that messageId. Polling
now INSERTS a new live entry when no record matches the file_id, and
`useAttachments` merges live entries onto DB entries by file_id so the
resolved text/textFormat reach `artifactTypeForAttachment` and the
chip routes through the proper PanelArtifact card.

Also replaces the small file chip used during the pending state with
a PreviewPlaceholderCard that mirrors ToolArtifactCard chrome, so the
transition to the resolved PanelArtifact no longer reshapes the UI.

*  feat: auto-open panel when deferred preview resolves pending→ready

The legacy auto-open path is gated only on `isSubmitting`, so an
office-file preview that resolves *after* the SSE stream closes would
render in place but never auto-open the panel — even though that's
exactly the moment the result becomes meaningful to the user. Adds a
per-file_id one-shot signal that `useAttachmentPreviewSync` flips on
the pending→ready edge; `ToolArtifactCard` consumes it on mount and
auto-opens regardless of submission state. The signal is *only* set on
the actual transition (history loads of pre-resolved files don't
trigger it) and is consumed once (panel close + reopen on the same
card stays user-controlled).

* 🐛 fix: drop placeholder Terminal overlay + scope auto-open to fresh resolutions

Two fixes for issues spotted in manual testing of the deferred-preview
auto-open feature:

1. PreviewPlaceholderCard was passing `file={attachment}` to FilePreview,
   which triggered SourceIcon's Terminal overlay (`metadata.fileIdentifier`
   is set on every code-execution file). The artifact card itself doesn't
   show that overlay; the placeholder shouldn't either, so the
   pending→resolved transition is visually seamless.

2. The `previewJustResolved` flag flipped on every pending→ready
   transition observed by the polling hook — including stale-pending
   DB records that resolve via the first poll on a *history load*.
   Conversations whose immediate-persist snapshot left attachments at
   `status: 'pending'` would yank the panel open every revisit.
   Adds `mountedDuringStreamRef` to the hook (mirroring ToolArtifactCard)
   so the flag fires only when the hook itself was mounted during an
   active turn — preserving the pre-PR contract that the panel only
   auto-opens for results the user is actively waiting on, never for
   history.

* 🐛 fix: don't downgrade preview to failed when only the SSE emit throws

Codex P2 finding on PR #12957: the original chain placed `.catch` after
`.then(onResolved)`, so a throw inside `onResolved` (transport-side
errors — SSE write race after stream close, an emitter listener
throwing) would propagate into the finalize catch and persist
`status: 'failed'` / `previewError: 'unexpected'`. That surfaced
"preview unavailable" in the UI for a perfectly valid file, and
degraded next-turn LLM context to reflect a non-existent failure.

Wraps `onResolved` in its own try/catch so emit errors are logged but
do not affect the file's persisted status. Extraction success and
emit success are now independent: if extraction succeeds and
`finalizePreview` writes the terminal status, the polling layer / next
page load surfaces the resolved preview even if this turn's SSE emit
didn't land.

* 🛡️ fix: run boot-time orphan sweep under system tenant context

Codex P2 finding on PR #12957: `File` is tenant-isolated, so under
`TENANT_ISOLATION_STRICT=true` the boot-time `sweepOrphanedPreviews`
threw `[TenantIsolation] Query attempted without tenant context in
strict mode` and the recovery path silently failed every restart.
Stale `status: 'pending'` records would be stuck until a user happened
to poll the preview endpoint and trigger the lazy sweep — which only
covers the file the user is currently looking at, not the bulk
candidate set the boot sweep is designed to recover.

Wraps the sweep in `runAsSystem(...)` in both boot paths
(`api/server/index.js` and `api/server/experimental.js`) and pins the
contract with regression tests in `file.spec.ts` — one test asserts
the bare call throws under strict mode, the other asserts the
`runAsSystem`-wrapped call succeeds.

* 🧹 chore: trim verbose comments from previous commit

* 🧹 chore: address review findings (dead branch, lazy-sweep cutoff, stale JSDoc)

- finalizePreview: drop unreachable !isOfficeBucket branch (caller
  already gates on hasOfficeHtmlPath, so this path is always office)
- preview endpoint: drop lazy-sweep cutoff from 5min to 2min — anything
  past the 60s render ceiling is definitively orphaned, and per-request
  sweep can be tighter than the per-instance boot sweep
- strip stale `isSubmitting` references from JSDoc in 3 spots (the
  client-side gate was removed in 9a65840)

Skipped: function-length (#3) and client-side polling cap (#4) —
refactors without correctness/perf wins; remaining NITs.

* 🧹 fix: trim 1 query off pending polls + clear stale lifecycle on cross-shape updates

- Preview endpoint: reuse fileAccess middleware's record for the
  lifecycle check; only re-fetch with text on the terminal ready
  response. Cuts the typical poll lifecycle from 2(N+1) to N+1
  queries, since the vast majority of polls hit while pending and
  don't need text at all.
- processCodeOutput non-office branch: explicitly null out status,
  previewError, previewRevision (codex P2). Without this, an update at
  the same (filename, conversationId) where the prior emit was an
  office file leaves stale lifecycle fields and the client renders
  the wrong state for the now non-office artifact.
- Tests: rewire preview.spec mocks for the new shape, add boundary
  test pinning the 2min cutoff, add regression test for the
  cross-shape update.

* 🐛 fix: keep polling on transient errors but cap permanently-broken endpoint

Codex P2: the previous `data?.status === 'pending' ? 2500 : false` gate
killed polling on the first transient error. With `retry: false`, a 500
left `data` undefined, the callback returned false, and the chip was
stuck "Preparing preview…" forever — exactly the bug the polling layer
was supposed to recover from.

Inverts the gate: stop on terminal success (`ready`/`failed`) or after
5 consecutive errors. Transient errors keep retrying; a permanently
broken endpoint caps at ~12.5s instead of polling forever. Predicate
extracted as `previewRefetchInterval` for direct unit testing without
fighting React Query's timer machinery.

*  feat: render pending-preview files in their own row

Pending deferred-preview chips now bucket into a separate row above
the resolved attachments — reads as "this is still happening" rather
than mixing with completed downloads. Once status flips to ready, the
chip re-buckets into panelArtifacts; failed re-buckets into the file
row alongside other downloads.

* 🎨 fix: render pending-preview chips in the panel-artifact row, not the file row

Previous bucketing put pending chips in the file row (since
`artifactTypeForAttachment` returns null for empty-text records). The
pending placeholder is a future panel artifact — sharing the row keeps
the chip in place when it resolves instead of jumping rows.

Plain files still get their own row.

* 🐛 fix: phase-1 SSE replay must not regress a resolved attachment

Codex P1: useEventHandlers.finalHandler iterates
responseMessage.attachments at stream end and dispatches each through
the attachment handler. Those records are the immediate-persist
snapshot (status:pending, text:null) — if a deferred update has
already moved the same file_id to ready/failed, the existing merge
let the pending fields win and downgraded the resolved record. Result:
chip flickers back to pending and polling restarts until the lazy
sweep corrects.

Pin the terminal lifecycle fields (status, text, textFormat,
previewError) when existing is ready/failed and incoming is pending.
Other field updates still go through.

* 🐛 fix: track preview-poll error cap outside React Query state

Codex P2: the previous cap relied on `query.state.fetchFailureCount`,
but React Query v4's reducer resets that to 0 on every fetch dispatch
(the `'fetch'` action). With `retry: false`, each failed poll left
count at 1 and the next dispatch reset it back to 0, so the `>= 5`
branch never fired and a permanently-broken endpoint polled forever.

Track consecutive errors in a module-level Map keyed by file_id,
incremented in a thin `fetchFilePreview` wrapper around the data
service call. The Map is cleared on success and on cap-stop, so
memory is bounded by in-flight pending file_ids per session.
2026-05-06 03:04:19 -04:00
Danny Avila
963068b112 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613)
* 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema

Adds Skills as a new first-class resource modeled on Anthropic's Agent
Skills, reusing the existing Prompt ACL stack for sharing. Lays the
groundwork for multi-file skills (SkillFile schema + metadata routes)
without wiring upload processing — single-file skills (inline SKILL.md
body) work end-to-end, multi-file uploads are stubbed for phase 2.

* 🔬 fix: Wire Skill Cleanup, AccessRole Enum, and Express 5 Path Params

CI surfaced four follow-ups from the initial Skills scaffolding commit
that local builds missed:

- AccessRole's resourceType field had a hardcoded enum that didn't
  include `'skill'`, blocking SKILL_OWNER/EDITOR/VIEWER role creation
  in every test that hit the AccessRole model.
- The seedDefaultRoles assertion in accessRole.spec.ts hard-listed the
  expected role IDs and needed the new SKILL_* entries.
- deleteUserController had no cleanup for skills, and the
  deleteUserResourceCoverage guard test enforces every ResourceType has
  a documented handler — wired in db.deleteUserSkills(user._id) and
  added the entry to HANDLED_RESOURCE_TYPES.
- Express 5's path-to-regexp v6 rejects the legacy `(*)` named-group
  glob syntax. The two skill file routes now use a plain `:relativePath`
  param; the client already encodeURIComponents the path, so a single
  param is sufficient and decoded server-side.

* 🪡 fix: Make Skill Name Uniqueness Application-Level

Resolve three more CI failures from the Skills scaffolding PR:

- Mongoose creates indexes asynchronously and mongodb-memory-server
  tests can race ahead of the unique (name, author, tenantId) index
  being built, so the duplicate-name uniqueness test was flaky.
  Added an explicit findOne pre-check inside createSkill that throws
  with code 11000 (mimicking the index violation), giving deterministic
  behavior. The unique index stays as the persistent guarantee.
- The deleteUser.spec.js and UserController.spec.js suites mock the
  ~/models module directly and were missing deleteUserSkills, causing
  deleteUserController to throw and return 500 instead of 200.
- Removed two doc-comment claims that the SKILL_NAME_MAX_LENGTH and
  SKILL_DESCRIPTION_MAX_LENGTH constants "match Anthropic's API". The
  values themselves are reasonable but the comments were misleading
  about who enforces them.

* 🪢 fix: Address Code Review Findings on Skills Scaffolding

Resolve all 15 findings from the comprehensive PR review:

Critical:
- Rollback the created skill when grantPermission throws so a transient
  ACL failure cannot leave an orphaned, inaccessible skill in the DB.
- Fix infinite query cache corruption in useUpdateSkillMutation helpers.
  setQueriesData([QueryKeys.skills]) matches useSkillsInfiniteQuery's
  InfiniteData cache entries, which have { pages, pageParams } shape —
  spreading data.skills on those would throw. Added an isInfiniteSkillData
  guard and per-page transform so both flat and infinite caches update
  correctly.

Major:
- Fix TUpdateSkillContext type: the public type declared previousListData
  but onMutate actually returns previousListSnapshots (a [key, value]
  tuple array). Updated the type + added TSkillCacheEntry as a shared
  export from data-provider.
- Add cancelQueries calls before optimistic update in onMutate so
  in-flight refetches cannot clobber the optimistic state.
- Parallelize deleteUserSkills ACL removal via Promise.allSettled instead
  of a sequential await loop — O(1) round-trip vs O(n).
- Stub mockDeleteUserSkills in stubDeletionMocks() and assert it's called
  with user.id in the deleteUser.spec.js happy-path test.
- Add idResolver: getSkillById to the SKILL branch in accessPermissions.js
  so GET /api/permissions/skill/<missing-id> returns 404 instead of 403.

Minor:
- Reuse resolved skill from req.resourceAccess.resourceInfo in getHandler
  to eliminate a redundant getSkillById call per GET /api/skills/:id.
- Reject PATCH /api/skills/:id requests whose body contains only
  expectedVersion — previously they silently bumped version with no
  changes, triggering spurious 409s for collaborators.
- Make TSkill.frontmatter optional (wire type) and add serializeFrontmatter
  / serializeSourceMetadata helpers that return undefined for empty
  objects instead of casting incomplete data to SkillFrontmatter.
- Standardize deleteUserSkills to accept string | ObjectId and convert
  internally, matching deleteUserPrompts's signature; UserController now
  passes user.id consistently.
- Replace bumpSkillVersionAndRecount (read-then-write, racy) with
  bumpSkillVersionAndAdjustFileCount using atomic $inc. upsertSkillFile
  pre-checks existence to distinguish insert (+1) from replace (0).
- Add DELETE /api/skills/:id/files/:relativePath integration tests
  covering success, 404, and 403 paths.

Nits:
- Drop trivial resolveSkillId wrapper — pass getSkillById directly.
- Remove dead staleTime: 1000 * 10 from useListSkillsQuery since all
  refetch triggers are already disabled.

* 🧭 fix: Resolve Second Skills Review Pass — Cache, Gate, TOCTOU

Address 13 of 14 findings from the second code review; reject #13 as
misread of the AGENTS.md import-order rule (package types correctly
precede local types regardless of length).

Major:
- Fix addSkillToCachedLists closure bug: a hoisted `prepended` flag
  was shared across every cache entry matched by setQueriesData, so
  concurrent flat + infinite caches would silently drop the prepend
  on whichever was processed second. Replaced the shared helper with
  three per-entry inline updaters that handle InfiniteData at the
  page level (page 0 only for prepend, all pages for replace/remove).
- Tighten patchHandler's expectedVersion validation: NaN passes
  `typeof === 'number'` and would previously leak current skill state
  via a misleading 409. Now requires finite positive integer and
  returns 400 otherwise.
- Guard decodeURIComponent in deleteFileHandler with try/catch —
  malformed percent encoding now returns 400 instead of 500.
- Add PermissionTypes.SKILLS + skillPermissionsSchema +
  TSkillPermissions in data-provider; seed default SKILLS permissions
  for ADMIN (all true) and USER (use + create only); wire
  checkSkillAccess / checkSkillCreate via generateCheckAccess onto
  the skills router mirroring the prompts pattern. Skills route now
  enforces role-based capability gates alongside per-resource ACLs.
  Test suite adds a mocked getRoleByName returning permissive SKILLS.
- Fix upsertSkillFile TOCTOU: replaced the pre-check + upsert pair
  with a single `findOneAndUpdate({ new: false, upsert: true })` call
  that atomically returns the pre-update doc (null ⇒ insert) so
  fileCount delta can't double-count on concurrent same-path uploads.

Minor:
- Add `sourceMetadata` to listSkillsByAccess .select() so summaries
  no longer silently drop the field for GitHub/Notion-synced skills.
- Include `cursor` in useListSkillsQuery's query key so manual
  pagination doesn't alias across pages.
- Clean up TSkillSummary to `Omit<TSkill, 'body' | 'frontmatter'>`
  matching what serializeSkillSummary actually emits; drop the
  Omit-then-re-add noise.
- Skip getPublicSkillIdSet in createHandler; a newly-created skill
  cannot have a PUBLIC ACL entry, so pass an empty set directly
  instead of paying a DB round-trip.
- Trim SkillMethods public surface: drop internal helpers
  countSkillFiles / deleteSkillFilesBySkillId / getSkillFile from the
  return object; inline the file cascade into deleteSkill.
- Use TSkillConflictResponse at the PATCH 409 call site instead of
  an inline ad-hoc object literal.
- Drop the now-unused EXPECTED_VERSION_ERROR module constant.

* 🧩 fix: Extend Role Schema + Types with SKILLS PermissionType

CI type-check and unit test failures from the PermissionTypes.SKILLS
addition surfaced three unrelated places that all hardcode the
permission-type set:

- IRole.permissions in data-schemas/types/role.ts enumerates every
  PermissionTypes key as an optional field. Adding SKILLS to the enum
  without updating the interface caused TS7053 'expression of type
  PermissionTypes can't be used to index type' errors in
  role.methods.spec.ts (lines 407-408, 477-478) because
  Object.values(PermissionTypes) now yielded a value the interface
  didn't cover.
- schema/role.ts rolePermissionsSchema mirrors the interface at the
  Mongoose layer; also needed SKILLS added so the persisted role
  document can actually store skill permissions.
- data-provider/roles.spec.ts has a guard test that every permission
  type carrying CREATE/SHARE/SHARE_PUBLIC must be explicitly "tracked"
  either in RESOURCE_PERMISSION_TYPES or in the PROMPTS/AGENTS/MEMORIES
  exemption list. Added SKILLS to the exemption list since skills
  follow the same default model as prompts/agents (USE + CREATE on for
  USER, SHARE / SHARE_PUBLIC off).

All three are additive pass-throughs with no behavior change.

* 🏷️ refactor: Introduce ISkillSummary for Narrow List Projection

Follow-up NITs from the second review pass on the Skills PR:

- Define ISkillSummary = Omit<ISkill, 'body' | 'frontmatter'> and use
  it as the element type in ListSkillsByAccessResult. The list query's
  .select() intentionally omits body and frontmatter for payload size,
  but the previous type claimed both fields were present — a type lie
  that would mislead future readers even though serializeSkillSummary
  never touches those fields at runtime. handlers.ts's signature for
  serializeSkillSummary now accepts ISkillSummary too.
- Document the intentional second-round-trip `findOne` in
  upsertSkillFile. Switching to `findOneAndUpdate({ new: false })`
  was required for TOCTOU-safe insert-vs-replace detection, which
  means the handler needs a follow-up query to return the post-upsert
  document. A comment now explains the tradeoff so future readers
  don't silently "optimize" it away.

No behavior change.

* 🌐 fix: Wire SKILL into SHARE_PUBLIC Resource Maps

Address codex comment #1 — making a skill public was blocked on two
hardcoded resource→permission-type maps that didn't know about SKILL:

- api/server/middleware/checkSharePublicAccess.js's
  resourceToPermissionType map was missing ResourceType.SKILL, so
  PUT /api/permissions/skill/:id with { public: true } would fall
  through to the 400 "Unsupported resource type for public sharing"
  path even though PermissionTypes.SKILLS exists and ADMIN has
  SHARE_PUBLIC configured. Added the mapping.
- client/src/hooks/Sharing/useCanSharePublic.ts has an identical
  client-side map used to gate the "Make Public" UI toggle. Without
  the SKILL mapping the hook returned false for everyone, so the
  toggle wouldn't render for skills once the sharing UI lands in
  phase 2. Added the mapping.

Codex comment #2 (create/update cache writes inject skills into
unrelated filtered lists) is invalid — it flags a pattern that
mirrors useUpdatePromptGroup (which the PR description explicitly
cites as the model) and is a deliberate optimistic-update tradeoff.
Trying to match each cache key's embedded filter would couple the
mutation callback to query-key internals, which is exactly what
setQueriesData is designed to avoid. No change there.

* 🧪 feat: Frontmatter Validation, Reserved-Name Fixes, Coaching Warnings

Address the follow-up review notes on the Skills PR. This commit closes
the gap between the wire-type promise and what the backend actually
enforces, tightens the reserved-name rules, and adds a non-blocking
coaching tier for validators.

Frontmatter validation (new):
- Add `validateSkillFrontmatter` in data-schemas/methods/skill.ts with
  strict mode — unknown keys are rejected so expanding the allowed set
  is an intentional code change. Known keys are type-checked against a
  `FrontmatterKind` table derived from Anthropic's Agent Skills spec
  (name, description, when-to-use, allowed-tools, arguments,
  argument-hint, user-invocable, disable-model-invocation, model,
  effort, context, agent, paths, shell, hooks, version, metadata).
- `hooks` and `metadata` get a shallow JSON-safety check (max depth 4,
  max string 2000, max array 100) instead of a full schema, since their
  full shapes live outside this module.
- Wired into BOTH createSkill AND updateSkill so the PATCH path can't
  smuggle invalid frontmatter past the validator.

Validation warning tier (new):
- Add optional `severity: 'error' | 'warning'` to `ValidationIssue`
  (defaults to error). `partitionIssues` splits an issue list into
  blocking errors and non-blocking warnings.
- `createSkill` / `updateSkill` filter on errors for the throw check
  and return warnings in a new `warnings: ValidationIssue[]` field on
  their result objects (`CreateSkillResult` / `UpdateSkillResult`).
- `validateSkillDescription` now emits a `TOO_SHORT` warning for
  descriptions under 20 chars — the primary triggering field, so a
  little coaching goes a long way.
- `createHandler` / `patchHandler` in packages/api surface the warnings
  via a new `attachWarnings` helper that decorates the serialized
  response with a `warnings?: TSkillWarning[]` field.
- `TSkill` gains an optional `warnings?: TSkillWarning[]` field
  documented as "present on POST/PATCH, never on GET".

Reserved-name filter (tightened):
- Replace the substring match (`.includes('anthropic')`) with prefix
  matching on `anthropic-` and `claude-` plus exact-match rejection of
  CLI slash-command collisions (help, clear, compact, model, exit,
  quit, settings, plus the bare `anthropic` / `claude` words). Both
  the pure validator (`methods/skill.ts`) and the Mongoose schema
  validator (`schema/skill.ts`) updated in lockstep; comments on
  each reference the other to prevent drift.
- `research-anthropic-helper` and `about-claude` are now allowed;
  `anthropic-helper`, `claude-bot`, and `settings` are still rejected.

Documentation:
- Add docstrings on `ISkill`, `schema/skill.ts`, and `TSkill` explaining
  the semantics of `name` (Claude-visible identifier, kebab-case,
  stable), `displayTitle` (UI-only cosmetic label, NOT sent to Claude),
  `description` (highest-leverage trigger field), and `source` /
  `sourceMetadata` (reserved for phase 2+ external sync).
- Add a detailed consistency comment on `bumpSkillVersionAndAdjustFileCount`
  explaining that it runs as a separate MongoDB operation from
  upsertSkillFile/deleteSkillFile, so `fileCount` can drift if the
  second op fails — options listed, tradeoff documented, phase 1
  risk window noted as closed because upload is still stubbed.

Tests:
- data-schemas skill.spec.ts: destructure `{ skill, warnings }` from
  createSkill at every call site; add a TOO_SHORT warning test, a
  frontmatter strict-mode test, reserved-prefix tests (including
  positive cases for substring names that should pass), CLI reserved
  word tests, and a full `validateSkillFrontmatter` describe block
  covering unknown keys, type mismatches, and deep-nesting rejection.
- api/server/routes/skills.test.js: bump default test description
  above the 20-char threshold, add a warning-emission test, add
  reserved-prefix + reserved-CLI-word tests, add an unknown-frontmatter-
  key test asserting the 400 response carries `issues` with `UNKNOWN_KEY`.

* 📦 fix: Export CreateSkillResult from data-schemas Methods Index

`CreateSkillResult` was defined in `methods/skill.ts` and consumed by
`packages/api/src/skills/handlers.ts` but never re-exported from the
methods barrel, so the type-check job failed with TS2724
"'@librechat/data-schemas' has no exported member named 'CreateSkillResult'".

Rollup's bundle-mode build picked up the type via its internal resolver,
but the standalone `tsc --noEmit` type-check ran against the package's
public entrypoint and couldn't see it. Added the type import + export
alongside the existing `UpdateSkillResult` export, which fixes the
CI type-check without any runtime change.
2026-04-25 04:01:59 -04:00
Danny Avila
738003b220
🛡️ fix: Prevent silent crash from unhandled MCP OAuth reconnect rejections (#12812)
* 🛡️ fix: Install global `unhandledRejection` handler

Node 15+ terminates the process by default when a promise rejection goes
unhandled. Under MCP OAuth reconnect storms and streamable-HTTP transport
resets, fire-and-forget async paths can emit transient rejections (ECONNRESET,
token refresh races) that would otherwise silently kill the server — no
uncaught exception log, no OOM signal. Register a listener so these paths log
and the process keeps serving other requests.

Refs: #12078

* 🔧 fix: Guard MCP OAuth reconnect fire-and-forget calls

`OAuthReconnectionManager.tryReconnect` awaits `getServerConfig` outside its
inner try/catch, so a rejection from the registry (or any throw before the
guarded block) would escape the fire-and-forget `void` call sites and
propagate as an unhandled rejection — the failure mode behind the silent
crashes reported in #12078. Route both call sites through a `safeTryReconnect`
wrapper that attaches a terminal `.catch` so unexpected rejections are
surfaced via the logger instead.

Refs: #12078

* 🧹 fix: Address review findings on MCP OAuth reconnect crash fix

- Move `getServerConfig` inside `tryReconnect`'s try/catch so the registry
  rejection path is handled by the inner cleanup (the structural root cause
  behind the silent crash). The outer `safeTryReconnect` wrapper remains as
  defense-in-depth.
- Extract the failed-reconnect cleanup as a private `cleanupOnFailedReconnect`
  method and invoke it from `safeTryReconnect`'s catch as well, so any
  rejection that does escape the inner try (e.g. a future regression) still
  resets tracker state instead of leaving the server stuck in `active` for
  the full `RECONNECTION_TIMEOUT_MS` window.
- Update the regression test to assert tracker state is cleaned up
  (`isActive` cleared, `isFailed` set, `disconnectUserConnection` called) so
  it can detect the stale-state failure mode it was meant to guard against.
- Forward non-Error rejection reasons as-is in the global handler so
  structured payloads like `{ code: "ECONNRESET", errno: -104 }` survive
  instead of being collapsed to "[object Object]" by `String()`.

Refs: #12078, review of #12812

* 🚑 fix: Restore fail-fast on boot rejection in primary server entry

`startServer()` was invoked bare in `api/server/index.js`. Before installing
the global `unhandledRejection` handler, a startup rejection (`connectDb`,
`getAppConfig`, `performStartupChecks`) terminated the process via Node's
default — Kubernetes / the orchestrator restarted the pod immediately.

After the handler was added, the same rejection was caught and logged, then
the process kept running half-initialized (no HTTP listener) until the
liveness probe eventually timed out — slow, indirect recovery instead of a
fast restart.

Wrap `startServer()` with the same `.catch(() => process.exit(1))` pattern
already used in `experimental.js` so boot failures fail-fast.

Refs: #12078, codex review of #12812

* 🚑 fix: Fail-fast on post-listen init failure in both server entries

The `app.listen` callback in `index.js` and `experimental.js` is async and
awaits `initializeMCPs`, `initializeOAuthReconnectManager`, and
`checkMigrations`. The callback's promise is detached from
`startServer().catch(...)` (the outer catch only sees errors that occurred
before `app.listen` was called), so without explicit handling those init
rejections used to terminate the process via Node's default and now would
be swallowed by the new `unhandledRejection` handler — leaving the HTTP
server listening (and passing liveness probes) while MCP / OAuth / migration
state is broken.

Wrap the post-listen init block in a try/catch that logs and calls
`process.exit(1)` so initialization failures stay fail-fast.

Refs: #12078, codex review of #12812
2026-04-24 23:18:49 -07:00
Danny Avila
2e706ebcb3
⚖️ refactor: Split Config Route into Unauthenticated and Authenticated Paths (#12490)
* refactor: split /api/config into unauthenticated and authenticated response paths

- Replace preAuthTenantMiddleware with optionalJwtAuth on the /api/config
  route so the handler can detect whether the request is authenticated
- When unauthenticated: call getAppConfig({ baseOnly: true }) for zero DB
  queries, return only login-relevant fields (social logins, turnstile,
  privacy policy / terms of service from interface config)
- When authenticated: call getAppConfig({ role, userId, tenantId }) to
  resolve per-user DB overrides (USER + ROLE + GROUP + PUBLIC principals),
  return full payload including modelSpecs, balance, webSearch, etc.
- Extract buildSharedPayload() and addWebSearchConfig() helpers to avoid
  duplication between the two code paths
- Fixes per-user balance overrides not appearing in the frontend because
  userId was never passed to getAppConfig (follow-up to #12474)

* test: rewrite config route tests for unauthenticated vs authenticated paths

- Replace the previously-skipped supertest tests with proper mocked tests
- Cover unauthenticated path: baseOnly config call, minimal payload,
  interface subset (privacyPolicy/termsOfService only), exclusion of
  authenticated-only fields
- Cover authenticated path: getAppConfig called with userId, full payload
  including modelSpecs/balance/webSearch, per-user balance override merging

* fix: address review findings — restore multi-tenant support, improve tests

- Chain preAuthTenantMiddleware back before optionalJwtAuth on /api/config
  so unauthenticated requests in multi-tenant deployments still get
  tenant-scoped config via X-Tenant-Id header (Finding #1)
- Use getAppConfig({ tenantId }) instead of getAppConfig({ baseOnly: true })
  when a tenant context is present; fall back to baseOnly for single-tenant
- Fix @type annotation: unauthenticated payload is Partial<TStartupConfig>
- Refactor addWebSearchConfig into pure buildWebSearchConfig that returns a
  value instead of mutating the payload argument
- Hoist isBirthday() to module level
- Remove inline narration comments
- Assert tenantId propagation in tests, including getTenantId fallback and
  user.tenantId preference
- Add error-path tests for both unauthenticated and authenticated branches
- Expand afterEach env var cleanup for proper test isolation

* test: fix mock isolation and add tenant-scoped response test

- Replace jest.clearAllMocks() with jest.resetAllMocks() so
  mockReturnValue implementations don't leak between tests
- Add test verifying tenant-scoped socialLogins and turnstile are
  correctly mapped in the unauthenticated response

* fix: add optionalJwtAuth to /api/config in experimental.js

Without this middleware, req.user is never populated in the experimental
cluster entrypoint, so authenticated users always receive the minimal
unauthenticated config payload.
2026-03-31 19:22:51 -04:00
Danny Avila
8ba2bde5c1
📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830)
* chore: move database model methods to /packages/data-schemas

* chore: add TypeScript ESLint rule to warn on unused variables

* refactor: model imports to streamline access

- Consolidated model imports across various files to improve code organization and reduce redundancy.
- Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path.
- Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact.
- Enhanced test files to align with the new import paths, maintaining test coverage and integrity.

* chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas

* test: update agent model mocks in unit tests

- Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality.
- Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication.
- Ensured consistency in agent mock implementations across test files.

* fix: update types in data-schemas

* refactor: enhance type definitions in transaction and spending methods

- Updated type definitions in `checkBalance.ts` to use specific request and response types.
- Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety.
- Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods.
- Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness.

* refactor: streamline model imports and enhance code organization

- Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy.
- Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact.
- Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods.

* feat: implement loadAddedAgent and refactor agent loading logic

- Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution.
- Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`.
- Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`.
- Enhanced type definitions and improved error handling in the agent loading process.
- Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage.

* refactor: enhance balance handling with new update interface

- Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase.
- Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety.
- Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities.
- Enhanced overall code clarity and maintainability by refining type definitions related to balance operations.

* feat: add unit tests for loadAgent functionality and enhance agent loading logic

- Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks.
- Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality.
- Improved test coverage for agent loading, including handling of non-existent agents and user permissions.

* chore: reorder memory method exports for consistency

- Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`.
2026-03-21 14:28:53 -04:00
Danny Avila
6169d4f70b
🚦 fix: 404 JSON Responses for Unmatched API Routes (#11976)
* feat: Implement 404 JSON response for unmatched API routes

- Added middleware to return a 404 JSON response with a message for undefined API routes.
- Updated SPA fallback to serve index.html for non-API unmatched routes.
- Ensured the error handler is positioned correctly as the last middleware in the stack.

* fix: Enhance logging in BaseClient for better token usage tracking

- Updated `getTokenCountForResponse` to log the messageId of the response for improved debugging.
- Enhanced userMessage logging to include messageId, tokenCount, and conversationId for clearer context during token count mapping.

* chore: Improve logging in processAddedConvo for better debugging

- Updated the logging structure in the processAddedConvo function to provide clearer context when processing added conversations.
- Removed redundant logging and enhanced the output to include model, agent ID, and endpoint details for improved traceability.

* chore: Enhance logging in BaseClient for improved token usage tracking

- Added debug logging in the BaseClient to track response token usage, including messageId, model, promptTokens, and completionTokens for better debugging and traceability.

* chore: Enhance logging in MemoryAgent for improved context

- Updated logging in the MemoryAgent to include userId, conversationId, messageId, and provider details for better traceability during memory processing.
- Adjusted log messages to provide clearer context when content is returned or not, aiding in debugging efforts.

* chore: Refactor logging in initializeClient for improved clarity

- Consolidated multiple debug log statements into a single message that provides a comprehensive overview of the tool context being stored for the primary agent, including the number of tools and the size of the tool registry. This enhances traceability and debugging efficiency.

* feat: Implement centralized 404 handling for unmatched API routes

- Introduced a new middleware function `apiNotFound` to standardize 404 JSON responses for undefined API routes.
- Updated the server configuration to utilize the new middleware, enhancing code clarity and maintainability.
- Added tests to ensure correct 404 responses for various non-GET methods and the `/api` root path.

* fix: Enhance logging in apiNotFound middleware for improved safety

- Updated the `apiNotFound` function to sanitize the request path by replacing problematic characters and limiting its length, ensuring safer logging of 404 errors.

* refactor: Move apiNotFound middleware to a separate file for better organization

- Extracted the `apiNotFound` function from the error middleware into its own file, enhancing code organization and maintainability.
- Updated the index file to export the new `notFound` middleware, ensuring it is included in the middleware stack.

* docs: Add comment to clarify usage of unsafeChars regex in notFound middleware

- Included a comment in the notFound middleware file to explain that the unsafeChars regex is safe to reuse with .replace() at the module scope, as it does not retain lastIndex state.
2026-02-27 22:49:54 -05:00
Danny Avila
6279ea8dd7
🛸 feat: Remote Agent Access with External API Support (#11503)
* 🪪 feat: Microsoft Graph Access Token Placeholder for MCP Servers (#10867)

* feat: MCP Graph Token env var

* Addressing copilot remarks

* Addressed Copilot review remarks

* Fixed graphtokenservice mock in MCP test suite

* fix: remove unnecessary type check and cast in resolveGraphTokensInRecord

* ci: add Graph Token integration tests in MCPManager

* refactor: update user type definitions to use Partial<IUser> in multiple functions

* test: enhance MCP tests for graph token processing and user placeholder resolution

- Added comprehensive tests to validate the interaction between preProcessGraphTokens and processMCPEnv.
- Ensured correct resolution of graph tokens and user placeholders in various configurations.
- Mocked OIDC utilities to facilitate testing of token extraction and validation.
- Verified that original options remain unchanged after processing.

* chore: import order

* chore: imports

---------

Co-authored-by: Danny Avila <danny@librechat.ai>

* WIP: OpenAI-compatible API for LibreChat agents

- Added OpenAIChatCompletionController for handling chat completions.
- Introduced ListModelsController and GetModelController for listing and retrieving agent details.
- Created routes for OpenAI API endpoints, including /v1/chat/completions and /v1/models.
- Developed event handlers for streaming responses in OpenAI format.
- Implemented request validation and error handling for API interactions.
- Integrated content aggregation and response formatting to align with OpenAI specifications.

This commit establishes a foundational API for interacting with LibreChat agents in a manner compatible with OpenAI's chat completion interface.

* refactor: OpenAI-spec content aggregation for improved performance and clarity

* fix: OpenAI chat completion controller with safe user handling for correct tool loading

* refactor: Remove conversation ID from OpenAI response context and related handlers

* refactor: OpenAI chat completion handling with streaming support

- Introduced a lightweight tracker for streaming responses, allowing for efficient tracking of emitted content and usage metadata.
- Updated the OpenAIChatCompletionController to utilize the new tracker, improving the handling of streaming and non-streaming responses.
- Refactored event handlers to accommodate the new streaming logic, ensuring proper management of tool calls and content aggregation.
- Adjusted response handling to streamline error reporting during streaming sessions.

* WIP: Open Responses API with core service, types, and handlers

- Added Open Responses API module with comprehensive types and enums.
- Implemented core service for processing requests, including validation and input conversion.
- Developed event handlers for streaming responses and non-streaming aggregation.
- Established response building logic and error handling mechanisms.
- Created detailed types for input and output content, ensuring compliance with Open Responses specification.

* feat: Implement response storage and retrieval in Open Responses API

- Added functionality to save user input messages and assistant responses to the database when the `store` flag is set to true.
- Introduced a new endpoint to retrieve stored responses by ID, allowing users to access previous interactions.
- Enhanced the response creation process to include database operations for conversation and message storage.
- Implemented tests to validate the storage and retrieval of responses, ensuring correct behavior for both existing and non-existent response IDs.

* refactor: Open Responses API with additional token tracking and validation

- Added support for tracking cached tokens in response usage, improving token management.
- Updated response structure to include new properties for top log probabilities and detailed usage metrics.
- Enhanced tests to validate the presence and types of new properties in API responses, ensuring compliance with updated specifications.
- Refactored response handling to accommodate new fields and improve overall clarity and performance.

* refactor: Update reasoning event handlers and types for consistency

- Renamed reasoning text events to simplify naming conventions, changing `emitReasoningTextDelta` to `emitReasoningDelta` and `emitReasoningTextDone` to `emitReasoningDone`.
- Updated event types in the API to reflect the new naming, ensuring consistency across the codebase.
- Added `logprobs` property to output events for enhanced tracking of log probabilities.

* feat: Add validation for streaming events in Open Responses API tests

* feat: Implement response.created event in Open Responses API

- Added emitResponseCreated function to emit the response.created event as the first event in the streaming sequence, adhering to the Open Responses specification.
- Updated createResponse function to emit response.created followed by response.in_progress.
- Enhanced tests to validate the order of emitted events, ensuring response.created is triggered before response.in_progress.

* feat: Responses API with attachment event handling

- Introduced `createResponsesToolEndCallback` to handle attachment events in the Responses API, emitting `librechat:attachment` events as per the Open Responses extension specification.
- Updated the `createResponse` function to utilize the new callback for processing tool outputs and emitting attachments during streaming.
- Added helper functions for writing attachment events and defined types for attachment data, ensuring compatibility with the Open Responses protocol.
- Enhanced tests to validate the integration of attachment events within the Responses API workflow.

* WIP: remote agent auth

* fix: Improve loading state handling in AgentApiKeys component

- Updated the rendering logic to conditionally display loading spinner and API keys based on the loading state.
- Removed unnecessary imports and streamlined the component for better readability.

* refactor: Update API key access handling in routes

- Replaced `checkAccess` with `generateCheckAccess` for improved access control.
- Consolidated access checks into a single `checkApiKeyAccess` function, enhancing code readability and maintainability.
- Streamlined route definitions for creating, listing, retrieving, and deleting API keys.

* fix: Add permission handling for REMOTE_AGENT resource type

* feat: Enhance permission handling for REMOTE_AGENT resources

- Updated the deleteAgent and deleteUserAgents functions to handle permissions for both AGENT and REMOTE_AGENT resource types.
- Introduced new functions to enrich REMOTE_AGENT principals and backfill permissions for AGENT owners.
- Modified createAgentHandler and duplicateAgentHandler to grant permissions for REMOTE_AGENT alongside AGENT.
- Added utility functions for retrieving effective permissions for REMOTE_AGENT resources, ensuring consistent access control across the application.

* refactor: Rename and update roles for remote agent access

- Changed role name from API User to Editor in translation files for clarity.
- Updated default editor role ID from REMOTE_AGENT_USER to REMOTE_AGENT_EDITOR in resource configurations.
- Adjusted role localization to reflect the new Editor role.
- Modified access permissions to align with the updated role definitions across the application.

* feat: Introduce remote agent permissions and update access handling

- Added support for REMOTE_AGENTS in permission schemas, including use, create, share, and share_public permissions.
- Updated the interface configuration to include remote agent settings.
- Modified middleware and API key access checks to align with the new remote agent permission structure.
- Enhanced role defaults to incorporate remote agent permissions, ensuring consistent access control across the application.

* refactor: Update AgentApiKeys component and permissions handling

- Refactored the AgentApiKeys component to improve structure and readability, including the introduction of ApiKeysContent for better separation of concerns.
- Updated CreateKeyDialog to accept an onKeyCreated callback, enhancing its functionality.
- Adjusted permission checks in Data component to use REMOTE_AGENTS and USE permissions, aligning with recent permission schema changes.
- Enhanced loading state handling and dialog management for a smoother user experience.

* refactor: Update remote agent access checks in API routes

- Replaced existing access checks with `generateCheckAccess` for remote agents in the API keys and agents routes.
- Introduced specific permission checks for creating, listing, retrieving, and deleting API keys, enhancing access control.
- Improved code structure by consolidating permission handling for remote agents across multiple routes.

* fix: Correct query parameters in ApiKeysContent component

- Updated the useGetAgentApiKeysQuery call to include an object for the enabled parameter, ensuring proper functionality when the component is open.
- This change improves the handling of API key retrieval based on the component's open state.

* feat: Implement remote agents permissions and update API routes

- Added new API route for updating remote agents permissions, enhancing role management capabilities.
- Introduced remote agents permissions handling in the AgentApiKeys component, including a dedicated settings dialog.
- Updated localization files to include new remote agents permission labels for better user experience.
- Refactored data provider to support remote agents permissions updates, ensuring consistent access control across the application.

* feat: Add remote agents permissions to role schema and interface

- Introduced new permissions for REMOTE_AGENTS in the role schema, including USE, CREATE, SHARE, and SHARE_PUBLIC.
- Updated the IRole interface to reflect the new remote agents permissions structure, enhancing role management capabilities.

* feat: Add remote agents settings button to API keys dialog

* feat: Update AgentFooter to include remote agent sharing permissions

- Refactored access checks to incorporate permissions for sharing remote agents.
- Enhanced conditional rendering logic to allow sharing by users with remote agent permissions.
- Improved loading state handling for remote agent permissions, ensuring a smoother user experience.

* refactor: Update API key creation access check and localization strings

- Replaced the access check for creating API keys to use the existing remote agents access check.
- Updated localization strings to correct the descriptions for remote agent permissions, ensuring clarity in user interface.

* fix: resource permission mapping to include remote agents

- Changed the resourceToPermissionMap to use a Partial<Record> for better flexibility.
- Added mapping for REMOTE_AGENT permissions, enhancing the sharing capabilities for remote agents.

* feat: Implement remote access checks for agent models

- Enhanced ListModelsController and GetModelController to include checks for user permissions on remote agents.
- Integrated findAccessibleResources to filter agents based on VIEW permission for REMOTE_AGENT.
- Updated response handling to ensure users can only access agents they have permissions for, improving security and access control.

* fix: Update user parameter type in processUserPlaceholders function

- Changed the user parameter type in the processUserPlaceholders function from Partial<Partial<IUser>> to Partial<IUser> for improved type clarity and consistency.

* refactor: Simplify integration test structure by removing conditional describe

- Replaced conditional describeWithApiKey with a standard describe for all integration tests in responses.spec.js.
- This change enhances test clarity and ensures all tests are executed consistently, regardless of the SKIP_INTEGRATION_TESTS flag.

* test: Update AgentFooter tests to reflect new grant access dialog ID

- Changed test IDs for the grant access dialog in AgentFooter tests to include the resource type, ensuring accurate identification in the test cases.
- This update improves test clarity and aligns with recent changes in the component's implementation.

* test: Enhance integration tests for Open Responses API

- Updated integration tests in responses.spec.js to utilize an authRequest helper for consistent authorization handling across all test cases.
- Introduced a test user and API key creation to improve test setup and ensure proper permission checks for remote agents.
- Added checks for existing access roles and created necessary roles if they do not exist, enhancing test reliability and coverage.

* feat: Extend accessRole schema to include remoteAgent resource type

- Updated the accessRole schema to add 'remoteAgent' to the resourceType enum, enhancing the flexibility of role assignments and permissions management.

* test: refactored test setup to create a minimal Express app for responses routes, enhancing test structure and maintainability.

* test: Enhance abort.spec.js by mocking additional modules for improved test isolation

- Updated the test setup in abort.spec.js to include actual implementations of '@librechat/data-schemas' and '@librechat/api' while maintaining mock functionality.
- This change improves test reliability and ensures that the tests are more representative of the actual module behavior.

* refactor: Update conversation ID generation to use UUID

- Replaced the nanoid with uuidv4 for generating conversation IDs in the createResponse function, enhancing uniqueness and consistency in ID generation.

* test: Add remote agent access roles to AccessRole model tests

- Included additional access roles for remote agents (REMOTE_AGENT_EDITOR, REMOTE_AGENT_OWNER, REMOTE_AGENT_VIEWER) in the AccessRole model tests to ensure comprehensive coverage of role assignments and permissions management.

* chore: Add deletion of user agent API keys in user deletion process

- Updated the user deletion process in UserController and delete-user.js to include the removal of user agent API keys, ensuring comprehensive cleanup of user data upon account deletion.

* test: Add remote agents permissions to permissions.spec.ts

- Enhanced the permissions tests by including comprehensive permission settings for remote agents across various scenarios, ensuring accurate validation of access controls for remote agent roles.

* chore: Update remote agents translations for clarity and consistency

- Removed outdated remote agents translation entries and added revised entries to improve clarity on API key creation and sharing permissions for remote agents. This enhances user understanding of the available functionalities.

* feat: Add indexing and TTL for agent API keys

- Introduced an index on the `key` field for improved query performance.
- Added a TTL index on the `expiresAt` field to enable automatic cleanup of expired API keys, ensuring efficient management of stored keys.

* chore: Update API route documentation for clarity

- Revised comments in the agents route file to clarify the handling of API key authentication.
- Removed outdated endpoint listings to streamline the documentation and focus on current functionality.

---------

Co-authored-by: Max Sanna <max@maxsanna.com>
2026-01-28 17:44:33 -05:00
Danny Avila
52e6796635
📦 chore: Bump Express.js to v5 (#10671)
* chore: update express to version 5.1.0 in package.json

* chore: update express-rate-limit to version 8.2.1 in package.json and package-lock.json

* fix: Enhance server startup error handling in experimental and index files

* Added error handling for server startup in both experimental.js and index.js to log errors and exit the process if the server fails to start.
* Updated comments in openidStrategy.js to clarify the purpose of the CustomOpenIDStrategy class and its relation to Express version changes.

* chore: Implement rate limiting for all POST routes excluding /speech, required for express v5

* Added middleware to apply IP and user rate limiters to all POST requests, ensuring that the /speech route remains unaffected.
* Enhanced code clarity with comments explaining the new rate limiting logic.

* chore: Enable writable req.query for mongoSanitize compatibility in Express 5

* chore: Ensure req.body exists in multiple middleware and route files for Express 5 compatibility
2025-12-11 16:36:15 -05:00
Danny Avila
656e1abaea
🪦 refactor: Remove Legacy Code (#10533)
* 🗑️ chore: Remove unused Legacy Provider clients and related helpers

* Deleted OpenAIClient and GoogleClient files along with their associated tests.
* Removed references to these clients in the clients index file.
* Cleaned up typedefs by removing the OpenAISpecClient export.
* Updated chat controllers to use the OpenAI SDK directly instead of the removed client classes.

* chore/remove-openapi-specs

* 🗑️ chore: Remove unused mergeSort and misc utility functions

* Deleted mergeSort.js and misc.js files as they are no longer needed.
* Removed references to cleanUpPrimaryKeyValue in messages.js and adjusted related logic.
* Updated mongoMeili.ts to eliminate local implementations of removed functions.

* chore: remove legacy endpoints

* chore: remove all plugins endpoint related code

* chore: remove unused prompt handling code and clean up imports

* Deleted handleInputs.js and instructions.js files as they are no longer needed.
* Removed references to these files in the prompts index.js.
* Updated docker-compose.yml to simplify reverse proxy configuration.

* chore: remove unused LightningIcon import from Icons.tsx

* chore: clean up translation.json by removing deprecated and unused keys

* chore: update Jest configuration and remove unused mock file

    * Simplified the setupFiles array in jest.config.js by removing the fetchEventSource mock.
    * Deleted the fetchEventSource.js mock file as it is no longer needed.

* fix: simplify endpoint type check in Landing and ConversationStarters components

    * Updated the endpoint type check to use strict equality for better clarity and performance.
    * Ensured consistency in the handling of the azureOpenAI endpoint across both components.

* chore: remove unused dependencies from package.json and package-lock.json

* chore: remove legacy EditController, associated routes and imports

* chore: update banResponse logic to refine request handling for banned users

* chore: remove unused validateEndpoint middleware and its references

* chore: remove unused 'res' parameter from initializeClient in multiple endpoint files

* chore: remove unused 'isSmallScreen' prop from BookmarkNav and NewChat components; clean up imports in ArchivedChatsTable and useSetIndexOptions hooks; enhance localization in PromptVersions

* chore: remove unused import of Constants and TMessage from MobileNav; retain only necessary QueryKeys import

* chore: remove unused TResPlugin type and related references; clean up imports in types and schemas
2025-12-11 16:36:12 -05:00
Danny Avila
8bdc808074
refactor: Optimize & Standardize Tokenizer Usage (#10777)
* refactor: Token Limit Processing with Enhanced Efficiency

- Added a new test suite for `processTextWithTokenLimit`, ensuring comprehensive coverage of various scenarios including under, at, and exceeding token limits.
- Refactored the `processTextWithTokenLimit` function to utilize a ratio-based estimation method, significantly reducing the number of token counting function calls compared to the previous binary search approach.
- Improved handling of edge cases and variable token density, ensuring accurate truncation and performance across diverse text inputs.
- Included direct comparisons with the old implementation to validate correctness and efficiency improvements.

* refactor: Remove Tokenizer Route and Related References

- Deleted the tokenizer route from the server and removed its references from the routes index and server files, streamlining the API structure.
- This change simplifies the routing configuration by eliminating unused endpoints.

* refactor: Migrate countTokens Utility to API Module

- Removed the local countTokens utility and integrated it into the @librechat/api module for centralized access.
- Updated various files to reference the new countTokens import from the API module, ensuring consistent usage across the application.
- Cleaned up unused references and imports related to the previous countTokens implementation.

* refactor: Centralize escapeRegExp Utility in API Module

- Moved the escapeRegExp function from local utility files to the @librechat/api module for consistent usage across the application.
- Updated imports in various files to reference the new centralized escapeRegExp function, ensuring cleaner code and reducing redundancy.
- Removed duplicate implementations of escapeRegExp from multiple files, streamlining the codebase.

* refactor: Enhance Token Counting Flexibility in Text Processing

- Updated the `processTextWithTokenLimit` function to accept both synchronous and asynchronous token counting functions, improving its versatility.
- Introduced a new `TokenCountFn` type to define the token counting function signature.
- Added comprehensive tests to validate the behavior of `processTextWithTokenLimit` with both sync and async token counting functions, ensuring consistent results.
- Implemented a wrapper to track call counts for the `countTokens` function, optimizing performance and reducing unnecessary calls.
- Enhanced existing tests to compare the performance of the new implementation against the old one, demonstrating significant improvements in efficiency.

* chore: documentation for Truncation Safety Buffer in Token Processing

- Added a safety buffer multiplier to the character position estimates during text truncation to prevent overshooting token limits.
- Updated the `processTextWithTokenLimit` function to utilize the new `TRUNCATION_SAFETY_BUFFER` constant, enhancing the accuracy of token limit processing.
- Improved documentation to clarify the rationale behind the buffer and its impact on performance and efficiency in token counting.
2025-12-02 12:22:04 -05:00
Danny Avila
01413eea3d
🛡️ feat: Add Middleware for JSON Parsing and Prompt Group Updates (#10757)
* 🗨️ fix: Safe Validation for Prompt Updates

- Added `safeValidatePromptGroupUpdate` function to validate and sanitize prompt group update requests, ensuring only allowed fields are processed and sensitive fields are stripped.
- Updated the `patchPromptGroup` route to utilize the new validation function, returning appropriate error messages for invalid requests.
- Introduced comprehensive tests for the validation logic, covering various scenarios including allowed and disallowed fields, enhancing overall request integrity and security.
- Created a new schema file for prompt group updates, defining validation rules and types for better maintainability.

* 🔒 feat: Add JSON parse error handling middleware
2025-12-02 00:10:30 -05:00
Danny Avila
9f2fc25bde
🔬 refactor: Prevent Automatic MCP Server UI Deselection (#10588)
* chore: Add experimental backend server for multi-pod simulation

* Introduced a new backend script (`experimental.js`) to manage a clustered server environment with Redis cache flushing on startup.
* Updated `package.json` to include a new script command for the experimental backend.
* This setup aims to enhance scalability and performance for production environments.

* refactor: Remove server disconnection handling logic from useMCPServerManager
2025-11-19 17:10:25 -05:00