LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-07-04 13:21:17 +00:00

Author	SHA1	Message	Date
Marco Beretta	45d0d34c9e	chore: remove translation keys orphaned by the tool library redesign	2026-07-03 02:10:15 +02:00
Marco Beretta	a213b2e10f	feat: anchor the favorite star at the card's right edge Swap the ToolCard action-bar order so the star sits rightmost with the configure/info icon to its left. Every card can be favorited but only some are configurable, so anchoring the star keeps it in a consistent position across the grid.	2026-07-03 01:49:06 +02:00
Marco Beretta	d5cccb5f18	feat: add favorites for marketplace tools, MCP servers, and skills Reintroduce the favorite star from the old skill picker, generalized to every marketplace item kind except per-agent actions. Cards in the Tool Library and Skills dialogs get a hover-revealed star (always visible once favorited), and the existing Favorites views in both dialogs now filter to starred items. Favorites persist in a dedicated ToolFavorite collection, one document per (user, itemType, itemId) with a unique compound index, exposed through atomic per-item PUT/DELETE endpoints under /api/user/settings/favorites/ tools. Per-item writes are idempotent and race-free across tabs/devices (the unique index backstops concurrent toggles), reads are a single index-backed query capped at 100 favorites per user, and the client keeps React Query as the source of truth with optimistic updates. Handlers live in @librechat/api with a thin route wrapper; methods follow the data-schemas factory pattern with tenant isolation. The favorites filter now matches on compound kind:id keys instead of bare ids, closing a cross-kind collision where a tool and a skill sharing an id would both match. The skill-favorites data-service stubs and the reserved TUserFavorite.skillId field are replaced by the new tool-favorites service.	2026-07-03 01:12:31 +02:00
Marco Beretta	f6eccee9c7	fix: scope tooltip elevation to dialogs and restore dialog close button size Tooltips go back to z-150 globally; inside a dialog they now borrow the depth-aware popover z-index so they still clear nested dialogs (the Tool Library item dialog) without outranking freshly opened modals everywhere else. The default dialog close icon returns to its original size, and the lc-field pointer-focus suppression ships with the package next to Input and Textarea so external consumers get the whole mechanism from @librechat/client.	2026-07-02 21:06:42 +02:00
Marco Beretta	5c1e0b831a	fix: restore MCP attach semantics and confirmations in the tools marketplace Connecting an MCP server from the item dialog now enables all of its tools once the connection settles, deselect-all keeps the server attached via its placeholder token instead of detaching it, adding a server writes the token so a zero-tool attachment survives a save, and removing a server from the tools list asks for confirmation again. Consume-only servers are excluded from the catalog, matching the old select dialog. Also share the catalog/selection pipeline between ToolsSection and the marketplace through useAgentItems, hoist NEW_ACTION_ID next to ActionItem, drop unused status/view union members and stale TranslationKeys casts, document the phase-2 Favorites/Made-by-you views, fix the needs-setup dot semantics and card focus suppression, remove the redundant close button in CreateSkillDialog, move useInputModality into @librechat/client so external consumers can mount it, and delete dead files and orphaned translation keys.	2026-07-02 21:06:33 +02:00
Marco Beretta	9fa19a71d2	feat: make the skills create button a compact icon button	2026-07-02 17:12:57 +02:00
Marco Beretta	77d1cb58ae	feat: show a saving spinner and allow cancelling credential edits Drive the tool credential Save button from the real mutation state so it shows a spinner while the request is in flight, and add a Cancel button when re-editing already-saved credentials so the edit can be dismissed.	2026-07-02 16:28:52 +02:00
Marco Beretta	cec2b92459	feat: match Code Interpreter file upload to the File Search dropzone Swap Code Interpreter's thin btn-neutral bar for the same dashed dropzone (DropzoneContent + dropzoneClassName) File Search already uses, so the two capabilities' upload UIs are consistent.	2026-07-02 16:28:52 +02:00
Marco Beretta	7d18dd6dc4	feat: smoothly animate auth field changes in the MCP server dialog	2026-07-02 16:28:52 +02:00
Marco Beretta	d189f126d1	fix: vertically center OAuth dialog title against the MCP icon	2026-07-02 16:28:52 +02:00
Marco Beretta	2782af34c8	feat: show MCP server icon in OAuth dialog title	2026-07-02 16:28:52 +02:00
Marco Beretta	999301f10b	feat: cross-fade MCP tools between loading, list, and empty states	2026-07-02 16:28:51 +02:00
Marco Beretta	da551558fa	feat: smoothly collapse MCP connect button once connected	2026-07-02 16:28:51 +02:00
Marco Beretta	b31dbfa0c6	refactor: streamline MCP OAuth dialog - Remove the Cancel button (the flow auto-closes on connect / times out) - Show the URL in a read-only single-line scrollable input (cursor moves through it, not fully visible) with the shared CopyButton's smooth Copy/Check icon swap, matching the OAuth callback-URL field - Put the primary Continue with OAuth action (icon trailing) and an icon-only QR toggle together in a row at the bottom, below the URL - The QR reveals between the description and the URL with a smooth height animation (grid-rows 0fr to 1fr, matching MCPToolItem's reveal)	2026-07-02 16:28:51 +02:00
Marco Beretta	6c53a83a9e	feat: refine agent tools picker (skills, MCP connect/OAuth, web search) - Skills picker: per-card visibility (public) and shared-author badges, category filtering, and an in-place Create skill flow that auto-attaches the new skill without leaving the builder - MCP: inline Connect button in the first dialog plus a dedicated OAuth dialog (continue, copyable URL, QR code) shown only when OAuth is required - Web search: auth-aware affordance, settings cog when user-provided and an info icon when system-defined - Remove orphaned com_ui_unavailable/com_ui_initializing keys and the dead Tools/MCPToolItem component	2026-07-02 16:28:51 +02:00
Marco Beretta	71eb21d80c	feat: restore Memory capability toggle in agent builder tools catalog	2026-07-02 16:28:51 +02:00
Marco Beretta	178be2d4c7	feat: refine agent builder tools, actions, and MCP sections	2026-07-02 16:28:51 +02:00
Marco Beretta	e3ec9544ef	feat: redesign the agent builder tools, skills, and advanced panels Replace the stacked capability/MCP/skill/tool/action form sections with a unified tools marketplace, per-item configuration dialogs, and a consolidated Advanced panel. - unified tools marketplace (catalog, sidebar, polymorphic cards/rows) covering built-in capabilities, plugins, MCP servers, and actions, each with a detail/config dialog - dedicated Skills picker and a Tools section with selected-item summaries and empty states - redesigned action editor and authentication dialog (method cards, segmented controls) - rebuilt Advanced panel: orchestration hub (subagents, handoffs, chain), max steps, skills kill-switch, copyable agent id - restyled version history (timeline, tool/capability counts, in-app restore confirmation) - shared component updates (Radio, Input/Textarea, dropdown z-index, dialog primitives) and keyboard-only focus rings via useInputModality - format-hint placeholders for tool credential fields - sanitize numeric parameter inputs to prevent comma truncation	2026-07-02 16:28:51 +02:00
Danny Avila	477ee3439c	🧪 ci: De-flake ConversationsSection memoization spec on slow runners (#14071 ) waitFor resolves as soon as BookmarkNav's data hook has fired once, but the Suspense resolution commit can still have a trailing render pass pending on slow runners (Windows CI shards). The first stream tick then flushes that leftover pass alongside the tick, inflating the memoized children's render counts past the captured baseline (Expected: 1, Received: 2). Flush pending commits with an empty async act() before capturing baselines.	2026-07-02 10:11:43 -04:00
Danny Avila	0eef64344d	🌍 i18n: Update translation.json with latest translations (#14070 )	2026-07-02 09:19:43 -04:00
Danny Avila	e452a130e9	⚡ perf: Minimize group membership query in principal resolution (#14055 ) getUserPrincipals resolves a user's ACL principals on nearly every authenticated request. It fetched full group documents (including entire memberIds arrays) only to read each group _id, and always issued a separate User lookup for idOnTheSource. - Project { _id: 1 } on the memberIds group query so it returns only ids and can be served from the { memberIds: 1 } index instead of fetching and decoding whole group docs. - Accept role and idOnTheSource from the already-loaded request user and thread them from the capability middleware, collapsing the hot path to a single indexed group query (idOnTheSource: null means known-local).	2026-07-02 08:44:40 -04:00
Danny Avila	6b049c2eed	🧠 fix: Default Bedrock thinking `maxTokens` to model max output (#14058 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details * 🧠 fix: Default Bedrock thinking maxTokens to model max output Thinking tokens share the maxTokens output budget with tool-call arguments (e.g. a create_file content), so the low Bedrock defaults (8192 for enabled thinking, ~4096 server-side for adaptive when unset) truncated large authored files mid-argument — surfacing as OutputTruncationError once reasoning actually emits. Default maxTokens to the model's full max output via anthropicSettings.maxOutputTokens.reset(model), mirroring the direct-Anthropic path. Explicit maxTokens/maxOutputTokens are respected. * fix: canonicalize number-first Claude aliases before resolving max output	2026-07-01 18:29:45 -04:00
Danny Avila	8683eccbbc	🧠 fix: Apply Bedrock thinking config to bare inference-profile model IDs (#14054 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run Details Sync Helm Chart Tags / Sync chart tags (push) Waiting to run Details * 🧠 fix: Apply Bedrock thinking config to bare inference-profile model IDs The Bedrock request parser gated thinking config, sampling handling, and the anthropic_beta headers on the model ID literally containing `anthropic.`. When a deployment uses an application inference profile, the LibreChat model ID is a bare `claude-` (e.g. `claude-sonnet-5`) that maps to the profile ARN — so the gate never matched, no `thinking` config was sent, and reasoning models returned empty thinking blocks (most visibly: Claude Sonnet 5 never streamed reasoning, while `us.anthropic.claude-opus-4-8` did). Match on the `claude` family token instead of the `anthropic.` prefix so prefixed (`anthropic.`, `us.`, `global.`) and bare inference-profile IDs are handled identically. Verified e2e against live Bedrock via the agents SDK: a bare `claude-sonnet-5` now sends `{type:'adaptive', display:'summarized'}` and streams reasoning. Non-Claude Bedrock models (llama/cohere) and pre-thinking Claude (3.5 sonnet) are unaffected. 🧹 fix: Strip stale thinking fields for non-thinking Claude Bedrock IDs Follow-up to the bare-ID matching change: broadening the anthropic guard to match bare `claude-` meant a non-thinking Claude profile (e.g. a bare `claude-3-5-sonnet` inference profile) took the Claude cleanup branch, which kept persisted `thinking`/`anthropic_beta`/`output_config` from a previously-selected thinking model — leaking unsupported fields after a model switch. Extract `isThinkingModel` and, in the Claude cleanup branch, strip the thinking fields when the model isn't thinking-capable. Also fixes the pre-existing prefixed `anthropic.claude-3-5-sonnet` case (which already kept stale thinking). Thinking-capable models (sonnet-5, 3.7-sonnet) still keep their config. 🩹 fix: Preserve user anthropic_beta on non-thinking Claude cleanup The non-thinking stale-cleanup deleted amrf.anthropic_beta, but that is the generic Bedrock Anthropic beta field and may carry a user opt-in (e.g. max-tokens-3-5-sonnet-2024-07-15 for extended output on Claude 3.5). Strip only the thinking-specific fields (thinking/thinkingBudget/effort/output_config) and leave anthropic_beta intact. * fix: clear persisted AMRF (output_config, thinking, generated betas) on bare Bedrock profiles * fix: preserve persisted effort on resume + strip stale thinking/betas across bare profiles * fix: normalize string/comma-delimited anthropic_beta before stripping generated betas	2026-07-01 14:19:34 -04:00
Danny Avila	53ee82fe5d	🩹 fix: Coerce Stringified `edit_file` Edits (JSON-in-JSON) (#14056 ) Models sometimes pass edit_file's `edits` as a JSON-encoded string (or stringify individual edit entries) instead of a real array. That failed validation with "Provide old_text and new_text, or a non-empty edits array" and forced a full retry round-trip. normalizeEditArgs now JSON-parses a stringified `edits` value (and stringified entries) before validating. Non-strings and unparseable strings are left untouched, so the existing explicit errors still fire.	2026-07-01 12:41:26 -04:00
Arjun Vijay	89931baf22	🚪 fix: Support Admin Redirect Detection for Same-Origin Subpaths (#14040 )	2026-07-01 11:40:02 -04:00
Danny Avila	e6f5b6e70a	🌍 i18n: Update translation.json with latest translations (#14053 )	2026-07-01 11:18:00 -04:00
Danny Avila	bb7d99d56c	🫷 feat: Exclude File Authoring Tools From Eager Execution (#14051 ) * feat: exclude create_file/edit_file from eager execution Side-effecting host file-authoring tools should not be speculatively eager-executed: a write can land before the turn commits, and the eager path's incrementally-streamed args can diverge from the final tool call, tripping the SDK's 'changed after eager execution' guard so the model is told the write failed and loops (observed with create_file writing a large file to /mnt/data). Pass excludeToolNames so these tools run on the normal ToolNode path with the final args. Requires @librechat/agents with eager-exclusion support; older versions ignore the field. * chore: Bump `@librechat/agents` to v3.2.56 * refactor: reorder imports in run.ts for clarity * fix: also exclude execute_code/bash_tool from eager execution The eager 'changed after eager execution' corruption isn't specific to file authoring — any tool with a large free-form streamed arg is exposed. Observed live: a bash_tool heredoc (a full Python script in `command`) tripped the guard and the write never landed. execute_code (`code`) and bash_tool (`command`) carry large args and run code (side effects), so exclude them from eager alongside create_file/edit_file. * feat: wire codeSessionToolNames so create_file/edit_file share the code sandbox Activates the agents#283 capability: pass create_file/edit_file as codeSessionToolNames so their exec session/files fold into the shared code session and a file they write is visible to later execute_code/bash_tool calls (and the existing session is injected into their requests). No-op until @librechat/agents ships codeSessionToolNames (agents#283). * test: guard code-tool eager/session wiring in createRun Asserts createRun passes excludeToolNames (create_file/edit_file/execute_code/ bash_tool) and codeSessionToolNames (create_file/edit_file) to Run.create — the wiring the create_file->bash_tool sandbox-sharing chain depends on, which was silently missing before. Guards against a future edit dropping it. Mirrors the run-summarization test harness (mocks Run.create). The full create_file->bash_tool chain runs through the real code sandbox and can't run in the mock CI harness; the SDK mechanism is covered by @librechat/agents unit tests, and this guards the LibreChat wiring. * style: fix prettier formatting in run-codeTools test * chore: Bump `@librechat/agents` to v3.2.57	2026-07-01 11:07:30 -04:00
Danny Avila	f5c64a4d6d	📐 test: Guard Web Search Description Length (#14044 ) * test: Guard web search description length * test: Check registered web search description * style: Format web search description assertion	2026-07-01 10:42:53 -04:00
Malte Polley	e88f7a8f19	📧 fix: Add .eml (message/rfc822) Support to File Upload (#13989 ) * fix: add .eml (message/rfc822) support to file upload * chore: restore package lock metadata --------- Co-authored-by: Malte Polley <ahabsfriend@posteo.de> Co-authored-by: Danny Avila <danny@librechat.ai>	2026-07-01 09:21:14 -04:00
matt burnett	b20abb2593	fix: bound peak memory of concurrent base64 attachment encoding (#14023 ) * fix: bound peak memory of concurrent base64 attachment encoding * chore: sort encode imports --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-07-01 08:22:16 -04:00
matt burnett	38ab4add3d	fix: preserve role SHARE permissions across boot in initializeRoles (#14022 ) * fix: preserve role SHARE permissions across boot in initializeRoles * chore: sort role method spec imports --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-07-01 08:21:46 -04:00
Jaka Centa	68bb533083	🔥 fix: Firebase CDN Initialization Under tsdown CJS Interop (#14046 ) The `@librechat/api` build migrated to tsdown (rolldown/oxc) in #13595. tsdown externalizes third-party deps and uses strict CJS interop, so a default import of the Firebase v9+ modular SDK — whose CJS entry is `__esModule`-marked with only named exports and no `default` — resolves to `undefined`. `firebase.initializeApp(...)` then throws: TypeError: Cannot read properties of undefined (reading 'initializeApp') crashing startup whenever the Firebase file strategy is configured (`fileStrategy: firebase` or a granular `fileStrategies` entry). Switch to the idiomatic modular named import (`initializeApp`) and use the already-imported `FirebaseApp` type for the return annotation.	2026-07-01 08:20:40 -04:00
Marco Beretta	6576688f1b	📥 fix: Download Original File From Artifact Preview Panel for Office Documents (#14026 ) * fix: download original file from artifact preview panel for office documents The preview panel download button serialized the rendered HTML preview instead of the original binary for office artifacts (pptx/xlsx/docx) produced by the code interpreter, so users got an `index.html` text scrape rather than the file. The inline chat card was unaffected because it downloads the real file via `useAttachmentLink`. Thread the original-file download metadata (filepath/file_id/source/user) through `fileToArtifact` onto the Artifact, and update `DownloadArtifact` to fetch the original file through that same path for preview-only office artifacts. Text, source, and markdown artifacts keep the blob path so their in-panel content (and edits) still download as-is. Closes #14002 * fix: require a usable route before downloading the original artifact file A shared link to a non-snapshotted code-execution office artifact strips source/user and deletes filepath while keeping file_id (share sanitization + applyShareFileRoute). The preview-panel download gate treated that lone file_id as sufficient, so it routed to an empty useCodeOutputDownload fetch and downloaded nothing instead of falling back to the preview-content blob. Take the original-file branch only when useAttachmentLink can actually fetch: a non-empty filepath (http target, share route, or code-output URL) or full local-file metadata (isLocallyStoredSource + file_id + user). Export isLocallyStoredSource from LogLink so the panel reuses the same predicate. * fix: only show artifact download success after the file is delivered useAttachmentLink swallows fetch errors (an expired code-output URL or a 404 share download) and resolves without throwing, so the preview-panel download button flipped to the success checkmark even when no file was downloaded. Return a boolean from handleDownload (true once a download is initiated, false on error/empty response) and only mark the artifact download as succeeded when a file was actually delivered. The return value is ignored by the existing onClick callers.	2026-07-01 08:19:34 -04:00
matt burnett	c00fb2d73d	fix: `stripHeavyErrorFields` Winston format (defense-in-depth) (#14018 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details	2026-06-30 20:35:51 -04:00
matt burnett	84329ab0ff	fix: use `logAxiosError` at the RAG file_search/context call sites (#14014 )	2026-06-30 20:35:01 -04:00
Danny Avila	954caef3a3	🔄 chore: Bump `@librechat/agents` to v3.2.55 Some checks failed Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run Details Sync Helm Chart Tags / Sync chart tags (push) Waiting to run Details Publish `@librechat/client` to NPM / pack (push) Has been cancelled Details Publish `@librechat/client` to NPM / publish-npm (push) Has been cancelled Details	2026-06-30 20:28:40 -04:00
Alexey Korepanov	ac759ef2f7	🥷 feat: Add `showInMenu` Option to Model Specs (#14034 ) Add an optional `showInMenu` flag to model specs. When set to false, the spec is dropped from the model selector menu and from the client startup config (GET /api/config), but remains resolvable server-side by name — a request that sends `spec: "<name>"` still works, since server-side resolution uses the full, unfiltered list. Unlike `showIconInMenu` (which only hides the icon), this hides the whole entry. The flag is optional and defaults to listed, so existing specs are unaffected. Adds an `excludeHiddenModelSpecs()` helper (applied before `sanitizeModelSpecs`) plus unit tests.	2026-06-30 19:32:59 -04:00
Danny Avila	927a5957cb	📏 fix: Make `create_file` Missing-Content Error Truncation-Aware (#14043 ) A large SKILL.md/file authored in one create_file call can exceed the model's max output token budget; the response is cut off before the content argument finishes, the arg is dropped, and the handler returns a bare 'content is required'. The model reads that as a forgotten field and retries the same oversized write, looping. Make the error actionable: tell the model the response may have been truncated and to keep the main file lean, moving bulky sections into separate files written in their own calls.	2026-06-30 19:31:48 -04:00
Danny Avila	9f8b6d92c0	🤖 feat: Add Claude Sonnet 5 Support (#14042 ) * ✨ feat: Add Claude Sonnet 5 Support Wire up the claude-sonnet-5 model across token, pricing, and model-list config: - Context window (1M) and max output (128K) in @librechat/api token maps - Standard pricing ($3/$15 per MTok) and cache rates in data-schemas tx - 128K output-token carve-out in anthropicSettings (the family-wide 64K rule capped Sonnet 5 below its real limit); Bedrock/Vertex thinking and 1M-context detection already cover sonnet major >= 5 generically - Add to shared Anthropic, Bedrock, and Vertex default model lists, plus the .env.example examples - Tests for context/output/pricing/matching across the affected packages * ✅ test: Align Sonnet 5+ maxOutputTokens defaults with 128K spec getLLMConfig defaults flow from anthropicSettings.maxOutputTokens.reset(), which now returns 128K for Sonnet 5+. Update the future-proofing assertions in llm.spec.ts (Sonnet 5.x and 6-9.x) that still expected the old family-wide 64K cap. Haiku stays 64K; Opus stays 128K. * 🎚️ fix: Gate Sonnet 5 capability behaviors (sampling, thinking) Adding claude-sonnet-5 to the default list exposed it without the Anthropic capability gates, all confirmed against the live API: - omitsSamplingParameters: Sonnet 5 returns 400 on non-default temperature/ top_p/top_k ('deprecated for this model'); now dropped so selecting the model with saved sampling settings no longer fails. - requiresExplicitThinkingDisabled: omitting 'thinking' runs adaptive ON by default on Sonnet 5, so disabling thinking now sends { type: 'disabled' } (verified: 200, no thinking block) instead of omitting the field. - omitsThinkingByDefault: thinking.display defaults to omitted (empty thinking blocks); the display resolver now returns 'summarized' for Sonnet 5+ so the Thoughts UI keeps working (verified: 757-char summary returned). Gates apply to both the direct Anthropic and Bedrock paths. Tests added in bedrock.spec and llm.spec. * 🩹 fix: Sonnet 5 Bedrock availability + thinking-off persistence Round-2 Codex review (all verified against the live API / Anthropic docs): - Sonnet 5 is NOT available on the legacy Bedrock InvokeModel/Converse surface (Anthropic docs: 'use Claude in Amazon Bedrock or Claude Platform on AWS'), which is what LibreChat's ChatBedrockConverse uses. Removed it from the default Bedrock model lists (config + .env.example). Opus 4.8/4.7/Fable 5 stay — those ARE reachable via InvokeModel. Sonnet 5 remains on the direct Anthropic API and Vertex, where it works. - Reverted the Bedrock-side explicit-disabled thinking handling added last round: with Sonnet 5 off Bedrock, no Bedrock model needs { type: 'disabled' }, so that path (and its round-trip concern) no longer applies. - Direct Anthropic path: a persisted { type: 'disabled' } thinking object now normalizes to a boolean flag in getLLMConfig, so a user's Sonnet 5 'thinking off' setting stays off across the model_parameters round trip instead of flipping back to adaptive (a truthy object skipped the disabled branch). * ↩️ fix: Restore Sonnet 5 on Bedrock (Converse) — verified live Reverses the round-2 removal: Sonnet 5 IS available on AWS Bedrock. Tested live via the Converse API: - global.anthropic.claude-sonnet-5 returns a normal response - bare anthropic.claude-sonnet-5 needs an inference profile — but that's identical to the already-shipping Opus 4.8 / Fable 5 / Sonnet 4.6 entries, which all fail bare on-demand the same way - temperature=0.5 -> 400 'deprecated for this model'; thinking {type:disabled} suppresses reasoning — same as the direct API The 'legacy' Bedrock docs page that claimed Sonnet 5 wasn't on the surface is stale. Restored: - anthropic.claude-sonnet-5 in bedrockModels + .env.example - the Bedrock explicit-disabled thinking handling (requiresExplicitThinkingDisabled -> { type: 'disabled' }) - the Finding 4 round-trip fix in bedrockInputSchema (coerce a persisted disabled AMRF.thinking to thinking=false instead of !!thinking -> true), with an end-to-end schema->parser test proving 'thinking off' stays sticky. Direct-path round-trip fix (getLLMConfig thinkingFlag) is unchanged. * 💵 fix: Sonnet 5 intro pricing + sticky disabled thinking on Bedrock reload Round-4 Codex review (both verified): - Pricing: Anthropic lists Sonnet 5 at introductory $2/$10 per MTok (cache $2.50/$0.20) through 2026-08-31, reverting to $3/$15 ($3.75/$0.30) on Sep 1 (confirmed on platform.claude.com/pricing). The static tx multiplier table is used for real balance transactions, so the post-intro rates were overcharging ~50% during the launch window. Switched to the intro rates with a revert comment on both the token and cache entries. - Bedrock disabled-thinking persistence: initializeBedrock feeds persisted model_parameters straight through bedrockInputParser (NOT bedrockInputSchema), where additionalModelRequestFields is a known key — so a prior thinking:{type:'disabled'} was ignored and rebuilt as adaptive on reload. bedrockInputParser now surfaces a persisted disabled AMRF.thinking as thinking=false so it re-emits {type:'disabled'}. Verified end-to-end against the real initializeBedrock call path.	2026-06-30 19:26:33 -04:00
Danny Avila	6523a5add6	🗺️ refactor: Light Up In-Viewport Ribs at Rest in Message Nav (#14041 ) At rest (no hover/focus) the rail now reads like a minimap: only the in-viewport message ribs are at full opacity while the rest fade to 40%, updating live as you scroll. Replaces the blanket nav opacity-30 with per-rib opacity driven by the existing viewport-visibility highlight, so hover/focus still brings every rib to full opacity for the fisheye.	2026-06-30 16:49:01 -04:00
Danny Avila	dd8a4558f1	🪗 feat: Dock-Style Fisheye Nav Rail With Instant Hover Preview (#14021 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details * ✨ feat: Dock-Fisheye Message Nav Rail with Instant Hover Preview * 🎚️ refactor: Uniform resting ribs + clickable cursor for message nav * 🧹 fix: One rib per message in nav rail (dedupe nested message-render) * 🎯 fix: Accurate fisheye focus + click-anywhere-to-jump in message nav - Measure rib centers relative to the column (getBoundingClientRect) instead of offsetTop, which was relative to the positioned <nav> and shifted the pointer->rib mapping by the chevron height (hovered line wasn't the peak, preview showed an earlier message). - Column-level click jumps to the focused rib, so clicking anywhere the preview is showing works even when the pointer is off the thin line. - Restore @librechat/client jest stub to keep the unit isolated. * 💡 fix: Highlight only the hovered rib white in message nav * 🫥 style: Transparent message nav (drop pill background) * ♿ feat: Keyboard focus mirrors hover (magnify + highlight + preview) in message nav Tabbing to / Shift+Alt+M focusing a rib now drives the same fisheye pipeline as pointer hover via onFocus/onBlur on the column: the focused rib magnifies, highlights white, and shows the shared preview. Also addresses Codex finding on keyboard-focus previews. * 🩹 fix: Live tooltip preview + legacy media-query fallback in message nav - Derive the shared preview text from entryById at render time instead of snapshotting it into tip state, so a streaming/updating message refreshes the open tooltip without leaving and re-entering the rail. - Feature-detect MediaQueryList.addEventListener and fall back to addListener/removeListener so the reduced-motion watcher no longer throws (and breaks the nav) on Safari/iOS < 14. Addresses both Codex findings on review 4601236141.	2026-06-30 14:21:22 -04:00
Marco Beretta	e5d5018d7f	⚡ perf: memoize FavoritesList and BookmarkNav to prevent re-renders during streaming (#14011 ) * perf: memoize FavoritesList and BookmarkNav to prevent streaming re-renders ConversationsSection re-renders during message streaming as its conversation-list query and title generation update the cache. Its FavoritesList and BookmarkNav children were not memoized, so they re-rendered on every parent commit despite their props and subscriptions never changing during a stream. Wrap both in React.memo to insulate them from the parent cascade. Their props (toggleNav, isSmallScreen, tags, setTags) are referentially stable, so memo fully decouples them. Add a regression test asserting FavoritesList does not re-run when its parent re-renders with stable props. * test: verify ConversationsSection insulates Favorites/Bookmarks from streaming re-renders Renders the real ConversationsSection (mocking only data hooks) and forces repeated re-renders via a subscription it depends on, mirroring the conversation-list/title-generation cache churn during streaming. Asserts FavoritesList and BookmarkNav do not re-render, proving the parent passes referentially stable props so React.memo holds in the real render path (not just with hand-fed stable props).	2026-06-30 11:30:04 -04:00
Danny Avila	8545af91f2	📦 chore: bump `@librechat/agents` to v3.2.54 (#14035 )	2026-06-30 10:42:57 -04:00
Danny Avila	6dbf9d5ad3	🪝 feat: Human-in-the-Loop Runtime - Tool Approval + Ask-User-Question (Slice B) (#13942 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details * chore: add @langchain/langgraph-checkpoint-mongodb for HITL durable resume * feat: HITL tool approval runtime — backend (Slice B) - endpoints.agents.checkpointer config + durable Mongo checkpointer (seam over the app connection; SDK MemorySaver fallback) with a TTL index + deleteThread pruning - HITL run wiring (PreToolUse policy hook + humanInTheLoop) attached in createRun, fully inert when toolApproval.enabled is off - interrupt gate (pause job -> requires_action + emit on_pending_action) and a resume route that rebuilds the run from the durable checkpoint and run.resume()s it - atomic single-winner resolve; agent-consistency guard; expireStaleApprovals terminal event; checkpoint pruned on every non-paused completion (thread_id == conversationId) * feat: HITL tool approval UI — frontend (Slice B) approve/reject/edit/respond + ask-user controls in the tool card (OAuth-button precedent), batch-aware single submit, live + reconnect (resumeState.pendingAction) wiring, and resume mutations posting to /agents/chat/resume. * fix(hitl): decouple ApprovalProvider from chat context ApprovalProvider is now pure state (safe to mount in provider-less / shared / test renders); the context-dependent submit moved to a useResumeSubmit hook the cards call. Part imports getAskUserQuestionPart from ~/utils/approval directly so suites that partial-mock ~/utils render Part without throwing. * fix(hitl): address Codex review — backend - P1: enforce per-tool allowed_decisions on resume (reject a crafted decision the policy disallows) via findDisallowedDecisions - prune the durable checkpoint on user-abort of a paused run, and before a fresh HITL turn, so a new turn cannot rehydrate an expired/aborted interrupt (thread_id is the stable conversationId) - persist + use isTemporary and the original parentMessageId on resume (temporary chats stay temporary; initializeAgent scopes thread files off the right parent) - generate a deferred first-turn title BEFORE completeJob so its event reaches the client and the final event carries the real title - moderateText: skip when there is no text (tool-approval resume) and moderate the ask-user answer, instead of denying on an empty input * fix(hitl): address Codex review — frontend - render ToolApproval for ANY paused agent tool card (bash/code/file/etc.), not just the generic ToolCall, by wrapping the tool-card branch in Part (moved the rendering out of ToolCall) - findPendingActionMessageIndex only matches an assistant message, never the user message (the underscore-strip could target the user bubble before the assistant placeholder exists) * fix(hitl): address Codex re-review - title eligibility checks the user message’s parent (first turn), not the response’s parent — the previous check could never be true and skipped title generation - use client.buildResponseMetadata() for the resumed message so contextUsage / thoughtSignatures survive (the abort-only helper dropped them) - moderate decisions[].responseText (the respond action’s user text) - give /chat/abort req.config (configMiddleware) so the HITL checkpoint prune on abort actually runs - read resume state BEFORE setContentParts so the in-memory store does not lose the pre-pause seed content - count resumes against LIMIT_CONCURRENT_MESSAGES (increment/decrement) so paused-then- resumed turns cannot bypass the limit - require actionId on resume so a body without it cannot resolve the current action * fix(hitl): address Codex re-review (round 3) — resume fidelity Bring the lean resume path to parity with sendMessage for things it bypassed: - carry userMCPAuthMap into the rebuilt run so approved MCP tools keep the user's creds - seed initialSessions (buildInitialToolSessions) so approved code/file/skill tools have the pre-pause uploaded-file context (esp. cross-replica / after restart) - await client.artifactPromises and persist them as response attachments (else tool artifacts created after the pause vanish on reload / for late subscribers) - merge metadata: cumulative usage (+ summary marker) from the job, contextUsage / thoughtSignatures from the client — fixes the round-2 regression that underreported post-resume cost * fix(hitl): address Codex re-review (round 4) — resume hardening - resume: require an EXACT paused agent_id match (reject omitted/ephemeral agent_id, not just a different one) and reject an endpoint mismatch, so a request can't rebuild the claimed checkpoint on a different graph - moderateText: also moderate a tool-approval decision's reject `reason` and stringified `editedArguments`, not just `responseText` - request: re-mark the paused response `unfinished:true` after BaseClient saves it as completed, so an expired / never-resumed approval doesn't leave a "finished" response in history; the resume path overwrites it on success * test(hitl): route-level integration test for the resume controller Adds api/server/controllers/agents/__tests__/resume.spec.js, a supertest integration test that drives the real ResumeAgentController over the full pause -> approve -> resume -> finalize lifecycle with the SDK run, durable checkpointer, Mongo, and concurrency cache mocked. The pure decision/liveness helpers run for real via requireActual, so the guard ladder is exercised end to end rather than stubbed. 25 cases covering: - the authorization / staleness / agent-and-endpoint / actionId guard ladder - tool_approval validation (undecided tool call, policy-disallowed decision) - ask_user_question answer requirement - the concurrency gate (429) and the atomic single-winner claim (409) - the happy path: ACK, run reconstruction, decision->SDK mapping, finalize (save the now-finished response, emit done, complete job, prune checkpoint) - first-turn title generation before stream completion - re-pause (no double finalize), abort-during-resume (no double finalize), and the resume-failure terminal path (emitError + completeJob + prune) * test(hitl): strengthen resume coverage + add approval util tests Acts on a self-audit of the new resume integration test. resume.spec.js (25 -> 32 cases): - replace the tautological emitDone assertion (it only checked the hardcoded `final: true`) with a structural check of the finalEvent payload — responseMessage content/id/unfinished, requestMessage identity, title - cover the previously-unwalked finalize branches: tool-artifact attachments (null-filtered), the aggregatedContent fallback when live content is empty, and client response-metadata attachment - add guard cases: unsupported pending-action type (400) and the pre-multi-tenancy null-tenantId pass-through (must not 403) - add error-path cases: first-turn title generation throwing must still finalize, and a completeJob failure during a resume error must force a terminal job state via the last-resort updateJob client/src/utils/approval.spec.ts (new, 15 cases): - applyPendingAction tool_approval: join by tool_call_id not position, skip completed calls, default allowed_decisions to [], referential stability when nothing changes - applyPendingAction ask_user_question: append, idempotent replace on replay, non-array content coercion - getAskUserQuestionPart type guard; findPendingActionMessageIndex assistant-only resolution (never resolves to the user bubble) * fix(hitl): address Codex re-review (round 5) Five findings verified against the code before fixing: - resume: require an EXACT endpoint match (like agent_id) — a resume that OMITS endpoint must not fall through, since the shared chat middleware treats a missing/non-agents endpoint as the ephemeral agent and could rebuild the claimed checkpoint on a different graph - resume: filter malformed content parts before saving the finished response, matching the normal AgentClient path (a resumed turn could otherwise persist an empty/invalid tool_call part that breaks reload/rendering) - resume: accumulate tool artifacts across pause segments — persist them on re-pause and MERGE (not overwrite) at finalize, so artifacts produced before a second approval pause aren't dropped by the next rebuilt client - approval (client): findPendingActionMessageIndex returns -1 when a provided responseMessageId isn't found, so the caller retries instead of attaching the prompt/approval to a prior assistant reply; fall back to the last assistant only when no responseMessageId is given - RedisJobStore: make appendChunk extend-only (XADD + EXPIRE-if-shorter via a single eval) so the on_pending_action chunk emitted after a pause can't reset the chunk-stream TTL back to the running window and evict pre-pause content before the approval is resolved Tests: +endpoint-omitted/unsupported-type/malformed-filter/attachment-merge/ re-pause-persist cases in resume.spec.js (36); ask-retry -1 semantics in approval.spec.ts (16); extend-only TTL assertion in the RedisJobStore Redis integration spec. * test(hitl): mongodb-memory-server integration test for the checkpointer seam The checkpointer unit spec covers config/selection with no DB connection; this exercises the durable Mongo seam against a real (in-memory) MongoDB — the part correctness actually depends on: - getAgentCheckpointer builds a real MongoDBSaver when Mongo is connected and setup() creates the TTL index (expireAfterSeconds) on the checkpoint collection - memory type returns undefined (SDK MemorySaver fallback) even when connected - saver is memoized per resolved config - deleteAgentCheckpoint prunes a thread's persisted checkpoint (the cross-turn isolation guarantee: turn N+1 on the same conversationId can't rehydrate it) - pruning is thread-scoped — deleting one conversation leaves others intact - undefined threadId is a no-op * fix(hitl): address Codex re-review (round 6) Four findings verified against the code before fixing: - messageFilterPii: scan the resume payload's user-authored text (ask-user `answer`, and a tool-approval decision's `respond` text, `reject` reason, and edited tool arguments) — the shared /resume route ran through the PII filter but it only inspected req.body.text, so a blocked token rode the resume payload back into the model/tool (mirrors the earlier moderateText fix) - resume: re-prime skill files invoked in the pre-pause segment before rebuilding the run, so an approved code/file-backed tool keeps the injected skill-file session refs instead of running without them (mirrors the normal path's primeInvokedSkills; the pre-pause content stands in for the message payload) - hitl: pin the graph identity. Persist a fingerprint of the graph-determining request fields (endpoint, agent_id, model, spec, ephemeralAgent — normalized) on the pending action at pause, and reject a resume whose recomputed fingerprint differs. This closes the ephemeral-agent gap, where agent_id is undefined so the id guard can't tell two ephemeral configs apart - resume: reject incomplete edit/respond decisions (findIncompleteDecisions) — an `edit` without an object editedArguments or a `respond` without non-empty responseText is 400'd before mapping, rather than defaulting to {} / '' and resuming with behavior the user never approved Tests: incomplete-decision + fingerprint match/mismatch cases in resume.spec.js (41); findIncompleteDecisions + computeAgentRequestFingerprint unit tests; and resume-field PII cases in messageFilterPii.spec.ts. * fix(hitl): address Codex re-review (round 7) Four findings verified against the code before fixing: - RedisJobStore: clear `agent_id` on createJob (add it to staleHitlFields). The job hash is keyed by conversationId and reused across turns; updateMetadata only writes agent_id when truthy, so a conversation that switched from a saved agent to an ephemeral/no-agent turn kept the old id and the resume guard rejected the valid pause as a different agent. (real correctness bug) - fingerprint: include `promptPrefix` in computeAgentRequestFingerprint, and re-send it on resume (ResumeAgentFields + buildResumeFields). Ephemeral agents derive their system instructions from promptPrefix, so a resume changing it previously passed the pin and rebuilt different instructions. (completes the round-6 fingerprint) - resume: the re-pause branch now persists the segment's accumulated CONTENT (filtered), not just artifacts, so an approval that expires/reaps without a final resume no longer loses everything streamed during the resumed segment. - request: carry `manualSkills`/`alwaysAppliedSkills` on the persisted user message so a resumed turn's reconstructed requestMessage keeps its skill pills instead of dropping them until a full reload. Deferred (narrow, no safe contained fix yet — see PR thread replies): - resume rebuild without `addedConvo` for a multi-conversation/added-agent pane - cross-replica re-prime of manually-selected (not model-invoked) skill files Tests: stale-agent createJob clearing (Redis integration), promptPrefix fingerprint match/mismatch (resume.spec.js + policy.spec.ts), re-pause content persistence (resume.spec.js). * fix(hitl): address Codex re-review (round 8) Five findings verified against the code before fixing; the headline is a durable- resume correctness fix (the fingerprint had surfaced it as a 403): - resume durability (the important one): persist the graph-determining request fields (endpoint, agent_id, model, spec, promptPrefix, ephemeralAgent) on the pending action as `resumeContext`, and REPLAY them onto the resume request via a router-level middleware that runs before buildEndpointOption. The client can't reconstruct the ephemeral-agent config after a reload/cross-session, so the round-6/7 fingerprint would 403 a valid durable resume — and even without it the rebuilt agent would lose its tools. Replaying server-side rebuilds the SAME graph regardless of client state (and a crafted resume can't swap it; the fingerprint still matches because the body is restored first). - RedisJobStore: also clear `isTemporary` on createJob (same class as agent_id): a prior temporary turn's flag would otherwise survive a reused conversation hash and a later non-temporary resume would save its response as temporary. - resume: persist `contextMeta` (context-window calibration) onto the saved response like BaseClient does, so the next turn can seed its pruner. - request: carry manualSkills/alwaysAppliedSkills into the onStart metadata update (not just the preliminary one it overwrites), so a resumed turn's requestMessage keeps its skill pills. Deferred (narrow — see thread reply): - saved-agent edited WHILE a run is paused: agent_id matches but the definition changed; needs an agent version/config hash, which is a larger change for a narrow window. Tests: resumeContext pick/apply + round-trip (policy.spec.ts), contextMeta + manualSkills-on-requestMessage (resume.spec.js), isTemporary clearing (Redis integration). * style(hitl): prettier line-wrap in policy.spec.ts (R8 lint fix) * fix(hitl): address Codex re-review (round 9) Five findings, all fixed (addedConvo — deferred in rounds 7/8 — is now trivial thanks to the round-8 replay): - replay addedConvo: add it to RESUME_CONTEXT_KEYS so the resume middleware restores the parallel/secondary-agent config from the paused request; the client can't reconstruct it, and it determines the rebuilt graph. - skill pills (the real fix this time): the round-8 onStart metadata write was overwritten by trackUserMessage (the authoritative userMessage writer). Carry manualSkills/alwaysAppliedSkills in the emitted `created` message and persist them in trackUserMessage; widen UserMessageMeta + SerializableJobData.userMessage. - execute-code files on resume: seed the paused user message's own files onto req.body.files before initializeClient — they're excluded from the parent-walk code-session rebuild, so an approved code/read-file tool would otherwise resume without them. - in-memory pending-action UI: route ApprovalEvents.ON_PENDING_ACTION in the resume replay/pending-event loops to applyPendingActionToMessages (mirror the live handler), so a pause that lands in the snapshot window still renders its approval controls instead of sitting paused with no UI. - abort isTemporary: the /chat/abort partial-save now sources isTemporary from the job metadata, not req.body (the stop button posts only conversationId), so aborting a paused temporary chat no longer persists an orphaned partial. Tests: addedConvo in pickResumeContext (policy.spec.ts), file-restore on resume (resume.spec.js), abort-from-job-isTemporary (abort.spec.js). * fix(hitl): address Codex re-review (round 10) — resume/expiry races Three concurrency/coherence findings, verified against the code before fixing: - expiry-sweep CAS scope: both stale-approval sweeps (GenerationJobManager expireStaleApprovals and the RedisJobStore requires_action cleanup) called expire()/transitionStatus WITHOUT the observed pendingAction.actionId, so the CAS only checked status===requires_action. Between the read and the CAS a user could resolve the observed action and the run re-pause on a FRESH action; the stale sweep would then abort that valid new pause. Now both pass the observed actionId as expectActionId, so the CAS only fires for the action read as stale (a re-paused action has a different id → no-op). - resume graph cache: resumeCompletion cached the rebuilt graph (created with messages:[]) via setGraph; RedisJobStore.getContentParts prefers a cached graph over reconstructing from the chunk log, so a same-replica reload/status poll mid-resume returned aggregatedContent missing the pre-pause content. Skip setGraph on resume so introspection falls back to the complete chunk reconstruction (setContentParts still seeds the in-memory store). - pending-action UI: applyPendingActionToMessages scheduled a SINGLE animation-frame retry then dropped the pending action; Recoil/React updates can take several frames under load, leaving a valid requires_action run with no approval controls. Retry across frames (bounded at 120) until the target message commits. Test: expire() with a mismatched expectedActionId no-ops while the matching id expires (pendingAction.spec.ts). * chore(deps): update @librechat/agents to version 3.2.53 and @langchain/langgraph to version 1.4.7 in package-lock.json and related package.json files * refactor(hitl): add resolveToolApprovalPolicy seam for layered policy Extract the single point where tool-approval policy is resolved for a turn (`resolveToolApprovalPolicy`) and route the run call site through it instead of reading `endpoints.agents.toolApproval` inline. Behaviour-preserving: only the `endpoint` layer is wired today, so the result is identical to reading the app policy directly. The `agent` and `skills` layers are reserved seams with documented precedence (endpoint owns the `enabled` kill switch; agent overrides mode/allow/deny/ask/reason; skills may only tighten), so future per-agent and per-skill policy plumbing lands in one function rather than at the `createRun` site. Adds focused unit tests. * fix(hitl): address Codex re-review (round 11) — resume hardening F1 (P2, security) — applyResumeContext now DELETES any RESUME_CONTEXT_KEY absent from the persisted context, so the resume body carries exactly the graph-determining fields the pause had. Previously only defined keys were overwritten, leaving a client-supplied `addedConvo` (which the request fingerprint does not cover) in place — a crafted resume could rebuild a single-agent checkpoint as a different multi-agent graph/tool set. F3 (P2) — the resume route ACKs (res.json) before initializeClient, so a post-ACK getMCPRequestContext(req, res) saw the response as finished and returned undefined, leaving the resumed run without its run-scoped MCP connection store (approved MCP / OAuth-overlay tools then ran without their request-scoped connections). Pre-seed the store with a null res + cleanupOnResponse:false before the ACK and tear it down in the finally, mirroring the normal stream path (request.js). userMCPAuthMap was already preserved separately, so credentials were not lost — only the connection store. Declined: the ApprovalContext NEW_CONVO guard (P2) is a false positive — the `created` SSE event updates the conversation atom before any pause renders, so the id is concrete by click time (details in the PR thread). Tests: policy.spec (absent-key delete) + resume.spec (MCP context pre-seed/cleanup order). * fix(hitl): address Codex re-review (round 12) — resume fidelity + multi-tool UI F4 (P2) — temporal prompt vars: resume rebuilt the agent without restoring req.conversationCreatedAt or req.body.timezone, so {{current_datetime}}-style vars compiled a different system prompt than the paused graph (resume wall-clock, unzoned). Add 'timezone' to RESUME_CONTEXT_KEYS (persisted at pause, replayed by the resume middleware) and restore conversationCreatedAt from the convo before initializeClient — mirroring the normal path's resolveConversationCreatedAt. F5 (P2) — multi-tool approval: applyPendingActionToMessages stopped retrying once ANY tool-call part was tagged, so siblings that rendered on later frames never got approval controls and the resume route 400'd the partial batch. Add countTaggedApprovalParts and keep the bounded RAF retry going until every action_request is tagged (ask_user_question unchanged — one synthetic part). F6 (P3) — Edit accepted `null`/`[]` (valid JSON, non-object), enabling Submit for a value the resume route rejects via findIncompleteDecisions. Mirror the server's plain-object check in the client (store + editIsValid) so Submit only enables for an accepted value. Tests: policy.spec (timezone round-trip), resume.spec (conversationCreatedAt restore), approval.spec (countTaggedApprovalParts). * fix(hitl): address Codex re-review (round 13) — recurse into subagent approvals F9 (P2) — a tool paused INSIDE a subagent has its tool_call_id in the parent subagent tool_call's nested `subagent_content`, not as a top-level message part. applyToolApproval and countTaggedApprovalParts only scanned top-level content, so the approval never attached and the round-12 retry loop counted 0 tagged parts and spun to its frame cap with no controls. Both now recurse into `subagent_content` (immutably, so React refs update): the nested call gets tagged and is counted, so the retry terminates. Added approval.spec cases for the nested tag + count. Note: surfacing the interactive approve/reject controls inside the subagent view is a deliberate follow-up — ToolApproval -> useResumeSubmit -> useChatContext crashes when rendered in the portaled subagent dialog (outside the chat/approval providers), so that needs the controls scoped to the in-provider inline render (or the dialog wrapped with the providers). This commit fixes the data/traversal layer only. F7 (discovered-tool history on resume) and F8 (redis chunk TTL pause race) were verified false positives — see the PR threads. * fix(hitl): address Codex re-review (round 14) — resume fidelity + expiry relay F13 (P2) — manualSkills are graph-determining (skill allowed-tools union into the tool set before tools load) but weren't replayed, so a reload lost the skill tools and a crafted resume could inject a different skill past the fingerprint. Add 'manualSkills' to RESUME_CONTEXT_KEYS (same replay-only pattern as timezone/ addedConvo; the delete-absent half blocks injection). Not alwaysAppliedSkills — that's resolved server-side from the DB, not req.body. F12 (P2) — the resume final SSE built requestMessage from job.metadata.userMessage (persisted without files), so attachments vanished from the user bubble on resume. Spread the already-restored req.body.files onto it, matching the normal path. F11 (P2) — multi-replica approval expiry: RedisJobStore.cleanupRequiresActionIndex on another replica can win the requires_action->aborted CAS (it sets the hash error but has no event transport), and the local sweep then skips because the job is no longer requires_action, so a client subscribed here never gets the terminal error until the reap path. expireStaleApprovals now relays APPROVAL_EXPIRED_ERROR for a locally-subscribed job already aborted FOR approval expiry (error-string gated, idempotent via the errorEvent flag). emitError already publishes cross-replica. Tests: policy.spec (manualSkills round-trip + inject-drop), resume.spec (final requestMessage carries restored files). * fix(hitl): render approval controls for subagent-nested tool pauses (F10) Round-13 made applyToolApproval/countTaggedApprovalParts recurse into subagent_content (data), but SubagentDialogPart rendered nested TOOL_CALL parts with <ToolCall> only and never mounted <ToolApproval>, so a tool paused inside a subagent showed no controls and the run was unresolvable. Render <ToolApproval> in SubagentDialogPart's TOOL_CALL branch when the nested tool_call carries an approval and isn't yet resolved, mirroring the top-level Part.tsx render. The subagent dialog portals (OGDialog → ReactDOM.createPortal), but React context flows through the React tree, not the DOM tree, so ToolApproval resolves ApprovalProvider/ChatContext and the controls work + submit. Also harden useResumeSubmit: read ChatContext via useContext (non-throwing) instead of the throwing useChatContext wrapper, so the cards never crash when rendered outside a ChatContext.Provider (e.g. a search/citation render that passes chat context as a prop) — they degrade to inert (buildResumeFields returns null). * style(hitl): re-sort run.ts imports after dev rebase * fix(hitl): address Codex re-review (round 15) — resume content fidelity F14 (P2) — hide_sequential_outputs was applied in chatCompletion before saving/emitting content but not on resume, so a sequential-agent chain that pauses for HITL and resumes persisted/emitted intermediate outputs the setting is meant to hide. Extracted the filter into applyHideSequentialOutputsFilter() and call it from both chatCompletion and resumeCompletion (after handleRunInterrupt, covering the finalize + re-pause reads of client.contentParts). F16 (P2) — on a reloaded HITL pause, the DB already holds the paused user row + partial assistant row; useResumeOnLoad fed those as submission.messages, then finalHandler/createdHandler appended the same pair via requestMessage/responseMessage, duplicating the turn (buildTree doesn't dedupe children by messageId). buildSubmission- FromResumeState now strips the paused user/response rows (by messageId, incl. the padded/unpadded response id) from submission.messages — they're re-supplied by the placeholders + final event. Frontend-only; live (non-reload) pause path untouched. Deferred: F15 (collapsed-card subagent approval registration/visibility) — see thread. Tests: client.test (filter keeps last + tool_call parts / no-op when off), useResumeOnLoad.spec (paused pair stripped from submission.messages). * fix(hitl): address Codex re-review (round 16) — chunk TTL, slot, job replacement F17 (P2) — chunk-stream TTL on pause-before-chunk. CHUNK_APPEND_LUA derived its ceiling only from the chunk key's current TTL, so when the chunks key didn't exist at pause (fire-and-forget append in flight, or an ask-user pause before any chunk), the on_pending_action append created the stream with only the 20m running TTL while the approval window is 24h — content evicted before resume. The Lua now also reads the job key (KEYS[2]); when status == requires_action it takes max(running, TTL(jobKey)) (the approval window transitionStatus set), else the running TTL. Extend-only preserved; gated on paused status so normal runs never inflate. Both keys share {streamId} (cluster-safe). F19 (P2) — with LIMIT_CONCURRENT_MESSAGES, the approval prompt was emitted before the original request released its slot, so a fast Approve got /resume 429'd. handleRunInterrupt now releases the slot (idempotent via pendingRequestReleased) right after the pause, before the prompt; the request.js pause branch and resume.js finally only release if it didn't (no double-release). F20 (P2) — finalizeResumedTurn never checked the job wasn't replaced before emitDone/ completeJob/saveMessage, so a stale resume could clobber a newer turn that reused the conversationId. Added the createdAt guard the normal request path uses (skip finalization when the live job's createdAt != the paused job's). Deferred: F18 (subagent_content not reconstructed on Redis resume) — joins the subagent cluster (F15). See thread. Tests: RedisJobStore integration (pause-before-chunk gets approval TTL; running stays short), resume.spec (skip finalization on replacement; no double slot release on re-pause). * 🛡️ fix: Guard HITL terminal side-effects against job replacement Jobs are keyed by streamId == conversationId, so a new request REPLACES the running one on the same conversation. The replaced generation's tail must not clobber the live generation's state. Each path now re-reads the live job and compares createdAt against the generation's captured identity before acting. - Thread the generation's createdAt onto the client (request.js + resume.js) as client.jobCreatedAt — the identity every guard compares against. - handleRunInterrupt: skip approvals.pause when this run is no longer the live job, so a stale interrupt can't flip the NEWER job to requires_action. - chatCompletion finally: skip the checkpoint prune when replaced, so an older run's late finally can't delete the newer run's resume checkpoint. - resume catch-path: gate emitError/completeJob/prune behind a stillLive check (fail-open if the read throws), mirroring finalizeResumedTurn's success guard. - Persist the turn's uploaded files on job.metadata.userMessage (authoritative trackUserMessage writer) and prefer them on resume over the user DB row, whose save can still be racing a fast /resume. Tests: 13 guard-predicate cases in jobReplacement.spec.js. * 🔁 fix: Harden HITL resume — ownership re-check, file seeding, deferred-tool replay Three follow-ups to the round-17 job-replacement guards (Codex review 4594099963): - G1 (resume.js): the success-path ownership guard runs at the START of finalizeResumedTurn, but saveMessage + first-turn title generation await long enough for a new request to replace the job on the same conversationId. Re-read the live job immediately before emitDone/completeJob/prune so the terminal writes can't tear down the REPLACEMENT job — mirrors the catch-path guard. - G2 (request.js): onStart's metadata/chunk writes that persist the turn's files are fire-and-forget, so a fast approval could read job.metadata.userMessage before files landed. Seed files into getPreliminaryUserMessage instead — that write is AWAITED before the run starts, so files are durable before any interrupt can emit. - G3 (run.ts + client.js + resume.js + IJobStore.ts): the resumed graph is rebuilt with messages: [], so createRun's tool_search-discovery scan finds nothing. A deferred tool discovered earlier in the turn (and targeted by the paused call) was therefore absent from the rebuilt schema-only toolMap — resume would throw "unknown tool" (no loadRuntimeTools fallback is wired). Capture discovered tool names at pause via extractDiscoveredToolsFromHistory(run.getRunMessages()), persist them on job.metadata.discoveredTools, and replay them into createRun's new discoveredToolNames input (merged with message-extracted names, gated on hasAnyDeferredTools — inert otherwise). A new createRun test proves the deferred tool is promoted with the replay and absent without it (reproducing the bug). Tests: real createRun deferred-replay suite (run-summarization.test.ts) + G1/G2/G3 guard predicates (jobReplacement.spec.js). Full suite green. * 🔒 fix: Close HITL resume metadata + file-substitution + pause-race gaps Four findings on the round-18 commit (Codex review 4594430222): - H1 (P1, regression in round-18 G3): the discoveredTools captured at pause never reached resume — three metadata allowlists dropped it: GenerationJobManager .updateMetadata, RedisJobStore.deserializeJob, and buildJobFacade (plus the GenerationJobMetadata type). Added discoveredTools to all four, so the deferred-tool replay actually works end-to-end (in-memory store already kept it via Object.assign). - H2 (P2, security): /resume honored a client-supplied `files` array, letting a crafted client resume an approved code/read-file tool against a DIFFERENT file set than the one approved (files aren't in the resume fingerprint/context). Resume now ALWAYS sources files from the paused job (metadata → DB row), clearing any client-supplied set. - H3 (P2, ephemeral fidelity): non-default model parameters (temperature, max tokens, custom endpoint params) were lost on resume — ephemeral agents derive them from the request body, which the resume payload omits. Capture the resolved model_parameters in resumeContext at pause and replay them onto the body on resume (excluding `model`, which is replayed via the fingerprinted RESUME_CONTEXT_KEYS path). Saved agents already source these from the DB. - H4 (P2, Redis race): a pause landing between the resume snapshot and the Pub/Sub subscription reached neither resumeState.pendingAction nor (Redis) pendingEvents, and approval events aren't persisted to replayEvents — the client attached to a paused job with no approval UI. subscribeWithResume now re-reads the live job AFTER subscribing and surfaces the pending action if the snapshot missed it (live read, no staleness). Tests: discoveredTools metadata round-trip + subscribeWithResume re-read (pendingAction .spec.ts); client-file substitution rejection (resume.spec.js); model-parameter replay predicate (jobReplacement.spec.js). * 🧹 fix: Clear stale discovered tools, release slot on claim error, extend run-step TTL Three follow-ups on the round-19 commit (Codex review 4594783691): - I1 (P2): the round-19 discoveredTools field wasn't cleared on Redis streamId reuse. HSET only overwrites listed fields and handleRunInterrupt only writes discoveredTools when THIS turn discovers a deferred tool — so a replacement turn that pauses without its own discovery inherited the prior run's tool names and force-loaded undiscovered deferred tools on resume. Added discoveredTools to createJob's staleHitlFields HDEL list (the in-memory store already builds a fresh object, so it was Redis-only). - I2 (P2): with LIMIT_CONCURRENT_MESSAGES, approvals.resolve runs after the slot increment but before the run's try/finally, so a store/Redis error there leaked the slot until the counter TTL expired (spurious 429s on retry of the still-paused approval). Wrapped the claim in try/catch that decrements the slot and returns 500. - I3 (P3): saveRunSteps did SET ... EX running unconditionally, resetting the run-steps key to the 20-min running TTL even while the job is paused for the longer approval window — a reload after that window lost the tool timeline. Now uses a paused-window TTL script mirroring the chunk-stream no-shrink behavior (extends to the approval window when the job hash is requires_action). Also fixes a latent strict-tsc cast error in the round-19 pendingAction test. Tests: claim-throws-releases-slot (resume.spec.js); discoveredTools cleared on reuse + saveRunSteps preserves the paused TTL (RedisJobStore integration, USE_REDIS). * 🛡️ fix: Guard fast-resume save race, gate HITL to resumable routes, expire on stale submit Three findings on the round-20 commit (Codex review 4595045652): - J2 (P1): a fast /resume can claim + finalize the COMPLETED response while the original request's pause branch is still awaiting `response.databasePromise`; the later unfinished-save then overwrites the completed content. Re-check the job is still paused on THIS generation's action (a claim leaves requires_action; a replacement bumps createdAt) before marking the row unfinished; fail open on a read error. - J3 (P1): the tool-approval wiring (humanInTheLoop + PreToolUse hook + checkpointer) was applied to EVERY createRun caller when toolApproval.enabled, but the OpenAI-compatible and Responses controllers never inspect run.getInterrupt() or persist a pending action — an approval-gated tool would pause there with no approval surface or resume endpoint and the route would emit a normal final response / [DONE] with the tool call dangling. Gate the wiring on a new createRun `hitlCapable` flag, set only by AgentClient (chat + resume). - J4 (P2): a stale-action 409 on submit returned without driving expiry, leaving the job requires_action with a dead action until the periodic sweeper ran — any attached SSE client got no terminal event and the stream appeared to hang. Extracted GenerationJobManager .expireApproval(streamId, actionId) (expire CAS + terminal SSE, shared with the sweeper) and call it from the resume route when the observed action is stale. J1 (nested subagent approval controls not mounting while the details dialog is closed) is a valid frontend issue in the deferred subagent-HITL path — tracked separately (replied on the thread) since the fix touches the shared dialog primitive and needs UI verification. Tests: HITL-gate both directions (run-summarization.test.ts); expire-on-stale-submit (resume.spec.js); fast-resume unfinished-save guard predicate (jobReplacement.spec.js). * 💄 style: Wrap captureAgents signature to satisfy prettier (CI lint)	2026-06-29 16:56:41 -04:00
Danny Avila	186b738d2d	🪟 fix: Re-measure Sidebar Chat List on Width Change to Fix Date-Group Spacing (#13981 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Has been cancelled Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Has been cancelled Details GitNexus Index / index (push) Has been cancelled Details GitNexus Index / post-index (push) Has been cancelled Details * 🪟 fix: Re-measure sidebar chat list on width change to fix date-group spacing When the sidebar is expanded from a collapsed reload, virtualized rows first measure mid-animation at a narrow width, so date-group headers wrap and cache an inflated height. CellMeasurerCache(fixedWidth) keys heights by row, not width, so the stale height persists once full width is reached — leaving gaps under headers. Invalidate the measurement cache and recompute row heights whenever the measured list width changes. Adds a Playwright mock e2e (seeds backdated convos across date groups via a new db helper) that fails without the fix and passes with it. * 🧪 test: Harden sidebar e2e (runtime-env path, midnight-safe seed, convo isolation) Addresses Codex review on PR #13981: - db.ts honors E2E_RUNTIME_ENV_PATH when locating the runtime Mongo URI. - Seed timestamps anchor on local noon so the Today group stays in-day near midnight. - Clear the shared user's conversations before seeding so later date-group headers are not pushed below the virtualized viewport by other specs' leftover chats.	2026-06-26 13:43:03 -04:00
Danny Avila	c948606a8c	🛗 perf: Fetch Pinned Agents Directly Past the Global Agents Map (#13972 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details * 🚀 perf: Decouple Pinned Agents from Global Agents Map in Sidebar Pinned/favorite agents in the sidebar waited for the full global agents map (useListAgentsQuery, which walks every pagination cursor) before rendering. In environments with many agents this left pinned items in a loading state even though their IDs were already known. FavoritesList now fetches pinned agent IDs directly via getAgentById when the global map is still loading, and falls back to filtering only missing IDs once the map is available. The loading state tracks just the small set of pinned-agent queries instead of the entire catalog, so pinned agents appear as soon as their own data resolves. Closes #13967 * 🩹 fix: Address Codex review on pinned-agent decoupling - Stop caching the {found, agent} wrapper under the shared [QueryKeys.agent, id] key; direct fetches now return a plain Agent like useGetAgentByIdQuery, so opening/selecting a pinned agent within the stale window can no longer read a wrapper as an agent. Missing (404/403) agents are detected via the query error state. - Gate the direct fetches on the agents endpoint being enabled, so pinned agents the endpoint list intentionally hides are not fetched, rendered, or cleaned up when the endpoint is disabled. - Keep the loading skeleton while a direct fetch fails with a transient (non-404/403) error and the global agents map is still loading, so a pinned agent no longer disappears on a momentary 500/network error during startup. - Remove the now-unused AgentQueryResult type. * 🩹 fix: Address Codex round 2 on pinned-agent decoupling - Keep the loading skeleton (not an empty/collapsed row) while the endpoints query is still loading. The endpoint gate previously treated the default empty config as disabled, so pinned-agent favorites rendered an empty row that could be measured and cached by the CellMeasurer before the config arrived. isAgentsLoading now stays true while isEndpointsLoading is true. - Replace the blanket retry:false on direct pinned-agent fetches with a predicate that skips missing-agent (404/403) errors but still retries transient 500/network failures, restoring the prior default-retry resilience on the fast path. - Add data-testid to the favorite skeleton and a regression test for the endpoints-loading window. * 🛡️ fix: Don't delete pinned favorites on a global agents 403 GET /api/agents/:id runs the role-level AGENTS.USE check (checkAgentAccess) before the per-agent VIEW ACL, so a temporarily revoked role returns 403 for every agent. Because direct fetches now run while the agents map is undefined, treating those 403s as missing agents made the cleanup effect persist reorderFavorites and wipe all pinned agent favorites. staleAgentIdsKey now returns early while agentsMap is undefined, restoring the original invariant that favorite cleanup only runs once the global map has loaded successfully (which also proves AGENTS.USE is granted). Rendering of pinned agents while the map loads is unaffected; only deletion is deferred.	2026-06-26 13:07:09 -04:00
Danny Avila	4fa86be424	📂 fix: Mount Skill Files Under `skills/` in Code Interpreter (#13961 ) (#13975 ) Skill files were primed into the sandbox at `/mnt/data/{skillName}/...`, but the read_file/create_file/edit_file tool descriptions and the read_file bash-fallback hints all assume the `skills/{skillName}/...` namespace (sandbox cwd is `/mnt/data`). Agents therefore reached for `./skills/my-skill/...` in bash and missed ~100% of the time. - Add shared `SKILL_FILE_PREFIX` to agents/skills.ts (moved out of handlers.ts; single source of truth across the three layers). - Prefix the prime upload filenames and session names with `skills/` in skillFiles.ts so the physical mount matches the model-facing namespace; recover the bare relativePath by stripping `skills/{name}/`. - Canonicalize the read_file bash-fallback hints to `/mnt/data/skills/{skillName}/{relativePath}` so the implicit `{name}/...` addressing form is corrected too. Closes #13961	2026-06-26 12:22:06 -04:00
Danny Avila	12fea693bb	🦥 perf: Lazy-Load Agent Version History in Editor (#13977 ) Opening the agent editor fetched the full `versions` array (each a complete config snapshot) alongside the agent, so agents with large histories were slow to open. Version history is now loaded only when the user opens it. - Add `getAgentWithVersionCount` (aggregation: version count, no versions array) and `getAgentVersions` data-schemas methods. - `getAgentHandler` returns the version count without the heavy array; add `GET /agents/:id/versions` (EDIT-gated) for lazy retrieval. - Add `useGetAgentVersionsQuery`; VersionPanel reads current config from the cached expanded query and fetches versions on open. Revert keeps the expanded cache and versions query in sync.	2026-06-26 12:19:54 -04:00
Danny Avila	b15d40e3e4	🪣 refactor: Rate-Limit Token Routes and Cap Remote File Downloads (#13978 ) * harden token and remote file handling * sort s3 storage imports * split token submission rate limits	2026-06-26 12:19:03 -04:00
Danny Avila	3790bdcff2	📇 fix: Index sharedlinks updatedAt for Cosmos DB Sorts (#13979 ) The getSharedLink query sorts by updatedAt, but the sharedlinks collection had no updatedAt index. Azure Cosmos DB for MongoDB (RU-based) rejects sorts on non-indexed fields, causing an immediate 500 on GET /api/share/link/:conversationId whenever a conversation is opened. Standard MongoDB is unaffected.	2026-06-26 12:18:12 -04:00

1 2 3 4 5 ...

4668 commits