mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 07:46:47 +00:00
4059 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9ccc8d9bef
|
✨ v0.8.5 (#12727) | ||
|
|
997f765d6b
|
🌍 i18n: Update translation.json with latest translations (#12783)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |
||
|
|
181d705579
|
🧹 fix: Clean Up Orphaned Agent File Stubs After Deletion (#12781)
* 🧹 fix: Prune Orphaned File References on File Deletion Deleting a file via the Manage Files tab left its file_id in every agent's tool_resources.*.file_ids. Stubs accumulate until the frontend dedupe keys them as duplicates and blocks all new uploads (issue #12776). - Add removeAgentResourceFilesFromAllAgents in packages/data-schemas: a single updateMany/$pullAll across every EToolResources category. - Invoke it from processDeleteRequest after db.deleteFiles so every referencing agent is cleaned up, not just the one passed in req.body. - Wrap the cleanup in try/catch so a stale agent update cannot mask a successful file deletion. * 🧼 fix: Prune Orphaned File References on Agent Update Already-affected agents would stay broken even after the delete-time fix: the stubs sit on the agent document until something strips them. Heal them on the next save (issue #12776). - Add collectToolResourceFileIds + stripFileIdsFromToolResources helpers in @librechat/api — centralizing the tool_resources traversal used by the controller and the follow-up migration script. - In updateAgentHandler, check the effective tool_resources against the files collection. When orphans are found, either strip them from the incoming tool_resources (if the update sets them) or run the bulk cleanup (if the update leaves tool_resources untouched). * 🧰 chore: Add Migration to Clean Up Orphaned Agent File References Complements the delete-time and save-time fixes by healing agents that already accumulated orphan stubs before the upgrade (issue #12776). The script is idempotent — re-running it on a clean database is a no-op. - Add config/migrate-orphaned-agent-files.js following the existing migrate-*.js convention: --dry-run by default omitted (writes by default) and --batch-size= tuning knob. Streams agents via cursor. - Register migrate:orphaned-agent-files and :dry-run npm scripts. - Reuse collectToolResourceFileIds from @librechat/api so migration and runtime share the same traversal logic. * 🩹 fix: Address Codex/Copilot Review on Orphaned Agent File Cleanup Refines the #12776 fix series based on automated review feedback. - Scope save-time pruning to the current agent only. When a PATCH carries tool_resources, strip orphans from the incoming payload and pay the DB round-trip only then. Removes the collection-wide updateMany previously triggered when tool_resources was absent (Codex P2 / Copilot). - Wrap the orphan check in try/catch so a transient db.getFiles failure can't turn a good save into a 500 (comprehensive review #1). - Replace Object.values(EToolResources) casts with an explicit list of agent-side categories in both orphans.ts and agent.ts. code_interpreter belongs to the Assistants API and isn't a key of AgentToolResources — including it was a type lie and generated dead MongoDB clauses (comprehensive review #3, #8). - Export TOOL_RESOURCE_KEYS from @librechat/api and consume it in the migration script, dropping one duplicated definition (#4). - Cap migration results.details at 50 sample entries so the memory footprint stays bounded on deployments with thousands of corrupted agents (Codex P3). - Add migrate:orphaned-agent-files:batch npm script to match the convention set by migrate-agent-permissions / migrate-prompt-permissions (#7). - Add controller-level tests covering the three orphan-pruning paths: strip from incoming tool_resources, leave alone when tool_resources is absent, swallow db.getFiles errors and still save (#6). - Back pre-existing "should validate tool_resources in updates" test's file_ids with real File docs — the new pruning would otherwise strip them, and that test is about OCR conversion / schema filtering, not file existence. Register the File model in beforeAll so the fixture works. * 🩹 fix: Tighten TOOL_RESOURCE_KEYS Type and Align Migration Sample Output Two follow-ups from the second review pass. - Type data-schemas' TOOL_RESOURCE_KEYS as ReadonlyArray<keyof AgentToolResources> instead of readonly string[]. Data-schemas depends on data-provider, so the import is clean. Catches typos and aligns with the matching export in @librechat/api — doesn't guarantee exhaustiveness, but that's a TypeScript limitation, not a workspace one. - Align the migration's console output with DETAIL_SAMPLE_LIMIT: print every collected detail (up to 50) and, when more agents were affected than the sample size allowed, show a truncation notice. The old hard cap of 25 meant affected agents in the 26-50 range were collected but never shown. * ✅ test: Add Integration Coverage for Orphan Cleanup Paths (#12776) Exercise the delete-time and migration paths end-to-end against a real in-memory Mongo. Catches integration bugs the isolated unit tests on each layer couldn't. - api/server/services/Files/process.integration.spec.js — the primary repro: seed an Agent + File, call processDeleteRequest, assert the file_id disappears from every referencing agent's tool_resources while unrelated agents stay untouched. Also covers the no-op case and confirms a failure in the new cleanup step cannot roll back the file deletion itself. - api/test/migrate-orphaned-agent-files.spec.js — drives the migration module: --dry-run reports without writing, apply mode prunes across every tool_resource category, re-running is idempotent, and DETAIL_SAMPLE_LIMIT caps the in-memory sample on wide corruption. Mocks only the connect helper (the spec owns the mongoose instance) so the real migration code path — cursor, $pullAll, reduce — runs. * 🔒 fix: Run Orphan Cleanup Migration in System Tenant Context Codex P2 catch: under TENANT_ISOLATION_STRICT=true, the migration throws on the very first Agent.countDocuments() because the tenant isolation plugin fail-closes on queries without tenant context — which makes migrate:orphaned-agent-files unusable on the exact deployments most likely to have accumulated corruption. - Wrap the scan/prune body in runAsSystem so queries bypass the tenant filter (SYSTEM_TENANT_ID sentinel). The migration legitimately needs cross-tenant visibility — this is the same pattern seedDatabase and the S3 refresh job already use. - Add a regression test that spies on Agent.countDocuments() and asserts the active tenantStorage context is SYSTEM_TENANT_ID during the call. Pins the wrap against future regressions without the brittleness of toggling the strict-mode env var (which caches on first read). Note: the delete-time and save-time paths already run inside an authenticated HTTP request where tenantStorage.run is set by auth middleware, so the cleanup naturally scopes to the current tenant — which is the correct behavior there since file ownership is tenant-scoped. * 🧹 chore: Drop Unused path Import From Process Integration Spec Leftover from an earlier iteration that resolved the migration path via path.resolve before I switched to a relative require. The import does nothing now — removing it. |
||
|
|
40742f9191
|
🔏 fix: Prevent Browser Autofill From Silently Dropping MCP CustomUserVars on Save (#12770)
* fix: prevent browser autofill from silently dropping MCP customUserVars on save The customUserVars form in CustomUserVarsSection renders each credential as a plain `<input type="text">` with no autofill guards. This caused two user-visible problems: 1. Browser password managers (Chrome's built-in, 1Password, LastPass, Bitwarden) treat the fields as savable and offer to store values, which undermines the per-user credential model -- the whole point of customUserVars is that each user's own secrets stay in their own session and backend-encrypted, not cached by the browser. 2. When autofill fills a field, it does so via DOM mutation that does NOT fire React's synthetic `onChange`. react-hook-form's Controller therefore never sees the value, the form state stays `""`, and on submit the backend receives an empty string for every affected field. The user's typed credentials are silently dropped -- LibreChat stores 16 encrypted-empty-string rows in pluginauths, tool calls subsequently fail with "API key not set" errors, and the UI shows an "Unset" pill next to fields the user is certain they filled in. The fix matches the pattern already used in `SidePanel/Builder/ActionsAuth.tsx` for the analogous actions credential input: type="new-password" autoComplete="new-password" plus vendor-specific ignore attributes for LastPass and 1Password: data-lpignore="true" data-1p-ignore="true" (Modern browsers ignore `autocomplete="off"` for password-shaped fields, so the vendor attributes are the reliable defence against LastPass and 1Password, which have their own heuristics.) No behavioural change beyond preventing autofill: the input still accepts typed or pasted values, react-hook-form still collects them, submit still writes to the backend. The difference is that the values now actually reach the submit handler. * test: add regression guard for MCP customUserVars autofill prevention Condenses the rationale comment on the credential `<Input>` and adds a render test asserting that the autofill-prevention attributes (`type`, `autoComplete`, `data-lpignore`, `data-1p-ignore`) remain on the rendered input. The underlying bug (browser DOM mutations bypassing React's synthetic onChange) is severe enough that a future refactor accidentally dropping these attributes would silently re-introduce credential data loss. --------- Co-authored-by: Danny Avila <danny@librechat.ai> |
||
|
|
24d32f28f0
|
📅 feat: Support text/calendar (iCalendar) in Code Outputs (#12758)
Registers text/calendar across the MIME allowlists (fullMimeTypesList, codeInterpreterMimeTypesList, textMimeTypes regex) and maps the .ics, .ical, .ifb, .icalendar extensions in codeTypeMapping, so iCalendar files produced by the code interpreter are accepted as valid output and rendered as downloadable attachments. |
||
|
|
f5c4deac28
|
🔐 fix: Prefer WWW-Authenticate resource_metadata Hint for MCP OAuth (#12763)
* 📦 chore: Bump @modelcontextprotocol/sdk to v1.29.0 * ♻️ refactor: Extract WWW-Authenticate Probe Helper for MCP OAuth * 🔐 fix: Prefer WWW-Authenticate resource_metadata Hint for MCP OAuth Per RFC 9728 §5.1, the `resource_metadata=<url>` parameter in a 401 `WWW-Authenticate: Bearer` challenge is the authoritative protected-resource metadata source. Path-aware `.well-known` discovery was winning over the hint, so split deployments that serve valid-but-wrong metadata at the path-aware endpoint stranded OAuth at defunct authorization servers. Threads the hint through `discoverOAuthProtectedResourceMetadata` via `opts.resourceMetadataUrl` in both startup detection and the OAuth handler, matching the behavior of Claude Desktop, the MCP Inspector, OpenAI tooling, and Microsoft Copilot Studio. Fixes #12761. * 🧵 fix: Thread OAuth-Aware fetchFn Through Resource-Metadata Probe Without this, admin-configured `oauthHeaders` (e.g. a gateway API key that fronts the MCP endpoint) were stripped from the probe, causing the gateway to 401 for the wrong reason and masking the real `WWW-Authenticate` hint. The helper now accepts a FetchLike and defaults to global fetch, so the startup detection path is unchanged while the handler passes its OAuth- aware wrapper through. * 🧹 refactor: Address MCP OAuth Probe Review Findings - Thread `fetchFn` through `probeResourceMetadataHint` so admin-configured `oauthHeaders` reach the probe (a gateway API key that fronts the MCP endpoint would otherwise 401 us for the wrong reason and hide the real Bearer challenge). - Skip the redundant HEAD request in `checkAuthErrorFallback` when the probe already observed a 401/403; fall back to a fresh HEAD only when every probe attempt threw (transient network error). - Narrow the oauth barrel: drop `export * from './resourceHint'` so the helper stays an internal module. - Add `scope` extraction coverage (`Bearer scope="read write"`) and a 403-only observation path; isolate `MCP_OAUTH_ON_AUTH_ERROR=true` in a dedicated suite so precise-outcome tests aren't muddied by the safety net. * ✅ fix: Use Zod Schema in MCP Reconnection-Storm Test Tool MCP SDK 1.28 tightened `McpServer.tool()` to require Zod schemas instead of plain JSON-Schema objects. Swap the `{ message: { type: 'string' } }` shape for `z.string()` so the fixture server spins up under SDK 1.29. * 🛡️ fix: Harden MCP OAuth resource_metadata Hint Against SSRF The `resource_metadata` URL is echoed from an untrusted MCP server, so handing it straight to the SDK lets a malicious server redirect discovery at private IPs, the cloud metadata service, or any host the admin did not intend to reach. Caught by the Copilot review on #12763. - `handler.ts`: run the hint through the same `validateOAuthUrl` / `allowedDomains` gate that already guards the authorization-server URL; drop it and fall back to path-aware discovery on rejection. - `detectOAuth.ts`: no admin-scoped allowedDomains here, so apply a strict `isSSRFTarget` + DNS resolution check and silently discard any hint pointing at a private/loopback/metadata address. - Tests cover both the hostname-list and DNS-resolution rejection paths and assert the SDK falls back to path-aware discovery unharmed. * 🧪 test: Mock ~/auth in fallback Suite for Consistency Matches the main `detectOAuth.test.ts` mock so the SSRF guards added in the previous commit don't touch the real `~/auth` module at test time. * 🔍 fix: Scope OAuth Fallback to HEAD + Parse Multi-Scheme WWW-Authenticate Two codex findings on #12763: - **P1**: the merged `authChallenge` flag was letting POST-only 401/403 flip the `MCP_OAUTH_ON_AUTH_ERROR` fallback, misclassifying WAF/CSRF-hardened endpoints (HEAD 200 + POST 403) as OAuth-required. Rename to `headAuthChallenge` and derive it only from the HEAD probe, matching the legacy fallback's HEAD-only semantics. Add a regression test. - **P2**: the SDK's `extractWWWAuthenticateParams` only inspects the first scheme token, so multi-scheme headers like `Basic realm="api", Bearer resource_metadata="..."` silently dropped the authoritative Bearer hint. Fall back to a regex across the full header when the SDK returns nothing but Bearer is present. Add a regression test covering the multi-scheme case. * 🧽 refactor: Tighten MCP OAuth Probe Semantics Addresses the second external review pass plus codex P2: - Merge the two stacked JSDoc blocks on `probeResourceMetadataHint` into one with a proper `@returns` section. - Only short-circuit HEAD when it delivered the `resource_metadata` hint itself — a Bearer-without-params HEAD now lets POST run, since some servers surface their hint only on POST and we were missing it. - Drop the unused `scope` field from `ResourceHintProbeResult`; no caller read it, and YAGNI beats a reserved field. - Remove the redundant `OAUTH_ON_AUTH_ERROR` guard inside `checkAuthErrorFallback` — the only call site already gates on it. - Codex P2: signal "HEAD status unknown" via `null` when the HEAD probe threw and POST returned non-auth. Previously that combination leaked a `{headAuthChallenge: false}` result and silently skipped the fallback's retry HEAD, which could misclassify OAuth-required servers after a transient HEAD failure. Regression tests cover every path: Bearer-no-hint-on-HEAD + hint-on-POST, multi-scheme `Basic + Bearer` headers, HEAD-threw + POST-200 retry, and the WAF/CSRF-only POST 403 case. * 🪥 polish: Tighten Probe Null-Guard Ordering + Add Malformed-Hint Test Two NITs from the follow-up review: - Move `bearerChallenge` computation after the `!wwwAuth` guard so the variable is only derived when it can be meaningfully `true`. The early-return path is now a clean unconditional exit. - Add a regression test that asserts `Bearer resource_metadata="not-a-url"` yields `resourceMetadataUrl: undefined` without throwing, locking in the try/catch safety net in `extractHintFromHeader` and the SDK parser alike. |
||
|
|
2427edef90
|
🌍 i18n: Update translation.json with latest translations (#12711)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |
||
|
|
d1573a8493
|
🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (#12755)
* 🔒 fix: Validate MCP OAuth Protected Resource Metadata binding (GHSA-gvpj-vm2f-2m23) RFC 9728 §3.3/§7.3 requires clients to verify that the `resource` identifier advertised by an OAuth Protected Resource Metadata document matches the URL used to fetch it. Without this check, a malicious MCP server can serve metadata pointing at a legitimate server's `authorization_servers`, causing LibreChat to obtain an access token for the real server and send it to the attacker in subsequent API calls. Validation happens at discovery time so the entire metadata document is discarded on mismatch — `authorization_servers` on a spoofed document is equally untrustworthy and is the primary theft vector in the PoC. Uses the MCP SDK's own `checkResourceAllowed` (origin + path-prefix) for semantic parity with `selectResourceURL`, a SDK code path LibreChat bypasses. This is looser than RFC-strict equality (handles common cases like `/mcp/sse` server vs `/mcp` advertised resource, or trailing-slash normalization) while still rejecting cross-origin spoofs and same-origin sibling-path confusion. * 🛡️ fix: Re-validate Resource Binding at MCP OAuth Token Exchange Adds a defense-in-depth re-assertion of the RFC 9728 §3.3 resource/server binding inside `completeOAuthFlow`. Flows have a 10-minute TTL, so a flow initiated under pre-fix (vulnerable) code could still be in-flight at upgrade time carrying unvalidated resource metadata. Re-checking here closes that window without requiring operators to flush flow state on deploy. Also guards against future regressions that might re-introduce unvalidated paths into the flow-metadata pipeline (GHSA-gvpj-vm2f-2m23). * ✅ test: Address Review Findings on MCP OAuth Resource Validation Follow-up to GHSA-gvpj-vm2f-2m23 fix. Resolves three reviewer findings: - Assert `failFlow` is called when `completeOAuthFlow` rejects at re-validation — locks in the "no stuck PENDING entry" guarantee that the catch block already provides in production code. - Update the "no resource metadata" warning in `initiateOAuthFlow`: post-fix, that branch is only reachable when PRM discovery returned nothing (404, network error, server without RFC 9728). A document with a missing `resource` field now throws earlier in `assertResourceBoundToServer`, so the old "missing 'resource' property" phrasing described a case that can no longer reach this branch. - Add a test for an unparseable `resource` string triggering the error-wrapping path in `assertResourceBoundToServer` (verifies the wrapper surfaces a descriptive message instead of leaking a raw `TypeError: Invalid URL` from the SDK's `new URL()` call). |
||
|
|
7cf8b84778
|
📦 chore: Update @librechat/agentsto v3.1.68 (#12752)
- Bumped the version of @librechat/agents from 3.1.67 to 3.1.68 in multiple package.json files to ensure consistency and access to the latest features and fixes. - Updated package-lock.json to reflect the new version and maintain dependency integrity. |
||
|
|
8e866e6010
|
🗺️ fix: Resolve Custom-Endpoint Providers for Summarization (#12739)
* 🔧 fix: resolve custom-endpoint providers for summarization When `summarization.provider` in `librechat.yaml` is set to a custom-endpoint name (e.g. `Ollama`), the string was passed verbatim to the agents SDK, which only knows a fixed set of provider names and threw `Unsupported LLM provider: Ollama`. Before shaping the summarization config for the SDK, resolve the provider through `getProviderConfig`: custom-endpoint labels are remapped to the underlying SDK provider (e.g. `openAI`) and the endpoint's baseURL/apiKey are injected into `parameters` so the summarization model reaches the right backend, even when summarization targets a different custom endpoint than the main agent. Unknown names and names that appear with no matching endpoint flow through unchanged so the SDK can surface a clear error. User-provided credentials and unresolved env-var references are skipped rather than forwarded, letting the SDK's self-summarize path reuse the agent's own clientOptions. Ref: LibreChat Discussion #12614 * address: widen unresolved-env-var guard, fix test naming - Reject summarization overrides when the extracted baseURL/apiKey still contains any `${...}` placeholder, including prefix/suffix patterns like `https://${UNSET}.example.com` that `envVarRegex` (exact-match) missed. - Rename the "case-insensitive" test to reflect that only `Ollama` is normalized via `normalizeEndpointName`; add coverage proving other custom-endpoint names match case-sensitively. * address: use req.config in responses.js; forward full endpoint options - `responses.js` relied on a module-level `appConfig` set via `setAppConfig`, which is never called anywhere. Use `req.config` directly so the summarization provider resolver actually runs on the responses route. - Route the custom endpoint config through `getOpenAIConfig` so summarization inherits the same `headers`, `defaultQuery`, `addParams`/`dropParams`, and `customParams` transforms (Anthropic/Google/etc.) that `initializeCustom` applies for the main agent flow. Strip the stale `model`/`modelName` defaults so `summarization.model` still wins. * address: skip overrides when summarization matches agent endpoint When `summarization.provider` resolves to the same custom endpoint as the main agent, rely on the SDK's self-summarize path (which reuses `agentContext.clientOptions` unchanged) rather than injecting overrides. Otherwise the shallow spread of `clientOverrides.configuration` would replace the agent's request-resolved state (dynamic headers, proxy/fetch options) with yaml-only config. Only applies when summarization targets a *different* endpoint from the agent; the yaml config is all we have in that case, so overrides still flow through. * address: preserve raw provider when overrides cannot be built When summarization points at a different custom endpoint than the agent and we can't resolve the endpoint's credentials (user_provided, or a still-unresolved `${VAR}` after env extraction), remapping to `openAI` without overrides would silently route summaries to the default OpenAI client. Preserve the raw provider name so the SDK raises a clear "Unsupported LLM provider" error (now also logged, via the agents SDK defense-in-depth fix) instead of sending traffic to the wrong backend. * address: resolve endpoint headers and forward PROXY to summarization - Custom-endpoint `headers` now flow through `resolveHeaders` before reaching `getOpenAIConfig`, matching the main agent path. This ensures templated values like `\${PORTKEY_API_KEY}` or `{{LIBRECHAT_BODY_...}}` are substituted for summarization requests instead of being forwarded literally. - `PROXY` env var is now passed into `getOpenAIConfig` so cross-endpoint summarization honors outbound proxy dispatchers configured for the rest of the deployment. * address: user summarization parameters win over endpoint defaults Flip the merge order so `summarization.parameters` from yaml override `clientOverrides` defaults (which come from `getOpenAIConfig` and always include `streaming: true` etc.). A user who sets `parameters.streaming: false` in their config should still see non-streaming summarization for providers that require it. * address: review feedback (logging, dead code, DRY, types, deep-merge) - Log error in the resolveSummarizationProvider catch-all so programming bugs in getProviderConfig/getOpenAIConfig/resolveHeaders surface in operator logs instead of falling through silently. - Drop dead `setAppConfig`/`appConfig` infrastructure in responses.js and fix adjacent `allowedProviders` reference that also relied on the never-initialized module-level appConfig. Uses `req.config` directly. - Import canonical `normalizeEndpointName` from librechat-data-provider instead of duplicating it locally. - Replace `SummarizationClientOverrides = Record<string, unknown>` with an explicit interface covering the known fields. - Deep-merge `configuration` when user-supplied `summarization.parameters. configuration` overlaps the resolved endpoint configuration, so user additions (e.g. `defaultQuery`) don't wipe out `baseURL`/`defaultHeaders`. - Wrap `process.env` mutations in test in `try/finally` so a failed assertion doesn't leak env state into subsequent tests. - Drop `as unknown as AppConfig` in test helper; fixture now matches the `AppConfig` shape directly using a `Partial<TEndpoint>` union. - Trim JSDoc that restated the name it was attached to. * address: review nits — import order, local test type, conflict test - Move `import { logger }` up into the package value-imports section so it no longer sits between `import type` blocks. - Replace `as unknown as SummarizationConfig['parameters']` in the deep-merge test with a named `TestSummarizationParameters` type and a single narrowing cast at the call site, making intent explicit. - Add a test proving that user-supplied `configuration.baseURL` wins over the resolved endpoint baseURL, locking in the deep-merge's user-wins-on-conflict semantics that the previous suite only exercised additively. |
||
|
|
d2cbd551b7
|
🤝 fix: Load Handoff Agents for Agents API (#12740)
* 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (#12726)
Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents
into a shared `discoverConnectedAgents` helper in `@librechat/api` and
wires it into the OpenAI-compatible `/v1/chat/completions` and Open
Responses `/v1/responses` controllers. These endpoints previously only
passed the primary agent config to `createRun` while keeping
`primaryConfig.edges` intact, which forced `MultiAgentGraph` into
multi-agent mode without loading the referenced sub-agents and caused
StateGraph to throw "Found edge ending at unknown node <id>".
The discovery helper also filters orphaned edges (deleted sub-agents or
those the caller lacks VIEW permission on), so API users see the same
graceful fallback the chat UI already had.
* 🧪 fix: use ServerRequest in discovery spec helpers
CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test
helpers typed `req` as `express.Request`, which is not assignable to
`DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose
`user` is `IUser`). Local jest passed because ts-jest is transpile-only,
but the CI typecheck uses the full compiler.
* 🪲 fix: drop orphan edges on both endpoints, not just `to`
Addresses the P1 codex finding on #12740: `filterOrphanedEdges`
previously only removed edges whose `to` referenced a skipped agent.
Edges whose `from` was a skipped agent — the symmetric case in a
bidirectional graph like `A <-> B` where `B` is deleted or the user
lacks VIEW on it — leaked through to `createRun` and re-triggered
`Found edge ending at unknown node <id>` at StateGraph compile time.
The filter now drops an edge if either endpoint references a skipped
id, and the existing `to`-only test cases were updated to reflect the
stricter behavior. Adds a bidirectional-graph regression test in
`discovery.spec.ts`.
* 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes
Addresses the second P1 codex finding on #12740: the OpenAI-compat
`/v1/chat/completions` and Open Responses `/v1/responses` routes gate
the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`),
but `discoverConnectedAgents` was checking handoff sub-agents against
the looser in-app `AGENT` resource type. That allowed a remote caller
who could reach the orchestrator but had only in-app visibility on a
sub-agent to invoke it via the API — bypassing the remote-sharing
boundary.
Adds an optional `resourceType` param to `discoverConnectedAgents`
(defaulting to `AGENT` for the chat UI path) and passes
`ResourceType.REMOTE_AGENT` from both API controllers so every
discovered sub-agent clears the same sharing boundary enforced at
route entry.
* 🧯 fix: enforce allowedProviders for discovered sub-agents
Addresses the third P1 codex finding on #12740: `discoverConnectedAgents`
forwarded the caller's `endpointOption` verbatim into `initializeAgent`,
but on the OpenAI-compat routes that option's `endpoint` is the primary
agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only
enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)`
is true, so handoff sub-agents silently bypassed the provider allowlist
configured under `endpoints.agents.allowedProviders`.
Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every
per-sub-agent init call. The primary agent still uses the caller's
endpointOption as before — this only affects the BFS-loaded handoff
targets. Regression test asserts the override.
* ✂️ fix: prune unreachable sub-agents after orphan-edge filtering
Addresses the fourth P1 codex finding on #12740: BFS eagerly initializes
every sub-agent referenced in the primary's edge scan, but once
`filterOrphanedEdges` drops edges whose endpoints were skipped, some of
those sub-agents end up disconnected from the primary. In an `A -> B ->
C` graph (edges stored directly on A) where B is skipped (missing or
no VIEW), both edges are filtered, but C was already loaded and would
still be passed to `createRun` — which flips into multi-agent mode on
`agents.length > 1` and turns C into an unintended parallel start node.
After filtering edges, compute the set of agent ids reachable from the
primary through the surviving edge set and prune `agentConfigs` to that
set. Two regression tests added: one for the pruning case, one that
confirms agents connected via surviving edges are still kept.
* 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback
Addresses the fifth P1 codex finding on #12740: `onAgentInitialized`
fires during BFS, BEFORE the helper prunes agents that become
disconnected once `filterOrphanedEdges` runs. Writing the sub-agent
straight into the outer `agentConfigs` there and then only additively
merging the pruned `discoveredConfigs` left stranded entries in the
outer map, and `AgentClient` would still hand them to `createRun` as
extra parallel start nodes (the exact failure mode the pass-4 prune
was meant to eliminate for the API controllers).
Drop the `agentConfigs.set` from the callback and replace the additive
merge with a direct copy from `discoveredConfigs`, which is now the
single authoritative source of what the run should see. The
per-agent tool context map is still populated during BFS — stale
entries there are harmless because they're only read by closure inside
`ON_TOOL_EXECUTE` and are unreachable once the agent is not in
`agentConfigs`.
* 🔬 fix: address audit findings on discovery helper
Resolves findings from a comprehensive external audit of #12740.
**Finding 1 (CRITICAL) — stale edges survive the reachability prune.**
The pass-4 prune removed unreachable agents from `agentConfigs` but left
matching edges in the return value. In an `A -> B -> C -> D` graph (all
edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B
and B->C but keeps C->D (neither endpoint is skipped). The caller then
sees `agentConfigs` without C/D but `edges` still references them,
flipping `createRun` into multi-agent mode with mismatched agents/edges
— the exact crash this PR is supposed to fix. Now filter the edge list
to the reachable set in the same pass, so the returned shape is
self-consistent: every edge endpoint is either the primary id or a key
of `agentConfigs`. New regression test covers A->B->C->D with B skipped.
**Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API
request.** The OpenAI-compat and Responses controllers called
`getModelsConfig(req)` and `discoverConnectedAgents` even when the
primary agent had no edges (the common single-agent API case). Gate
both behind `primaryConfig.edges?.length > 0` so single-agent runs
don't pay that cost.
**Finding 5 (MINOR) — silent mutation of caller's
`primaryConfig.userMCPAuthMap`.** The helper aliased that object and
then `Object.assign`'d sub-agent entries into it, changing the caller's
config in-place. Shallow-clone up front so the returned merged map is
the only destination.
**Finding 7 (NIT) — dead `?? []` coalescing.**
`filterOrphanedEdges` always returns a concrete array, so the
`discoveredEdges ?? []` fallback was never reached. Simplified the
`primaryConfig.edges = …` assignment.
Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not
mutated in-place.
* 🧹 chore: address audit NITs on discovery helper
Addresses two NIT findings from the post-fix audit:
**F1** — the shallow clone on `primaryConfig.userMCPAuthMap` was only
applied on the primary side; the `else` branch (hit when the primary
had no MCP auth and the first sub-agent seeds the map) assigned the
sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's
`Object.assign` mutated the first one's map in place. Harmless in
practice (per-request ephemeral objects) but asymmetric. Clone in the
else branch too. Test added.
**F2** — `initialize.js` had a defensive `if (agentConfigs.size > 0 &&
!edges) edges = []` normalizer. Pre-existing dead code: the helper now
always returns a concrete array from `filteredEdges.filter(...)`.
Removed for clarity.
* 🕸 fix: require all sources reachable when traversing fan-in edges
Addresses the seventh P1 codex finding on #12740: the reachability BFS
advanced through an edge as soon as any of its `from` endpoints matched
the current frontier node (`sources.includes(current)`), but the
subsequent edge filter required ALL sources to be reachable (`every`).
The two-semantics mismatch let a fan-in edge like `{from: ['A','B'],
to: 'C'}` mark C reachable purely via A even when B had no path from
the primary, then drop the edge itself at filter time. Result: C
survived in `agentConfigs` with no surviving edge connecting it to A,
so `createRun` flipped into multi-agent mode on `agents.length > 1`
and C ran as an unintended parallel root.
Replace the BFS with a fixed-point iteration keyed on the same
all-sources-reachable predicate used by the filter, so traversal and
filtering stay aligned and multi-source edges only fire once every
source is in the reachable set.
Two regression tests added:
- `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts
neither B nor C leak into the result.
- `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires
and D becomes reachable once both B and C are.
* 🔀 fix: match SDK OR semantics for multi-source edge reachability
Reverts the all-sources-required reachability gate from
|
||
|
|
48c3d31db3
|
🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels (#12737)
* 🔊 fix: Preserve Log Metadata on Console for Warn/Error Levels The default console formatter discarded every structured metadata key on the winston info object — only `CONSOLE_JSON=true` preserved it. That meant failures emitted by the agents SDK (e.g. "Summarization LLM call failed") reached stdout without the provider, model, or underlying error attached, leaving users unable to diagnose the root cause. - Add `formatConsoleMeta` helper to serialize non-reserved metadata as a compact JSON trailer, with per-value string truncation and safe handling of circular references. - Append the metadata trailer to warn/error console lines; info/debug behavior is unchanged. - Relax `debugTraverse`'s debug-only gate so warn/error messages routed through the debug formatter also surface their metadata. - Add a `formatConsoleMeta` stub to the shared logger mock so existing tests keep working. * 🔐 fix: Also Redact Sensitive Patterns on Warn Console Lines The warn-level console output now includes a metadata trailer that may contain provider-returned error strings with embedded tokens or keys (e.g. `Bearer ...`, `sk-...`). Apply `redactMessage` to warn lines in addition to error, matching the new surface area. * 🔐 fix: Redact Sensitive Tokens Embedded in JSON Metadata Two gaps in the existing console redaction that became user-visible once warn/error lines started emitting structured metadata: 1. The OpenAI-key regex (`/^(sk-)[^\s]+/`) was anchored to start-of-line, so keys embedded inside JSON payloads (e.g. `{"apiKey":"sk-..."}`) were never redacted. Every console line begins with a timestamp, so the anchor effectively made this pattern dead code. 2. `formatConsoleMeta` stringified metadata values verbatim; a sensitive string value was only redacted by the whole-line regex pass, which missed the anchored `sk-` case above. Fix: - Drop the `^` anchor; add `/g` so every occurrence is redacted, not just the first. - Also exclude `"` and `'` from the token body so JSON-embedded values terminate at the closing quote rather than chewing into the next field. - Simplify `redactMessage` to apply patterns directly (dropping the `getMatchingSensitivePatterns` filter) — the filter used `.test()` which has stateful behavior on `/g` regexes and is no longer needed. - `formatConsoleMeta` now runs `redactMessage` over every string value before JSON serialization, so the metadata trailer is safe even on the warn path. - Add regression tests covering both fixes. Reviewed-by: Codex (P1 finding on PR #12737, commit |
||
|
|
ccf3a6c670
|
📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types (#12735)
* 🐛 fix: accept documented `summarization.trigger.type` values The Zod schema for `summarization.trigger.type` only accepted `'token_count'`, but: - the documentation lists `token_ratio`, `remaining_tokens`, and `messages_to_refine` as valid - the `@librechat/agents` runtime only evaluates those three types and silently no-ops on anything else The result was a double failure: any user following the docs hit a startup Zod error, and anyone who matched the schema by using `token_count` got a silent no-op at runtime where summarization never fired. Align the schema with the documented, runtime-supported trigger types. Closes #12721 * 🧹 fix: bound `token_ratio` trigger value to (0, 1] Per Codex review: the previous schema accepted `value: z.number().positive()` for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }` (presumably meant as "80%") passed validation and then silently never fired — because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always false. That is exactly the silent-no-op pattern this PR is trying to eliminate. Switch to a discriminated union so each trigger type has its own value constraint: - `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense - `remaining_tokens`: positive — token counts can be large - `messages_to_refine`: positive — message counts can be > 1 Added tests for the upper-bound rejection and the inclusive upper bound (`value: 1` still accepted as a valid "fire at 100%" extreme). * 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range Per Codex review: `.positive()` rejected `value: 0`, but the docs describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins who copy the documented lower bound into their YAML would fail schema validation at startup. Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme) setting — the agents SDK's `usedRatio >= 0` check will fire as soon as there is anything to refine, which is a legitimate "always summarize when pruning happens" configuration. `remaining_tokens` and `messages_to_refine` keep `.positive()`: both are counts, and `0` there produces no meaningful behavior (the SDK has an early return for `messagesToRefineCount <= 0`). * 🐛 fix: preserve `token_ratio` trigger when `value: 0` Per Codex review: now that the schema accepts `token_ratio: 0`, `shapeSummarizationConfig` would silently drop it because of a truthy check on `config?.trigger?.value`. The trigger would disappear and the runtime would fall back to "no trigger configured" — which fires on any pruning rather than honoring the explicit ratio. Switch to `typeof value === 'number'`, which preserves `0` while still rejecting `undefined`/`null`. Added a regression test that asserts `{ type: 'token_ratio', value: 0 }` survives the shaping function untouched. * 🧹 fix: reject non-finite trigger values at schema level Per Codex review: `z.number().positive()` still accepts `Infinity` and `NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but the agents SDK guards every trigger path with `Number.isFinite(...)` and silently returns `false` — summarization never fires while the server starts cleanly. That is the exact schema/runtime split this PR is trying to eliminate. Add `.finite()` to every trigger value. `token_ratio` already had an implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps the intent obvious and catches `NaN` (which `.max(1)` does not). * 🧹 fix: integer counts + targeted token_count migration warning Two findings from the comprehensive review: 1. `remaining_tokens` and `messages_to_refine` are token/message counts and are always integers in the runtime (`Number.isFinite(...)` guards already assume integer semantics). `z.number().positive()` accepted fractional values like `2.5`, which was semantically confusing and would round oddly against the runtime's `>=` / `<=` comparisons. Add `.int()` to both count-based branches; `token_ratio` stays fractional. 2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got the generic "Invalid summarization config" warning plus a flattened Zod error. Detect that specific case in `loadSummarizationConfig` and emit a migration-friendly message that names the three valid replacements. Export the function so the behavior is unit-testable. Also added a parameterized passthrough test covering `remaining_tokens` and `messages_to_refine` shaping, complementing the existing `token_ratio` coverage. * 🧹 fix: accurate fallback wording + bare-string trigger test Two nits from the follow-up audit: 1. The legacy-`token_count` warning claimed "Summarization will be disabled," but `shapeSummarizationConfig` treats a missing summarization config as self-summarize mode (fires on every pruning event using the agent's own provider/model). "Disabled" would mislead an admin into stopping investigation. Reword to describe the actual fallback and assert the new wording in the spec. 2. Add a regression test for the `trigger: 'bare-string'` YAML case, so the `typeof raw.trigger === 'object'` guard is exercised rather than implied. 3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message is safe in every terminal/aggregator regardless of UTF-8 handling. * 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union CI TS check failed: after the schema tightening, `raw.trigger.type` is narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine" | undefined`, so the runtime comparison to `"token_count"` is a TS2367 ("no overlap") error even though that's exactly the comparison we want for the migration guard. Widen just that one access via `as { type?: unknown }` so the migration check reads runtime-shaped YAML input without the type system folding it back into the narrowed union. |
||
|
|
2358d07b64
|
📝 fix: Preserve Raw Markdown Formatting on Upload as Text (#12734)
* 🐛 fix: Preserve Raw Markdown on `Upload as Text` When `RAG_API_URL` is configured, `.md` uploads were sent to the RAG API `/text` endpoint, which routes Markdown through `UnstructuredMarkdownLoader` and strips formatting (`#`, `**`, lists, blockquotes). Users expect `Upload as Text` to preserve raw content - identical bytes in a `.txt` file round-trip verbatim, while the `.md` came back stripped. Short-circuit the RAG API call for Markdown files (by MIME type or `.md` / `.markdown` extension) and read the file verbatim via `parseTextNative`. Non-Markdown paths are unaffected, and the embedding path (`/embed`) keeps its existing loader so vector search quality is unchanged. * 🐛 fix: normalize markdown MIME and accept `text/md` Addressing review feedback on the `Upload as Text` short-circuit: - Accept `text/md` in the markdown MIME set (LibreChat treats it as a valid markdown type elsewhere, e.g. the artifact-rendering prompt). - Normalize the incoming MIME type (lowercase + strip parameters) before the set lookup so parameterized values like `text/markdown; charset=utf-8` and uppercase `TEXT/MARKDOWN` still short-circuit. Extensionless uploads relying only on the `Content-Type` header would otherwise fall through to the RAG `/text` endpoint and lose their markdown formatting. Extend `text.spec.ts` parametrized cases with `text/md`, parameterized MIME, uppercase, and whitespace-padded variants. * 🧹 chore: Address Code Review Follow-ups on `Upload as Text` fix Addressing comprehensive review feedback: - Debug log now includes filename and MIME type so operators can identify which upload triggered the short-circuit without having to correlate other logs. - Expand markdown extension detection beyond `.md` / `.markdown` to cover `.mdown`, `.mkdn`, `.mkd`, `.mdwn` (case-insensitive regex). - Tighten `normalizeMimeType` parameter type from `string | undefined` to `string` to match the actual Express.Multer.File type. The falsy-check still protects against empty strings at runtime. - Extend parametrized tests with the most common real-world shapes: `text/plain` + `.md` (the MIME most browsers/servers assign), the new rare extensions, and empty MIME + `.md` (pure extension fallback path). - Add a positive assertion that `readFileAsString` was called with the expected arguments on every short-circuit case, so tests fail loudly if the native-parse path ever regresses. * 🧪 test: Cover `.mdwn` regex branch in Markdown short-circuit Every other alternation in `MARKDOWN_EXTENSIONS_RE` has at least one test case (`md`, `markdown`, `mdown`, `mkdn`, `mkd`) but `mdwn` did not, leaving a typo in that branch undetectable. |
||
|
|
3bd2681272
|
🪐 fix: Replace $bitsAllSet ACL Queries for Azure Cosmos DB Compatibility (#12736)
* 🐛 fix: replace `$bitsAllSet` ACL queries for Cosmos DB compatibility Azure Cosmos DB for MongoDB API does not implement the `$bitsAllSet` query operator, so every permission check against Cosmos DB threw. The five read paths in `aclEntry.ts` (`hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, and two sites in `getSoleOwnedResourceIds`) now fetch candidate entries and apply the bitwise mask in application code. This matches the existing FerretDB-compatible pattern. Fixes #12729. * 🐛 fix: delegate `findPubliclyAccessibleResources` to fixed DB method `AccessControlService.findPubliclyAccessibleResources` inlined the same `$bitsAllSet` query as the data-schemas layer, which fails on Azure Cosmos DB for MongoDB. Delegate to `_dbMethods.findPublicResourceIds` so a single implementation carries the Cosmos-compatible bitwise logic. Refs #12729. * 🐛 fix: move `$bitsAllSet` filter out of remote-agent aggregation `enrichRemoteAgentPrincipals` used `$bitsAllSet` inside an aggregation `$match`, which Azure Cosmos DB for MongoDB does not implement. Project `permBits` through the pipeline and filter for `PermissionBits.SHARE` in application code. The extra documents fetched are bounded by ACL entries on a single agent resource, so the cost is negligible. Refs #12729. * 🧪 test: rename misleading public-dedup test and add real dedup coverage The test previously named "returns deduplicated IDs even if the public principal has multiple entries" only set up a single ACL entry, so it did not actually exercise deduplication. Split into two tests: one for the happy path (single entry with required bits), and one that bypasses `grantPermission`'s upsert via `AclEntry.create` to confirm the application-layer dedup Map handles genuine duplicates. Refs #12729. * 🧪 test: cover SHARE-bit filter in `enrichRemoteAgentPrincipals` The `$bitsAllSet` match stage in `enrichRemoteAgentPrincipals` previously guaranteed every aggregation row had SHARE; the Cosmos DB fix moved that check into a JS `continue` branch with no direct coverage. Add a dependency-injected unit test that stubs the aggregation with mixed SHARE / non-SHARE / zero-bit rows and asserts only SHARE holders are enriched and queued for backfill. Also includes a regression guard that the `$match` pipeline stage no longer contains a `permBits` filter. Refs #12729. * ♻️ refactor: extract `filterByBitsAndDedup` helper for ACL reads `findAccessibleResources` and `findPublicResourceIds` each inlined the same bitmask-filter + `Map`-based dedup loop. Lift it into a private `filterByBitsAndDedup(entries, requiredBits)` helper so the Cosmos-DB compatible pattern lives in one place. Pure rename/extract — no behavior change. Refs #12729. * 📝 docs: fix stale `\$bitsAllSet` references in FerretDB spec The describe block and header comment in the FerretDB parity spec still referenced `\$bitsAllSet queries` after the Cosmos DB compatibility fix moved the bit mask into application code. Update the title to \"Bitwise permission queries\" and rewrite the header comment to describe the application-layer behavior being validated. Refs #12729. * ⚡ perf: push permission-mask filter back to the database via `$in` The original fix for #12729 moved `$bitsAllSet` filtering into application code, which meant every ACL read fetched the full set of rows for a principal/resource and filtered in JS. For tenants with large ACL collections this inflates wire transfer and heap. Replace the JS filter with `permBits: { $in: permissionBitSupersets(X) }`. For the 4-bit `PermissionBits` enum the `$in` list is at most 16 values (8 for a single-bit mask like SHARE). `$in` is indexable and supported by Azure Cosmos DB for MongoDB, so the filter runs on the server again — restoring `.distinct('resourceId')` and `findOne()` semantics. `permissionBitSupersets(requiredBits)` is memoized and exported from `@librechat/data-schemas`. Callers restored: - `hasPermission`: back to `findOne` short-circuit - `findAccessibleResources` / `findPublicResourceIds`: back to `.distinct()` - `getSoleOwnedResourceIds`: back to the `$match` + `$group` aggregation - `enrichRemoteAgentPrincipals`: bit filter back in `$match`, JS `continue` removed Refs #12729. * 🧪 test: add `\$bitsAllSet` vs `\$in` parity + perf spec Introduces `aclEntry.parity.spec.ts` — a side-by-side spec that runs the legacy `\$bitsAllSet` query and the current `\$in`-based query against the same `mongodb-memory-server` fixture and asserts identical output sets for every affected method (`hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, `getSoleOwnedResourceIds`) across all 7 meaningful permBits combinations. Also logs median wall-clock time for the two query paths over 20 runs on an 800-entry fixture, with a loose 3x guard against catastrophic regressions. Initial local numbers: 1.05 ms vs 1.07 ms (findAccessibleResources), 1.10 ms vs 1.05 ms (findPublicResourceIds). Refs #12729. * 🔒 hardening: freeze `permissionBitSupersets` cache + enum-shape guard Two defensive changes from the comprehensive audit: * Cached superset arrays are now `Object.freeze`d and the return type is `readonly number[]`. Previously the cached arrays were returned by mutable reference, so a caller that mutated the result would silently corrupt the process-wide cache for every subsequent permission check. `Object.freeze` turns that into a loud `TypeError` at the mutation site. All existing call sites pass the result directly to Mongoose's `$in`, which does not mutate. * Added a module-load guard `if (MAX_PERM_BITS === 0) throw`. If `PermissionBits` is ever refactored to a `const` object or string enum, `Object.values(...).filter(isNumber)` would return `[]` and `MAX_PERM_BITS` would silently become 0, making every query match no rows and breaking every permission check. The guard fails loudly instead. Also collapsed four identical JSDoc lines across `hasPermission`, `findAccessibleResources`, `findPublicResourceIds`, and `getSoleOwnedResourceIds` into a single `{@link permissionBitSupersets}` reference. Refs #12729. * 🧪 test: add focused unit tests for `permissionBitSupersets` The helper is the single point of correctness for every ACL read path (every query uses `permBits: { $in: permissionBitSupersets(X) }`), so it warrants direct coverage independent of the higher-level parity and behavior specs. Six cases added: * `requiredBits=0` returns all 16 values * `requiredBits=15` returns `[15]` only * every returned value is a bitwise superset of `requiredBits` * full parity against the `$bitsAllSet` definition for every `required` in 0..15 * memoization: repeat calls return the same frozen reference * frozen result throws `TypeError` on mutation attempts Refs #12729. * 🧪 test: tighten parity perf guard and document fixture constants The `expect(currentMs).toBeLessThan(legacyMs * 3 + 50)` form was dominated by the `+ 50` additive term at typical sub-ms query latencies — at legacy=1ms, a 50x regression would still pass. Replace with `Math.max(legacyMs * 5, 50)` so the multiplicative ceiling is intact once the new path climbs out of the fixed noise floor. Also added inline rationale for the `FIXTURE_SIZE = 800` and `PERF_ITERATIONS = 20` constants. Refs #12729. * 🧹 chore: remove stale perf-guard comment, hoist rationale to describe Commit |
||
|
|
8f39f31c87
|
🤝 fix: Normalize Empty Handoff Fields to Restore Default Fallback (#12707)
* fix(agents): normalize handoff prompt and parameter values to ensure default instructions * fix(agents): address lint errors and codex review comments |
||
|
|
b579390287
|
📦 chore: npm audit & bump @librechat/agents to v3.1.67 (#12710)
* chore: Update package-lock.json with new dependencies and version upgrades - Added new dependencies for @langchain/anthropic and @langchain/core, including @anthropic-ai/sdk and fast-xml-parser. - Updated existing dependencies for @librechat/agents, @opentelemetry/api-logs, @opentelemetry/core, and related packages to their latest versions. - Enhanced integrity checks and licensing information for new and updated packages. * chore: Update @librechat/agents dependency to version 3.1.66 in package.json and package-lock.json - Bumped the version of @librechat/agents from 3.1.65 to 3.1.66 across multiple package.json files to ensure consistency and access to the latest features and fixes. * chore: Update dompurify and fast-xml-parser dependencies to version 3.4.0 and 5.6.0 respectively - Bumped the version of dompurify across multiple package.json files to ensure consistency and access to the latest features and security fixes. - Updated fast-xml-parser to the latest version in relevant package.json files for improved functionality. * chore: Update @librechat/agents dependency to version 3.1.67 in package.json and package-lock.json - Bumped the version of @librechat/agents from 3.1.66 to 3.1.67 across multiple package.json files to ensure consistency and access to the latest features and fixes. |
||
|
|
034b672d0c
|
🫧 feat: Claude Opus 4.7 Reasoning Visibility (#12701)
* 🫧 fix: Restore Claude Opus 4.7 Reasoning Visibility Claude Opus 4.7 omits `thinking` content from Messages API responses by default — empty thinking blocks still stream, but the `thinking` field is blank unless the caller passes `display: "summarized"` in the adaptive thinking config. This silenced the LibreChat "Thoughts" UI for Anthropic (and Anthropic-on-Bedrock) adaptive models. - Extend `ThinkingConfigAdaptive` in `packages/api/src/types/anthropic.ts` with an optional `display: 'summarized' | 'omitted'` field - Emit `{ type: 'adaptive', display: 'summarized' }` from `configureReasoning` in `packages/api/src/endpoints/anthropic/helpers.ts` - Emit `{ type: 'adaptive', display: 'summarized' }` from `bedrockInputParser` in `packages/data-provider/src/bedrock.ts` and update the local `ThinkingConfig` union - Update existing adaptive-thinking assertions to include the new field - Add dedicated tests asserting `display: 'summarized'` flows through both the Anthropic endpoint and the Bedrock parser See https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7#thinking-content-omitted-by-default * refactor: Gate `display: summarized` on Opus 4.7+ Narrow the reasoning-visibility opt-in to the models that actually omit thinking content by default, instead of applying it to every adaptive model. Pre-Opus-4.7 adaptive models (Opus 4.6, Sonnet 4.6) already return summaries, so sending the field is unnecessary noise. - Add `omitsThinkingByDefault(model)` in `packages/data-provider/src/bedrock.ts` that returns true only for Opus 4.7+ (including future majors like Opus 5+) - Bedrock parser now only attaches `display: 'summarized'` when the helper matches, keeping the adaptive object unchanged for older models - Anthropic endpoint `configureReasoning` uses the same helper so its emit path matches the Bedrock one - Tests: replace the blanket `display: 'summarized'` assertions with model-specific ones (Opus 4.7 gets it, Opus 4.6 / Sonnet 4.6 do not), add a dedicated `omitsThinkingByDefault` suite covering naming variants and future versions * feat: Configurable Thought Visibility for Anthropic Adaptive Models Expose the Anthropic `thinking.display` API field as a user-facing parameter so users can override the `auto` default (which stays as the Opus-4.7+ opt-in added earlier in this PR). Also fixes the CI type error by widening the adaptive thinking type assignment via a resolver helper that returns a properly-typed object. - Add `ThinkingDisplay` enum (`auto` | `summarized` | `omitted`) and matching zod schema in `packages/data-provider/src/schemas.ts` - Add `thinkingDisplay` to `tConversationSchema`, `anthropicSettings`, and the pick lists for Bedrock input/parser + Anthropic agent params - Add `resolveThinkingDisplay(model, explicit)` helper in `packages/data-provider/src/bedrock.ts` that returns the wire value or undefined (auto → model default, explicit → always honored) - `bedrockInputParser` now reads `thinkingDisplay` from input and emits `display` only when the resolver returns a value; strips the field on non-adaptive-model branches so it does not leak - `configureReasoning` in the Anthropic endpoint threads `thinkingDisplay` through, uses the resolver, and casts the adaptive config to `AnthropicClientOptions['thinking']` so the widened shape compiles against the stale installed SDK types - Add UI slider for `thinkingDisplay` in `parameterSettings.ts` next to `effort`, with three-position `com_ui_auto` / `com_ui_summarized` / `com_ui_omitted` labels - Add translation keys `com_endpoint_anthropic_thinking_display`, `com_endpoint_anthropic_thinking_display_desc`, `com_ui_summarized`, `com_ui_omitted` - Add tests: `resolveThinkingDisplay` suite (5 cases covering auto / explicit / unknown input), parser round-trip tests for all three modes on Opus 4.6 and Opus 4.7, Anthropic endpoint tests for explicit summarized/omitted overrides * fix: Drop `thinkingDisplay` When Adaptive Thinking Is Disabled If a user turns adaptive thinking off but had previously selected a `thinkingDisplay` value, the stale field was left in `additionalFields` and ended up merged into the Bedrock request's `additionalModelRequestFields`. That leaks a non-Bedrock key into the payload and can round-trip back into `llmConfig`. - Delete `additionalFields.thinkingDisplay` alongside `thinking` and `thinkingBudget` in the `thinking === false` branch of `bedrockInputParser` - Add a regression test asserting `thinking`, `thinkingBudget`, and `thinkingDisplay` are all absent when adaptive thinking is disabled on an Opus 4.7 request Reported by chatgpt-codex-connector on PR #12701. * refactor: Consolidate `ThinkingDisplay` Types and Preserve Persisted Display Address review findings on PR #12701: - [Codex P2] `bedrockInputSchema.transform` now extracts `thinking.display` from persisted `additionalModelRequestFields` back into the top-level `thinkingDisplay` field so explicit `'omitted'` round-trips through storage instead of being silently reverted to `'summarized'` on the next parse. - [Codex P2] `getLLMConfig` in the Anthropic endpoint now reads `.display` from a persisted `thinking` object (agents store the full Anthropic shape) and uses it as the fallback for `thinkingDisplay` when no top-level override is present. - [Audit #2] Collapse the three parallel wire-value types into a single `ThinkingDisplayWireValue = Exclude<ThinkingDisplay, 'auto'>` exported from `schemas.ts`; remove the duplicate `ThinkingDisplay` alias in `packages/api/src/types/anthropic.ts` (which collided with the enum name) and the `ThinkingDisplayValue` alias in `bedrock.ts`. - [Audit #3] Add `thinkingDisplay` to the `TEndpointOption` pick list next to `effort`. - [Audit #4] Add a TODO comment next to the `as AnthropicClientOptions['thinking']` cast explaining the stale `@librechat/agents` SDK types that require it. - Add tests: four round-trip cases asserting `bedrockInputSchema` recovers `display` from persisted AMRF (Opus 4.7 omitted, pre-4.7 summarized, unknown-value ignore, explicit top-level wins), and two `getLLMConfig` cases asserting the Anthropic endpoint preserves and overrides persisted `thinking.display`. * fix: Preserve Persisted `thinking.display` in bedrockInputParser The parser constructed a fresh adaptive thinking config without looking at any `display` already embedded in the incoming `additionalModelRequestFields.thinking`. On round-trip through `initializeBedrock`, a persisted user choice of `'omitted'` on Opus 4.7+ was silently reverted to `'summarized'` by the auto fallback. - Extract `extractPersistedDisplay` helper and reuse it in both the schema transform (form-state round-trip) and the parser (wire-request round-trip) - `bedrockInputParser` now feeds the persisted display as the resolver's explicit value when no top-level `thinkingDisplay` override is set - Add regression tests: parser preserves `display: 'omitted'` for persisted Opus 4.7 AMRF, and top-level `thinkingDisplay` still wins over persisted AMRF display Reported by chatgpt-codex-connector (P1) on PR #12701. |
||
|
|
e2e3284713
|
🦉 feat: Claude Opus 4.7 Model Support (#12698)
* 🦉 feat: Claude Opus 4.7 Model Support
- Add `claude-opus-4-7` to shared Anthropic models and `anthropic.claude-opus-4-7` to Bedrock models
- Register 1M context window and 128K max output in anthropic token maps
- Add token pricing ($5/$25), cache rates (6.25/0.5), and premium tier ($10/$37.50 above 200K) in tx.ts
- Update `.env.example` with Opus 4.7 IDs in `ANTHROPIC_MODELS` and `BEDROCK_AWS_MODELS` examples
- Add parallel Opus 4.7 test cases for token/cache/premium rates, context length, max output, name-variation matching, and 1M-context qualification
* feat: Add `xhigh` Effort Level for Opus 4.7
- Add `xhigh` variant to `AnthropicEffort` enum between `high` and `max`
- Expose `xhigh` in `anthropicSettings.effort.options` and the UI slider `enumMappings`
- Reuse existing `com_ui_xhigh` translation key
* test: Cover `xhigh` Effort and Exact Opus 4.7 Premium Rates
- Assert `xhigh` position (between high and max), inclusion in
`anthropicSettings.effort.options`, zod acceptance, and rejection of
unknown values in schemas.spec.ts
- Verify bedrockInputParser emits `output_config: { effort: 'xhigh' }`
for adaptive `anthropic.claude-opus-4-7`
- Verify getLLMConfig sets adaptive thinking and `output_config.effort =
'xhigh'` for `claude-opus-4-7`
- Pin Opus 4.7 premium pricing to exact threshold/prompt/completion
values (200000 / 10 / 37.5) so silent rate drift fails the test
|
||
|
|
49f228de78
|
🔼 refactor: Improve UX for Command Popovers (#12677)
* refactor: Improve UX for Command Popovers * Added loading state handling in Mention and PromptsCommand components to display a spinner when data is being fetched. * Refactored onFocus logic to clear the textarea and set the search value based on command character input. * Introduced a new `isLoading` state in the useMentions hook to manage loading indicators across multiple data queries. * Added unit tests for the useHandleKeyUp hook to ensure command triggering works correctly under various conditions. * ci: useHandleKeyUp tests for command navigation * Added tests to ensure that the command popovers do not trigger when the cursor is mid-text after pressing ArrowLeft or Delete. * Updated the shouldTriggerCommand function to refine the conditions under which commands are triggered based on cursor position. * Improved agent query handling in useMentions hook for better performance and clarity. * refactor: Optimize Mention and PromptsCommand Components * Refactored Mention and PromptsCommand components to utilize Recoil state for popover visibility, improving state management and reducing prop drilling. * Simplified onFocus logic to enhance user experience when interacting with command inputs. * Added unit tests for useHandleKeyUp to ensure proper command handling and popover visibility based on user input. * Improved performance by memoizing popover state and reducing unnecessary re-renders. * fix: Address review findings for command popover refactor - Fix endpointType regression: add effectiveEndpointByIndex selector that returns endpointType ?? endpoint, matching the original ChatForm guard for custom endpoints proxying assistants - Extract duplicated initInputRef callback into shared useInitPopoverInput hook, used by both Mention and PromptsCommand - Add navigation keys (ArrowLeft, ArrowRight, ArrowDown, Home, End, Delete) to invalidKeys to prevent false popover triggers - Add endpoint gating tests for assistants/azureAssistants blocking the + command - Remove unused _index param from MentionContent |
||
|
|
dd26a2fda5
|
🧩 style: Agent Side Panel Layout and Consistency Fixes (#12676)
* style: Update padding in ActionsPanel, ModelPanel, and AdvancedPanel for improved layout * Adjusted padding in ActionsPanel, ModelPanel, and AdvancedPanel components to enhance visual consistency and layout. * Changed `py-4` to `pt-2` in the main container of each panel to reduce vertical spacing and improve overall design aesthetics. * style: Update text size in MCPTool component for improved accessibility * Changed text size in the MCPTool component to `text-sm` for better readability and consistency across the UI. * This adjustment enhances the user experience by ensuring that text is appropriately sized for various display settings. * style: Enhance layout and accessibility in ActionsInput, ActionsPanel, and AgentPanel components * Updated the layout in ActionsInput to improve flex properties and ensure better responsiveness. * Refined the structure of ActionsPanel for a more consistent visual hierarchy and added accessibility features. * Adjusted the AgentPanel form layout for improved usability and streamlined component integration. * Changed text size in Dropdown component to `text-sm` for better readability across the UI. * style: Refactor layout in ActionsInput component for improved responsiveness * Updated the layout of the ActionsInput component to enhance flex properties and ensure better responsiveness. * Adjusted the textarea styling for improved usability and consistency in design. * Removed commented-out code related to example functionality to clean up the component structure. * refactor: Simplify single line code detection in MarkdownComponents * Introduced a new utility function `isSingleLineCode` to streamline the logic for determining if code is a single line. * Updated references in the `MarkdownCode` and `MarkdownCodeNoExecution` components to use the new utility function for improved readability and maintainability. * Enhanced the `processChildren` function in the Markdown editor to handle non-code elements more effectively. |
||
|
|
edd4c6d60c
|
💎 fix: Handle usage_metadata in Title Transaction for Gemini Models (#12386)
* fix: handle `usage_metadata` in title transaction for Gemini models Gemini models return token usage via `usage_metadata` instead of `usage` or `tokenUsage`. The `collectedUsage` mapping in `titleConvo` only handled the latter two, causing title generation transactions to be silently skipped for Gemini. Adds an `else if (item.usage_metadata)` branch to extract `input_tokens`/`output_tokens`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove trailing whitespace in usage_metadata handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
6183303653
|
🔉 fix: Normalize audio MIME types in STT format validation (#12674)
* fix: normalize audio MIME types in STT format validation Use getFileExtensionFromMime() to normalize non-standard MIME types (e.g. audio/x-m4a, audio/x-wav, audio/x-flac) before checking against the accepted formats list in azureOpenAIProvider. This is the same class of bug as #12608 (text/x-markdown), but for STT audio validation. Only audio/ and video/ MIME prefixes are normalized to prevent non-audio types from matching via the webm default fallback. Export getFileExtensionFromMime for testability. Fixes #12632 * fix: reject unknown audio subtypes in STT format validation Use MIME_TO_EXTENSION_MAP for normalization instead of getFileExtensionFromMime() which falls back to 'webm' for unrecognized types. Gate raw subtype matching on audio/video prefix to prevent non-audio types (e.g. text/webm) from passing validation. Resolves Codex review comment about unknown subtypes silently passing. --------- Co-authored-by: Tobias Jonas <t.jonas@innfactory.de> |
||
|
|
76e9543f99
|
🧹 chore: Cap PR Indexes at 3 and Add Delete-Before-Sync (#12672)
* fix: add docker system prune before image pull to prevent disk exhaustion The 60GB droplet filled up after ~40 deploys because each docker compose pull leaves the previous image's layers as dangling/unused. The gitnexus image is ~700MB, so ~40 stale copies ≈ 28GB of dead layers. Combined with indexes, OS, and Docker's build cache, the disk hits 100% and the next pull fails with 'no space left on device'. Add a docker system prune -af --volumes BEFORE pulling the new image on every deploy. This removes stopped containers, unused networks, all images not referenced by a running container, and build cache. Running containers are never touched. Typically frees 1-2GB per deploy (the previous image's layers). Also add a hard 2GB free-space guard after prune so the deploy fails with a clear error instead of letting docker pull attempt a 700MB extract onto a near-full disk. * fix: cap PR indexes at 3 + delete-before-sync for 10GB disk The 10GB droplet has ~2GB free. Each index is ~130MB, so 7 PR indexes (~900MB) plus main+dev (~260MB) plus the ~700MB Docker image leaves almost nothing for image pulls. The deploy failed with 'no space left on device' during docker compose pull. Three changes: 1. Cap PR indexes at MAX_PR_INDEXES=3. The resolve step now sorts PR artifacts by created_at descending and only keeps the 3 most recent. Older PR indexes are logged as evicted and their droplet folders get cleaned by the prune step. 2. Prune BEFORE sync (was after). Freeing disk space from evicted indexes before rsyncing new data is critical on a tight disk. The old order (sync then prune) could briefly hold both old evicted indexes and newly-uploaded ones simultaneously. 3. Delete-before-sync for every index, including main/dev. Instead of rsync --delete (which transfers new files then removes extras), rm -rf the target folder before rsync so the disk never holds both old and new copies of the same index (~260MB saved per index). Main/dev are only deleted when a fresh artifact is about to replace them — never evicted between deploys. Budget on 10GB disk: OS + Docker engine: ~4.0 GB Docker image (running): ~0.7 GB main + dev indexes: ~0.26 GB 3 PR indexes: ~0.39 GB Docker prune headroom: ~0.7 GB (for image pull) Free: ~3.9 GB * refine: restrict automatic PR indexing to danny-avila authored PRs With 200+ open PRs and a 10GB disk capped at 3 served PR indexes, auto-indexing every contributor PR burns CI minutes for artifacts that will mostly be evicted before anyone queries them. Narrow the pull_request auto-trigger to PRs authored by danny-avila only. Other contributors' PRs can still be indexed on demand via /gitnexus index (contributor-gated comment command) or manual workflow_dispatch — both arrive as workflow_dispatch events and bypass the pull_request filter entirely. * fix: drop --volumes from docker system prune to preserve Caddy TLS state The deploy workflow explicitly handles a caddy-not-running state later in the same step. If Caddy is stopped when the prune runs, --volumes deletes the caddy-data and caddy-config volumes (TLS certs + ACME account keys), forcing a Let's Encrypt re-issuance on next start. LE rate-limits to 5 certs per domain per week, so repeated wipes could brick HTTPS for days. docker system prune -af (without --volumes) still removes stopped containers, unused networks, all dangling/unreferenced images, and build cache — which is where the disk savings come from. Named volumes are left untouched. * fix: rsync-then-swap instead of delete-before-sync The delete-before-sync pattern removed the live index BEFORE rsync ran. If rsync failed (SSH timeout, disk pressure, network error), the index was already gone — production served nothing for that repo until a later deploy succeeded. Replace with rsync-then-swap: upload to a .new temp directory, and only rm + mv into place after rsync succeeds. On rsync failure, the .new temp is cleaned up and the old index stays live. The cost is ~130MB of extra disk while both old and new coexist, but the prune step runs first and frees evicted PR indexes, so this fits comfortably on the 10GB disk. * fix: fail deploy on main/dev rsync failure, soft-fail PRs only The rsync-then-swap pattern downgraded ALL failures to a warning, so the deploy continued even when LibreChat or LibreChat-dev failed to sync. The job would pull the new image, restart the container, and report success while serving stale or missing core indexes. Split by criticality: main/dev rsync failures now exit 1 (aborting the deploy before the container restart). PR index failures remain soft-fail with a warning — a missing PR index is inconvenient but shouldn't take the whole server down. |
||
|
|
f6b73938af
|
📝 docs: add RAG_API_URL to .env.example (#12665)
|
||
|
|
a613caced3
|
🧊 fix: In-Memory Endpoint Token Config Cache Isolation (#12673)
* fix: endpoint token config not using shared cache in same process (initializing clients) * refactor: Update default max context tokens for agent initialization - Introduced a constant `DEFAULT_MAX_CONTEXT_TOKENS` set to 32000. - Updated the `initializeAgent` function to use this constant instead of hardcoded values for maximum context tokens, improving maintainability and clarity. * refactor: shared caching mechanism for token configuration - Introduced a memoized in-memory cache for Keyv instances to ensure shared access across the same namespace, improving cache efficiency. - Updated the `standardCache` function to utilize the new in-memory cache for the TOKEN_CONFIG namespace. - Refactored the `initializeCustom` function to use the `tokenConfigCache` for better cache management. - Removed redundant tokenCache parameter from `fetchModels` to streamline the function signature. * fix: match TOKEN_CONFIG TTL and add memoization tests Pass Time.THIRTY_MINUTES to tokenConfigCache() to match the TTL used by getLogStores.js, preventing load-order-dependent expiry behavior. Add 7 automated tests covering in-memory memoization: referential identity, cross-call-site data sharing, namespace isolation, first-caller TTL semantics, fallbackStore bypass, and tokenConfigCache parity with direct standardCache access. * fix: export DEFAULT_MAX_CONTEXT_TOKENS, address review nits - Export the constant so tests (and future consumers) reference it directly instead of hardcoding the numeric value. - Add independent TTL assertion for tokenConfigCache (R-1 nit). - Add tokenConfigCache mock to custom/initialize.spec.ts. |
||
|
|
b40e8be7c8
|
🖼️ fix: Hide Duplicate Image Placeholder During Image Generation (#12654)
* fix: Hide duplicate image placeholder during image generation * test: Update OpenAIImageGen tests for conditional Image rendering |
||
|
|
5d108df665
|
🌍 i18n: Update translation.json with latest translations (#12646)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |
||
|
|
60cee6eec7
|
🔍 fix: Anthropic Web Search Multi-Turn Issue and Attachment Results (#12651)
* 🔍 fix: Improve WebSearch Progress Handling Based on Attachment Results
- Adjusted progress handling in the WebSearch component to treat searches as complete if attachments contain results, addressing issues with server tool calls not receiving completion signals.
- Introduced `effectiveProgress` to reflect the actual state of progress based on the presence of results, enhancing the accuracy of cancellation and completion states.
- Updated related logic to ensure proper handling of search completion and finalization states based on the new progress calculations.
* fix: only override progress when not streaming
During streaming (isSubmitting=true), use actual progress so the
searching/processing/reading states display correctly. Only override
to 1 after streaming completes to prevent the cancelled check from
hiding the component.
* chore: Update @librechat/agents and mathjs dependencies to latest versions
* chore: Upgrade mathjs dependency to version 15.2.0 across package-lock and package.json files
|
||
|
|
5cc783b8e8
|
🎯 fix: Preserve Selected Artifact When Clicking Artifact Button (#12601)
* fix: preserve selected artifact when clicking artifact button * fix: preserve artifact selection on click and during streaming * fix: remove broken streaming guard, add JSDoc and test - Remove `userHasManualSelection` guard from effect #3: it cannot distinguish manual clicks from system auto-selection, blocking auto-advancement to new artifacts during streaming. - Add JSDoc on `currentArtifactIdRef` explaining why it must not be added to effect deps (toggle-close regression). - Add test verifying auto-advancement during streaming. * style: use standard multi-line JSDoc format for ref comment --------- Co-authored-by: Danny Avila <danny@librechat.ai> |
||
|
|
c4bb41137d
|
🔑 fix: Clear Stale Client Registration on invalid_client During OAuth Token Refresh (#12643)
* fix: clear stale client registration on invalid_client during token refresh When a token refresh fails with `invalid_client`, the stored DCR client registration is no longer valid on the authorization server. The existing error handler only checked for `unauthorized_client` and returned null, leaving the stale client_id cached in the database permanently. Every subsequent token refresh attempt would fail with the same error. Now when `invalid_client` is detected during refresh: 1. The stale client registration is deleted from the database 2. A `ReauthenticationRequiredError` is thrown to trigger a fresh OAuth flow with new dynamic client registration Also passes `deleteTokens` from MCPConnectionFactory to getTokens() so the cleanup has access to the token deletion method. * fix: address review findings for stale client cleanup on token refresh - Delete stale refresh token alongside client registration on invalid_client (Finding 1) - Add tests for all new code paths: cleanup, warning, case-insensitivity, cleanup failure (Finding 2) - Detect all vendor-specific client rejection patterns (client_id mismatch, client not found, unknown client) with case-insensitive matching (Finding 3) - Use else-if for mutually exclusive error branches (Finding 4) - Log warning when deleteTokens is not available on client rejection (Finding 6) - Fix log message to say "attempting to clear" before async cleanup (Finding 7) - Extract isClientRejectionMessage to shared utility, refactor MCPConnectionFactory.isClientRejection to use it * fix: address followup review findings - Extract isInvalidClientMessage (4 stale-client patterns) from isClientRejectionMessage to eliminate pattern duplication between utils.ts and tokens.ts (Finding 1) - Remove redundant staleIdentifier variable, reuse identifier already in scope (Finding 2) - Separate await from .then() on Promise.allSettled for readability (Finding 3) - Add dedicated unit tests for isInvalidClientMessage and isClientRejectionMessage (Finding 4) - Assert error message content in primary invalid_client test (Finding 5) - Add JSDoc on deleteTokens in GetTokensParams (Finding 6) --------- Co-authored-by: Mani Japra <mani@muonspace.com> |
||
|
|
7b48203906
|
🗂️ feat: Sidebar Icon Toggle & New Chat History Switch (#12642)
* 🗂️ feat: Sidebar Icon Toggle & New Chat History Switch
Add collapse-on-active-click for sidebar icons (VSCode-style) and optionally switch to Chat History panel when creating a new chat.
* fix: Address review findings — extract DEFAULT_PANEL constant, add tests
Export DEFAULT_PANEL from ActivePanelContext and use it in ExpandedPanel
instead of hardcoding 'conversations'. Add ExpandedPanel tests covering
NavIconButton collapse toggle and NewChatButton panel switch behaviors.
* fix: Address review — prop-drill setActive, test disabled setting, strengthen assertions
Pass setActive as a prop to NewChatButton instead of subscribing to
ActivePanelContext, avoiding wasted re-renders on every panel switch.
Add negative-path test for switchToHistory=false. Add positive panel
assertions to inactive-icon click tests. Fix import order.
|
||
|
|
9b9a86d17d
|
🔀 fix: Resolve Action Tools by Exact Name to Prevent Multi-Action Domain Collision (#12594)
* 🐛 fix: resolve Action tools by exact tool name to prevent multi-action collision When two OpenAPI Actions on the same Agent share a hostname, the second action's entry overwrote the first in the encoded-domain Map and one action's tools silently disappeared from the LLM payload. The buggy resolution loop also used substring matching, which caused similar shadowing for any encoded-domain prefix overlap. This change builds a Map keyed on the full tool name (`<operationId>_action_<encoded-domain>`) directly, mirroring the exact lookup pattern that getActionToolDefinitions already uses. Each function in an action's spec gets its own slot, so two actions sharing a hostname no longer collide. Both the new and the legacy domain encodings are registered for each function so agents whose stored tool names predate the current encoding still resolve. Applied at all three call sites that had the buggy pattern: - processRequiredActions (assistants/threads path) - loadAgentTools (agent build path) - loadActionToolsForExecution (agent execution path) Adds three regression tests covering both ordering directions and the execution path. Tests fail without the fix and pass with it. * 🐛 fix: Normalize action tool name at lookup + cover assistants path Follow-up to the multi-action domain collision fix. Addresses PR #12594 review feedback: **Must-fix #1 — short-hostname lookup mismatch.** The toolToAction map is keyed on the `_`-collapsed domain, but `agent.tools` and `currentAction.tool` persist the raw `domainParser(..., true)` output, which for hostnames ≤ ENCODED_DOMAIN_LENGTH is a `---`-separated string (e.g. `medium---com`). Exact-match `Map.get()` missed those keys and silently dropped the tool. Fix: normalize every incoming tool name through a new `normalizeActionToolName` helper before the lookup in `loadAgentTools`, `processRequiredActions`, and `loadActionToolsForExecution`. **Must-fix #2 — assistants path coverage.** `processRequiredActions` received the same structural rewrite but had zero tests. Added a regression test under `multi-action domain collision regression` that drives two shared-hostname actions through the assistants path and asserts each tool reaches its own request builder. **Must-fix #3 — legacy encoding branch coverage.** The `if (legacyNormalized !== normalizedDomain)` registration was never exercised by any test. Added a test where `agent.tools` stores the legacy-format name and asserts it still resolves. **Should-fix #4 — DRY the registration loop.** Extracted `registerActionTools({ toolToAction, functionSignatures, normalizedDomain, legacyNormalized, makeEntry })`. All three call sites now share the same key-building logic; the key template lives in one place. **Should-fix #5 — remove stale optional chaining.** In `loadActionToolsForExecution`, `functionSignature?.description ?? ''` became `functionSignature.description` — `sig` is always defined by the iterator, matching the style of `loadAgentTools`. **Should-fix #6 — drop unreachable `!requestBuilder` guard.** Entries in `processRequiredActions` are now pre-built with `requestBuilder: requestBuilders[sig.name]`, which `openapiToFunction` always produces alongside the signature, so the guard is dead. **Should-fix #7 — unwrap `actionSetsData`.** It now holds a bare `Map` instead of `{ toolToAction }`; the sentinel `!actionSetsData` check still works because `new Map()` is truthy. Also added a short-hostname regression test (`loadAgentTools resolves raw ---separated tool names`) that reproduces Must-fix #1: it fails against the previous commit (0 create calls) and passes with the normalization in place. 41 tests, all passing. The 3 new regression tests are under `multi-action domain collision regression` and cover the assistants path, the legacy encoding branch, and the short-hostname lookup path. * 🐛 fix: Tighten registerActionTools key handling and assistants test Follow-up to |
||
|
|
277fdd2b43
|
🪪 feat: Optimized Entra ID Group Sync with Auto-Creation (#12606)
* feat: implement optimized Entra group sync with auto-creation
## Changes
### MUST FIX (Critical Issues) - RESOLVED
1. **BUG FIX: Prevent unintended user removal from existing groups**
- ISSUE: db.syncUserEntraGroups() was called with only missing groups, causing removal
from all existing Entra groups (full bidirectional sync behavior)
- SOLUTION: Replaced with db.upsertGroupByExternalId() for each missing group followed
by single bulkUpdateGroups() to add memberships (race-safe, idempotent)
- BENEFIT: User memberships correctly maintained for mix of existing + new groups
2. **JSDoc @throws contradiction**
- ISSUE: JSDoc declared function throws, but implementation catches all errors
- SOLUTION: Removed @throws from JSDoc - function is best-effort
- BENEFIT: Prevents unnecessary try/catch in caller code
3. **Missing test for group creation flow**
- ISSUE: Auto-creating missing Entra groups had no test coverage
- SOLUTION: Added regression test for mix of existing + new groups scenario
- BENEFIT: Prevents future regressions on critical path
### SHOULD FIX (Important Improvements) - RESOLVED
4. **E11000 race condition handling**
- SOLUTION: Upserts are idempotent and race-safe by design
- BENEFIT: Concurrent logins no longer race each other
5. **Direct Mongoose access instead of db layer**
- SOLUTION: Added findGroupsByExternalIds() helper to userGroup.ts
- BENEFIT: Centralized data access, easier to add tenant scoping
6. **Serial DB round-trips on login path**
- ISSUE: 40+ queries for user with 20 new groups
- SOLUTION: Promise.all() for parallel upserts + single bulkUpdate
- BENEFIT: ~10x performance improvement
7. **Graph API 429/503 throttling unhandled**
- SOLUTION: Retry logic with exponential backoff (1s, 2s delays)
- BENEFIT: Temporary API issues no longer cause permanent membership loss
8. **Sequential batch requests slow**
- ISSUE: 200 groups = 10 batches × 200ms = ~2s sequential
- SOLUTION: Promise.all() with concurrency limit (5 parallel batches)
- BENEFIT: ~400ms total time
## Minor Fixes
- Removed dead code check
- PII removal: user._id instead of user.email in logs
- ES6 shorthand fixes
- Style consistency (blank lines)
- Projection optimization
## Verification
✅ npm run build - success
✅ npm run test:api - 61/61 passing (+ new regression test)
✅ npm run lint - no errors
✅ All feedback from danny-avila resolved
* docs: better JSDoc for the syncUserEntraGroupMemberships method
---------
Co-authored-by: Airam Hernández Hernández <airam.hernandez@intelequia.com>
|
||
|
|
55840286d4
|
🔬 fix: Scope Web Search Results to Own Turn (#12631)
* 🔬 fix: Scope Web Search Results to Own Turn in WebSearch Component
Fix duplicated search results when multiple web_search tool calls occur
in a single response. Each WebSearch instance now displays only its own
turn's results instead of aggregating all turns.
* test: Add WebSearch turn-scoping tests
Cover the regression-prone turn isolation logic: verify each WebSearch
instance renders only its own turn's sources when sharing a SearchContext,
test the attachments-first priority path and the searchResults fallback,
and assert component states (cancelled, error, streaming, complete).
* fix: Use getAllByText for duplicate sr-only + visible text in tests
WebSearch renders status text in both an sr-only aria-live region and
a visible span, causing getByText to fail with multiple matches.
* refactor: Make ownTurn a string to avoid repeated conversions
|
||
|
|
546f006e42
|
💬 feat: Serialize GitNexus Deploys and Post Completion Comments on PR Commands (#12623)
Three related changes that tighten the GitNexus CI/CD loop. Serialized deploys - Previous concurrency group was keyed by head ref with cancel-in-progress, which let deploys targeting different refs (e.g. main push + PR command) run in parallel. That's a data race: the prune-stale-indexes step computes active_names up front, so deploy A rsyncing /opt/gitnexus/indexes/LibreChat-pr-12580 can collide with deploy B pruning the same folder based on a pre-rsync view of the active set. - Collapse to a single global group gitnexus-deploy with cancel-in-progress: false. All deploys queue behind one another. A rsync/docker-compose restart is never killed mid-operation. The 20-minute job timeout bounds queue depth. PR completion feedback - Add a "index complete" comment step in gitnexus-index.yml that fires only when inputs.pr_number is set (i.e. the run came via the /gitnexus command). Posts success or failure with a link to the run and whether embeddings were generated. - Add a "deploy complete" comment step in gitnexus-deploy.yml that handles both trigger paths: workflow_run from a native PR auto-index (PR number recovered from the matrix entry whose runId matches the trigger run), and workflow_dispatch from the index workflow's bot- fallback path (PR number passed through as a new inputs.pr_number). - Plumb inputs.pr_number through the bot-fallback dispatch in gitnexus-index.yml so the deploy workflow knows where to comment for command-triggered runs. - Only comments on the PR that asked for the index, never broadcasts. Workflow rename - Drop the "DigitalOcean" suffix from the deploy workflow's display name and filename. The platform is still DO (.do/gitnexus/ still holds the compose + caddy config) but the workflow itself is platform-agnostic in form and the suffix was visual noise. - File renamed gitnexus-deploy-do.yml -> gitnexus-deploy.yml. - Concurrency group and all cross-references updated in lock-step. - permissions at deploy job level now includes pull-requests: write so the completion comment can post. |
||
|
|
8cb5c62fa1
|
🔗 chore: Dispatch GitNexus Deploy When Index Is Bot-Triggered (#12621)
* fix: dispatch deploy from index when triggered by github-actions[bot] GitHub Actions suppresses workflow_run events for workflow runs whose triggering actor is GITHUB_TOKEN (to prevent recursive chains). This means when gitnexus-pr-command.yml uses `gh api workflow_dispatch` to kick off gitnexus-index.yml, the downstream gitnexus-deploy-do.yml workflow_run trigger never fires — the PR command indexes the PR but the new artifact never makes it onto the droplet. Add a final step in gitnexus-index.yml that dispatches the deploy workflow directly via API, but ONLY when the triggering actor is github-actions[bot]. User-triggered runs (push, pull_request, manual workflow_dispatch from the UI) continue to rely on workflow_run as before, so we don't double-deploy. Requires a new actions:write permission at the workflow level for this dispatch. contents:read is unchanged. * fix: resolve main/dev indexes by artifact name, not branch run query The resolve step was querying listWorkflowRuns filtered by branch=main and branch=dev, then assuming the latest successful run on each branch produced the expected gitnexus-index-main / gitnexus-index-dev artifact. That assumption breaks for /gitnexus index command runs: The PR command workflow dispatches gitnexus-index.yml with ref=main (because that's where the workflow file lives) and an input pr_number. The resulting run has head_branch='main' but uploads its artifact as gitnexus-index-pr-<N>, not gitnexus-index-main. listWorkflowRuns returns that run as the "latest success on main", the download step tries to fetch gitnexus-index-main from it, and the API returns "no artifact matches any of the names or patterns provided". Fix: resolve all indexes (main, dev, and PRs) through the same listArtifactsForRepo path the PR discovery already uses. Looks up the freshest non-expired artifact by name directly, so the run's head_branch and event type don't matter — if the artifact exists, we find it; if not, we warn and move on. Side benefit: the resolution logic is now shorter and consistent across branches and PRs. * fix: paginate open PRs and parallelize artifact lookups The resolve step was capped at 100 open PRs by github.rest.pulls.list's per_page ceiling — LibreChat has 200+ open at any given time, so the tail of the PR queue was silently skipped. On top of that, the inner artifact lookup loop was serial, so even after pagination the resolve step would take 40-60 seconds on a busy repo (one API call per PR). - Replace the single-page rest.pulls.list call with github.paginate, which follows the Link header across pages and returns the full open-PR set regardless of count. - Drop the 100-PR truncation warning that was a known-limitation notice for exactly this case. - Batch the per-PR artifact lookups into groups of 10 via Promise.all. 200 PRs now take ~10 seconds instead of ~60, and the burst stays well within the authenticated rate limit (5000/hr). - Add a final core.info summary showing how many of the open PRs actually had a servable index artifact, so the log is useful for debugging why a specific PR isn't showing up on the droplet. |
||
|
|
990763cbee
|
🧠 feat: Enable GitNexus Embeddings for Dev Branch and PR Indexes (#12620)
* feat: auto-enable embeddings for dev and PR indexes too Previously only main branch pushes got --embeddings; dev and contributor PRs ran graph-only and relied on BM25 search. Semantic search on those indexes silently returned empty, which defeats the whole point of serving them to MCP clients. New logic: every automatic trigger (push to main/dev, pull_request from contributors) enables --embeddings. Only workflow_dispatch still respects the explicit input toggle, so operators can run a fast graph-only re-index when they don't need fresh vectors. Cost: adds ~3-5 minutes per index run. Acceptable tradeoff for having semantic search work across all served branches + open PRs instead of just main. * refine: gate PR embeddings on unit-test path relevance Previous version auto-enabled --embeddings on every contributor PR, which cost ~3-5 min of extra CI per index even on PRs that couldn't benefit from semantic code search (docs, config, workflow files, i18n strings, etc.). New logic mirrors the backend-review.yml and frontend-review.yml path filters — if a PR doesn't touch api/, client/, or packages/ it won't trigger unit tests and it doesn't need embeddings. The check queries the GitHub API for the PR's changed file list via `gh api repos/.../pulls/<N>/files` (paginated for very large PRs) and enables embeddings only when at least one path matches. main/dev pushes still always embed. workflow_dispatch still respects the explicit input toggle, which also covers the /gitnexus index [embeddings] PR command. The contributor gate at the job level is unchanged — non-contributor PRs are still skipped entirely regardless of paths. * feat: /gitnexus command works for non-contributor and fork PRs The command workflow already gated on the commenter's author association (not the PR author's), so a contributor commenting /gitnexus index on an outside contributor's PR passes the auth check. But the downstream index workflow checked out the PR's raw head SHA, which only exists in the fork for cross-repo PRs — actions/checkout fetches from the base repo's origin and fails. Switch the command workflow to dispatch with refs/pull/<N>/head instead of the SHA. GitHub mirrors every PR's head into the base repo as this ref regardless of whether the PR is from a fork, so the checkout always resolves. End result: a contributor can type `/gitnexus index embeddings` on any PR — including one opened by a first-time contributor from a fork — and the index (with embeddings, if requested) is built and served. The contributor takes responsibility for the trust boundary by typing the command. Updated the relevant header/inline comments in both workflows so the next maintainer understands the refs/pull/<N>/head choice and the commenter-based gating. * refine: /gitnexus index defaults to embeddings on A contributor typing the command has already chosen to spend ~5 minutes of CI on a full re-index; they wouldn't invoke the command just to get a BM25-only result. Flip the default so the short form `/gitnexus index` produces an embeddings-enabled index. Modifier semantics: /gitnexus index -> embeddings ON (new default) /gitnexus index embeddings -> embeddings ON (explicit, no-op alias) /gitnexus index fast -> embeddings OFF (opt-out) /gitnexus index graph-only -> embeddings OFF (alias) /gitnexus index no-embeddings-> embeddings OFF (alias) The previous `embeddings` modifier is preserved as a no-op alias so anyone who learned the earlier form still gets what they expected. |
||
|
|
39fb93f6c4
|
🏗️ chore: Set Up Docker Buildx for GitNexus Image GHA Cache Export (#12619)
The first GHCR build on main failed with: ERROR: Cache export is not supported for the docker driver. The default docker driver on ubuntu-latest runners can't export cache to type=gha. docker/setup-buildx-action@v3 without a driver argument defaults to docker-container, which supports both cache-from and cache-to. Gated on the same condition as the build step so it only runs when an image rebuild is actually needed. |
||
|
|
8eab39bc8f
|
🌊 feat: Add GitNexus DigitalOcean Pipeline with PR Index Serving (#12612)
* feat: migrate GitNexus deployment from Fly.io to DigitalOcean droplet
Fly.io's 1GB machine was pegged at ~900MB memory with load spiking to
2.7 under even modest query load. Moving to a 2GB+ DO droplet that can
take advantage of existing credits.
Architecture change: indexes no longer baked into the image. Instead,
a long-lived image (built only when .do/gitnexus/ changes) is pulled
from GHCR, and the deploy workflow rsyncs .gitnexus/ data into
/opt/gitnexus/indexes/<name>/ on the droplet and restarts only the
gitnexus container. Caddy stays running so TLS certs don't churn.
- Add .do/gitnexus/Dockerfile (same native-addon + extension patch
layers as the Fly variant, but no COPY indexes/ step)
- Add .do/gitnexus/docker-compose.yml with gitnexus + caddy services
on an internal bridge network, 1.8GB memory limit, healthcheck
- Add .do/gitnexus/Caddyfile with automatic HTTPS for the configured
subdomain and bearer token auth for all routes except /health
- Add .do/gitnexus/entrypoint.sh that registers every index mounted
at /indexes/<name>/.gitnexus at container start, then runs
gitnexus serve bound to 0.0.0.0 (internal docker network only)
- Add .do/gitnexus/install-extensions.js for LadybugDB FTS/vector
extension pre-install (workaround for upstream bug)
- Add .github/workflows/gitnexus-deploy-do.yml that builds the image
only on Dockerfile/entrypoint changes, pushes to GHCR, rsyncs the
index artifacts to the droplet, and restarts the gitnexus container
- Remove .fly/gitnexus/ and .github/workflows/gitnexus-deploy.yml —
Fly app will be destroyed after DO deploy is verified working
Required new secrets: DO_HOST, DO_USER, DO_SSH_KEY. GITNEXUS_DOMAIN
and API_TOKEN live in /opt/gitnexus/.env on the droplet itself.
* refactor: prefix deploy secrets with GITNEXUS_ for namespace isolation
Rename DO_HOST -> GITNEXUS_DO_HOST, DO_USER -> GITNEXUS_DO_USER, and
DO_SSH_KEY -> GITNEXUS_DO_SSH_KEY so the secrets are clearly scoped
to the gitnexus deploy and don't collide with any other DigitalOcean
secrets LibreChat might add later.
* feat: serve PR indexes alongside main/dev and add /gitnexus command
The index workflow was already building and uploading per-PR indexes
(gitnexus-index-pr-<N>) for contributor PRs, but the deploy workflow
only consumed main and dev artifacts. PR indexes were sitting in
storage doing nothing. This wires them all the way through to the
live MCP server, with proper cleanup when PRs close.
Deploy workflow changes:
- Drop the branches filter on workflow_run so PR index completions
also trigger deploys (PR indexes are already contributor-gated
upstream in gitnexus-index.yml via author_association)
- Resolve all open PRs via the GitHub API, look up each one's latest
non-expired gitnexus-index-pr-<N> artifact, and serve whichever
ones exist. PRs without an index artifact are skipped — we don't
retroactively index anything.
- Per-ref concurrency group so rapid pushes to the same PR coalesce
but different refs still deploy in parallel
- After rsyncing active indexes, prune any /opt/gitnexus/indexes/
folder that isn't in the active set. Safety net for missed PR
close events.
New workflow: gitnexus-cleanup-pr.yml
- Fires on pull_request closed (merged or not)
- SSHs to the droplet, removes /opt/gitnexus/indexes/LibreChat-pr-<N>,
restarts the gitnexus container
New workflow: gitnexus-pr-command.yml
- Listens for issue_comment events where body starts with /gitnexus
- Contributor gated via author_association
- Supports: /gitnexus index — index with defaults
/gitnexus index embeddings — index with --embeddings
- Dispatches gitnexus-index.yml with the PR number and head SHA,
reacts to the comment with a rocket emoji
Index workflow changes:
- New dispatch inputs pr_number and pr_ref for command-driven runs
- Checkout step uses inputs.pr_ref when set so the PR's head commit
is analyzed instead of the default branch
- Artifact naming falls back through pr_number -> pull_request number
-> ref_name, keeping existing behavior for push/PR events
- Concurrency group switches to pr-<N> when dispatched by the command
so re-runs on the same PR debounce correctly
* chore: remove Fly variant reference from Dockerfile header
The Fly variant no longer exists in the repo, so the comparison
comment is meaningless. Rewritten as a standalone description of
the image's design.
* review: resolve 15 findings from review audit
Critical
- Drop the unused caddy binary from the gitnexus image. Caddy runs in
its own container in this architecture; installing it inside the
gitnexus image added ~40-60MB for no reason and contradicted the
Dockerfile header comment.
Major
- Replace ssh-keyscan TOFU with a GITNEXUS_DO_KNOWN_HOST secret.
Both deploy and cleanup workflows now pin the droplet's host key
from a stored value instead of silently trusting whatever the host
presents at deploy time. Fails the workflow if the secret is empty
so no one accidentally regresses to TOFU.
- Gate gitnexus-cleanup-pr.yml to same-repo PRs via
github.event.pull_request.head.repo.full_name == github.repository.
Fork PR closes no longer produce failed runs when secrets are
withheld by GitHub. The deploy workflow's stale-folder prune step
remains the safety net for any fork-contributor indexes.
- Fail fast in entrypoint.sh when main/dev index registration errors.
Previously `|| echo WARN` swallowed failures so a broken index
passed the docker healthcheck and the deploy was marked green
while queries returned empty. PR indexes stay best-effort
(a corrupt PR index shouldn't take the whole server down).
- Authenticate the droplet with GHCR on every deploy using
GITHUB_TOKEN, so private GHCR packages work without documentation
detours or manual docker login on the host. Bootstrap comments
explain the flow.
- Switch docker-compose caddy.depends_on from the short-form
(service_started) to service_healthy so Caddy doesn't route to a
gitnexus container that's still starting (500ms-60s window after
recreation).
Minor
- Guard the HEAD~1 diff with `git rev-parse --verify HEAD~1` so the
first-commit and workflow_run-from-PR cases default to rebuild
instead of silently skipping a legitimately-changed image.
- Move `packages: write` off the workflow-level permissions and onto
the build-image job. deploy no longer inherits unnecessary GHCR
write access.
- Skip the SSH session in cleanup-pr.yml when no gitnexus-index-pr-<N>
artifact ever existed for the PR. Eliminates ~95% of no-op SSH
round-trips on a busy repo (docs-only PRs, paths-ignored PRs, etc).
- Reload Caddy in-place after config upload with `caddy reload`,
falling back to force-recreate on reload failure and `compose up`
on first-time bootstrap. Picks up Caddyfile or env changes without
losing TLS certs.
- Replace `sleep 5` post-deploy with a real readiness poll against
docker's health status. Fails the workflow if gitnexus doesn't
report healthy within 120s, so a broken startup surfaces in CI
instead of being silently marked green.
- Warn when listPulls hits the 100-item per_page ceiling so a future
growth spurt past 100 open PRs doesn't silently drop indexes.
Nit
- Tighten NODE_OPTIONS --max-old-space-size from 1536 to 1280MB,
giving KuzuDB's C++ heap ~512MB of room under the 1792MB cgroup
limit instead of ~256MB.
- Rewrite the stale "headroom for Caddy" comment in entrypoint.sh
(Caddy lives in a separate container now).
- Restore load-bearing comments in install-extensions.js explaining
the @ladybugdb/core path layout and the throwaway-db cache-priming
pattern.
- Parameterize the docker-compose image reference as
${GITNEXUS_IMAGE:-ghcr.io/danny-avila/librechat-gitnexus:latest}
so forks or pinned version tags can override via /opt/gitnexus/.env.
Deferred
- Finding 12 (memory headroom) addressed partially via the heap cap
reduction; full profiling of KuzuDB C++ allocations under query
load deferred to post-deploy monitoring.
* review: resolve 8 follow-up findings from second review pass
Security
- F1: pipe GHCR token via SSH stdin instead of expanding it into the
remote command string. Previously `"echo '$GH_TOKEN' | docker login"`
expanded the token locally before SSH sent it as an argument, so the
live token was briefly visible in /proc/<pid>/cmdline on the droplet
to any process running as deploy or root. New form uses
`printf '%s' "$GH_TOKEN" | ssh ... "docker login --password-stdin"`
so the token only travels through the encrypted SSH stdin pipe.
Reliability
- F2: add json-file log rotation (50m x 3 files) via a YAML anchor
shared by both services. Default Docker logging is unbounded and
would eventually fill the 60GB droplet disk.
- F4: set memswap_limit=1792m to match mem_limit. Without this, Docker
lets the container silently spill onto host swap when KuzuDB's C++
heap overruns the 1792m RAM budget, turning sub-second graph queries
into multi-second ones with no alert. Hard OOM-kill is preferable —
unless-stopped restarts the container, the deploy health poll
catches it, the failure is explicit.
- F5: extend the post-deploy health poll from 24 iterations (120s) to
36 iterations (180s) so it clears Docker's own unhealthy-detection
ceiling (start_period 60s + retries 3 * interval 30s = 150s). A
container that legitimately takes 125s to warm up would previously
fail the deploy at 120s while Docker would still report it as
"starting".
Operability
- F3: document the `--no-deps` escape hatch in the compose file header
so an operator can restart Caddy during a gitnexus outage without
being trapped by the service_healthy dependency (e.g. emergency
Caddyfile fix while gitnexus is thrashing).
- F7: rewrite the misleading service_healthy comment. The old text
said it prevents 502s "after a restart", implying continuous
protection. Clarified that depends_on only governs initial compose
up ordering — during force-recreates Caddy briefly routes to a
starting gitnexus and the deploy's health poll is the actual guard.
- F6: add `shopt -s nullglob` before the prune loop so an empty
/opt/gitnexus/indexes directory is an explicit no-op instead of
relying on the quirk that `rm -rf "*"` (with literal "*") silently
succeeds. Next reader won't have to recognize the bash default.
- F8: soft-fail PR artifact downloads when the artifact disappeared
between resolve and download. Main/dev artifact failures stay fatal
because a missing main/dev index is a real deploy failure, but a
deleted PR artifact no longer aborts the whole deploy.
* review: resolve 3 NITs from third review pass
- F1: rewrite the printf '%s' comment. The previous version claimed
docker login --password-stdin rejects trailing newlines, which is
inaccurate — docker login strips whitespace. The real reason for
printf over echo is byte-exact output and portability, and the
token-in-process-table security rationale is already documented
in the preceding sentences.
- F2: when a PR artifact download soft-fails, the PR's name stays
in active_names so the prune step keeps the droplet's existing
copy instead of wiping it (stale > empty). Make this transition
visible by spelling it out in the :⚠️: message.
- F3: fencepost fix in the health poll. The previous loop ran 36
iterations and claimed "180s" in the comment, but the final
iteration exits without a trailing sleep, so the real ceiling
was 35 * 5s = 175s. Extended to 37 iterations (36 sleeps * 5s
= 180s) so the comment matches reality.
|
||
|
|
b1fee80de4
|
📑 fix: Alias Mimetype text/x-markdown to text/markdown (#12608)
text/x-markdown is a deprecated version of a markdown mimetype, but we're seeing that sometimes users still send this mimetype. This change allows these files to be uploaded as text/markdown. |
||
|
|
70c91f8afd
|
🩹 chore: Correct lbug-adapter Path for GitNexus Vector Extension Patch (#12609)
The published gitnexus npm package compiles pool-adapter.ts to dist/mcp/core/lbug-adapter.js, not dist/core/lbug/pool-adapter.js as I guessed in the previous PR. The sed patch failed the Docker build because the target file didn't exist. Point the patch at the correct compiled path and add a grep -c verification after sed to confirm the replacement actually landed. |
||
|
|
711747a5a0
|
🔎 fix: Install LadybugDB Extensions and Patch Vector Load for GitNexus Search (#12607)
Two upstream GitNexus 1.5.3 bugs combine to break query() in serve mode: 1. pool-adapter.ts calls LOAD EXTENSION fts but never INSTALL fts. The CI-produced .gitnexus/ artifact doesn't include the extension cache (~/.kuzu/extension/), so LOAD silently fails in the fresh container and all BM25/FTS searches return empty. 2. pool-adapter.ts only loads the FTS extension — it never loads the vector extension. Every CALL QUERY_VECTOR_INDEX in semanticSearch fails, so hybrid search's semantic leg returns empty too. Combined, query() returns empty even for exact function names because both the BM25 and semantic legs of RRF merging have zero inputs. Meanwhile context()/cypher()/route_map() still work because they use plain Cypher with no extensions. Workaround: - Add install-extensions.js that runs INSTALL fts and INSTALL vector against a throwaway database during Docker build, populating ~/.kuzu/extension/ with the cached extension binaries - Sed-patch pool-adapter.js at build time to also LOAD EXTENSION vector alongside the existing FTS load, wrapped in try/catch - Update deploy workflow to copy install-extensions.js into the build context |
||
|
|
a121ae3dea
|
🩺 fix: Capture GitNexus Serve Output and Add Startup Health Check (#12599)
* fix: capture gitnexus serve output and add startup health check gitnexus serve was backgrounded with no log capture, so crash reasons were invisible in Fly logs. Now pipes output to stdout and waits up to 30s for the server to be ready before starting Caddy, with early exit if the process dies. * fix: install newer libstdc++ for LadybugDB native addon @ladybugdb/core prebuilt binary requires GLIBCXX_3.4.32 (GCC 13+) but node:24-slim ships Bookworm's libstdc++6 which only has 3.4.31. Pull libstdc++6 from Debian Trixie to satisfy the runtime dependency. * fix: separate build tools from libstdc++ upgrade to avoid conflict Installing Trixie's libstdc++6 alongside Bookworm's g++ fails because the Trixie libc6 transitive dep conflicts with Bookworm's libc6-dev. Split into two RUN steps: compile native addons first, remove g++, then upgrade libstdc++ from Trixie with no conflicting packages. * fix: name repo as LibreChat and add dev branch deploy support - Use REPO_NAME build arg (default: LibreChat) as the WORKDIR so gitnexus registers the index with a proper name instead of "repo" - Deploy workflow now triggers on both main and dev branch index runs - Dev branch registers as "LibreChat-dev", main as "LibreChat" - workflow_dispatch gains a branch selector input * feat: serve both main and dev indexes from one container + auto-embed main - Dockerfile now copies multiple indexes from indexes/<name>/.gitnexus and registers each with gitnexus, so one server handles both branches - Deploy workflow downloads latest successful main + dev artifacts in parallel and bundles them into a single deploy - list_repos returns LibreChat and LibreChat-dev; queries target either via the repo parameter - Main branch pushes auto-enable --embeddings for semantic search; dev and PRs remain graph-only for speed (opt-in via dispatch input) - Bump index job timeout to 25m to account for embedding generation * fix: raise Fly machine memory to 1GB + add swap + cap Node heap The 512MB machine was getting OOM-killed when gitnexus serve's default --max-old-space-size=8192 over-committed memory during query spikes. - Bump VM memory 512mb -> 1gb - Add 512MB swap file to absorb transient spikes - Cap Node heap at 768MB via NODE_OPTIONS so it stays within the machine's real capacity and leaves headroom for Caddy and the OS |
||
|
|
4f133f8955
|
✨ v0.8.5-rc1 (#12569) | ||
|
|
f128587bb7
|
📦 chore: bump axios, @librechat/agents (#12598)
* chore: bump @librechat/agents to v3.1.64 * chore: update axios to version 1.15.0 across multiple packages |
||
|
|
46b529a86a
|
📩 fix: Restore Primary Action Button Visibility in Light Mode (#12591)
The default <Button> variant (bg-primary / text-primary-foreground) renders invisible in light mode on the Agent Marketplace 'Start Chat' button and the Grant Access dialog 'Save Changes' button, making these primary actions undiscoverable without hovering. Switch these three affirmative-action buttons to variant='submit', which uses the hardcoded 'bg-surface-submit text-white' combination already defined in the Button component specifically to avoid the contrast issues of the default variant (see the comment above the 'submit' variant in packages/client/src/components/Button.tsx). No visual change in dark mode; in light mode the buttons now render as the green submit color with white text, matching the semantic intent of the action. Co-authored-by: Timothy Look <timothy.look@pmv.eu> |
||
|
|
1a83f36cda
|
📌 feat: Add Pin Support for Model Specs (#11219)
* feat: Enhance favorites functionality to support model specs * refactor(FavoritesList): reorder imports based on lenght * feat: improve favorite modelSpec controller; refactor: useIsActiveItem hook * refactor: consolidate Favorite type, harden controller, add tests - Add canonical TUserFavorite type in data-provider, replace three duplicate definitions (data-service, data-schemas, store/favorites) - Consolidate FavoritesController spec validation into single block, add return on 500 paths, add maxlength to mongoose sub-schema - Fix import order in FavoritesList.tsx, merge namespace type imports in FavoriteItem.tsx and FavoritesList.tsx - Add focus-visible:ring-inset on pin buttons to prevent ring clipping - Add explicit return type and JSDoc on useIsActiveItem hook - Use props.type for narrowing consistency in FavoriteItem getTypeLabel - Add 22 backend tests for FavoritesController (spec validation, typeCount exclusivity, persistence, GET path) - Add 40 frontend tests: useFavorites spec methods, useIsActiveItem observer lifecycle, ModelSpecItem pin button, FavoriteItem all three type branches, FavoritesList spec rendering * fix: address PR review findings for pin model specs - Harden backend validation to reject partial cross-type fields (e.g. spec+endpoint, agentId+model without endpoint) - Add stale-spec auto-cleanup in FavoritesList mirroring agent cleanup - Add type="button" to pin buttons in ModelSpecItem/EndpointModelItem - Fix import order violations in EndpointModelItem and ModelSpecItem - Remove hollow test, dead key prop, inline trivial helpers - Fix misleading test description, add onSelectSpec to test mock - Add return to controller success responses for consistency - Add 6 backend tests for partial cross-type field validation * fix: guard stale-spec cleanup against unloaded startupConfig Prevents race condition where spec favorites are incorrectly deleted on cold start before startupConfig has loaded. Mirrors the existing agentsMap === undefined guard pattern used for stale agent cleanup. Also adds tests for stale-spec cleanup persistence and fixes namespace import pattern in FavoritesList.spec.tsx. * fix: replace nested ternaries with if/else in FavoriteItem Resolves ESLint no-nested-ternary warnings for name and typeLabel derivations. --------- Co-authored-by: Danny Avila <danny@librechat.ai> |
||
|
|
daa8f0ea6b
|
📂 fix: Respect supportedMimeTypes Config in File Picker Accept Filter (#12596)
* fix: respect supportedMimeTypes config in file picker accept filter The browser file picker's accept attribute was hardcoded by provider identity, ignoring the endpoint's supportedMimeTypes from fileConfig. Users who configured permissive MIME types (e.g., '.*') still saw a restrictive filter in the upload dialog. Add isPermissiveMimeConfig utility that detects wildcard patterns in the endpoint's supportedMimeTypes. When permissive, the file picker accept attribute is set to empty (unrestricted). Non-permissive configs retain the existing provider-based defaults. Closes #12589 * fix: address review findings for isPermissiveMimeConfig - Use non-standard MIME namespace probe (x-librechat/x-probe) so category-wildcard patterns like ^application\/.*$ no longer false-positive as permissive - Add single-line JSDoc to isPermissiveMimeConfig - Use !== undefined instead of != null (fileType is never null) - Add endpointFileConfig to dropdownItems useMemo deps to prevent stale closure when config changes without endpoint change - Add tests for broad application and multi-category patterns * fix: wrap handleUploadClick in useCallback to satisfy exhaustive-deps handleUploadClick is captured inside the dropdownItems useMemo but was not in its dependency array. Wrap it in useCallback with endpointFileConfig.supportedMimeTypes as the sole dependency, then reference the stable callback in the useMemo deps. |
||
|
|
81275ff0e0
|
⏱️ refactor: User Job Tracking TTL and Proactive Cleanup to Redis Job Store (#12595)
* refactor: Add user job tracking TTL to RedisJobStore - Introduced a new TTL for per-user job tracking sets, set to 24 hours, to enhance job management. - Updated RedisJobStoreOptions interface to include userJobsSetTtl for configuration. - Modified job creation and deletion methods to manage user job sets effectively, ensuring proper expiration and cleanup. - Enhanced comments for clarity on the new TTL functionality and its implications for user job tracking. * fix: Address review findings for user job tracking TTL - Remove redundant `del(userJobsKey)` in `getActiveJobIdsByUser` that raced with concurrent `createJob` on other replicas (Redis auto-deletes empty Sets after SREM) - Guard `userJobsSetTtl: 0` from silently destroying tracking sets (`EXPIRE key 0` deletes the key on Redis 7.0+) - Extract `deleteJobInternal` so `cleanup()` reuses the already-fetched userId instead of issuing a redundant HGETALL per stale job - Add integration tests for TTL behavior, proactive SREM, configurable userJobsSetTtl, and TTL refresh on repeated createJob * fix: Address follow-up review findings for RedisJobStore - Use deleteJobInternal in cleanup() terminal-but-in-running-set path to ensure userJobsKey SREM is not skipped - Clear local caches in deleteJob before the fallible getJob call so they are cleaned even on transient Redis errors - Add proactive SREM tests for aborted and error terminal statuses - Add test for tenant-qualified user tracking key format * fix: Preserve completedTtl for non-running jobs in cleanup() The cleanup() terminal-status branch should only remove tracking set membership, not delete the job hash. deleteJobInternal bypasses the completedTtl window that updateJob already applied, causing clients polling for final status to lose the job data early. |