CERBERUS/LibreChat - Gitora: Self-Hosted Future for Developers

CERBERUS/LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-07-02 04:12:36 +00:00

Author	SHA1	Message	Date
Danny Avila	49f4b659f6	🔐 fix: Honor Admin-Panel MCP Allowlist Overrides Without Restart (#13814 ) * 🔐 fix: Honor Admin-Panel MCP Allowlist Overrides Without Restart MCPServersRegistry was built once at boot from getAppConfig({ baseOnly: true }), freezing allowedDomains/allowedAddresses to YAML. Admin-panel mcpSettings overrides were ignored by both inspection (addServer/ reinspectServer/updateServer/lazyInitConfigServer) and runtime connection enforcement (assertResolvedRuntimeConfigAllowed), so a domain allowed only via the panel failed inspection and never connected. Make the registry's effective allowlists mutable and refresh them from the merged admin-panel config: seed at boot, and re-apply on every config mutation via invalidateConfigCaches -> clearMcpConfigCache. Both inspection and connection paths read the same getters, so both honor overrides without a restart. Fail-safe: current allowlists are preserved when the merged read fails. * 🛡️ fix: Scope MCP allowlist refresh to global config, fail-safe on DB error Address Codex P1 review findings on the allowlist-refresh path: - Tenant-scoped config mutations no longer push one tenant's merged mcpSettings into the process-wide registry singleton (read by all MCP connection paths), which would leak allowlists across tenants. Only global (non-tenant) mutations refresh the registry; tenant mutations still evict the config-server cache. - The refresh read now uses strictOverrides:true so a transient DB error throws instead of silently returning YAML base config — preserving the last-known allowlists rather than overwriting them with fallback values. Adds the strictOverrides option to getAppConfig (default off, no behavior change for existing callers). * ♻️ refactor: Resolve MCP allowlists per-request (tenant-scoped) instead of a global singleton Supersedes the prior global-mutation approach. MCP allowlists live in mcpSettings, which is tenant/principal-scoped admin config, so a process-wide singleton value is the wrong model — it caused cross-tenant bleed and stale reads. Instead, inject a resolver (from the app layer, where the merged config lives) that the registry calls per inspection and per connection. It reads the ALS tenant context via getAppConfig and accepts the acting user so user/role-scoped overrides resolve; config-source inspection (no user) resolves at tenant scope. Falls back to the YAML base allowlists when no resolver is set or the lookup fails, so a transient error fails to the operator baseline rather than disabling the allowlist. Removes the now-unnecessary setAllowlists / boot-seed / invalidateConfigCaches refresh / getAppConfig.strictOverrides machinery. * 🔒 fix: Scope config-source cache by allowlist; resolve OAuth allowlists per-request Address Codex review of the per-request resolver: - Config-source cache key now folds in the resolved allowlists, not just the raw-config hash. Inspection results became allowlist-dependent, so without this a tenant whose allowlist rejects a URL could poison the shared key with an inspectionFailed stub for a tenant that allows it (and vice versa). The tenant-scoped allowlist is resolved once per ensureConfigServers pass and threaded through the cache key + inspection. - The two remaining request-time OAuth allowlist reads now use the merged config instead of the YAML base getters: the fallback OAuth-initiate path (routes/mcp.js) via resolveAllowlists, and OAuth revocation (UserController.maybeUninstallOAuthMCP) via the request's already-merged appConfig.mcpSettings. Without this, an OAuth endpoint allowed only by an admin-panel override was rejected while inspection/connection allowed it. * ✅ test: Update MCP OAuth registry/config mocks for per-request allowlists CI fix for the Finding-12 change. The OAuth-initiate route now calls registry.resolveAllowlists() and the revocation path reads the merged appConfig.mcpSettings, so the affected specs' mocks were asserting the old base-getter values: - routes/__tests__/mcp.spec.js: add resolveAllowlists to the registry mock. - UserController.mcpOAuth.spec.js: provide mcpSettings on the getAppConfig mock so revokeOAuthToken still receives the expected allowlists. * 🧪 test: e2e proof that admin-panel MCP allowlist override takes effect Adds a Playwright mock-harness spec for #13809. A URL-based MCP fixture (e2e-http, streamable-http SDK server) boots inspectionFailed because its origin is omitted from the YAML mcpSettings.allowedDomains; the spec adds that origin via an admin config override (PUT /api/admin/config/user/:id) and asserts the server reinitializes — exercising the real resolver path through the backend + DB. Before the fix, reinspection used the frozen YAML allowlist and the server stayed unreachable. - e2e/setup/fake-mcp-http-server.js: streamable-HTTP MCP fixture (health GET /). - e2e/playwright.config.mock.ts: boot the fixture as a second webServer. - e2e/config/librechat.e2e.yaml: mcpSettings.allowedDomains (excludes 127.0.0.1) + the e2e-http server. - e2e/specs/mock/mcp-allowlist-override.spec.ts: login → baseline reinit fails → apply override → reinit succeeds.	2026-06-17 20:14:53 -04:00
Danny Avila	c27d6b85a4	🤫 refactor: Silent MCP OAuth Refresh on Mid-Session 401 (#13369 ) * 🤫 fix: Silent MCP OAuth Refresh on Mid-Session 401 Avoids the hourly interactive re-auth prompt when an MCP server (e.g. Azure Entra ID) returns 401 mid-session by attempting a refresh token exchange first, and only falling back to the interactive OAuth flow when no refresh token is stored or the refresh server rejects it. Resolves #13364. * fix: Use distinct flow type for silent token refresh to avoid cache hit Addresses the Codex review on PR #13369: `attemptSilentTokenRefresh` was reusing the `'mcp_get_tokens'` flow type, so `FlowStateManager.createFlowWithHandler` would short-circuit and return the same tokens cached by an earlier `getOAuthTokens` call — the very tokens the server just rejected — without executing the forced-refresh handler. Switch silent refresh to the distinct `'mcp_force_refresh_tokens'` flow type so coalescing still works but stale `mcp_get_tokens` cache entries are not reused. After a successful refresh, invalidate the `mcp_get_tokens` flow cache so the next `getOAuthTokens` call reads the freshly persisted tokens from storage rather than the stale cached value. Add a regression test that simulates the real `FlowStateManager.createFlowWithHandler` cache-hit behavior for `mcp_get_tokens` and verifies the silent refresh handler still runs and returns the freshly refreshed tokens. * fix: Address Codex round-2 review on silent MCP OAuth refresh Three follow-up findings from Codex on PR #13369: 1. The new `mcp_force_refresh_tokens` flow type was itself cached by `FlowStateManager.createFlowWithHandler`, so a subsequent 401 within the refreshed token's `expires_at` could re-serve the just-rejected token without ever re-running the refresh handler. 2. The factory's `oauthRequired` listener was removed immediately after the initial `attemptToConnect` succeeded, so a real mid-session 401 emitted by `MCPConnection.connectClient` during transport recovery had no listener — the OAuth handled-promise would simply time out instead of triggering the silent refresh. 3. Routing the silent refresh through a distinct flow type broke coalescing with the `mcp_get_tokens` lock used by `getOAuthTokens`, letting two paths concurrently redeem the same stored refresh token. For providers that rotate refresh tokens (e.g. Azure Entra) the second redemption is rejected, kicking the user back into interactive OAuth despite a successful refresh elsewhere. Resolution: - Drop `FlowStateManager` from the silent-refresh path entirely. Replace with a process-local `inflightSilentRefreshes` Map keyed by `userId:serverName` that holds only the in-flight Promise (no cached result), so every fresh 401 after settlement triggers a fresh redemption while concurrent 401s for the same user/server still share one redemption. - Stop calling `cleanupOAuthHandlers()` on successful initial connect, keeping the OAuth handler attached for the connection's lifetime so mid-session 401s actually reach `attemptSilentTokenRefresh`. - Add a regression test reproducing the stale-cache scenario by faking the `mcp_get_tokens` cache hit and asserting silent refresh still runs against storage and returns the fresh tokens. - Add a coalescing test asserting two concurrent oauthRequired events for the same user/server result in a single `forceRefreshTokens` call. - Clear `inflightSilentRefreshes` in `beforeEach` to prevent cross-test leakage; switch the silent-refresh test mocks to `mockResolvedValueOnce` / `mockImplementationOnce` so leftover mock state cannot leak into later test cases. Acknowledged remaining gap: the silent refresh still races `getOAuthTokens`'s `mcp_get_tokens` flow when both run concurrently (narrow window when an existing connection's local `expires_at` is still valid but the server invalidated the token, and a new connection is being created in parallel). The race is self-healing on the next 401 and documented inline. * fix: Address Codex round-3 review on silent MCP OAuth refresh Three more findings from Codex on PR #13369: 1. The in-flight silent-refresh promise was unbounded. If `forceRefreshTokens()` ever hung (slow provider, dropped TCP), the `inflightSilentRefreshes` lock stayed occupied forever and every later 401 for the same user/server joined the stuck promise instead of starting a fresh attempt or falling back to interactive OAuth. 2. The interactive-OAuth fallback didn't invalidate the `mcp_get_tokens` flow cache after persisting fresh tokens. For providers that don't issue refresh tokens (so silent refresh returns null), the old cache could still feed stale access tokens to the next `getOAuthTokens` call until its TTL expired — causing an immediate reconnect with the same just-rejected token. 3. When silent refresh failed, the handler fell through to `handleOAuthRequired()` whose recent-completion fast path can reuse a COMPLETED `mcp_oauth` flow within `PENDING_STALE_MS`. Those cached tokens are exactly the ones the server just rejected, so the connection would keep adopting them and looping on 401s until the cache aged out. Resolution: - Wrap `runSilentRefresh()` with a 60-second `withTimeout` (well under `connectClient`'s 120s OAuth timeout). On timeout the `.catch` resolves to null and the `finally` clears the in-flight entry, so the next 401 starts fresh and falls through to interactive OAuth. - Extract two helpers — `invalidateGetTokensFlow` and `invalidateCompletedOAuthFlow` — and call them from the right branches: clear `mcp_get_tokens` after silent-refresh success AND after interactive-OAuth `storeTokens`; clear the COMPLETED `mcp_oauth` state (plus its CSRF mapping) before falling through to interactive OAuth so the fast-reuse path can't re-serve the rejected tokens. - Add three regression tests: hung refresh release-the-lock under fake timers, completed-OAuth cache invalidation pre-fallback, and `mcp_get_tokens` invalidation after interactive token store. * fix: Address Codex round-4 review on silent MCP OAuth refresh Three more findings from Codex on PR #13369: 1. (P1) The silent-refresh in-flight lock keyed only by `userId:serverName`. In multi-tenant setups where two tenants share a userId (e.g. username-based IDs) and the same MCP server name, a concurrent mid-session 401 from tenant B would join tenant A's in-flight refresh and adopt tenant A's freshly minted tokens onto a tenant-B connection — a cross-tenant credential leak. 2. (P2) `invalidateGetTokensFlow` deleted the `mcp_get_tokens` flow state regardless of its status. When another connection was currently in `getOAuthTokens()` (PENDING flow) and joiners were monitoring it, the unconditional delete made those waiters see "Flow state not found" and unnecessarily fall back to interactive OAuth — even though fresh tokens were already being written. 3. (P2) The 60s `withTimeout` wrapping `runSilentRefresh()` only races the promise; it does not cancel the underlying `forceRefreshTokens` / refresh-token HTTP request. If the request returned after a subsequent interactive OAuth had stored newer tokens, the late completion would `storeTokens` over the newer state. This requires a provider that doesn't rotate refresh tokens AND a refresh slower than 60s AND a successful interactive OAuth in that window — narrow but real. Resolution: - Capture `getTenantId()` into a new `factory.tenantId` field at factory construction time (before the OAuth handler closes over it outside the original request's async context) and include it in the silent-refresh lock key as `tenantId:userId:serverName`. - `invalidateGetTokensFlow` now calls `getFlowState` first and only deletes when `status === 'COMPLETED'`. PENDING lookups are left alone so concurrent `getOAuthTokens` waiters via `monitorFlow` can still settle. - For (3), document the race as a known limitation inline. Fully closing it requires threading an `AbortSignal` through `MCPTokenStorage.forceRefreshTokens` and the OAuth refresh handler to skip the late `storeTokens` after timeout — out of scope for this PR's surgical change. - Add `getTenantId` to the `MCPOAuthConnectionEvents` test's `@librechat/data-schemas` mock so the factory constructor doesn't blow up under that suite. - Add three regression tests: per-tenant lock isolation, PENDING-state preservation under `invalidateGetTokensFlow`, and (reused) the existing interactive-store invalidation test now driven through `getFlowState` returning the COMPLETED state. * fix: Address silent MCP OAuth refresh review Restore captured tenant context around token storage and OAuth fallback paths so mid-session callbacks do not lose tenant scope. Thread AbortSignal through forced refresh and OAuth token requests, cap silent refresh by the connection OAuth timeout, and prevent timed-out refreshes from writing stale credentials after fallback. Complete pending mcp_get_tokens flows with fresh tokens, add missing FlowState createdAt test fixtures, and cover the new tenant/abort/cache behaviors. * fix: Tighten tenant-scoped MCP token refresh Cap silent refresh by both the factory connect timeout and the connection OAuth wait timeout so fallback OAuth wins before the outer connect attempt expires. Tenant-scope mcp_get_tokens flow ids for both token lookup and refresh invalidation, preventing cross-tenant flow completion or cache deletion when tenants share user ids and server names. Add regression tests for the omitted initTimeout budget and tenant-prefixed token flow locks. * fix: Reserve MCP OAuth fallback budget * fix: Harden MCP OAuth refresh races * fix: Keep MCP OAuth fallback route-compatible * test: Add SDK MCP OAuth refresh repro * fix: Address MCP OAuth refresh review findings * fix: Address MCP OAuth tenant review findings * fix: Close MCP OAuth route tenant gaps * fix: Preserve MCP OAuth refresh flow guards * fix: Avoid reprocessing MCP OAuth reauth config * fix: Release timed-out MCP refresh locks * fix: Release MCP OAuth request callbacks * fix: Tenant-scope remaining MCP OAuth flow lookups * ci: Sort imports in MCP OAuth test suites	2026-06-10 13:12:42 -04:00
Danny Avila	9dd062e42e	🧯 fix: Harden Data Retention Semantics (#13049 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details GitNexus Index / index (push) Waiting to run Details GitNexus Index / post-index (push) Blocked by required conditions Details * feat: support data retention for normal chats Add retentionMode config variable supporting "all" and "temporary" values. When "all" is set, data retention applies to all chats, not just temporary ones. Adds isTemporary field to conversations for proper filtering. Adapted to new TS method files in packages/data-schemas since upstream moved models out of api/models/. Based on danny-avila/LibreChat#10532 Co-Authored-By: WhammyLeaf <233105313+WhammyLeaf@users.noreply.github.com> (cherry picked from commit `30109e90b0`) * feat: extend data retention to files, tool calls, and shared links Add expiredAt field and TTL indexes to file, toolCall, and share schemas. Set expiredAt on tool calls, shared links, and file uploads when retentionMode is "all" or chat is temporary. (cherry picked from commit `48973752d3`) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: lint/test (cherry picked from commit `310c514e6a`) * fix: address code review feedback for data retention PR Critical: - Fix BookmarkMenu crash: restore optional chaining on conversation - Fix migration hazard: backward-compatible sidebar filter that also checks expiredAt for documents without isTemporary field Major: - Add logging to getRetentionExpiry error path, align with tools.js - Add tests for retentionMode: ALL in saveConvo and saveMessage - Fix share route: apply expiredAt for temporary chats too by querying the conversation's isTemporary flag server-side - Add assertions for getRetentionExpiry mocks in process tests Minor: - Fix ChatRoute isTemporaryChat to be strictly boolean via Boolean() - Fix stale test description (expired -> temporary) - Comment out retentionMode default in example yaml - Simplify verbose if/else to isTemporary === true - Add compound index on { user: 1, isTemporary: 1 } - Remove narrating comment from process.spec.js Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> (cherry picked from commit `6bad535f90`) * chore: fix typescript (cherry picked from commit `826527a46b`) * fix: lint (cherry picked from commit `77817e80ea`) * fix: use mockSanitizeArtifactPath in retention test The 'getRetentionExpiry is called with the request object' test referenced an undefined `mockSanitizeFilename` identifier, breaking both lint (no-undef) and the test suite. Use the existing `mockSanitizeArtifactPath` mock that the surrounding tests already use, since `processCodeOutput` calls `sanitizeArtifactPath` (not `sanitizeFilename`) before invoking `getRetentionExpiry`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit `52ea2da66d`) * fix: forward isTemporary from client for retention on file uploads and tool calls Server-side `getRetentionExpiry` (file uploads) and the tool-call controller both read `req.body.isTemporary`, but the file upload multipart form and the tool-call payload did not include that field. In `retentionMode: temporary` (default), files uploaded and tool calls created from temporary chats were therefore retained indefinitely. Forward the Recoil `isTemporary` flag in both client paths so the existing server checks can fire correctly. `ToolParams` gains an optional `isTemporary` field. Addresses Codex P1 review feedback on PR #29. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit `7e937df05a`) * test: stub store.isTemporary in useFileHandling test mocks Previous commit added `useRecoilValue(store.isTemporary)` to the hook. The test file mocks `~/store` with only `ephemeralAgentByConvoId` and does not stub `useRecoilValue`, so all 7 cases threw "Invalid argument to useRecoilValue: expected an atom or selector but got undefined". Add a stub default export with `isTemporary` and a `useRecoilValue` mock returning `false`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit `eb1609537d`) * fix: harden data retention semantics * fix: provide sweep request context for expired files * fix: preserve temporary flags in all-retention updates * fix: honor assistant versions in retention sweeps * fix: retain non-temporary flags in all mode * fix: hide expired retained records * fix: propagate retained conversation expiry * fix: refresh meili retention cutoff * fix: prevent overlapping file sweeps * fix: show legacy retained conversations * fix: index legacy retained records * fix: harden retention cleanup edge cases * fix: count failed file storage sweeps * fix: preserve legacy temporary retention * fix: assign retention sweep worker deterministically * fix: hide expired shared links on reads * fix: prevent retention refresh after parent expiry * fix: break code output retention import cycle * fix: harden retention review findings * fix: ignore expired share duplicates * fix: reject expired retained share creation * fix: harden retention review edge cases * fix: address retention audit findings * fix: enforce expired conversation shares in all retention * fix: scope temporary upload flag to chat files * fix: address retention review findings * fix: address codex retention review findings * fix: tighten missing storage detection * test: remove unused file process spec bindings --------- Co-authored-by: WhammyLeaf <233105313+WhammyLeaf@users.noreply.github.com> Co-authored-by: Aron Gates <aron@muonspace.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-19 21:58:42 -04:00
Danny Avila	4cce88be42	🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets (#12933 ) * 🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets LibreChat already blocks SSRF-prone targets (private IPs, loopback, link-local, .internal/.local TLDs) at every server-side fetch site that consumes user-controllable URLs — custom-endpoint baseURLs, MCP servers, OpenAPI Actions, and OAuth endpoints. The only existing escape hatch is `allowedDomains`, but that flips the field into a strict whitelist: adding `127.0.0.1` to permit a self-hosted Ollama also blocks every public destination that isn't in the list. Introduce `allowedAddresses` as the orthogonal primitive: a private- IP-space exemption list. When a hostname or its resolved IP appears in the list, the SSRF block is bypassed for that target. Public destinations remain reachable. Operators can now run self-hosted LLMs / MCP servers / Action endpoints on private addresses without weakening the default-deny posture for everything else. Schema additions in `packages/data-provider/src/config.ts`: - `endpoints.allowedAddresses` (new — gates `validateEndpointURL`) - `mcpSettings.allowedAddresses` (parallel to `allowedDomains`) - `actions.allowedAddresses` (parallel to `allowedDomains`) Core changes in `packages/api/src/auth/`: - New `isAddressAllowed(hostnameOrIP, allowedAddresses)` — pure, case-insensitive, bracket-stripped literal match. - Threaded the list through `isSSRFTarget`, `resolveHostnameSSRF`, `isDomainAllowedCore`, `isActionDomainAllowed`, `isMCPDomainAllowed`, `isOAuthUrlAllowed`, and `validateEndpointURL`. - Extended `createSSRFSafeAgents` and `createSSRFSafeUndiciConnect` to accept the list, building an SSRF-safe DNS lookup that exempts matching hostnames/IPs at TCP connect time (TOCTOU-safe). Wiring: - Custom and OpenAI endpoint initialize sites pass `endpoints.allowedAddresses` to `validateEndpointURL`. - `MCPServersRegistry` stores `allowedAddresses` and exposes it via `getAllowedAddresses()`. The factory, connection class, manager, `UserConnectionManager`, and `ConnectionsRepository` all thread it through to the SSRF utilities. - `MCPOAuthHandler.initiateOAuthFlow`, `refreshOAuthTokens`, and `validateOAuthUrl` accept the list and consult it on every URL validation along the OAuth chain. - `ToolService`, `ActionService`, and the assistants/agents action routes pass `actions.allowedAddresses` to `isActionDomainAllowed` and to `createSSRFSafeAgents` for runtime action calls. - `initializeMCPs.js` reads `mcpSettings.allowedAddresses` from the app config and forwards it to the registry constructor. Documentation: - `librechat.example.yaml` shows the new field next to each existing `allowedDomains` block, with a note clarifying that `allowedAddresses` is an exemption list (not a whitelist). Tests: - Unit tests for `isAddressAllowed` covering literal IPs, hostnames, IPv6 brackets, case insensitivity, and partial-match rejection. - Exemption tests for every entry point: `isSSRFTarget`, `resolveHostnameSSRF`, `validateEndpointURL`, `isActionDomainAllowed`, `isMCPDomainAllowed`, `isOAuthUrlAllowed`. - Existing tests updated to reflect the new optional parameter. Default behavior is unchanged: omitted = empty list = no exemptions. * 🩹 fix: Plumb allowedAddresses Through AppConfig endpoints Type The initial PR added `endpoints.allowedAddresses` to the data-provider config schema and consumed it in the endpoint initialize sites, but the runtime `AppConfig.endpoints` shape in `@librechat/data-schemas` was a hand-maintained subset that didn't include the new field — so `tsc` rejected `appConfig.endpoints.allowedAddresses`. Add the field to `AppConfig['endpoints']` in `packages/data-schemas/src/types/app.ts` and forward it from the loaded config in `packages/data-schemas/src/app/endpoints.ts` so the runtime config carries the value. Update `initializeMCPs.spec.js` to expect the third positional argument (`allowedAddresses`) on the `createMCPServersRegistry` call. * 🩹 fix: Enforce allowedDomains Before allowedAddresses In isOAuthUrlAllowed The initial implementation checked the address exemption first, so a URL whose hostname appeared in `allowedAddresses` would return true even when the admin had configured `allowedDomains` as a strict bound on OAuth endpoints. A malicious MCP server could advertise OAuth metadata, token, or revocation URLs at any address the admin had permitted for an unrelated reason (a self-hosted LLM at `127.0.0.1`, for example) and pass validation, expanding SSRF reach beyond the configured domain whitelist. Reorder: when `allowedDomains` is set, treat it as authoritative — return true only if the URL matches a domain entry, otherwise fall through to false. The address exemption only applies when no `allowedDomains` is configured (mirrors how the downstream SSRF check in `validateOAuthUrl` consults `allowedAddresses`). Add a regression test asserting that an `allowedAddresses` entry does not broaden a configured `allowedDomains` list. Reported by chatgpt-codex-connector on PR #12933. * 🩹 fix: Forward allowedAddresses To Remaining OAuth Callers Two `MCPOAuthHandler` callers still used the pre-feature signatures and were silently dropping the new `allowedAddresses` argument: - `api/server/routes/mcp.js` invoked `initiateOAuthFlow` with the old 5-argument shape, so OAuth flows initiated through the route handler ignored the registry's `getAllowedAddresses()` and would reject any metadata/authorization/token URL on a permitted private host. - `api/server/controllers/UserController.js#maybeUninstallOAuthMCP` invoked `revokeOAuthToken` without the address exemption, so uninstalling an OAuth-backed MCP server on a permitted private host would fail at the revocation step even though the rest of the MCP connection path now permits it. Both sites now read `allowedAddresses` from the registry alongside `allowedDomains` and forward it. Reported by Copilot on PR #12933. * 🩹 fix: Update Test Mocks And Assertions For OAuth allowedAddresses The previous commit started passing `allowedAddresses` to `MCPOAuthHandler.initiateOAuthFlow` from `api/server/routes/mcp.js` and to `MCPOAuthHandler.revokeOAuthToken` from `api/server/controllers/UserController.js`, but the corresponding test files mocked the registry without `getAllowedAddresses` (causing `TypeError`s) and asserted the old positional shape on `toHaveBeenCalledWith`. Update the mocks and assertions to match the new arity: - `api/server/routes/__tests__/mcp.spec.js`: add `getAllowedDomains`/`getAllowedAddresses` to the registry mock and expect the additional positional args on `initiateOAuthFlow`. - `api/server/controllers/__tests__/maybeUninstallOAuthMCP.spec.js`: add a `getAllowedAddresses` mock alongside the existing `getAllowedDomains` and seed it in `setupOAuthServerFound`. - `api/server/controllers/__tests__/UserController.mcpOAuth.spec.js`: add `getAllowedAddresses` to the registry mock and expect the trailing `null` arg on the three `revokeOAuthToken` assertions. * 🛡️ fix: Address Comprehensive Review — Scope allowedAddresses To Private IP Space Major findings from the comprehensive PR review (severity → fix): CRITICAL — `validateOAuthUrl` SSRF fallback bypass. When `allowedDomains` is configured and a URL fails the whitelist, the SSRF fallback in `validateOAuthUrl` was still passing `allowedAddresses` to `isSSRFTarget` / `resolveHostnameSSRF`, letting a malicious MCP server advertise OAuth endpoints at any address the admin had permitted for an unrelated reason. Suppress `allowedAddresses` in the fallback when `allowedDomains` is active — the address exemption is opt-in for the no-whitelist mode only. MAJOR — WebSocket transport SSRF check ignored exemptions. The `constructTransport` WebSocket branch called `resolveHostnameSSRF(wsHostname)` without `this.allowedAddresses`, so a permitted private MCP server would pass `isMCPDomainAllowed` but be blocked at transport creation. Forward the exemption. Scope `allowedAddresses` to private IP space only (operator directive). The exemption list is for permitting private/internal targets; it must not be a back-door to broaden trust to public destinations. - Schema (`packages/data-provider/src/config.ts`): new `allowedAddressesSchema` rejects URLs (`://`), paths/CIDR (`/`), whitespace, and public IPv4/IPv6 literals at config-load time. Wired into `endpoints`, `mcpSettings`, and `actions`. - Runtime (`packages/api/src/auth/domain.ts`): `isAddressAllowed` now drops public-IP candidates and public-IP entries on the match path — defense in depth so a misconfigured runtime list never grants exemption. - Hot path (`packages/api/src/auth/agent.ts`): `buildSSRFSafeLookup` pre-normalizes the list into a `Set<string>` once at construction and applies the same scoping filter, so the connect-time DNS lookup is an O(1) Set membership check instead of a full re-iterate-and-normalize on every outbound request. Test coverage for the connect-time and OAuth-fallback paths. - `agent.spec.ts`: new describe block exercising `buildSSRFSafeLookup` and `createSSRFSafe` with `allowedAddresses` — hostname-literal exemption, resolved-IP exemption, public-IP scoping, URL/CIDR/whitespace rejection, and the default no-list block. - `handler.allowedAddresses.test.ts` (new): integration tests for `validateOAuthUrl` — covers both the no-domains-set "permit private" path and the strict-bound regression where `allowedAddresses` must NOT bypass `allowedDomains`. Documentation & cleanup.* - `connection.ts` redirect SSRF check: explicit comment that `allowedAddresses` is intentionally NOT consulted for redirect targets (server-controlled, must not inherit the admin's exemption). - `MCPConnectionFactory.test.ts`: replaced an `eslint-disable` with a proper `import { getTenantId } from '@librechat/data-schemas'`. The disable was added to make a pre-existing `require()` quiet — the cleaner fix is to use the existing top-level import. Updated `MCPConnectionSSRF.test.ts` WebSocket SSRF assertions to match the new two-argument call shape (`hostname, allowedAddresses`). * 🩹 fix: Require Absolute URL Before allowedAddresses Trust Bypass In isOAuthUrlAllowed `parseDomainSpec` is lenient — it silently prepends `https://` to schemeless inputs so it can match patterns like bare `example.com`. That leniency leaked into `isOAuthUrlAllowed`'s new `allowedAddresses` short-circuit: a value like `10.0.0.5/oauth` (no scheme) would parse successfully via the prepended default, hit the address-exemption path, return `true`, and skip `validateOAuthUrl`'s strict `new URL(url)` parse-or-throw — only to fail later in OAuth discovery with a less clear runtime error. Add a strict `new URL(url)` gate at the top of `isOAuthUrlAllowed`. Schemeless inputs now fall through to `validateOAuthUrl`'s explicit "Invalid OAuth <field>" rejection. Tests added in both `auth/domain.spec.ts` (unit) and the OAuth handler integration spec (end-to-end). Reported by chatgpt-codex-connector (P2) on PR #12933. * 🛡️ fix: Address Follow-Up Comprehensive Review — Schema Tests, Shared Normalization, host:port Auditing the second comprehensive review: F1 MAJOR — schema validation untested. `allowedAddressesSchema` had zero coverage, so a regression in the three refinement stages or the three wiring locations (`endpoints` / `mcpSettings` / `actions`) would silently let invalid entries reach the runtime. Added a dedicated `describe('allowedAddressesSchema')` block in `config.spec.ts` covering: valid private IPs (v4 + v6, including the previously-missed 192.0.0.0/24 range), accepted hostnames, all rejection categories (URLs, CIDR, paths, whitespace tabs/newlines, host:port, public IP literals), and full `configSchema.parse()` integration at each of the three nesting points. F2 MINOR — `isPrivateIPv4Literal` divergence. The schema reimpl in `packages/data-provider` was discarding the `c` octet, so the `192.0.0.0/24` (RFC 5736 IETF protocol assignments) range that the authoritative `isPrivateIPv4` accepts was being rejected with a misleading "public IP" error. Destructure `c` and add the missing range check; covered by the new schema tests. F3 MINOR — DRY violation across `domain.ts` and `agent.ts`. Both files had independent normalization implementations with a subtle whitespace-check divergence (`/\s/` vs `.includes(' ')`). Extracted the shared logic into a new `packages/api/src/auth/allowedAddresses.ts` module that both consumers import: - `normalizeAddressEntry(entry)` — single-entry shape check - `looksLikeHostPort(entry)` — host:port detector (used by F4) - `normalizeAllowedAddressesSet(list)` — pre-normalized Set for the connect-time hot path - `isAddressInAllowedSet(candidate, set)` — membership check that enforces private-IP scoping on the candidate Both `isAddressAllowed` (preflight) and `buildSSRFSafeLookup` (connect) now go through the same primitives; the whitespace divergence is gone. To break the import cycle (`allowedAddresses` needs `isPrivateIP`, `domain` previously owned it), extracted IP private-range detection into a leaf `auth/ip.ts` module. `domain.ts` re-exports `isPrivateIP` for backward compatibility with existing call sites. F4 MINOR — `host:port` silently misclassified. Entries like `localhost:8080` previously slipped through the URL/path guard, were mis-detected as IPv6, failed `isPrivateIP`, and were silently dropped with a misleading "public IP" schema error. Added an explicit `looksLikeHostPort` check with a clear error: "allowedAddresses entries must not include a port — list the bare hostname or IP only." Bare `::1`, `[::1]`, and other valid IPv6 literals are intentionally not matched (regex distinguishes by colon count and the bracketed `[ipv6]:port` form). F5 MINOR — hostname-trust documentation gap. Hostname entries short-circuit `resolveHostnameSSRF` before any DNS lookup — that's a deliberate design (admin trusts the name) but it means the exemption follows whatever the name resolves to at runtime. Added an explicit note in `librechat.example.yaml` for both `mcpSettings.allowedAddresses` and `endpoints.allowedAddresses`: "a hostname entry trusts whatever IP that name resolves to. Only list hostnames whose DNS you control. Prefer literal IPs when you can." F6 (8 positional params) is flagged for follow-up; refactor to an options object is a breaking-API change deferred to a separate PR. F7 (redirect/WebSocket asymmetry, NIT, conf 40) — skipping; the existing inline comment is sufficient. * 🧹 chore: Address Follow-Up NITs — Import Order And Mirror-Function Naming Three NITs from the latest comprehensive review: NIT #1 (conf 85) — local import order. AGENTS.md requires local imports sorted longest-to-shortest. Both `domain.ts` and `agent.ts` had `./ip` (shorter) before `./allowedAddresses` (longer). Swapped. NIT #2 (conf 60) — missing cross-reference. The schema-side `isHostPortShape` in `packages/data-provider/src/config.ts` had no note pointing at the canonical runtime mirror. Added a JSDoc paragraph explaining the mirror relationship and why a local copy exists (the data-provider package can't import from `@librechat/api` without creating a circular dependency). NIT #3 (conf 50) — naming inconsistency. Renamed `isHostPortShape` → `looksLikeHostPort` so the schema mirror matches the runtime helper exactly. Kept as a separate function (not a shared import) for the same circular-dependency reason; the matching name makes it obvious they should stay in lockstep.	2026-05-03 21:43:59 -04:00
Danny Avila	4e45e8e17c	🧹 fix: Clear MCP OAuth Tokens On Revoke Fixes #12912.\n\n- Clear stored MCP OAuth tokens and flow state on revoke cleanup-only paths.\n- Keep provider revocation best-effort when token and client metadata are available.\n- Add controller and function coverage for stale metadata, missing config, and cleanup failure paths.	2026-05-03 02:52:47 +09:00