Commit graph

4141 commits

Author SHA1 Message Date
Marco Beretta
f4814359cb
fix: keep green focus border on auth secret inputs
SecretInput's modernized default uses focus-visible:border-border-heavy and
hover:border-border-medium, which Tailwind emits after the auth pages' focus:
rules and overrides them. Auth pages now also declare focus-visible:border-green-500
and hover:border-border-light so cn()/twMerge resolves them as the winners
when classes are concatenated.
2026-05-07 22:05:20 +02:00
Marco Beretta
a2c6614cbc
fix: remove focus ring from SecretInput 2026-05-07 19:04:08 +02:00
Marco Beretta
964b5ffd32
fix: align SecretInput eye icon and modernize controls
The wrapper was a flex container, so passing 'mb-2' on the input made it
contribute its margin to the wrapper's cross-axis size — the controls overlay
spanned the inflated height and centered the toggle 4px below the input's
true center. Switching the wrapper to a plain relative block collapses height
back to the input.

Also tightens the toggle/copy buttons (size-7 rounded-md with hover:bg-surface-hover)
and adds a focus ring on the input. Auth pages still override className/buttonClassName
so login/register styling is unchanged.
2026-05-07 17:18:33 +02:00
Marco Beretta
57a43531d4
fix: reveal SecretInput controls on hover 2026-05-06 16:06:41 +02:00
Marco Beretta
c001681b41
fix: use SecretInput for dynamic credentials 2026-05-06 00:13:49 +02:00
Marco Beretta
779fe28ddb
fix: align SecretInput controls 2026-05-06 00:05:04 +02:00
Marco Beretta
d9b7eb9148
chore: remove unused password i18n keys 2026-05-05 23:57:22 +02:00
Marco Beretta
6eaa8595c1
fix: align auth SecretInput styles 2026-05-05 23:45:23 +02:00
Marco Beretta
1eeccd6c5d
feat: use SecretInput for sensitive fields 2026-05-05 16:27:50 +02:00
Artyom Bogachenko
5683706af5
🔐 feat: OIDC Bearer Token Authentication for Remote Agent API (#12450)
* Remote Agent Auth middleware

* consider migration and update user

* fix eslint errors

* add scope validation

* fix codex review errors

* add filter for use: sig

* add jwks-rsa deps

* Fix remote agent OIDC auth review findings

* Polish remote agent OIDC timeout coverage

* Reject remote OIDC tokens without subject

* Use tenant context for remote agent auth config

* Harden remote agent OIDC scope handling

* Polish remote agent OIDC cache and scope tests

* Resolve remote agent auth review comments

* Reuse OpenID email claim resolver for remote auth

* Skip empty OpenID email fallback claims

* Use pre-auth tenant context for remote auth config

* Downgrade expected OIDC fallback logging

* Require secure remote OIDC endpoints

* Polish remote agent auth edge cases

* Enforce unique balance records

* Bind remote OpenID users to issuer

* Fix issuer-scoped OpenID indexes

* Avoid unique balance index requirement

* Fix remote OpenID issuer normalization boundaries

* Require issuer-bound OpenID lookups

* Enforce tenant API key policy after auth

* Fix remote auth tenant policy types

* Normalize remote OIDC discovery issuer

* Allow normalized remote OIDC issuer validation

* Enforce resolved tenant OIDC policy

* Polish OpenID issuer and scope validation

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 17:06:35 -04:00
Danny Avila
f5dd053128
🛡️ refactor: Restrict User Tavily Endpoint URLs (#12946) 2026-05-05 05:09:11 +09:00
Danny Avila
3170bd8b22
📦 chore: bump @librechat/agents to v3.1.77 2026-05-03 23:54:46 -04:00
Danny Avila
2c4a78094a
🛂 refactor: Avoid Default Tavily Safe Search (#12939) 2026-05-04 12:01:23 +09:00
Yashwanth Alapati
3da1d8c961
🔍 feat: add Tavily as Search and Scraper Provider (#12581)
* feat: add Tavily integration as search provider and scraper provider

* chore:update tavily web search parameters

* chore:tavily paramer update

* chore:update data-schemas test for tavily

* fix: allow Tavily string option modes

* fix: align Tavily config options

* fix: scope Tavily scraper timeout

* fix: use resolved scraper provider timeout

* fix: widen Tavily search provider types

* fix: harden Tavily web search config

* fix: cap Tavily option timeouts

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 11:29:13 +09:00
Danny Avila
d6d70eeb26
📦 chore: bump @librechat/agents to v3.1.76 2026-05-03 22:25:12 -04:00
github-actions[bot]
9b376178a6
🌍 i18n: Update translation.json with latest translations (#12916)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-04 11:18:20 +09:00
Danny Avila
619f28d76d
🛡️ fix: Sanitize HTML In Admin Banner And MCP Config Dialog (#12927)
Two `dangerouslySetInnerHTML` sites rendered admin-supplied HTML
without sanitization:

- `Banner.tsx` rendered `banner.message` directly.
- `MCPConfigDialog.tsx` rendered each `customUserVars` description.

Wrap both with DOMPurify, allowing only the inline tags needed for
formatting (links, emphasis, line breaks). Hardens against compromised
admin or yaml supply-chain scenarios. Pattern matches the existing
`CustomUserVarsSection.tsx` and `Tooltip.tsx` sanitizer setup.
2026-05-04 11:17:44 +09:00
Danny Avila
37429e8a3e
🚦 feat: Make URL Auto-Submit Configurable (#12929)
`/c/new?prompt=…&submit=true` previously auto-submitted the prompt
unconditionally. For deployments where users may receive crafted
links from external sources, an authenticated victim's click can
trigger an immediate, attacker-controlled prompt against a memory- or
tool-enabled model — providing a 1-click vector for prompt-injection
exfiltration via markdown image rendering.

Add `interface.autoSubmitFromUrl` (default `true` to preserve current
behavior). Operators handling sensitive memory/tool data can set it
to `false` so URL-supplied prompts only pre-fill the composer; the
user must press Send explicitly.
2026-05-04 11:17:19 +09:00
Danny Avila
c7f38d9621
🛡️ fix: Validate Avatar URL Before Fetch (#12928)
`resizeAvatar` previously called `node-fetch` on any string input with
no validation. When OIDC providers surface a user-controllable
`picture` claim, this could be used to make blind SSRF requests to
internal services on every social login.

Wrap the URL fetch with:
- An allowlist on the URL protocol (http/https only).
- The shared `createSSRFSafeAgents` utility, which blocks resolution to
  private, loopback, and link-local IPs at TCP connect time
  (TOCTOU-safe; works equally for hostname targets that DNS-resolve
  privately and for IP-literal targets, since Node's `net.Socket`
  always dispatches through the agent's `lookup` hook).
- `redirect: 'error'` so a public-IP redirect target cannot be used to
  bypass the agent check on a subsequent hop.
- A 5-second total request budget (node-fetch v2's `timeout` covers
  request initiation through full body receipt, bounding slow-loris
  exposure rather than just the TCP connect).
- A 10 MB response cap (`size` option + `Content-Length` pre-check +
  post-read length assertion) so a hostile payload cannot exhaust
  memory before `sharp()` rejects it.

Fetch the canonicalized `parsed.href` rather than the raw input string
to eliminate any future parser-differential between `new URL()` and
the underlying fetch implementation.

Per-call agent construction is intentional: the avatar path runs once
per social login per user, so pooling adds complexity without a
measurable benefit. Documented inline.

Comprehensive test coverage in `avatar.spec.js`:
- Rejects malformed URLs, non-http(s) schemes (file://, data:,
  javascript:).
- Asserts the happy-path canonicalization (`fetch` is called with
  `parsed.href`) and the SSRF-safe agent factory routing
  (https→httpsAgent, http→httpAgent).
- Rejects non-2xx HTTP status.
- Rejects an oversized Content-Length before reading the body, and
  asserts `.buffer()` is never invoked in that case.
- Rejects an oversized body even when the server lies about / omits
  Content-Length.
- Surfaces ESSRF, redirect, and `size` overflow errors thrown by the
  fetch layer.
- Confirms Buffer inputs bypass the fetcher entirely.
2026-05-04 11:16:40 +09:00
Danny Avila
4cce88be42
🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets (#12933)
* 🪟 feat: Add allowedAddresses Exemption List For SSRF-Guarded Targets

LibreChat already blocks SSRF-prone targets (private IPs, loopback,
link-local, .internal/.local TLDs) at every server-side fetch site
that consumes user-controllable URLs — custom-endpoint baseURLs, MCP
servers, OpenAPI Actions, and OAuth endpoints. The only existing
escape hatch is `allowedDomains`, but that flips the field into a
strict whitelist: adding `127.0.0.1` to permit a self-hosted Ollama
also blocks every public destination that isn't in the list.

Introduce `allowedAddresses` as the orthogonal primitive: a private-
IP-space exemption list. When a hostname or its resolved IP appears
in the list, the SSRF block is bypassed for that target. Public
destinations remain reachable. Operators can now run self-hosted
LLMs / MCP servers / Action endpoints on private addresses without
weakening the default-deny posture for everything else.

Schema additions in `packages/data-provider/src/config.ts`:
- `endpoints.allowedAddresses` (new — gates `validateEndpointURL`)
- `mcpSettings.allowedAddresses` (parallel to `allowedDomains`)
- `actions.allowedAddresses` (parallel to `allowedDomains`)

Core changes in `packages/api/src/auth/`:
- New `isAddressAllowed(hostnameOrIP, allowedAddresses)` — pure,
  case-insensitive, bracket-stripped literal match.
- Threaded the list through `isSSRFTarget`, `resolveHostnameSSRF`,
  `isDomainAllowedCore`, `isActionDomainAllowed`, `isMCPDomainAllowed`,
  `isOAuthUrlAllowed`, and `validateEndpointURL`.
- Extended `createSSRFSafeAgents` and `createSSRFSafeUndiciConnect`
  to accept the list, building an SSRF-safe DNS lookup that exempts
  matching hostnames/IPs at TCP connect time (TOCTOU-safe).

Wiring:
- Custom and OpenAI endpoint initialize sites pass
  `endpoints.allowedAddresses` to `validateEndpointURL`.
- `MCPServersRegistry` stores `allowedAddresses` and exposes it via
  `getAllowedAddresses()`. The factory, connection class, manager,
  `UserConnectionManager`, and `ConnectionsRepository` all thread
  it through to the SSRF utilities.
- `MCPOAuthHandler.initiateOAuthFlow`, `refreshOAuthTokens`, and
  `validateOAuthUrl` accept the list and consult it on every URL
  validation along the OAuth chain.
- `ToolService`, `ActionService`, and the assistants/agents action
  routes pass `actions.allowedAddresses` to `isActionDomainAllowed`
  and to `createSSRFSafeAgents` for runtime action calls.
- `initializeMCPs.js` reads `mcpSettings.allowedAddresses` from the
  app config and forwards it to the registry constructor.

Documentation:
- `librechat.example.yaml` shows the new field next to each existing
  `allowedDomains` block, with a note clarifying that
  `allowedAddresses` is an exemption list (not a whitelist).

Tests:
- Unit tests for `isAddressAllowed` covering literal IPs, hostnames,
  IPv6 brackets, case insensitivity, and partial-match rejection.
- Exemption tests for every entry point: `isSSRFTarget`,
  `resolveHostnameSSRF`, `validateEndpointURL`, `isActionDomainAllowed`,
  `isMCPDomainAllowed`, `isOAuthUrlAllowed`.
- Existing tests updated to reflect the new optional parameter.

Default behavior is unchanged: omitted = empty list = no exemptions.

* 🩹 fix: Plumb allowedAddresses Through AppConfig endpoints Type

The initial PR added `endpoints.allowedAddresses` to the
data-provider config schema and consumed it in the endpoint
initialize sites, but the runtime `AppConfig.endpoints` shape in
`@librechat/data-schemas` was a hand-maintained subset that didn't
include the new field — so `tsc` rejected `appConfig.endpoints.allowedAddresses`.

Add the field to `AppConfig['endpoints']` in
`packages/data-schemas/src/types/app.ts` and forward it from the
loaded config in `packages/data-schemas/src/app/endpoints.ts` so the
runtime config carries the value.

Update `initializeMCPs.spec.js` to expect the third positional
argument (`allowedAddresses`) on the `createMCPServersRegistry` call.

* 🩹 fix: Enforce allowedDomains Before allowedAddresses In isOAuthUrlAllowed

The initial implementation checked the address exemption first, so a
URL whose hostname appeared in `allowedAddresses` would return true
even when the admin had configured `allowedDomains` as a strict bound
on OAuth endpoints. A malicious MCP server could advertise OAuth
metadata, token, or revocation URLs at any address the admin had
permitted for an unrelated reason (a self-hosted LLM at `127.0.0.1`,
for example) and pass validation, expanding SSRF reach beyond the
configured domain whitelist.

Reorder: when `allowedDomains` is set, treat it as authoritative —
return true only if the URL matches a domain entry, otherwise fall
through to false. The address exemption only applies when no
`allowedDomains` is configured (mirrors how the downstream SSRF check
in `validateOAuthUrl` consults `allowedAddresses`).

Add a regression test asserting that an `allowedAddresses` entry does
not broaden a configured `allowedDomains` list.

Reported by chatgpt-codex-connector on PR #12933.

* 🩹 fix: Forward allowedAddresses To Remaining OAuth Callers

Two `MCPOAuthHandler` callers still used the pre-feature signatures and
were silently dropping the new `allowedAddresses` argument:

- `api/server/routes/mcp.js` invoked `initiateOAuthFlow` with the old
  5-argument shape, so OAuth flows initiated through the route handler
  ignored the registry's `getAllowedAddresses()` and would reject any
  metadata/authorization/token URL on a permitted private host.
- `api/server/controllers/UserController.js#maybeUninstallOAuthMCP`
  invoked `revokeOAuthToken` without the address exemption, so
  uninstalling an OAuth-backed MCP server on a permitted private host
  would fail at the revocation step even though the rest of the MCP
  connection path now permits it.

Both sites now read `allowedAddresses` from the registry alongside
`allowedDomains` and forward it. Reported by Copilot on PR #12933.

* 🩹 fix: Update Test Mocks And Assertions For OAuth allowedAddresses

The previous commit started passing `allowedAddresses` to
`MCPOAuthHandler.initiateOAuthFlow` from `api/server/routes/mcp.js`
and to `MCPOAuthHandler.revokeOAuthToken` from
`api/server/controllers/UserController.js`, but the corresponding
test files mocked the registry without `getAllowedAddresses` (causing
`TypeError`s) and asserted the old positional shape on
`toHaveBeenCalledWith`.

Update the mocks and assertions to match the new arity:

- `api/server/routes/__tests__/mcp.spec.js`: add
  `getAllowedDomains`/`getAllowedAddresses` to the registry mock and
  expect the additional positional args on `initiateOAuthFlow`.
- `api/server/controllers/__tests__/maybeUninstallOAuthMCP.spec.js`:
  add a `getAllowedAddresses` mock alongside the existing
  `getAllowedDomains` and seed it in `setupOAuthServerFound`.
- `api/server/controllers/__tests__/UserController.mcpOAuth.spec.js`:
  add `getAllowedAddresses` to the registry mock and expect the
  trailing `null` arg on the three `revokeOAuthToken` assertions.

* 🛡️ fix: Address Comprehensive Review — Scope allowedAddresses To Private IP Space

Major findings from the comprehensive PR review (severity → fix):

**CRITICAL — `validateOAuthUrl` SSRF fallback bypass.** When `allowedDomains`
is configured and a URL fails the whitelist, the SSRF fallback in
`validateOAuthUrl` was still passing `allowedAddresses` to `isSSRFTarget` /
`resolveHostnameSSRF`, letting a malicious MCP server advertise OAuth
endpoints at any address the admin had permitted for an unrelated reason.
Suppress `allowedAddresses` in the fallback when `allowedDomains` is active —
the address exemption is opt-in for the no-whitelist mode only.

**MAJOR — WebSocket transport SSRF check ignored exemptions.** The
`constructTransport` WebSocket branch called `resolveHostnameSSRF(wsHostname)`
without `this.allowedAddresses`, so a permitted private MCP server would
pass `isMCPDomainAllowed` but be blocked at transport creation. Forward
the exemption.

**Scope `allowedAddresses` to private IP space only (operator directive).**
The exemption list is for permitting private/internal targets; it must not
be a back-door to broaden trust to public destinations.
- Schema (`packages/data-provider/src/config.ts`): new
  `allowedAddressesSchema` rejects URLs (`://`), paths/CIDR (`/`),
  whitespace, and public IPv4/IPv6 literals at config-load time. Wired
  into `endpoints`, `mcpSettings`, and `actions`.
- Runtime (`packages/api/src/auth/domain.ts`): `isAddressAllowed` now
  drops public-IP candidates and public-IP entries on the match path —
  defense in depth so a misconfigured runtime list never grants exemption.
- Hot path (`packages/api/src/auth/agent.ts`): `buildSSRFSafeLookup`
  pre-normalizes the list into a `Set<string>` once at construction and
  applies the same scoping filter, so the connect-time DNS lookup is an
  O(1) Set membership check instead of a full re-iterate-and-normalize on
  every outbound request.

**Test coverage for the connect-time and OAuth-fallback paths.**
- `agent.spec.ts`: new describe block exercising `buildSSRFSafeLookup` and
  `createSSRFSafe*` with `allowedAddresses` — hostname-literal exemption,
  resolved-IP exemption, public-IP scoping, URL/CIDR/whitespace rejection,
  and the default no-list block.
- `handler.allowedAddresses.test.ts` (new): integration tests for
  `validateOAuthUrl` — covers both the no-domains-set "permit private"
  path and the strict-bound regression where `allowedAddresses` must NOT
  bypass `allowedDomains`.

**Documentation & cleanup.**
- `connection.ts` redirect SSRF check: explicit comment that
  `allowedAddresses` is intentionally NOT consulted for redirect targets
  (server-controlled, must not inherit the admin's exemption).
- `MCPConnectionFactory.test.ts`: replaced an `eslint-disable` with a
  proper `import { getTenantId } from '@librechat/data-schemas'`. The
  disable was added to make a pre-existing `require()` quiet — the cleaner
  fix is to use the existing top-level import.

Updated `MCPConnectionSSRF.test.ts` WebSocket SSRF assertions to match the
new two-argument call shape (`hostname, allowedAddresses`).

* 🩹 fix: Require Absolute URL Before allowedAddresses Trust Bypass In isOAuthUrlAllowed

`parseDomainSpec` is lenient — it silently prepends `https://` to
schemeless inputs so it can match patterns like bare `example.com`.
That leniency leaked into `isOAuthUrlAllowed`'s new
`allowedAddresses` short-circuit: a value like `10.0.0.5/oauth` (no
scheme) would parse successfully via the prepended default, hit the
address-exemption path, return `true`, and skip `validateOAuthUrl`'s
strict `new URL(url)` parse-or-throw — only to fail later in OAuth
discovery with a less clear runtime error.

Add a strict `new URL(url)` gate at the top of `isOAuthUrlAllowed`.
Schemeless inputs now fall through to `validateOAuthUrl`'s explicit
"Invalid OAuth <field>" rejection. Tests added in both
`auth/domain.spec.ts` (unit) and the OAuth handler integration spec
(end-to-end).

Reported by chatgpt-codex-connector (P2) on PR #12933.

* 🛡️ fix: Address Follow-Up Comprehensive Review — Schema Tests, Shared Normalization, host:port

Auditing the second comprehensive review:

**F1 MAJOR — schema validation untested.** `allowedAddressesSchema` had
zero coverage, so a regression in the three refinement stages or the
three wiring locations (`endpoints` / `mcpSettings` / `actions`) would
silently let invalid entries reach the runtime. Added a dedicated
`describe('allowedAddressesSchema')` block in `config.spec.ts` covering:
valid private IPs (v4 + v6, including the previously-missed 192.0.0.0/24
range), accepted hostnames, all rejection categories (URLs, CIDR, paths,
whitespace tabs/newlines, host:port, public IP literals), and full
`configSchema.parse()` integration at each of the three nesting points.

**F2 MINOR — `isPrivateIPv4Literal` divergence.** The schema reimpl in
`packages/data-provider` was discarding the `c` octet, so the
`192.0.0.0/24` (RFC 5736 IETF protocol assignments) range that the
authoritative `isPrivateIPv4` accepts was being rejected with a
misleading "public IP" error. Destructure `c` and add the missing range
check; covered by the new schema tests.

**F3 MINOR — DRY violation across `domain.ts` and `agent.ts`.** Both
files had independent normalization implementations with a subtle
whitespace-check divergence (`/\s/` vs `.includes(' ')`). Extracted the
shared logic into a new `packages/api/src/auth/allowedAddresses.ts`
module that both consumers import:
  - `normalizeAddressEntry(entry)` — single-entry shape check
  - `looksLikeHostPort(entry)` — host:port detector (used by F4)
  - `normalizeAllowedAddressesSet(list)` — pre-normalized Set for the
    connect-time hot path
  - `isAddressInAllowedSet(candidate, set)` — membership check that
    enforces private-IP scoping on the candidate

Both `isAddressAllowed` (preflight) and `buildSSRFSafeLookup` (connect)
now go through the same primitives; the whitespace divergence is gone.

To break the import cycle (`allowedAddresses` needs `isPrivateIP`,
`domain` previously owned it), extracted IP private-range detection
into a leaf `auth/ip.ts` module. `domain.ts` re-exports `isPrivateIP`
for backward compatibility with existing call sites.

**F4 MINOR — `host:port` silently misclassified.** Entries like
`localhost:8080` previously slipped through the URL/path guard, were
mis-detected as IPv6, failed `isPrivateIP`, and were silently dropped
with a misleading "public IP" schema error. Added an explicit
`looksLikeHostPort` check with a clear error: "allowedAddresses
entries must not include a port — list the bare hostname or IP only."
Bare `::1`, `[::1]`, and other valid IPv6 literals are intentionally
not matched (regex distinguishes by colon count and the bracketed
`[ipv6]:port` form).

**F5 MINOR — hostname-trust documentation gap.** Hostname entries
short-circuit `resolveHostnameSSRF` before any DNS lookup — that's a
deliberate design (admin trusts the name) but it means the exemption
follows whatever the name resolves to at runtime. Added an explicit
note in `librechat.example.yaml` for both `mcpSettings.allowedAddresses`
and `endpoints.allowedAddresses`: "a hostname entry trusts whatever IP
that name resolves to. Only list hostnames whose DNS you control.
Prefer literal IPs when you can."

**F6** (8 positional params) is flagged for follow-up; refactor to an
options object is a breaking-API change deferred to a separate PR.
**F7** (redirect/WebSocket asymmetry, NIT, conf 40) — skipping; the
existing inline comment is sufficient.

* 🧹 chore: Address Follow-Up NITs — Import Order And Mirror-Function Naming

Three NITs from the latest comprehensive review:

**NIT #1 (conf 85) — local import order.** AGENTS.md requires local
imports sorted longest-to-shortest. Both `domain.ts` and `agent.ts`
had `./ip` (shorter) before `./allowedAddresses` (longer). Swapped.

**NIT #2 (conf 60) — missing cross-reference.** The schema-side
`isHostPortShape` in `packages/data-provider/src/config.ts` had no
note pointing at the canonical runtime mirror. Added a JSDoc paragraph
explaining the mirror relationship and why a local copy exists (the
data-provider package can't import from `@librechat/api` without
creating a circular dependency).

**NIT #3 (conf 50) — naming inconsistency.** Renamed
`isHostPortShape` → `looksLikeHostPort` so the schema mirror matches
the runtime helper exactly. Kept as a separate function (not a shared
import) for the same circular-dependency reason; the matching name
makes it obvious they should stay in lockstep.
2026-05-03 21:43:59 -04:00
Danny Avila
85fa881e3c
🔐 fix: Avoid Logging Password On Login Validation Error (#12926)
The Passport local strategy validation error logged the entire request
body (including the password) into error logs. Replace it with the
email only, matching the metadata shape used by sibling log calls in
the same function.
2026-05-03 21:27:41 -04:00
Danny Avila
5013d6d35c
🧯 fix: Harden Code Env Filepath Uploads (#12936)
* fix: Harden code env filepath uploads

* test: Cover code env filepath edge cases

* fix: Scrub code env fallback filenames
2026-05-04 10:25:59 +09:00
Danny Avila
41a6d6d11c
🛡️ fix: Harden MCP Redirect SSRF Checks (#12931)
* fix: Harden MCP redirect SSRF checks

* fix: Address MCP redirect review feedback

* test: Tighten MCP SSRF redirect assertions
2026-05-04 09:58:47 +09:00
Danny Avila
5b567f5ff4
🧷 fix: Pin GitNexus Native Dependency (#12937) 2026-05-04 09:58:21 +09:00
Danny Avila
59e90a1878
🛡️ fix: Harden GitNexus Index Workflow (#12935)
* fix: Harden GitNexus index workflow

* fix: Resolve GitNexus flags before checkout
2026-05-03 18:05:45 -04:00
Danny Avila
1b79e0b785
🧬 chore: Align LibreChat With Agents LangChain Upgrade (#12922)
* 🔧 chore: Update dependencies in package-lock.json and package.json

- Bump version of @librechat/agents to 3.1.75-dev.0 in multiple package.json files.
- Upgrade various AWS SDK and Smithy dependencies to their latest versions in package-lock.json for improved stability and performance.

* 🔧 chore: Update AWS SDK and Smithy dependencies in package-lock.json

- Bump version of @aws-sdk/client-bedrock-runtime to 3.1041.0 and update related dependencies for improved performance and stability.
- Upgrade various AWS SDK and Smithy packages to their latest versions, ensuring compatibility and enhanced functionality.

* chore: Align LibreChat with agents LangChain upgrade

- Route LangChain imports through @librechat/agents facade exports
- Update @librechat/agents to 3.1.75-dev.1 and remove direct LangChain deps
- Normalize nullable agent model params and API key override typing
- Update Google thinking config typing for newer LangChain packages
- Refresh targeted audit-related dependency overrides

* chore: Add Jest types for API specs

* test: Fix LangChain upgrade CI specs

* test: Exercise agents env facade

* fix: Clean up TS preview diagnostics

* fix: Address Codex review feedback
2026-05-03 12:46:01 -04:00
Danny Avila
4e45e8e17c
🧹 fix: Clear MCP OAuth Tokens On Revoke
Fixes #12912.\n\n- Clear stored MCP OAuth tokens and flow state on revoke cleanup-only paths.\n- Keep provider revocation best-effort when token and client metadata are available.\n- Add controller and function coverage for stale metadata, missing config, and cleanup failure paths.
2026-05-03 02:52:47 +09:00
Danny Avila
eb22bb6969
🧭 fix: Migrate Anthropic Long Context (#12911) 2026-05-02 22:14:19 +09:00
Danny Avila
f3e1201ae7
📌 fix: Stabilize Agent Prompt Cache Prefix (#12907)
* fix: stabilize agent prompt cache prefix

* chore: refresh agents sdk lockfile integrity

* test: format agent memory assertion

* test: type agent context fixtures

* fix: preserve MCP instruction precedence

* fix: reuse resolved conversation anchor

* fix: keep resumable startup immediate
2026-05-02 09:55:31 +09:00
Danny Avila
5b5e2b0286
🛡️ fix: Handle MCP Tool Cache Lookup Failures (#12910)
* Handle MCP tool cache lookup failures

* Harden MCP cached tool lookup

* Cover full MCP tool cache outage

* Guard MCP tool cache store lookup
2026-05-02 09:21:28 +09:00
Danny Avila
74307e6dcc
💭 feat: Require Explicit Auto-agent Enablement for Memories (#12886) 2026-05-01 23:56:08 +09:00
ethanlaj
781bfb857d
🩹 fix: Sync ControlCombobox popover width with trigger after layout changes (#12887)
* 🩹 fix: Sync ControlCombobox popover width with trigger after layout changes

The popover width was measured once on mount via offsetWidth. When the agent builder side panel opens after a page reload with the sidebar collapsed, the trigger button is initially measured during the layout transition (~26px) and never re-measured, leaving the agent select dropdown rendered at the far left with no options fully visible.

Use a ResizeObserver to keep buttonWidth in sync with the trigger's actual width whenever it resizes, then disconnect on unmount.

* test: cover ControlCombobox isCollapsed, no-ResizeObserver, and zero-width branches

Address review feedback:
- Use button.offsetWidth as the ResizeObserver fallback instead of
  entry.contentRect.width to avoid a content-box vs border-box mismatch in
  pre-2022 browsers that ship ResizeObserver without borderBoxSize.
- Add tests for the three previously-untested branches: isCollapsed=true
  (no observation of the trigger), ResizeObserver unavailable (sync-only
  measurement), and zero-width entries (state unchanged).

* test: lock the button.offsetWidth fallback against revert

Add a test that drives the ResizeObserver callback with borderBoxSize
absent and divergent contentRect.width vs offsetWidth (251 vs 275).
The fix would silently revert to entry.contentRect.width without this
test failing, so this pins the chosen fallback semantics.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-30 15:02:16 +09:00
Danny Avila
65990a33e9
📥 fix: Resolve Imported-Conversation Default Model From Runtime modelsConfig (#12885)
* 📥 fix: Use Endpoint-Aware Default Model on Imported Conversations

Claude conversations imported from claude.ai's data export display
"gpt-4o-mini" in the chat UI until the page is refreshed, and any
attempt to send a message before refreshing fails with "The model
'gpt-4o-mini' is not available for Anthropic."

Root cause: ImportBatchBuilder.finishConversation() unconditionally
defaulted the saved conversation's `model` field to
openAISettings.model.default, regardless of `this.endpoint`. Claude
exports don't carry a model name, so every imported Claude conversation
landed with endpoint=anthropic but model=gpt-4o-mini.

Fix: pick the default based on `this.endpoint` via a small lookup
(openAI -> gpt-4o-mini, anthropic -> claude-3-5-sonnet-latest), keeping
the existing OpenAI default as the fallback for unknown endpoints.

Fixes #12844

* 🪄 refactor: Resolve Import Default Model From `modelsConfig`

Replace the hardcoded per-endpoint default lookup added in the previous
commit with a runtime resolver that consults the same models config the
chat UI uses (`getModelsConfig` in ModelController -> `loadDefaultModels`
+ `loadConfigModels`). This way an imported conversation defaults to a
model the LibreChat instance has actually configured / discovered for
the endpoint, instead of a hardcoded constant that may not exist on this
deployment.

Resolution order:
1. First non-empty model in `modelsConfig[endpoint]`.
2. Per-endpoint hardcoded fallback (anthropic/openAI settings) if the
   runtime config is empty for the endpoint or `getModelsConfig` throws.
3. `openAISettings.model.default` if even the per-endpoint fallback is
   missing (unknown endpoint).

`importBatchBuilder.finishConversation` now accepts an optional
`defaultModel` argument; each importer resolves it once at the top via
`resolveImportDefaultModel({ endpoint, requestUserId, userRole })` and
threads it through. ChatGPT message-level model selection also falls
back to the resolved default before the hardcoded gpt-4o-mini.
2026-04-30 00:43:04 -04:00
Nick Haffie-Emslie
3758380c61
🔌 fix: Follow 307/308 redirects in MCP streamable HTTP transport (#12850)
* 🔌 fix: Follow 307/308 redirects in MCP streamable HTTP transport

Some MCP servers (e.g. Coda) return 308 Permanent Redirect to route
doc-scoped tool calls to a different endpoint path. The fetch wrapper
used `redirect: 'manual'` for SSRF protection, which silently dropped
these redirects and caused tool calls to fail with empty error bodies.

Follow 307/308 redirects (method-preserving per RFC 7538) up to a
depth of 5. SSRF safety is preserved because the same undici Agent
with its SSRF-safe connect function validates redirect targets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🛡️ fix: Harden MCP 307/308 redirect handling against SSRF and credential leaks

- Validate every redirect target against `resolveHostnameSSRF` so allowlist
  deployments (which disable connect-time SSRF protection) still block hops to
  private/reserved IPs.
- Strip `Authorization`, `Cookie`, `mcp-session-id`, and any user-injected
  headers when a 307/308 crosses an origin boundary, mirroring browser/Fetch
  behavior so a redirecting MCP server can't exfiltrate credentials.
- Cancel the intermediate response body before each next hop so undici can
  reuse pooled sockets rather than holding them until GC.
- Restructure redirect test helpers to be same-origin (matching real-world
  Coda-style routing), drop dead setup code, fix the misleading "5 hops
  successfully" test, and add coverage for SSRF-blocked redirects, cross-
  origin credential stripping, and same-origin credential preservation.

* 🛡️ fix: Also strip `serverConfig.headers` on cross-origin MCP redirects

Previously only runtime `setRequestHeaders` keys were treated as secret on a
307/308 cross-origin hop. API keys baked into `serverConfig.headers` (passed
through `requestInit.headers` at transport construction time) survived
stripping, so a malicious MCP endpoint could exfiltrate them by returning a
cross-origin `Location`. Pass the configured header keys through to
`createFetchFunction` so both runtime and config secrets are stripped.

The cross-origin credential test now also configures `serverConfig.headers`
to lock in this behavior.

* 🧹 chore: Tighten MCP redirect-stripping coverage and helper duplication

- Add `proxy-authorization` to the cross-origin forbidden header set so a
  forward-proxy credential header would also be stripped on a cross-origin
  hop, matching the Fetch-spec list.
- Strengthen the cross-origin credential test with positive assertions that
  benign protocol headers (`accept`, `content-type`) survive the hop, so a
  regression that strips everything indiscriminately would now fail.
- Extract the duplicated MCP request handler / session-teardown logic from
  three test helpers into shared `createMCPRequestHandler` and
  `closeMCPSessions` utilities.

* 🛠️ fix: Handle `Request` inputs in MCP `customFetch` URL derivation

`customFetch` is typed to accept `UndiciRequestInfo` (`string | URL | Request`),
but `Request.prototype.toString()` returns `"[object Request]"`. The previous
implementation derived `originalOrigin` and the redirect base via
`url.toString()`, so a `Request` input would throw inside `new URL(...)` before
any network call — a regression even when no redirect was involved.

Add a `getRequestUrlString` helper that extracts the URL string for all three
shapes, track the URL string alongside the fetch input through the redirect
loop, and add parameterized tests that exercise `customFetch` with each shape.

* 🛠️ fix: Don't override `Request` input headers in MCP `buildFetchInit`

Previously `buildFetchInit` always set `headers` on the returned init —
even when neither `init.headers` nor runtime headers contributed anything.
Passing `headers: {}` to `undiciFetch` overrides the headers carried on a
`Request` input (auth tokens, MCP session, protocol negotiation), so
Request-based wrappers could fail authentication even without a redirect
in play. Skip the `headers` override entirely when there is nothing to
merge.

Adds a regression test that supplies `Authorization` and a custom header
on the `Request` itself and asserts both reach the target server.

* 🛠️ fix: Preserve `Request` method/body across MCP redirects + guard cross-origin strip

Two regressions surfaced by extending `customFetch` to accept `Request` inputs:

1. **307/308 method/body loss.** The redirect loop switches `url` to the
   new `Location` string, but the original `Request`'s method and body
   stayed bound to the (now-discarded) `Request` object. A redirected
   POST silently became a GET with no payload — the exact behavior the
   method-preserving codes are designed to prevent. Added a
   `resolveFetchInput` helper that runs once at the top of `customFetch`,
   extracts a `Request`'s method/body/headers into the shared init, and
   buffers the body via `arrayBuffer()` so 307/308 retries can replay it.

2. **Cross-origin strip crashed on absent headers.** After the previous
   fix that stopped `buildFetchInit` from setting `headers: {}`,
   `currentInit.headers` could legitimately be `undefined`. The
   cross-origin branch read it as a `Record` and called `Object.entries`
   on `undefined`, throwing `TypeError`. Guard the branch on
   `currentInit.headers != null` — when there are no headers there is
   nothing to strip.

Adds two regression tests: a POST-with-body `Request` that 308-redirects
cross-origin (asserts both method and body survive) and a no-headers
cross-origin redirect (asserts the strip path no longer crashes).

* 🛠️ fix: Forward `Request.signal` through MCP `customFetch` normalization

`resolveFetchInput` was copying method/body/headers off a `Request` input
but dropping `Request.signal` on the floor, so a caller that wired an
`AbortController` onto the `Request` for cancellation/timeouts lost that
wiring as soon as we re-shaped the input into the `(string, init)` pair
used by the redirect loop. Subsequent aborts no longer reached the
in-flight fetch — a regression from the pre-PR code, which forwarded
the original `Request` directly to undici.

Forward the signal alongside method/body/headers, with explicit
`init.signal` still winning per Fetch-spec semantics. Regression test
aborts a controller before calling \`customFetch\` with the wired
`Request` and asserts the call rejects.

* 🧪 test: Pin URL.origin contract for protocol-downgrade redirect handling

Audit follow-up. The cross-origin strip path keys off
`targetUrl.origin !== originalOrigin`, and `URL.origin` is defined as
`scheme + "://" + host + ":" + port`, so a same-host `https → http`
redirect produces a different origin and trips the strip path through
the existing logic — no separate code path needed.

Pin that contract with a small unit test so a future change to URL
semantics (or a refactor that swaps in a different comparison) doesn't
silently regress protocol-downgrade stripping. Standing up a TLS
fixture (self-signed cert, undici skip-verify, etc.) just to re-prove
the URL spec is wasted complexity.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-29 22:12:29 +09:00
Danny Avila
756530c2b8
🩹 fix: Polish code-execution attachment UX (#12870)
* 🧹 chore: Strip code-execution boilerplate from tool output

The bash executor in `@librechat/agents` appends two kinds of noise to
every successful run:

1. Trailing `Note:` paragraphs — long behavioral hints repeating
   rules already in the system prompt ("Files from previous executions
   are automatically available...", "Files in 'Available files' are
   inputs..."). Re-stating these on every tool call adds ~50 tokens of
   waste per call, which compounds across long agent traces.

2. Per-file `| <annotation>` suffixes on every line of `Generated
   files:` / `Available files (...):`. The two section headers already
   convey the new-vs-known distinction; the per-file annotations are
   redundant *and* phrased inconsistently ("downloaded by the user"
   vs. "displayed to the user" vs. "known to the user").

Strip both in a small `cleanCodeToolOutput` helper invoked from
`packages/api/src/agents/handlers.ts` for every tool listed in
`CODE_EXECUTION_TOOLS`. Non-code-execution tools pass through
unchanged. The cleaning happens *after* tool resolution but *before*
downstream consumers (model context, SSE forwarding, persistence) see
the content, so subsequent model turns get the lean output.

* 🩹 fix: Polish code-execution attachment rendering

Three rough edges visible in code-interpreter conversations:

1. **Sandbox-internal `.dirkeep` placeholders leak as file chips.** The
   bash executor creates `.dirkeep` inside any new directory so the
   stateless container preserves the folder across executions. After
   `sanitizeArtifactPath`'s `_` prefix and 6-hex collision suffix it
   surfaces as `_.dirkeep-<hash>` — a 0-byte chip with no value to the
   user, sometimes hiding the real artifact behind it. New
   `isInternalSandboxArtifact` helper filters them out of every
   routing path (`Attachment`, `AttachmentGroup`, `LogContent`).

2. **The `-<hash>` collision suffix is visible in chip labels.** The
   suffix is collision-avoidance machinery; users only need to see the
   canonical name. New `displayFilename` strips it for display while
   leaving the on-disk `attachment.filename` untouched so downloads
   resolve. Applied across `FileContainer`, `ToolArtifactCard`,
   `ToolMermaidArtifact`, and `LogContent`'s text-attachment label
   path.

3. **0-byte / placeholder files outrank real artifacts in render
   order.** Bucket sort by salience (non-empty before empty) sinks
   stragglers to the bottom. Stable sort preserves arrival order for
   peers.

Added regression tests cover the new helpers, the dirkeep filter
across buckets, and the within-bucket salience ordering.

* 🩹 fix: Don't auto-open artifact panel on history navigation

Navigating to a previous conversation full of code-execution artifacts
would auto-open the side panel and focus the most-recent artifact —
the same code path that fires for fresh streaming artifacts. Users
expect that "auto-open" behavior only when an artifact arrives via
SSE, not when they revisit an old chat.

Two-part gate:

1. `ToolArtifactCard`'s focus effect captures `isSubmitting` at first
   render via a ref. A card mounted *during* a stream means a new
   artifact arrived → steal panel focus (legacy behavior). A card
   mounted while `isSubmitting === false` is part of conversation
   history → leave focus alone.

2. `Presentation`'s panel-render condition gains `currentArtifactId
   != null`. With (1) keeping `currentArtifactId` null on history
   load, the panel stops rendering at all on navigation — even if
   `artifactsVisibility` was left `true` by a prior conversation.
   User clicks on a chip to re-open (the click handler is unchanged
   and unconditional).

Test seeds `isSubmittingFamily(0)` per case: existing tests opt into
streaming (default `true`) so legacy auto-focus assertions still hold;
new tests for history-load opt into `streaming: false` and verify
no auto-focus + click-to-open still works.

* 🩹 fix: Force panel visible on streaming artifact arrival

The previous commit gated `setCurrentArtifactId` on `isSubmitting` but
left `artifactsVisibility` untouched. When a user had explicitly
closed the panel earlier in the session, a fresh SSE artifact would
set `currentArtifactId` (so the chip read "click to close") but
`Presentation`'s render condition still required `visibility === true`
— net effect: the card claimed to be open, the panel stayed hidden.

Streaming arrivals now also call `setVisible(true)`, which is the
explicit "auto-open when first created" behavior the user asked for.
History mounts (`isSubmitting === false`) still leave both focus and
visibility alone, so navigating to an old conversation does not
re-open the panel.

Two regression tests added: one asserts streaming flips visibility on
even when seeded false, the other asserts history mounts leave a
seeded-false visibility alone.

* 🧹 chore: Tighten code-execution attachment polish per audit feedback

Resolves the eight actionable findings from the comprehensive audit:

- Scope `displayFilename` out of `FileContainer`: opt-in via a new
  `displayName` prop. User-uploaded chips (input area, persisted
  message files) keep their raw filename, eliminating the false-positive
  class where `report-abc123.pdf` was silently rewritten to `report.pdf`.
  Code-execution artifact paths in `Attachment.tsx` explicitly compute
  the de-suffixed name and pass it through.
- Tighten `TRAILING_NOTES_PATTERN` to anchor on the two known boilerplate
  openings (`Files from previous executions`, `Files in "Available files"`),
  so a user-authored `Note:` line preceded by a blank line in stdout no
  longer gets eaten along with everything after it.
- `ToolMermaidArtifact`: compute `visibleFilename` once and reuse for
  title, content, and the download `aria-label` (was using the raw
  `attachment.filename` for the aria-label, creating a screen-reader
  inconsistency).
- `ToolArtifactCard`: read `isSubmittingFamily(0)` once via a
  non-subscribing `useRecoilCallback`, instead of subscribing for the
  full lifetime to a value the ref only ever needs at first render.
- Extract `bySalience` and `byEntrySalience` comparators from
  `attachmentTypes.ts`, replacing the ten duplicated sort lambdas in
  `Attachment.tsx` and `LogContent.tsx`.
- Treat `attachmentSalience({ bytes: undefined })` as neutral (`0`)
  rather than empty (`1`); only an explicit `bytes === 0` sinks. Stops
  non-code-exec sources (web-search inline results, files where the
  schema omits the byte count) from silently sinking past real content.
- Pin the click-history test to the panel-open button by name instead
  of relying on `getByRole('button', { pressed: false })`, which
  matched by DOM order.
- Add the missing blank line between adjacent `it(...)` blocks.
- Drop the verbose narrating comments in `FileContainer` along with the
  removed `displayFilename` import.

Adds three regression tests for the new behavior (FileContainer raw
filename, artifact-context displayName flow, user-authored `Note:` line
preserved through cleanup) and updates the salience test for the new
neutral-undefined semantics.

* 🧹 chore: Drop redundant `@testing-library/jest-dom` import in FileContainer spec

`client/test/setupTests.js` already imports the matchers globally for every
Jest test in the client workspace, so the explicit import here was dead code.
Removing it brings the spec in line with the broader convention used by
`ArtifactRouting.test.tsx`, `LogContent.test.tsx`, and `attachmentTypes.test.ts`.

* 🛡️ fix: Narrow `.dirkeep`/`.gitkeep` filter to the sandbox-specific form

`isInternalSandboxArtifact` was filtering bare `.dirkeep` / `.gitkeep`
along with the post-sanitization form. Bare versions never originate
from the bash executor (the dotfile rewrite + disambiguator step in
`sanitizeArtifactPath` always produces `_.dirkeep-<6 hex>`), so the only
real-world source of a bare `.gitkeep` is project scaffolding the user
uploaded — silently hiding it from every attachment bucket meant the
file disappeared with no way to surface or download it.

Tightening to `^_\.(?:dirkeep|gitkeep)-[0-9a-f]{6}$` keeps the
sandbox-placeholder filter intact while letting user-uploaded markers
render normally. Tests inverted accordingly: bare forms now expected to
render; only the post-sanitization form is filtered.

* 🩹 fix: Address comprehensive-review findings on attachment helpers

Five findings from the latest pass:

- **MAJOR — `displayFilename` false-positive on extensionless 6-hex.**
  The previous regex `/-[0-9a-f]{6}(?=\.[^.]+$|$)/` stripped any leaf
  ending in `-XXXXXX` regardless of context, so a user-named
  `build-a1b2c3` (script-emitted hash artifact, no extension) lost its
  tail and rendered as `build`. Split into two narrower patterns:
  `COLLISION_SUFFIX_BEFORE_EXT` only matches when followed by an
  extension; `SANITIZED_DOTFILE_TRAILING_SUFFIX` only fires when the
  leaf starts with `_.` AND ends with `-XXXXXX` — the unambiguous
  fingerprint of `sanitizeArtifactPath`'s dotfile rewrite.

- **MINOR — `isInternalSandboxArtifact` filter too aggressive.**
  `(file.bytes ?? 0) > 0` treated undefined bytes as zero, falling
  through to the regex check. Tightened to `file.bytes !== 0`: only
  an *explicit* zero counts as the empty-placeholder shape worth
  hiding. Non-code-exec sources without `bytes` populated render
  normally now.

- **MINOR — `getValue()` could throw on a degenerate atom state.**
  Switched the snapshot read in `ToolArtifactCard` to
  `valueMaybe() ?? false` so a transient error / loading state on the
  upstream selector doesn't crash card mount. The `false` default is
  the right history-fallback (don't auto-open if we can't classify).

- **NIT — `attachmentSalience` / `bySalience` over-broad signature.**
  Removed the test-only `{ bytes?: number }` arm; functions now accept
  `TAttachment` directly. The internal `bytes` read still goes through
  a cast since not every TAttachment branch declares it. Tests updated
  to use the existing `baseAttachment(...)` helper.

- **MINOR — Missing regression test for extensionless 6-hex.**
  Added `'build-a1b2c3'` and `'out/blob-deadbe'` cases that pin the
  preservation behavior, plus an `isInternalSandboxArtifact` test that
  asserts undefined-bytes attachments are not filtered.

* 🩹 fix: Make code-file artifacts click-to-open only

Removes mount-time auto-open from `ToolArtifactCard`. Streaming
arrivals no longer hijack the panel — even a freshly-emitted SSE
artifact registers silently in `artifactsState` and waits for the
user to click. Combined with `Presentation`'s
`currentArtifactId != null` render gate, the panel stays closed
across history navigation, page reload, and SSE arrival.

Click is the only path that opens the panel. `handleOpen` is
unchanged: first click focuses + reveals, second click on the same
chip closes.

Dropped:
- `useRecoilCallback` snapshot read of `isSubmittingFamily(0)`
- `mountedDuringStreamRef` ref + lazy-init block
- The whole focus + visibility effect (was effect 3)
- `useRef` import (now unused)

Tests:
- `ArtifactRouting.test.tsx` rewritten to exercise the click path:
  registers-on-mount-without-focus, click-to-open-then-close, multi-
  card-no-auto-focus, click-when-visibility-was-false. The streaming
  state is no longer seeded; both `renderWith` and `renderWithProbe`
  collapsed back to plain `RecoilRoot`.
- `LogContent.test.tsx` flips its panel-routing assertions from
  `pressed: true` (which asserted auto-focus) to `pressed: false`
  with a chip-title check (which asserts the panel card rendered
  but stayed unfocused).

* Revert "🩹 fix: Make code-file artifacts click-to-open only"

This reverts commit 6761531287.

* 🩹 fix: Exclude CODE bucket from streaming auto-open

Narrows the previous-commit revert: rich-preview artifacts (HTML,
React, Markdown, plain text) keep the legacy SSE auto-open UX, but
the CODE bucket (`.py`, `.js`, `.cpp`, `Dockerfile`, `Makefile`, …)
stays click-to-open even on streaming.

Source-code artifacts are typically supporting helpers the agent
emits alongside a richer deliverable (a Python script that builds
the actual `.html` output, for example). Auto-opening every
helper's panel each time it gets written would shove the panel
in front of the user every tool call. The user explicitly opens
a code chip when they want to inspect it.

Implementation:
- Focus+open effect skips early when `artifact.type === CODE`.
- `artifact.type` added to the dep array so the gate re-evaluates
  if the type ever changes (it shouldn't, but the dep is honest).
- JSDoc updated to call out the carve-out.

Tests:
- New `does NOT auto-open a streaming CODE artifact (test.py is
  click-to-open)` — seeds isSubmitting=true, mounts a `.py`,
  asserts the artifact registers but currentArtifactId stays null.
- New `clicking a CODE artifact focuses it even though it skipped
  auto-open` — confirms the click path still surfaces a `.py`.
- All 25 prior auto-open tests for HTML/React/Markdown/plain-text
  buckets still pass unchanged: those types continue to auto-open
  on streaming.

* 🧹 chore: Address two NITs from the audit-fix follow-up review

- **NIT #1 (conf 60)**: Add a test for the dotfile-with-extension
  intersection (`_.config-abcdef.txt` → `.config.txt`). Both halves
  of the path were tested separately — extension-anchored suffix
  stripping and `_.` underscore restoration — but the combination
  wasn't pinned. Adds `expect(displayFilename('_.config-abcdef.txt'))
  .toBe('.config.txt')`.

- **NIT #2 (conf 25)**: Tighten the cast in `attachmentSalience` from
  the anonymous `{ bytes?: number }` shape to the concrete
  `TFile & TAttachmentMetadata` (the actual TAttachment branch that
  declares `bytes`). Same runtime behavior; a future retype of
  `TFile.bytes` will now surface here at compile time instead of
  being silently papered over.

* 🩹 fix: Stop stripping `-<6 hex>` suffixes from non-dotfile filenames

Codex's repeated P2 was correct: the `COLLISION_SUFFIX_BEFORE_EXT`
regex stripped any `-<6 hex>` immediately before an extension
regardless of context. That collapsed legitimate user-named files
like `report-deadbe.csv` and `report-beef01.csv` onto the same chip
label `report.csv`, silently merging distinct files in the UI.

The structural truth: only the dotfile shape (`_.foo-XXXXXX`) carries
an unambiguous discriminator (the leading `_.` that
`sanitizeArtifactPath` adds when rewriting a leading dot). The
extension-only case (`name-<hash>.ext`) has no such discriminator —
we can't distinguish a sanitized `report 1.csv` (which became
`report_1-<hash>.csv`) from a user-named `report-deadbe.csv` from
the filename alone.

Recovering the non-dotfile case cleanly would require a backend
`wasSanitized` metadata flag we don't have. Without it, the safer
choice is to leave non-dotfile names alone — uglier when the file
*was* sanitized, but never collapses distinct files onto a shared
label.

Changes:
- Drop `COLLISION_SUFFIX_BEFORE_EXT`. Replace
  `SANITIZED_DOTFILE_TRAILING_SUFFIX` with a unified
  `SANITIZED_DOTFILE_PATTERN` that handles both extensionless and
  with-extension dotfile shapes in one regex.
- Simplify `displayFilename` to a single match + reconstruct path.
- Update tests: drop the broad-stripping assertion
  (`output-deadbe.csv` → `output.csv`), add explicit codex-regression
  cases (`report-deadbe.csv` and `report-beef01.csv` preserve
  unchanged), document the deliberate non-recovery for sanitized
  non-dotfiles, update the AttachmentGroup→FileContainer integration
  test to reflect the narrower stripping (non-dotfile `archive-deadbe.zip`
  passes through; new dotfile `_.config-abcdef.zip` → `.config.zip`
  exercises the recoverable path).

* 🩹 fix: Scope code-tool annotation stripping to file-list sections

Codex was right: the previous global `.replace` would mutate any line
ending in one of the three annotation phrases — even legitimate
stdout. A user script doing
`echo "foo | File is already downloaded by the user"` had its output
silently scrubbed before being fed back into model context.

New `FILE_SECTION_PATTERN` captures `Generated files:` /
`Available files (...)` blocks (header + lines starting with `- /`).
Annotation stripping now only runs *within* the captured file-list
section via a nested `.replace`, so:

- Inside the section: per-file `| <ann>` suffixes still get stripped
  (line-per-file ≥ 4 files form, inline `, ` comma-separated ≤ 3
  files form — both already covered by existing patterns).
- Outside the section: stdout, stderr, blank lines, the trailing
  `Note:` paragraphs (handled by their own pattern), and any user
  text that coincidentally contains an annotation phrase pass
  through unchanged.

Tests:
- New `does NOT mutate stdout that legitimately contains an
  annotation phrase outside a file-list section` pins the codex
  regression: three coincidental phrases in stdout, no
  `Generated files:` header, all three preserved verbatim.
- New `strips annotations inside a file-list section but preserves
  identical phrases in stdout above it` covers the mixed case where
  the same phrase appears in both stdout and a file listing —
  stdout survives, listing gets cleaned, exactly one occurrence
  remains.
- All 9 prior tests still pass (file-section stripping behavior
  unchanged for both line-per-file and inline-comma layouts).
2026-04-29 08:53:10 -04:00
Danny Avila
4a5fc701d2
📂 fix: Preserve Nested Skill Paths in Code-Env Uploads (#12877)
* fix(code): preserve code env upload filepaths

* chore: Reorder import statements in crud.js
2026-04-29 08:07:46 -04:00
Danny Avila
915b30c60d
📦 chore: update @librechat/agents to v3.1.74 (#12869) 2026-04-29 10:20:52 +09:00
Helge Wiethoff
61b9b1daa7
🩹 fix(SSE): Treat responseCode === 0 as Transport Failure, Not Server Error (#12834)
* fix(sse): treat responseCode===0 as transport failure, not server error

When a long-running model response (e.g. gpt-5.4 with web_search:true)
takes longer than the browser's idle connection timeout, the SSE transport
drops and sse.js fires an error event with responseCode=0 and e.data set
to the raw response buffer (non-JSON SSE text).

The previous guard `!responseCode` is truthy for both 0 (transport drop)
and undefined (genuine server-sent error event), so the client incorrectly
entered the server-error branch, tried to JSON.parse raw SSE text, logged
"Failed to parse server error", and showed the user a red error banner --
even though the backend continued processing and delivered the final answer
seconds later.

Fix 1 (client): change guard from `!responseCode` to `responseCode == null`
so that only undefined/null (no HTTP status at all) triggers the server-error
parse path. responseCode===0 now correctly falls through to the reconnect path.

Fix 2 (backend): after res.flushHeaders() the response is already committed
as SSE. The fallback branch that wrote res.status(404).json() was an HTTP/SSE
protocol violation. Replace with an SSE-conformant event:error frame + res.end().

* fix(sse): use onError helper on subscribe failure + add regression tests

Replace silent res.end() with onError('Failed to subscribe to stream')
so the client receives a parseable SSE error event instead of a stream
that closes with no signal. The previous res.end() left the UI stuck
in "submitting" state because no error/abort/final event ever fired.

Also adds two missing test cases for the responseCode guard change:
- responseCode === 0 with raw SSE buffer data must NOT call errorHandler
  (transport failure should reconnect, not display garbage)
- responseCode == null with JSON error data MUST call errorHandler
  (server-sent error events should still surface to the user)

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-04-29 10:05:51 +09:00
ethanlaj
85894c11c7
🧜‍♂️ fix: Preserve Mermaid foreignObject HTML in Sanitized SVG (#12819) 2026-04-29 09:37:38 +09:00
David Newman
1f37ec842a
🔌 fix: Prevent Repeated Idle Check Triggers for Users With Failed MCP Connections (#12853) 2026-04-29 09:19:35 +09:00
Daniel Lew
2503365c44
🚫 feat: Add Support for none Reranker Type in Web Search Config (#12765)
Most of the codebase already supports the concept of *not* using a reranker w/
web search, except there was no way to initially setup an absent reranker
component.

Now there's a special path for skipping the reranker auth when loading web
search config, which allows for skipping the reranker when using web search.
2026-04-29 09:17:04 +09:00
Gaurav Dubey
84043432a5
🧹 fix: Graceful MCP OAuth Revoke Cleanup When Tokens Are Missing (#12825)
* fix: graceful MCP OAuth revoke cleanup when tokens are missing (#12754)

`maybeUninstallOAuthMCP` in `api/server/controllers/UserController.js`
aborts before the DB-delete and flow-state cleanup steps whenever
`MCPTokenStorage.getTokens` throws `ReauthenticationRequiredError` —
which is exactly what happens when a user clicks "Revoke" on an MCP
server whose backend is already dead and whose refresh token is gone.
The resulting error is both surfaced to the log as a red line and, more
importantly, leaks the DB token row and OAuth flow state.

Wrap the token retrieval in try/catch following the same best-effort
pattern already used for the two `revokeOAuthToken` calls. On
`ReauthenticationRequiredError`, skip revocation silently (info log)
and continue to the cleanup steps. On any other unexpected error, log
a warning and continue — cleanup must always run.

Exported `maybeUninstallOAuthMCP` for direct unit testing and added
`api/server/controllers/__tests__/maybeUninstallOAuthMCP.spec.js` with
8 cases: early-return guards (non-MCP key, non-OAuth server, missing
client info), happy path (both tokens revoked + cleanup), both
failure-to-retrieve paths (ReauthenticationRequiredError and arbitrary
error — cleanup still runs in both), single-token path, and
revocation-call failures (cleanup still runs).

Fixes #12754.

* test: use instanceof check against real ReauthenticationRequiredError

Follow-up to the previous commit on this branch. Two changes:

1. `UserController.maybeUninstallOAuthMCP` now checks
   `error instanceof ReauthenticationRequiredError` using the real
   class imported from `@librechat/api`, instead of comparing
   `error?.name === 'ReauthenticationRequiredError'`. The name-string
   check matched any unrelated error that happened to have the same
   `.name`; the `instanceof` check is a proper identity test.

2. The accompanying spec's jest mock for `@librechat/api` now
   exposes a `ReauthenticationRequiredError` class, and the test
   imports it from that mock so the `instanceof` comparison in the
   production code holds during the test. Without this, the two
   "skips revocation ... still runs cleanup" tests threw
   `TypeError: Right-hand side of 'instanceof' is not an object`
   because the mock left the class undefined.

All 8 tests in the spec pass.
2026-04-29 09:11:12 +09:00
Yorgos K
f2df0ea62b
🛡️ fix: Filter user_provided Sentinel in Tool Credential Loading (#12840)
When GOOGLE_KEY=user_provided is set as an endpoint config, the
loadAuthValues() function in credentials.js would pass the literal
string 'user_provided' to tools via the || fallback chain. This caused
Gemini Image Tools to fail at runtime with an invalid API key error,
as initializeGeminiClient() received the sentinel value instead of a
real key.

The fix aligns loadAuthValues() with checkPluginAuth() in format.ts,
which already correctly excludes user_provided and empty/whitespace
values. Now loadAuthValues() skips these values and continues to the
next field in the fallback chain or falls through to user DB values.

Added regression tests covering:
- user_provided sentinel is skipped, DB value used instead
- Fallback chain continues past user_provided to next field
- Empty and whitespace env values are skipped
- Real env values are returned correctly
- Optional fields with sentinel values handled gracefully
2026-04-29 09:09:54 +09:00
Danny Avila
89bf2ab7b4
💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868)
* 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12855)

Different providers report `usage_metadata.input_tokens` with different
semantics:

  - Anthropic / Bedrock: `input_tokens` EXCLUDES cache; cache reads/writes
    arrive separately and must be added to get the total prompt size.
  - Gemini / OpenAI: `input_tokens` ALREADY INCLUDES cached tokens
    (Google's `promptTokenCount`, OpenAI's `prompt_tokens`). Their
    `input_token_details.cache_*` are subsets of `input_tokens`.

`recordCollectedUsage` treated both schemes as additive, so for cache-hit
requests on Gemini/OpenAI it added cache tokens on top of an
`input_tokens` value that already contained them — overcharging users by
the cache_hit_rate (e.g., ~67% cache hit ≈ 1.67x overcharge). This
matches the issue reporter's GCP billing comparison.

Adds a small `splitUsage` helper that classifies the provider by model
name and computes `inputOnly` (the non-cached portion) plus the
all-inclusive `totalInput` for both the spend math and the returned
`input_tokens` summary. The helper defaults to additive semantics (the
historical behavior) so unknown providers are unaffected.

Updates existing OpenAI-shaped tests that previously asserted the buggy
additive math, and adds Gemini regression tests using the exact numbers
from the issue report (input=11125, cache_read=7441 → input=3684).

Anthropic / Bedrock paths remain bit-identical to before.

* 🔧 refactor: Classify Cache-Token Semantics by Provider, Not Model Name

Follows up the previous commit. Replaces a model-name regex
(`gemini|gpt|o[1-9]|chatgpt`) with an explicit `Providers` enum lookup
keyed off the `usage.provider` field — `UsageMetadata.provider` already
exists in `IJobStore.ts` but was never being populated.

  - `callbacks.js#ModelEndHandler` now attaches `usage.provider` from
    `agentContext.provider` alongside `usage.model`.
  - `usage.ts` uses a `SUBSET_PROVIDERS` set (`openAI`, `azureOpenAI`,
    `google`, `vertexai`, `xai`, `deepseek`, `openrouter`, `moonshot`)
    backed by the canonical `Providers` enum from
    `librechat-data-provider`.
  - `xai`, `deepseek`, `openrouter`, `moonshot` extend `ChatOpenAI` so
    they inherit subset semantics (verified in node_modules).
  - Defaults to additive when `usage.provider` is missing, so the title
    flow (which doesn't propagate provider) and any pre-this-PR usage
    entries keep their existing behavior.

Tests: switch fixtures from model-name signaling to explicit `provider`
field, plus a Vertex AI case and a "missing provider" fallback case.
2026-04-29 08:36:00 +09:00
Danny Avila
46a86d849f
🛂 fix: Skip Inherited / Mark Skill Files Read-Only in Code-Env Pipeline (#12866)
* 🛂 fix: Skip Re-Download of Inherited Code-Env Files (No More 403 Storms)

When a bash/code-interpreter call lists or operates on inputs the user
already owns (skill files primed via primeInvokedSkills, files inherited
from a prior session), codeapi echoes those files back in the tool
result with `inherited: true`. We were treating every entry as a
generated artifact and calling processCodeOutput on each, which:

1. Hit `/api/files/code/download/<session_id>/<file_id>` with the
   user's session key. Skill files are uploaded under the skill's
   entity_id, so every download 403'd — producing dozens of
   "Unauthorized download" log lines per turn.

2. Surfaced those inputs as ghost file chips in the UI even though
   they were never generated by the run.

3. Wasted a download round-trip even when no auth boundary was
   crossed — the file is already persisted at its origin.

Fix: skip files where `file.inherited === true` in all three
artifact-files loops (`tools.js`, `createToolEndCallback`, and
`createResponsesToolEndCallback`). Skill files remain available to
subsequent calls via primeInvokedSkills / session inheritance — we
just don't redundantly re-download them.

Pairs with codeapi-side change that adds the `inherited` flag.

* 🔒 feat: Mark Skill Files as `read_only` During Code-Env Priming

Pairs with codeapi `read_only` upload flag (ClickHouse/ai#1345). When
LibreChat primes a skill into the code-env, every file in the batch
(SKILL.md plus all bundled scripts/schemas/docs) is now uploaded with
`read_only: true`. Codeapi seals these inputs at the filesystem layer
(chmod 444) and the walker echoes the original refs as `inherited:
true` regardless of whether sandboxed code modified the bytes on disk.

Without this, the previous PR's `inherited` skip handled only the
unchanged case. A modified skill file (pip writing pyc near a .py, a
script accidentally truncating LICENSE.txt, etc.) still flowed through
the modified-input branch on codeapi, got a fresh user-owned file_id,
uploaded as a "generated" artifact, and surfaced in the UI as a chip
the user couldn't actually authorize a download for.

Changes:

- `api/server/services/Files/Code/crud.js`:
  `batchUploadCodeEnvFiles({ ..., read_only })` forwards the flag as
  a multipart form field. Default `false` preserves existing behavior
  for user-attached files and prior-session inheritance.

- `packages/api/src/agents/skillFiles.ts`: type signature gains
  `read_only?: boolean`; `primeSkillFiles` passes `true`.

- `packages/api/src/agents/skillFiles.spec.ts`: assert the upload call
  carries `read_only: true`.

The flag is intentionally not skill-specific. Any future
infrastructure-input flow (system fixtures, cached datasets, etc.) can
opt in the same way.
2026-04-29 08:26:25 +09:00
Danny Avila
f69e8e26f8
🪟 feat: Render Source-Code Artifacts in the Side Panel (#12854)
* 🪟 feat: Render Source-Code Artifacts in the Side Panel (CODE bucket)

PR #12832 wired markdown / mermaid / html / .jsx-tsx tool outputs through
the side-panel artifact pipeline but explicitly punted on code files:

> Everything else (csv, py, json, xls/docx/pptx, …) keeps PR #12829's
> inline behaviour — dedicated viewers will land in follow-ups.

This adds the code-file viewer. A `simple_graph.py` (and every other
common source file) now opens in the side panel alongside markdown,
mermaid, html, and react artifacts instead of falling back to the
inline `<pre>` rendering.

**Design.** New `CODE: 'application/vnd.code'` bucket reuses the static-
markdown sandpack template — `useArtifactProps` pre-wraps the source as
a fenced code block (` ```python\n...\n``` `) before handing it to
`getMarkdownFiles`. The fence carries a `language-<x>` class through
`marked`, so a future highlighter swap-in (e.g. drop `highlight.js`
into the markdown template) picks up syntax colors automatically. The
`react-ts` (sandpack) template's React boot cost is avoided since
source files don't need it.

**Single source of truth for languages.** New `CODE_EXTENSION_TO_LANGUAGE`
map drives BOTH:
  - `EXTENSION_TO_TOOL_ARTIFACT_TYPE` routing (presence in this map =
    code file). Adding a new language is one entry.
  - The fenced-block language hint (exported as `languageForFilename`).

Identifiers follow the GitHub / `highlight.js` convention so the future
highlighter pickup is automatic.

**Scope.** Programming languages + stylesheets + shell + sql/graphql +
build files (Dockerfile/Makefile/HCL). Pure data formats
(CSV/TSV/JSON/JSONL/NDJSON/XML/YAML/TOML) and config dotfiles
(`.env`/`.ini`/`.conf`/`.cfg`) are intentionally NOT routed in this
pass — they're better served by dedicated viewers (CSV table view,
etc.) or remain inline. Adding them later is a one-entry change in
the map.

**JSX/TSX kept on the React (sandpack) bucket.** They're React component
sources; the existing live-preview should win over the static CODE
bucket. Plain `.js`/`.ts` source goes through CODE.

**MIME-type fallback.** The codeapi backend serves `text/x-python`,
`text/x-typescript`, etc. as `Content-Type` for source files, so a
file whose extension was stripped/renamed upstream still routes to
CODE via the MIME map.

**Empty-text gate.** CODE joins MARKDOWN/PLAIN_TEXT in the empty-text
exception (an empty `.py` is still a Python file). HTML/REACT/MERMAID
still require text — their viewers (sandpack/mermaid.js) error on
empty input.

**Files changed:**
- `client/src/utils/artifacts.ts` — `CODE` bucket constant,
  `CODE_EXTENSION_TO_LANGUAGE` map, exported `isCodeExtension` and
  `languageForFilename` helpers, extension/MIME routing additions,
  template + dependencies entries, empty-text gate exception, helper
  hoisting (extensionOf / baseMime moved up so the language map can
  reference them).
- `client/src/hooks/Artifacts/useArtifactProps.ts` — exported
  `wrapAsFencedCodeBlock`, CODE branch that wraps the source then
  routes through `getMarkdownFiles`.

**Tests (+22):**
- 8 parameterized routing cases (.py, .js, .go, .rs, .css, .sh, .sql,
  .kt) verify the CODE bucket fires.
- Extension wins when MIME is generic octet-stream (Python has no
  magic bytes; common case).
- Regression: jsx/tsx STAY on REACT bucket (no live-preview regression).
- Regression: data formats (CSV/JSON/YAML/TOML) and config dotfiles
  (.env/.ini) do NOT route to CODE.
- Empty-text exception for CODE (empty Python file is still a Python file).
- `useArtifactProps`: CODE → content.md / static template, fenced-block
  shape, language hint, unknown-extension fallback to raw extension,
  no-extension empty hint, index.html via markdown template.
- `wrapAsFencedCodeBlock`: language hint, empty hint, single-trailing-
  newline trim, multi-newline preservation, empty-source emit.

87/87 in artifact-impacted tests; 155/155 across the broader artifact
suite. No regressions in pre-existing markdown/mermaid/HTML/REACT/text
behavior.

* 🛡️ fix: Bare-filename routing + adaptive fence delimiter (codex P2 ×2)

Two follow-ups from Codex review on the CODE bucket:

1. **Bare-filename routing for extensionless build files (Codex P2).**
   `Dockerfile`, `Makefile`, `Gemfile`, `Rakefile`, `Vagrantfile`,
   `Brewfile` have no `.` in their basename — `extensionOf` returns
   `''` and the extension map can't match, so they fell through to
   inline rendering despite being in `CODE_EXTENSION_TO_LANGUAGE`.

   New `bareNameOf` helper returns the lowercased basename for
   extensionless filenames (returns `''` for files with a `.` so the
   extension and bare-name paths don't double-match). Both
   `detectArtifactTypeFromFile` and `languageForFilename` consult it as
   a second lookup against the same `CODE_EXTENSION_TO_LANGUAGE` map,
   so adding a new build file is one entry. Path-aware: takes the
   basename so `proj/Dockerfile` (path-preserving sanitizer output)
   still routes correctly.

   Added the four extra Ruby build-script names while I was here.

2. **Adaptive fence delimiter (Codex P2).** A hardcoded ` ``` ` fence
   breaks when the source contains a line starting with ` ``` ` —
   for example, a JS file containing a markdown-shaped template
   literal:
       const md = `
       ```
       hello
       ```
       `;
   CommonMark closes a fence on a line whose backtick run matches-or-
   exceeds the opener, so `marked` would close the outer fence at
   the inner `\`\`\`` and the rest of the file would render as
   markdown — corrupting the artifact and potentially altering
   formatting / links outside `<code>`.

   New `longestLeadingBacktickRun(source)` scans for the longest
   start-of-line backtick run in the payload. Fence length =
   `max(3, longest + 1)` — strictly more than any internal run, so
   `marked` can never close the outer fence early. Only escalates
   when needed; the common case still uses a triple-backtick fence.

   Inline backticks (mid-line) don't count — they're not fence
   delimiters. Only column-zero runs trigger escalation, so e.g.
   a Python file with ` `inline ``` here` ` keeps the 3-fence.

+11 regression tests:
  - 8 parameterized cases: `Dockerfile`/`Makefile`/`Gemfile`/etc. route
    to CODE via bare-name fallback (case-insensitive on basename).
  - Path-aware: `proj/Dockerfile` recognized.
  - No double-match: `dockerfile.dev` (with extension) returns null.
  - Unknown extensionless files (`README`, `LICENSE`) stay null.
  - 4-backtick fence when source has ` ``` ` at start-of-line.
  - 5-backtick fence when source has ` ```` ` at start-of-line.
  - 3-backtick fence (default) for ordinary code.
  - Inline backticks don't escalate.
  - Source starting with backtick run at offset 0.

Plus 6 new `languageForFilename` tests covering bare-name fallback
and path-awareness.

108/108 in artifact-impacted tests (was 87, +21 tests). No regressions.

* 🛡️ fix: Indented fence detection + basename-scoped extensionOf (codex P2/P3)

Two follow-ups from the latest Codex review on the CODE bucket:

1. **Indented backtick runs (Codex P2).** `longestLeadingBacktickRun`
   was scanning `^(`+)` — column 0 only. CommonMark allows fence
   closers to be indented up to 3 spaces, so a JS source containing
   an indented `\`\`\`` (e.g. inside a template literal embedded in a
   class method) would still terminate our outer fence and the
   remainder would render as markdown.

   Updated regex to `^ {0,3}(`+)`. Tabs are not allowed in fence
   indentation (CommonMark expands them to 4 spaces, which is over
   the 3-space limit), so spaces alone suffice. Backticks indented
   4+ spaces are CommonMark "indented code blocks" — they can't
   terminate a fence, so we correctly don't escalate for them.

2. **`extensionOf` path-laden output (Codex P3).** `extensionOf` took
   `lastIndexOf('.')` across the FULL path string, so
   `pkg.v1/Dockerfile` yielded the nonsensical "extension"
   `v1/dockerfile`. `languageForFilename` returned that as the language
   hint (broken `language-v1/dockerfile` class on the fenced block),
   AND the routing's bare-name fallback couldn't fire because the
   extension lookup returned non-empty.

   New `basenameOf` helper strips path separators; `extensionOf` and
   `bareNameOf` both go through it. After the fix:
     - `pkg.v1/Dockerfile` → `extensionOf` returns `''` → `bareNameOf`
       returns `dockerfile` → routes to CODE with correct language.
     - `pkg.v1/main.go` → `extensionOf` returns `go` → routes correctly.
     - `pkg.v1/script.py` → `extensionOf` returns `py` → routes correctly.

+10 regression tests:
  - 5 parameterized cases covering 1-3 space indent at fence lengths
    3, 4, 5 (escalation kicks in correctly).
  - 4-space indent does NOT escalate (CommonMark indented-code-block
    territory; can't close a fence).
  - `pkg.v1/Dockerfile` and `a.b.c/Makefile` route to CODE +
    `languageForFilename` returns `dockerfile`/`makefile`.
  - Dotted-directory files (`pkg.v1/main.go`, `a.b.c/script.py`) still
    route correctly via the basename-scoped extension parse.

118/118 in artifact-impacted tests (was 108, +10 tests). No regressions.

* 🛡️ fix: Comprehensive review polish + MIME-derived language hint (codex P3)

Resolves all 8 valid findings from the comprehensive review and the
follow-up Codex P3 on the same PR. None are user-visible bugs; the set
spans correctness guards, dead-code removal, organization, and test
coverage.

**Comprehensive review #1 — Remove dead `isCodeExtension` export.**
Function was exported with zero callers anywhere in the codebase.

**Comprehensive review #2 — Guard the for-loop against silent overwrites.**
The `for (ext of CODE_EXTENSION_TO_LANGUAGE)` loop blindly assigned
each language extension to the CODE bucket. If a future contributor
added `jsx` or `tsx` to the language map (a natural mistake — they
ARE source code), the loop would silently overwrite the REACT bucket
entries and break the sandpack live-preview with no compile-time or
runtime error. Added `if (ext in EXTENSION_TO_TOOL_ARTIFACT_TYPE) continue`
so explicit map entries always win.

**Comprehensive review #3 — Add `fileToArtifact` end-to-end test for CODE.**
Routing was tested via `detectArtifactTypeFromFile`; full Artifact
construction (id / type / title / content / messageId / language) for
CODE was not. Added 5 new `fileToArtifact` cases.

**Comprehensive review #4 — Move pure utilities out of the hook file.**
`wrapAsFencedCodeBlock` and `longestLeadingBacktickRun` are pure
string transformations with no React dependencies. Moved both to
`utils/artifacts.ts`. Test files updated to import from the new
location.

**Comprehensive review #5 — Correct the MIME-map "mirrors" comment.**
Comment claimed the MIME map mirrored `CODE_EXTENSION_TO_LANGUAGE`, but
covered ~21 of ~60 entries. Reworded to "best-effort COMMON-CASE list,
not an exhaustive mirror" with the rationale (extension routing is
primary; MIME is a stripped-filename fallback).

**Comprehensive review #6 — Drop `lang ? lang : ''` ternary.**
`lang` is typed `string`; the only falsy value is `''`. Removed.
(Replaced via the MIME-fallback rewrite of `wrapAsFencedCodeBlock`,
where `lang` is now used directly without the ternary.)

**Comprehensive review #7 — Avoid double `basenameOf` computation.**
`extensionOf(filename)` and `bareNameOf(filename)` both internally
called `basenameOf` — when the extension lookup missed,
`detectArtifactTypeFromFile` paid for two parses of the same path.
Split into private `extensionFromBasename` / `bareNameFromBasename`
helpers; the caller computes `basenameOf` once and threads it through.

**Comprehensive review #8 — Trim verbose Dockerfile/Makefile comment.**
Inline comment block in the language map duplicated `bareNameOf`'s
JSDoc. Replaced with a one-line pointer.

**Codex P3 — MIME fallback for the CODE language hint.**
`detectArtifactTypeFromFile` routes `{ filename: 'noext', type:
'text/x-python' }` to CODE via the MIME bucket map, but then
`useArtifactProps` derived the language hint from `artifact.title`
ONLY — and `noext` has no extension, so `languageForFilename` returned
empty and the fenced block emitted with no `language-` class. The
future highlighter swap-in would lose syntax-color metadata for these
files.

  - New `MIME_TO_LANGUAGE` map covering the language MIMEs codeapi
    actually emits.
  - `languageForFilename(filename, mime?)` now takes an optional MIME
    second arg and falls back to it after the extension and bare-name
    paths.
  - `fileToArtifact` resolves the language at construction time
    (using both filename AND `attachment.type`) and stores it on
    `artifact.language`. The hook reads `artifact.language` directly
    rather than re-deriving from `title` alone, so the MIME signal
    survives end-to-end.
  - Title-derived fallback in the hook covers older callers that
    don't populate `language`.

Tests:
  +10 cases for the comprehensive review findings (CODE end-to-end
  via `fileToArtifact`, language storage, non-CODE language
  un-set).
  +6 cases for the MIME fallback (`languageForFilename(name, mime)`
  ordering, MIME parameter stripping, extension/bare-name vs MIME
  precedence, empty signal).
  +2 hook tests for `artifact.language` pre-resolved vs title-fallback.

131/131 in directly-impacted files (was 118, +13).
199/199 across the broader artifact suite. No regressions.

Pre-existing TypeScript errors in `a11y/`, `Agents/`, `Auth/`,
`Mermaid.tsx`, etc. are unrelated to this PR (verified by checking
`tsc --noEmit` on origin/dev — same errors).
2026-04-28 19:07:19 -04:00
Danny Avila
c9dee962e7
📂 fix: Preserve Nested Folder Paths for Code-Execution Artifacts (#12848)
* 📂 fix: Preserve Nested Folder Paths for Code-Execution Artifacts

When codeapi reports a generated file at a nested path (`a/b/file.txt`),
`processCodeOutput` was running it through `sanitizeFilename` — which
calls `path.basename()` and then collapses `/` to `_`. The DB row ended
up with `filename: "file.txt"`, `primeFiles` shipped that flat name back
to the next sandbox session, and `cat /mnt/data/a/b/file.txt` 404'd.

Fix: split the sanitizer into two helpers in `packages/api/src/utils/files.ts`:

  - `sanitizeArtifactPath` — segment-wise sanitize while preserving
    `/`. Falls back to basename on `..` traversal, absolute paths, and
    other malformed inputs. The DB record uses this so the next prime()
    can recreate the nested path in the sandbox.

  - `flattenArtifactPath` — encode `/` as `__` for the local
    `saveBuffer` strategies, which key by single-component filename and
    would otherwise create unintended subdirectories under uploads/.

`process.js` is updated to use both: DB filename keeps the path, storage
key flattens. `claimCodeFile` is also keyed on `safeName` so the
(filename, conversationId) compound key stays consistent with the
record `createFile` writes.

Tests:
  +13 unit tests in `files.spec.ts` (sanitizeArtifactPath table,
  flattenArtifactPath round-trip).
  +1 integration test in `process.spec.js` asserting the DB-row vs
  storage-key split for a nested path.
  Updated `process-traversal.spec.js` to mock the new helpers.

64 pass / 0 fail across `Files/Code/`; 36 pass / 0 fail in
`packages/api/src/utils/files.spec.ts`.

Companion: ClickHouse/ai#1327 — the codeapi-side counterpart that stops
phantom file IDs from reaching this code path in the first place. These
two are independent but the matplotlib bug is most cleanly resolved when
both ship.

* 🛡️ fix: Re-add 255-char per-segment cap in sanitizeArtifactPath (codex review P2)

`sanitizeArtifactPath` dropped the 255-char basename cap that
`sanitizeFilename` enforces. Long artifact names then flowed unbounded
into `processCodeOutput`'s storage key (`${file_id}__${flatName}`) and
tripped `ENAMETOOLONG` on filesystems that enforce `NAME_MAX` —
saveBuffer fails, and the file falls back to a download URL instead of
persisting / priming. This was a regression specifically for flat
filenames that the original `sanitizeFilename` would have truncated
safely.

Re-add the cap as a per-path-component limit so it applies cleanly to
both flat and nested paths:

  - Leaf segment: extension-preserving truncation, matching
    `sanitizeFilename`'s shape (`<truncated-stem>-<6 hex>.<ext>`).
  - Non-leaf (directory) segments: plain truncate-and-disambiguate
    (`<truncated-name>-<6 hex>`); directory names don't carry semantic
    extensions worth preserving.
  - Defensive fallback when `path.extname` returns a pathologically long
    "extension" (e.g. `_.aaaa…aaa` after the dotfile underscore prefix
    rewrite turns a long hidden file into a non-dotfile with a 300-char
    "extension"): collapse to whole-segment truncation rather than
    leaving the cap unmet.

+6 unit tests covering: long leaf (regression case), long leaf under a
preserved directory, long non-leaf segment, deeply nested mixed-length,
exact-255 boundary (no truncation), and the dotfile + truncation
interaction.

* 🛡️ fix: Cap flattened storage key against NAME_MAX in processCodeOutput (codex review P1)

Per-segment caps on the path-preserving form aren't enough. Once segments
are joined with `__` for the storage key, deeply-nested or moderately
long paths can still produce a flat form that overflows once
`${file_id}__` is prepended — `${file_id}__a__b__c.csv` for a 3-level
100-char-each path is ~344 chars, well past filesystem NAME_MAX (255).
saveBuffer then trips ENAMETOOLONG and falls back to a download URL,
and the artifact never persists / primes.

`flattenArtifactPath` gets an optional `maxLength` parameter. When set,
the function truncates the flat form to fit, preserving the leaf
extension with the same disambiguating-hex-suffix shape sanitizeFilename
uses. Default (`undefined`) keeps existing call sites uncapped — the cap
is opt-in for callers that are actually building a filesystem key.
Pathologically long "extensions" from `path.extname` (e.g. `.aaaa…aaa`)
fall back to whole-key truncation rather than leaving the cap unmet.

processCodeOutput composes the storage key after `file_id` is known and
passes `255 - file_id.length - 2` as the budget so the full
`${file_id}__${flatName}` string fits in one filesystem path component.

+7 unit tests in files.spec.ts:
  - Pass-through when no maxLength supplied (cap is opt-in).
  - Pass-through when flat form fits within maxLength.
  - Truncation with leaf extension preserved (the regression case).
  - Leaf-only overflow with extension preservation.
  - Pathological long-extension fallback (whole-key truncation).
  - No-extension stem truncation.
  - Boundary equality (off-by-one guard).

+1 integration test in process.spec.js: processCodeOutput passes the
file_id-aware budget (`255 - file_id.length - 2`) to flattenArtifactPath.

114/114 across files.spec.ts + Files/Code (49 + 65).

* 🛡️ fix: Determinize + clamp artifact-path truncation (codex review P2 ×2)

Two follow-ups to Codex review on the path/flat-key cap:

1. **Deterministic truncation suffixes**. The previous helpers used
   `crypto.randomBytes(3)` for the disambiguator, mirroring
   `sanitizeFilename`'s shape. That made the truncated form non-
   deterministic: a re-upload of the same long filename would compute a
   *different* storage key, orphaning the previous on-disk file under
   the reused `file_id` returned by `claimCodeFile`.

   New `deterministicHexSuffix(input)` helper hashes the input with
   SHA-256 and takes the first 6 hex chars. Same input → same suffix
   (storage key stable across re-uploads); different inputs sharing a
   truncation prefix still get different suffixes (collision avoidance).
   24 bits ≈ 16M values is collision-safe for our scale (single-digit
   artifacts per turn per (filename, conversationId) bucket).

   Applied to `truncateLeafSegment`, `truncateDirSegment`, and
   `flattenArtifactPath` — every truncation site in the new helpers.
   `sanitizeFilename` (pre-existing) is intentionally left alone; its
   tests rely on the random-bytes mock and it's outside this PR's scope.

2. **Final clamp on flattenArtifactPath result**. The old `Math.max(1,
   maxLength - ext.length - 7)` floor could let the result slip past
   `maxLength` when the extension was nearly as large as the budget
   (e.g. `maxLength=5`, `ext=".txt"`: budget computed as 0, but result
   was `-<6 hex>.txt` = 11 chars). Drop the `Math.max(1, …)` floor and
   add a final `truncated.slice(0, maxLength)` so the contract holds
   for any input. Also short-circuit `maxLength <= 0` to `''` for
   pathological budgets.

Tests updated to compute the expected hash inline (the existing
`randomBytes` mock doesn't apply to the new code path), plus 4 new
regression tests:
  - sanitizeArtifactPath: same input → same output, different inputs →
    different outputs (determinism + collision avoidance).
  - flattenArtifactPath: same input → same output, different inputs
    sharing a truncation prefix → different outputs.
  - flattenArtifactPath: clamp holds when ext.length > maxLength - 7.
  - flattenArtifactPath: returns '' for maxLength <= 0.

53 unit tests pass. 65 integration tests pass.

* 🛡️ fix: Total-path cap + basename for classifier (codex P2 + comprehensive review)

Four follow-ups from the latest reviews on this PR:

1. **Codex P2: total-path cap in sanitizeArtifactPath**. Per-segment
   caps weren't enough — a deeply nested path (3+ at-cap segments) can
   still produce a joined form past Mongo's 1024-byte indexed-key limit
   (4.0 and earlier reject; later versions configurable). Added
   `ARTIFACT_PATH_TOTAL_MAX = 512` and a leaf-only fallback when the
   joined form exceeds it. Same shape as the absolute-path /
   `..`-traversal fallbacks above; the leaf is already segment-capped to
   ≤255, so the final result stays within bounds.

2. **Codex P2: pass basename to classifier/extractor in process.js**.
   With the path-preserving sanitizer, `safeName` can now be a nested
   string like `reports.v1/Makefile`. The classifier's `extensionOf`
   reads that as `v1/Makefile` (the slice after the dot in the directory
   name) and the bare-name branch rejects because it sees a `.`
   anywhere. Result: extensionless artifacts under dotted folders
   (Makefile, Dockerfile, etc.) get misclassified as `other` and skip
   text extraction. Pass `path.basename(safeName)` to both
   `classifyCodeArtifact` and `extractCodeArtifactText` so
   classification matches what the old flat-name flow produced.

3. **Review nit: drop dead `sanitizeFilename` mock in process.spec.js**.
   process.js no longer imports `sanitizeFilename`; the mock was
   misleading dead code.

4. **Review nit: rename misleading `'embedded parent traversal'` test**.
   `path.posix.normalize('a/../escape.txt')` resolves to `escape.txt`
   which goes through the normal segment-split path, not the
   `sanitizeFilename` fallback. Test name now says "resolves embedded
   parent traversal via path normalization" to match the actual code
   path.

+3 regression tests:
  - sanitizeArtifactPath falls back to leaf-only when joined > 512.
  - sanitizeArtifactPath keeps nested path within the 512 budget.
  - process.spec: passes basename (`Makefile` from `reports.v1/Makefile`)
    to classifyCodeArtifact + extractCodeArtifactText.

Existing "caps every segment in a deeply-nested path" test now uses 2
segments (not 3) so the joined form stays under the new total cap; the
3-segment scenario is covered by the new fallback test instead.

55 unit + 66 integration = 121/121 pass.

* 📝 docs: Correct sanitizeArtifactPath JSDoc to match actual schema index

Two doc-only fixes from the latest comprehensive review (both NIT):

1. **Index field list was wrong**. JSDoc claimed the compound unique
   index was `{ file_id, filename, conversationId, context }`. The
   actual index in `packages/data-schemas/src/schema/file.ts:92-95` is
   `{ filename, conversationId, context, tenantId }` with a partial
   filter for `context: FileContext.execute_code`. The cap rationale
   (Mongo 4.0 indexed-key limit) is correct and unchanged; just the
   field list was wrong. Added the schema file path so future readers
   can find the source of truth.

2. **Trade-off acknowledgement**. The reviewer noted that the
   leaf-only fallback loses directory structure, which means the
   model's `cat /mnt/data/<deep>/<path>/file.txt` would 404 on the
   pathological-depth case — partially re-introducing the original
   flat-name bug for >512-char paths. This is intentional (DB write
   failure is strictly worse than losing structure), but the trade-off
   wasn't called out explicitly in the JSDoc. Added a paragraph
   acknowledging it and noting that the cap is monotonically better
   than the pre-PR behavior, where ALL artifacts were treated this way
   regardless of depth.

No code or test changes — pure JSDoc correction. Tests still 55/0.

* 🛡️ fix: Disambiguate sanitized artifact names to keep claimCodeFile keys unique (codex P2)

`sanitizeArtifactPath` is not injective — multiple raw inputs can collapse
onto the same regex-and-normalize output. Codex's example:
`reports 2026/out.csv` and `reports_2026/out.csv` both sanitize to
`reports_2026/out.csv`. `claimCodeFile` is keyed on the schema's compound
unique `(filename, conversationId, context, tenantId)` index, so the
later upload silently matches the earlier record and overwrites the first
artifact's bytes via the reused `file_id` — a single conversation can
drop files when both names are valid in the sandbox.

This collision space isn't strictly new — pre-PR `sanitizeFilename`
(basename-only) had the same property — but the path-preserving form
gives us enough information to fix it for the first time.

**Fix.** When character-level sanitization changed something (regex
replacement, path normalization, dotfile prefix, empty-segment collapse),
embed a deterministic SHA-256 prefix of the **raw input** in the leaf
segment via the new `embedDisambiguatorInLeaf` helper. Same raw input →
same safe form (idempotent for re-uploads); different raw inputs that
would have collided → different safe forms.

**Why "character-level"** specifically:
- The disambiguator fires when `preCapJoined !== inputName` (post-regex
  + dotfile + empty-segment, BUT pre-truncation).
- Truncation alone is already disambiguated by `truncateLeafSegment`'s
  own seg-hash; firing the input-hash branch on truncation would just
  stack a second hash for no collision-avoidance benefit and clutter
  human-readable filenames.

**Three known collision shapes covered:**
1. `out 1.csv` vs `out_1.csv` (and `out@1.csv` vs `out#1.csv`, etc.)
2. `dir//file.txt` vs `dir/file.txt` (empty-segment collapse)
3. `.x` vs `_.x` (dotfile-prefix step)

**Disambiguator + truncation interaction:** for very long mutated leaves,
`truncateLeafSegment` caps at 255 first, then `embedDisambiguatorInLeaf`
re-trims to insert the input hash. The seg-hash from the first pass is
replaced by the input-hash from the second pass — that's intentional
(input-hash is the load-bearing collision-avoidance suffix; seg-hash was
only ever decorative once the input-hash exists). Final clamp ensures
the result never exceeds `ARTIFACT_PATH_SEGMENT_MAX` regardless of input.

**Disambiguator + total-cap fallback:** when joined > 512, we fall back
to the leaf-only form. The leaf has already had the disambiguator
embedded, so collision avoidance survives the pathological-depth case.

**`embedDisambiguatorInLeaf`** uses `dot <= 1` to detect "no real
extension" (covers extensionless names AND dotfile-prefixed leaves like
`_.hidden` — without this, `_.hidden` would split as stem `_` + ext
`.hidden` and produce the awkward `_-<hash>.hidden`).

**Updated 5 existing tests** that asserted the old collision-prone
outputs — they now verify the disambiguator-included form. The
character-level-only firing rule was load-bearing here: tests for
"clean inputs (no mutation)" and "long inputs (truncation only)" still
pass without any disambiguator clutter.

**+7 regression tests** in a new `collision avoidance (Codex review P2)`
describe block:
1. Different raw inputs sanitizing to the same form get distinct safes
2. Whitespace-vs-underscore in directory segment
3. Dotfile-prefix collision
4. Idempotency: same raw → same safe across calls
5. Clean inputs skip the disambiguator (cosmetic guarantee)
6. Disambiguator survives leaf truncation (long mutated leaf)
7. Disambiguator survives total-cap fallback (pathological depth)

62 unit + 66 integration = 128/128 pass.
2026-04-28 12:52:04 +09:00
Danny Avila
7070eb76aa
🔧 fix: Replace Literal NUL Bytes in handlers.spec Test Fixture + Normalize CRLF (#12852)
Two test-file hygiene fixes:

1. **Literal NUL bytes**. The `'rejects binary content (NUL bytes)
   post-fetch'` test embedded raw `\x00` bytes directly in the source
   string (`const binaryWithNul = '<3 NULs>\rIHDR<2 NULs>\x04'`).
   Embedding NULs in source files breaks editors, linters, ts-loader,
   and most git tooling — `grep` even classifies the file as binary
   ("Binary file matches"). Replace with `\x00` escape sequences in the
   string literal so the source is plain ASCII while the runtime string
   value is unchanged.

2. **CRLF line endings**. My earlier commits to this file picked up
   Windows-style `\r\n` from git's `core.autocrlf=true` checkout
   conversion, then staged them as `\r\n`. The diff against `dev`
   showed the entire file as changed even though only a few lines were
   touched semantically. Normalize the whole file back to LF so future
   diffs read clean.

The diff for this commit is large (~1248 lines marked changed) but
every change is one of: CRLF → LF, or the single `binaryWithNul`
escape-sequence rewrite. No semantic test changes.

Tests: 39/39 pass (unchanged behavior).
2026-04-28 11:20:26 +09:00
Danny Avila
2624c18633
🚫 fix: Reject Binary Files in read_file Sandbox Fallback (No More Mojibake) (#12851)
* 🚫 fix: Reject Binary Files in read_file Sandbox Fallback (No More Mojibake)

`read_file("/mnt/data/simple_graph.png")` was shelling `cat` through codeapi
`/exec` and shipping the result back to the model. codeapi's transport is
JSON, so `stdout` containing PNG bytes round-tripped through lossy UTF-8
replacement (every non-ASCII byte became U+FFFD), got line-numbered by
`addLineNumbers`, and arrived in the model's context as a multi-KB blob of
`�PNG\r\n  2 | ...`. The bytes were unrecoverable — and the same
codeapi sandbox logged the base64-style mojibake too — so the goal is
fail-fast, not retrieval.

Two guards in `handleSandboxFileFallback`, both bypassed by the existing
text path:

  1. **Extension precheck** (BEFORE any network call) for known-binary
     types: images, documents (pdf/docx/xlsx/etc), archives, audio, video,
     native libs, fonts, and a few other byte-soup formats. The message
     for image extensions points at the existing chat attachment ("the
     image is already shown to the user, use bash_tool for programmatic
     processing"); other binaries get the generic "use bash" hint.

  2. **NUL-byte sniff** (AFTER the read) on the first 8KB for unknown
     extensions or no-extension paths. The codeapi `/exec` JSON encoding
     mangles most non-UTF-8 bytes but a NUL terminator from a magic
     header survives the round-trip, so this catches novel binary
     formats without an extension precheck.

`lowercaseExtension` uses the basename to avoid false-triggering on
directory-name dots (e.g. `proj.v1/notes` has no extension, not `.v1/notes`).

+6 tests:
  - Image rejected by extension, never calls readSandboxFile, message
    points at the existing attachment.
  - Non-image binary (.zip) rejected with a different (bash-only) message.
  - Case-insensitive extension match (.PNG vs .png).
  - NUL-byte sniff catches unknown-extension binary post-fetch.
  - Text files with binary-adjacent extensions (.txt) still readable.
  - Dotted directory names don't false-trigger the extension match.

38/38 handlers.spec.ts pass.

The companion bash "command not found" issue from the same conversation
is a separate LLM mistake (writing raw Python as the bash command without
`python3 -c` / heredoc wrapper). Not coded here — flagged to the user.

* 🖼️ fix: Allow SVG read_file (XML text, no mojibake risk) — codex review P2

`.svg` was bucketed with raster-image extensions in
`BINARY_EXTENSIONS_NEVER_READABLE`, which made `handleSandboxFileFallback`
reject every SVG before calling readSandboxFile. SVG is an XML text
format — there's no mojibake risk for normal content, and the model has
legitimate reasons to inspect or edit a generated SVG (tweaking colors,
paths, viewBox, etc.). Block was a regression for valid read_file usage.

Remove `.svg` from both `BINARY_EXTENSIONS_NEVER_READABLE` (so it routes
through the normal sandbox read path) and `IMAGE_EXTENSIONS_FOR_HINT`
(now-dead entry — only used by the rejection-message picker). The
post-fetch NUL-byte sniff still catches anything that turns out to be
binary despite a `.svg` extension.

+1 regression test that an SVG with valid XML content reads through
successfully (`<svg>...<circle/>...</svg>` → status: 'success', content
contains `<svg`/`viewBox`).
2026-04-27 20:47:38 -04:00
Danny Avila
47f65fe39a
🪟 feat: Render Code-Execution Text Artifacts as Side-Panel Artifacts (#12832)
* 🪟 feat: Render Code-Execution Text Artifacts as Side-Panel Artifacts

Builds on PR #12829 (which populates `text` on code-execution file
attachments). When a tool-output file's extension/MIME maps to a
viewer we already have, route it through the artifact UI instead of
the inline `<pre>`:

- text/html, text/htm        → existing artifacts side panel (sandpack)
- App.jsx / App.tsx          → existing artifacts side panel (sandpack)
- *.md / *.markdown / *.mdx  → existing artifacts side panel (sandpack)
- *.mmd / *.mermaid          → standalone Mermaid component, inline
                                (no sandpack/react template)

The card and the mermaid header both expose a download button so the
underlying file is still reachable. Everything else (csv, py, json,
xls/docx/pptx, …) keeps PR #12829's inline behaviour — dedicated
viewers for csv/docx/xlsx/pptx will land in follow-ups.

Backend: `.mmd` and `.mermaid` added to UTF8_TEXT_EXTENSIONS so
mermaid sources reach the client with `text` populated.

Frontend changes:
- `client/src/utils/artifacts.ts` — `TOOL_ARTIFACT_TYPES` constant,
  `detectArtifactTypeFromFile`, `fileToArtifact` (id is derived from
  `file_id` so the same artifact across renders dedupes cleanly).
- `client/src/components/Chat/Messages/Content/Parts/ToolArtifactCard.tsx`
  — registers the artifact in `artifactsState`, renders an
  `ArtifactButton`-style trigger paired with a download button.
- `client/src/components/Chat/Messages/Content/Parts/ToolMermaidArtifact.tsx`
  — wraps the standalone Mermaid component with a filename + download
  header so the file stays reachable.
- `Attachment.tsx` and `LogContent.tsx` — gain panel-artifact and
  mermaid branches in the routing decision tree, ahead of the existing
  inline-text fallback. Existing branches untouched.

Test coverage: backend extension matrix (mmd/mermaid), frontend
predicates (`isPanelArtifact`, `isMermaidArtifact`,
`artifactTypeForAttachment`), `fileToArtifact`, and an RTL suite that
verifies each type routes to the right component (panel card / mermaid
render / inline pre / file chip).

* 🩹 fix: Address review on code-artifacts-panel routing

- ToolArtifactCard: defer artifact registration to the click handler so
  rendering a card never side-effects into `artifactsState`. With
  `artifactsVisibility` defaulting to `true`, eager mount-time
  registration would surface tool artifacts in the side panel without
  user intent — now matches ArtifactButton's pattern. Drop the
  redundant `artifacts` subscription (write-only via useSetRecoilState).
- LogContent.tsx: precompute `Artifact`s inside the existing useMemo
  bucket-sort so each render isn't producing fresh objects. Without
  this, missing updatedAt/createdAt fields would make `toLastUpdate`
  return `Date.now()` and churn Recoil state on every parent render.
- Attachment.tsx + LogContent.tsx: classify each attachment once via
  `artifactTypeForAttachment` and branch on the result, instead of
  calling `isMermaidArtifact` and `isPanelArtifact` back-to-back
  (each of which internally re-classified). AGENTS.md single-pass rule.
- artifacts.ts `detectArtifactTypeFromFile`: strip `;` parameters
  before the MIME comparison (so `text/html; charset=utf-8` is
  recognized) and add fallbacks for `application/vnd.react`,
  `application/vnd.ant.react`, and `application/vnd.mermaid`.
- ToolMermaidArtifact: drop the `id` prop entirely when `file_id` is
  undefined so we never pass an undefined DOM id through to mermaid.
- AttachmentGroup: keys derived from `file_id` (not bare index) so
  add/remove churn doesn't remount stable cards.
- Wrappers (PanelArtifact / MermaidArtifact / ToolMermaidArtifact)
  tightened from `Partial<TAttachment>` to `TAttachment` since the
  caller always passes a full attachment.
- fileToArtifact: drop dead `?? ''` on content (guarded by the
  preceding type check).
- Tests: new click-interaction suite verifying the deferred-registration
  invariant, click registers + opens panel, and second click toggles
  closed without losing the registered artifact.

* 🧹 chore: Address follow-up review NITs

- artifacts.test.ts: regression-pin baseMime() with charset/case
  variants for text/html, text/markdown, application/vnd.react.
- attachmentTypes.ts: drop the now-unused isMermaidArtifact and
  isPanelArtifact wrappers (the routing collapsed onto a single
  artifactTypeForAttachment call in the previous commit, so they
  were only kept alive by their own test). attachmentTypes.test.ts
  rewritten to exercise artifactTypeForAttachment branches directly.
- Attachment.tsx + LogContent.tsx: re-sort the local imports
  longest-to-shortest per AGENTS.md (~/utils/artifacts is 72 chars
  and was sitting after a 51-char import).

*  feat: Auto-open panel + route txt/docx/odt/pptx through artifacts

- artifacts.ts: add `text/plain` to TOOL_ARTIFACT_TYPES so plain-text
  documents (and the markdown-like ones we don't have rich viewers for
  yet) can route through the side panel. `useArtifactProps` already
  dispatches `text/plain` to the markdown-style template, so they
  render cleanly with no panel-side change.
- Extension map gains txt/docx/odt/pptx → text/plain. pptx is wired
  up speculatively — backend extraction is still deferred, so the
  routing fires the moment that lands. The MIME map gets the matching
  office MIME types for symmetry (extension wins, but it's nice to
  have the fallback when sniffing returns the canonical office MIME).
- ToolArtifactCard: register the artifact in `artifactsState` on
  mount again. With visibility defaulting to `true` and the panel's
  `useArtifacts` hook auto-selecting the latest artifact, this gives
  the auto-open behaviour that the legacy streaming artifacts have.
  Click handler is now just "focus + reveal" (registration already
  happened); a user who has explicitly closed the panel keeps it
  closed and uses the click to re-open.
- Tests: parameterised row for each new extension; ArtifactRouting
  invariant flipped from "no register on mount" to "registers on
  mount so panel can auto-open". Existing TextAttachment test that
  used `a.txt` switched to `a.csv` since `.txt` now panel-routes.

* 🐛 fix: Auto-focus latest tool artifact + self-heal after panel close

Two bugs in the previous commit's auto-open behaviour:

1. After closing the side panel, no artifact card could be reopened.
   `useArtifacts.ts` resets `artifactsState` in its unmount cleanup
   (line 50), which fires when visibility goes to `false`. The card's
   mount-only `useEffect` doesn't refire after that wipe, so the
   subsequent click set `currentArtifactId` to an id that was no
   longer in `artifactsState`, and `Presentation.tsx` then refused to
   render the panel because `Object.keys(artifacts).length === 0`.

   Fix: the registration `useEffect` now has no dependency array, so
   it self-heals after the wipe (the dedup check keeps it cheap when
   nothing actually needs writing).

2. Newly-arrived artifacts didn't steal focus from an already-selected
   one. `useArtifacts`'s fallback auto-select (line 64) only fires
   when `currentId` is null or no longer in the list — it deliberately
   protects an existing selection, while the streaming-specific effect
   that handles legacy focus-stealing is gated on `isSubmitting`.
   That gate doesn't apply to tool-output artifacts.

   Fix: a second `useEffect` keyed on `artifact.id` calls
   `setCurrentArtifactId(artifact.id)` whenever a new card mounts.
   Cards mount in attachment-array order, so the LAST-mounted card
   (the newest tool output) wins — matching the legacy "latest auto-
   opens" UX.

Tests: replace the now-stale "no register on mount" assertion with
"registers and auto-focuses on mount", flip the toggle test to start
from the auto-focused state, and add two regression tests covering
the close-then-reopen path and the latest-of-many auto-focus.

*  feat: Route pptx through artifact panel with placeholder content

Before this commit, pptx files fell through to a plain FileContainer
chip even though the extension was wired into the artifact map: backend
text extraction is still deferred for pptx, so `attachment.text` came
back null/empty and `detectArtifactTypeFromFile`'s strict text check
returned null. That meant docx/odt rendered as proper artifact cards
while pptx in the same message rendered as a tiny download chip.

`detectArtifactTypeFromFile` now allows empty text for the plain-text
and markdown buckets, since their viewers (the markdown template) handle
empty content gracefully. HTML / React / Mermaid still require real
content because sandpack and mermaid.js error on empty input.

`fileToArtifact` substitutes a markdown placeholder
("Preview not available yet — click Download to view the file.") when
the file routes through the panel without text. The panel renders the
placeholder via the markdown template; pptx (and any docx that fails
extraction) gets visual parity with its siblings, and the moment
backend extraction lands the placeholder is replaced by real content
without any frontend change.

Tests: split the "no text returns null" assertion into the strict
viewers (HTML/React/Mermaid) and the lenient ones (plain-text/markdown);
add a fileToArtifact case proving pptx without text gets the
placeholder, and another proving real text wins when present.

*  feat: Dedup duplicate tool-artifact cards across tool calls + messages

Two `ToolArtifactCard` instances for the same file_id (e.g. agent reads
back what it just wrote, or the same file is referenced in turns 1 and
5) now collapse to a single chip — the most recent mount wins, the
older sibling re-renders to `null`.

Implementation:

- New `toolArtifactClaim` atomFamily keyed by artifact id. Each card
  generates a unique component-instance key via `useId()`, claims the
  slot in a `useLayoutEffect` (synchronous before paint, no flicker),
  and releases it on unmount only if the claim is still ours. A later
  card with the same id overwrites the claim → earlier card subscribes
  via `useRecoilState` and renders `null`.

- Family-keyed (per artifact id) so adding/removing a claim for one
  file never re-renders cards for unrelated files. Addresses the
  "messages view re-renders frequently" concern: each card subscribes
  only to its own slice.

- `ToolMermaidArtifact` shares the same atom via the new exported
  `toolArtifactKey()` helper, so the same `.mmd` file can't double-
  render either.

- Latest content always wins for the panel because the eager
  `setArtifacts` registration is last-write-wins on `artifactsState`
  by id — independent of which card holds the claim. Updating a file
  refreshes the panel content even if the chip's visual location
  doesn't move.

Tests: two new cases asserting that duplicate panel and mermaid
attachments collapse to a single rendered card.

* 🧹 chore: Address comprehensive review on code-artifacts-panel

- ToolArtifactCard self-heal now subscribes to a per-id selector
  (`artifactByIdSelector`) instead of a no-deps `useEffect`. Effect
  deps are `(artifact, existingEntry, setArtifacts)` so it runs
  deterministically when the slice transitions to undefined (panel-
  unmount cleanup) or when artifact content drifts — not on every
  parent render. Each card subscribes only to its own slice via the
  selectorFamily, so unrelated state changes don't re-render.
- artifacts.ts: localize the empty-content placeholder via a new
  `fileToArtifact(attachment, options?)` signature. Callers in
  `Attachment.tsx` (PanelArtifact) and `LogContent.tsx` resolve
  `com_ui_artifact_preview_pending` from `useLocalize` and thread
  it in. Default is empty string when no placeholder is supplied.
- artifacts.ts: thread `preClassifiedType` through `fileToArtifact`
  so the routing decision tree's `artifactTypeForAttachment` call
  is the only classification — previously `fileToArtifact` re-ran
  `detectArtifactTypeFromFile` after the routing already had the
  answer. Bucket type updated to `Array<{ attachment, type }>`.
- artifacts.ts: drop bare `text/plain` from `MIME_TO_TOOL_ARTIFACT_TYPE`.
  The extension map handles `.txt` explicitly; routing every
  unrecognized-extension `text/plain` file (extensionless scripts,
  `.env`, etc.) through the panel was a wider catch than the PR
  scope intended.
- artifacts.ts: stable `toLastUpdate` fallback of `0` (was
  `Date.now()`). `useArtifacts` sorts by `lastUpdateTime`, so a fresh
  timestamp on every call would re-sort entries non-deterministically
  across renders.
- artifacts.ts: drop dead `toolArtifactId = toolArtifactKey` alias.
  Add `filepath` to the key-derivation fallback chain so two
  unnamed-and-unidentified files don't collide on the literal
  `tool-artifact-unknown` key.
- ToolArtifactCard import order: package types before local types.
- store/artifacts.ts: JSDoc on `toolArtifactClaim` documenting the
  atomFamily-entries-persist-after-unmount trade-off (entries reset
  to null on card unmount; total cost is one key + a null per
  artifact — fine at typical session scale).
- Tests:
  - Updated existing `fileToArtifact` placeholder assertion to use
    the caller-provided string.
  - New: panel routing skips re-classification when
    `preClassifiedType` is provided.
  - New: bare `text/plain` MIME with unrecognized extension does
    NOT route through the panel.
  - New `LogContent.test.tsx` (6 cases) — HTML→panel, mermaid→
    inline, CSV→inline `<pre>`, archive→download chip, pptx→
    placeholder card, mixed split.
  - Dedup tests rewritten to use two AttachmentGroups (matching
    the real per-tool-call render) instead of a same-array
    duplicate that triggered React's duplicate-key warning.

* 🩹 fix: Address codex review + comprehensive review NITs

codex (P2):
- artifacts.ts: switch placeholder fallback to nullish coalescing.
  Empty string is now preserved as legitimate content (a 0-byte `.md`
  or `.txt` is a valid artifact, not "extraction unavailable") —
  only `null`/`undefined` triggers the deferred-extraction placeholder.
- Attachment.tsx: derive React keys via a new `renderKey` helper that
  combines `file_id` with the array index. Prevents duplicate keys
  when the same file_id appears twice in one bucket (rare but
  possible — a tool call writing the same path twice). Without
  unique keys, React's reconciler could reuse the wrong card
  instance, undermining the latest-mention dedup.

comprehensive review NITs:
- Attachment.tsx: hoist `import type { ToolArtifactType }` up into
  the type-import section per AGENTS.md.
- artifacts.ts `fileToArtifact`: defense-in-depth empty-text guard
  for the `preClassifiedType` path. Mirrors the gate in
  `detectArtifactTypeFromFile` so a future caller that bypasses
  classification can't hand sandpack/mermaid an empty buffer.
  Plain-text and markdown remain tolerated empty.

Tests:
- New: empty `.md` content passes through unchanged when a
  placeholder is also supplied.
- New: sibling cards with the same file_id in one group render
  without React key-collision warnings.
- Updated existing placeholder test to use `text: null` (the case
  where the placeholder is actually meant to fire).
- Three parameterized cases pinning the new
  preClassifiedType-with-empty-text safety guard.

* 🩹 fix: Address codex P1/P2 review on code-artifacts-panel

- P1 (stale artifacts leak across conversations): Add a top-level
  `useResetArtifactsOnConversationChange` hook in `Presentation.tsx`
  that wipes `artifactsState` / `currentArtifactId` on every
  conversation switch, regardless of panel visibility. Without this
  guard, ToolArtifactCard's self-heal effect would re-register the
  previous conversation's artifacts after panel close, leaking them
  into the next conversation's panel on open.

- P2 (expiresAt skipped on panel-routed entries): Restore the legacy
  expiry gate in `LogContent` ahead of panel/mermaid bucket-sort, so
  expired pptx/html/etc. attachments fall back to the
  "download expired" message instead of rendering as a clickable
  artifact card backed by a dead link.

Includes regression coverage for both paths.

* 🧹 chore: Share renderAttachmentKey across Attachment + LogContent

Hoist the per-occurrence React-key helper from `Attachment.tsx` into
`attachmentTypes.ts` so `LogContent` can use the same pattern. Apply
it to LogContent's panel/mermaid/text/image/nonInline buckets — the
prior keys (e.g. `mermaid-${file_id ?? index}`, `file.filepath ?? ...`)
would have collided if the same file_id appeared twice in one render,
even though that's astronomically rare for a single tool call.

Also drops the unused `file_id` field on `MermaidEntry` since the key
no longer needs it.

* 🩹 fix: Loosen artifacts util input types to match runtime fallbacks

`fileToArtifact`, `detectArtifactTypeFromFile`, `toolArtifactKey`, and
`toLastUpdate` all read every picked field with a nullish fallback —
their inputs were nonetheless typed as required `Pick<TFile, ...>`.
That mismatch made every realistic fixture (and several call sites
that lack a stable `filepath`) fail typecheck for fields the
implementations never strictly need.

Wrap the picks in `Partial<>` so the type matches the contract.

* 🩹 fix: Gate tool-artifact registration on claim winner

When two `ToolArtifactCard` instances mount for the same `artifact.id`
with divergent content (a code-execution file overwritten across
turns reuses its file_id), both effects subscribe to `existingEntry`
through `artifactByIdSelector`. Each card detects the other's write as
drift and overwrites it back, ping-ponging `artifactsState` between
old and new content and causing render churn / panel flicker.

Gate the self-heal registration on `isMyClaim` so only the latest
(claim-holding) card writes. The non-winner still subscribes to the
slice but short-circuits before calling `setArtifacts`, breaking the
loop. Adds a regression test that fails (loop / wrong final content)
without the gate.
2026-04-28 06:08:32 +09:00