mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 16:07:30 +00:00
🧱 refactor: typed CodeEnvRef + kind discriminator + principal-aware sandbox cache (#12960)
* 🧱 refactor: typed CodeEnvRef + kind discriminator + tenant-aware sandbox cache Final cutover for the LibreChat ↔ codeapi sandbox file identity. Replaces the magic string `${session_id}/${file_id}?entity_id=...` with a typed, discriminated `CodeEnvRef`. Pre-release lockstep deploy with codeapi #1455 and agents #148; no legacy aliases retained. ## Final shape ```ts type CodeEnvRef = | { kind: 'skill'; id: string; storage_session_id: string; file_id: string; version: number } | { kind: 'agent'; id: string; storage_session_id: string; file_id: string } | { kind: 'user'; id: string; storage_session_id: string; file_id: string }; ``` `kind` drives codeapi's sessionKey: `<tenant>:<kind>:<id>[✌️<version>]` for shared kinds, `<tenant>:user:<userId>` for user-private (auth context provides `userId`). `version` is statically required for `kind: 'skill'` and forbidden otherwise via discriminated union — constraint holds at compile time on every consumer, not just codeapi's runtime validator. `id` is sessionKey-meaningful for `'skill'` / `'agent'`; informational only for `'user'` (codeapi resolves user identity from auth context). ## What changed - `packages/data-provider/src/codeEnvRef.ts` — discriminated union + `CODE_ENV_KINDS` const-tuple keeps the runtime list and TS union locked together. - Schemas: `metadata.codeEnvRef` and `SkillFile.codeEnvRef` enums tightened to `['skill', 'agent', 'user']`. - `primeSkillFiles` writes `kind: 'skill'`, `id: skill._id`, `version: skill.version`. Cache-hit path reads `codeEnvRef` directly. Bumping `skill.version` on edit naturally invalidates the prior cache entry under the new sessionKey. - `processCodeOutput` writes `kind: 'user'`, `id: req.user.id`. Output bucket is always user-scoped, regardless of which skill the execution invoked. New regression test pins the asymmetry. - `primeFiles` reupload preserves `kind`/`id`/`version?` from the existing ref so a skill-cache-miss reupload doesn't silently demote to user bucket. - `crud.js` upload functions (`uploadCodeEnvFile` / `batchUploadCodeEnvFiles`) thread `kind`/`id`/`version?` to the multipart form (codeapi #1455 option α). Without these on the wire, codeapi falls back to user bucketing and skill-cache invalidation never fires. Client-side validation mirrors codeapi's validator. - `Files/process.js` — chat attachments use `kind: 'user'`; agent setup files use `kind: 'agent'`. - Drops `entity_id` everywhere (struct, schema sub-docs, write paths, upload form fields). Drops `'system'` from the kind enum (no emitter ever existed). ## Test plan - [x] `cd packages/data-provider && npx jest src/codeEnvRef.spec` — 4 / 4 - [x] `cd packages/data-schemas && npx jest` — 1447 / 1447 - [x] `cd packages/api && npx jest src/agents` — 81 / 81 in skillFiles + handlers + resources - [x] `cd api && npx jest server/services/Files server/controllers/agents` — 436 / 436 - [x] `cd api && npx jest server/services/Files/Code` — 98 / 98 (incl. new "outputs are user-scoped regardless of which skill the execution invoked" regression and "reupload forwards kind/id/version from existing ref") - [x] `npx tsc --noEmit -p packages/data-{provider,schemas}/tsconfig.json && npx tsc --noEmit -p packages/api/tsconfig.json` — clean (only pre-existing unrelated dev errors in storage/balance, untouched here) ## Deploy notes - **24h cache-miss burst** on first deploy. Inputs (skill caches re-prime under new sessionKey shape) and outputs (any pre-Phase C skill-output cached files become unreadable). Bounded by codeapi's 24h TTL. - **Lockstep with codeapi #1455 and agents #148.** Either repo can land first since no aliases to drain, but the three deploys must overlap within the same maintenance window. - **`@librechat/agents` bump to `3.1.79-dev.0`** required after agents #148 lands and is published. ## What this enables Auth bridge work (JWT-based tenant/user identity between LC and codeapi) — codeapi now derives sessionKey purely from `req.codeApiAuthContext.{ tenantId, userId}`, so the next chapter is replacing the header-asserted user identity with a verified-claim path. * 🩹 fix: persist execute_code uploads under codeEnvRef metadata key Codex review P1 (chatgpt-codex-connector). `Files/process.js` was storing the upload result under `metadata.fileIdentifier` even though: - `uploadCodeEnvFile` now returns `{ storage_session_id, file_id }`, not the legacy magic string. - The post-cutover schema (`File.metadata.codeEnvRef`) only declares `codeEnvRef` — mongoose strict mode silently strips unknown keys. - All readers (`primeFiles`, `getCodeFilesByIds`, `categorizeFileForToolResources`, controller filtering) check `metadata.codeEnvRef`. Net effect of the bug: chat-attached and agent-setup execute_code files would lose their sandbox reference on save, and primeFiles would skip them on subsequent code-execution turns — the file blob would still be available locally but never re-mounted in the sandbox. Fix: construct the full `CodeEnvRef` (`{ kind, id, storage_session_id, file_id }`) at the write site and persist under `metadata.codeEnvRef`. `BaseClient`'s "is this a code-env file" presence check accepts the new shape alongside the legacy `fileIdentifier` for back-compat with any pre-cutover records still in the database. Mirrors the same change in `processAttachments.spec.ts` (which re-implements the BaseClient logic for testability). New regression tests in `process.spec.js` cover three cases: - chat attachments (`messageAttachment=true`) → `kind: 'user'` - agent setup (`messageAttachment=false`) → `kind: 'agent'` - legacy `fileIdentifier` key is NOT persisted (would be schema-stripped) * 🩹 fix: read storage_session_id on primed file refs (Codex P1) Codex review (chatgpt-codex-connector). After Phase B's per-file `session_id` → `storage_session_id` rename, `primeFiles` emits the new field — but `seedCodeFilesIntoSessions` was still reading `files[0].session_id` for the representative session and `f.session_id` for the dedupe key. In runs with only primed attachments (no skill seed), `representativeSessionId` was `undefined`, the function returned the unchanged map, and `seedCodeFilesIntoSessions` silently dropped the entire batch. The first `execute_code` call then started without `_injected_files` and the agent couldn't see prior-turn artifacts. Fix: - `codeFilesSession.ts`: read `f.storage_session_id` for both the dedupe key and the representative session id. JSDoc updated to match the new field name. - `callbacks.js`: the two output-file persistence paths read `file.session_id` to pass to `processCodeOutput` — switch to `file.storage_session_id`. The original comment explicitly says this should be the STORAGE session, which is exactly the field Phase B renamed. - `codeFilesSession.spec.ts`: fixture builder uses `storage_session_id` and `kind: 'user'` to match the post-cutover `CodeEnvFile` shape. Lockstep coordination: this matches the post-bump shape of `@librechat/agents` 3.1.79+. CI tsc errors against the currently-pinned 3.1.78 are expected and resolve when the dep bumps in this PR before merge. * 📦 chore: Bump `@librechat/agents` to version 3.1.80-dev.0 in package-lock and package.json files * 🪪 fix: thread kind/id/version through codeapi /download URLs (Phase C α) Symmetric fix for the upload-side wire change in 537725a. Codeapi's `sessionAuth` middleware now requires `kind`/`id`/`version?` on every download/freshness URL — without them it 400s with "kind must be one of: skill, agent, user" before serving the file. Three sites construct codeapi-side URLs that go through `sessionAuth`: - `processCodeOutput` (`Files/Code/process.js`): `/download/<sess>/<id>` for freshly-generated sandbox outputs. Always `kind: 'user'` + `id: req.user.id` — code-output files are always user-private, regardless of which skill the run invoked. - `getSessionInfo` (`Files/Code/process.js`): `/sessions/<sess>/objects/<id>` for the 23h freshness check. Pulls kind/id/version straight off the `codeEnvRef` already in scope — skill files stay skill-bucketed, user files stay user-bucketed. - `/code/download/:session_id/:fileId` LC route (`routes/files/files.js`): proxies to codeapi for manual downloads. Code-output files only on this route, so `kind: 'user'` + `id: req.user.id`. The `getCodeOutputDownloadStream` helper in `crud.js` now takes an `identity` param, validated by a `buildCodeEnvDownloadQuery` helper that mirrors `appendCodeEnvFileIdentity`'s shape rules: kind required from the closed `{skill, agent, user}` set, version required for 'skill' and forbidden otherwise. Bad callers fail fast on the client instead of round-tripping a 400. Also cleans up two log-noise sources reported alongside the 400: - `logAxiosError` in `packages/api/src/utils/axios.ts` was dumping `error.response.data` raw. With `responseType: 'arraybuffer'` that's a `Buffer` (~4 chars per byte after JSON-serialization); with `responseType: 'stream'` it's a `Readable` whose internal state serializes the entire ring buffer + socket. New `renderResponseData` decodes small buffers as UTF-8 (truncated past 2KB) and stubs streams as `'[stream]'`. Diagnostics stay useful, log lines stop being megabytes. - `/code/download` route's catch was bare `logger.error('...', error)`, bypassing the redactor. Switched to `logAxiosError` so it benefits from the same buffer/stream handling. Tests updated to match the new contract: - crud.spec: `getCodeOutputDownloadStream` fixtures pass `userIdentity`; new cases cover skill identity (with version), bad kind rejection, skill-without-version rejection. - process.spec: `getSessionInfo` test passes a full `codeEnvRef` object. * ♻️ refactor: extract codeEnv identity helpers into packages/api Per the project convention that new backend code lives in TypeScript under `packages/api`, moves `appendCodeEnvFileIdentity` and `buildCodeEnvDownloadQuery` from `api/server/services/Files/Code/crud.js` into a new `packages/api/src/files/code/identity.ts` module. Both helpers are pure validators that mirror codeapi's `parseUploadSessionKeyInput` server-side rules (closed kind set, `version` required for `'skill'` and forbidden otherwise) — they deserve TS support and a dedicated spec rather than living as JSDoc-typed helpers in the legacy `/api` workspace. The new module: - Exports a `CodeEnvIdentity` interface using the `librechat-data-provider` `CodeEnvKind` discriminated union. - Adds 13 unit tests in `identity.spec.ts` covering the validation matrix (skill+version, agent, user, and every rejection path) plus URL encoding for the download query. - Re-exported from `packages/api/src/files/code/index.ts` alongside `classify`, `extract`, and `form`. Consumer updates: - `api/server/services/Files/Code/crud.js`: drops the local helpers and imports them from `@librechat/api`. Net -64 lines. - `api/server/services/Files/Code/process.js`: same. - Test mocks for `@librechat/api` in three spec files now stub the helpers' validation behavior locally rather than pulling them through `requireActual` (which would drag in provider-config init-time side effects). The package's `exports` field only surfaces the root barrel, so leaf imports aren't reachable from legacy `/api` test setup. No runtime behavior change. Identity validation rules and emitted form/query shapes are byte-for-byte identical pre/post. * 🪪 fix: emit resource_id alongside id on _injected_files (skill 403 fix) Companion to codeapi #1455 fix and agents 3.1.80-dev.1 — the wire shape for shared-kind files now requires `resource_id` distinct from the storage `id`. Without this LC change, codeapi's sessionKey re-derivation on every shared-kind /exec rejects with 403 session_key_mismatch: cached: legacy:skill:69dcf561...✌️59 (signed at upload, skill _id) derived: legacy:skill:ysPwEURuPk-...✌️59 (storage nanoid) Emit sites updated: - `primeInvokedSkills` cache-hit path: `resource_id: ref.id` (the persisted skill `_id` from `codeEnvRef.id`); `id: ref.file_id` unchanged (storage uuid). - `primeInvokedSkills` fresh-upload path: `resource_id: skill._id.toString()` on every primed file (the `allPrimedFiles` builder type now carries the field). - `processCodeOutput`'s `pushFile` (Code/process.js): `resource_id: ref.id` — for `kind: 'user'` this is informational (codeapi derives sessionKey from auth context) but emitted for shape uniformity with shared kinds. Bumps `@librechat/agents` to `^3.1.80-dev.1` (the version that ships the matching `CodeEnvFile.resource_id` field). ## Test plan - [x] `cd packages/api && npx jest src/agents` — 67 / 67 pass (skillFiles fixtures updated to assert `resource_id` on the emitted CodeSessionContext.files). - [x] `cd api && npx jest server/services/Files server/controllers/agents` — 445 / 445 pass (process.spec fixtures updated for the reupload + cache-hit emission). - [x] `npx tsc --noEmit -p packages/api/tsconfig.json` — clean. * fix(skill-tool-call): carry resource_id through primeSkillFiles → artifact Codeapi was 400ing every /exec following a `handle_skill` tool call with `resource_id is invalid` (`type: 'undefined'`). Both code paths in `primeSkillFiles` (cache-hit + fresh-upload) returned files without `resource_id`/`kind`/`version`, and the artifact in `handlers.ts` forwarded the stripped shape into `tc.codeSessionContext.files` → `_injected_files`. `primeInvokedSkills` (the NL-detected loader) had already been fixed end-to-end; this commit aligns the tool-invoked path with the same contract: `resource_id` = `skill._id.toString()`, `kind: 'skill'`, `version` = the skill's monotonic counter. Tests added to `skillFiles.spec.ts` lock the contract on `primeSkillFiles` directly so future refactors can't silently drop the resource identity again. * fix(handlers.spec): align session_id → storage_session_id rename + kind discriminator Pre-existing TS errors against the post-rename `CodeEnvFile` shape: the test file still used `session_id` on per-file objects (renamed to `storage_session_id` in agents Phase B/C) and was missing the `kind` discriminator the discriminated union requires. Both inputs and the matching `expect.toEqual(...)` mirrors updated together so the runtime equality check still holds. Lines 723-732 stay as-is — they sit behind `as unknown as ToolCallRequest` and TS already skipped them. * chore: fix `@librechat/agents`, correct version to 3.1.80-dev.0 in package.json files * chore: bump `@librechat/agents` to version 3.1.80-dev.1 in package.json and package-lock.json * chore: bump `@librechat/agents` to version 3.1.80-dev.2 * feat(observability): trace file priming chain from primeCodeFiles to _injected_files Diagnosing the user-upload "files=[] on first /exec" bug requires seeing where in the LC chain a file ref disappears. Prior to this patch the chain (primeCodeFiles → primedCodeFiles → initialSessions → CodeSessionContext → _injected_files) was opaque end-to-end: - primeCodeFiles silently dropped files without `metadata.codeEnvRef` - reuploadFile catches all errors and continues with no signal - the handlers.ts handoff to codeapi never logged what it was sending After this patch, a single grep on `[primeCodeFiles]` plus `[code-env:inject]` shows the full per-file path: [primeCodeFiles] in: file_ids=N resourceFiles=M [primeCodeFiles] file=<id> path=skip reason=no-codeenvref filename=... [primeCodeFiles] file=<id> path=cache-hit-by-session storage_session_id=... [primeCodeFiles] file=<id> path=reupload reason=no-uploadtime ... [primeCodeFiles] file=<id> path=reupload reason=stale ... [primeCodeFiles] file=<id> path=reupload-success oldSession=... newSession=... newFileId=... [primeCodeFiles] file=<id> path=reupload-failed session=... [primeCodeFiles] file=<id> path=fresh-active storage_session_id=... [primeCodeFiles] out: returned=N skippedNoRef=M reuploadFailures=K [code-env:inject] tool=<name> files=N missingResourceId=K (debug) [code-env:inject] M/N files missing resource_id ... (warn) [code-env:inject] tool=<name> _injected_files=0 ... (warn) The boundary log warns when LC sends zero injected files on a code-execution tool call — that's the user's actual symptom showing up at the LC side instead of having to correlate against codeapi's `Request received { files: [] }`. Tag chosen as `[code-env:inject]` rather than `[handoff:exec]` to avoid collision with the app-level "handoff" semantic (subagent handoff workflow). Structural cleanup in primeFiles: replaced the `if (ref) { ... }` nesting with an early `if (!ref) continue` so the per-path instrumentation hooks land at top-level scope instead of indented inside a conditional. Behavior unchanged; pushFile / reuploadFile identical. Spec fixtures (handlers.spec.ts, codeFilesSession.spec.ts) updated to include `resource_id` on `CodeEnvFile` literals — required by the post-3.1.80-dev.2 type now installed. ## Test plan - [x] `cd packages/api && npx jest src/agents/handlers.spec.ts src/agents/codeFilesSession.spec.ts src/agents/skillFiles.spec.ts` — 69/69 pass - [x] `cd api && npx jest server/services/Files/Code/process.spec.js` — 84/84 pass - [x] `npx tsc --noEmit -p packages/api` — clean - [x] `npx eslint` on all four touched files — clean * chore: add CONSOLE_JSON_STRING_LENGTH to .env.example for JSON log string length configuration * fix(files): align codeapi upload filename with LC's sanitized DB filename User-attached files for code execution were uploading to codeapi under `file.originalname` (raw upload filename, may contain spaces / special chars) while LC's DB record stored the sanitized form (`sanitizeFilename(file.originalname)`, underscores). Codeapi preserves whatever filename the upload sent, so the sandbox saw `/mnt/data/<originalname>` while LC's `primeFiles` toolContext text + `_injected_files.name` referenced `file.filename` (sanitized). Visible failure: agent gets system prompt saying /mnt/data/librechat_code_api_-_active_customer_-_2025-11-05.xlsx …tries that path, hits `FileNotFoundError`, then notices the sandbox's actual `Available files` line says /mnt/data/librechat code api - active customer - 2025-11-05.xlsx …retries with spaces, succeeds. Wastes a tool call per upload and leaks raw filenames into model context. Fix: sanitize once and use the sanitized form in both the codeapi upload AND the LC DB record. Sandbox path = LC toolContext text = in-memory ref name. No drift. Reupload path (`Code/process.js` line 867 `filename: file.filename`) already uses the sanitized DB name, so it stays consistent with the fresh-upload path after this change. ## Test plan - [x] `cd api && npx jest server/services/Files/process` — 32/32 pass - [x] `npx eslint` on the touched file — clean * chore: bump `@librechat/agents` to version 3.1.80-dev.3 in package.json and package-lock.json
This commit is contained in:
parent
9441563b95
commit
93c4ef4ba8
40 changed files with 1937 additions and 456 deletions
|
|
@ -40,6 +40,15 @@ jest.mock('@librechat/api', () => {
|
|||
* inline (non-finalize) path so existing assertions on a single
|
||||
* createFile call hold. */
|
||||
hasOfficeHtmlPath: jest.fn(() => false),
|
||||
/* Identity-helper stub mirroring `packages/api/src/files/code/identity.ts`.
|
||||
* `processCodeOutput` calls this for every output download URL;
|
||||
* traversal cases don't care about the query shape, just that it
|
||||
* returns something concatable. */
|
||||
buildCodeEnvDownloadQuery: jest.fn(({ kind, id, version }) => {
|
||||
const params = new URLSearchParams({ kind, id });
|
||||
if (version != null) params.set('version', String(version));
|
||||
return `?${params.toString()}`;
|
||||
}),
|
||||
codeServerHttpAgent: new http.Agent({ keepAlive: false }),
|
||||
codeServerHttpsAgent: new https.Agent({ keepAlive: false }),
|
||||
};
|
||||
|
|
|
|||
|
|
@ -7,6 +7,8 @@ const {
|
|||
createAxiosInstance,
|
||||
codeServerHttpAgent,
|
||||
codeServerHttpsAgent,
|
||||
appendCodeEnvFileIdentity,
|
||||
buildCodeEnvDownloadQuery,
|
||||
} = require('@librechat/api');
|
||||
|
||||
const axios = createAxiosInstance();
|
||||
|
|
@ -16,16 +18,22 @@ const MAX_FILE_SIZE = 150 * 1024 * 1024;
|
|||
/**
|
||||
* Retrieves a download stream for a specified file.
|
||||
* @param {string} fileIdentifier - The identifier for the file (e.g., "session_id/fileId").
|
||||
* @param {{ kind: 'skill' | 'agent' | 'user'; id: string; version?: number }} identity
|
||||
* Resource identity required by codeapi's `sessionAuth` to derive the
|
||||
* matching sessionKey. For code-output downloads this is always
|
||||
* `kind: 'user', id: <userId>`; for skill/agent re-downloads pass
|
||||
* the kind+id (+version for skill) from the file's `metadata.codeEnvRef`.
|
||||
* @returns {Promise<AxiosResponse>} A promise that resolves to a readable stream of the file content.
|
||||
* @throws {Error} If there's an error during the download process.
|
||||
*/
|
||||
async function getCodeOutputDownloadStream(fileIdentifier) {
|
||||
async function getCodeOutputDownloadStream(fileIdentifier, identity) {
|
||||
try {
|
||||
const baseURL = getCodeBaseURL();
|
||||
const query = buildCodeEnvDownloadQuery(identity);
|
||||
/** @type {import('axios').AxiosRequestConfig} */
|
||||
const options = {
|
||||
method: 'get',
|
||||
url: `${baseURL}/download/${fileIdentifier}`,
|
||||
url: `${baseURL}/download/${fileIdentifier}${query}`,
|
||||
responseType: 'stream',
|
||||
headers: {
|
||||
'User-Agent': 'LibreChat/1.0',
|
||||
|
|
@ -49,20 +57,31 @@ async function getCodeOutputDownloadStream(fileIdentifier) {
|
|||
|
||||
/**
|
||||
* Uploads a file to the Code Environment server.
|
||||
*
|
||||
* `kind`/`id`/`version?` are required so codeapi can route the upload to
|
||||
* the correct sessionKey bucket — `<tenant>:<kind>:<id>[:v:<version>]`
|
||||
* for shared kinds, `<tenant>:user:<authContext.userId>` for `user`.
|
||||
* Without these, codeapi falls back to user-scoped bucketing regardless
|
||||
* of the resource the file belongs to, so skill-cache invalidation
|
||||
* (driven by the version bump on edit) never fires. See codeapi #1455.
|
||||
*
|
||||
* @param {Object} params - The params object.
|
||||
* @param {ServerRequest} params.req - The request object from Express. It should have a `user` property with an `id` representing the user
|
||||
* @param {import('fs').ReadStream | import('stream').Readable} params.stream - The read stream for the file.
|
||||
* @param {string} params.filename - The name of the file.
|
||||
* @param {string} [params.entity_id] - Optional entity ID for the file.
|
||||
* @returns {Promise<string>}
|
||||
* @param {'skill' | 'agent' | 'user'} params.kind - Resource kind that owns this file's storage session.
|
||||
* @param {string} params.id - Resource id (skillId / agentId / userId). Codeapi
|
||||
* ignores this for `kind: 'user'` (auth context provides userId), but it's
|
||||
* sent uniformly for shape symmetry with the discriminated union.
|
||||
* @param {number} [params.version] - Required when `kind === 'skill'`; absent otherwise.
|
||||
* @returns {Promise<{ storage_session_id: string; file_id: string }>}
|
||||
* The codeapi storage location of the uploaded file.
|
||||
* @throws {Error} If there's an error during the upload process.
|
||||
*/
|
||||
async function uploadCodeEnvFile({ req, stream, filename, entity_id = '' }) {
|
||||
async function uploadCodeEnvFile({ req, stream, filename, kind, id, version }) {
|
||||
try {
|
||||
const form = new FormData();
|
||||
if (entity_id.length > 0) {
|
||||
form.append('entity_id', entity_id);
|
||||
}
|
||||
appendCodeEnvFileIdentity(form, { kind, id, version });
|
||||
appendCodeEnvFile(form, stream, filename);
|
||||
|
||||
const baseURL = getCodeBaseURL();
|
||||
|
|
@ -83,18 +102,16 @@ async function uploadCodeEnvFile({ req, stream, filename, entity_id = '' }) {
|
|||
|
||||
const response = await axios.post(`${baseURL}/upload`, form, options);
|
||||
|
||||
/** @type {{ message: string; session_id: string; files: Array<{ fileId: string; filename: string }> }} */
|
||||
/** @type {{ message: string; storage_session_id: string; files: Array<{ fileId: string; filename: string }> }} */
|
||||
const result = response.data;
|
||||
if (result.message !== 'success') {
|
||||
throw new Error(`Error uploading file: ${result.message}`);
|
||||
}
|
||||
|
||||
const fileIdentifier = `${result.session_id}/${result.files[0].fileId}`;
|
||||
if (entity_id.length === 0) {
|
||||
return fileIdentifier;
|
||||
}
|
||||
|
||||
return `${fileIdentifier}?entity_id=${entity_id}`;
|
||||
return {
|
||||
storage_session_id: result.storage_session_id,
|
||||
file_id: result.files[0].fileId,
|
||||
};
|
||||
} catch (error) {
|
||||
throw new Error(
|
||||
logAxiosError({
|
||||
|
|
@ -109,25 +126,28 @@ async function uploadCodeEnvFile({ req, stream, filename, entity_id = '' }) {
|
|||
* Uploads multiple files to the code execution environment in a single request.
|
||||
* Uses the /upload/batch endpoint which shares one session_id across all files.
|
||||
*
|
||||
* `kind`/`id`/`version?` carry the resource identity for codeapi's sessionKey
|
||||
* derivation — see `uploadCodeEnvFile` for the full motivation.
|
||||
*
|
||||
* @param {object} params
|
||||
* @param {import('express').Request & { user: { id: string } }} params.req - The request object.
|
||||
* @param {Array<{ stream: NodeJS.ReadableStream; filename: string }>} params.files - Files to upload.
|
||||
* @param {string} [params.entity_id] - Optional entity ID.
|
||||
* @param {'skill' | 'agent' | 'user'} params.kind - Resource kind that owns the batch's storage session.
|
||||
* @param {string} params.id - Resource id (skillId / agentId / userId).
|
||||
* @param {number} [params.version] - Required when `kind === 'skill'`; absent otherwise.
|
||||
* @param {boolean} [params.read_only] - When true, codeapi tags every file in
|
||||
* the batch as infrastructure (e.g. skill files). The flag is persisted as
|
||||
* MinIO object metadata (`X-Amz-Meta-Read-Only`) and travels with the file
|
||||
* through subsequent download/walk passes — sandboxed-code modifications
|
||||
* are dropped on the floor and the original ref is echoed back as
|
||||
* `inherited: true`, never as a generated artifact.
|
||||
* @returns {Promise<{ session_id: string; files: Array<{ fileId: string; filename: string }> }>}
|
||||
* @returns {Promise<{ storage_session_id: string; files: Array<{ fileId: string; filename: string }> }>}
|
||||
* @throws {Error} If the batch upload fails entirely.
|
||||
*/
|
||||
async function batchUploadCodeEnvFiles({ req, files, entity_id = '', read_only = false }) {
|
||||
async function batchUploadCodeEnvFiles({ req, files, kind, id, version, read_only = false }) {
|
||||
try {
|
||||
const form = new FormData();
|
||||
if (entity_id.length > 0) {
|
||||
form.append('entity_id', entity_id);
|
||||
}
|
||||
appendCodeEnvFileIdentity(form, { kind, id, version });
|
||||
if (read_only) {
|
||||
form.append('read_only', 'true');
|
||||
}
|
||||
|
|
@ -153,12 +173,12 @@ async function batchUploadCodeEnvFiles({ req, files, entity_id = '', read_only =
|
|||
|
||||
const response = await axios.post(`${baseURL}/upload/batch`, form, options);
|
||||
|
||||
/** @type {{ message: string; session_id: string; files: Array<{ status: string; fileId?: string; filename: string; error?: string }>; succeeded: number; failed: number }} */
|
||||
/** @type {{ message: string; storage_session_id: string; files: Array<{ status: string; fileId?: string; filename: string; error?: string }>; succeeded: number; failed: number }} */
|
||||
const result = response.data;
|
||||
if (
|
||||
!result ||
|
||||
typeof result !== 'object' ||
|
||||
!result.session_id ||
|
||||
!result.storage_session_id ||
|
||||
!Array.isArray(result.files)
|
||||
) {
|
||||
throw new Error(`Unexpected batch upload response: ${JSON.stringify(result).slice(0, 200)}`);
|
||||
|
|
@ -179,7 +199,7 @@ async function batchUploadCodeEnvFiles({ req, files, entity_id = '', read_only =
|
|||
.filter((f) => f.status === 'success' && f.fileId)
|
||||
.map((f) => ({ fileId: f.fileId, filename: f.filename }));
|
||||
|
||||
return { session_id: result.session_id, files: successFiles };
|
||||
return { storage_session_id: result.storage_session_id, files: successFiles };
|
||||
} catch (error) {
|
||||
throw new Error(
|
||||
logAxiosError({
|
||||
|
|
@ -190,4 +210,8 @@ async function batchUploadCodeEnvFiles({ req, files, entity_id = '', read_only =
|
|||
}
|
||||
}
|
||||
|
||||
module.exports = { getCodeOutputDownloadStream, uploadCodeEnvFile, batchUploadCodeEnvFiles };
|
||||
module.exports = {
|
||||
getCodeOutputDownloadStream,
|
||||
uploadCodeEnvFile,
|
||||
batchUploadCodeEnvFiles,
|
||||
};
|
||||
|
|
|
|||
|
|
@ -9,6 +9,26 @@ jest.mock('@librechat/agents', () => ({
|
|||
getCodeBaseURL: jest.fn(() => 'https://code-api.example.com'),
|
||||
}));
|
||||
|
||||
/* Inline the identity helpers' validation rules instead of pulling
|
||||
* them through `@librechat/api`'s root barrel (which has init-time
|
||||
* provider-config side effects that don't matter here) or its leaf
|
||||
* module (the package's `exports` field only surfaces the root).
|
||||
* The real implementation lives in `packages/api/src/files/code/identity.ts`
|
||||
* and has its own dedicated `identity.spec.ts` covering the validation
|
||||
* matrix; this stub just mirrors enough behavior for the surrounding
|
||||
* crud tests to exercise the upload/download flow. */
|
||||
const VALID_KINDS = new Set(['skill', 'agent', 'user']);
|
||||
const validateIdentity = ({ kind, id, version }, label) => {
|
||||
if (!kind || !VALID_KINDS.has(kind)) throw new Error(`${label}: invalid kind "${kind}"`);
|
||||
if (!id) throw new Error(`${label}: missing id for kind "${kind}"`);
|
||||
if (kind === 'skill' && version == null) {
|
||||
throw new Error(`${label}: kind "skill" requires a numeric version`);
|
||||
}
|
||||
if (kind !== 'skill' && version != null) {
|
||||
throw new Error(`${label}: version is only valid for kind "skill"`);
|
||||
}
|
||||
};
|
||||
|
||||
jest.mock('@librechat/api', () => {
|
||||
const http = require('http');
|
||||
const https = require('https');
|
||||
|
|
@ -16,6 +36,18 @@ jest.mock('@librechat/api', () => {
|
|||
appendCodeEnvFile: jest.fn((form, stream, filename) => {
|
||||
form.append('file', stream, { filename });
|
||||
}),
|
||||
appendCodeEnvFileIdentity: jest.fn((form, identity) => {
|
||||
validateIdentity(identity, 'appendCodeEnvFileIdentity');
|
||||
form.append('kind', identity.kind);
|
||||
form.append('id', identity.id);
|
||||
if (identity.version != null) form.append('version', String(identity.version));
|
||||
}),
|
||||
buildCodeEnvDownloadQuery: jest.fn((identity) => {
|
||||
validateIdentity(identity, 'buildCodeEnvDownloadQuery');
|
||||
const params = new URLSearchParams({ kind: identity.kind, id: identity.id });
|
||||
if (identity.version != null) params.set('version', String(identity.version));
|
||||
return `?${params.toString()}`;
|
||||
}),
|
||||
logAxiosError: jest.fn(({ message }) => message),
|
||||
createAxiosInstance: jest.fn(() => mockAxios),
|
||||
codeServerHttpAgent: new http.Agent({ keepAlive: false }),
|
||||
|
|
@ -32,11 +64,17 @@ describe('Code CRUD', () => {
|
|||
});
|
||||
|
||||
describe('getCodeOutputDownloadStream', () => {
|
||||
/* Code-output downloads always carry `kind: 'user'` + `id: <userId>`
|
||||
* — codeapi's `sessionAuth` rejects without them post-Phase C. The
|
||||
* fixture mirrors what `processCodeOutput` and the `/code/download`
|
||||
* route pass in production. */
|
||||
const userIdentity = { kind: 'user', id: 'user-123' };
|
||||
|
||||
it('should pass dedicated keepAlive:false agents to axios', async () => {
|
||||
const mockResponse = { data: Readable.from(['chunk']) };
|
||||
mockAxios.mockResolvedValue(mockResponse);
|
||||
|
||||
await getCodeOutputDownloadStream('session-1/file-1');
|
||||
await getCodeOutputDownloadStream('session-1/file-1', userIdentity);
|
||||
|
||||
const callConfig = mockAxios.mock.calls[0][0];
|
||||
expect(callConfig.httpAgent).toBe(codeServerHttpAgent);
|
||||
|
|
@ -50,18 +88,52 @@ describe('Code CRUD', () => {
|
|||
it('should request stream response from the correct URL', async () => {
|
||||
mockAxios.mockResolvedValue({ data: Readable.from(['chunk']) });
|
||||
|
||||
await getCodeOutputDownloadStream('session-1/file-1');
|
||||
await getCodeOutputDownloadStream('session-1/file-1', userIdentity);
|
||||
|
||||
const callConfig = mockAxios.mock.calls[0][0];
|
||||
expect(callConfig.url).toBe('https://code-api.example.com/download/session-1/file-1');
|
||||
/* URL carries `?kind=user&id=<userId>` so codeapi's `sessionAuth`
|
||||
* can reconstruct the matching `<tenant>:user:<userId>` sessionKey
|
||||
* (Phase C / option α). */
|
||||
expect(callConfig.url).toBe(
|
||||
'https://code-api.example.com/download/session-1/file-1?kind=user&id=user-123',
|
||||
);
|
||||
expect(callConfig.responseType).toBe('stream');
|
||||
expect(callConfig.timeout).toBe(15000);
|
||||
});
|
||||
|
||||
it('forwards skill identity (kind/id/version) when re-downloading a primed skill file', async () => {
|
||||
mockAxios.mockResolvedValue({ data: Readable.from(['chunk']) });
|
||||
|
||||
await getCodeOutputDownloadStream('session-2/file-x', {
|
||||
kind: 'skill',
|
||||
id: 'skill-abc',
|
||||
version: 7,
|
||||
});
|
||||
|
||||
const callConfig = mockAxios.mock.calls[0][0];
|
||||
expect(callConfig.url).toBe(
|
||||
'https://code-api.example.com/download/session-2/file-x?kind=skill&id=skill-abc&version=7',
|
||||
);
|
||||
});
|
||||
|
||||
it('rejects skill identity without a version (mirrors codeapi validator)', async () => {
|
||||
await expect(
|
||||
getCodeOutputDownloadStream('s/f', { kind: 'skill', id: 'skill-abc' }),
|
||||
).rejects.toThrow(/skill.*version/);
|
||||
expect(mockAxios).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('rejects unknown kind without dispatching to codeapi', async () => {
|
||||
await expect(getCodeOutputDownloadStream('s/f', { kind: 'system', id: 'x' })).rejects.toThrow(
|
||||
/invalid kind/,
|
||||
);
|
||||
expect(mockAxios).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('should throw on network error', async () => {
|
||||
mockAxios.mockRejectedValue(new Error('ECONNREFUSED'));
|
||||
|
||||
await expect(getCodeOutputDownloadStream('s/f')).rejects.toThrow();
|
||||
await expect(getCodeOutputDownloadStream('s/f', userIdentity)).rejects.toThrow();
|
||||
});
|
||||
});
|
||||
|
||||
|
|
@ -70,13 +142,15 @@ describe('Code CRUD', () => {
|
|||
req: { user: { id: 'user-123' } },
|
||||
stream: Readable.from(['file-content']),
|
||||
filename: 'data.csv',
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
};
|
||||
|
||||
it('should pass dedicated keepAlive:false agents to axios', async () => {
|
||||
mockAxios.post.mockResolvedValue({
|
||||
data: {
|
||||
message: 'success',
|
||||
session_id: 'sess-1',
|
||||
storage_session_id: 'sess-1',
|
||||
files: [{ fileId: 'fid-1', filename: 'data.csv' }],
|
||||
},
|
||||
});
|
||||
|
|
@ -96,7 +170,7 @@ describe('Code CRUD', () => {
|
|||
mockAxios.post.mockResolvedValue({
|
||||
data: {
|
||||
message: 'success',
|
||||
session_id: 'sess-1',
|
||||
storage_session_id: 'sess-1',
|
||||
files: [{ fileId: 'fid-1', filename: 'data.csv' }],
|
||||
},
|
||||
});
|
||||
|
|
@ -107,35 +181,106 @@ describe('Code CRUD', () => {
|
|||
expect(callConfig.timeout).toBe(120000);
|
||||
});
|
||||
|
||||
it('should return fileIdentifier on success', async () => {
|
||||
it('should return { storage_session_id, file_id } on success', async () => {
|
||||
mockAxios.post.mockResolvedValue({
|
||||
data: {
|
||||
message: 'success',
|
||||
session_id: 'sess-1',
|
||||
storage_session_id: 'sess-1',
|
||||
files: [{ fileId: 'fid-1', filename: 'data.csv' }],
|
||||
},
|
||||
});
|
||||
|
||||
const result = await uploadCodeEnvFile(baseUploadParams);
|
||||
expect(result).toBe('sess-1/fid-1');
|
||||
expect(result).toEqual({ storage_session_id: 'sess-1', file_id: 'fid-1' });
|
||||
});
|
||||
|
||||
it('should append entity_id query param when provided', async () => {
|
||||
mockAxios.post.mockResolvedValue({
|
||||
/* Phase C / option α (codeapi #1455): the upload wire carries the
|
||||
* resource identity codeapi uses for sessionKey derivation. Without
|
||||
* these on the form, codeapi falls back to user bucketing for every
|
||||
* upload and skill-cache invalidation never fires. Validation runs
|
||||
* client-side too so a bad caller fails fast instead of round-tripping
|
||||
* a 400. */
|
||||
describe('codeapi resource identity (kind/id/version)', () => {
|
||||
const FormData = require('form-data');
|
||||
const successResponse = {
|
||||
data: {
|
||||
message: 'success',
|
||||
session_id: 'sess-1',
|
||||
storage_session_id: 'sess-1',
|
||||
files: [{ fileId: 'fid-1', filename: 'data.csv' }],
|
||||
},
|
||||
};
|
||||
let appendSpy;
|
||||
|
||||
beforeEach(() => {
|
||||
/* Spying on the prototype lets us assert form fields without
|
||||
* materializing the multipart body — `form.getBuffer()` would
|
||||
* fail on the file-stream entry, but we don't care about the
|
||||
* stream here, only the identity fields that ride beside it. */
|
||||
appendSpy = jest.spyOn(FormData.prototype, 'append');
|
||||
});
|
||||
|
||||
const result = await uploadCodeEnvFile({ ...baseUploadParams, entity_id: 'agent-42' });
|
||||
expect(result).toBe('sess-1/fid-1?entity_id=agent-42');
|
||||
afterEach(() => {
|
||||
appendSpy.mockRestore();
|
||||
});
|
||||
|
||||
const fieldsAppended = () =>
|
||||
appendSpy.mock.calls
|
||||
.filter((call) => typeof call[1] === 'string' || typeof call[1] === 'number')
|
||||
.reduce((acc, [name, value]) => ({ ...acc, [name]: value }), {});
|
||||
|
||||
it('forwards kind, id, and (when skill) version on the multipart form', async () => {
|
||||
mockAxios.post.mockResolvedValue(successResponse);
|
||||
|
||||
await uploadCodeEnvFile({
|
||||
...baseUploadParams,
|
||||
kind: 'skill',
|
||||
id: 'skill-42',
|
||||
version: 7,
|
||||
});
|
||||
|
||||
expect(fieldsAppended()).toEqual({ kind: 'skill', id: 'skill-42', version: '7' });
|
||||
});
|
||||
|
||||
it('omits version on the form for non-skill kinds', async () => {
|
||||
mockAxios.post.mockResolvedValue(successResponse);
|
||||
|
||||
await uploadCodeEnvFile({ ...baseUploadParams, kind: 'agent', id: 'agent-9' });
|
||||
|
||||
const fields = fieldsAppended();
|
||||
expect(fields).toEqual({ kind: 'agent', id: 'agent-9' });
|
||||
expect(fields).not.toHaveProperty('version');
|
||||
});
|
||||
|
||||
it('rejects unknown kind without dispatching to codeapi', async () => {
|
||||
await expect(
|
||||
uploadCodeEnvFile({ ...baseUploadParams, kind: 'system', id: 'x' }),
|
||||
).rejects.toThrow(/invalid kind/);
|
||||
expect(mockAxios.post).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('rejects skill upload without a version (mirrors codeapi validator)', async () => {
|
||||
await expect(
|
||||
uploadCodeEnvFile({ ...baseUploadParams, kind: 'skill', id: 'skill-42' }),
|
||||
).rejects.toThrow(/skill.*version/);
|
||||
expect(mockAxios.post).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('rejects version on non-skill kinds (mirrors codeapi validator)', async () => {
|
||||
await expect(
|
||||
uploadCodeEnvFile({
|
||||
...baseUploadParams,
|
||||
kind: 'agent',
|
||||
id: 'agent-9',
|
||||
version: 3,
|
||||
}),
|
||||
).rejects.toThrow(/version.*skill/);
|
||||
expect(mockAxios.post).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
|
||||
it('should throw when server returns non-success message', async () => {
|
||||
mockAxios.post.mockResolvedValue({
|
||||
data: { message: 'quota_exceeded', session_id: 's', files: [] },
|
||||
data: { message: 'quota_exceeded', storage_session_id: 's', files: [] },
|
||||
});
|
||||
|
||||
await expect(uploadCodeEnvFile(baseUploadParams)).rejects.toThrow('quota_exceeded');
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ const {
|
|||
extractCodeArtifactText,
|
||||
getExtractedTextFormat,
|
||||
getStorageMetadata,
|
||||
buildCodeEnvDownloadQuery,
|
||||
} = require('@librechat/api');
|
||||
const {
|
||||
Tools,
|
||||
|
|
@ -286,7 +287,7 @@ const runPreviewFinalize = ({ finalize, fileId, previewRevision, onResolved }) =
|
|||
/**
|
||||
* Process code execution output files — downloads and saves both images
|
||||
* and non-image files. All files are saved to local storage with
|
||||
* `fileIdentifier` metadata for code env re-upload.
|
||||
* `codeEnvRef` metadata for code env re-upload.
|
||||
*
|
||||
* Returns a two-part shape so callers can ship the attachment to the
|
||||
* client immediately and run preview extraction in the background:
|
||||
|
|
@ -334,9 +335,15 @@ const processCodeOutput = async ({
|
|||
|
||||
try {
|
||||
const formattedDate = currentDate.toISOString();
|
||||
/* Code-output files are always user-private — no skill execution
|
||||
* produces a skill-scoped output bucket. The download URL must
|
||||
* carry `?kind=user&id=<userId>` so codeapi's `sessionAuth`
|
||||
* resolves the matching `<tenant>:user:<userId>` sessionKey. See
|
||||
* codeapi #1455 / Phase C. */
|
||||
const downloadQuery = buildCodeEnvDownloadQuery({ kind: 'user', id: req.user.id });
|
||||
const response = await axios({
|
||||
method: 'get',
|
||||
url: `${baseURL}/download/${session_id}/${id}`,
|
||||
url: `${baseURL}/download/${session_id}/${id}${downloadQuery}`,
|
||||
responseType: 'arraybuffer',
|
||||
headers: {
|
||||
'User-Agent': 'LibreChat/1.0',
|
||||
|
|
@ -366,7 +373,15 @@ const processCodeOutput = async ({
|
|||
};
|
||||
}
|
||||
|
||||
const fileIdentifier = `${session_id}/${id}`;
|
||||
/* Code-output files belong to the user who ran the execution.
|
||||
* SessionKey on codeapi will be `<tenant>:user:<userId>` for these,
|
||||
* so cache and access stay user-private. */
|
||||
const codeEnvRef = {
|
||||
kind: 'user',
|
||||
id: req.user.id,
|
||||
storage_session_id: session_id,
|
||||
file_id: id,
|
||||
};
|
||||
|
||||
/* `safeName` keeps the directory structure (`a/b/file.txt` -> `a/b/file.txt`)
|
||||
* so the next prime() can place the file at the same nested path in the
|
||||
|
|
@ -444,7 +459,7 @@ const processCodeOutput = async ({
|
|||
updatedAt: formattedDate,
|
||||
source: appConfig.fileStrategy,
|
||||
context: FileContext.execute_code,
|
||||
metadata: { fileIdentifier },
|
||||
metadata: { codeEnvRef },
|
||||
};
|
||||
await createFile(file, true);
|
||||
return { file: Object.assign(file, { messageId, toolCallId }) };
|
||||
|
|
@ -542,7 +557,7 @@ const processCodeOutput = async ({
|
|||
tenantId: req.user.tenantId,
|
||||
bytes: buffer.length,
|
||||
updatedAt: formattedDate,
|
||||
metadata: { fileIdentifier },
|
||||
metadata: { codeEnvRef },
|
||||
source: appConfig.fileStrategy,
|
||||
context: FileContext.execute_code,
|
||||
usage: isUpdate ? (claimed.usage ?? 0) + 1 : 1,
|
||||
|
|
@ -651,26 +666,31 @@ function checkIfActive(dateString) {
|
|||
/**
|
||||
* Retrieves the `lastModified` time string for a specified file from Code Execution Server.
|
||||
*
|
||||
* @param {string} fileIdentifier - The identifier for the file (e.g., "session_id/fileId").
|
||||
* @param {import('librechat-data-provider').CodeEnvRef} ref - Typed pointer
|
||||
* into codeapi storage. Carries kind/id/storage_session_id/file_id;
|
||||
* codeapi resolves the sessionKey from the request's auth context.
|
||||
*
|
||||
* @returns {Promise<string|null>}
|
||||
* A promise that resolves to the `lastModified` time string of the file if successful, or null if there is an
|
||||
* error in initialization or fetching the info.
|
||||
*/
|
||||
async function getSessionInfo(fileIdentifier) {
|
||||
async function getSessionInfo(ref) {
|
||||
try {
|
||||
const baseURL = getCodeBaseURL();
|
||||
const [path, queryString] = fileIdentifier.split('?');
|
||||
const [session_id, fileId] = path.split('/');
|
||||
let queryParams = {};
|
||||
if (queryString) {
|
||||
queryParams = Object.fromEntries(new URLSearchParams(queryString).entries());
|
||||
}
|
||||
|
||||
/* `/sessions/.../objects/...` is gated by codeapi's `sessionAuth`
|
||||
* middleware (post-Phase C). The middleware reconstructs the
|
||||
* sessionKey from the URL query (`kind`/`id`/`version?`) plus the
|
||||
* requester's auth context, then matches it against the cached
|
||||
* sessionKey on the storage bucket. We have the full `codeEnvRef`
|
||||
* here, so pass kind+id (+version when skill) directly. */
|
||||
const query = buildCodeEnvDownloadQuery({
|
||||
kind: ref.kind,
|
||||
id: ref.id,
|
||||
...(ref.kind === 'skill' ? { version: ref.version } : {}),
|
||||
});
|
||||
const response = await axios({
|
||||
method: 'get',
|
||||
url: `${baseURL}/sessions/${session_id}/objects/${fileId}`,
|
||||
params: queryParams,
|
||||
url: `${baseURL}/sessions/${ref.storage_session_id}/objects/${ref.file_id}${query}`,
|
||||
headers: {
|
||||
'User-Agent': 'LibreChat/1.0',
|
||||
},
|
||||
|
|
@ -706,6 +726,15 @@ const primeFiles = async (options) => {
|
|||
const agentResourceIds = new Set(file_ids);
|
||||
const resourceFiles = tool_resources?.[EToolResources.execute_code]?.files ?? [];
|
||||
|
||||
/* Step 1 of the priming trace: input volume. Pair with the
|
||||
* per-file `[primeCodeFiles] file=...` lines and the final
|
||||
* `[primeCodeFiles] returned=...` line below to locate which
|
||||
* layer drops a file the sandbox doesn't end up seeing. */
|
||||
logger.debug(
|
||||
`[primeCodeFiles] in: file_ids=${file_ids.length} resourceFiles=${resourceFiles.length}`,
|
||||
{ agentId, file_ids, resourceFileIds: resourceFiles.map((f) => f?.file_id) },
|
||||
);
|
||||
|
||||
// Get all files first
|
||||
const allFiles = (await getFiles({ file_id: { $in: file_ids } }, null, { text: 0 })) ?? [];
|
||||
|
||||
|
|
@ -728,146 +757,195 @@ const primeFiles = async (options) => {
|
|||
const sessions = new Map();
|
||||
let toolContext = '';
|
||||
|
||||
/* Per-file path counters — emitted at the bottom so a single
|
||||
* grep on `[primeCodeFiles]` shows the input volume, the per-file
|
||||
* paths taken, and the final dispatch summary in one trace. */
|
||||
let skippedNoRef = 0;
|
||||
let reuploadFailures = 0;
|
||||
|
||||
for (let i = 0; i < dbFiles.length; i++) {
|
||||
const file = dbFiles[i];
|
||||
if (!file) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (file.metadata.fileIdentifier) {
|
||||
const [path, queryString] = file.metadata.fileIdentifier.split('?');
|
||||
const [session_id, id] = path.split('/');
|
||||
|
||||
let queryParams = {};
|
||||
if (queryString) {
|
||||
queryParams = Object.fromEntries(new URLSearchParams(queryString).entries());
|
||||
}
|
||||
|
||||
/**
|
||||
* `pushFile` accepts optional overrides so the reupload path can
|
||||
* push the FRESH `(session_id, id, entity_id)` parsed off the new
|
||||
* `fileIdentifier`. Without these overrides, the closure would
|
||||
* capture the stale pre-reupload refs from the outer loop and
|
||||
* the in-memory `files` array (now consumed by
|
||||
* `buildInitialToolSessions` to seed `Graph.sessions`) would
|
||||
* point at a sandbox object that no longer exists. The DB record
|
||||
* gets the new identifier via `updateFile`, but the seed would
|
||||
* still inject the old one — bash_tool / read_file would 404
|
||||
* trying to mount the file until the next turn re-reads metadata.
|
||||
*
|
||||
* `entity_id` is forwarded so codeapi can resolve sessionKey
|
||||
* per-file, allowing one execute to mix files uploaded under
|
||||
* different entities (e.g. a skill bundle plus a user attachment).
|
||||
*/
|
||||
const pushFile = (overrideSessionId, overrideId, overrideEntityId) => {
|
||||
if (!toolContext) {
|
||||
toolContext = `- Note: The following files are available in the "${Tools.execute_code}" tool environment:`;
|
||||
}
|
||||
|
||||
let fileSuffix = '';
|
||||
if (!agentResourceIds.has(file.file_id)) {
|
||||
fileSuffix =
|
||||
file.context === FileContext.execute_code
|
||||
? ' (from previous code execution)'
|
||||
: ' (attached by user)';
|
||||
}
|
||||
|
||||
const entity_id = overrideEntityId ?? queryParams.entity_id;
|
||||
|
||||
/* Surface the preview lifecycle so the LLM knows when a
|
||||
* prior-turn artifact's rich preview didn't materialize. The
|
||||
* file blob is always available (`processCodeOutput` persists
|
||||
* it before returning), so the model can still tell the user
|
||||
* "you can download it" even when the preview never resolved.
|
||||
* Absent status means legacy or non-office — render normally. */
|
||||
let previewSuffix = '';
|
||||
if (file.status === 'pending') {
|
||||
previewSuffix = ' (preview not yet generated)';
|
||||
} else if (file.status === 'failed') {
|
||||
previewSuffix = file.previewError
|
||||
? ` (preview unavailable: ${file.previewError})`
|
||||
: ' (preview unavailable)';
|
||||
}
|
||||
|
||||
toolContext += `\n\t- /mnt/data/${file.filename}${fileSuffix}${previewSuffix}`;
|
||||
files.push({
|
||||
id: overrideId ?? id,
|
||||
session_id: overrideSessionId ?? session_id,
|
||||
name: file.filename,
|
||||
...(entity_id ? { entity_id } : {}),
|
||||
});
|
||||
};
|
||||
|
||||
if (sessions.has(session_id)) {
|
||||
pushFile();
|
||||
continue;
|
||||
}
|
||||
|
||||
const reuploadFile = async () => {
|
||||
try {
|
||||
const { getDownloadStream } = getStrategyFunctions(file.source);
|
||||
const { handleFileUpload: uploadCodeEnvFile } = getStrategyFunctions(
|
||||
FileSources.execute_code,
|
||||
);
|
||||
const stream = await getDownloadStream(options.req, file.filepath);
|
||||
const fileIdentifier = await uploadCodeEnvFile({
|
||||
req: options.req,
|
||||
stream,
|
||||
filename: file.filename,
|
||||
entity_id: queryParams.entity_id,
|
||||
});
|
||||
|
||||
// Preserve existing metadata when adding fileIdentifier
|
||||
const updatedMetadata = {
|
||||
...file.metadata, // Preserve existing metadata (like S3 storage info)
|
||||
fileIdentifier, // Add fileIdentifier
|
||||
};
|
||||
|
||||
await updateFile({
|
||||
file_id: file.file_id,
|
||||
metadata: updatedMetadata,
|
||||
});
|
||||
/**
|
||||
* Parse the FRESH fileIdentifier returned by the reupload and
|
||||
* route it through both the dedupe Map and the in-memory
|
||||
* `files` list. The original `(session_id, id)` parsed at the
|
||||
* top of this iteration refer to the old, expired/missing
|
||||
* sandbox object — using them here would silently re-introduce
|
||||
* the bug `Graph.sessions` seeding is supposed to fix.
|
||||
*
|
||||
* `entity_id` survives the round-trip: the upload was tagged
|
||||
* with `queryParams.entity_id` above, so the new identifier
|
||||
* carries the same scope.
|
||||
*/
|
||||
const [newPath, newQuery] = fileIdentifier.split('?');
|
||||
const [newSessionId, newId] = newPath.split('/');
|
||||
const newQueryParams = newQuery
|
||||
? Object.fromEntries(new URLSearchParams(newQuery).entries())
|
||||
: {};
|
||||
sessions.set(newSessionId, true);
|
||||
pushFile(newSessionId, newId, newQueryParams.entity_id);
|
||||
} catch (error) {
|
||||
logger.error(
|
||||
`Error re-uploading file ${id} in session ${session_id}: ${error.message}`,
|
||||
error,
|
||||
);
|
||||
}
|
||||
};
|
||||
const uploadTime = await getSessionInfo(file.metadata.fileIdentifier);
|
||||
if (!uploadTime) {
|
||||
logger.warn(`Failed to get upload time for file ${id} in session ${session_id}`);
|
||||
await reuploadFile();
|
||||
continue;
|
||||
}
|
||||
if (!checkIfActive(uploadTime)) {
|
||||
await reuploadFile();
|
||||
continue;
|
||||
}
|
||||
sessions.set(session_id, true);
|
||||
pushFile();
|
||||
const ref = file.metadata?.codeEnvRef;
|
||||
if (!ref) {
|
||||
skippedNoRef += 1;
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=skip reason=no-codeenvref filename=${file.filename}`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
const session_id = ref.storage_session_id;
|
||||
const id = ref.file_id;
|
||||
|
||||
/**
|
||||
* `pushFile` accepts optional overrides so the reupload path can
|
||||
* push the FRESH `(storage_session_id, file_id)` from the new
|
||||
* `codeEnvRef`. Without these overrides, the closure would
|
||||
* capture the stale pre-reupload refs from the outer loop and
|
||||
* the in-memory `files` array (now consumed by
|
||||
* `buildInitialToolSessions` to seed `Graph.sessions`) would
|
||||
* point at a sandbox object that no longer exists. The DB record
|
||||
* gets the new ref via `updateFile`, but the seed would still
|
||||
* inject the old one — bash_tool / read_file would 404 trying to
|
||||
* mount the file until the next turn re-reads metadata.
|
||||
*
|
||||
* `kind`, `id`, `version` are preserved on the in-memory ref so
|
||||
* codeapi can resolve sessionKey per-file (kind switch +
|
||||
* tenant prefix from auth context).
|
||||
*/
|
||||
const pushFile = (overrideSessionId, overrideId) => {
|
||||
if (!toolContext) {
|
||||
toolContext = `- Note: The following files are available in the "${Tools.execute_code}" tool environment:`;
|
||||
}
|
||||
|
||||
let fileSuffix = '';
|
||||
if (!agentResourceIds.has(file.file_id)) {
|
||||
fileSuffix =
|
||||
file.context === FileContext.execute_code
|
||||
? ' (from previous code execution)'
|
||||
: ' (attached by user)';
|
||||
}
|
||||
|
||||
/* Surface the preview lifecycle so the LLM knows when a
|
||||
* prior-turn artifact's rich preview didn't materialize. The
|
||||
* file blob is always available (`processCodeOutput` persists
|
||||
* it before returning), so the model can still tell the user
|
||||
* "you can download it" even when the preview never resolved.
|
||||
* Absent status means legacy or non-office — render normally. */
|
||||
let previewSuffix = '';
|
||||
if (file.status === 'pending') {
|
||||
previewSuffix = ' (preview not yet generated)';
|
||||
} else if (file.status === 'failed') {
|
||||
previewSuffix = file.previewError
|
||||
? ` (preview unavailable: ${file.previewError})`
|
||||
: ' (preview unavailable)';
|
||||
}
|
||||
|
||||
toolContext += `\n\t- /mnt/data/${file.filename}${fileSuffix}${previewSuffix}`;
|
||||
/* `id` is the storage file_id (drives codeapi's upload-key
|
||||
* existence check), `resource_id` is the entity that owns
|
||||
* the storage session (drives sessionKey re-derivation). For
|
||||
* code-output files this is `kind: 'user'` and `resource_id`
|
||||
* is informational (codeapi ignores it for user kind), but
|
||||
* we still send it for shape uniformity with shared kinds. */
|
||||
files.push({
|
||||
id: overrideId ?? id,
|
||||
resource_id: ref.id,
|
||||
storage_session_id: overrideSessionId ?? session_id,
|
||||
name: file.filename,
|
||||
kind: ref.kind,
|
||||
...(ref.kind === 'skill' ? { version: ref.version } : {}),
|
||||
});
|
||||
};
|
||||
|
||||
if (sessions.has(session_id)) {
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=cache-hit-by-session storage_session_id=${session_id}`,
|
||||
);
|
||||
pushFile();
|
||||
continue;
|
||||
}
|
||||
|
||||
const reuploadFile = async () => {
|
||||
try {
|
||||
const { getDownloadStream } = getStrategyFunctions(file.source);
|
||||
const { handleFileUpload: uploadCodeEnvFile } = getStrategyFunctions(
|
||||
FileSources.execute_code,
|
||||
);
|
||||
const stream = await getDownloadStream(options.req, file.filepath);
|
||||
/* Reupload preserves the resource identity from the existing
|
||||
* ref so codeapi re-buckets under the same sessionKey shape
|
||||
* (skill stays skill, user stays user). Without this, a
|
||||
* skill-cache-miss reupload would land in the user bucket
|
||||
* and never re-shareable cross-user. */
|
||||
const uploaded = await uploadCodeEnvFile({
|
||||
req: options.req,
|
||||
stream,
|
||||
filename: file.filename,
|
||||
kind: ref.kind,
|
||||
id: ref.id,
|
||||
...(ref.kind === 'skill' ? { version: ref.version } : {}),
|
||||
});
|
||||
|
||||
/**
|
||||
* Use the FRESH `(storage_session_id, file_id)` from the
|
||||
* reupload response and route it through the dedupe Map, the
|
||||
* persisted record, and the in-memory `files` list. The
|
||||
* original ref captured at the top of this iteration refers
|
||||
* to the old, expired/missing sandbox object — using it here
|
||||
* would silently re-introduce the bug `Graph.sessions`
|
||||
* seeding is supposed to fix.
|
||||
*
|
||||
* `kind`, `id`, `version` survive the round-trip: the
|
||||
* upload preserves the resource identity, only the storage
|
||||
* pointer changes.
|
||||
*/
|
||||
const newRef = {
|
||||
kind: ref.kind,
|
||||
id: ref.id,
|
||||
storage_session_id: uploaded.storage_session_id,
|
||||
file_id: uploaded.file_id,
|
||||
...(ref.kind === 'skill' ? { version: ref.version } : {}),
|
||||
};
|
||||
|
||||
const updatedMetadata = {
|
||||
...file.metadata,
|
||||
codeEnvRef: newRef,
|
||||
};
|
||||
|
||||
await updateFile({
|
||||
file_id: file.file_id,
|
||||
metadata: updatedMetadata,
|
||||
});
|
||||
sessions.set(newRef.storage_session_id, true);
|
||||
pushFile(newRef.storage_session_id, newRef.file_id);
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=reupload-success ` +
|
||||
`oldSession=${session_id} newSession=${newRef.storage_session_id} newFileId=${newRef.file_id}`,
|
||||
);
|
||||
} catch (error) {
|
||||
reuploadFailures += 1;
|
||||
logger.error(
|
||||
`[primeCodeFiles] file=${file.file_id} path=reupload-failed session=${session_id}: ${error.message}`,
|
||||
error,
|
||||
);
|
||||
}
|
||||
};
|
||||
const uploadTime = await getSessionInfo(ref);
|
||||
if (!uploadTime) {
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=reupload reason=no-uploadtime ` +
|
||||
`storage_session_id=${session_id}`,
|
||||
);
|
||||
await reuploadFile();
|
||||
continue;
|
||||
}
|
||||
if (!checkIfActive(uploadTime)) {
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=reupload reason=stale ` +
|
||||
`uploadTime=${uploadTime} storage_session_id=${session_id}`,
|
||||
);
|
||||
await reuploadFile();
|
||||
continue;
|
||||
}
|
||||
sessions.set(session_id, true);
|
||||
logger.debug(
|
||||
`[primeCodeFiles] file=${file.file_id} path=fresh-active storage_session_id=${session_id}`,
|
||||
);
|
||||
pushFile();
|
||||
}
|
||||
|
||||
/* Dispatch summary — emitted unconditionally so a single grep on
|
||||
* `[primeCodeFiles] out` always shows the final state, not only
|
||||
* the per-path trail leading up to it. */
|
||||
logger.debug(
|
||||
`[primeCodeFiles] out: returned=${files.length} ` +
|
||||
`skippedNoRef=${skippedNoRef} reuploadFailures=${reuploadFailures}`,
|
||||
);
|
||||
|
||||
return { files, toolContext };
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -85,6 +85,16 @@ jest.mock('@librechat/api', () => {
|
|||
* if a case needs to assert the 'html' value. */
|
||||
getExtractedTextFormat: (...args) => mockGetExtractedTextFormat(...args),
|
||||
getStorageMetadata: jest.fn(() => ({})),
|
||||
/* Identity helpers mirror codeapi's validator. The real impl
|
||||
* lives in `packages/api/src/files/code/identity.ts` with its
|
||||
* own dedicated `identity.spec.ts`; here we just stub the
|
||||
* download-query builder since `processCodeOutput` calls it on
|
||||
* every output download. */
|
||||
buildCodeEnvDownloadQuery: jest.fn(({ kind, id, version }) => {
|
||||
const params = new URLSearchParams({ kind, id });
|
||||
if (version != null) params.set('version', String(version));
|
||||
return `?${params.toString()}`;
|
||||
}),
|
||||
codeServerHttpAgent: new http.Agent({ keepAlive: false }),
|
||||
codeServerHttpsAgent: new https.Agent({ keepAlive: false }),
|
||||
};
|
||||
|
|
@ -708,17 +718,68 @@ describe('Code Process', () => {
|
|||
});
|
||||
|
||||
describe('metadata and file properties', () => {
|
||||
it('should include fileIdentifier in metadata', async () => {
|
||||
it('should include codeEnvRef in metadata with kind: user', async () => {
|
||||
const smallBuffer = Buffer.alloc(100);
|
||||
mockAxios.mockResolvedValue({ data: smallBuffer });
|
||||
|
||||
const { file: result } = await processCodeOutput(baseParams);
|
||||
|
||||
expect(result.metadata).toEqual({
|
||||
fileIdentifier: 'session-123/file-id-123',
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'session-123',
|
||||
file_id: 'file-id-123',
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
/* Phase C lock-in: outputs are ALWAYS user-scoped, never skill-scoped.
|
||||
* Even when an execution turn invoked a skill (so input files were
|
||||
* `kind: 'skill'` shared cross-user), the resulting output bucket
|
||||
* tags `kind: 'user'` with the requesting user's id. This prevents
|
||||
* cross-user leakage of artifacts a skill may have generated for
|
||||
* one user — each user gets their own output sessionKey on codeapi.
|
||||
*
|
||||
* Drift hazard: someone reading the simple user-derivation may
|
||||
* later think "we should respect input kind for outputs too" and
|
||||
* widen output scope to match input scope. This test pins the
|
||||
* intentional asymmetry so that change requires updating the test
|
||||
* (and re-reading the rationale). */
|
||||
it('outputs are user-scoped regardless of which skill the execution invoked', async () => {
|
||||
const smallBuffer = Buffer.alloc(100);
|
||||
mockAxios.mockResolvedValue({ data: smallBuffer });
|
||||
|
||||
const userA = { ...mockReq, user: { id: 'user-A' } };
|
||||
const userB = { ...mockReq, user: { id: 'user-B' } };
|
||||
|
||||
const { file: outputA } = await processCodeOutput({ ...baseParams, req: userA });
|
||||
const { file: outputB } = await processCodeOutput({ ...baseParams, req: userB });
|
||||
|
||||
// Each user's output ref is keyed by their own user id. The
|
||||
// `id` field tracks the requesting user, never the skill.
|
||||
expect(outputA.metadata.codeEnvRef).toEqual({
|
||||
kind: 'user',
|
||||
id: 'user-A',
|
||||
storage_session_id: 'session-123',
|
||||
file_id: 'file-id-123',
|
||||
});
|
||||
expect(outputB.metadata.codeEnvRef).toEqual({
|
||||
kind: 'user',
|
||||
id: 'user-B',
|
||||
storage_session_id: 'session-123',
|
||||
file_id: 'file-id-123',
|
||||
});
|
||||
|
||||
// No skill identity leaks into the output ref under any property.
|
||||
const refA = outputA.metadata.codeEnvRef;
|
||||
const refB = outputB.metadata.codeEnvRef;
|
||||
expect(refA.kind).not.toBe('skill');
|
||||
expect(refB.kind).not.toBe('skill');
|
||||
expect(refA).not.toHaveProperty('version');
|
||||
expect(refB).not.toHaveProperty('version');
|
||||
});
|
||||
|
||||
it('should set correct context for code-generated files', async () => {
|
||||
const smallBuffer = Buffer.alloc(100);
|
||||
mockAxios.mockResolvedValue({ data: smallBuffer });
|
||||
|
|
@ -934,7 +995,12 @@ describe('Code Process', () => {
|
|||
data: [{ name: 'sess/fid', lastModified: new Date().toISOString() }],
|
||||
});
|
||||
|
||||
await getSessionInfo('sess/fid', 'api-key');
|
||||
await getSessionInfo({
|
||||
kind: 'user',
|
||||
id: 'user-1',
|
||||
storage_session_id: 'sess',
|
||||
file_id: 'fid',
|
||||
});
|
||||
|
||||
const callConfig = mockAxios.mock.calls[0][0];
|
||||
expect(callConfig.httpAgent).toBe(codeServerHttpAgent);
|
||||
|
|
@ -1511,8 +1577,8 @@ describe('Code Process', () => {
|
|||
* `getStrategyFunctions(FileSources.execute_code)` for the code-env
|
||||
* upload — both go through the same factory in production.
|
||||
*/
|
||||
function setupReuploadMocks(newFileIdentifier) {
|
||||
const handleFileUpload = jest.fn().mockResolvedValue(newFileIdentifier);
|
||||
function setupReuploadMocks(newRef) {
|
||||
const handleFileUpload = jest.fn().mockResolvedValue(newRef);
|
||||
const getDownloadStream = jest.fn().mockResolvedValue('mock-stream');
|
||||
getStrategyFunctions.mockImplementation((source) => {
|
||||
if (source === 'execute_code') return { handleFileUpload };
|
||||
|
|
@ -1526,7 +1592,7 @@ describe('Code Process', () => {
|
|||
return { handleFileUpload, getDownloadStream };
|
||||
}
|
||||
|
||||
it('seed receives FRESH session_id + id parsed off the new fileIdentifier on reupload', async () => {
|
||||
it('seed receives FRESH (storage_session_id, file_id) from the reupload response', async () => {
|
||||
const dbFile = {
|
||||
file_id: 'librechat-file-id',
|
||||
filename: 'sentinel.txt',
|
||||
|
|
@ -1535,12 +1601,17 @@ describe('Code Process', () => {
|
|||
context: 'execute_code',
|
||||
metadata: {
|
||||
/* Stale sandbox ref — this is what `getSessionInfo` will 404 on. */
|
||||
fileIdentifier: 'OLD_SESSION/OLD_ID',
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'OLD_SESSION',
|
||||
file_id: 'OLD_ID',
|
||||
},
|
||||
},
|
||||
};
|
||||
getFiles.mockResolvedValue([dbFile]);
|
||||
|
||||
setupReuploadMocks('NEW_SESSION/NEW_ID');
|
||||
setupReuploadMocks({ storage_session_id: 'NEW_SESSION', file_id: 'NEW_ID' });
|
||||
|
||||
const result = await primeFiles({
|
||||
req: { user: { id: 'user-123', role: 'USER' } },
|
||||
|
|
@ -1553,22 +1624,82 @@ describe('Code Process', () => {
|
|||
// The seed list (consumed by buildInitialToolSessions) MUST carry
|
||||
// the post-reupload ids — not the stale pre-reupload ones.
|
||||
expect(result.files).toEqual([
|
||||
{ id: 'NEW_ID', session_id: 'NEW_SESSION', name: 'sentinel.txt' },
|
||||
{
|
||||
id: 'NEW_ID',
|
||||
/* `resource_id` carries the codeEnvRef.id (= original
|
||||
* userId for kind: 'user'), threaded onto the in-memory
|
||||
* file ref for codeapi's sessionKey re-derivation. */
|
||||
resource_id: 'user-123',
|
||||
storage_session_id: 'NEW_SESSION',
|
||||
name: 'sentinel.txt',
|
||||
kind: 'user',
|
||||
},
|
||||
]);
|
||||
});
|
||||
|
||||
it('persists the new fileIdentifier on the DB record (existing behavior, regression-locked)', async () => {
|
||||
/* Phase C / option α (codeapi #1455): reupload preserves the
|
||||
* resource identity from the existing ref so codeapi re-buckets
|
||||
* under the same sessionKey shape. Without this, a skill-cache-miss
|
||||
* reupload lands in the user bucket and is no longer cross-user
|
||||
* shareable. */
|
||||
it('reupload forwards kind/id (and version when skill) from the existing ref', async () => {
|
||||
const dbFile = {
|
||||
file_id: 'librechat-file-id',
|
||||
filename: 'sentinel.txt',
|
||||
filepath: '/uploads/sentinel.txt',
|
||||
source: 'local',
|
||||
context: 'execute_code',
|
||||
metadata: { fileIdentifier: 'OLD_SESSION/OLD_ID' },
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'skill',
|
||||
id: 'skill-99',
|
||||
storage_session_id: 'OLD_SESSION',
|
||||
file_id: 'OLD_ID',
|
||||
version: 4,
|
||||
},
|
||||
},
|
||||
};
|
||||
getFiles.mockResolvedValue([dbFile]);
|
||||
|
||||
setupReuploadMocks('NEW_SESSION/NEW_ID');
|
||||
const { handleFileUpload } = setupReuploadMocks({
|
||||
storage_session_id: 'NEW_SESSION',
|
||||
file_id: 'NEW_ID',
|
||||
});
|
||||
|
||||
await primeFiles({
|
||||
req: { user: { id: 'user-123', role: 'USER' } },
|
||||
tool_resources: {
|
||||
execute_code: { file_ids: ['librechat-file-id'], files: [] },
|
||||
},
|
||||
agentId: 'agent-id',
|
||||
});
|
||||
|
||||
expect(handleFileUpload).toHaveBeenCalledTimes(1);
|
||||
const uploadArgs = handleFileUpload.mock.calls[0][0];
|
||||
expect(uploadArgs.kind).toBe('skill');
|
||||
expect(uploadArgs.id).toBe('skill-99');
|
||||
expect(uploadArgs.version).toBe(4);
|
||||
});
|
||||
|
||||
it('persists fresh codeEnvRef (kind/id preserved) on the DB record after reupload', async () => {
|
||||
const dbFile = {
|
||||
file_id: 'librechat-file-id',
|
||||
filename: 'sentinel.txt',
|
||||
filepath: '/uploads/sentinel.txt',
|
||||
source: 'local',
|
||||
context: 'execute_code',
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'OLD_SESSION',
|
||||
file_id: 'OLD_ID',
|
||||
},
|
||||
},
|
||||
};
|
||||
getFiles.mockResolvedValue([dbFile]);
|
||||
|
||||
setupReuploadMocks({ storage_session_id: 'NEW_SESSION', file_id: 'NEW_ID' });
|
||||
|
||||
await primeFiles({
|
||||
req: { user: { id: 'user-123', role: 'USER' } },
|
||||
|
|
@ -1581,10 +1712,62 @@ describe('Code Process', () => {
|
|||
expect(updateFile).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
file_id: 'librechat-file-id',
|
||||
metadata: expect.objectContaining({ fileIdentifier: 'NEW_SESSION/NEW_ID' }),
|
||||
metadata: expect.objectContaining({
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'NEW_SESSION',
|
||||
file_id: 'NEW_ID',
|
||||
},
|
||||
}),
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it('reads codeEnvRef directly when present (skipping reupload)', async () => {
|
||||
const dbFile = {
|
||||
file_id: 'librechat-file-id',
|
||||
filename: 'sentinel.txt',
|
||||
filepath: '/uploads/sentinel.txt',
|
||||
source: 'local',
|
||||
context: 'execute_code',
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'STRUCT_SESSION',
|
||||
file_id: 'STRUCT_ID',
|
||||
},
|
||||
},
|
||||
};
|
||||
getFiles.mockResolvedValue([dbFile]);
|
||||
filterFilesByAgentAccess.mockImplementation(({ files }) => Promise.resolve(files));
|
||||
// getSessionInfo returns a fresh timestamp so reupload is skipped.
|
||||
mockAxios.mockResolvedValue({ data: { lastModified: new Date().toISOString() } });
|
||||
|
||||
const result = await primeFiles({
|
||||
req: { user: { id: 'user-123', role: 'USER' } },
|
||||
tool_resources: {
|
||||
execute_code: { file_ids: ['librechat-file-id'], files: [] },
|
||||
},
|
||||
agentId: 'agent-id',
|
||||
});
|
||||
|
||||
expect(updateFile).not.toHaveBeenCalled();
|
||||
expect(result.files).toEqual([
|
||||
{
|
||||
id: 'STRUCT_ID',
|
||||
/* `resource_id` from the persisted codeEnvRef.id — for
|
||||
* `kind: 'user'` this is informational (codeapi derives
|
||||
* sessionKey from auth context) but threaded for shape
|
||||
* uniformity with shared kinds. */
|
||||
resource_id: 'user-123',
|
||||
storage_session_id: 'STRUCT_SESSION',
|
||||
name: 'sentinel.txt',
|
||||
kind: 'user',
|
||||
},
|
||||
]);
|
||||
});
|
||||
});
|
||||
|
||||
describe('primeFiles toolContext surfaces preview status to the LLM', () => {
|
||||
|
|
@ -1606,7 +1789,14 @@ describe('Code Process', () => {
|
|||
filepath: `/uploads/${overrides.status ?? 'ready'}.xlsx`,
|
||||
source: 'local',
|
||||
context: 'execute_code',
|
||||
metadata: { fileIdentifier: 'CURRENT_SESSION/CURRENT_ID' },
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'CURRENT_SESSION',
|
||||
file_id: 'CURRENT_ID',
|
||||
},
|
||||
},
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
|
|
|||
|
|
@ -569,13 +569,44 @@ const processAgentFileUpload = async ({ req, res, metadata }) => {
|
|||
}
|
||||
const { handleFileUpload: uploadCodeEnvFile } = getStrategyFunctions(FileSources.execute_code);
|
||||
const stream = fs.createReadStream(file.path);
|
||||
const fileIdentifier = await uploadCodeEnvFile({
|
||||
/* Resource identity for codeapi's sessionKey:
|
||||
* - chat attachments (messageAttachment=true): `kind: 'user'`, codeapi
|
||||
* buckets under `<tenant>:user:<authContext.userId>` regardless of `id`.
|
||||
* - agent setup files (messageAttachment=false): `kind: 'agent'`, shared
|
||||
* per agent identity. `id` carries the agent id. */
|
||||
const codeKind = messageAttachment === true ? 'user' : 'agent';
|
||||
const codeId = messageAttachment === true ? req.user.id : agent_id;
|
||||
/* Upload under the same sanitized filename LC stores in its DB
|
||||
* (`fileInfo.filename` below uses `sanitizeFilename(originalname)`).
|
||||
* Codeapi/file_server use this as the on-disk name in the sandbox
|
||||
* — `/mnt/data/<filename>` — and `primeFiles`'s `toolContext` text
|
||||
* + `_injected_files.name` both reference `file.filename`. Sending
|
||||
* the unsanitized `file.originalname` here makes the sandbox path
|
||||
* (with spaces / special chars) drift from what LC tells the model
|
||||
* is available, causing FileNotFoundError on the first reference. */
|
||||
const sandboxFilename = sanitizeFilename(file.originalname);
|
||||
const uploaded = await uploadCodeEnvFile({
|
||||
req,
|
||||
stream,
|
||||
filename: file.originalname,
|
||||
entity_id,
|
||||
filename: sandboxFilename,
|
||||
kind: codeKind,
|
||||
id: codeId,
|
||||
});
|
||||
fileInfoMetadata = { fileIdentifier };
|
||||
/* Persist under the structured `codeEnvRef` shape — the only key the
|
||||
* post-cutover schema (`metadata.codeEnvRef`) and downstream readers
|
||||
* (`primeFiles`, `getCodeFilesByIds`, `categorizeFileForToolResources`,
|
||||
* controller filtering) accept. Storing under the legacy
|
||||
* `fileIdentifier` key would be silently dropped by mongoose strict
|
||||
* mode and the file would lose its sandbox reference on subsequent
|
||||
* priming turns. */
|
||||
fileInfoMetadata = {
|
||||
codeEnvRef: {
|
||||
kind: codeKind,
|
||||
id: codeId,
|
||||
storage_session_id: uploaded.storage_session_id,
|
||||
file_id: uploaded.file_id,
|
||||
},
|
||||
};
|
||||
} else if (tool_resource === EToolResources.file_search) {
|
||||
const isFileSearchEnabled = await checkCapability(req, AgentCapabilities.file_search);
|
||||
if (!isFileSearchEnabled) {
|
||||
|
|
|
|||
|
|
@ -346,6 +346,128 @@ describe('processAgentFileUpload', () => {
|
|||
).resolves.not.toThrow();
|
||||
});
|
||||
});
|
||||
|
||||
/* Phase C / option α regression: the upload must persist its sandbox
|
||||
* pointer under `metadata.codeEnvRef` (the post-cutover schema). The
|
||||
* legacy `metadata.fileIdentifier` key is silently stripped by mongoose
|
||||
* strict mode and downstream readers (`primeFiles`, `getCodeFilesByIds`,
|
||||
* `categorizeFileForToolResources`, controller filtering) only check
|
||||
* `codeEnvRef`. Storing under the legacy key would orphan the file —
|
||||
* priming would skip it on subsequent code-execution turns and the
|
||||
* sandbox copy would never re-mount. */
|
||||
describe('execute_code uploads persist codeEnvRef metadata', () => {
|
||||
const fs = require('fs');
|
||||
const { Readable } = require('stream');
|
||||
let createReadStreamSpy;
|
||||
|
||||
beforeEach(() => {
|
||||
/* `processAgentFileUpload` opens the multer-staged temp file via
|
||||
* `fs.createReadStream`. The test fixture path doesn't exist, so
|
||||
* stub it to a tiny in-memory stream. */
|
||||
createReadStreamSpy = jest
|
||||
.spyOn(fs, 'createReadStream')
|
||||
.mockImplementation(() => Readable.from(Buffer.from('')));
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
createReadStreamSpy.mockRestore();
|
||||
});
|
||||
|
||||
const setupCodeEnvUpload = (uploaded) => {
|
||||
/* `processAgentFileUpload` calls `getStrategyFunctions` twice:
|
||||
* once with `execute_code` for the codeapi upload, then again with
|
||||
* the on-disk strategy (`local`) for the standard storage step that
|
||||
* runs in the same flow. Both must return a working
|
||||
* `handleFileUpload`. */
|
||||
const codeEnvUpload = jest.fn().mockResolvedValue(uploaded);
|
||||
const localUpload = jest.fn().mockResolvedValue({
|
||||
bytes: 0,
|
||||
filename: 'upload.bin',
|
||||
filepath: '/uploads/upload.bin',
|
||||
});
|
||||
getStrategyFunctions.mockImplementation((src) =>
|
||||
src === FileSources.execute_code
|
||||
? { handleFileUpload: codeEnvUpload }
|
||||
: { handleFileUpload: localUpload, saveBuffer: jest.fn() },
|
||||
);
|
||||
return codeEnvUpload;
|
||||
};
|
||||
|
||||
it('persists kind:user codeEnvRef for chat attachments (messageAttachment=true)', async () => {
|
||||
setupCodeEnvUpload({ storage_session_id: 'sess-1', file_id: 'fid-1' });
|
||||
const req = makeReq();
|
||||
await processAgentFileUpload({
|
||||
req,
|
||||
res: mockRes,
|
||||
metadata: {
|
||||
agent_id: 'agent-abc',
|
||||
tool_resource: EToolResources.execute_code,
|
||||
file_id: 'file-uuid',
|
||||
message_file: true,
|
||||
},
|
||||
});
|
||||
|
||||
expect(db.createFile).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'user',
|
||||
id: 'user-123',
|
||||
storage_session_id: 'sess-1',
|
||||
file_id: 'fid-1',
|
||||
},
|
||||
},
|
||||
}),
|
||||
true,
|
||||
);
|
||||
});
|
||||
|
||||
it('persists kind:agent codeEnvRef for agent setup files (messageAttachment=false)', async () => {
|
||||
setupCodeEnvUpload({ storage_session_id: 'sess-2', file_id: 'fid-2' });
|
||||
const req = makeReq();
|
||||
await processAgentFileUpload({
|
||||
req,
|
||||
res: mockRes,
|
||||
metadata: {
|
||||
agent_id: 'agent-abc',
|
||||
tool_resource: EToolResources.execute_code,
|
||||
file_id: 'file-uuid',
|
||||
},
|
||||
});
|
||||
|
||||
expect(db.createFile).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
metadata: {
|
||||
codeEnvRef: {
|
||||
kind: 'agent',
|
||||
id: 'agent-abc',
|
||||
storage_session_id: 'sess-2',
|
||||
file_id: 'fid-2',
|
||||
},
|
||||
},
|
||||
}),
|
||||
true,
|
||||
);
|
||||
});
|
||||
|
||||
it('does not persist legacy fileIdentifier key (mongoose strict drops it)', async () => {
|
||||
setupCodeEnvUpload({ storage_session_id: 'sess-3', file_id: 'fid-3' });
|
||||
const req = makeReq();
|
||||
await processAgentFileUpload({
|
||||
req,
|
||||
res: mockRes,
|
||||
metadata: {
|
||||
agent_id: 'agent-abc',
|
||||
tool_resource: EToolResources.execute_code,
|
||||
file_id: 'file-uuid',
|
||||
message_file: true,
|
||||
},
|
||||
});
|
||||
|
||||
const persisted = db.createFile.mock.calls[0][0];
|
||||
expect(persisted.metadata).not.toHaveProperty('fileIdentifier');
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('processFileURL', () => {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue