🛂 fix: Skip Inherited / Mark Skill Files Read-Only in Code-Env Pipeline (#12866)

* 🛂 fix: Skip Re-Download of Inherited Code-Env Files (No More 403 Storms)

When a bash/code-interpreter call lists or operates on inputs the user
already owns (skill files primed via primeInvokedSkills, files inherited
from a prior session), codeapi echoes those files back in the tool
result with `inherited: true`. We were treating every entry as a
generated artifact and calling processCodeOutput on each, which:

1. Hit `/api/files/code/download/<session_id>/<file_id>` with the
   user's session key. Skill files are uploaded under the skill's
   entity_id, so every download 403'd — producing dozens of
   "Unauthorized download" log lines per turn.

2. Surfaced those inputs as ghost file chips in the UI even though
   they were never generated by the run.

3. Wasted a download round-trip even when no auth boundary was
   crossed — the file is already persisted at its origin.

Fix: skip files where `file.inherited === true` in all three
artifact-files loops (`tools.js`, `createToolEndCallback`, and
`createResponsesToolEndCallback`). Skill files remain available to
subsequent calls via primeInvokedSkills / session inheritance — we
just don't redundantly re-download them.

Pairs with codeapi-side change that adds the `inherited` flag.

* 🔒 feat: Mark Skill Files as `read_only` During Code-Env Priming

Pairs with codeapi `read_only` upload flag (ClickHouse/ai#1345). When
LibreChat primes a skill into the code-env, every file in the batch
(SKILL.md plus all bundled scripts/schemas/docs) is now uploaded with
`read_only: true`. Codeapi seals these inputs at the filesystem layer
(chmod 444) and the walker echoes the original refs as `inherited:
true` regardless of whether sandboxed code modified the bytes on disk.

Without this, the previous PR's `inherited` skip handled only the
unchanged case. A modified skill file (pip writing pyc near a .py, a
script accidentally truncating LICENSE.txt, etc.) still flowed through
the modified-input branch on codeapi, got a fresh user-owned file_id,
uploaded as a "generated" artifact, and surfaced in the UI as a chip
the user couldn't actually authorize a download for.

Changes:

- `api/server/services/Files/Code/crud.js`:
  `batchUploadCodeEnvFiles({ ..., read_only })` forwards the flag as
  a multipart form field. Default `false` preserves existing behavior
  for user-attached files and prior-session inheritance.

- `packages/api/src/agents/skillFiles.ts`: type signature gains
  `read_only?: boolean`; `primeSkillFiles` passes `true`.

- `packages/api/src/agents/skillFiles.spec.ts`: assert the upload call
  carries `read_only: true`.

The flag is intentionally not skill-specific. Any future
infrastructure-input flow (system fixtures, cached datasets, etc.) can
opt in the same way.
This commit is contained in:
Danny Avila 2026-04-29 08:26:25 +09:00 committed by GitHub
parent f69e8e26f8
commit 46a86d849f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 53 additions and 1 deletions

View file

@ -543,6 +543,15 @@ function createToolEndCallback({ req, res, artifactPromises, streamId = null })
}
for (const file of output.artifact.files) {
/* `inherited` files are unchanged passthroughs of inputs the caller
* already owns (skill files, prior session inputs, inherited
* .dirkeep markers). Skip post-processing: re-downloading with the
* user's session key 403s when the file is entity-scoped, and the
* input is already persisted at its origin. They remain available
* to subsequent calls via primeInvokedSkills / session inheritance. */
if (file.inherited) {
continue;
}
const { id, name } = file;
artifactPromises.push(
(async () => {
@ -760,6 +769,15 @@ function createResponsesToolEndCallback({ req, res, tracker, artifactPromises })
}
for (const file of output.artifact.files) {
/* `inherited` files are unchanged passthroughs of inputs the caller
* already owns (skill files, prior session inputs, inherited
* .dirkeep markers). Skip post-processing: re-downloading with the
* user's session key 403s when the file is entity-scoped, and the
* input is already persisted at its origin. They remain available
* to subsequent calls via primeInvokedSkills / session inheritance. */
if (file.inherited) {
continue;
}
const { id, name } = file;
artifactPromises.push(
(async () => {

View file

@ -180,6 +180,15 @@ const callTool = async (req, res) => {
const artifactPromises = [];
for (const file of artifact.files) {
/* Files flagged `inherited` by codeapi are unchanged passthroughs of
* inputs the caller already owns (skill files, prior downloaded inputs,
* inherited .dirkeep markers). Re-downloading them is wasted work and
* 403s when the file is scoped to a different entity (e.g. skill
* entity_id) than the user's session key. They remain available for
* subsequent tool calls via primeInvokedSkills / session inheritance. */
if (file.inherited) {
continue;
}
const { id, name } = file;
artifactPromises.push(
(async () => {

View file

@ -112,15 +112,24 @@ async function uploadCodeEnvFile({ req, stream, filename, entity_id = '' }) {
* @param {import('express').Request & { user: { id: string } }} params.req - The request object.
* @param {Array<{ stream: NodeJS.ReadableStream; filename: string }>} params.files - Files to upload.
* @param {string} [params.entity_id] - Optional entity ID.
* @param {boolean} [params.read_only] - When true, codeapi tags every file in
* the batch as infrastructure (e.g. skill files). The flag is persisted as
* MinIO object metadata (`X-Amz-Meta-Read-Only`) and travels with the file
* through subsequent download/walk passes sandboxed-code modifications
* are dropped on the floor and the original ref is echoed back as
* `inherited: true`, never as a generated artifact.
* @returns {Promise<{ session_id: string; files: Array<{ fileId: string; filename: string }> }>}
* @throws {Error} If the batch upload fails entirely.
*/
async function batchUploadCodeEnvFiles({ req, files, entity_id = '' }) {
async function batchUploadCodeEnvFiles({ req, files, entity_id = '', read_only = false }) {
try {
const form = new FormData();
if (entity_id.length > 0) {
form.append('entity_id', entity_id);
}
if (read_only) {
form.append('read_only', 'true');
}
for (const file of files) {
form.append('file', file.stream, file.filename);
}

View file

@ -106,6 +106,10 @@ describe('primeInvokedSkills — execute_code capability gate', () => {
auth internally. */
expect(uploadArgs).not.toHaveProperty('apiKey');
expect(uploadArgs.entity_id).toBe(SKILL_ID.toString());
/* Skill files are infrastructure inputs; the read_only flag tells codeapi
to seal them so any sandboxed-code modifications are dropped instead
of surfaced as ghost generated artifacts. */
expect(uploadArgs.read_only).toBe(true);
/* One uploaded file per `fileRecords` entry plus the synthetic
SKILL.md that `primeSkillFiles` always prepends. */
expect(uploadArgs.files).toHaveLength(fileRecords.length + 1);

View file

@ -27,6 +27,10 @@ export interface PrimeSkillFilesParams {
req: ServerRequest;
files: Array<{ stream: NodeJS.ReadableStream; filename: string }>;
entity_id?: string;
/** When true, codeapi tags every file in the batch as infrastructure
* (read-only inputs that must never surface as generated artifacts,
* even if sandboxed code mutates the bytes on disk). */
read_only?: boolean;
}) => Promise<{
session_id: string;
files: Array<{ fileId: string; filename: string }>;
@ -161,6 +165,14 @@ export async function primeSkillFiles(
req,
files: filesToUpload,
entity_id: entityId,
/* Skill files are infrastructure: SKILL.md + bundled scripts/schemas/
* docs that the agent reads but should never edit. Tag the upload as
* read-only so codeapi seals the inputs (chmod 444 in-sandbox) and
* walker echoes the original refs as `inherited: true` even if some
* sandboxed code path mutates bytes on disk. Without this, modified
* skill files surface as ghost generated artifacts the user has no
* authority to download. */
read_only: true,
});
// Exclude SKILL.md from the returned files array — it is uploaded to disk
// for bash access but has no codeEnvIdentifier (cannot be cached). Omitting