mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 16:07:30 +00:00
* 🛠️ feat: Add registerCodeExecutionTools helper Idempotently registers `bash_tool` + `read_file` in the run's tool registry and tool-definition list via a registry `.has()` dedupe. Sets up the single code-execution tool path shared by: - `initializeAgent` (when an agent has `execute_code` in its tools and the capability is enabled for the run) - `injectSkillCatalog` (when skills are active; unconditional read_file, bash_tool follows `codeEnvAvailable`) Both callers reach the helper in the same initialization sequence, so the second call becomes a no-op and exactly one copy of each tool reaches the LLM — no more double registration for agents that combine `execute_code` capability with active skills. Unit-tested on a fresh run, idempotence (second call, overlap with prior tooldefs, partial overlap), and the no-registry variant. * 🔀 refactor: Route injectSkillCatalog bash_tool + read_file through registerCodeExecutionTools The `skill` tool is still registered inline (it's skill-path-specific), but `bash_tool` + `read_file` now flow through the shared idempotent helper so a prior registration from the execute_code path doesn't produce a duplicate copy later in the same run. Behavior preserved: - `read_file` always registers when any active skill is in scope — manually-primed `disable-model-invocation: true` skills still need it to load `references/*` from storage. - `bash_tool` follows `codeEnvAvailable` exactly as before. Adds a test pinning the cross-call dedupe: when `injectSkillCatalog` runs AFTER `registerCodeExecutionTools` has already seeded the registry + tool definitions with bash_tool/read_file, the resulting `toolDefinitions` still contains exactly one copy of each. * 🪄 feat: Expand `execute_code` tool name into bash_tool + read_file at initialize-time When an agent's `tools` include `execute_code` and the `execute_code` capability is enabled for the run, `initializeAgent` now registers `bash_tool` + `read_file` via `registerCodeExecutionTools` before `injectSkillCatalog`. The legacy `execute_code` tool definition is no longer handed to the LLM — `execute_code` remains on the agent document as a capability-trigger marker, but the runtime expands it into the skill-flavored tool pair. Call ordering matters: the `execute_code` registration runs BEFORE `injectSkillCatalog`, so the skill path's own `registerCodeExecutionTools` call inside `injectSkillCatalog` becomes a no-op via the registry's `.has()` check. Exactly one copy of each tool reaches the LLM whether the agent has: - only `execute_code` (legacy path) - only skills - both No data migration needed — `agent.tools: ['execute_code']` stays in the DB unchanged; the expansion is a runtime operation. Three tests cover the matrix: execute_code + capability on → bash_tool + read_file registered; execute_code + capability off → neither registered; no execute_code + capability on → neither registered. * 🗑️ refactor: Drop CodeExecutionToolDefinition from the builtin registry Removes the legacy `execute_code` entry from `agentToolDefinitions` and the corresponding import. With the initialize-time expansion in place, nothing consults `getToolDefinition('execute_code')` for a tool schema any more — the capability gate still filters on the string `execute_code`, but the actual tool definitions the LLM sees come from `registerCodeExecutionTools` (i.e. `bash_tool` + `read_file`). `loadToolDefinitions` in `packages/api/src/tools/definitions.ts` silently drops `execute_code` when it no longer resolves in the registry — that's the expected path and is now covered by an updated test. No caller of `getToolDefinition('execute_code')` expects a non-undefined result after this change. * 🔌 refactor: Read CODE_API_KEY from env for primeCodeFiles + PTC Finishes the Phase 4 server-env-keyed rollout on the two remaining `loadAuthValues({ authFields: [EnvVar.CODE_API_KEY] })` sites in `ToolService.js`: - `primeCodeFiles` (user-attached file priming on execute_code agents) - Programmatic Tool Calling (`createProgrammaticToolCallingTool`) Both now read `process.env[EnvVar.CODE_API_KEY]` directly, matching `bash_tool`'s pattern. The per-user plugin-auth path is no longer consulted for code-env credentials anywhere in the hot path — the agents library owns the actual tool-call execution and also reads the env var internally. Priming still fires for existing user-file workflows so the legacy `toolContextMap[execute_code]` hint ("files available at /mnt/data/...") stays in the prompt; only the key lookup changed. * 🔧 fix: Type the pre-seeded dedupe-test tools as LCTool CI TypeScript type checks caught `{ parameters: {} }` in the new cross-call dedupe test: `LCTool.parameters` is a `JsonSchemaType`, not `{}`. Use `{ type: 'object', properties: {} }` and type the local registry Map through the parameter-derived shape so the pre-seeded values match what `toolRegistry.set` expects. * 🛡️ fix: Run execute_code expansion before GOOGLE_TOOL_CONFLICT gate Codex review caught a latent regression: the original Phase 8 placement ran `registerCodeExecutionTools` after `hasAgentTools` was computed, so an execute-code-only agent on Google/Vertex with provider-specific `options.tools` populated would no longer trip `GOOGLE_TOOL_CONFLICT` — the legacy `CodeExecutionToolDefinition` used to populate `toolDefinitions` before the guard, but after dropping it from the registry, `toolDefinitions` stayed empty until my expansion ran downstream of the guard. Mixed provider + agent tools would silently flow through to the LLM. Fix moves the `execute_code` expansion to BEFORE `hasAgentTools` computation. `bash_tool` + `read_file` now contribute to the check the same way the legacy `execute_code` def did. Covered by a new test that pins the Google+execute_code+provider-tools scenario — the `rejects.toThrow(/google_tool_conflict/)` path would have silently passed on the prior placement. * 🔗 fix: Thread codeEnvAvailable through handoff sub-agents Round-2 codex review caught the other half of the execute_code expansion gap: `discoverConnectedAgents` omitted `codeEnvAvailable` from its forwarded `initializeAgent` params, so handoff sub-agents with `agent.tools: ['execute_code']` lost the `bash_tool` + `read_file` registration (pre-Phase 8 the legacy `CodeExecutionToolDefinition` would have landed in their `toolDefinitions` via the registry). - Add `codeEnvAvailable?` to `DiscoverConnectedAgentsParams` and forward it verbatim on every sub-agent `initializeAgent` call. - Update the three JS call sites that construct the primary's `codeEnvAvailable` (`services/Endpoints/agents/initialize.js`, `controllers/agents/openai.js`, `controllers/agents/responses.js`) to pass the same flag into `discoverConnectedAgents` — one authoritative source per request. - Two regression tests in `discovery.spec.ts` pin the true/false passthrough so a future refactor that drops the param-forwarding surfaces immediately. Left intentionally unchanged: `packages/api/src/agents/openai/service.ts` (public API helper with no in-repo caller). External consumers of `createAgentChatCompletion` who want code execution should pass a `codeEnvAvailable`-aware `initializeAgent` via `deps` — documenting the full public-API surface is out of scope for this Phase 8 PR. * 🔗 fix: Thread codeEnvAvailable through addedConvo + memory-agent paths Round-3 codex review caught the last two production `initializeAgent` callers missing the Phase-8 capability flag: - `api/server/services/Endpoints/agents/addedConvo.js` (multi-convo parallel agent execution). Added `codeEnvAvailable` to `processAddedConvo`'s destructured params and forwarded it into the per-added-agent `initializeAgent` call. Caller in `api/server/services/Endpoints/agents/initialize.js` passes the same `codeEnvAvailable` it computed for the primary. - `api/server/controllers/agents/client.js` (`useMemory` — memory extraction agent). Computes its own `codeEnvAvailable` from `appConfig?.endpoints?.[EModelEndpoint.agents]?.capabilities` and forwards into `initializeAgent`. Memory agents rarely list `execute_code`, but if one does, pre-Phase 8 they got the legacy `execute_code` tool registered unconditionally — the passthrough restores parity. With this, every production caller of `initializeAgent` explicitly resolves the capability: main chat flow (primary + handoff), OpenAI chat completions (primary + handoff), Responses API (primary + handoff), added convo parallel agents, and memory agents. The one remaining caller, `packages/api/src/agents/openai/service.ts::createAgentChatCompletion`, is a public API helper with no in-repo consumer (external callers must pass a capability-aware `initializeAgent` via `deps`). * 🪤 fix: Remove duplicate appConfig declaration causing TDZ ReferenceError The Responses API controller had TWO `const appConfig = req.config;` bindings inside `createResponse`: one at the top of the function (added by the Phase 4 `bash_tool` decouple) and one inside the try block (added by the polish PR #12760). Because `const` is block-scoped with a temporal dead zone, the inner redeclaration put `appConfig` in TDZ for the entire try block, so any earlier reference inside the try — notably `appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders` at line 348 — threw `ReferenceError: Cannot access 'appConfig' before initialization`. The error was silently swallowed by the outer try/catch, leaving `recordCollectedUsage` unreached and the six `responses.unit.spec.js` token-usage tests failing. Removing the inner redeclaration fixes the six failing tests (verified: 11/11 pass locally post-fix, 0 regressions elsewhere). The outer function-scoped binding already provides `appConfig` to every downstream reference. * 🔗 fix: Thread codeEnvAvailable through the OpenAI chat-completion public API Round-4 codex review (legitimate on the type-safety angle, even though the runtime concern was already covered): the `createAgentChatCompletion` helper defines its own narrower `InitializeAgentParams` interface locally, and the type was missing `codeEnvAvailable`. External consumers who supply a capability-aware `deps.initializeAgent` couldn't route `codeEnvAvailable` through without a type-cast workaround. - Widen the local `InitializeAgentParams` interface to include `codeEnvAvailable?: boolean` (matches the real `packages/api/src/agents/initialize.ts` type). - Derive `codeEnvAvailable` inside `createAgentChatCompletion` from `deps.appConfig?.endpoints?.agents?.capabilities` (the same source the in-repo controllers use) and forward to `deps.initializeAgent`. Uses a string literal `'execute_code'` lookup so this file stays free of a `librechat-data-provider` import — keeping the dependency surface of the public helper minimal. With this, external consumers of `createAgentChatCompletion` who pass `appConfig` with the agents capabilities get `bash_tool` + `read_file` registration automatically; consumers who don't pass `appConfig` retain the existing "explicit opt-in" semantics (the flag stays `undefined`, expansion is skipped). * 🧹 chore: Review-driven polish — observability log, JSDoc DRY, test gaps, no-op allocation Addresses the comprehensive review of PR #12767: - **Finding #1** (MINOR, observability): `initializeAgent` now emits a debug log when an agent lists `execute_code` in its tools but the runtime gate is off (`params.codeEnvAvailable` !== true). The event-driven `loadToolDefinitionsWrapper` path doesn't log capability-disabled warnings, so without this the tool silently vanishes from the LLM's definitions with zero trace. Operators debugging "why isn't code interpreter working?" now get a signal at the initialize layer. - **Finding #5** (NIT, allocation): `registerCodeExecutionTools` now returns the input `toolDefinitions` array by reference on the no-op path (both tools already registered by a prior caller in the same run) instead of allocating a fresh spread array every time. The common dual-call scenario — `initializeAgent` then `injectSkillCatalog` — saves one O(n) copy per request. - **Finding #4** (NIT, DRY): Collapsed the duplicated 6-line JSDoc comment in `openai.js`, `responses.js`, and `addedConvo.js` into either a one-line `@see DiscoverConnectedAgentsParams.codeEnvAvailable` pointer (the two JS call sites) or a compact 3-line block referring back to the canonical source (addedConvo's @param). - **Finding #2** (MINOR, test gap): Added `api/server/services/Endpoints/agents/addedConvo.spec.js` with three cases covering `codeEnvAvailable=true`, `codeEnvAvailable=false`, and omitted (undefined) passthrough. A future refactor that drops the param from destructuring now surfaces here instead of silently regressing multi-convo parallel agents with `execute_code`. - **Finding #3** (MINOR, test gap): Added `api/server/controllers/agents/__tests__/client.memory.spec.js` pinning the capability-flag derivation that `AgentClient::useMemory` uses — six cases covering present/absent/null/undefined config shapes plus an enum-literal pin (`'execute_code'` / `'agents'`). Catches enum renames or config-path shifts that would otherwise silently strip `bash_tool` + `read_file` from memory agents. Finding #7 (jest.mock scoping, confidence 40) left as-is: the reviewer's own risk assessment noted `buildToolSet` doesn't touch the mocked exports, and restructuring a file-level `jest.mock` to `jest.doMock` + dynamic `import()` introduces more complexity than the speculative risk justifies. The existing mock is scoped to the test file and contains the same stubs the adjacent `skills.test.ts` already uses. Finding #6 (PR description commit count) addressed out-of-band via PR description update. All existing tests pass, typecheck clean, lint clean across touched files. New tests: 9 cases across 2 new spec files. * 🧽 refactor: Replace hardcoded 'execute_code' string with AgentCapabilities enum in service.ts Follow-up review (conf 55) caught that `openai/service.ts`'s Phase 8 `codeEnvAvailable` derivation used the literal `'execute_code'` while every in-repo controller uses `AgentCapabilities.execute_code` from `librechat-data-provider`. The file deliberately uses local type interfaces to keep the public API helper's type surface small, but that pattern was never a ban on single-value imports from the data provider — `packages/api` already depends on it. Importing the enum value means a future rename of `AgentCapabilities.execute_code` propagates to this file automatically, matching the in-repo controllers' behavior. Other follow-up findings left as-is per the reviewer's own verdict: - #2 (memory spec mirrors the production expression rather than calling `AgentClient::useMemory` directly): reviewer flagged as "not blocking" / "design-philosophy observation." The test file's JSDoc already explicitly documents the tradeoff and pins the enum literals to catch the most likely drift vector. Standing up `AgentClient` + all its mocks for a one-line regression guard is disproportionate. - #3 (`addedConvo.spec.js` mock signature vs. underlying `loadAddedAgent` arity): reviewer's own confidence 25 noted the mock matches the wrapper's actual call pattern in the production file. Not a real gap. - #4 was self-retracted as a false alarm. * 🗑️ refactor: Fully deprecate CODE_API_KEY — remove all LibreChat-side references The code-execution sandbox no longer authenticates via a per-run `CODE_API_KEY` (frontend or backend). Auth moved server-side into the agents library / sandbox service, so LibreChat drops every reference: **Backend plumbing:** - `api/server/services/Files/Code/crud.js`: `getCodeOutputDownloadStream`, `uploadCodeEnvFile`, `batchUploadCodeEnvFiles` no longer accept `apiKey` or send the `X-API-Key` header. - `api/server/services/Files/Code/process.js`: `processCodeOutput`, `getSessionInfo`, `primeFiles` drop the `apiKey` param throughout. - `api/server/services/ToolService.js`: stop reading `process.env[EnvVar.CODE_API_KEY]` for `primeCodeFiles` and PTC; the agents library handles auth internally. Remove the now-dead `loadAuthValues` + `EnvVar` imports. Drop the misleading "LIBRECHAT_CODE_API_KEY" hint from the bash_tool error log. - `api/server/services/Files/process.js`: remove the `loadAuthValues` call around `uploadCodeEnvFile`. - `api/server/routes/files/files.js`: code-env file download no longer fetches a per-user key. - `api/server/controllers/tools.js`: `execute_code` is no longer a tool that needs verifyToolAuth with `[EnvVar.CODE_API_KEY]` — the endpoint always reports system-authenticated so the client skips the key-entry dialog. `processCodeOutput` called without `apiKey`. - `api/server/controllers/agents/callbacks.js`: `processCodeOutput` invoked without the loadAuthValues round trip, for both LegacyHandler and Responses-API handlers. - `api/app/clients/tools/util/handleTools.js`: `createCodeExecutionTool` called with just `user_id` + files. **packages/api:** - `packages/api/src/agents/skillFiles.ts`: `PrimeSkillFilesParams`, `PrimeInvokedSkillsDeps`, `primeSkillFiles`, `primeInvokedSkills` all drop the `apiKey` param; the gate is purely `codeEnvAvailable`. - `packages/api/src/agents/handlers.ts`: `handleSkillToolCall` drops the `process.env[EnvVar.CODE_API_KEY]` read; skill-file priming is now gated solely on `codeEnvAvailable`. `ToolExecuteOptions` signatures drop apiKey from `batchUploadCodeEnvFiles` and `getSessionInfo`. - `packages/api/src/agents/skillConfigurable.ts`: JSDoc no longer references the env var. - `packages/api/src/tools/classification.ts`: PTC creation no longer gated on `loadAuthValues`; `buildToolClassification` drops the `loadAuthValues` dep entirely (no LibreChat-side callers need it for this path anymore). - `packages/api/src/tools/definitions.ts`: `LoadToolDefinitionsDeps` drops the `loadAuthValues` field. **Frontend:** - Delete `client/src/hooks/Plugins/useAuthCodeTool.ts`, `useCodeApiKeyForm.ts`, and `client/src/components/SidePanel/Agents/Code/ApiKeyDialog.tsx` — the install/revoke dialogs for CODE_API_KEY are fully dead. - `BadgeRowContext.tsx`: drop `codeApiKeyForm` from the context type and provider. `codeInterpreter` toggle treated as always authenticated (sandbox auth is server-side). - `ToolsDropdown.tsx`, `ToolDialogs.tsx`, `CodeInterpreter.tsx`, `RunCode.tsx`, `SidePanel/Agents/Code/Action.tsx` +`Form.tsx`: all API-key dialog trigger refs, "Configure code interpreter" gear buttons, and auth-verification plumbing removed. The "Code Interpreter" toggle is now a plain `AgentCapabilities.execute_code` checkbox — no key-entry gate. - `client/src/locales/en/translation.json`: drop the three `com_ui_librechat_code_api*` keys and `com_ui_add_code_interpreter_api_key`. Other locales are externally automated per CLAUDE.md. **Config:** - `.env.example`: remove the `# LIBRECHAT_CODE_API_KEY=your-key` section and its header. **Tests:** - `crud.spec.js`: assertions flipped to pin "no X-API-Key header" and "no apiKey param". - `skillFiles.spec.ts`: removed env-var save/restore; tests now pin that the batch-upload path is gated solely on `codeEnvAvailable` and that no apiKey is threaded through. - `handlers.spec.ts`: same — just the `codeEnvAvailable` gate pins remain. - `classification.spec.ts`: remove the two tests that asserted `loadAuthValues` was (not) called for PTC. - `definitions.spec.ts`: drop every `loadAuthValues: mockLoadAuthValues` entry from the deps shape. - `process.spec.js`: strip the mock of `EnvVar.CODE_API_KEY`. **Comment hygiene:** - `tools.ts`, `initialize.ts`, `registry/definitions.ts`: shortened stale comment references to "legacy `execute_code` tool" without naming the retired env var. Tests verified: 678 packages/api tests pass, 836 backend api tests pass. Typecheck clean, lint clean. Only remaining CODE_API_KEY mentions in the code are two regression-guard assertions: - `crud.spec.js`: pins "no X-API-Key header" stays absent. - `skillConfigurable.spec.ts`: pins `configurable` never grows a `codeApiKey` field. * 🧹 chore: Remove the last two CODE_API_KEY name mentions in LibreChat Follow-up to the prior full deprecation commit: two tests still named the retired identifier in their regression-guard assertions. - `packages/api/src/agents/skillConfigurable.spec.ts`: drop the "does not inject a codeApiKey key" test. The `codeApiKey` field is gone from the production configurable shape, so an absence-assertion naming it re-introduces the retired identifier in code. - `api/server/services/Files/Code/crud.spec.js`: rename the "without an X-API-Key header" case back to "should request stream response from the correct URL" and drop the `expect(headers).not.toHaveProperty('X-API-Key')` assertion. The surrounding request-shape checks (URL, timeout, responseType) still pin the behavior; the explicit header-absence line was named-after the deprecated contract. Result: `grep -rn "CODE_API_KEY\|codeApiKey\|LIBRECHAT_CODE_API_KEY"` against the LibreChat source tree returns zero hits. The only remaining `X-API-Key` strings in this repo are on unrelated OpenAPI Action + MCP server auth configurations, where the string is user-facing config, not a LibreChat-owned identifier. Tests: 677 packages/api pass (2 pre-existing summarization e2e failures unrelated); 126 api-workspace controller/service tests pass. Typecheck and lint clean. * 🎯 fix: Narrow codeEnvAvailable to per-agent (admin cap AND agent.tools) Before this commit, `codeEnvAvailable` was computed in the three JS controllers as the admin-level capability flag only (`enabledCapabilities.has(AgentCapabilities.execute_code)`) and passed through `initializeAgent` → `injectSkillCatalog` / `primeInvokedSkills` / `enrichWithSkillConfigurable` unchanged. A skills-only agent whose `tools` array didn't include `execute_code` still got `bash_tool` registered (via `injectSkillCatalog`) and skill files re-primed to the sandbox on every turn — wrong, because the agent never opted in to code execution. **Fix:** `initializeAgent` now computes the per-agent effective value once as `params.codeEnvAvailable === true && agent.tools.includes(Tools.execute_code)`, reuses the same boolean for: 1. The `execute_code` → `bash_tool + read_file` expansion gate (previously already consulted `agent.tools`; now shares the single `effectiveCodeEnvAvailable` binding). 2. The `injectSkillCatalog` call (previously got the raw admin flag). 3. The returned `InitializedAgent.codeEnvAvailable` field (new, typed as required boolean). **Controllers (initialize.js, openai.js, responses.js):** store `primaryConfig.codeEnvAvailable` in `agentToolContexts.set(primaryId, ...)`, capture `config.codeEnvAvailable` in every handoff `onAgentInitialized` callback, and read it from the per-agent ctx inside the `toolExecuteOptions.loadTools` runtime closure. The hoisted `const codeEnvAvailable = enabledCapabilities.has(...)` locals in the two OpenAI-compat controllers are gone — they were shadowing the narrowed per-agent value. **primeInvokedSkills:** `handlePrimeInvokedSkills` in `services/Endpoints/agents/initialize.js` now uses `primaryConfig.codeEnvAvailable` (per-agent, narrowed) instead of the raw admin flag. A skills-only primary agent won't re-prime historical skill files to the sandbox even when the admin enabled the capability globally. **Efficiency:** one extra `&&` in `initializeAgent`. No runtime hot-path cost — the `includes()` scan on `agent.tools` was already happening for the `execute_code` expansion gate; it's now just bound to a local. Tool execution closures read `ctx.codeEnvAvailable === true` (property access + strict equality, O(1)). **Ephemeral-agent note:** per-agent narrowing is authoritative for both persisted and ephemeral flows. The ephemeral toggle (`ephemeralAgent.execute_code`) is reconciled into `agent.tools` upstream in `packages/api/src/agents/added.ts`, so `agent.tools.includes('execute_code')` is the single source of truth by the time `initializeAgent` runs. **Tests:** two new regression tests pin the narrowing contract: - `initialize.test.ts` — four-quadrant matrix on `InitializedAgent.codeEnvAvailable` (cap on × agent asks, cap on × doesn't ask, cap off × asks, neither). Catches future refactors that drop either half of the AND. - `skills.test.ts` — `injectSkillCatalog` with `codeEnvAvailable: false` against an active skill catalog must NOT register `bash_tool` even though it still registers `read_file` + `skill`. This is the state a skills-only agent gets post-narrowing. All 191 affected packages/api tests pass + 836 backend api tests pass. Typecheck clean, lint clean. * 🧽 refactor: Comprehensive-review polish — hoist tool defs, pin verifyToolAuth contract, doc appConfig Addresses the comprehensive review of Phase 8. Findings mapped: **#1 (MINOR): `verifyToolAuth` unconditional auth for execute_code** - Added doc comment explicitly stating the deployment contract (admin capability → reachable sandbox; no per-check health probe to keep UI-gate queries O(1)). - New `api/server/controllers/__tests__/tools.verifyToolAuth.spec.js` with 4 regression tests pinning the contract: 1. `authenticated: true` + `SYSTEM_DEFINED` for execute_code. 2. 404 for unknown tool IDs. 3. `loadAuthValues` is never consulted (catches a future revert that would resurface the per-user key-entry dialog). 4. Response `message` is never `USER_PROVIDED`. **#2 (MINOR): `openai/service.ts` undocumented `appConfig` dependency** - Expanded the `ChatCompletionDependencies.appConfig` JSDoc to spell out that omitting it silently disables code execution for agents with `execute_code` in their tools. External consumers of `createAgentChatCompletion` now have the contract documented at the type boundary. **#5 (NIT): `registerCodeExecutionTools` re-allocates tool defs** - Hoisted `READ_FILE_DEF` and `BASH_TOOL_DEF` to module-level `Object.freeze`d constants. The shapes derive entirely from static `@librechat/agents` exports, so a single frozen object per tool is safe to share across every agent init. Eliminates the ~4-property allocations on every call (including the common second-call no-op path). **#6 (NIT): Verbose history-priming comment in initialize.js** - Trimmed the 16-line `handlePrimeInvokedSkills` block to a 5-line summary with `@see InitializedAgent.codeEnvAvailable` pointer. The canonical narrowing explanation lives on the type; the controller comment is just the ACL-vs-capability rationale. **Skipped:** - #3 (memory spec tests a mirror function): reviewer self-dismissed as a design tradeoff; the enum-literal pin already catches the highest-risk drift vector. - #4 (cross-repo contract for `createCodeExecutionTool`): user will explicitly install the latest `@librechat/agents` dev version once the companion PR publishes, so the version pin will be authoritative. - #7 (migration/deprecation note for self-hosters): out of scope per user direction — release notes handle this. Tests verified: 679 packages/api + 840 backend api tests pass. Typecheck + lint clean. * 🔧 chore: Update @librechat/agents version to 3.1.68-dev.1 across package-lock and package.json files This commit updates the version of the `@librechat/agents` package from `3.1.68-dev.0` to `3.1.68-dev.1` in the `package-lock.json` and relevant `package.json` files. This change ensures consistency across the project and incorporates any updates or fixes from the new version.
1063 lines
34 KiB
JavaScript
1063 lines
34 KiB
JavaScript
const fs = require('fs');
|
|
const path = require('path');
|
|
const mime = require('mime');
|
|
const { v4 } = require('uuid');
|
|
const {
|
|
isUUID,
|
|
megabyte,
|
|
FileContext,
|
|
FileSources,
|
|
imageExtRegex,
|
|
EModelEndpoint,
|
|
EToolResources,
|
|
mergeFileConfig,
|
|
AgentCapabilities,
|
|
checkOpenAIStorage,
|
|
removeNullishValues,
|
|
isAssistantsEndpoint,
|
|
getEndpointFileConfig,
|
|
documentParserMimeTypes,
|
|
} = require('librechat-data-provider');
|
|
const { logger } = require('@librechat/data-schemas');
|
|
const { sanitizeFilename, parseText, processAudioFile } = require('@librechat/api');
|
|
const {
|
|
convertImage,
|
|
resizeAndConvert,
|
|
resizeImageBuffer,
|
|
} = require('~/server/services/Files/images');
|
|
const { addResourceFileId, deleteResourceFileId } = require('~/server/controllers/assistants/v2');
|
|
const { getOpenAIClient } = require('~/server/controllers/assistants/helpers');
|
|
const { loadAuthValues } = require('~/server/services/Tools/credentials');
|
|
const { getFileStrategy } = require('~/server/utils/getFileStrategy');
|
|
const { checkCapability } = require('~/server/services/Config');
|
|
const { LB_QueueAsyncCall } = require('~/server/utils/queue');
|
|
const { getStrategyFunctions } = require('./strategies');
|
|
const { determineFileType } = require('~/server/utils');
|
|
const { STTService } = require('./Audio/STTService');
|
|
const db = require('~/models');
|
|
|
|
/**
|
|
* Creates a modular file upload wrapper that ensures filename sanitization
|
|
* across all storage strategies. This prevents storage-specific implementations
|
|
* from having to handle sanitization individually.
|
|
*
|
|
* @param {Function} uploadFunction - The storage strategy's upload function
|
|
* @returns {Function} - Wrapped upload function with sanitization
|
|
*/
|
|
const createSanitizedUploadWrapper = (uploadFunction) => {
|
|
return async (params) => {
|
|
const { req, file, file_id, ...restParams } = params;
|
|
|
|
// Create a modified file object with sanitized original name
|
|
// This ensures consistent filename handling across all storage strategies
|
|
const sanitizedFile = {
|
|
...file,
|
|
originalname: sanitizeFilename(file.originalname),
|
|
};
|
|
|
|
return uploadFunction({ req, file: sanitizedFile, file_id, ...restParams });
|
|
};
|
|
};
|
|
|
|
/**
|
|
* Enqueues the delete operation to the leaky bucket queue if necessary, or adds it directly to promises.
|
|
*
|
|
* @param {object} params - The passed parameters.
|
|
* @param {ServerRequest} params.req - The express request object.
|
|
* @param {MongoFile} params.file - The file object to delete.
|
|
* @param {Function} params.deleteFile - The delete file function.
|
|
* @param {Promise[]} params.promises - The array of promises to await.
|
|
* @param {string[]} params.resolvedFileIds - The array of promises to await.
|
|
* @param {OpenAI | undefined} [params.openai] - If an OpenAI file, the initialized OpenAI client.
|
|
*/
|
|
function enqueueDeleteOperation({ req, file, deleteFile, promises, resolvedFileIds, openai }) {
|
|
if (checkOpenAIStorage(file.source)) {
|
|
// Enqueue to leaky bucket
|
|
promises.push(
|
|
new Promise((resolve, reject) => {
|
|
LB_QueueAsyncCall(
|
|
() => deleteFile(req, file, openai),
|
|
[],
|
|
(err, result) => {
|
|
if (err) {
|
|
logger.error('Error deleting file from OpenAI source', err);
|
|
reject(err);
|
|
} else {
|
|
resolvedFileIds.push(file.file_id);
|
|
resolve(result);
|
|
}
|
|
},
|
|
);
|
|
}),
|
|
);
|
|
} else {
|
|
// Add directly to promises
|
|
promises.push(
|
|
deleteFile(req, file)
|
|
.then(() => resolvedFileIds.push(file.file_id))
|
|
.catch((err) => {
|
|
logger.error('Error deleting file', err);
|
|
return Promise.reject(err);
|
|
}),
|
|
);
|
|
}
|
|
}
|
|
|
|
// TODO: refactor as currently only image files can be deleted this way
|
|
// as other filetypes will not reside in public path
|
|
/**
|
|
* Deletes a list of files from the server filesystem and the database.
|
|
*
|
|
* @param {Object} params - The params object.
|
|
* @param {MongoFile[]} params.files - The file objects to delete.
|
|
* @param {ServerRequest} params.req - The express request object.
|
|
* @param {DeleteFilesBody} params.req.body - The request body.
|
|
* @param {string} [params.req.body.agent_id] - The agent ID if file uploaded is associated to an agent.
|
|
* @param {string} [params.req.body.assistant_id] - The assistant ID if file uploaded is associated to an assistant.
|
|
* @param {string} [params.req.body.tool_resource] - The tool resource if assistant file uploaded is associated to a tool resource.
|
|
*
|
|
* @returns {Promise<void>}
|
|
*/
|
|
const processDeleteRequest = async ({ req, files }) => {
|
|
const appConfig = req.config;
|
|
const resolvedFileIds = [];
|
|
const deletionMethods = {};
|
|
const promises = [];
|
|
|
|
/** @type {Record<string, OpenAI | undefined>} */
|
|
const client = { [FileSources.openai]: undefined, [FileSources.azure]: undefined };
|
|
const initializeClients = async () => {
|
|
if (appConfig.endpoints?.[EModelEndpoint.assistants]) {
|
|
const openAIClient = await getOpenAIClient({
|
|
req,
|
|
overrideEndpoint: EModelEndpoint.assistants,
|
|
});
|
|
client[FileSources.openai] = openAIClient.openai;
|
|
}
|
|
|
|
if (!appConfig.endpoints?.[EModelEndpoint.azureOpenAI]?.assistants) {
|
|
return;
|
|
}
|
|
|
|
const azureClient = await getOpenAIClient({
|
|
req,
|
|
overrideEndpoint: EModelEndpoint.azureAssistants,
|
|
});
|
|
client[FileSources.azure] = azureClient.openai;
|
|
};
|
|
|
|
if (req.body.assistant_id !== undefined) {
|
|
await initializeClients();
|
|
}
|
|
|
|
const agentFiles = [];
|
|
|
|
for (const file of files) {
|
|
const source = file.source ?? FileSources.local;
|
|
if (req.body.agent_id && req.body.tool_resource) {
|
|
agentFiles.push({
|
|
tool_resource: req.body.tool_resource,
|
|
file_id: file.file_id,
|
|
});
|
|
}
|
|
|
|
if (source === FileSources.text) {
|
|
resolvedFileIds.push(file.file_id);
|
|
continue;
|
|
}
|
|
|
|
if (checkOpenAIStorage(source) && !client[source]) {
|
|
await initializeClients();
|
|
}
|
|
|
|
const openai = client[source];
|
|
|
|
if (req.body.assistant_id && req.body.tool_resource) {
|
|
promises.push(
|
|
deleteResourceFileId({
|
|
req,
|
|
openai,
|
|
file_id: file.file_id,
|
|
assistant_id: req.body.assistant_id,
|
|
tool_resource: req.body.tool_resource,
|
|
}),
|
|
);
|
|
} else if (req.body.assistant_id) {
|
|
promises.push(openai.beta.assistants.files.del(req.body.assistant_id, file.file_id));
|
|
}
|
|
|
|
if (deletionMethods[source]) {
|
|
enqueueDeleteOperation({
|
|
req,
|
|
file,
|
|
deleteFile: deletionMethods[source],
|
|
promises,
|
|
resolvedFileIds,
|
|
openai,
|
|
});
|
|
continue;
|
|
}
|
|
|
|
const { deleteFile } = getStrategyFunctions(source);
|
|
if (!deleteFile) {
|
|
throw new Error(`Delete function not implemented for ${source}`);
|
|
}
|
|
|
|
deletionMethods[source] = deleteFile;
|
|
enqueueDeleteOperation({ req, file, deleteFile, promises, resolvedFileIds, openai });
|
|
}
|
|
|
|
if (agentFiles.length > 0) {
|
|
promises.push(
|
|
db.removeAgentResourceFiles({
|
|
agent_id: req.body.agent_id,
|
|
files: agentFiles,
|
|
}),
|
|
);
|
|
}
|
|
|
|
await Promise.allSettled(promises);
|
|
await db.deleteFiles(resolvedFileIds);
|
|
|
|
if (resolvedFileIds.length > 0) {
|
|
try {
|
|
await db.removeAgentResourceFilesFromAllAgents({ file_ids: resolvedFileIds });
|
|
} catch (error) {
|
|
logger.error('Error cleaning up orphaned agent file references', error);
|
|
}
|
|
}
|
|
};
|
|
|
|
/**
|
|
* Processes a file URL using a specified file handling strategy. This function accepts a strategy name,
|
|
* fetches the corresponding file processing functions (for saving and retrieving file URLs), and then
|
|
* executes these functions in sequence. It first saves the file using the provided URL and then retrieves
|
|
* the URL of the saved file. If any error occurs during this process, it logs the error and throws an
|
|
* exception with an appropriate message.
|
|
*
|
|
* @param {Object} params - The parameters object.
|
|
* @param {FileSources} params.fileStrategy - The file handling strategy to use.
|
|
* Must be a value from the `FileSources` enum, which defines different file
|
|
* handling strategies (like saving to Firebase, local storage, etc.).
|
|
* @param {string} params.userId - The user's unique identifier. Used for creating user-specific paths or
|
|
* references in the file handling process.
|
|
* @param {string} params.URL - The URL of the file to be processed.
|
|
* @param {string} params.fileName - The name that will be used to save the file (including extension)
|
|
* @param {string} params.basePath - The base path or directory where the file will be saved or retrieved from.
|
|
* @param {FileContext} params.context - The context of the file (e.g., 'avatar', 'image_generation', etc.)
|
|
* @returns {Promise<MongoFile>} A promise that resolves to the DB representation (MongoFile)
|
|
* of the processed file. It throws an error if the file processing fails at any stage.
|
|
*/
|
|
const processFileURL = async ({ fileStrategy, userId, URL, fileName, basePath, context }) => {
|
|
const { saveURL, getFileURL } = getStrategyFunctions(fileStrategy);
|
|
try {
|
|
const {
|
|
bytes = 0,
|
|
type = '',
|
|
dimensions = {},
|
|
} = (await saveURL({ userId, URL, fileName, basePath })) || {};
|
|
const filepath = await getFileURL({ fileName: `${userId}/${fileName}`, basePath });
|
|
return await db.createFile(
|
|
{
|
|
user: userId,
|
|
file_id: v4(),
|
|
bytes,
|
|
filepath,
|
|
filename: fileName,
|
|
source: fileStrategy,
|
|
type,
|
|
context,
|
|
width: dimensions.width,
|
|
height: dimensions.height,
|
|
},
|
|
true,
|
|
);
|
|
} catch (error) {
|
|
logger.error(`Error while processing the image with ${fileStrategy}:`, error);
|
|
throw new Error(`Failed to process the image with ${fileStrategy}. ${error.message}`);
|
|
}
|
|
};
|
|
|
|
/**
|
|
* Applies the current strategy for image uploads.
|
|
* Saves file metadata to the database with an expiry TTL.
|
|
*
|
|
* @param {Object} params - The parameters object.
|
|
* @param {ServerRequest} params.req - The Express request object.
|
|
* @param {Express.Response} [params.res] - The Express response object.
|
|
* @param {ImageMetadata} params.metadata - Additional metadata for the file.
|
|
* @param {boolean} params.returnFile - Whether to return the file metadata or return response as normal.
|
|
* @returns {Promise<void>}
|
|
*/
|
|
const processImageFile = async ({ req, res, metadata, returnFile = false }) => {
|
|
const { file } = req;
|
|
const appConfig = req.config;
|
|
const source = getFileStrategy(appConfig, { isImage: true });
|
|
const { handleImageUpload } = getStrategyFunctions(source);
|
|
const { file_id, temp_file_id, endpoint } = metadata;
|
|
|
|
const { filepath, bytes, width, height } = await handleImageUpload({
|
|
req,
|
|
file,
|
|
file_id,
|
|
endpoint,
|
|
});
|
|
|
|
const result = await db.createFile(
|
|
{
|
|
user: req.user.id,
|
|
file_id,
|
|
temp_file_id,
|
|
bytes,
|
|
filepath,
|
|
filename: file.originalname,
|
|
context: FileContext.message_attachment,
|
|
source,
|
|
type: `image/${appConfig.imageOutputType}`,
|
|
width,
|
|
height,
|
|
},
|
|
true,
|
|
);
|
|
|
|
if (returnFile) {
|
|
return result;
|
|
}
|
|
res.status(200).json({ message: 'File uploaded and processed successfully', ...result });
|
|
};
|
|
|
|
/**
|
|
* Applies the current strategy for image uploads and
|
|
* returns minimal file metadata, without saving to the database.
|
|
*
|
|
* @param {Object} params - The parameters object.
|
|
* @param {ServerRequest} params.req - The Express request object.
|
|
* @param {FileContext} params.context - The context of the file (e.g., 'avatar', 'image_generation', etc.)
|
|
* @param {boolean} [params.resize=true] - Whether to resize and convert the image to target format. Default is `true`.
|
|
* @param {{ buffer: Buffer, width: number, height: number, bytes: number, filename: string, type: string, file_id: string }} [params.metadata] - Required metadata for the file if resize is false.
|
|
* @returns {Promise<{ filepath: string, filename: string, source: string, type: string}>}
|
|
*/
|
|
const uploadImageBuffer = async ({ req, context, metadata = {}, resize = true }) => {
|
|
const appConfig = req.config;
|
|
const source = getFileStrategy(appConfig, { isImage: true });
|
|
const { saveBuffer } = getStrategyFunctions(source);
|
|
let { buffer, width, height, bytes, filename, file_id, type } = metadata;
|
|
if (resize) {
|
|
file_id = v4();
|
|
type = `image/${appConfig.imageOutputType}`;
|
|
({ buffer, width, height, bytes } = await resizeAndConvert({
|
|
inputBuffer: buffer,
|
|
desiredFormat: appConfig.imageOutputType,
|
|
}));
|
|
filename = `${path.basename(req.file.originalname, path.extname(req.file.originalname))}.${
|
|
appConfig.imageOutputType
|
|
}`;
|
|
}
|
|
const fileName = `${file_id}-${filename}`;
|
|
const filepath = await saveBuffer({ userId: req.user.id, fileName, buffer });
|
|
return await db.createFile(
|
|
{
|
|
user: req.user.id,
|
|
file_id,
|
|
bytes,
|
|
filepath,
|
|
filename,
|
|
context,
|
|
source,
|
|
type,
|
|
width,
|
|
height,
|
|
},
|
|
true,
|
|
);
|
|
};
|
|
|
|
/**
|
|
* Applies the current strategy for file uploads.
|
|
* Saves file metadata to the database with an expiry TTL.
|
|
* Files must be deleted from the server filesystem manually.
|
|
*
|
|
* @param {Object} params - The parameters object.
|
|
* @param {ServerRequest} params.req - The Express request object.
|
|
* @param {Express.Response} params.res - The Express response object.
|
|
* @param {FileMetadata} params.metadata - Additional metadata for the file.
|
|
* @returns {Promise<void>}
|
|
*/
|
|
const processFileUpload = async ({ req, res, metadata }) => {
|
|
const appConfig = req.config;
|
|
const isAssistantUpload = isAssistantsEndpoint(metadata.endpoint);
|
|
const assistantSource =
|
|
metadata.endpoint === EModelEndpoint.azureAssistants ? FileSources.azure : FileSources.openai;
|
|
// Use the configured file strategy for regular file uploads (not vectordb)
|
|
const source = isAssistantUpload ? assistantSource : appConfig.fileStrategy;
|
|
const { handleFileUpload } = getStrategyFunctions(source);
|
|
const { file_id, temp_file_id = null } = metadata;
|
|
|
|
/** @type {OpenAI | undefined} */
|
|
let openai;
|
|
if (checkOpenAIStorage(source)) {
|
|
({ openai } = await getOpenAIClient({ req }));
|
|
}
|
|
|
|
const { file } = req;
|
|
const sanitizedUploadFn = createSanitizedUploadWrapper(handleFileUpload);
|
|
const {
|
|
id,
|
|
bytes,
|
|
filename,
|
|
filepath: _filepath,
|
|
embedded,
|
|
height,
|
|
width,
|
|
} = await sanitizedUploadFn({
|
|
req,
|
|
file,
|
|
file_id,
|
|
openai,
|
|
});
|
|
|
|
if (isAssistantUpload && !metadata.message_file && !metadata.tool_resource) {
|
|
await openai.beta.assistants.files.create(metadata.assistant_id, {
|
|
file_id: id,
|
|
});
|
|
} else if (isAssistantUpload && !metadata.message_file) {
|
|
await addResourceFileId({
|
|
req,
|
|
openai,
|
|
file_id: id,
|
|
assistant_id: metadata.assistant_id,
|
|
tool_resource: metadata.tool_resource,
|
|
});
|
|
}
|
|
|
|
let filepath = isAssistantUpload ? `${openai.baseURL}/files/${id}` : _filepath;
|
|
if (isAssistantUpload && file.mimetype.startsWith('image')) {
|
|
const result = await processImageFile({
|
|
req,
|
|
file,
|
|
metadata: { file_id: v4() },
|
|
returnFile: true,
|
|
});
|
|
filepath = result.filepath;
|
|
}
|
|
|
|
const result = await db.createFile(
|
|
{
|
|
user: req.user.id,
|
|
file_id: id ?? file_id,
|
|
temp_file_id,
|
|
bytes,
|
|
filepath,
|
|
filename: filename ?? sanitizeFilename(file.originalname),
|
|
context: isAssistantUpload ? FileContext.assistants : FileContext.message_attachment,
|
|
model: isAssistantUpload ? req.body.model : undefined,
|
|
type: file.mimetype,
|
|
embedded,
|
|
source,
|
|
height,
|
|
width,
|
|
},
|
|
true,
|
|
);
|
|
res.status(200).json({ message: 'File uploaded and processed successfully', ...result });
|
|
};
|
|
|
|
/**
|
|
* Applies the current strategy for file uploads.
|
|
* Saves file metadata to the database with an expiry TTL.
|
|
* Files must be deleted from the server filesystem manually.
|
|
*
|
|
* @param {Object} params - The parameters object.
|
|
* @param {ServerRequest} params.req - The Express request object.
|
|
* @param {Express.Response} params.res - The Express response object.
|
|
* @param {FileMetadata} params.metadata - Additional metadata for the file.
|
|
* @returns {Promise<void>}
|
|
*/
|
|
const processAgentFileUpload = async ({ req, res, metadata }) => {
|
|
const { file } = req;
|
|
const appConfig = req.config;
|
|
const { agent_id, tool_resource, file_id, temp_file_id = null } = metadata;
|
|
|
|
let messageAttachment = !!metadata.message_file;
|
|
|
|
if (agent_id && !tool_resource && !messageAttachment) {
|
|
throw new Error('No tool resource provided for agent file upload');
|
|
}
|
|
|
|
if (tool_resource === EToolResources.file_search && file.mimetype.startsWith('image')) {
|
|
throw new Error('Image uploads are not supported for file search tool resources');
|
|
}
|
|
|
|
if (!messageAttachment && !agent_id) {
|
|
throw new Error('No agent ID provided for agent file upload');
|
|
}
|
|
|
|
const isImage = file.mimetype.startsWith('image');
|
|
let fileInfoMetadata;
|
|
const entity_id = messageAttachment === true ? undefined : agent_id;
|
|
const basePath = mime.getType(file.originalname)?.startsWith('image') ? 'images' : 'uploads';
|
|
if (tool_resource === EToolResources.execute_code) {
|
|
const isCodeEnabled = await checkCapability(req, AgentCapabilities.execute_code);
|
|
if (!isCodeEnabled) {
|
|
throw new Error('Code execution is not enabled for Agents');
|
|
}
|
|
const { handleFileUpload: uploadCodeEnvFile } = getStrategyFunctions(FileSources.execute_code);
|
|
const stream = fs.createReadStream(file.path);
|
|
const fileIdentifier = await uploadCodeEnvFile({
|
|
req,
|
|
stream,
|
|
filename: file.originalname,
|
|
entity_id,
|
|
});
|
|
fileInfoMetadata = { fileIdentifier };
|
|
} else if (tool_resource === EToolResources.file_search) {
|
|
const isFileSearchEnabled = await checkCapability(req, AgentCapabilities.file_search);
|
|
if (!isFileSearchEnabled) {
|
|
throw new Error('File search is not enabled for Agents');
|
|
}
|
|
// Note: File search processing continues to dual storage logic below
|
|
} else if (tool_resource === EToolResources.context) {
|
|
const { file_id, temp_file_id = null } = metadata;
|
|
|
|
/**
|
|
* @param {object} params
|
|
* @param {string} params.text
|
|
* @param {number} params.bytes
|
|
* @param {string} params.filepath
|
|
* @param {string} params.type
|
|
* @return {Promise<void>}
|
|
*/
|
|
const createTextFile = async ({ text, bytes, filepath, type = 'text/plain' }) => {
|
|
const textBytes = Buffer.byteLength(text, 'utf8');
|
|
if (textBytes > 15 * megabyte) {
|
|
throw new Error(
|
|
`Extracted text from "${file.originalname}" exceeds the 15MB storage limit (${Math.round(textBytes / megabyte)}MB). Try a shorter document.`,
|
|
);
|
|
}
|
|
const fileInfo = removeNullishValues({
|
|
text,
|
|
bytes,
|
|
file_id,
|
|
temp_file_id,
|
|
user: req.user.id,
|
|
type,
|
|
filepath: filepath ?? file.path,
|
|
source: FileSources.text,
|
|
filename: file.originalname,
|
|
model: messageAttachment ? undefined : req.body.model,
|
|
context: messageAttachment ? FileContext.message_attachment : FileContext.agents,
|
|
});
|
|
|
|
if (!messageAttachment && tool_resource) {
|
|
await db.addAgentResourceFile({
|
|
file_id,
|
|
agent_id,
|
|
tool_resource,
|
|
updatingUserId: req?.user?.id,
|
|
});
|
|
}
|
|
const result = await db.createFile(fileInfo, true);
|
|
return res
|
|
.status(200)
|
|
.json({ message: 'Agent file uploaded and processed successfully', ...result });
|
|
};
|
|
|
|
const fileConfig = mergeFileConfig(appConfig.fileConfig);
|
|
|
|
const shouldUseConfiguredOCR =
|
|
appConfig?.ocr != null &&
|
|
fileConfig.checkType(file.mimetype, fileConfig.ocr?.supportedMimeTypes || []);
|
|
|
|
const shouldUseDocumentParser =
|
|
!shouldUseConfiguredOCR && documentParserMimeTypes.some((regex) => regex.test(file.mimetype));
|
|
|
|
const shouldUseOCR = shouldUseConfiguredOCR || shouldUseDocumentParser;
|
|
|
|
const resolveDocumentText = async () => {
|
|
if (shouldUseConfiguredOCR) {
|
|
try {
|
|
const ocrStrategy = appConfig?.ocr?.strategy ?? FileSources.document_parser;
|
|
const { handleFileUpload } = getStrategyFunctions(ocrStrategy);
|
|
return await handleFileUpload({ req, file, loadAuthValues });
|
|
} catch (err) {
|
|
logger.error(
|
|
`[processAgentFileUpload] Configured OCR failed for "${file.originalname}", falling back to document_parser:`,
|
|
err,
|
|
);
|
|
}
|
|
}
|
|
try {
|
|
const { handleFileUpload } = getStrategyFunctions(FileSources.document_parser);
|
|
return await handleFileUpload({ req, file, loadAuthValues });
|
|
} catch (err) {
|
|
logger.error(
|
|
`[processAgentFileUpload] Document parser failed for "${file.originalname}":`,
|
|
err,
|
|
);
|
|
}
|
|
};
|
|
|
|
if (shouldUseConfiguredOCR && !(await checkCapability(req, AgentCapabilities.ocr))) {
|
|
throw new Error('OCR capability is not enabled for Agents');
|
|
}
|
|
|
|
if (shouldUseOCR) {
|
|
const ocrResult = await resolveDocumentText();
|
|
if (ocrResult) {
|
|
const { text, bytes, filepath: ocrFileURL } = ocrResult;
|
|
return await createTextFile({ text, bytes, filepath: ocrFileURL });
|
|
}
|
|
throw new Error(
|
|
`Unable to extract text from "${file.originalname}". The document may be image-based and requires an OCR service to process.`,
|
|
);
|
|
}
|
|
|
|
const shouldUseSTT = fileConfig.checkType(
|
|
file.mimetype,
|
|
fileConfig.stt?.supportedMimeTypes || [],
|
|
);
|
|
|
|
if (shouldUseSTT) {
|
|
const sttService = await STTService.getInstance();
|
|
const { text, bytes } = await processAudioFile({ req, file, sttService });
|
|
return await createTextFile({ text, bytes });
|
|
}
|
|
|
|
const shouldUseText = fileConfig.checkType(
|
|
file.mimetype,
|
|
fileConfig.text?.supportedMimeTypes || [],
|
|
);
|
|
|
|
if (!shouldUseText) {
|
|
throw new Error(`File type ${file.mimetype} is not supported for text parsing.`);
|
|
}
|
|
|
|
const { text, bytes } = await parseText({ req, file, file_id });
|
|
return await createTextFile({ text, bytes, type: file.mimetype });
|
|
}
|
|
|
|
// Dual storage pattern for RAG files: Storage + Vector DB
|
|
let storageResult, embeddingResult;
|
|
const isImageFile = file.mimetype.startsWith('image');
|
|
const source = getFileStrategy(appConfig, { isImage: isImageFile });
|
|
|
|
if (tool_resource === EToolResources.file_search) {
|
|
// FIRST: Upload to Storage for permanent backup (S3/local/etc.)
|
|
const { handleFileUpload } = getStrategyFunctions(source);
|
|
const sanitizedUploadFn = createSanitizedUploadWrapper(handleFileUpload);
|
|
storageResult = await sanitizedUploadFn({
|
|
req,
|
|
file,
|
|
file_id,
|
|
basePath,
|
|
entity_id,
|
|
});
|
|
|
|
// SECOND: Upload to Vector DB
|
|
const { uploadVectors } = require('./VectorDB/crud');
|
|
|
|
embeddingResult = await uploadVectors({
|
|
req,
|
|
file,
|
|
file_id,
|
|
entity_id,
|
|
});
|
|
|
|
// Vector status will be stored at root level, no need for metadata
|
|
fileInfoMetadata = {};
|
|
} else {
|
|
// Standard single storage for non-RAG files
|
|
const { handleFileUpload } = getStrategyFunctions(source);
|
|
const sanitizedUploadFn = createSanitizedUploadWrapper(handleFileUpload);
|
|
storageResult = await sanitizedUploadFn({
|
|
req,
|
|
file,
|
|
file_id,
|
|
basePath,
|
|
entity_id,
|
|
});
|
|
}
|
|
|
|
let { bytes, filename, filepath: _filepath, height, width } = storageResult;
|
|
// For RAG files, use embedding result; for others, use storage result
|
|
let embedded = storageResult.embedded;
|
|
if (tool_resource === EToolResources.file_search) {
|
|
embedded = embeddingResult?.embedded;
|
|
filename = embeddingResult?.filename || filename;
|
|
}
|
|
|
|
let filepath = _filepath;
|
|
|
|
if (!messageAttachment && tool_resource) {
|
|
await db.addAgentResourceFile({
|
|
file_id,
|
|
agent_id,
|
|
tool_resource,
|
|
updatingUserId: req?.user?.id,
|
|
});
|
|
}
|
|
|
|
if (isImage) {
|
|
const result = await processImageFile({
|
|
req,
|
|
file,
|
|
metadata: { file_id: v4() },
|
|
returnFile: true,
|
|
});
|
|
filepath = result.filepath;
|
|
}
|
|
|
|
const fileInfo = removeNullishValues({
|
|
user: req.user.id,
|
|
file_id,
|
|
temp_file_id,
|
|
bytes,
|
|
filepath,
|
|
filename: filename ?? sanitizeFilename(file.originalname),
|
|
context: messageAttachment ? FileContext.message_attachment : FileContext.agents,
|
|
model: messageAttachment ? undefined : req.body.model,
|
|
metadata: fileInfoMetadata,
|
|
type: file.mimetype,
|
|
embedded,
|
|
source,
|
|
height,
|
|
width,
|
|
});
|
|
|
|
const result = await db.createFile(fileInfo, true);
|
|
|
|
res.status(200).json({ message: 'Agent file uploaded and processed successfully', ...result });
|
|
};
|
|
|
|
/**
|
|
* @param {object} params - The params object.
|
|
* @param {OpenAI} params.openai - The OpenAI client instance.
|
|
* @param {string} params.file_id - The ID of the file to retrieve.
|
|
* @param {string} params.userId - The user ID.
|
|
* @param {string} [params.filename] - The name of the file. `undefined` for `file_citation` annotations.
|
|
* @param {boolean} [params.saveFile=false] - Whether to save the file metadata to the database.
|
|
* @param {boolean} [params.updateUsage=false] - Whether to update file usage in database.
|
|
*/
|
|
const processOpenAIFile = async ({
|
|
openai,
|
|
file_id,
|
|
userId,
|
|
filename,
|
|
saveFile = false,
|
|
updateUsage = false,
|
|
}) => {
|
|
const _file = await openai.files.retrieve(file_id);
|
|
const originalName = filename ?? (_file.filename ? path.basename(_file.filename) : undefined);
|
|
const filepath = `${openai.baseURL}/files/${userId}/${file_id}${
|
|
originalName ? `/${originalName}` : ''
|
|
}`;
|
|
const type = mime.getType(originalName ?? file_id);
|
|
const source =
|
|
openai.req.body.endpoint === EModelEndpoint.azureAssistants
|
|
? FileSources.azure
|
|
: FileSources.openai;
|
|
const file = {
|
|
..._file,
|
|
type,
|
|
file_id,
|
|
filepath,
|
|
usage: 1,
|
|
user: userId,
|
|
context: _file.purpose,
|
|
source,
|
|
model: openai.req.body.model,
|
|
filename: originalName ?? file_id,
|
|
};
|
|
|
|
if (saveFile) {
|
|
await db.createFile(file, true);
|
|
} else if (updateUsage) {
|
|
try {
|
|
await db.updateFileUsage({ file_id });
|
|
} catch (error) {
|
|
logger.error('Error updating file usage', error);
|
|
}
|
|
}
|
|
|
|
return file;
|
|
};
|
|
|
|
/**
|
|
* Process OpenAI image files, convert to target format, save and return file metadata.
|
|
* @param {object} params - The params object.
|
|
* @param {ServerRequest} params.req - The Express request object.
|
|
* @param {Buffer} params.buffer - The image buffer.
|
|
* @param {string} params.file_id - The file ID.
|
|
* @param {string} params.filename - The filename.
|
|
* @param {string} params.fileExt - The file extension.
|
|
* @returns {Promise<MongoFile>} The file metadata.
|
|
*/
|
|
const processOpenAIImageOutput = async ({ req, buffer, file_id, filename, fileExt }) => {
|
|
const currentDate = new Date();
|
|
const formattedDate = currentDate.toISOString();
|
|
const appConfig = req.config;
|
|
const _file = await convertImage(req, buffer, undefined, `${file_id}${fileExt}`);
|
|
|
|
// Create only one file record with the correct information
|
|
const file = {
|
|
..._file,
|
|
usage: 1,
|
|
user: req.user.id,
|
|
type: mime.getType(fileExt),
|
|
createdAt: formattedDate,
|
|
updatedAt: formattedDate,
|
|
source: getFileStrategy(appConfig, { isImage: true }),
|
|
context: FileContext.assistants_output,
|
|
file_id,
|
|
filename,
|
|
};
|
|
db.createFile(file, true);
|
|
return file;
|
|
};
|
|
|
|
/**
|
|
* Retrieves and processes an OpenAI file based on its type.
|
|
*
|
|
* @param {Object} params - The params passed to the function.
|
|
* @param {OpenAIClient} params.openai - The OpenAI client instance.
|
|
* @param {RunClient} params.client - The LibreChat client instance: either refers to `openai` or `streamRunManager`.
|
|
* @param {string} params.file_id - The ID of the file to retrieve.
|
|
* @param {string} [params.basename] - The basename of the file (if image); e.g., 'image.jpg'. `undefined` for `file_citation` annotations.
|
|
* @param {boolean} [params.unknownType] - Whether the file type is unknown.
|
|
* @returns {Promise<{file_id: string, filepath: string, source: string, bytes?: number, width?: number, height?: number} | null>}
|
|
* - Returns null if `file_id` is not defined; else, the file metadata if successfully retrieved and processed.
|
|
*/
|
|
async function retrieveAndProcessFile({
|
|
openai,
|
|
client,
|
|
file_id,
|
|
basename: _basename,
|
|
unknownType,
|
|
}) {
|
|
if (!file_id) {
|
|
return null;
|
|
}
|
|
|
|
let basename = _basename;
|
|
const processArgs = { openai, file_id, filename: basename, userId: client.req.user.id };
|
|
|
|
// If no basename provided, return only the file metadata
|
|
if (!basename) {
|
|
return await processOpenAIFile({ ...processArgs, saveFile: true });
|
|
}
|
|
|
|
const fileExt = path.extname(basename);
|
|
if (client.attachedFileIds?.has(file_id) || client.processedFileIds?.has(file_id)) {
|
|
return processOpenAIFile({ ...processArgs, updateUsage: true });
|
|
}
|
|
|
|
/**
|
|
* @returns {Promise<Buffer>} The file data buffer.
|
|
*/
|
|
const getDataBuffer = async () => {
|
|
const response = await openai.files.content(file_id);
|
|
const arrayBuffer = await response.arrayBuffer();
|
|
return Buffer.from(arrayBuffer);
|
|
};
|
|
|
|
let dataBuffer;
|
|
if (unknownType || !fileExt || imageExtRegex.test(basename)) {
|
|
try {
|
|
dataBuffer = await getDataBuffer();
|
|
} catch (error) {
|
|
logger.error('Error downloading file from OpenAI:', error);
|
|
dataBuffer = null;
|
|
}
|
|
}
|
|
|
|
if (!dataBuffer) {
|
|
return await processOpenAIFile({ ...processArgs, saveFile: true });
|
|
}
|
|
|
|
// If the filetype is unknown, inspect the file
|
|
if (dataBuffer && (unknownType || !fileExt)) {
|
|
const detectedExt = await determineFileType(dataBuffer);
|
|
const isImageOutput = detectedExt && imageExtRegex.test('.' + detectedExt);
|
|
|
|
if (!isImageOutput) {
|
|
return await processOpenAIFile({ ...processArgs, saveFile: true });
|
|
}
|
|
|
|
return await processOpenAIImageOutput({
|
|
file_id,
|
|
req: client.req,
|
|
buffer: dataBuffer,
|
|
filename: basename,
|
|
fileExt: detectedExt,
|
|
});
|
|
} else if (dataBuffer && imageExtRegex.test(basename)) {
|
|
return await processOpenAIImageOutput({
|
|
file_id,
|
|
req: client.req,
|
|
buffer: dataBuffer,
|
|
filename: basename,
|
|
fileExt,
|
|
});
|
|
} else {
|
|
logger.debug(`[retrieveAndProcessFile] Non-image file type detected: ${basename}`);
|
|
return await processOpenAIFile({ ...processArgs, saveFile: true });
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Converts a base64 string to a buffer.
|
|
* @param {string} base64String
|
|
* @returns {Buffer<ArrayBufferLike>}
|
|
*/
|
|
function base64ToBuffer(base64String) {
|
|
try {
|
|
const typeMatch = base64String.match(/^data:([A-Za-z-+/]+);base64,/);
|
|
const type = typeMatch ? typeMatch[1] : '';
|
|
|
|
const base64Data = base64String.replace(/^data:([A-Za-z-+/]+);base64,/, '');
|
|
|
|
if (!base64Data) {
|
|
throw new Error('Invalid base64 string');
|
|
}
|
|
|
|
return {
|
|
buffer: Buffer.from(base64Data, 'base64'),
|
|
type,
|
|
};
|
|
} catch (error) {
|
|
throw new Error(`Failed to convert base64 to buffer: ${error.message}`);
|
|
}
|
|
}
|
|
|
|
async function saveBase64Image(
|
|
url,
|
|
{ req, file_id: _file_id, filename: _filename, endpoint, context, resolution },
|
|
) {
|
|
const appConfig = req.config;
|
|
const effectiveResolution = resolution ?? appConfig.fileConfig?.imageGeneration ?? 'high';
|
|
const file_id = _file_id ?? v4();
|
|
let filename = `${file_id}-${_filename}`;
|
|
const { buffer: inputBuffer, type } = base64ToBuffer(url);
|
|
if (!path.extname(_filename)) {
|
|
const extension = mime.getExtension(type);
|
|
if (extension) {
|
|
filename += `.${extension}`;
|
|
} else {
|
|
throw new Error(`Could not determine file extension from MIME type: ${type}`);
|
|
}
|
|
}
|
|
|
|
const image = await resizeImageBuffer(inputBuffer, effectiveResolution, endpoint);
|
|
const source = getFileStrategy(appConfig, { isImage: true });
|
|
const { saveBuffer } = getStrategyFunctions(source);
|
|
const filepath = await saveBuffer({
|
|
userId: req.user.id,
|
|
fileName: filename,
|
|
buffer: image.buffer,
|
|
});
|
|
return await db.createFile(
|
|
{
|
|
type,
|
|
source,
|
|
context,
|
|
file_id,
|
|
filepath,
|
|
filename,
|
|
user: req.user.id,
|
|
bytes: image.bytes,
|
|
width: image.width,
|
|
height: image.height,
|
|
},
|
|
true,
|
|
);
|
|
}
|
|
|
|
/**
|
|
* Filters a file based on its size and the endpoint origin.
|
|
*
|
|
* @param {Object} params - The parameters for the function.
|
|
* @param {ServerRequest} params.req - The request object from Express.
|
|
* @param {string} [params.req.endpoint]
|
|
* @param {string} [params.req.file_id]
|
|
* @param {number} [params.req.width]
|
|
* @param {number} [params.req.height]
|
|
* @param {number} [params.req.version]
|
|
* @param {boolean} [params.image] - Whether the file expected is an image.
|
|
* @param {boolean} [params.isAvatar] - Whether the file expected is a user or entity avatar.
|
|
* @returns {void}
|
|
*
|
|
* @throws {Error} If a file exception is caught (invalid file size or type, lack of metadata).
|
|
*/
|
|
function filterFile({ req, image, isAvatar }) {
|
|
const { file } = req;
|
|
const { endpoint, endpointType, file_id, width, height } = req.body;
|
|
|
|
if (!file_id && !isAvatar) {
|
|
throw new Error('No file_id provided');
|
|
}
|
|
|
|
if (file.size === 0) {
|
|
throw new Error('Empty file uploaded');
|
|
}
|
|
|
|
/* parse to validate api call, throws error on fail */
|
|
if (!isAvatar) {
|
|
isUUID.parse(file_id);
|
|
}
|
|
|
|
if (!endpoint && !isAvatar) {
|
|
throw new Error('No endpoint provided');
|
|
}
|
|
|
|
const appConfig = req.config;
|
|
const fileConfig = mergeFileConfig(appConfig.fileConfig);
|
|
|
|
const endpointFileConfig = getEndpointFileConfig({
|
|
endpoint,
|
|
fileConfig,
|
|
endpointType,
|
|
});
|
|
const fileSizeLimit =
|
|
isAvatar === true ? fileConfig.avatarSizeLimit : endpointFileConfig.fileSizeLimit;
|
|
|
|
if (file.size > fileSizeLimit) {
|
|
throw new Error(
|
|
`File size limit of ${fileSizeLimit / megabyte} MB exceeded for ${
|
|
isAvatar ? 'avatar upload' : `${endpoint} endpoint`
|
|
}`,
|
|
);
|
|
}
|
|
|
|
const isSupportedMimeType = fileConfig.checkType(
|
|
file.mimetype,
|
|
endpointFileConfig.supportedMimeTypes,
|
|
);
|
|
|
|
if (!isSupportedMimeType) {
|
|
throw new Error('Unsupported file type');
|
|
}
|
|
|
|
if (!image || isAvatar === true) {
|
|
return;
|
|
}
|
|
|
|
if (!width) {
|
|
throw new Error('No width provided');
|
|
}
|
|
|
|
if (!height) {
|
|
throw new Error('No height provided');
|
|
}
|
|
}
|
|
|
|
module.exports = {
|
|
filterFile,
|
|
processFileURL,
|
|
saveBase64Image,
|
|
processImageFile,
|
|
uploadImageBuffer,
|
|
processFileUpload,
|
|
processDeleteRequest,
|
|
processAgentFileUpload,
|
|
retrieveAndProcessFile,
|
|
};
|