mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-06-30 03:12:11 +00:00
* 💾 feat: Persist Context Breakdown & Branch/Total Usage Cost Persist the granular context breakdown and per-response usage/cost on the response message metadata, and re-derive branch + total usage/cost from a per-message index so the popover survives reloads and is branch-aware live. - Add aggregateEmittedUsage + buildPersistedContextUsage helpers in packages/api; capture the latest visible snapshot and every emitted on_token_usage payload via contextUsageSink/usageEmitSink. - Attach metadata.contextUsage (Part A) and metadata.usage (Part B) on the agents response message in sendCompletion. - Carry per-message usage on the token index; add sumTotalUsage/setEntryUsage and branch-scoped usage on sumBranch. - Repurpose the session accumulator into a single in-flight pending holder; flush it into the index at finalize; hydrate breakdowns on load. - Render branch cost with a conditional all-branches total in the breakdown. * 🧹 chore: Remove orphaned com_ui_session_cost i18n key * 🩹 fix: Address Codex review — normalize usage server-side, fix reload deltas - Persist per-event-normalized display units in metadata.usage (TResponseUsage) so reloaded mixed-provider turns match the live session; client reads them directly instead of re-normalizing with a single stamped provider (P2). - Persist completedOutputTokens (final call output) on metadata.contextUsage so a reloaded multi-call turn adds the post-snapshot delta, not the full tokenCount the snapshot already counts (P2). - buildIndex preserves a prior entry's immutable usage when a rebuilt cache message lacks metadata.usage, so a mid-session rebuild (regenerate) keeps a sibling branch's flushed cost (fixes the e2e regenerate failure). - Track costKnown so turns saved with contextCost off don't render $0.00 when cost display is later enabled (P3). - Use an epsilon for the all-branches cost comparison to avoid a spurious total row from float summation order (P3). - Update unit/integration/e2e tests for the new shapes; regenerate e2e asserts the all-branches total after reload (deterministic via persisted metadata). * 🩹 fix: Address Codex round 2 — pending leak, cost coverage, reload delta - Clear the in-flight pending usage on terminal abort/error (resetLive), so a stopped generation's tokens no longer merge into the next response (P2). - costKnown now means COMPLETE coverage (ANDed): a branch mixing cost-bearing and cost-less turns is flagged incomplete and the cost row is hidden rather than rendering an under-reported total (P2). - Drop the tokenCount fallback for completedOutputTokens on reload: only the persisted post-snapshot delta is used, so a multi-call turn whose provider emitted no usage_metadata no longer double-counts earlier output (P2). - Update tokens.spec for AND coverage semantics + incomplete-cost case. * 🩹 fix: Address Codex round 3 — no-usage snapshots, total coverage, provider-less cache - Skip persisting metadata.contextUsage when the response emitted no primary usage event: without a known post-snapshot output the granular gauge would undercount the reply on reload, so fall back to the coarse per-message estimate instead (P2). - Gate the all-branches cost row on totalUsage.costKnown so an incomplete total (a sibling saved without cost) never renders an under-reported figure (P2). - aggregateEmittedUsage/finalCallOutputTokens now normalize per-event with the client's magnitude fallback (normalizeEventUnits) instead of billing splitUsage, so provider-less cached events match live on reload (P2). - Add backend test for the provider-less cached case. * 🩹 fix: Address Codex round 4 — abort attribution, complete cost coverage - aggregateEmittedUsage persists cost only when EVERY call was priced; a partial pricing failure now omits cost so the client treats coverage as unknown rather than reading an under-reported sum as authoritative (P2). - finalizeUsage flushes pending into the response entry only when events were folded this session (eventCount > 0), so a late/second resumable subscriber carrying persisted metadata.usage keeps it instead of being overwritten with an empty pending record (P2). - On user stop, attribute the in-flight pending usage to the partial response (new attributePending handler) instead of discarding it in resetLive — the stopped reply's billed tokens are kept and still can't leak into the next response; resetLive's discard remains for the error path (P2). * 🐛 fix: Persist branch cost across branch switches via sticky usage history Branch cost vanished on switching to a sibling branch (until a new turn) — the cost analog of the granularity bug. buildIndex rebuilds the token index from the messages cache; a sibling generated this session whose cache message lacks metadata.usage (and is transiently dropped from the cache during regenerate) lost its live-flushed usage, so sumBranch found none and the cost row hid. Fix: a sticky per-response usage map (conversationId → messageId → usage), written by setEntryUsage and never rebuilt from the cache — the usage counterpart of snapshotsByAnchorFamily for the breakdown. buildIndex/upsertEntries restore an entry's usage from it when the message carries none; cleared on convo switch and migrated with the index. Add unit coverage for the drop-then-readd regression and an e2e assertion that branch cost survives a branch switch. * 🐛 fix: Re-index on branch switch so branch cost survives the switch The sticky usage history alone didn't fix the reported branch-switch cost drop: on a branch switch no cache `updated` event fires, so the index subscriber never re-ran, and the post-regenerate rebuild was skipped while `isSubmitting` was still true — leaving the index stale and missing the now-viewed branch's response entirely (sticky can only restore entries present in a rebuild). Re-index from the messages cache on every tail change (created/finalize AND branch switch), not just while submitting. The cache holds the full message set at switch time, so the viewed branch's response is re-added and its usage restored from metadata.usage or the sticky history → sumBranch finds it and the branch cost renders. Verified locally: the branch-switch e2e now passes (the cost section shows both the branch row and the all-branches total). Also fixed that e2e assertion to target a single cost value (strict-mode safe). * 🩹 fix: Handle stopped-stream usage — reset pending + persist abort metadata Codex round (stop/abort edges): - Resumable explicit-stop (intentional SSE close) reset UI state but never cleared pendingUsageFamily, so usage folded before the stop leaked into the next response in the conversation. Discard pending on intentional close (resetLive); a resume re-folds via backfillUsage, so nothing is lost. - The abort save path (abortMiddleware) persisted the stopped response without metadata.usage/contextUsage, so its cost + breakdown vanished on reload. Rebuild both from the job's persisted tokenUsage (emitted payloads incl. cost) and contextUsage snapshot — parity with the normal sendCompletion path; breakdown gated on a primary usage event like buildResponseMetadata. Deferred (per scope decision): mid-stream branch-switch transiently shows the streaming branch's pending on the viewed sibling (cosmetic, until finalize). * 🩹 fix: Persist abort metadata on the real agents route + tighten snapshot gate Codex round (corrects last round's wrong-path fixes): - Stopped AGENTS responses are saved by routes/agents/index.js (/chat/abort), not abortMiddleware — so last round's metadata fix never ran for them. Moved the rollup/snapshot builder into packages/api as buildAbortedResponseMetadata (shared, unit-tested) and applied it in BOTH abort save paths, so a stopped agent reply keeps its cost + breakdown on reload. - Persist the breakdown only when the FINAL visible call emitted usage: track a per-response snapshot count and require primaryUsageCount >= snapshotCount. Previously any earlier primary usage event passed the gate, so a multi-call turn whose final call emitted no usage_metadata used an earlier call's output as completedOutputTokens (already counted by the latest snapshot) → reload over-reported. Now it falls back to the coarse estimate. Resumable stop pending-reset (prior round, 3cde6fe035) already flows through clearAllSubmissions → SSE close → the intentional-close handler's resetLive. Deferred per scope: mid-stream branch-switch pending attribution (tracked). * 🩹 fix: Abort breakdown over-count + resume re-fold after pending discard Codex round (on the re-applied abort/snapshot work): - buildAbortedResponseMetadata now persists ONLY the usage/cost rollup, not the context breakdown. The abort path can't tell whether the final call emitted usage (the job stores only the latest snapshot, not a count), so persisting the breakdown risked reusing an earlier call's output as completedOutputTokens (already in the snapshot) → reload over-count. Stopped/incomplete responses now fall back to the coarse gauge estimate, which is safe and apt. - resetLive now also forgets the conversation's folded usage-event identities (clearUsageFolded). Discarding pending on a terminal/intentional close left the folded keys set, so a later resume's backfillUsage saw the persisted events as duplicates and never rebuilt pending — leaving the response's usage missing until a full reload. Clearing them lets the resume re-fold.
946 lines
34 KiB
JavaScript
946 lines
34 KiB
JavaScript
const { logger } = require('@librechat/data-schemas');
|
|
const { createContentAggregator } = require('@librechat/agents');
|
|
const {
|
|
loadSkillStates,
|
|
initializeAgent,
|
|
primeInvokedSkills,
|
|
validateAgentModel,
|
|
extractManualSkills,
|
|
GenerationJobManager,
|
|
getCustomEndpointConfig,
|
|
discoverConnectedAgents,
|
|
resolveAgentScopedSkillIds,
|
|
resolveModelSpecSkillIds,
|
|
buildAgentContextAttachmentsByAgentId,
|
|
} = require('@librechat/api');
|
|
const {
|
|
ResourceType,
|
|
EModelEndpoint,
|
|
PermissionBits,
|
|
MAX_SUBAGENT_DEPTH,
|
|
isAgentsEndpoint,
|
|
getResponseSender,
|
|
AgentCapabilities,
|
|
MAX_SUBAGENT_GRAPH_NODES,
|
|
isEphemeralAgentId,
|
|
} = require('librechat-data-provider');
|
|
const {
|
|
createToolEndCallback,
|
|
getDefaultHandlers,
|
|
} = require('~/server/controllers/agents/callbacks');
|
|
const { loadAgentTools, loadToolsForExecution } = require('~/server/services/ToolService');
|
|
const { filterFilesByAgentAccess } = require('~/server/services/Files/permissions');
|
|
const {
|
|
getSkillToolDeps,
|
|
getSkillDbMethods,
|
|
canAuthorSkillFiles,
|
|
withDeploymentSkillIds,
|
|
buildAgentToolContext,
|
|
enrichLoadedToolsWithAgentContext,
|
|
} = require('./skillDeps');
|
|
const { getModelsConfig } = require('~/server/controllers/ModelController');
|
|
const { checkPermission, findAccessibleResources } = require('~/server/services/PermissionService');
|
|
const AgentClient = require('~/server/controllers/agents/client');
|
|
const { processAddedConvo } = require('./addedConvo');
|
|
const { logViolation } = require('~/cache');
|
|
const db = require('~/models');
|
|
|
|
/**
|
|
* Creates a tool loader function for the agent.
|
|
* @param {AbortSignal} signal - The abort signal
|
|
* @param {string | null} [streamId] - The stream ID for resumable mode
|
|
* @param {boolean} [definitionsOnly=false] - When true, returns only serializable
|
|
* tool definitions without creating full tool instances (for event-driven mode)
|
|
*/
|
|
function createToolLoader(signal, streamId = null, definitionsOnly = false) {
|
|
/**
|
|
* @param {object} params
|
|
* @param {ServerRequest} params.req
|
|
* @param {ServerResponse} params.res
|
|
* @param {string} params.agentId
|
|
* @param {string[]} params.tools
|
|
* @param {string} params.provider
|
|
* @param {string} params.model
|
|
* @param {AgentToolResources} params.tool_resources
|
|
* @returns {Promise<{
|
|
* tools?: StructuredTool[],
|
|
* toolContextMap: Record<string, unknown>,
|
|
* toolDefinitions?: import('@librechat/agents').LCTool[],
|
|
* userMCPAuthMap?: Record<string, Record<string, string>>,
|
|
* toolRegistry?: import('@librechat/agents').LCToolRegistry
|
|
* } | undefined>}
|
|
*/
|
|
return async function loadTools({
|
|
req,
|
|
res,
|
|
tools,
|
|
model,
|
|
agentId,
|
|
provider,
|
|
tool_options,
|
|
tool_resources,
|
|
}) {
|
|
const agent = { id: agentId, tools, provider, model, tool_options };
|
|
try {
|
|
return await loadAgentTools({
|
|
req,
|
|
res,
|
|
agent,
|
|
signal,
|
|
streamId,
|
|
tool_resources,
|
|
definitionsOnly,
|
|
});
|
|
} catch (error) {
|
|
logger.error('Error loading tools for agent ' + agentId, error);
|
|
}
|
|
};
|
|
}
|
|
|
|
/**
|
|
* Initializes the AgentClient for a given request/response cycle.
|
|
* @param {Object} params
|
|
* @param {Express.Request} params.req
|
|
* @param {Express.Response} params.res
|
|
* @param {AbortSignal} params.signal
|
|
* @param {Object} params.endpointOption
|
|
*/
|
|
const initializeClient = async ({ req, res, signal, endpointOption }) => {
|
|
if (!endpointOption) {
|
|
throw new Error('Endpoint option not provided');
|
|
}
|
|
const appConfig = req.config;
|
|
|
|
/** @type {string | null} */
|
|
const streamId = req._resumableStreamId || null;
|
|
|
|
/** @type {Array<UsageMetadata>} */
|
|
const collectedUsage = [];
|
|
/**
|
|
* Vertex Gemini 3 thought signatures captured from `chat_model_end` events,
|
|
* keyed by `tool_call_id`. Persisted on
|
|
* `responseMessage.metadata.thoughtSignatures` so subsequent conversation
|
|
* turns can restore each signature onto the right reconstructed AIMessage's
|
|
* `additional_kwargs.signatures` and avoid 400s when resuming after a tool
|
|
* round-trip without a final text reply. Always allocated; capture path
|
|
* is a no-op for providers that don't emit signatures (OpenAI, Anthropic,
|
|
* Bedrock, etc.).
|
|
* @type {Record<string, string>}
|
|
*/
|
|
const collectedThoughtSignatures = {};
|
|
/** @type {ArtifactPromises} */
|
|
const artifactPromises = [];
|
|
const { contentParts, aggregateContent } = createContentAggregator();
|
|
const toolEndCallback = createToolEndCallback({ req, res, artifactPromises, streamId });
|
|
|
|
/** Query accessible skill IDs once per run (shared across all agents).
|
|
* Skills activate under strict opt-in semantics — see
|
|
* `resolveAgentScopedSkillIds` for the per-agent activation predicate:
|
|
* - Ephemeral agent → model-spec `skills` config first, otherwise the
|
|
* per-conversation skills badge toggle (full catalog).
|
|
* - Persisted agent → `agent.skills_enabled === true`. Optional
|
|
* `agent.skills` allowlist narrows the catalog; empty/undefined
|
|
* allowlist with the toggle on = full accessible catalog. */
|
|
const enabledCapabilities = new Set(appConfig?.endpoints?.[EModelEndpoint.agents]?.capabilities);
|
|
const skillsCapabilityEnabled = enabledCapabilities.has(AgentCapabilities.skills);
|
|
const codeEnvAvailable = enabledCapabilities.has(AgentCapabilities.execute_code);
|
|
const ephemeralSkillsToggle = req.body?.ephemeralAgent?.skills === true;
|
|
const skillDbMethods = getSkillDbMethods();
|
|
|
|
const accessibleSkillIds = skillsCapabilityEnabled
|
|
? withDeploymentSkillIds(
|
|
await findAccessibleResources({
|
|
userId: req.user.id,
|
|
role: req.user.role,
|
|
resourceType: ResourceType.SKILL,
|
|
requiredPermissions: PermissionBits.VIEW,
|
|
}),
|
|
)
|
|
: [];
|
|
const editableSkillIds = skillsCapabilityEnabled
|
|
? await findAccessibleResources({
|
|
userId: req.user.id,
|
|
role: req.user.role,
|
|
resourceType: ResourceType.SKILL,
|
|
requiredPermissions: PermissionBits.EDIT,
|
|
})
|
|
: [];
|
|
const skillCreateAllowed = skillsCapabilityEnabled
|
|
? await getSkillToolDeps().canCreateSkill({ req })
|
|
: false;
|
|
|
|
const { skillStates, defaultActiveOnShare } = await loadSkillStates({
|
|
userId: req.user.id,
|
|
appConfig,
|
|
getUserById: db.getUserById,
|
|
accessibleSkillIds,
|
|
});
|
|
|
|
/**
|
|
* Agent context store - populated after initialization, accessed by callback via closure.
|
|
* Maps agentId -> { userMCPAuthMap, agent, tool_resources, toolRegistry, openAIApiKey }
|
|
* @type {Map<string, {
|
|
* userMCPAuthMap?: Record<string, Record<string, string>>,
|
|
* agent?: object,
|
|
* tool_resources?: object,
|
|
* toolRegistry?: import('@librechat/agents').LCToolRegistry,
|
|
* requestScopedConnections?: import('@librechat/api').RequestScopedMCPConnectionStore,
|
|
* openAIApiKey?: string
|
|
* }>}
|
|
*/
|
|
const agentToolContexts = new Map();
|
|
|
|
const toolExecuteOptions = {
|
|
loadTools: async (toolNames, agentId) => {
|
|
const ctx = agentToolContexts.get(agentId) ?? {};
|
|
logger.debug(`[ON_TOOL_EXECUTE] ctx found: ${!!ctx.userMCPAuthMap}, agent: ${ctx.agent?.id}`);
|
|
logger.debug(`[ON_TOOL_EXECUTE] toolRegistry size: ${ctx.toolRegistry?.size ?? 'undefined'}`);
|
|
|
|
const result = await loadToolsForExecution({
|
|
req,
|
|
res,
|
|
signal,
|
|
streamId,
|
|
toolNames,
|
|
agent: ctx.agent,
|
|
toolRegistry: ctx.toolRegistry,
|
|
mcpAvailableTools: ctx.mcpAvailableTools,
|
|
requestScopedConnections: ctx.requestScopedConnections,
|
|
userMCPAuthMap: ctx.userMCPAuthMap,
|
|
tool_resources: ctx.tool_resources,
|
|
actionsEnabled: ctx.actionsEnabled,
|
|
});
|
|
|
|
logger.debug(`[ON_TOOL_EXECUTE] loaded ${result.loadedTools?.length ?? 0} tools`);
|
|
/** Per-agent narrowed flag (admin capability AND agent.tools
|
|
* includes execute_code), captured in `agentToolContexts` when
|
|
* the agent initialized. Falls back to `false` on any stray
|
|
* ctx miss so a skills-only agent never gains sandbox access
|
|
* even if capability lookup somehow skips. */
|
|
return enrichLoadedToolsWithAgentContext({
|
|
result,
|
|
req,
|
|
ctx,
|
|
});
|
|
},
|
|
toolEndCallback,
|
|
...getSkillToolDeps(),
|
|
};
|
|
|
|
const summarizationOptions =
|
|
appConfig?.summarization?.enabled === false ? { enabled: false } : { enabled: true };
|
|
|
|
/**
|
|
* Per-request map of per-subagent `createContentAggregator` instances
|
|
* keyed by the parent's `tool_call_id`. The handler in `callbacks.js`
|
|
* lazily creates an aggregator for each distinct `parentToolCallId`
|
|
* and folds every `ON_SUBAGENT_UPDATE` event into it as they stream
|
|
* in. `AgentClient` pulls each aggregator's `contentParts` at message
|
|
* save time and attaches them to the matching `subagent` tool_call so
|
|
* the child's reasoning / tool calls / final text survive a page
|
|
* refresh — the client-side Recoil atom is best-effort live-only.
|
|
*/
|
|
const subagentAggregatorsByToolCallId = new Map();
|
|
|
|
/** Backend prices each model call authoritatively (premium tiers, cache
|
|
* rates) and emits the cost on on_token_usage when contextCost is on, so
|
|
* the gauge sums real costs instead of re-deriving from base rates.
|
|
* `endpointTokenConfig` is filled in once `primaryConfig` resolves below so
|
|
* custom-endpoint agents price with their configured rates, not defaults. */
|
|
const usageCost = {
|
|
enabled: appConfig?.interfaceConfig?.contextCost === true,
|
|
pricing: { getMultiplier: db.getMultiplier, getCacheMultiplier: db.getCacheMultiplier },
|
|
};
|
|
|
|
/** Latest visible context snapshot + every emitted usage payload for this
|
|
* response, captured by the handlers and persisted on the response message's
|
|
* metadata so the breakdown and branch/total cost survive a reload.
|
|
* @type {{ latest: import('librechat-data-provider').TContextUsageEvent | null, count: number }} */
|
|
const contextUsageSink = { latest: null, count: 0 };
|
|
/** @type {Array<import('librechat-data-provider').TTokenUsageEvent>} */
|
|
const usageEmitSink = [];
|
|
|
|
const eventHandlers = getDefaultHandlers({
|
|
res,
|
|
toolExecuteOptions,
|
|
summarizationOptions,
|
|
aggregateContent,
|
|
toolEndCallback,
|
|
collectedUsage,
|
|
collectedThoughtSignatures,
|
|
streamId,
|
|
subagentAggregatorsByToolCallId,
|
|
usageCost,
|
|
contextUsageSink,
|
|
usageEmitSink,
|
|
});
|
|
|
|
if (!endpointOption.agent) {
|
|
throw new Error('No agent promise provided');
|
|
}
|
|
|
|
const primaryAgent = await endpointOption.agent;
|
|
delete endpointOption.agent;
|
|
if (!primaryAgent) {
|
|
throw new Error('Agent not found');
|
|
}
|
|
|
|
const modelsConfig = await getModelsConfig(req);
|
|
const validationResult = await validateAgentModel({
|
|
req,
|
|
res,
|
|
modelsConfig,
|
|
logViolation,
|
|
agent: primaryAgent,
|
|
});
|
|
|
|
if (!validationResult.isValid) {
|
|
throw new Error(validationResult.error?.message);
|
|
}
|
|
|
|
const agentConfigs = new Map();
|
|
const allowedProviders = new Set(appConfig?.endpoints?.[EModelEndpoint.agents]?.allowedProviders);
|
|
|
|
/** Event-driven mode: only load tool definitions, not full instances */
|
|
const loadTools = createToolLoader(signal, streamId, true);
|
|
/** @type {Array<MongoFile>} */
|
|
const requestFiles = req.body.files ?? [];
|
|
/** @type {string} */
|
|
const conversationId = req.body.conversationId;
|
|
/** @type {string | undefined} */
|
|
const parentMessageId = req.body.parentMessageId;
|
|
/**
|
|
* Skill names the user invoked via the `$` popover for this turn. Only flows
|
|
* to the primary agent — handoff agents are follow-up turns that don't see
|
|
* the user's per-submission `$` selections. `extractManualSkills` also
|
|
* drops non-string / empty elements so a crafted payload can't reach the
|
|
* `getSkillByName` DB query with nonsense values.
|
|
* @type {string[] | undefined}
|
|
*/
|
|
const manualSkills = extractManualSkills(req.body);
|
|
|
|
const selectedModelSpec =
|
|
endpointOption.spec && Array.isArray(appConfig?.modelSpecs?.list)
|
|
? appConfig.modelSpecs.list.find((modelSpec) => modelSpec.name === endpointOption.spec)
|
|
: null;
|
|
|
|
if (
|
|
primaryAgent &&
|
|
isEphemeralAgentId(primaryAgent.id) &&
|
|
selectedModelSpec &&
|
|
Object.hasOwn(selectedModelSpec, 'skills')
|
|
) {
|
|
if (selectedModelSpec.skills === true) {
|
|
primaryAgent.skills_enabled = true;
|
|
delete primaryAgent.skills;
|
|
} else if (selectedModelSpec.skills === false) {
|
|
primaryAgent.skills_enabled = false;
|
|
primaryAgent.skills = [];
|
|
} else if (Array.isArray(selectedModelSpec.skills)) {
|
|
const resolvedSkillIds = await resolveModelSpecSkillIds({
|
|
names: selectedModelSpec.skills,
|
|
accessibleSkillIds,
|
|
getSkillByName: db.getSkillByName,
|
|
});
|
|
primaryAgent.skills_enabled = true;
|
|
primaryAgent.skills = resolvedSkillIds.map((id) => id.toString());
|
|
}
|
|
}
|
|
|
|
const primaryScopedSkillIds = resolveAgentScopedSkillIds({
|
|
agent: primaryAgent,
|
|
accessibleSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
});
|
|
const primaryScopedEditableSkillIds = resolveAgentScopedSkillIds({
|
|
agent: primaryAgent,
|
|
accessibleSkillIds: editableSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
});
|
|
const primarySkillAuthoringAvailable = canAuthorSkillFiles({
|
|
agent: primaryAgent,
|
|
scopedEditableSkillIds: primaryScopedEditableSkillIds,
|
|
skillCreateAllowed,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
});
|
|
|
|
const primaryConfig = await initializeAgent(
|
|
{
|
|
req,
|
|
res,
|
|
loadTools,
|
|
requestFiles,
|
|
conversationId,
|
|
parentMessageId,
|
|
agent: primaryAgent,
|
|
endpointOption,
|
|
allowedProviders,
|
|
isInitialAgent: true,
|
|
accessibleSkillIds: primaryScopedSkillIds,
|
|
skillAuthoringAvailable: primarySkillAuthoringAvailable,
|
|
codeEnvAvailable,
|
|
skillStates,
|
|
defaultActiveOnShare,
|
|
manualSkills,
|
|
},
|
|
{
|
|
getFiles: db.getFiles,
|
|
getUserKey: db.getUserKey,
|
|
getMessages: db.getMessages,
|
|
getConvoFiles: db.getConvoFiles,
|
|
updateFilesUsage: db.updateFilesUsage,
|
|
getUserKeyValues: db.getUserKeyValues,
|
|
getUserCodeFiles: db.getUserCodeFiles,
|
|
getToolFilesByIds: db.getToolFilesByIds,
|
|
getCodeGeneratedFiles: db.getCodeGeneratedFiles,
|
|
filterFilesByAgentAccess,
|
|
listSkillsByAccess: skillDbMethods.listSkillsByAccess,
|
|
listAlwaysApplySkills: skillDbMethods.listAlwaysApplySkills,
|
|
getSkillByName: skillDbMethods.getSkillByName,
|
|
},
|
|
);
|
|
|
|
/** Price emitted usage with the primary agent's resolved endpoint config so
|
|
* custom-endpoint agents reflect configured rates (mirrors the AgentClient
|
|
* spending path, which reads the same config). */
|
|
usageCost.endpointTokenConfig = primaryConfig.endpointTokenConfig;
|
|
|
|
logger.debug(
|
|
`[initializeClient] Storing tool context for ${primaryConfig.id}: ${primaryConfig.toolDefinitions?.length ?? 0} tools, registry size: ${primaryConfig.toolRegistry?.size ?? '0'}`,
|
|
);
|
|
agentToolContexts.set(
|
|
primaryConfig.id,
|
|
buildAgentToolContext({ agent: primaryAgent, config: primaryConfig }),
|
|
);
|
|
|
|
const {
|
|
agentConfigs: discoveredConfigs,
|
|
edges: discoveredEdges,
|
|
userMCPAuthMap: discoveredMCPAuthMap,
|
|
skippedAgentIds: discoveredSkippedIds,
|
|
} = await discoverConnectedAgents(
|
|
{
|
|
req,
|
|
res,
|
|
primaryConfig,
|
|
agent_ids: primaryConfig.agent_ids,
|
|
endpointOption,
|
|
allowedProviders,
|
|
modelsConfig,
|
|
loadTools,
|
|
requestFiles,
|
|
conversationId,
|
|
parentMessageId,
|
|
computeAccessibleSkillIds: (agent) =>
|
|
resolveAgentScopedSkillIds({
|
|
agent,
|
|
accessibleSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
}),
|
|
computeSkillAuthoringAvailable: (agent) =>
|
|
canAuthorSkillFiles({
|
|
agent,
|
|
scopedEditableSkillIds: resolveAgentScopedSkillIds({
|
|
agent,
|
|
accessibleSkillIds: editableSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
}),
|
|
skillCreateAllowed,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
}),
|
|
skillStates,
|
|
defaultActiveOnShare,
|
|
codeEnvAvailable,
|
|
},
|
|
{
|
|
getAgent: db.getAgent,
|
|
checkPermission,
|
|
logViolation,
|
|
db: {
|
|
getFiles: db.getFiles,
|
|
getUserKey: db.getUserKey,
|
|
getMessages: db.getMessages,
|
|
getConvoFiles: db.getConvoFiles,
|
|
updateFilesUsage: db.updateFilesUsage,
|
|
getUserKeyValues: db.getUserKeyValues,
|
|
getUserCodeFiles: db.getUserCodeFiles,
|
|
getToolFilesByIds: db.getToolFilesByIds,
|
|
getCodeGeneratedFiles: db.getCodeGeneratedFiles,
|
|
filterFilesByAgentAccess,
|
|
listSkillsByAccess: skillDbMethods.listSkillsByAccess,
|
|
listAlwaysApplySkills: skillDbMethods.listAlwaysApplySkills,
|
|
getSkillByName: skillDbMethods.getSkillByName,
|
|
},
|
|
// The callback fires during BFS, before the helper prunes agents
|
|
// whose edges end up filtered. Don't populate `agentConfigs` here —
|
|
// `discoveredConfigs` (returned below) is the authoritative pruned
|
|
// set. The per-agent tool context map is OK to keep populated even
|
|
// for pruned ids: it's only read by closure in ON_TOOL_EXECUTE,
|
|
// stale entries are unreachable at runtime.
|
|
onAgentInitialized: (agentId, agent, config) => {
|
|
agentToolContexts.set(agentId, buildAgentToolContext({ agent, config }));
|
|
},
|
|
// Pass through the `@librechat/api` exports so that tests which
|
|
// `jest.mock('@librechat/api')` can override the initializer/validator.
|
|
initializeAgent,
|
|
validateAgentModel,
|
|
},
|
|
);
|
|
|
|
// Copy the pruned discovery result into the outer map. Anything the
|
|
// helper dropped (skipped or unreachable after edge filtering) is
|
|
// intentionally absent. `processAddedConvo` below may still add more
|
|
// entries for parallel multi-convo execution.
|
|
for (const [agentId, config] of discoveredConfigs) {
|
|
agentConfigs.set(agentId, config);
|
|
}
|
|
|
|
let userMCPAuthMap = discoveredMCPAuthMap;
|
|
let edges = discoveredEdges;
|
|
|
|
/** Multi-Convo: Process addedConvo for parallel agent execution */
|
|
const { userMCPAuthMap: updatedMCPAuthMap } = await processAddedConvo({
|
|
req,
|
|
res,
|
|
loadTools,
|
|
logViolation,
|
|
modelsConfig,
|
|
requestFiles,
|
|
agentConfigs,
|
|
primaryAgent,
|
|
endpointOption,
|
|
userMCPAuthMap,
|
|
conversationId,
|
|
parentMessageId,
|
|
allowedProviders,
|
|
primaryAgentId: primaryConfig.id,
|
|
accessibleSkillIds,
|
|
editableSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
skillCreateAllowed,
|
|
skillStates,
|
|
defaultActiveOnShare,
|
|
codeEnvAvailable,
|
|
});
|
|
|
|
if (updatedMCPAuthMap) {
|
|
userMCPAuthMap = updatedMCPAuthMap;
|
|
}
|
|
|
|
for (const [agentId, config] of agentConfigs) {
|
|
if (agentToolContexts.has(agentId)) {
|
|
continue;
|
|
}
|
|
agentToolContexts.set(agentId, buildAgentToolContext({ agent: config, config }));
|
|
}
|
|
|
|
// `discoverConnectedAgents` always returns a concrete array, so no
|
|
// further normalization is needed before handing this to `createRun`.
|
|
primaryConfig.edges = edges;
|
|
|
|
// Subagents: load any explicit subagent configs. Subagents run in isolated
|
|
// context windows and are invoked via a dedicated spawn tool (not handoff
|
|
// edges). An agent that is ONLY referenced as a subagent is dropped from
|
|
// `agentConfigs` so the LangGraph pipeline doesn't treat it as a
|
|
// parallel/handoff node, but it is KEPT in `agentToolContexts` — the child's
|
|
// `ON_TOOL_EXECUTE` dispatches resolve tool execution context (agent,
|
|
// tool_resources, skill ACLs, ...) from that map, so removing it would leave
|
|
// action tools skipped and resource-scoped tools running without their
|
|
// configured resources.
|
|
const subagentsCapabilityEnabled = enabledCapabilities.has(AgentCapabilities.subagents);
|
|
/** Track skipped ids locally so repeated failures short-circuit within
|
|
* the subagent loading loop. Seeded from the discovery helper's skip
|
|
* list so agents that already failed handoff loading don't get retried. */
|
|
const skippedAgentIds = new Set(discoveredSkippedIds ?? []);
|
|
|
|
/** All agent ids referenced on any edge (source OR target). Used by
|
|
* `loadSubagentsFor` to decide whether an agent that's only a subagent
|
|
* can be safely dropped from `agentConfigs` — LangGraph doesn't treat
|
|
* pure subagents as parallel/handoff nodes. */
|
|
const edgeAgentIds = new Set([primaryConfig.id]);
|
|
for (const edge of edges ?? []) {
|
|
const sources = Array.isArray(edge.from) ? edge.from : [edge.from];
|
|
const targets = Array.isArray(edge.to) ? edge.to : [edge.to];
|
|
for (const id of sources) {
|
|
if (typeof id === 'string') edgeAgentIds.add(id);
|
|
}
|
|
for (const id of targets) {
|
|
if (typeof id === 'string') edgeAgentIds.add(id);
|
|
}
|
|
}
|
|
|
|
/** Lazy per-id agent loader used for subagents that weren't reachable
|
|
* via the handoff edge graph (so `discoverConnectedAgents` didn't
|
|
* initialize them). Mirrors the helper's internal `processAgent`:
|
|
* DB lookup + VIEW check + `initializeAgent`, then inserts into
|
|
* `agentConfigs` and `agentToolContexts`. Returns `null` on any
|
|
* failure so the caller can skip gracefully. */
|
|
const loadAgentById = async (agentId) => {
|
|
if (skippedAgentIds.has(agentId)) return null;
|
|
const existing = agentConfigs.get(agentId);
|
|
if (existing) return existing;
|
|
|
|
try {
|
|
const agent = await db.getAgent({ id: agentId });
|
|
if (!agent) {
|
|
skippedAgentIds.add(agentId);
|
|
return null;
|
|
}
|
|
const userId = req.user?.id;
|
|
if (!userId) {
|
|
skippedAgentIds.add(agentId);
|
|
return null;
|
|
}
|
|
const hasAccess = await checkPermission({
|
|
userId,
|
|
role: req.user?.role,
|
|
resourceType: ResourceType.AGENT,
|
|
resourceId: agent._id,
|
|
requiredPermission: PermissionBits.VIEW,
|
|
});
|
|
if (!hasAccess) {
|
|
logger.warn(
|
|
`[processAgent] User ${userId} lacks VIEW access to subagent ${agentId}, skipping`,
|
|
);
|
|
skippedAgentIds.add(agentId);
|
|
return null;
|
|
}
|
|
const validation = await validateAgentModel({
|
|
req,
|
|
res,
|
|
agent,
|
|
modelsConfig,
|
|
logViolation,
|
|
});
|
|
if (!validation.isValid) {
|
|
logger.warn(
|
|
`[processAgent] Subagent ${agentId} failed model validation: ${validation.error?.message}`,
|
|
);
|
|
skippedAgentIds.add(agentId);
|
|
return null;
|
|
}
|
|
const scopedSkillIds = resolveAgentScopedSkillIds({
|
|
agent,
|
|
accessibleSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
});
|
|
const scopedEditableSkillIds = resolveAgentScopedSkillIds({
|
|
agent,
|
|
accessibleSkillIds: editableSkillIds,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
});
|
|
const config = await initializeAgent(
|
|
{
|
|
req,
|
|
res,
|
|
agent,
|
|
loadTools,
|
|
requestFiles,
|
|
conversationId,
|
|
parentMessageId,
|
|
endpointOption: { ...endpointOption, endpoint: EModelEndpoint.agents },
|
|
allowedProviders,
|
|
accessibleSkillIds: scopedSkillIds,
|
|
skillAuthoringAvailable: canAuthorSkillFiles({
|
|
agent,
|
|
scopedEditableSkillIds,
|
|
skillCreateAllowed,
|
|
skillsCapabilityEnabled,
|
|
ephemeralSkillsToggle,
|
|
}),
|
|
/** Match the primary / handoff / addedConvo paths: forward the
|
|
* endpoint-level admin flag so `initializeAgent` can compute the
|
|
* per-agent narrowing (admin AND agent.tools includes
|
|
* execute_code) into `InitializedAgent.codeEnvAvailable`. Without
|
|
* this, a code-enabled subagent loaded only through
|
|
* `subagentAgentConfigs` initializes with `codeEnvAvailable:
|
|
* false`, so `bash_tool` / `read_file` sandbox fallback are
|
|
* silently gated off even though the seed walk found it. */
|
|
codeEnvAvailable,
|
|
skillStates,
|
|
defaultActiveOnShare,
|
|
},
|
|
{
|
|
getFiles: db.getFiles,
|
|
getUserKey: db.getUserKey,
|
|
getMessages: db.getMessages,
|
|
getConvoFiles: db.getConvoFiles,
|
|
updateFilesUsage: db.updateFilesUsage,
|
|
getUserKeyValues: db.getUserKeyValues,
|
|
getUserCodeFiles: db.getUserCodeFiles,
|
|
getToolFilesByIds: db.getToolFilesByIds,
|
|
getCodeGeneratedFiles: db.getCodeGeneratedFiles,
|
|
filterFilesByAgentAccess,
|
|
listSkillsByAccess: skillDbMethods.listSkillsByAccess,
|
|
listAlwaysApplySkills: skillDbMethods.listAlwaysApplySkills,
|
|
getSkillByName: skillDbMethods.getSkillByName,
|
|
},
|
|
);
|
|
agentConfigs.set(agentId, config);
|
|
agentToolContexts.set(agentId, buildAgentToolContext({ agent, config }));
|
|
return config;
|
|
} catch (err) {
|
|
logger.error(`[processAgent] Error processing subagent ${agentId}:`, err);
|
|
skippedAgentIds.add(agentId);
|
|
return null;
|
|
}
|
|
};
|
|
|
|
/** Collected during resolution; applied to `agentConfigs` only after
|
|
* every config has had its subagents resolved. Eager pruning would
|
|
* hide pure-subagent ids from the subsequent `loadSubagentsFor`
|
|
* loop, which would leave *their* `subagentAgentConfigs` empty and
|
|
* silently break nested delegation like A → B → C where B is only
|
|
* a subagent of A. */
|
|
const pureSubagentIds = new Set();
|
|
const subagentGraphIds = new Set();
|
|
const loadedSubagentConfigIds = new Set();
|
|
|
|
const assertSubagentGraphRoom = (agentId) => {
|
|
if (subagentGraphIds.has(agentId)) {
|
|
return;
|
|
}
|
|
if (subagentGraphIds.size >= MAX_SUBAGENT_GRAPH_NODES) {
|
|
logger.warn('[initializeClient] Subagent graph node limit exceeded', {
|
|
agentId,
|
|
primaryAgentId: primaryConfig.id,
|
|
loadedSubagentCount: subagentGraphIds.size,
|
|
maxSubagentGraphNodes: MAX_SUBAGENT_GRAPH_NODES,
|
|
});
|
|
throw new Error(
|
|
`Subagent graph exceeds the maximum of ${MAX_SUBAGENT_GRAPH_NODES} unique agents.`,
|
|
);
|
|
}
|
|
};
|
|
|
|
/**
|
|
* Loads `subagentAgentConfigs` for a single agent config. Shared
|
|
* between the primary agent and handoff-target agents (and pure
|
|
* subagents, transitively) so an agent used via handoff or
|
|
* nested-subagent that has its own explicit `subagents.agent_ids`
|
|
* gets them honored at runtime. Self-spawn works regardless (no DB
|
|
* lookup needed). Pruning decisions are deferred to `pureSubagentIds`.
|
|
*/
|
|
const loadSubagentsFor = async (config, depth = 0) => {
|
|
const sub = config.subagents;
|
|
if (!subagentsCapabilityEnabled || !sub?.enabled) {
|
|
config.subagentAgentConfigs = [];
|
|
return;
|
|
}
|
|
|
|
if (loadedSubagentConfigIds.has(config.id)) {
|
|
if ((config.subagentAgentConfigs?.length ?? 0) > 0 && depth >= MAX_SUBAGENT_DEPTH) {
|
|
logger.warn('[initializeClient] Subagent graph depth limit exceeded', {
|
|
agentId: config.id,
|
|
primaryAgentId: primaryConfig.id,
|
|
depth,
|
|
maxSubagentDepth: MAX_SUBAGENT_DEPTH,
|
|
childCount: config.subagentAgentConfigs.length,
|
|
});
|
|
throw new Error(
|
|
`Subagent graph exceeds the maximum depth of ${MAX_SUBAGENT_DEPTH} at agent ${config.id}.`,
|
|
);
|
|
}
|
|
return;
|
|
}
|
|
|
|
/** Dedupe and filter in one pass — a crafted payload could
|
|
* legitimately include the same ID twice; the backend shouldn't
|
|
* create duplicate SubagentConfig entries for the LLM to see as
|
|
* separate spawn targets. */
|
|
const explicitSubagentIds = Array.from(
|
|
new Set(
|
|
Array.isArray(sub.agent_ids)
|
|
? sub.agent_ids.filter((id) => typeof id === 'string' && id && id !== config.id)
|
|
: [],
|
|
),
|
|
);
|
|
|
|
if (explicitSubagentIds.length > 0 && depth >= MAX_SUBAGENT_DEPTH) {
|
|
logger.warn('[initializeClient] Subagent graph depth limit exceeded', {
|
|
agentId: config.id,
|
|
primaryAgentId: primaryConfig.id,
|
|
depth,
|
|
maxSubagentDepth: MAX_SUBAGENT_DEPTH,
|
|
childCount: explicitSubagentIds.length,
|
|
});
|
|
throw new Error(
|
|
`Subagent graph exceeds the maximum depth of ${MAX_SUBAGENT_DEPTH} at agent ${config.id}.`,
|
|
);
|
|
}
|
|
|
|
loadedSubagentConfigIds.add(config.id);
|
|
|
|
/** @type {Array<Object>} */
|
|
const resolved = [];
|
|
for (const subagentId of explicitSubagentIds) {
|
|
if (skippedAgentIds.has(subagentId)) continue;
|
|
|
|
/** Cycle guard: a configuration like A ↔ B (B lists A as its
|
|
* subagent) would otherwise trigger `loadAgentById` on the
|
|
* primary — inserting a second config for the same primary id,
|
|
* which downstream duplicates in the agent array. Reuse the
|
|
* existing primary config when a subagent ref points back at it. */
|
|
if (subagentId === primaryConfig.id) {
|
|
resolved.push(primaryConfig);
|
|
continue;
|
|
}
|
|
|
|
assertSubagentGraphRoom(subagentId);
|
|
const subagentConfig = await loadAgentById(subagentId);
|
|
if (!subagentConfig) continue;
|
|
|
|
subagentGraphIds.add(subagentConfig.id ?? subagentId);
|
|
resolved.push(subagentConfig);
|
|
|
|
if (!edgeAgentIds.has(subagentId)) {
|
|
pureSubagentIds.add(subagentId);
|
|
}
|
|
}
|
|
|
|
config.subagentAgentConfigs = resolved;
|
|
};
|
|
|
|
const maxResolvedDepthByConfigId = new Map();
|
|
|
|
/** BFS across subagent trees so nested chains like A → B → C get
|
|
* resolved before any pruning. Agent configs are loaded once, but
|
|
* overlapping roots can still be revisited at deeper path depths so
|
|
* the depth guard observes the deepest reachable subagent path. */
|
|
const resolveSubagentTrees = async (rootConfigs) => {
|
|
const pending = rootConfigs.map((cfg) => ({ cfg, depth: 0 }));
|
|
for (let index = 0; index < pending.length; index++) {
|
|
const { cfg, depth } = pending[index];
|
|
if (!cfg?.id) continue;
|
|
const previousDepth = maxResolvedDepthByConfigId.get(cfg.id);
|
|
if (previousDepth != null && previousDepth >= depth) continue;
|
|
maxResolvedDepthByConfigId.set(cfg.id, depth);
|
|
await loadSubagentsFor(cfg, depth);
|
|
for (const child of cfg.subagentAgentConfigs ?? []) {
|
|
const childDepth = depth + 1;
|
|
const previousChildDepth = child?.id ? maxResolvedDepthByConfigId.get(child.id) : undefined;
|
|
if (child?.id && (previousChildDepth == null || previousChildDepth < childDepth)) {
|
|
pending.push({ cfg: child, depth: childDepth });
|
|
}
|
|
}
|
|
}
|
|
};
|
|
|
|
await resolveSubagentTrees([primaryConfig, ...agentConfigs.values()]);
|
|
|
|
/** Drop pure-subagent entries now that every reachable config has
|
|
* had its subagents resolved. They stay in `agentToolContexts` so
|
|
* their tools still execute with the right scoping. */
|
|
for (const id of pureSubagentIds) {
|
|
agentConfigs.delete(id);
|
|
}
|
|
|
|
primaryConfig.subagents = subagentsCapabilityEnabled ? primaryConfig.subagents : undefined;
|
|
|
|
/** If the capability is off at the endpoint level, strip `subagents` on
|
|
* every loaded config — not just the primary. `run.ts` calls
|
|
* `buildSubagentConfigs` for every agent in the array, so a handoff
|
|
* agent with `subagents.enabled: true` persisted on its document would
|
|
* otherwise still expose self-spawn at runtime even though the admin
|
|
* has disabled the capability globally. */
|
|
if (!subagentsCapabilityEnabled) {
|
|
for (const config of agentConfigs.values()) {
|
|
config.subagents = undefined;
|
|
config.subagentAgentConfigs = undefined;
|
|
}
|
|
}
|
|
|
|
const agentContextAttachmentsByAgentId = buildAgentContextAttachmentsByAgentId([
|
|
primaryConfig,
|
|
...agentConfigs.values(),
|
|
]);
|
|
|
|
let endpointConfig = appConfig.endpoints?.[primaryConfig.endpoint];
|
|
if (!isAgentsEndpoint(primaryConfig.endpoint) && !endpointConfig) {
|
|
try {
|
|
endpointConfig = getCustomEndpointConfig({
|
|
endpoint: primaryConfig.endpoint,
|
|
appConfig,
|
|
});
|
|
} catch (err) {
|
|
logger.error(
|
|
'[api/server/controllers/agents/client.js #titleConvo] Error getting custom endpoint config',
|
|
err,
|
|
);
|
|
}
|
|
}
|
|
|
|
const sender =
|
|
primaryAgent.name ??
|
|
getResponseSender({
|
|
...endpointOption,
|
|
model: endpointOption.model_parameters.model,
|
|
modelDisplayLabel: endpointConfig?.modelDisplayLabel,
|
|
modelLabel: endpointOption.model_parameters.modelLabel,
|
|
});
|
|
|
|
/** History priming uses the user's full ACL-accessible skill set (not
|
|
* per-agent scoped) because prior turns may reference skills no longer
|
|
* in any active agent's scope; the ACL check is the security gate.
|
|
* `codeEnvAvailable` comes from `primaryConfig` — @see
|
|
* `InitializedAgent.codeEnvAvailable` for the per-agent narrowing. */
|
|
const handlePrimeInvokedSkills = skillsCapabilityEnabled
|
|
? (payload) =>
|
|
primeInvokedSkills({
|
|
req,
|
|
payload,
|
|
accessibleSkillIds,
|
|
codeEnvAvailable: primaryConfig.codeEnvAvailable === true,
|
|
...getSkillToolDeps(),
|
|
})
|
|
: undefined;
|
|
|
|
const client = new AgentClient({
|
|
req,
|
|
res,
|
|
sender,
|
|
contentParts,
|
|
agentConfigs,
|
|
eventHandlers,
|
|
collectedUsage,
|
|
collectedThoughtSignatures,
|
|
aggregateContent,
|
|
artifactPromises,
|
|
primeInvokedSkills: handlePrimeInvokedSkills,
|
|
agent: primaryConfig,
|
|
spec: endpointOption.spec,
|
|
iconURL: endpointOption.iconURL,
|
|
chatProjectId: endpointOption.chatProjectId,
|
|
attachments: primaryConfig.requestAttachments ?? primaryConfig.attachments,
|
|
agentContextAttachmentsByAgentId,
|
|
endpointType: endpointOption.endpointType,
|
|
resendFiles: primaryConfig.resendFiles ?? true,
|
|
maxContextTokens: primaryConfig.maxContextTokens,
|
|
endpoint: isEphemeralAgentId(primaryConfig.id) ? primaryConfig.endpoint : EModelEndpoint.agents,
|
|
subagentAggregatorsByToolCallId,
|
|
/** Resolved endpoint token/pricing config so spending and cost reflect
|
|
* configured rates for custom-endpoint agents instead of defaults. */
|
|
endpointTokenConfig: primaryConfig.endpointTokenConfig,
|
|
/** Capture sinks the handlers fill during the run; `sendCompletion` reads
|
|
* them to persist the breakdown + usage rollup on the response message. */
|
|
contextUsageSink,
|
|
usageEmitSink,
|
|
});
|
|
|
|
if (streamId) {
|
|
GenerationJobManager.setCollectedUsage(streamId, collectedUsage);
|
|
}
|
|
|
|
return { client, userMCPAuthMap };
|
|
};
|
|
|
|
module.exports = { initializeClient };
|