LibreChat/api/app/clients/BaseClient.js
Danny Avila 5eb1c2c107
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
🖇️ feat: Reference Selected Chat Text with Multi-Quote Popup (#13868)
* 🖇️ feat: Reference Selected Chat Text with Multi-Quote Popup

Add a ChatGPT/Codex-style quote feature: selecting text in any message shows
an 'Add to chat' popup that accumulates removable quote chips above the
composer. On submit, the excerpts are merged into the user message text as
Markdown blockquotes (counted in the user message token count, not a system
message) and persisted on the message so they render on the user bubble and
survive reload.

- packages/api: add getReferencedQuotes + mergeQuotedText helpers (blockquote merge, length/count caps) with unit tests
- BaseClient.sendMessage: temporarily merge req.body.quotes into userMessage.text before buildMessages, restore clean text, persist quotes array
- data-schemas + data-provider: add optional quotes field to message schema/type
- client: pendingQuotesByConvoId atom, QuoteButton selection popup, PendingQuoteChips composer row, MessageQuotes persistent display
- useChatFunctions: drain pending quotes onto the message, carry forward on regenerate
- add localization keys and component/integration tests

* 🧪 test: Add Playwright e2e for chat quote feature

Add e2e/specs/mock/quotes.spec.ts covering select -> 'Add to chat' popup ->
chip -> send -> persistent reference block -> reload, plus multi-select
accumulation and chip removal. Selection is driven programmatically (real DOM
Range + dispatched mouseup) to summon the popup deterministically.

Add data-testid hooks (add-to-chat-button, pending-quote-chips, message-quotes)
to the quote components for stable selectors.

* 🛡️ fix: Address Codex review on quote feature

- Run PII filter + OpenAI moderation over req.body.quotes (P1): quoted excerpts
  are merged into the model-facing user message, so they must clear the same
  filters; a crafted quotes payload could otherwise bypass them. Adds tests.
- Carry quotes through edit/save-and-submit replays (overrideQuotes in
  EditMessage), mirroring overrideManualSkills, so edited turns keep context.
- Hide the quote UI for Assistants endpoints (which bypass BaseClient merge),
  so users can't queue quotes the assistant never receives.
- Clear pending quote/skill queues by resolved conversationId in useClearStates,
  not the UI index, so queued-but-unsent selections don't linger in Recoil.
- Cap queued quotes client-side at 10 to match the backend QUOTE_MAX_COUNT, so
  the composer never shows more quotes than are actually sent.

* 🧵 fix: Durably re-merge quotes + Codex round 2

Address Codex's re-review of the quote feature:

- Durable history re-merge (per maintainer decision): quotes are no longer
  merged at request time and stripped; instead each user message's persisted
  message.quotes is merged into its formatted content in AgentClient.buildMessages
  (new prependQuotes helper) for current AND historical turns. The model
  receives the referenced context on every prompt and the token count stays
  consistent with what was persisted; stored text stays clean for display.
- Attach normalized quotes to the user message in handleStartMethods (before
  getReqData/onStart) so the optimistic bubble, resumable abort metadata, and
  saved row all carry them (fixes the abort-metadata gap).
- Skip the quote drain entirely for Assistants endpoints in useChatFunctions,
  leaving the pending atom intact (UI is already hidden there).
- Normalize req.body.quotes via getReferencedQuotes before moderation/PII so
  only the trimmed/truncated/capped excerpts the model will receive are checked.
- Tests: prependQuotes unit tests; BaseClient quote tests assert early
  attachment + clean text; e2e now verifies the model receives the merged
  blockquote on the current turn and re-merged from history on a later turn
  (new E2E_ASSERT_QUOTE mock marker).

* 🔗 fix: Quote share/memo/abort/PII gaps (Codex round 3)

- Shared links: include quotes in the anonymized projection + SharedMessage
  type (+test) so the /share view renders the same reference blocks as the
  owner, mirroring manualSkills/alwaysAppliedSkills.
- MessageRender memo: compare quotes length so a server/resume copy whose only
  change is the quote list re-renders (the block no longer goes stale/missing).
- Resumable job metadata: include quotes in the userMessage written to
  GenerationJobManager so a reload/reconnect mid-stream reconstructs the chips.
- PII + moderation: also scan the merged blockquote+text exactly as the model
  receives it, so a secret split across a quote and the typed body (each clean
  alone) is caught (+cross-boundary test).
- e2e: make quote-add robust against the auto-scroll-dismisses-selection race
  via a retried select+click helper.

* 🛑 fix: Keep quotes on aborted turn's request message (Codex round 4)

abortMiddleware reconstructs finalEvent.requestMessage from jobData.userMessage
but only copied ids + text; include quotes so a stopped quoted turn keeps its
MessageQuotes in the UI and a regenerate-before-reload still sends the
referenced context. Completes the resumable-metadata fix from the prior round.

* 🧮 fix: Quote recount + preliminary abort metadata (Codex round 5)

- Force a canonical token recount for messages carrying quotes in
  AgentClient.buildMessages, so a plain text-only Save edit (which recomputes
  tokenCount from text alone) can't leave a stale, quote-excluding count that
  undercounts context on later turns — recount from the quote-merged copy
  self-heals it.
- Seed normalized quotes into the preliminary userMessage metadata
  (getPreliminaryUserMessage), so an abort during init/tool-loading (before
  onStart) still reconstructs the stopped turn's MessageQuotes.

*  fix: Add getReferencedQuotes to controller test mocks (CI)

request.js's getPreliminaryUserMessage now calls getReferencedQuotes; the
agents controller specs mock @librechat/api wholesale, so the mock must export
it or the call throws and cascades. Added a faithful mock (normalize/cap,
null when empty) to request.resumeMetadata.spec.js and jobReplacement.spec.js.

* 📐 fix: Quotes in context projection + resumable metadata (Codex round 6)

- Context-usage projection (resolveContextProjection): select message.quotes,
  prepend them into the projected user text, and recount quoted messages so the
  context gauge counts the same prompt the model receives (a text-only Save edit
  no longer makes the gauge undercount / over-report remaining budget).
- Resumable job metadata: trackUserMessage (created-event rewrite) and abortJob
  (final requestMessage) now carry quotes; SerializableJobData.userMessage and
  CreatedEvent.message gained an optional quotes field. With the cross-replica
  created-event spread, stopping/reconnecting a quoted turn after the created
  event keeps its MessageQuotes.

* 💬 feat: Collapse multi-select quotes into one chip with hover popup

Composer feedback: the quote chip area now shows a single chip — the excerpt
text for one selection, or a collapsed "{n} selections" pill for multiple,
with a hover popup (HoverCard) listing every excerpt and a per-item remove. The
chip is taller (py-1.5/text-sm) to read less skinny. Adds com_ui_quote_selections
and com_ui_remove_all_quotes; updates unit + e2e tests (e2e drives the count via
a data-quote-count hook and exercises the hover popup).

*  fix: Make multi-selection quote popup keyboard accessible

The collapsed "{n} selections" pill used a HoverCard, which Radix only opens on
pointer hover — its interactive content was unreachable by keyboard. Replaced it
with a Popover: the trigger is a real button that opens on click / Enter / Space
(focus moves into the list, each excerpt's × is tab-navigable, Escape closes and
restores focus), with hover-open preserved for mouse via controlled open state +
a close grace period. Hover-initiated opens skip auto-focus so they don't pull
focus off the composer. Adds an e2e asserting keyboard open/close.

* 📐 fix: Clamp the Add-to-chat button within the viewport (Codex round 7)

The floating selection button positioned via translate(-50%,-100%) (bottom-center
anchor) but clamped top/left as if they were its top-left, so a selection near
the viewport top or sides could render the button partly/fully offscreen. Now it
measures the button (ref + useLayoutEffect) and computes an on-screen top-left —
clamping by the full width within side margins and flipping below the selection
when there's no room above — with no transform, and stays hidden until measured
so it never flashes at an unclamped spot.

* ↩️ fix: Restore pending quotes on early-abort draft (Codex round 8)

When a turn is stopped before the created event (e.g. during tool/MCP init), the
final handler restores requestMessage.text to the draft, but the pending-quote
atom was already drained on submit — so a retry sent no quotes. The abort
requestMessage now carries quotes (preliminary metadata + abort fixes), so the
three early-abort/no-response draft-restore paths in useEventHandlers now also
re-queue pendingQuotesByConvoId from requestMessage.quotes.

*  fix: Use Ariakit Popover for quote selections (keyboard focus)

The multi-selection popup used a hand-rolled Radix Popover with Popover.Anchor +
a manual button, so Radix had no trigger to return focus to — Escape dumped
focus to the page top. Refactored to Ariakit (the codebase's popover primitive,
per DropdownPopup/Fork): the `PopoverDisclosure` is the real trigger, so Escape
closes and returns focus to the composer instead of the top of the page. Keyboard
opens (Enter/Space) autofocus into the list and tab through each excerpt's remove;
hover opens for mouse with autofocus suppressed so it never pulls focus off the
composer. e2e asserts the keyboard open/navigate/Escape flow keeps focus on a
real control (never BODY).
2026-06-21 08:33:11 -04:00

1392 lines
46 KiB
JavaScript

const crypto = require('crypto');
const fetch = require('node-fetch');
const { logger } = require('@librechat/data-schemas');
const {
countTokens,
checkBalance,
getBalanceConfig,
buildMessageFiles,
extractFileContext,
getReferencedQuotes,
encodeAndFormatAudios,
encodeAndFormatVideos,
encodeAndFormatDocuments,
} = require('@librechat/api');
const {
Constants,
FileSources,
ContentTypes,
excludedKeys,
EModelEndpoint,
mergeFileConfig,
isParamEndpoint,
isAgentsEndpoint,
isEphemeralAgentId,
supportsBalanceCheck,
isBedrockDocumentType,
getEndpointFileConfig,
} = require('librechat-data-provider');
const { getStrategyFunctions } = require('~/server/services/Files/strategies');
const { logViolation } = require('~/cache');
const TextStream = require('./TextStream');
const db = require('~/models');
class BaseClient {
constructor(apiKey, options = {}) {
this.apiKey = apiKey;
this.sender = options.sender ?? 'AI';
this.currentDateString = new Date().toLocaleDateString('en-us', {
year: 'numeric',
month: 'long',
day: 'numeric',
});
/** @type {boolean} */
this.skipSaveConvo = false;
/** @type {boolean} */
this.skipSaveUserMessage = false;
/** @type {string} */
this.user;
/** @type {string} */
this.conversationId;
/** @type {string} */
this.responseMessageId;
/** @type {string} */
this.parentMessageId;
/** @type {TAttachment[]} */
this.attachments;
/** The key for the usage object's input tokens
* @type {string} */
this.inputTokensKey = 'prompt_tokens';
/** The key for the usage object's output tokens
* @type {string} */
this.outputTokensKey = 'completion_tokens';
/** @type {Set<string>} */
this.savedMessageIds = new Set();
/**
* Flag to determine if the client re-submitted the latest assistant message.
* @type {boolean | undefined} */
this.continued;
/**
* Flag to determine if the client has already fetched the conversation while saving new messages.
* @type {boolean | undefined} */
this.fetchedConvo;
/** @type {TMessage[]} */
this.currentMessages = [];
/** @type {import('librechat-data-provider').VisionModes | undefined} */
this.visionMode;
/** @type {import('librechat-data-provider').FileConfig | undefined} */
this._mergedFileConfig;
/** @type {import('librechat-data-provider').EndpointFileConfig | undefined} */
this._endpointFileConfig;
}
setOptions() {
throw new Error("Method 'setOptions' must be implemented.");
}
async getCompletion() {
throw new Error("Method 'getCompletion' must be implemented.");
}
/** @type {sendCompletion} */
async sendCompletion() {
throw new Error("Method 'sendCompletion' must be implemented.");
}
getSaveOptions() {
throw new Error('Subclasses must implement getSaveOptions');
}
async buildMessages() {
throw new Error('Subclasses must implement buildMessages');
}
async summarizeMessages() {
throw new Error('Subclasses attempted to call summarizeMessages without implementing it');
}
/**
* @returns {string}
*/
getResponseModel() {
if (isAgentsEndpoint(this.options.endpoint) && this.options.agent && this.options.agent.id) {
return this.options.agent.id;
}
return this.modelOptions?.model ?? this.model;
}
/**
* Abstract method to get the token count for a message. Subclasses must implement this method.
* @param {TMessage} responseMessage
* @returns {number}
*/
getTokenCountForResponse(responseMessage) {
logger.debug('[BaseClient] `recordTokenUsage` not implemented.', {
messageId: responseMessage?.messageId,
});
}
/**
* Abstract method to record token usage. Subclasses must implement this method.
* If a correction to the token usage is needed, the method should return an object with the corrected token counts.
* Should only be used if `recordCollectedUsage` was not used instead.
* @param {string} [model]
* @param {AppConfig['balance']} [balance]
* @param {number} promptTokens
* @param {number} completionTokens
* @param {string} [messageId]
* @returns {Promise<void>}
*/
async recordTokenUsage({ model, balance, promptTokens, completionTokens, messageId }) {
logger.debug('[BaseClient] `recordTokenUsage` not implemented.', {
model,
balance,
messageId,
promptTokens,
completionTokens,
});
}
/**
* Makes an HTTP request and logs the process.
*
* @param {RequestInfo} url - The URL to make the request to. Can be a string or a Request object.
* @param {RequestInit} [init] - Optional init options for the request.
* @returns {Promise<Response>} - A promise that resolves to the response of the fetch request.
*/
async fetch(_url, init) {
let url = _url;
if (this.options.directEndpoint) {
url = this.options.reverseProxyUrl;
}
logger.debug(`Making request to ${url}`);
if (typeof Bun !== 'undefined') {
return await fetch(url, init);
}
return await fetch(url, init);
}
getBuildMessagesOptions() {
throw new Error('Subclasses must implement getBuildMessagesOptions');
}
async generateTextStream(text, onProgress, options = {}) {
const stream = new TextStream(text, options);
await stream.processTextStream(onProgress);
}
/**
* @returns {[string|undefined, string|undefined]}
*/
processOverideIds() {
/** @type {Record<string, string | undefined>} */
let { overrideConvoId, overrideUserMessageId } = this.options?.req?.body ?? {};
if (overrideConvoId) {
const [conversationId, index] = overrideConvoId.split(Constants.COMMON_DIVIDER);
overrideConvoId = conversationId;
if (index !== '0') {
this.skipSaveConvo = true;
}
}
if (overrideUserMessageId) {
const [userMessageId, index] = overrideUserMessageId.split(Constants.COMMON_DIVIDER);
overrideUserMessageId = userMessageId;
if (index !== '0') {
this.skipSaveUserMessage = true;
}
}
return [overrideConvoId, overrideUserMessageId];
}
async setMessageOptions(opts = {}) {
if (opts && opts.replaceOptions) {
this.setOptions(opts);
}
const [overrideConvoId, overrideUserMessageId] = this.processOverideIds();
const { isEdited, isContinued } = opts;
const user = opts.user ?? null;
this.user = user;
const saveOptions = this.getSaveOptions();
this.abortController = opts.abortController ?? new AbortController();
const requestConvoId = overrideConvoId ?? opts.conversationId;
const conversationId = requestConvoId ?? crypto.randomUUID();
const parentMessageId = opts.parentMessageId ?? Constants.NO_PARENT;
const userMessageId =
overrideUserMessageId ?? opts.overrideParentMessageId ?? crypto.randomUUID();
let responseMessageId = opts.responseMessageId ?? crypto.randomUUID();
let head = isEdited ? responseMessageId : parentMessageId;
this.currentMessages = (await this.loadHistory(conversationId, head)) ?? [];
this.conversationId = conversationId;
if (isEdited && !isContinued) {
responseMessageId = crypto.randomUUID();
head = responseMessageId;
this.currentMessages[this.currentMessages.length - 1].messageId = head;
}
if (opts.isRegenerate && responseMessageId.endsWith('_')) {
responseMessageId = crypto.randomUUID();
}
this.responseMessageId = responseMessageId;
return {
...opts,
user,
head,
saveOptions,
userMessageId,
requestConvoId,
conversationId,
parentMessageId,
responseMessageId,
};
}
createUserMessage({ messageId, parentMessageId, conversationId, text }) {
return {
messageId,
parentMessageId,
conversationId,
sender: 'User',
text,
isCreatedByUser: true,
};
}
async handleStartMethods(message, opts) {
const {
user,
head,
saveOptions,
userMessageId,
requestConvoId,
conversationId,
parentMessageId,
responseMessageId,
} = await this.setMessageOptions(opts);
const userMessage = opts.isEdited
? this.currentMessages[this.currentMessages.length - 2]
: this.createUserMessage({
messageId: userMessageId,
parentMessageId,
conversationId,
text: message,
});
/**
* Attach quoted excerpts (the "Add to chat" selections from `req.body.quotes`)
* before `getReqData`/`onStart` fire, so the optimistic bubble, resumable job
* metadata, and the saved row all carry them. Only on fresh turns — edits
* replay an existing message that already has its quotes. The excerpts are
* merged into the model-facing text later, per message, in `buildMessages`,
* keeping the stored `text` clean while the count stays consistent.
*/
if (!opts.isEdited) {
const referencedQuotes = getReferencedQuotes(this.options.req?.body?.quotes);
if (referencedQuotes != null) {
userMessage.quotes = referencedQuotes;
}
}
if (typeof opts?.getReqData === 'function') {
opts.getReqData({
userMessage,
conversationId,
responseMessageId,
sender: this.sender,
});
}
if (typeof opts?.onStart === 'function') {
const isNewConvo = !requestConvoId && parentMessageId === Constants.NO_PARENT;
opts.onStart(userMessage, responseMessageId, isNewConvo);
}
return {
...opts,
user,
head,
conversationId,
responseMessageId,
saveOptions,
userMessage,
};
}
/**
* Adds instructions to the messages array. If the instructions object is empty or undefined,
* the original messages array is returned. Otherwise, the instructions are added to the messages
* array either at the beginning (default) or preserving the last message at the end.
*
* @param {Array} messages - An array of messages.
* @param {Object} instructions - An object containing instructions to be added to the messages.
* @param {boolean} [beforeLast=false] - If true, adds instructions before the last message; if false, adds at the beginning.
* @returns {Array} An array containing messages and instructions, or the original messages if instructions are empty.
*/
addInstructions(messages, instructions, beforeLast = false) {
if (!instructions || Object.keys(instructions).length === 0) {
return messages;
}
if (!beforeLast) {
return [instructions, ...messages];
}
// Legacy behavior: add instructions before the last message
const payload = [];
if (messages.length > 1) {
payload.push(...messages.slice(0, -1));
}
payload.push(instructions);
if (messages.length > 0) {
payload.push(messages[messages.length - 1]);
}
return payload;
}
concatenateMessages(messages) {
return messages.reduce((acc, message) => {
const nameOrRole = message.name ?? message.role;
return acc + `${nameOrRole}:\n${message.content}\n\n`;
}, '');
}
/**
* This method processes an array of messages and returns a context of messages that fit within a specified token limit.
* It iterates over the messages from newest to oldest, adding them to the context until the token limit is reached.
* If the token limit would be exceeded by adding a message, that message is not added to the context and remains in the original array.
* The method uses `push` and `pop` operations for efficient array manipulation, and reverses the context array at the end to maintain the original order of the messages.
*
* @param {Object} params
* @param {TMessage[]} params.messages - An array of messages, each with a `tokenCount` property. The messages should be ordered from oldest to newest.
* @param {number} [params.maxContextTokens] - The max number of tokens allowed in the context. If not provided, defaults to `this.maxContextTokens`.
* @param {{ role: 'system', content: text, tokenCount: number }} [params.instructions] - Instructions already added to the context at index 0.
* @returns {Promise<{
* context: TMessage[],
* remainingContextTokens: number,
* messagesToRefine: TMessage[],
* }>} An object with three properties: `context`, `remainingContextTokens`, and `messagesToRefine`.
* `context` is an array of messages that fit within the token limit.
* `remainingContextTokens` is the number of tokens remaining within the limit after adding the messages to the context.
* `messagesToRefine` is an array of messages that were not added to the context because they would have exceeded the token limit.
*/
async getMessagesWithinTokenLimit({ messages: _messages, maxContextTokens, instructions }) {
// Every reply is primed with <|start|>assistant<|message|>, so we
// start with 3 tokens for the label after all messages have been counted.
let currentTokenCount = 3;
const instructionsTokenCount = instructions?.tokenCount ?? 0;
let remainingContextTokens =
(maxContextTokens ?? this.maxContextTokens) - instructionsTokenCount;
const messages = [..._messages];
const context = [];
if (currentTokenCount < remainingContextTokens) {
while (messages.length > 0 && currentTokenCount < remainingContextTokens) {
if (messages.length === 1 && instructions) {
break;
}
const poppedMessage = messages.pop();
const { tokenCount } = poppedMessage;
if (poppedMessage && currentTokenCount + tokenCount <= remainingContextTokens) {
context.push(poppedMessage);
currentTokenCount += tokenCount;
} else {
messages.push(poppedMessage);
break;
}
}
}
if (instructions) {
context.push(_messages[0]);
messages.shift();
}
const prunedMemory = messages;
remainingContextTokens -= currentTokenCount;
return {
context: context.reverse(),
remainingContextTokens,
messagesToRefine: prunedMemory,
};
}
async sendMessage(message, opts = {}) {
const appConfig = this.options.req?.config;
/** @type {Promise<TMessage>} */
let userMessagePromise;
const { user, head, isEdited, conversationId, responseMessageId, saveOptions, userMessage } =
await this.handleStartMethods(message, opts);
if (opts.progressCallback) {
opts.onProgress = opts.progressCallback.call(null, {
...(opts.progressOptions ?? {}),
parentMessageId: userMessage.messageId,
messageId: responseMessageId,
});
}
const { editedContent } = opts;
// It's not necessary to push to currentMessages
// depending on subclass implementation of handling messages
// When this is an edit, all messages are already in currentMessages, both user and response
if (isEdited) {
let latestMessage = this.currentMessages[this.currentMessages.length - 1];
if (!latestMessage) {
latestMessage = {
messageId: responseMessageId,
conversationId,
parentMessageId: userMessage.messageId,
isCreatedByUser: false,
model: this.modelOptions?.model ?? this.model,
sender: this.sender,
};
this.currentMessages.push(userMessage, latestMessage);
} else if (editedContent != null) {
// Handle editedContent for content parts
if (editedContent && latestMessage.content && Array.isArray(latestMessage.content)) {
const { index, text, type } = editedContent;
if (index >= 0 && index < latestMessage.content.length) {
const contentPart = latestMessage.content[index];
if (type === ContentTypes.THINK && contentPart.type === ContentTypes.THINK) {
contentPart[ContentTypes.THINK] = text;
} else if (type === ContentTypes.TEXT && contentPart.type === ContentTypes.TEXT) {
contentPart[ContentTypes.TEXT] = text;
}
}
}
}
this.continued = true;
} else {
this.currentMessages.push(userMessage);
}
/**
* When the userMessage is pushed to currentMessages, the parentMessage is the userMessageId.
* this only matters when buildMessages is utilizing the parentMessageId, and may vary on implementation
*/
const parentMessageId = isEdited ? head : userMessage.messageId;
this.parentMessageId = parentMessageId;
let {
prompt: payload,
tokenCountMap,
promptTokens,
} = await this.buildMessages(
this.currentMessages,
parentMessageId,
this.getBuildMessagesOptions(opts),
opts,
);
if (tokenCountMap && tokenCountMap[userMessage.messageId]) {
userMessage.tokenCount = tokenCountMap[userMessage.messageId];
logger.debug('[BaseClient] userMessage', {
messageId: userMessage.messageId,
tokenCount: userMessage.tokenCount,
conversationId: userMessage.conversationId,
});
}
if (!isEdited && !this.skipSaveUserMessage) {
const reqFiles = this.options.req?.body?.files;
if (reqFiles && Array.isArray(this.options.attachments)) {
const files = buildMessageFiles(reqFiles, this.options.attachments);
if (files.length > 0) {
userMessage.files = files;
}
delete userMessage.image_urls;
}
/**
* Persist the user's manual skill picks onto the user message so the
* frontend `SkillPills` component can render them in history
* after reload. UI-only metadata — the runtime skill resolution
* pipeline reads the top-level `req.body.manualSkills` separately.
* Filter is defense-in-depth on top of Mongoose schema validation:
* keeps the DB row free of empty/non-string entries even if a
* crafted payload slips past schema checks upstream.
*/
const rawManualSkills = this.options.req?.body?.manualSkills;
if (Array.isArray(rawManualSkills) && rawManualSkills.length > 0) {
const skills = rawManualSkills.filter((s) => typeof s === 'string' && s.length > 0);
if (skills.length > 0) {
userMessage.manualSkills = skills;
}
}
/**
* Persist the names of skills auto-primed this turn via `always-apply`
* frontmatter so `SkillPills` can render pinned-variant badges
* on the user bubble that survive reload and history render. Frozen
* at turn time (not reconstructed from `Skill.alwaysApply` at render
* time) because the flag is mutable — historical turns must keep
* their audit trail even if an admin flips `alwaysApply` off later.
*/
const alwaysApplySkillPrimes = this.options.agent?.alwaysApplySkillPrimes;
if (Array.isArray(alwaysApplySkillPrimes) && alwaysApplySkillPrimes.length > 0) {
const names = alwaysApplySkillPrimes
.map((p) => p?.name)
.filter((n) => typeof n === 'string' && n.length > 0);
if (names.length > 0) {
userMessage.alwaysAppliedSkills = names;
}
}
userMessagePromise = this.saveMessageToDatabase(userMessage, saveOptions, user).catch(
(err) => {
logger.error('[BaseClient] Failed to save user message:', err);
return {};
},
);
this.savedMessageIds.add(userMessage.messageId);
if (typeof opts?.getReqData === 'function') {
opts.getReqData({
userMessagePromise,
});
}
}
const balanceConfig = getBalanceConfig(appConfig);
if (
balanceConfig?.enabled &&
supportsBalanceCheck[this.options.endpointType ?? this.options.endpoint]
) {
await checkBalance(
{
req: this.options.req,
res: this.options.res,
txData: {
user: this.user,
tokenType: 'prompt',
amount: promptTokens,
endpoint: this.options.endpoint,
model: this.modelOptions?.model ?? this.model,
endpointTokenConfig: this.options.endpointTokenConfig,
},
},
{
logViolation,
getMultiplier: db.getMultiplier,
findBalanceByUser: db.findBalanceByUser,
createAutoRefillTransaction: db.createAutoRefillTransaction,
balanceConfig,
upsertBalanceFields: db.upsertBalanceFields,
},
);
}
const { completion, metadata } = await this.sendCompletion(payload, opts);
if (this.abortController) {
this.abortController.requestCompleted = true;
}
/** @type {TMessage} */
const responseMessage = {
messageId: responseMessageId,
conversationId,
parentMessageId: userMessage.messageId,
isCreatedByUser: false,
isEdited,
model: this.getResponseModel(),
sender: this.sender,
promptTokens,
iconURL: this.options.iconURL,
endpoint: this.options.endpoint,
...(this.metadata ?? {}),
metadata: Object.keys(metadata ?? {}).length > 0 ? metadata : undefined,
};
if (typeof completion === 'string') {
responseMessage.text = completion;
} else if (
Array.isArray(completion) &&
(this.clientName === EModelEndpoint.agents ||
isParamEndpoint(this.options.endpoint, this.options.endpointType))
) {
responseMessage.text = '';
if (!opts.editedContent || this.currentMessages.length === 0) {
responseMessage.content = completion;
} else {
const latestMessage = this.currentMessages[this.currentMessages.length - 1];
if (!latestMessage?.content) {
responseMessage.content = completion;
} else {
const existingContent = [...latestMessage.content];
const { type: editedType } = opts.editedContent;
responseMessage.content = this.mergeEditedContent(
existingContent,
completion,
editedType,
);
}
}
} else if (Array.isArray(completion)) {
responseMessage.text = completion.join('');
}
if (tokenCountMap && this.recordTokenUsage && this.getTokenCountForResponse) {
let completionTokens;
/**
* Metadata about input/output costs for the current message. The client
* should provide a function to get the current stream usage metadata; if not,
* use the legacy token estimations.
* @type {StreamUsage | null} */
const usage = this.getStreamUsage != null ? this.getStreamUsage() : null;
if (usage != null && Number(usage[this.outputTokensKey]) > 0) {
responseMessage.tokenCount = usage[this.outputTokensKey];
completionTokens = responseMessage.tokenCount;
} else {
responseMessage.tokenCount = this.getTokenCountForResponse(responseMessage);
completionTokens = responseMessage.tokenCount;
await this.recordTokenUsage({
usage,
promptTokens,
completionTokens,
balance: balanceConfig,
/** Note: When using agents, responseMessage.model is the agent ID, not the model */
model: this.model,
messageId: this.responseMessageId,
});
}
logger.debug('[BaseClient] Response token usage', {
messageId: responseMessage.messageId,
model: responseMessage.model,
promptTokens,
completionTokens,
});
}
if (userMessagePromise) {
await userMessagePromise;
}
if (
this.contextMeta?.calibrationRatio > 0 &&
this.contextMeta.calibrationRatio !== 1 &&
userMessage.tokenCount > 0
) {
const calibrated = Math.round(userMessage.tokenCount * this.contextMeta.calibrationRatio);
if (calibrated !== userMessage.tokenCount) {
logger.debug('[BaseClient] Calibrated user message tokenCount', {
messageId: userMessage.messageId,
raw: userMessage.tokenCount,
calibrated,
ratio: this.contextMeta.calibrationRatio,
});
userMessage.tokenCount = calibrated;
await this.updateMessageInDatabase({
messageId: userMessage.messageId,
tokenCount: calibrated,
});
}
}
if (this.artifactPromises) {
responseMessage.attachments = (await Promise.all(this.artifactPromises)).filter((a) => a);
}
if (this.options.attachments) {
try {
saveOptions.files = this.options.attachments.map((attachments) => attachments.file_id);
} catch (error) {
logger.error('[BaseClient] Error mapping attachments for conversation', error);
}
}
if (this.contextMeta) {
responseMessage.contextMeta = this.contextMeta;
}
responseMessage.databasePromise = this.saveMessageToDatabase(
responseMessage,
saveOptions,
user,
);
this.savedMessageIds.add(responseMessage.messageId);
return responseMessage;
}
async loadHistory(conversationId, parentMessageId = null) {
logger.debug('[BaseClient] Loading history:', { conversationId, parentMessageId });
const messages = (await db.getMessages({ conversationId, user: this.user })) ?? [];
if (messages.length === 0) {
return [];
}
let mapMethod = null;
if (this.getMessageMapMethod) {
mapMethod = this.getMessageMapMethod();
}
let _messages = this.constructor.getMessagesForConversation({
messages,
parentMessageId,
mapMethod,
});
_messages = await this.addPreviousAttachments(_messages);
if (!this.shouldSummarize) {
return _messages;
}
for (let i = _messages.length - 1; i >= 0; i--) {
const msg = _messages[i];
if (!msg) {
continue;
}
const summaryBlock = BaseClient.findSummaryContentBlock(msg);
if (summaryBlock) {
this.previous_summary = {
...msg,
summary: BaseClient.getSummaryText(summaryBlock),
summaryTokenCount: summaryBlock.tokenCount,
};
break;
}
if (msg.summary) {
this.previous_summary = msg;
break;
}
}
if (this.previous_summary) {
const { messageId, summary, tokenCount, summaryTokenCount } = this.previous_summary;
logger.debug('[BaseClient] Previous summary:', {
messageId,
summary,
tokenCount,
summaryTokenCount,
});
}
return _messages;
}
/**
* Save a message to the database.
* @param {TMessage} message
* @param {Partial<TConversation>} endpointOptions
* @param {string | null} user
*/
async saveMessageToDatabase(message, endpointOptions, user = null) {
// Snapshot options before any await; disposeClient may set client.options = null
// while this method is suspended at an I/O boundary, but the local reference
// remains valid (disposeClient nulls the property, not the object itself).
const options = this.options;
if (!options) {
logger.error('[BaseClient] saveMessageToDatabase: client disposed before save, skipping');
return {};
}
if (this.user && user !== this.user) {
throw new Error('User mismatch.');
}
const hasAddedConvo = options?.req?.body?.addedConvo != null;
const reqCtx = {
userId: options?.req?.user?.id,
isTemporary: options?.req?.body?.isTemporary,
interfaceConfig: options?.req?.config?.interfaceConfig,
};
const savedMessage = await db.saveMessage(
reqCtx,
{
...message,
endpoint: options.endpoint,
unfinished: false,
user,
...(hasAddedConvo && { addedConvo: true }),
},
{ context: 'api/app/clients/BaseClient.js - saveMessageToDatabase #saveMessage' },
);
if (this.skipSaveConvo) {
return { message: savedMessage };
}
const fieldsToKeep = {
conversationId: message.conversationId,
endpoint: options.endpoint,
endpointType: options.endpointType,
...endpointOptions,
};
const conversationCreatedAt = options?.req?.conversationCreatedAt;
const createdAtOnInsert =
conversationCreatedAt != null ? new Date(conversationCreatedAt) : undefined;
const validCreatedAtOnInsert =
createdAtOnInsert && !Number.isNaN(createdAtOnInsert.getTime())
? createdAtOnInsert
: undefined;
const req = options?.req;
const skippedExistingConvoLookup = this.fetchedConvo === true;
const hasResolvedConversation =
req != null && Object.prototype.hasOwnProperty.call(req, 'resolvedConversation');
let existingConvo = null;
if (!skippedExistingConvoLookup && hasResolvedConversation) {
existingConvo = req.resolvedConversation;
} else if (!skippedExistingConvoLookup) {
existingConvo = await db.getConvo(req?.user?.id, message.conversationId);
}
if (hasResolvedConversation) {
delete req.resolvedConversation;
}
const shouldSetCreatedAtOnInsert = !skippedExistingConvoLookup && existingConvo == null;
const unsetFields = {};
const exceptions = new Set(['spec', 'iconURL']);
const hasNonEphemeralAgent =
isAgentsEndpoint(options.endpoint) &&
endpointOptions?.agent_id &&
!isEphemeralAgentId(endpointOptions.agent_id);
if (hasNonEphemeralAgent) {
exceptions.add('model');
}
if (existingConvo != null) {
this.fetchedConvo = true;
for (const key in existingConvo) {
if (!key) {
continue;
}
if (excludedKeys.has(key) && !exceptions.has(key)) {
continue;
}
if (endpointOptions?.[key] === undefined) {
unsetFields[key] = 1;
}
}
}
const conversation = await db.saveConvo(reqCtx, fieldsToKeep, {
context: 'api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo',
unsetFields,
createdAtOnInsert: shouldSetCreatedAtOnInsert ? validCreatedAtOnInsert : undefined,
});
return { message: savedMessage, conversation };
}
/**
* Update a message in the database.
* @param {Partial<TMessage>} message
*/
async updateMessageInDatabase(message) {
await db.updateMessage(this.options?.req?.user?.id, message);
}
/** Extracts text from a summary block (handles both legacy `text` field and new `content` array format). */
static getSummaryText(summaryBlock) {
if (Array.isArray(summaryBlock.content)) {
return summaryBlock.content.map((b) => b.text ?? '').join('');
}
if (typeof summaryBlock.content === 'string') {
return summaryBlock.content;
}
return summaryBlock.text ?? '';
}
/** Finds the last summary content block in a message's content array (last-summary-wins). */
static findSummaryContentBlock(message) {
if (!Array.isArray(message?.content)) {
return null;
}
let lastSummary = null;
for (const part of message.content) {
if (
part?.type === ContentTypes.SUMMARY &&
BaseClient.getSummaryText(part).trim().length > 0
) {
lastSummary = part;
}
}
return lastSummary;
}
/**
* Iterate through messages, building an array based on the parentMessageId.
*
* This function constructs a conversation thread by traversing messages from a given parentMessageId up to the root message.
* It handles cyclic references by ensuring that a message is not processed more than once.
* If the 'summary' option is set to true and a message has a 'summary' property:
* - The message's 'role' is set to 'system'.
* - The message's 'text' is set to its 'summary'.
* - If the message has a 'summaryTokenCount', the message's 'tokenCount' is set to 'summaryTokenCount'.
* The traversal stops at the message with the 'summary' property.
*
* Each message object should have an 'id' or 'messageId' property and may have a 'parentMessageId' property.
* The 'parentMessageId' is the ID of the message that the current message is a reply to.
* If 'parentMessageId' is not present, null, or is Constants.NO_PARENT,
* the message is considered a root message.
*
* @param {Object} options - The options for the function.
* @param {TMessage[]} options.messages - An array of message objects. Each object should have either an 'id' or 'messageId' property, and may have a 'parentMessageId' property.
* @param {string} options.parentMessageId - The ID of the parent message to start the traversal from.
* @param {Function} [options.mapMethod] - An optional function to map over the ordered messages. Applied conditionally based on mapCondition.
* @param {(message: TMessage) => boolean} [options.mapCondition] - An optional function to determine whether mapMethod should be applied to a given message. If not provided and mapMethod is set, mapMethod applies to all messages.
* @param {boolean} [options.summary=false] - If set to true, the traversal modifies messages with 'summary' and 'summaryTokenCount' properties and stops at the message with a 'summary' property.
* @returns {TMessage[]} An array containing the messages in the order they should be displayed, starting with the most recent message with a 'summary' property if the 'summary' option is true, and ending with the message identified by 'parentMessageId'.
*/
static getMessagesForConversation({
messages,
parentMessageId,
mapMethod = null,
mapCondition = null,
summary = false,
}) {
if (!messages || messages.length === 0) {
return [];
}
const orderedMessages = [];
let currentMessageId = parentMessageId;
const visitedMessageIds = new Set();
while (currentMessageId) {
if (visitedMessageIds.has(currentMessageId)) {
break;
}
const message = messages.find((msg) => {
const messageId = msg.messageId ?? msg.id;
return messageId === currentMessageId;
});
visitedMessageIds.add(currentMessageId);
if (!message) {
break;
}
let resolved = message;
let hasSummary = false;
if (summary) {
const summaryBlock = BaseClient.findSummaryContentBlock(message);
if (summaryBlock) {
const summaryText = BaseClient.getSummaryText(summaryBlock);
resolved = {
...message,
role: 'system',
content: [{ type: ContentTypes.TEXT, text: summaryText }],
tokenCount: summaryBlock.tokenCount,
};
hasSummary = true;
} else if (message.summary) {
resolved = {
...message,
role: 'system',
content: [{ type: ContentTypes.TEXT, text: message.summary }],
tokenCount: message.summaryTokenCount ?? message.tokenCount,
};
hasSummary = true;
}
}
const shouldMap = mapMethod != null && (mapCondition != null ? mapCondition(resolved) : true);
const processedMessage = shouldMap ? mapMethod(resolved) : resolved;
orderedMessages.push(processedMessage);
if (hasSummary) {
break;
}
currentMessageId =
message.parentMessageId === Constants.NO_PARENT ? null : message.parentMessageId;
}
orderedMessages.reverse();
return orderedMessages;
}
/**
* Algorithm adapted from "6. Counting tokens for chat API calls" of
* https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
*
* An additional 3 tokens need to be added for assistant label priming after all messages have been counted.
* In our implementation, this is accounted for in the getMessagesWithinTokenLimit method.
*
* The content parts example was adapted from the following example:
* https://github.com/openai/openai-cookbook/pull/881/files
*
* Note: image token calculation is to be done elsewhere where we have access to the image metadata
*
* @param {Object} message
*/
getTokenCountForMessage(message) {
// Note: gpt-3.5-turbo and gpt-4 may update over time. Use default for these as well as for unknown models
let tokensPerMessage = 3;
let tokensPerName = 1;
const model = this.modelOptions?.model ?? this.model;
if (model === 'gpt-3.5-turbo-0301') {
tokensPerMessage = 4;
tokensPerName = -1;
}
const processValue = (value) => {
if (Array.isArray(value)) {
for (let item of value) {
if (
!item ||
!item.type ||
item.type === ContentTypes.THINK ||
item.type === ContentTypes.ERROR ||
item.type === ContentTypes.IMAGE_URL
) {
continue;
}
if (item.type === ContentTypes.TOOL_CALL && item.tool_call != null) {
const toolName = item.tool_call?.name || '';
if (toolName != null && toolName && typeof toolName === 'string') {
numTokens += this.getTokenCount(toolName);
}
const args = item.tool_call?.args || '';
if (args != null && args && typeof args === 'string') {
numTokens += this.getTokenCount(args);
}
const output = item.tool_call?.output || '';
if (output != null && output && typeof output === 'string') {
numTokens += this.getTokenCount(output);
}
continue;
}
const nestedValue = item[item.type];
if (!nestedValue) {
continue;
}
processValue(nestedValue);
}
} else if (typeof value === 'string') {
numTokens += this.getTokenCount(value);
} else if (typeof value === 'number') {
numTokens += this.getTokenCount(value.toString());
} else if (typeof value === 'boolean') {
numTokens += this.getTokenCount(value.toString());
}
};
let numTokens = tokensPerMessage;
for (let [key, value] of Object.entries(message)) {
processValue(value);
if (key === 'name') {
numTokens += tokensPerName;
}
}
return numTokens;
}
/**
* Merges completion content with existing content when editing TEXT or THINK types
* @param {Array} existingContent - The existing content array
* @param {Array} newCompletion - The new completion content
* @param {string} editedType - The type of content being edited
* @returns {Array} The merged content array
*/
mergeEditedContent(existingContent, newCompletion, editedType) {
if (!newCompletion.length) {
return existingContent.concat(newCompletion);
}
if (editedType !== ContentTypes.TEXT && editedType !== ContentTypes.THINK) {
return existingContent.concat(newCompletion);
}
const lastIndex = existingContent.length - 1;
const lastExisting = existingContent[lastIndex];
const firstNew = newCompletion[0];
if (lastExisting?.type !== firstNew?.type || firstNew?.type !== editedType) {
return existingContent.concat(newCompletion);
}
const mergedContent = [...existingContent];
if (editedType === ContentTypes.TEXT) {
mergedContent[lastIndex] = {
...mergedContent[lastIndex],
[ContentTypes.TEXT]:
(mergedContent[lastIndex][ContentTypes.TEXT] || '') + (firstNew[ContentTypes.TEXT] || ''),
};
} else {
mergedContent[lastIndex] = {
...mergedContent[lastIndex],
[ContentTypes.THINK]:
(mergedContent[lastIndex][ContentTypes.THINK] || '') +
(firstNew[ContentTypes.THINK] || ''),
};
}
// Add remaining completion items
return mergedContent.concat(newCompletion.slice(1));
}
async sendPayload(payload, opts = {}) {
if (opts && typeof opts === 'object') {
this.setOptions(opts);
}
return await this.sendCompletion(payload, opts);
}
async addDocuments(message, attachments) {
const documentResult = await encodeAndFormatDocuments(
this.options.req,
attachments,
{
provider: this.options.agent?.provider ?? this.options.endpoint,
endpoint: this.options.agent?.endpoint ?? this.options.endpoint,
useResponsesApi: this.options.agent?.model_parameters?.useResponsesApi,
model: this.modelOptions?.model ?? this.model,
},
getStrategyFunctions,
);
message.documents =
documentResult.documents && documentResult.documents.length
? documentResult.documents
: undefined;
return documentResult.files;
}
async addVideos(message, attachments) {
const videoResult = await encodeAndFormatVideos(
this.options.req,
attachments,
{
provider: this.options.agent?.provider ?? this.options.endpoint,
endpoint: this.options.agent?.endpoint ?? this.options.endpoint,
},
getStrategyFunctions,
);
message.videos =
videoResult.videos && videoResult.videos.length ? videoResult.videos : undefined;
return videoResult.files;
}
async addAudios(message, attachments) {
const audioResult = await encodeAndFormatAudios(
this.options.req,
attachments,
{
provider: this.options.agent?.provider ?? this.options.endpoint,
endpoint: this.options.agent?.endpoint ?? this.options.endpoint,
},
getStrategyFunctions,
);
message.audios =
audioResult.audios && audioResult.audios.length ? audioResult.audios : undefined;
return audioResult.files;
}
/**
* Extracts text context from attachments and sets it on the message.
* This handles text that was already extracted from files (OCR, transcriptions, document text, etc.)
* @param {TMessage} message - The message to add context to
* @param {MongoFile[]} attachments - Array of file attachments
* @returns {Promise<void>}
*/
async addFileContextToMessage(message, attachments) {
const fileContext = await extractFileContext({
attachments,
req: this.options?.req,
tokenCountFn: (text) => countTokens(text),
});
if (fileContext) {
message.fileContext = fileContext;
}
}
async processAttachments(message, attachments) {
const categorizedAttachments = {
images: [],
videos: [],
audios: [],
documents: [],
};
const allFiles = [];
const provider = this.options.agent?.provider ?? this.options.endpoint;
const isBedrock = provider === EModelEndpoint.bedrock;
if (!this._mergedFileConfig) {
this._mergedFileConfig = mergeFileConfig(this.options.req?.config?.fileConfig);
const endpoint = this.options.agent?.endpoint ?? this.options.endpoint;
this._endpointFileConfig = getEndpointFileConfig({
fileConfig: this._mergedFileConfig,
endpoint,
endpointType: this.options.endpointType,
});
}
for (const file of attachments) {
/** @type {FileSources} */
const source = file.source ?? FileSources.local;
if (source === FileSources.text) {
allFiles.push(file);
continue;
}
if (
file.embedded === true ||
file.metadata?.codeEnvRef != null ||
file.metadata?.fileIdentifier != null
) {
allFiles.push(file);
continue;
}
if (file.type.startsWith('image/')) {
categorizedAttachments.images.push(file);
} else if (file.type === 'application/pdf') {
categorizedAttachments.documents.push(file);
allFiles.push(file);
} else if (isBedrock && isBedrockDocumentType(file.type)) {
categorizedAttachments.documents.push(file);
allFiles.push(file);
} else if (file.type.startsWith('video/')) {
categorizedAttachments.videos.push(file);
allFiles.push(file);
} else if (file.type.startsWith('audio/')) {
categorizedAttachments.audios.push(file);
allFiles.push(file);
} else if (
file.type &&
this._mergedFileConfig &&
this._endpointFileConfig?.supportedMimeTypes &&
this._mergedFileConfig.checkType(file.type, this._endpointFileConfig.supportedMimeTypes)
) {
categorizedAttachments.documents.push(file);
allFiles.push(file);
}
}
const [imageFiles] = await Promise.all([
categorizedAttachments.images.length > 0
? this.addImageURLs(message, categorizedAttachments.images)
: Promise.resolve([]),
categorizedAttachments.documents.length > 0
? this.addDocuments(message, categorizedAttachments.documents)
: Promise.resolve([]),
categorizedAttachments.videos.length > 0
? this.addVideos(message, categorizedAttachments.videos)
: Promise.resolve([]),
categorizedAttachments.audios.length > 0
? this.addAudios(message, categorizedAttachments.audios)
: Promise.resolve([]),
]);
allFiles.push(...imageFiles);
const seenFileIds = new Set();
const uniqueFiles = [];
for (const file of allFiles) {
if (file.file_id && !seenFileIds.has(file.file_id)) {
seenFileIds.add(file.file_id);
uniqueFiles.push(file);
} else if (!file.file_id) {
uniqueFiles.push(file);
}
}
return uniqueFiles;
}
/**
* @param {TMessage[]} _messages
* @returns {Promise<TMessage[]>}
*/
async addPreviousAttachments(_messages) {
if (!this.options.resendFiles) {
return _messages;
}
const seen = new Set();
const attachmentsProcessed =
this.options.attachments && !(this.options.attachments instanceof Promise);
if (attachmentsProcessed) {
for (const attachment of this.options.attachments) {
seen.add(attachment.file_id);
}
}
/**
*
* @param {TMessage} message
*/
const processMessage = async (message) => {
if (!this.message_file_map) {
/** @type {Record<string, MongoFile[]> */
this.message_file_map = {};
}
const fileIds = [];
for (const file of message.files) {
if (seen.has(file.file_id)) {
continue;
}
fileIds.push(file.file_id);
seen.add(file.file_id);
}
if (fileIds.length === 0) {
return message;
}
const files = await db.getFiles(
{
file_id: { $in: fileIds },
},
{},
{},
);
await this.addFileContextToMessage(message, files);
await this.processAttachments(message, files);
this.message_file_map[message.messageId] = files;
return message;
};
const promises = [];
for (const message of _messages) {
if (!message.files) {
promises.push(message);
continue;
}
promises.push(processMessage(message));
}
const messages = await Promise.all(promises);
this.checkVisionRequest(Object.values(this.message_file_map ?? {}).flat());
return messages;
}
}
module.exports = BaseClient;