LibreChat/api/server/controllers/agents/client.test.js
Danny Avila 6dbf9d5ad3
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
🪝 feat: Human-in-the-Loop Runtime - Tool Approval + Ask-User-Question (Slice B) (#13942)
* chore: add @langchain/langgraph-checkpoint-mongodb for HITL durable resume

* feat: HITL tool approval runtime — backend (Slice B)

- endpoints.agents.checkpointer config + durable Mongo checkpointer (seam over the app
  connection; SDK MemorySaver fallback) with a TTL index + deleteThread pruning
- HITL run wiring (PreToolUse policy hook + humanInTheLoop) attached in createRun, fully
  inert when toolApproval.enabled is off
- interrupt gate (pause job -> requires_action + emit on_pending_action) and a resume
  route that rebuilds the run from the durable checkpoint and run.resume()s it
- atomic single-winner resolve; agent-consistency guard; expireStaleApprovals terminal
  event; checkpoint pruned on every non-paused completion (thread_id == conversationId)

* feat: HITL tool approval UI — frontend (Slice B)

approve/reject/edit/respond + ask-user controls in the tool card (OAuth-button precedent),
batch-aware single submit, live + reconnect (resumeState.pendingAction) wiring, and resume
mutations posting to /agents/chat/resume.

* fix(hitl): decouple ApprovalProvider from chat context

ApprovalProvider is now pure state (safe to mount in provider-less / shared / test
renders); the context-dependent submit moved to a useResumeSubmit hook the cards call.
Part imports getAskUserQuestionPart from ~/utils/approval directly so suites that
partial-mock ~/utils render Part without throwing.

* fix(hitl): address Codex review — backend

- P1: enforce per-tool allowed_decisions on resume (reject a crafted decision the
  policy disallows) via findDisallowedDecisions
- prune the durable checkpoint on user-abort of a paused run, and before a fresh
  HITL turn, so a new turn cannot rehydrate an expired/aborted interrupt (thread_id
  is the stable conversationId)
- persist + use isTemporary and the original parentMessageId on resume (temporary
  chats stay temporary; initializeAgent scopes thread files off the right parent)
- generate a deferred first-turn title BEFORE completeJob so its event reaches the
  client and the final event carries the real title
- moderateText: skip when there is no text (tool-approval resume) and moderate the
  ask-user answer, instead of denying on an empty input

* fix(hitl): address Codex review — frontend

- render ToolApproval for ANY paused agent tool card (bash/code/file/etc.), not just
  the generic ToolCall, by wrapping the tool-card branch in Part (moved the rendering
  out of ToolCall)
- findPendingActionMessageIndex only matches an assistant message, never the user
  message (the underscore-strip could target the user bubble before the assistant
  placeholder exists)

* fix(hitl): address Codex re-review

- title eligibility checks the user message’s parent (first turn), not the response’s
  parent — the previous check could never be true and skipped title generation
- use client.buildResponseMetadata() for the resumed message so contextUsage /
  thoughtSignatures survive (the abort-only helper dropped them)
- moderate decisions[].responseText (the respond action’s user text)
- give /chat/abort req.config (configMiddleware) so the HITL checkpoint prune on abort
  actually runs
- read resume state BEFORE setContentParts so the in-memory store does not lose the
  pre-pause seed content
- count resumes against LIMIT_CONCURRENT_MESSAGES (increment/decrement) so paused-then-
  resumed turns cannot bypass the limit
- require actionId on resume so a body without it cannot resolve the current action

* fix(hitl): address Codex re-review (round 3) — resume fidelity

Bring the lean resume path to parity with sendMessage for things it bypassed:
- carry userMCPAuthMap into the rebuilt run so approved MCP tools keep the user's creds
- seed initialSessions (buildInitialToolSessions) so approved code/file/skill tools have
  the pre-pause uploaded-file context (esp. cross-replica / after restart)
- await client.artifactPromises and persist them as response attachments (else tool
  artifacts created after the pause vanish on reload / for late subscribers)
- merge metadata: cumulative usage (+ summary marker) from the job, contextUsage /
  thoughtSignatures from the client — fixes the round-2 regression that underreported
  post-resume cost

* fix(hitl): address Codex re-review (round 4) — resume hardening

- resume: require an EXACT paused agent_id match (reject omitted/ephemeral
  agent_id, not just a different one) and reject an endpoint mismatch, so a
  request can't rebuild the claimed checkpoint on a different graph
- moderateText: also moderate a tool-approval decision's reject `reason` and
  stringified `editedArguments`, not just `responseText`
- request: re-mark the paused response `unfinished:true` after BaseClient saves
  it as completed, so an expired / never-resumed approval doesn't leave a
  "finished" response in history; the resume path overwrites it on success

* test(hitl): route-level integration test for the resume controller

Adds api/server/controllers/agents/__tests__/resume.spec.js, a supertest
integration test that drives the real ResumeAgentController over the full
pause -> approve -> resume -> finalize lifecycle with the SDK run, durable
checkpointer, Mongo, and concurrency cache mocked. The pure decision/liveness
helpers run for real via requireActual, so the guard ladder is exercised end to
end rather than stubbed.

25 cases covering:
- the authorization / staleness / agent-and-endpoint / actionId guard ladder
- tool_approval validation (undecided tool call, policy-disallowed decision)
- ask_user_question answer requirement
- the concurrency gate (429) and the atomic single-winner claim (409)
- the happy path: ACK, run reconstruction, decision->SDK mapping, finalize
  (save the now-finished response, emit done, complete job, prune checkpoint)
- first-turn title generation before stream completion
- re-pause (no double finalize), abort-during-resume (no double finalize),
  and the resume-failure terminal path (emitError + completeJob + prune)

* test(hitl): strengthen resume coverage + add approval util tests

Acts on a self-audit of the new resume integration test.

resume.spec.js (25 -> 32 cases):
- replace the tautological emitDone assertion (it only checked the hardcoded
  `final: true`) with a structural check of the finalEvent payload —
  responseMessage content/id/unfinished, requestMessage identity, title
- cover the previously-unwalked finalize branches: tool-artifact attachments
  (null-filtered), the aggregatedContent fallback when live content is empty,
  and client response-metadata attachment
- add guard cases: unsupported pending-action type (400) and the
  pre-multi-tenancy null-tenantId pass-through (must not 403)
- add error-path cases: first-turn title generation throwing must still
  finalize, and a completeJob failure during a resume error must force a
  terminal job state via the last-resort updateJob

client/src/utils/approval.spec.ts (new, 15 cases):
- applyPendingAction tool_approval: join by tool_call_id not position,
  skip completed calls, default allowed_decisions to [], referential
  stability when nothing changes
- applyPendingAction ask_user_question: append, idempotent replace on replay,
  non-array content coercion
- getAskUserQuestionPart type guard; findPendingActionMessageIndex
  assistant-only resolution (never resolves to the user bubble)

* fix(hitl): address Codex re-review (round 5)

Five findings verified against the code before fixing:

- resume: require an EXACT endpoint match (like agent_id) — a resume that OMITS
  endpoint must not fall through, since the shared chat middleware treats a
  missing/non-agents endpoint as the ephemeral agent and could rebuild the
  claimed checkpoint on a different graph
- resume: filter malformed content parts before saving the finished response,
  matching the normal AgentClient path (a resumed turn could otherwise persist
  an empty/invalid tool_call part that breaks reload/rendering)
- resume: accumulate tool artifacts across pause segments — persist them on
  re-pause and MERGE (not overwrite) at finalize, so artifacts produced before
  a second approval pause aren't dropped by the next rebuilt client
- approval (client): findPendingActionMessageIndex returns -1 when a provided
  responseMessageId isn't found, so the caller retries instead of attaching the
  prompt/approval to a prior assistant reply; fall back to the last assistant
  only when no responseMessageId is given
- RedisJobStore: make appendChunk extend-only (XADD + EXPIRE-if-shorter via a
  single eval) so the on_pending_action chunk emitted after a pause can't reset
  the chunk-stream TTL back to the running window and evict pre-pause content
  before the approval is resolved

Tests: +endpoint-omitted/unsupported-type/malformed-filter/attachment-merge/
re-pause-persist cases in resume.spec.js (36); ask-retry -1 semantics in
approval.spec.ts (16); extend-only TTL assertion in the RedisJobStore Redis
integration spec.

* test(hitl): mongodb-memory-server integration test for the checkpointer seam

The checkpointer unit spec covers config/selection with no DB connection; this
exercises the durable Mongo seam against a real (in-memory) MongoDB — the part
correctness actually depends on:

- getAgentCheckpointer builds a real MongoDBSaver when Mongo is connected and
  setup() creates the TTL index (expireAfterSeconds) on the checkpoint collection
- memory type returns undefined (SDK MemorySaver fallback) even when connected
- saver is memoized per resolved config
- deleteAgentCheckpoint prunes a thread's persisted checkpoint (the cross-turn
  isolation guarantee: turn N+1 on the same conversationId can't rehydrate it)
- pruning is thread-scoped — deleting one conversation leaves others intact
- undefined threadId is a no-op

* fix(hitl): address Codex re-review (round 6)

Four findings verified against the code before fixing:

- messageFilterPii: scan the resume payload's user-authored text (ask-user
  `answer`, and a tool-approval decision's `respond` text, `reject` reason, and
  edited tool arguments) — the shared /resume route ran through the PII filter
  but it only inspected req.body.text, so a blocked token rode the resume
  payload back into the model/tool (mirrors the earlier moderateText fix)
- resume: re-prime skill files invoked in the pre-pause segment before rebuilding
  the run, so an approved code/file-backed tool keeps the injected skill-file
  session refs instead of running without them (mirrors the normal path's
  primeInvokedSkills; the pre-pause content stands in for the message payload)
- hitl: pin the graph identity. Persist a fingerprint of the graph-determining
  request fields (endpoint, agent_id, model, spec, ephemeralAgent — normalized)
  on the pending action at pause, and reject a resume whose recomputed
  fingerprint differs. This closes the ephemeral-agent gap, where agent_id is
  undefined so the id guard can't tell two ephemeral configs apart
- resume: reject incomplete edit/respond decisions (findIncompleteDecisions) —
  an `edit` without an object editedArguments or a `respond` without non-empty
  responseText is 400'd before mapping, rather than defaulting to {} / '' and
  resuming with behavior the user never approved

Tests: incomplete-decision + fingerprint match/mismatch cases in resume.spec.js
(41); findIncompleteDecisions + computeAgentRequestFingerprint unit tests; and
resume-field PII cases in messageFilterPii.spec.ts.

* fix(hitl): address Codex re-review (round 7)

Four findings verified against the code before fixing:

- RedisJobStore: clear `agent_id` on createJob (add it to staleHitlFields). The
  job hash is keyed by conversationId and reused across turns; updateMetadata
  only writes agent_id when truthy, so a conversation that switched from a saved
  agent to an ephemeral/no-agent turn kept the old id and the resume guard
  rejected the valid pause as a different agent. (real correctness bug)
- fingerprint: include `promptPrefix` in computeAgentRequestFingerprint, and
  re-send it on resume (ResumeAgentFields + buildResumeFields). Ephemeral agents
  derive their system instructions from promptPrefix, so a resume changing it
  previously passed the pin and rebuilt different instructions. (completes the
  round-6 fingerprint)
- resume: the re-pause branch now persists the segment's accumulated CONTENT
  (filtered), not just artifacts, so an approval that expires/reaps without a
  final resume no longer loses everything streamed during the resumed segment.
- request: carry `manualSkills`/`alwaysAppliedSkills` on the persisted user
  message so a resumed turn's reconstructed requestMessage keeps its skill pills
  instead of dropping them until a full reload.

Deferred (narrow, no safe contained fix yet — see PR thread replies):
- resume rebuild without `addedConvo` for a multi-conversation/added-agent pane
- cross-replica re-prime of manually-selected (not model-invoked) skill files

Tests: stale-agent createJob clearing (Redis integration), promptPrefix
fingerprint match/mismatch (resume.spec.js + policy.spec.ts), re-pause content
persistence (resume.spec.js).

* fix(hitl): address Codex re-review (round 8)

Five findings verified against the code before fixing; the headline is a durable-
resume correctness fix (the fingerprint had surfaced it as a 403):

- resume durability (the important one): persist the graph-determining request
  fields (endpoint, agent_id, model, spec, promptPrefix, ephemeralAgent) on the
  pending action as `resumeContext`, and REPLAY them onto the resume request via
  a router-level middleware that runs before buildEndpointOption. The client
  can't reconstruct the ephemeral-agent config after a reload/cross-session, so
  the round-6/7 fingerprint would 403 a valid durable resume — and even without
  it the rebuilt agent would lose its tools. Replaying server-side rebuilds the
  SAME graph regardless of client state (and a crafted resume can't swap it; the
  fingerprint still matches because the body is restored first).
- RedisJobStore: also clear `isTemporary` on createJob (same class as agent_id):
  a prior temporary turn's flag would otherwise survive a reused conversation
  hash and a later non-temporary resume would save its response as temporary.
- resume: persist `contextMeta` (context-window calibration) onto the saved
  response like BaseClient does, so the next turn can seed its pruner.
- request: carry manualSkills/alwaysAppliedSkills into the onStart metadata
  update (not just the preliminary one it overwrites), so a resumed turn's
  requestMessage keeps its skill pills.

Deferred (narrow — see thread reply):
- saved-agent edited WHILE a run is paused: agent_id matches but the definition
  changed; needs an agent version/config hash, which is a larger change for a
  narrow window.

Tests: resumeContext pick/apply + round-trip (policy.spec.ts), contextMeta +
manualSkills-on-requestMessage (resume.spec.js), isTemporary clearing (Redis
integration).

* style(hitl): prettier line-wrap in policy.spec.ts (R8 lint fix)

* fix(hitl): address Codex re-review (round 9)

Five findings, all fixed (addedConvo — deferred in rounds 7/8 — is now trivial
thanks to the round-8 replay):

- replay addedConvo: add it to RESUME_CONTEXT_KEYS so the resume middleware
  restores the parallel/secondary-agent config from the paused request; the
  client can't reconstruct it, and it determines the rebuilt graph.
- skill pills (the real fix this time): the round-8 onStart metadata write was
  overwritten by trackUserMessage (the authoritative userMessage writer). Carry
  manualSkills/alwaysAppliedSkills in the emitted `created` message and persist
  them in trackUserMessage; widen UserMessageMeta + SerializableJobData.userMessage.
- execute-code files on resume: seed the paused user message's own files onto
  req.body.files before initializeClient — they're excluded from the
  parent-walk code-session rebuild, so an approved code/read-file tool would
  otherwise resume without them.
- in-memory pending-action UI: route ApprovalEvents.ON_PENDING_ACTION in the
  resume replay/pending-event loops to applyPendingActionToMessages (mirror the
  live handler), so a pause that lands in the snapshot window still renders its
  approval controls instead of sitting paused with no UI.
- abort isTemporary: the /chat/abort partial-save now sources isTemporary from
  the job metadata, not req.body (the stop button posts only conversationId), so
  aborting a paused temporary chat no longer persists an orphaned partial.

Tests: addedConvo in pickResumeContext (policy.spec.ts), file-restore on resume
(resume.spec.js), abort-from-job-isTemporary (abort.spec.js).

* fix(hitl): address Codex re-review (round 10) — resume/expiry races

Three concurrency/coherence findings, verified against the code before fixing:

- expiry-sweep CAS scope: both stale-approval sweeps (GenerationJobManager
  expireStaleApprovals and the RedisJobStore requires_action cleanup) called
  expire()/transitionStatus WITHOUT the observed pendingAction.actionId, so the
  CAS only checked status===requires_action. Between the read and the CAS a user
  could resolve the observed action and the run re-pause on a FRESH action; the
  stale sweep would then abort that valid new pause. Now both pass the observed
  actionId as expectActionId, so the CAS only fires for the action read as stale
  (a re-paused action has a different id → no-op).
- resume graph cache: resumeCompletion cached the rebuilt graph (created with
  messages:[]) via setGraph; RedisJobStore.getContentParts prefers a cached
  graph over reconstructing from the chunk log, so a same-replica reload/status
  poll mid-resume returned aggregatedContent missing the pre-pause content. Skip
  setGraph on resume so introspection falls back to the complete chunk
  reconstruction (setContentParts still seeds the in-memory store).
- pending-action UI: applyPendingActionToMessages scheduled a SINGLE
  animation-frame retry then dropped the pending action; Recoil/React updates can
  take several frames under load, leaving a valid requires_action run with no
  approval controls. Retry across frames (bounded at 120) until the target
  message commits.

Test: expire() with a mismatched expectedActionId no-ops while the matching id
expires (pendingAction.spec.ts).

* chore(deps): update @librechat/agents to version 3.2.53 and @langchain/langgraph to version 1.4.7 in package-lock.json and related package.json files

* refactor(hitl): add resolveToolApprovalPolicy seam for layered policy

Extract the single point where tool-approval policy is resolved for a turn
(`resolveToolApprovalPolicy`) and route the run call site through it instead
of reading `endpoints.agents.toolApproval` inline.

Behaviour-preserving: only the `endpoint` layer is wired today, so the result
is identical to reading the app policy directly. The `agent` and `skills`
layers are reserved seams with documented precedence (endpoint owns the
`enabled` kill switch; agent overrides mode/allow/deny/ask/reason; skills may
only tighten), so future per-agent and per-skill policy plumbing lands in one
function rather than at the `createRun` site. Adds focused unit tests.

* fix(hitl): address Codex re-review (round 11) — resume hardening

F1 (P2, security) — applyResumeContext now DELETES any RESUME_CONTEXT_KEY
absent from the persisted context, so the resume body carries exactly the
graph-determining fields the pause had. Previously only defined keys were
overwritten, leaving a client-supplied `addedConvo` (which the request
fingerprint does not cover) in place — a crafted resume could rebuild a
single-agent checkpoint as a different multi-agent graph/tool set.

F3 (P2) — the resume route ACKs (res.json) before initializeClient, so a
post-ACK getMCPRequestContext(req, res) saw the response as finished and
returned undefined, leaving the resumed run without its run-scoped MCP
connection store (approved MCP / OAuth-overlay tools then ran without their
request-scoped connections). Pre-seed the store with a null res +
cleanupOnResponse:false before the ACK and tear it down in the finally,
mirroring the normal stream path (request.js). userMCPAuthMap was already
preserved separately, so credentials were not lost — only the connection store.

Declined: the ApprovalContext NEW_CONVO guard (P2) is a false positive — the
`created` SSE event updates the conversation atom before any pause renders, so
the id is concrete by click time (details in the PR thread).

Tests: policy.spec (absent-key delete) + resume.spec (MCP context pre-seed/cleanup order).

* fix(hitl): address Codex re-review (round 12) — resume fidelity + multi-tool UI

F4 (P2) — temporal prompt vars: resume rebuilt the agent without restoring
req.conversationCreatedAt or req.body.timezone, so {{current_datetime}}-style
vars compiled a different system prompt than the paused graph (resume wall-clock,
unzoned). Add 'timezone' to RESUME_CONTEXT_KEYS (persisted at pause, replayed by
the resume middleware) and restore conversationCreatedAt from the convo before
initializeClient — mirroring the normal path's resolveConversationCreatedAt.

F5 (P2) — multi-tool approval: applyPendingActionToMessages stopped retrying once
ANY tool-call part was tagged, so siblings that rendered on later frames never got
approval controls and the resume route 400'd the partial batch. Add
countTaggedApprovalParts and keep the bounded RAF retry going until every
action_request is tagged (ask_user_question unchanged — one synthetic part).

F6 (P3) — Edit accepted `null`/`[]` (valid JSON, non-object), enabling Submit for
a value the resume route rejects via findIncompleteDecisions. Mirror the server's
plain-object check in the client (store + editIsValid) so Submit only enables for
an accepted value.

Tests: policy.spec (timezone round-trip), resume.spec (conversationCreatedAt
restore), approval.spec (countTaggedApprovalParts).

* fix(hitl): address Codex re-review (round 13) — recurse into subagent approvals

F9 (P2) — a tool paused INSIDE a subagent has its tool_call_id in the parent
subagent tool_call's nested `subagent_content`, not as a top-level message part.
applyToolApproval and countTaggedApprovalParts only scanned top-level content, so
the approval never attached and the round-12 retry loop counted 0 tagged parts and
spun to its frame cap with no controls. Both now recurse into `subagent_content`
(immutably, so React refs update): the nested call gets tagged and is counted, so
the retry terminates. Added approval.spec cases for the nested tag + count.

Note: surfacing the interactive approve/reject controls inside the subagent view is
a deliberate follow-up — ToolApproval -> useResumeSubmit -> useChatContext crashes
when rendered in the portaled subagent dialog (outside the chat/approval providers),
so that needs the controls scoped to the in-provider inline render (or the dialog
wrapped with the providers). This commit fixes the data/traversal layer only.

F7 (discovered-tool history on resume) and F8 (redis chunk TTL pause race) were
verified false positives — see the PR threads.

* fix(hitl): address Codex re-review (round 14) — resume fidelity + expiry relay

F13 (P2) — manualSkills are graph-determining (skill allowed-tools union into the
tool set before tools load) but weren't replayed, so a reload lost the skill tools
and a crafted resume could inject a different skill past the fingerprint. Add
'manualSkills' to RESUME_CONTEXT_KEYS (same replay-only pattern as timezone/
addedConvo; the delete-absent half blocks injection). Not alwaysAppliedSkills —
that's resolved server-side from the DB, not req.body.

F12 (P2) — the resume final SSE built requestMessage from job.metadata.userMessage
(persisted without files), so attachments vanished from the user bubble on resume.
Spread the already-restored req.body.files onto it, matching the normal path.

F11 (P2) — multi-replica approval expiry: RedisJobStore.cleanupRequiresActionIndex
on another replica can win the requires_action->aborted CAS (it sets the hash error
but has no event transport), and the local sweep then skips because the job is no
longer requires_action, so a client subscribed here never gets the terminal error
until the reap path. expireStaleApprovals now relays APPROVAL_EXPIRED_ERROR for a
locally-subscribed job already aborted FOR approval expiry (error-string gated,
idempotent via the errorEvent flag). emitError already publishes cross-replica.

Tests: policy.spec (manualSkills round-trip + inject-drop), resume.spec (final
requestMessage carries restored files).

* fix(hitl): render approval controls for subagent-nested tool pauses (F10)

Round-13 made applyToolApproval/countTaggedApprovalParts recurse into
subagent_content (data), but SubagentDialogPart rendered nested TOOL_CALL parts
with <ToolCall> only and never mounted <ToolApproval>, so a tool paused inside a
subagent showed no controls and the run was unresolvable.

Render <ToolApproval> in SubagentDialogPart's TOOL_CALL branch when the nested
tool_call carries an approval and isn't yet resolved, mirroring the top-level
Part.tsx render. The subagent dialog portals (OGDialog → ReactDOM.createPortal),
but React context flows through the React tree, not the DOM tree, so ToolApproval
resolves ApprovalProvider/ChatContext and the controls work + submit.

Also harden useResumeSubmit: read ChatContext via useContext (non-throwing)
instead of the throwing useChatContext wrapper, so the cards never crash when
rendered outside a ChatContext.Provider (e.g. a search/citation render that passes
chat context as a prop) — they degrade to inert (buildResumeFields returns null).

* style(hitl): re-sort run.ts imports after dev rebase

* fix(hitl): address Codex re-review (round 15) — resume content fidelity

F14 (P2) — hide_sequential_outputs was applied in chatCompletion before
saving/emitting content but not on resume, so a sequential-agent chain that
pauses for HITL and resumes persisted/emitted intermediate outputs the setting
is meant to hide. Extracted the filter into applyHideSequentialOutputsFilter()
and call it from both chatCompletion and resumeCompletion (after handleRunInterrupt,
covering the finalize + re-pause reads of client.contentParts).

F16 (P2) — on a reloaded HITL pause, the DB already holds the paused user row +
partial assistant row; useResumeOnLoad fed those as submission.messages, then
finalHandler/createdHandler appended the same pair via requestMessage/responseMessage,
duplicating the turn (buildTree doesn't dedupe children by messageId). buildSubmission-
FromResumeState now strips the paused user/response rows (by messageId, incl. the
padded/unpadded response id) from submission.messages — they're re-supplied by the
placeholders + final event. Frontend-only; live (non-reload) pause path untouched.

Deferred: F15 (collapsed-card subagent approval registration/visibility) — see thread.

Tests: client.test (filter keeps last + tool_call parts / no-op when off),
useResumeOnLoad.spec (paused pair stripped from submission.messages).

* fix(hitl): address Codex re-review (round 16) — chunk TTL, slot, job replacement

F17 (P2) — chunk-stream TTL on pause-before-chunk. CHUNK_APPEND_LUA derived its
ceiling only from the chunk key's current TTL, so when the chunks key didn't exist
at pause (fire-and-forget append in flight, or an ask-user pause before any chunk),
the on_pending_action append created the stream with only the 20m running TTL while
the approval window is 24h — content evicted before resume. The Lua now also reads
the job key (KEYS[2]); when status == requires_action it takes max(running, TTL(jobKey))
(the approval window transitionStatus set), else the running TTL. Extend-only preserved;
gated on paused status so normal runs never inflate. Both keys share {streamId} (cluster-safe).

F19 (P2) — with LIMIT_CONCURRENT_MESSAGES, the approval prompt was emitted before the
original request released its slot, so a fast Approve got /resume 429'd. handleRunInterrupt
now releases the slot (idempotent via pendingRequestReleased) right after the pause, before
the prompt; the request.js pause branch and resume.js finally only release if it didn't
(no double-release).

F20 (P2) — finalizeResumedTurn never checked the job wasn't replaced before emitDone/
completeJob/saveMessage, so a stale resume could clobber a newer turn that reused the
conversationId. Added the createdAt guard the normal request path uses (skip finalization
when the live job's createdAt != the paused job's).

Deferred: F18 (subagent_content not reconstructed on Redis resume) — joins the subagent
cluster (F15). See thread.

Tests: RedisJobStore integration (pause-before-chunk gets approval TTL; running stays short),
resume.spec (skip finalization on replacement; no double slot release on re-pause).

* 🛡️ fix: Guard HITL terminal side-effects against job replacement

Jobs are keyed by streamId == conversationId, so a new request REPLACES the
running one on the same conversation. The replaced generation's tail must not
clobber the live generation's state. Each path now re-reads the live job and
compares createdAt against the generation's captured identity before acting.

- Thread the generation's createdAt onto the client (request.js + resume.js)
  as client.jobCreatedAt — the identity every guard compares against.
- handleRunInterrupt: skip approvals.pause when this run is no longer the live
  job, so a stale interrupt can't flip the NEWER job to requires_action.
- chatCompletion finally: skip the checkpoint prune when replaced, so an older
  run's late finally can't delete the newer run's resume checkpoint.
- resume catch-path: gate emitError/completeJob/prune behind a stillLive check
  (fail-open if the read throws), mirroring finalizeResumedTurn's success guard.
- Persist the turn's uploaded files on job.metadata.userMessage (authoritative
  trackUserMessage writer) and prefer them on resume over the user DB row, whose
  save can still be racing a fast /resume.

Tests: 13 guard-predicate cases in jobReplacement.spec.js.

* 🔁 fix: Harden HITL resume — ownership re-check, file seeding, deferred-tool replay

Three follow-ups to the round-17 job-replacement guards (Codex review 4594099963):

- G1 (resume.js): the success-path ownership guard runs at the START of
  finalizeResumedTurn, but saveMessage + first-turn title generation await long
  enough for a new request to replace the job on the same conversationId. Re-read
  the live job immediately before emitDone/completeJob/prune so the terminal writes
  can't tear down the REPLACEMENT job — mirrors the catch-path guard.

- G2 (request.js): onStart's metadata/chunk writes that persist the turn's files
  are fire-and-forget, so a fast approval could read job.metadata.userMessage before
  files landed. Seed files into getPreliminaryUserMessage instead — that write is
  AWAITED before the run starts, so files are durable before any interrupt can emit.

- G3 (run.ts + client.js + resume.js + IJobStore.ts): the resumed graph is rebuilt
  with messages: [], so createRun's tool_search-discovery scan finds nothing. A
  deferred tool discovered earlier in the turn (and targeted by the paused call) was
  therefore absent from the rebuilt schema-only toolMap — resume would throw "unknown
  tool" (no loadRuntimeTools fallback is wired). Capture discovered tool names at
  pause via extractDiscoveredToolsFromHistory(run.getRunMessages()), persist them on
  job.metadata.discoveredTools, and replay them into createRun's new discoveredToolNames
  input (merged with message-extracted names, gated on hasAnyDeferredTools — inert
  otherwise). A new createRun test proves the deferred tool is promoted with the replay
  and absent without it (reproducing the bug).

Tests: real createRun deferred-replay suite (run-summarization.test.ts) + G1/G2/G3
guard predicates (jobReplacement.spec.js). Full suite green.

* 🔒 fix: Close HITL resume metadata + file-substitution + pause-race gaps

Four findings on the round-18 commit (Codex review 4594430222):

- H1 (P1, regression in round-18 G3): the discoveredTools captured at pause never
  reached resume — three metadata allowlists dropped it: GenerationJobManager
  .updateMetadata, RedisJobStore.deserializeJob, and buildJobFacade (plus the
  GenerationJobMetadata type). Added discoveredTools to all four, so the deferred-tool
  replay actually works end-to-end (in-memory store already kept it via Object.assign).

- H2 (P2, security): /resume honored a client-supplied `files` array, letting a crafted
  client resume an approved code/read-file tool against a DIFFERENT file set than the one
  approved (files aren't in the resume fingerprint/context). Resume now ALWAYS sources
  files from the paused job (metadata → DB row), clearing any client-supplied set.

- H3 (P2, ephemeral fidelity): non-default model parameters (temperature, max tokens,
  custom endpoint params) were lost on resume — ephemeral agents derive them from the
  request body, which the resume payload omits. Capture the resolved model_parameters in
  resumeContext at pause and replay them onto the body on resume (excluding `model`, which
  is replayed via the fingerprinted RESUME_CONTEXT_KEYS path). Saved agents already source
  these from the DB.

- H4 (P2, Redis race): a pause landing between the resume snapshot and the Pub/Sub
  subscription reached neither resumeState.pendingAction nor (Redis) pendingEvents, and
  approval events aren't persisted to replayEvents — the client attached to a paused job
  with no approval UI. subscribeWithResume now re-reads the live job AFTER subscribing and
  surfaces the pending action if the snapshot missed it (live read, no staleness).

Tests: discoveredTools metadata round-trip + subscribeWithResume re-read (pendingAction
.spec.ts); client-file substitution rejection (resume.spec.js); model-parameter replay
predicate (jobReplacement.spec.js).

* 🧹 fix: Clear stale discovered tools, release slot on claim error, extend run-step TTL

Three follow-ups on the round-19 commit (Codex review 4594783691):

- I1 (P2): the round-19 discoveredTools field wasn't cleared on Redis streamId reuse.
  HSET only overwrites listed fields and handleRunInterrupt only writes discoveredTools
  when THIS turn discovers a deferred tool — so a replacement turn that pauses without its
  own discovery inherited the prior run's tool names and force-loaded undiscovered deferred
  tools on resume. Added discoveredTools to createJob's staleHitlFields HDEL list (the
  in-memory store already builds a fresh object, so it was Redis-only).

- I2 (P2): with LIMIT_CONCURRENT_MESSAGES, approvals.resolve runs after the slot increment
  but before the run's try/finally, so a store/Redis error there leaked the slot until the
  counter TTL expired (spurious 429s on retry of the still-paused approval). Wrapped the
  claim in try/catch that decrements the slot and returns 500.

- I3 (P3): saveRunSteps did SET ... EX running unconditionally, resetting the run-steps key
  to the 20-min running TTL even while the job is paused for the longer approval window —
  a reload after that window lost the tool timeline. Now uses a paused-window TTL script
  mirroring the chunk-stream no-shrink behavior (extends to the approval window when the
  job hash is requires_action).

Also fixes a latent strict-tsc cast error in the round-19 pendingAction test.

Tests: claim-throws-releases-slot (resume.spec.js); discoveredTools cleared on reuse +
saveRunSteps preserves the paused TTL (RedisJobStore integration, USE_REDIS).

* 🛡️ fix: Guard fast-resume save race, gate HITL to resumable routes, expire on stale submit

Three findings on the round-20 commit (Codex review 4595045652):

- J2 (P1): a fast /resume can claim + finalize the COMPLETED response while the original
  request's pause branch is still awaiting `response.databasePromise`; the later
  unfinished-save then overwrites the completed content. Re-check the job is still paused on
  THIS generation's action (a claim leaves requires_action; a replacement bumps createdAt)
  before marking the row unfinished; fail open on a read error.

- J3 (P1): the tool-approval wiring (humanInTheLoop + PreToolUse hook + checkpointer) was
  applied to EVERY createRun caller when toolApproval.enabled, but the OpenAI-compatible and
  Responses controllers never inspect run.getInterrupt() or persist a pending action — an
  approval-gated tool would pause there with no approval surface or resume endpoint and the
  route would emit a normal final response / [DONE] with the tool call dangling. Gate the
  wiring on a new createRun `hitlCapable` flag, set only by AgentClient (chat + resume).

- J4 (P2): a stale-action 409 on submit returned without driving expiry, leaving the job
  requires_action with a dead action until the periodic sweeper ran — any attached SSE client
  got no terminal event and the stream appeared to hang. Extracted GenerationJobManager
  .expireApproval(streamId, actionId) (expire CAS + terminal SSE, shared with the sweeper) and
  call it from the resume route when the observed action is stale.

J1 (nested subagent approval controls not mounting while the details dialog is closed) is a
valid frontend issue in the deferred subagent-HITL path — tracked separately (replied on the
thread) since the fix touches the shared dialog primitive and needs UI verification.

Tests: HITL-gate both directions (run-summarization.test.ts); expire-on-stale-submit
(resume.spec.js); fast-resume unfinished-save guard predicate (jobReplacement.spec.js).

* 💄 style: Wrap captureAgents signature to satisfy prettier (CI lint)
2026-06-29 16:56:41 -04:00

3122 lines
102 KiB
JavaScript

const { Providers } = require('@librechat/agents');
const { Constants, ContentTypes, EModelEndpoint } = require('librechat-data-provider');
const AgentClient = require('./client');
jest.mock('@librechat/agents', () => ({
...jest.requireActual('@librechat/agents'),
createMetadataAggregator: () => ({
handleLLMEnd: jest.fn(),
collected: [],
}),
}));
jest.mock('@librechat/api', () => ({
...jest.requireActual('@librechat/api'),
checkAccess: jest.fn(),
countFormattedMessageTokens: jest.fn(() => 42),
countTokens: jest.fn((text) => Math.ceil(String(text ?? '').length / 4)),
initializeAgent: jest.fn(),
createMemoryProcessor: jest.fn(),
isMemoryAgentEnabled: jest.fn((config) => {
if (!config || config.disabled === true) return false;
const agent = config.agent;
if (agent?.enabled !== true) return false;
return Boolean(agent.id || (agent.provider && agent.model));
}),
loadAgent: jest.fn(),
}));
jest.mock('~/server/services/Config', () => ({
getMCPServerTools: jest.fn(),
}));
jest.mock('~/server/services/MCP', () => ({
resolveConfigServers: jest.fn().mockResolvedValue({}),
}));
jest.mock('~/models', () => ({
getAgent: jest.fn(),
getRoleByName: jest.fn(),
getFormattedMemories: jest.fn(),
}));
// Mock getMCPManager
const mockFormatInstructions = jest.fn();
jest.mock('~/config', () => ({
getMCPManager: jest.fn(() => ({
formatInstructionsForContext: mockFormatInstructions,
})),
}));
describe('AgentClient - applyHideSequentialOutputsFilter', () => {
const textPart = (text) => ({ type: ContentTypes.TEXT, text });
const toolCallPart = (id) => ({ type: ContentTypes.TOOL_CALL, tool_call: { id } });
it('keeps only the last part + tool_call parts when hide_sequential_outputs is on', () => {
const ctx = {
options: { agent: { hide_sequential_outputs: true } },
contentParts: [
textPart('intermediate'),
toolCallPart('tc1'),
textPart('reasoning'),
textPart('final'),
],
};
AgentClient.prototype.applyHideSequentialOutputsFilter.call(ctx);
expect(ctx.contentParts).toEqual([toolCallPart('tc1'), textPart('final')]);
});
it('is a no-op when hide_sequential_outputs is off', () => {
const parts = [textPart('a'), textPart('b')];
const ctx = { options: { agent: { hide_sequential_outputs: false } }, contentParts: parts };
AgentClient.prototype.applyHideSequentialOutputsFilter.call(ctx);
expect(ctx.contentParts).toEqual([textPart('a'), textPart('b')]);
});
});
describe('AgentClient - titleConvo', () => {
let client;
let mockRun;
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
beforeEach(() => {
// Reset all mocks
jest.clearAllMocks();
// Mock run object
mockRun = {
generateTitle: jest.fn().mockResolvedValue({
title: 'Generated Title',
}),
};
// Mock agent - with both endpoint and provider
mockAgent = {
id: 'agent-123',
endpoint: EModelEndpoint.openAI, // Use a valid provider as endpoint for getProviderConfig
provider: EModelEndpoint.openAI, // Add provider property
model_parameters: {
model: 'gpt-4',
},
};
// Mock request and response
mockReq = {
user: {
id: 'user-123',
},
body: {
model: 'gpt-4',
endpoint: EModelEndpoint.openAI,
key: null,
},
config: {
endpoints: {
[EModelEndpoint.openAI]: {
// Match the agent endpoint
titleModel: 'gpt-3.5-turbo',
titlePrompt: 'Custom title prompt',
titleMethod: 'structured',
titlePromptTemplate: 'Template: {{content}}',
},
},
},
};
mockRes = {};
// Mock options
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
endpointTokenConfig: {},
};
// Create client instance
client = new AgentClient(mockOptions);
client.run = mockRun;
client.responseMessageId = 'response-123';
client.conversationId = 'convo-123';
client.contentParts = [{ type: 'text', text: 'Test content' }];
client.recordCollectedUsage = jest.fn().mockResolvedValue(); // Mock as async function that resolves
});
describe('titleConvo method', () => {
it('should throw error if run is not initialized', async () => {
client.run = null;
await expect(
client.titleConvo({ text: 'Test', abortController: new AbortController() }),
).rejects.toThrow('Run not initialized');
});
it('waits for the run in immediate mode instead of throwing', async () => {
client.run = null;
const abortController = new AbortController();
const titlePromise = client.titleConvo({ text: 'Test', abortController, immediate: true });
// Simulate `chatCompletion` assigning the run (client.js: `this.run = run`).
client.run = mockRun;
client._resolveRun(mockRun);
await titlePromise;
expect(mockRun.generateTitle).toHaveBeenCalled();
});
it('passes empty contentParts in immediate mode (title from the user input only)', async () => {
client.contentParts = [{ type: 'text', text: 'Streaming response so far' }];
const abortController = new AbortController();
await client.titleConvo({ text: 'Hello there', abortController, immediate: true });
const call = mockRun.generateTitle.mock.calls[0][0];
expect(call.contentParts).toEqual([]);
expect(call.inputText).toBe('Hello there');
});
it('uses live contentParts in non-immediate (final) mode', async () => {
client.contentParts = [{ type: 'text', text: 'Full response' }];
const abortController = new AbortController();
await client.titleConvo({ text: 'Hello there', abortController });
const call = mockRun.generateTitle.mock.calls[0][0];
expect(call.contentParts).toEqual([{ type: 'text', text: 'Full response' }]);
});
it('rejects promptly when aborted before the run initializes in immediate mode', async () => {
client.run = null;
const abortController = new AbortController();
abortController.abort();
await expect(
client.titleConvo({ text: 'Test', abortController, immediate: true }),
).rejects.toThrow('Aborted before run initialization');
expect(mockRun.generateTitle).not.toHaveBeenCalled();
});
it('should use titlePrompt from endpoint config', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titlePrompt: 'Custom title prompt',
}),
);
});
it('should use titlePromptTemplate from endpoint config', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titlePromptTemplate: 'Template: {{content}}',
}),
);
});
it('should use titleMethod from endpoint config', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.OPENAI,
titleMethod: 'structured',
}),
);
});
it('should use titleModel from endpoint config when provided', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Check that generateTitle was called with correct clientOptions
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('gpt-3.5-turbo');
});
it('preserves Anthropic custom headers on title requests despite omitTitleOptions', async () => {
const prevKey = process.env.ANTHROPIC_API_KEY;
process.env.ANTHROPIC_API_KEY = 'sk-ant-test';
try {
const req = {
user: { id: 'user-123' },
body: { model: 'claude-sonnet-4-5', endpoint: EModelEndpoint.anthropic, key: null },
config: {
endpoints: {
[EModelEndpoint.anthropic]: {
headers: { 'X-Conversation-Id': '{{LIBRECHAT_BODY_CONVERSATIONID}}' },
},
},
},
};
const agent = {
id: 'agent-anthropic',
endpoint: EModelEndpoint.anthropic,
provider: EModelEndpoint.anthropic,
model_parameters: { model: 'claude-sonnet-4-5' },
};
const anthropicClient = new AgentClient({ req, res: {}, agent, endpointTokenConfig: {} });
anthropicClient.run = mockRun;
anthropicClient.responseMessageId = 'response-123';
anthropicClient.conversationId = 'convo-123';
anthropicClient.contentParts = [{ type: 'text', text: 'Test content' }];
anthropicClient.recordCollectedUsage = jest.fn().mockResolvedValue();
await anthropicClient.titleConvo({ text: 'Hello', abortController: new AbortController() });
const defaultHeaders =
mockRun.generateTitle.mock.calls[0][0].clientOptions?.clientOptions?.defaultHeaders;
// Custom header survives the `omitTitleOptions` strip and resolves the conversationId
expect(defaultHeaders?.['X-Conversation-Id']).toBe('convo-123');
// Provider-managed beta header is preserved alongside it
expect(defaultHeaders?.['anthropic-beta']).toBeDefined();
} finally {
if (prevKey === undefined) {
delete process.env.ANTHROPIC_API_KEY;
} else {
process.env.ANTHROPIC_API_KEY = prevKey;
}
}
});
it('should handle missing endpoint config gracefully', async () => {
// Remove endpoint config
mockReq.config = { endpoints: {} };
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titlePrompt: undefined,
titlePromptTemplate: undefined,
titleMethod: undefined,
}),
);
});
it('should use agent model when titleModel is not provided', async () => {
// Remove titleModel from config
mockReq.config = {
endpoints: {
[EModelEndpoint.openAI]: {
titlePrompt: 'Custom title prompt',
titleMethod: 'structured',
titlePromptTemplate: 'Template: {{content}}',
// titleModel is omitted
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('gpt-4'); // Should use agent's model
});
it('should not use titleModel when it equals CURRENT_MODEL constant', async () => {
mockReq.config = {
endpoints: {
[EModelEndpoint.openAI]: {
titleModel: Constants.CURRENT_MODEL,
titlePrompt: 'Custom title prompt',
titleMethod: 'structured',
titlePromptTemplate: 'Template: {{content}}',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('gpt-4'); // Should use agent's model
});
it('should pass all required parameters to generateTitle', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(mockRun.generateTitle).toHaveBeenCalledWith({
provider: expect.any(String),
inputText: text,
contentParts: client.contentParts,
clientOptions: expect.objectContaining({
model: 'gpt-3.5-turbo',
}),
titlePrompt: 'Custom title prompt',
titlePromptTemplate: 'Template: {{content}}',
titleMethod: 'structured',
chainOptions: expect.objectContaining({
signal: abortController.signal,
}),
});
});
it('should record collected usage after title generation', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
expect(client.recordCollectedUsage).toHaveBeenCalledWith({
model: 'gpt-3.5-turbo',
context: 'title',
collectedUsage: expect.any(Array),
balance: {
enabled: false,
},
transactions: {
enabled: true,
},
messageId: 'response-123',
});
});
it('should return the generated title', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
expect(result).toBe('Generated Title');
});
it('should sanitize the generated title by removing think blocks', async () => {
const titleWithThinkBlock = '<think>reasoning about the title</think> User Hi Greeting';
mockRun.generateTitle.mockResolvedValue({
title: titleWithThinkBlock,
});
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should remove the <think> block and return only the clean title
expect(result).toBe('User Hi Greeting');
expect(result).not.toContain('<think>');
expect(result).not.toContain('</think>');
});
it('should return fallback title when sanitization results in empty string', async () => {
const titleOnlyThinkBlock = '<think>only reasoning no actual title</think>';
mockRun.generateTitle.mockResolvedValue({
title: titleOnlyThinkBlock,
});
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should return the fallback title since sanitization would result in empty string
expect(result).toBe('Untitled Conversation');
});
it('should handle errors gracefully and return undefined', async () => {
mockRun.generateTitle.mockRejectedValue(new Error('Title generation failed'));
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
expect(result).toBeUndefined();
});
it('should skip title generation when titleConvo is set to false', async () => {
// Set titleConvo to false in endpoint config
mockReq.config = {
endpoints: {
[EModelEndpoint.openAI]: {
titleConvo: false,
titleModel: 'gpt-3.5-turbo',
titlePrompt: 'Custom title prompt',
titleMethod: 'structured',
titlePromptTemplate: 'Template: {{content}}',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should return undefined without generating title
expect(result).toBeUndefined();
// generateTitle should NOT have been called
expect(mockRun.generateTitle).not.toHaveBeenCalled();
// recordCollectedUsage should NOT have been called
expect(client.recordCollectedUsage).not.toHaveBeenCalled();
});
it('should skip title generation for temporary chats', async () => {
// Set isTemporary to true
mockReq.body.isTemporary = true;
const text = 'Test temporary chat';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should return undefined without generating title
expect(result).toBeUndefined();
// generateTitle should NOT have been called
expect(mockRun.generateTitle).not.toHaveBeenCalled();
// recordCollectedUsage should NOT have been called
expect(client.recordCollectedUsage).not.toHaveBeenCalled();
});
it('should skip title generation when titleConvo is false in all config', async () => {
// Set titleConvo to false in "all" config
mockReq.config = {
endpoints: {
all: {
titleConvo: false,
titleModel: 'gpt-4o-mini',
titlePrompt: 'All config title prompt',
titleMethod: 'completion',
titlePromptTemplate: 'All config template',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should return undefined without generating title
expect(result).toBeUndefined();
// generateTitle should NOT have been called
expect(mockRun.generateTitle).not.toHaveBeenCalled();
// recordCollectedUsage should NOT have been called
expect(client.recordCollectedUsage).not.toHaveBeenCalled();
});
it('should skip title generation when titleConvo is false for custom endpoint scenario', async () => {
// This test validates the behavior when customEndpointConfig (retrieved via
// getProviderConfig for custom endpoints) has titleConvo: false.
//
// The code path is:
// 1. endpoints?.all is checked (undefined in this test)
// 2. endpoints?.[endpoint] is checked (our test config)
// 3. Would fall back to titleProviderConfig.customEndpointConfig (for real custom endpoints)
//
// We simulate a custom endpoint scenario using a dynamically named endpoint config
// Create a unique endpoint name that represents a custom endpoint
const customEndpointName = 'customEndpoint';
// Configure the endpoint to have titleConvo: false
// This simulates what would be in customEndpointConfig for a real custom endpoint
mockReq.config = {
endpoints: {
// No 'all' config - so it will check endpoints[endpoint]
// This config represents what customEndpointConfig would contain
[customEndpointName]: {
titleConvo: false,
titleModel: 'custom-model-v1',
titlePrompt: 'Custom endpoint title prompt',
titleMethod: 'completion',
titlePromptTemplate: 'Custom template: {{content}}',
baseURL: 'https://api.custom-llm.com/v1',
apiKey: 'test-custom-key',
// Additional custom endpoint properties
models: {
default: ['custom-model-v1', 'custom-model-v2'],
},
},
},
};
// Set up agent to use our custom endpoint
// Use openAI as base but override with custom endpoint name for this test
mockAgent.endpoint = EModelEndpoint.openAI;
mockAgent.provider = EModelEndpoint.openAI;
// Override the endpoint in the config to point to our custom config
mockReq.config.endpoints[EModelEndpoint.openAI] =
mockReq.config.endpoints[customEndpointName];
delete mockReq.config.endpoints[customEndpointName];
const text = 'Test custom endpoint conversation';
const abortController = new AbortController();
const result = await client.titleConvo({ text, abortController });
// Should return undefined without generating title because titleConvo is false
expect(result).toBeUndefined();
// generateTitle should NOT have been called
expect(mockRun.generateTitle).not.toHaveBeenCalled();
// recordCollectedUsage should NOT have been called
expect(client.recordCollectedUsage).not.toHaveBeenCalled();
});
it('should pass titleEndpoint configuration to generateTitle', async () => {
// Mock the API key just for this test
const originalApiKey = process.env.ANTHROPIC_API_KEY;
process.env.ANTHROPIC_API_KEY = 'test-api-key';
// Add titleEndpoint to the config
mockReq.config = {
endpoints: {
[EModelEndpoint.openAI]: {
titleModel: 'gpt-3.5-turbo',
titleEndpoint: EModelEndpoint.anthropic,
titleMethod: 'structured',
titlePrompt: 'Custom title prompt',
titlePromptTemplate: 'Custom template',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify generateTitle was called with the custom configuration
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titleMethod: 'structured',
provider: Providers.ANTHROPIC,
titlePrompt: 'Custom title prompt',
titlePromptTemplate: 'Custom template',
}),
);
// Restore the original API key
if (originalApiKey) {
process.env.ANTHROPIC_API_KEY = originalApiKey;
} else {
delete process.env.ANTHROPIC_API_KEY;
}
});
it('should use all config when endpoint config is missing', async () => {
// Set 'all' config without endpoint-specific config
mockReq.config = {
endpoints: {
all: {
titleModel: 'gpt-4o-mini',
titlePrompt: 'All config title prompt',
titleMethod: 'completion',
titlePromptTemplate: 'All config template: {{content}}',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify generateTitle was called with 'all' config values
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titleMethod: 'completion',
titlePrompt: 'All config title prompt',
titlePromptTemplate: 'All config template: {{content}}',
}),
);
// Check that the model was set from 'all' config
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('gpt-4o-mini');
});
it('should prioritize all config over endpoint config for title settings', async () => {
// Set both endpoint and 'all' config
mockReq.config = {
endpoints: {
[EModelEndpoint.openAI]: {
titleModel: 'gpt-3.5-turbo',
titlePrompt: 'Endpoint title prompt',
titleMethod: 'structured',
// titlePromptTemplate is omitted to test fallback
},
all: {
titleModel: 'gpt-4o-mini',
titlePrompt: 'All config title prompt',
titleMethod: 'completion',
titlePromptTemplate: 'All config template',
},
},
};
const text = 'Test conversation text';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify 'all' config takes precedence over endpoint config
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titleMethod: 'completion',
titlePrompt: 'All config title prompt',
titlePromptTemplate: 'All config template',
}),
);
// Check that the model was set from 'all' config
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('gpt-4o-mini');
});
it('should use all config with titleEndpoint and verify provider switch', async () => {
// Mock the API key for the titleEndpoint provider
const originalApiKey = process.env.ANTHROPIC_API_KEY;
process.env.ANTHROPIC_API_KEY = 'test-anthropic-key';
// Set comprehensive 'all' config with all new title options
mockReq.config = {
endpoints: {
all: {
titleConvo: true,
titleModel: 'claude-3-haiku-20240307',
titleMethod: 'completion', // Testing the new default method
titlePrompt: 'Generate a concise, descriptive title for this conversation',
titlePromptTemplate: 'Conversation summary: {{content}}',
titleEndpoint: EModelEndpoint.anthropic, // Should switch provider to Anthropic
},
},
};
const text = 'Test conversation about AI and machine learning';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify all config values were used
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.ANTHROPIC, // Critical: Verify provider switched to Anthropic
titleMethod: 'completion',
titlePrompt: 'Generate a concise, descriptive title for this conversation',
titlePromptTemplate: 'Conversation summary: {{content}}',
inputText: text,
contentParts: client.contentParts,
}),
);
// Verify the model was set from 'all' config
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('claude-3-haiku-20240307');
// Verify other client options are set correctly
expect(generateTitleCall.clientOptions).toMatchObject({
model: 'claude-3-haiku-20240307',
// Note: Anthropic's getOptions may set its own maxTokens value
});
// Restore the original API key
if (originalApiKey) {
process.env.ANTHROPIC_API_KEY = originalApiKey;
} else {
delete process.env.ANTHROPIC_API_KEY;
}
});
it('should test all titleMethod options from all config', async () => {
// Test each titleMethod: 'completion', 'functions', 'structured'
const titleMethods = ['completion', 'functions', 'structured'];
for (const method of titleMethods) {
// Clear previous calls
mockRun.generateTitle.mockClear();
// Set 'all' config with specific titleMethod
mockReq.config = {
endpoints: {
all: {
titleModel: 'gpt-4o-mini',
titleMethod: method,
titlePrompt: `Testing ${method} method`,
titlePromptTemplate: `Template for ${method}: {{content}}`,
},
},
};
const text = `Test conversation for ${method} method`;
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify the correct titleMethod was used
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
titleMethod: method,
titlePrompt: `Testing ${method} method`,
titlePromptTemplate: `Template for ${method}: {{content}}`,
}),
);
}
});
describe('Azure-specific title generation', () => {
let originalEnv;
beforeEach(() => {
// Reset mocks
jest.clearAllMocks();
// Save original environment variables
originalEnv = { ...process.env };
// Mock Azure API keys
process.env.AZURE_OPENAI_API_KEY = 'test-azure-key';
process.env.AZURE_API_KEY = 'test-azure-key';
process.env.EASTUS_API_KEY = 'test-eastus-key';
process.env.EASTUS2_API_KEY = 'test-eastus2-key';
});
afterEach(() => {
// Restore environment variables
process.env = originalEnv;
});
it('should use OPENAI provider for Azure serverless endpoints', async () => {
// Set up Azure endpoint with serverless config
mockAgent.endpoint = EModelEndpoint.azureOpenAI;
mockAgent.provider = EModelEndpoint.azureOpenAI;
mockReq.config = {
endpoints: {
[EModelEndpoint.azureOpenAI]: {
titleConvo: true,
titleModel: 'grok-3',
titleMethod: 'completion',
titlePrompt: 'Azure serverless title prompt',
streamRate: 35,
modelGroupMap: {
'grok-3': {
group: 'Azure AI Foundry',
deploymentName: 'grok-3',
},
},
groupMap: {
'Azure AI Foundry': {
apiKey: '${AZURE_API_KEY}',
baseURL: 'https://test.services.ai.azure.com/models',
version: '2024-05-01-preview',
serverless: true,
models: {
'grok-3': {
deploymentName: 'grok-3',
},
},
},
},
},
},
};
mockReq.body.endpoint = EModelEndpoint.azureOpenAI;
mockReq.body.model = 'grok-3';
const text = 'Test Azure serverless conversation';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify provider was switched to OPENAI for serverless
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.OPENAI, // Should be OPENAI for serverless
titleMethod: 'completion',
titlePrompt: 'Azure serverless title prompt',
}),
);
});
it('should use AZURE provider for Azure endpoints with instanceName', async () => {
// Set up Azure endpoint
mockAgent.endpoint = EModelEndpoint.azureOpenAI;
mockAgent.provider = EModelEndpoint.azureOpenAI;
mockReq.config = {
endpoints: {
[EModelEndpoint.azureOpenAI]: {
titleConvo: true,
titleModel: 'gpt-4o',
titleMethod: 'structured',
titlePrompt: 'Azure instance title prompt',
streamRate: 35,
modelGroupMap: {
'gpt-4o': {
group: 'eastus',
deploymentName: 'gpt-4o',
},
},
groupMap: {
eastus: {
apiKey: '${EASTUS_API_KEY}',
instanceName: 'region-instance',
version: '2024-02-15-preview',
models: {
'gpt-4o': {
deploymentName: 'gpt-4o',
},
},
},
},
},
},
};
mockReq.body.endpoint = EModelEndpoint.azureOpenAI;
mockReq.body.model = 'gpt-4o';
const text = 'Test Azure instance conversation';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify provider remains AZURE with instanceName
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.AZURE,
titleMethod: 'structured',
titlePrompt: 'Azure instance title prompt',
}),
);
});
it('should handle Azure titleModel with CURRENT_MODEL constant', async () => {
// Set up Azure endpoint
mockAgent.endpoint = EModelEndpoint.azureOpenAI;
mockAgent.provider = EModelEndpoint.azureOpenAI;
mockAgent.model_parameters.model = 'gpt-4o-latest';
mockReq.config = {
endpoints: {
[EModelEndpoint.azureOpenAI]: {
titleConvo: true,
titleModel: Constants.CURRENT_MODEL,
titleMethod: 'functions',
streamRate: 35,
modelGroupMap: {
'gpt-4o-latest': {
group: 'region-eastus',
deploymentName: 'gpt-4o-mini',
version: '2024-02-15-preview',
},
},
groupMap: {
'region-eastus': {
apiKey: '${EASTUS2_API_KEY}',
instanceName: 'test-instance',
version: '2024-12-01-preview',
models: {
'gpt-4o-latest': {
deploymentName: 'gpt-4o-mini',
version: '2024-02-15-preview',
},
},
},
},
},
},
};
mockReq.body.endpoint = EModelEndpoint.azureOpenAI;
mockReq.body.model = 'gpt-4o-latest';
const text = 'Test Azure current model';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify it uses the correct model when titleModel is CURRENT_MODEL
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
// When CURRENT_MODEL is used with Azure, the model gets mapped to the deployment name
// In this case, 'gpt-4o-latest' is mapped to 'gpt-4o-mini' deployment
expect(generateTitleCall.clientOptions.model).toBe('gpt-4o-mini');
// Also verify that CURRENT_MODEL constant was not passed as the model
expect(generateTitleCall.clientOptions.model).not.toBe(Constants.CURRENT_MODEL);
});
it('should handle Azure with multiple model groups', async () => {
// Set up Azure endpoint
mockAgent.endpoint = EModelEndpoint.azureOpenAI;
mockAgent.provider = EModelEndpoint.azureOpenAI;
mockReq.config = {
endpoints: {
[EModelEndpoint.azureOpenAI]: {
titleConvo: true,
titleModel: 'o1-mini',
titleMethod: 'completion',
streamRate: 35,
modelGroupMap: {
'gpt-4o': {
group: 'eastus',
deploymentName: 'gpt-4o',
},
'o1-mini': {
group: 'region-eastus',
deploymentName: 'o1-mini',
},
'codex-mini': {
group: 'codex-mini',
deploymentName: 'codex-mini',
},
},
groupMap: {
eastus: {
apiKey: '${EASTUS_API_KEY}',
instanceName: 'region-eastus',
version: '2024-02-15-preview',
models: {
'gpt-4o': {
deploymentName: 'gpt-4o',
},
},
},
'region-eastus': {
apiKey: '${EASTUS2_API_KEY}',
instanceName: 'region-eastus2',
version: '2024-12-01-preview',
models: {
'o1-mini': {
deploymentName: 'o1-mini',
},
},
},
'codex-mini': {
apiKey: '${AZURE_API_KEY}',
baseURL: 'https://example.cognitiveservices.azure.com/openai/',
version: '2025-04-01-preview',
serverless: true,
models: {
'codex-mini': {
deploymentName: 'codex-mini',
},
},
},
},
},
},
};
mockReq.body.endpoint = EModelEndpoint.azureOpenAI;
mockReq.body.model = 'o1-mini';
const text = 'Test Azure multi-group conversation';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify correct model and provider are used
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.AZURE,
titleMethod: 'completion',
}),
);
const generateTitleCall = mockRun.generateTitle.mock.calls[0][0];
expect(generateTitleCall.clientOptions.model).toBe('o1-mini');
expect(generateTitleCall.clientOptions.maxTokens).toBeUndefined(); // o1 models shouldn't have maxTokens
});
it('should use all config as fallback for Azure endpoints', async () => {
// Set up Azure endpoint with minimal config
mockAgent.endpoint = EModelEndpoint.azureOpenAI;
mockAgent.provider = EModelEndpoint.azureOpenAI;
mockReq.body.endpoint = EModelEndpoint.azureOpenAI;
mockReq.body.model = 'gpt-4';
// Set 'all' config as fallback with a serverless Azure config
mockReq.config = {
endpoints: {
all: {
titleConvo: true,
titleModel: 'gpt-4',
titleMethod: 'structured',
titlePrompt: 'Fallback title prompt from all config',
titlePromptTemplate: 'Template: {{content}}',
modelGroupMap: {
'gpt-4': {
group: 'default-group',
deploymentName: 'gpt-4',
},
},
groupMap: {
'default-group': {
apiKey: '${AZURE_API_KEY}',
baseURL: 'https://default.openai.azure.com/',
version: '2024-02-15-preview',
serverless: true,
models: {
'gpt-4': {
deploymentName: 'gpt-4',
},
},
},
},
},
},
};
const text = 'Test Azure with all config fallback';
const abortController = new AbortController();
await client.titleConvo({ text, abortController });
// Verify all config is used
expect(mockRun.generateTitle).toHaveBeenCalledWith(
expect.objectContaining({
provider: Providers.OPENAI, // Should be OPENAI when no instanceName
titleMethod: 'structured',
titlePrompt: 'Fallback title prompt from all config',
titlePromptTemplate: 'Template: {{content}}',
}),
);
});
});
});
describe('getOptions method - GPT-5+ model handling', () => {
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
beforeEach(() => {
jest.clearAllMocks();
mockAgent = {
id: 'agent-123',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
model_parameters: {
model: 'gpt-5',
},
};
mockReq = {
app: {
locals: {},
},
user: {
id: 'user-123',
},
};
mockRes = {};
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
};
client = new AgentClient(mockOptions);
});
it('should move maxTokens to modelKwargs.max_completion_tokens for GPT-5 models', () => {
const clientOptions = {
model: 'gpt-5',
maxTokens: 2048,
temperature: 0.7,
};
// Simulate the getOptions logic that handles GPT-5+ models
if (/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) && clientOptions.maxTokens != null) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
clientOptions.modelKwargs.max_completion_tokens = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs).toBeDefined();
expect(clientOptions.modelKwargs.max_completion_tokens).toBe(2048);
expect(clientOptions.temperature).toBe(0.7); // Other options should remain
});
it('should move maxTokens to modelKwargs.max_output_tokens for GPT-5 models with useResponsesApi', () => {
const clientOptions = {
model: 'gpt-5',
maxTokens: 2048,
temperature: 0.7,
useResponsesApi: true,
};
if (/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) && clientOptions.maxTokens != null) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
const paramName =
clientOptions.useResponsesApi === true ? 'max_output_tokens' : 'max_completion_tokens';
clientOptions.modelKwargs[paramName] = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs).toBeDefined();
expect(clientOptions.modelKwargs.max_output_tokens).toBe(2048);
expect(clientOptions.temperature).toBe(0.7); // Other options should remain
});
it('should handle GPT-5+ models with existing modelKwargs', () => {
const clientOptions = {
model: 'gpt-6',
maxTokens: 1500,
temperature: 0.8,
modelKwargs: {
customParam: 'value',
},
};
// Simulate the getOptions logic
if (/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) && clientOptions.maxTokens != null) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
clientOptions.modelKwargs.max_completion_tokens = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs).toEqual({
customParam: 'value',
max_completion_tokens: 1500,
});
});
it('should not modify maxTokens for non-GPT-5+ models', () => {
const clientOptions = {
model: 'gpt-4',
maxTokens: 2048,
temperature: 0.7,
};
// Simulate the getOptions logic
if (/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) && clientOptions.maxTokens != null) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
clientOptions.modelKwargs.max_completion_tokens = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
// Should not be modified since it's GPT-4
expect(clientOptions.maxTokens).toBe(2048);
expect(clientOptions.modelKwargs).toBeUndefined();
});
it('should handle various GPT-5+ model formats', () => {
const testCases = [
{ model: 'gpt-5.1', shouldTransform: true },
{ model: 'gpt-5.1-chat-latest', shouldTransform: true },
{ model: 'gpt-5.1-codex', shouldTransform: true },
{ model: 'gpt-5', shouldTransform: true },
{ model: 'gpt-5-turbo', shouldTransform: true },
{ model: 'gpt-6', shouldTransform: true },
{ model: 'gpt-7-preview', shouldTransform: true },
{ model: 'gpt-8', shouldTransform: true },
{ model: 'gpt-9-mini', shouldTransform: true },
{ model: 'gpt-4', shouldTransform: false },
{ model: 'gpt-4o', shouldTransform: false },
{ model: 'gpt-3.5-turbo', shouldTransform: false },
{ model: 'claude-3', shouldTransform: false },
];
testCases.forEach(({ model, shouldTransform }) => {
const clientOptions = {
model,
maxTokens: 1000,
};
// Simulate the getOptions logic
if (
/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) &&
clientOptions.maxTokens != null
) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
clientOptions.modelKwargs.max_completion_tokens = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
if (shouldTransform) {
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs?.max_completion_tokens).toBe(1000);
} else {
expect(clientOptions.maxTokens).toBe(1000);
expect(clientOptions.modelKwargs).toBeUndefined();
}
});
});
it('should not swap max token param for older models when using useResponsesApi', () => {
const testCases = [
{ model: 'gpt-5.1', shouldTransform: true },
{ model: 'gpt-5.1-chat-latest', shouldTransform: true },
{ model: 'gpt-5.1-codex', shouldTransform: true },
{ model: 'gpt-5', shouldTransform: true },
{ model: 'gpt-5-turbo', shouldTransform: true },
{ model: 'gpt-6', shouldTransform: true },
{ model: 'gpt-7-preview', shouldTransform: true },
{ model: 'gpt-8', shouldTransform: true },
{ model: 'gpt-9-mini', shouldTransform: true },
{ model: 'gpt-4', shouldTransform: false },
{ model: 'gpt-4o', shouldTransform: false },
{ model: 'gpt-3.5-turbo', shouldTransform: false },
{ model: 'claude-3', shouldTransform: false },
];
testCases.forEach(({ model, shouldTransform }) => {
const clientOptions = {
model,
maxTokens: 1000,
useResponsesApi: true,
};
if (
/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) &&
clientOptions.maxTokens != null
) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
const paramName =
clientOptions.useResponsesApi === true ? 'max_output_tokens' : 'max_completion_tokens';
clientOptions.modelKwargs[paramName] = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
if (shouldTransform) {
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs?.max_output_tokens).toBe(1000);
} else {
expect(clientOptions.maxTokens).toBe(1000);
expect(clientOptions.modelKwargs).toBeUndefined();
}
});
});
it('should not transform if maxTokens is null or undefined', () => {
const testCases = [
{ model: 'gpt-5', maxTokens: null },
{ model: 'gpt-5', maxTokens: undefined },
{ model: 'gpt-6', maxTokens: 0 }, // Should transform even if 0
];
testCases.forEach(({ model, maxTokens }, index) => {
const clientOptions = {
model,
maxTokens,
temperature: 0.7,
};
// Simulate the getOptions logic
if (
/\bgpt-[5-9](?:\.\d+)?\b/i.test(clientOptions.model) &&
clientOptions.maxTokens != null
) {
clientOptions.modelKwargs = clientOptions.modelKwargs ?? {};
clientOptions.modelKwargs.max_completion_tokens = clientOptions.maxTokens;
delete clientOptions.maxTokens;
}
if (index < 2) {
// null or undefined cases
expect(clientOptions.maxTokens).toBe(maxTokens);
expect(clientOptions.modelKwargs).toBeUndefined();
} else {
// 0 case - should transform
expect(clientOptions.maxTokens).toBeUndefined();
expect(clientOptions.modelKwargs?.max_completion_tokens).toBe(0);
}
});
});
});
describe('buildMessages with MCP server instructions', () => {
let client;
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
beforeEach(() => {
jest.clearAllMocks();
// Reset the mock to default behavior
mockFormatInstructions.mockResolvedValue(
'# MCP Server Instructions\n\nTest MCP instructions here',
);
const { DynamicStructuredTool } = require('@librechat/agents/langchain/tools');
// Create mock MCP tools with the delimiter pattern
const mockMCPTool1 = new DynamicStructuredTool({
name: `tool1${Constants.mcp_delimiter}server1`,
description: 'Test MCP tool 1',
schema: {},
func: async () => 'result',
});
const mockMCPTool2 = new DynamicStructuredTool({
name: `tool2${Constants.mcp_delimiter}server2`,
description: 'Test MCP tool 2',
schema: {},
func: async () => 'result',
});
mockAgent = {
id: 'agent-123',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
instructions: 'Base agent instructions',
model_parameters: {
model: 'gpt-4',
},
tools: [mockMCPTool1, mockMCPTool2],
};
mockReq = {
user: {
id: 'user-123',
},
body: {
endpoint: EModelEndpoint.openAI,
},
config: {},
};
mockRes = {};
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
endpoint: EModelEndpoint.agents,
};
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
client.shouldSummarize = false;
client.maxContextTokens = 4096;
});
it('should await MCP instructions and not include [object Promise] in agent instructions', async () => {
// Set specific return value for this test
mockFormatInstructions.mockResolvedValue(
'# MCP Server Instructions\n\nUse these tools carefully',
);
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
});
// Verify formatInstructionsForContext was called with correct server names
expect(mockFormatInstructions).toHaveBeenCalledWith(['server1', 'server2'], {});
// Verify the instructions do NOT contain [object Promise]
expect(client.options.agent.instructions).not.toContain('[object Promise]');
// Verify the instructions DO contain the MCP instructions
expect(client.options.agent.instructions).toContain('# MCP Server Instructions');
expect(client.options.agent.instructions).toContain('Use these tools carefully');
// Verify the base instructions are also included (from agent config, not buildOptions)
expect(client.options.agent.instructions).toContain('Base agent instructions');
});
it('should handle MCP instructions with ephemeral agent', async () => {
// Set specific return value for this test
mockFormatInstructions.mockResolvedValue(
'# Ephemeral MCP Instructions\n\nSpecial ephemeral instructions',
);
// Set up ephemeral agent with MCP servers
mockReq.body.ephemeralAgent = {
mcp: ['ephemeral-server1', 'ephemeral-server2'],
};
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Test ephemeral',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Ephemeral instructions',
additional_instructions: null,
});
// Verify formatInstructionsForContext was called with ephemeral server names
expect(mockFormatInstructions).toHaveBeenCalledWith(
['ephemeral-server1', 'ephemeral-server2'],
{},
);
// Verify no [object Promise] in instructions
expect(client.options.agent.instructions).not.toContain('[object Promise]');
// Verify ephemeral MCP instructions are included
expect(client.options.agent.instructions).toContain('# Ephemeral MCP Instructions');
expect(client.options.agent.instructions).toContain('Special ephemeral instructions');
});
it('should handle empty MCP instructions gracefully', async () => {
// Set empty return value for this test
mockFormatInstructions.mockResolvedValue('');
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Base instructions only',
additional_instructions: null,
});
// Verify the instructions still work without MCP content (from agent config, not buildOptions)
expect(client.options.agent.instructions).toBe('Base agent instructions');
expect(client.options.agent.instructions).not.toContain('[object Promise]');
});
it('should handle MCP instructions error gracefully', async () => {
// Set error return for this test
mockFormatInstructions.mockRejectedValue(new Error('MCP error'));
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
// Should not throw
await client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
});
// Should still have base instructions without MCP content (from agent config, not buildOptions)
expect(client.options.agent.instructions).toContain('Base agent instructions');
expect(client.options.agent.instructions).not.toContain('[object Promise]');
});
});
describe('buildMessages with request and agent-scoped context attachments', () => {
let client;
let mockReq;
let mockRes;
let mockAgent;
const makeTextFile = (file_id, filename, text) => ({
user: 'user-123',
file_id,
filename,
filepath: `/uploads/${filename}`,
object: 'file',
type: 'text/plain',
bytes: text.length,
embedded: false,
usage: 0,
source: 'text',
text,
});
const makeUploadedFile = (file_id, filename, type) => ({
user: 'user-123',
file_id,
filename,
filepath: `/uploads/${filename}`,
object: 'file',
type,
bytes: 128,
embedded: false,
usage: 0,
source: 'local',
});
beforeEach(() => {
jest.clearAllMocks();
mockFormatInstructions.mockResolvedValue('');
require('@librechat/api').countFormattedMessageTokens.mockImplementation(() => 42);
mockAgent = {
id: 'primary-agent',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
instructions: 'Primary instructions',
model_parameters: {
model: 'gpt-4',
},
tools: [],
};
mockReq = {
user: {
id: 'user-123',
personalization: {
memories: true,
},
},
body: {
endpoint: EModelEndpoint.openAI,
fileTokenLimit: 1000,
},
config: {
memory: {
disabled: true,
},
},
};
mockRes = {};
client = new AgentClient({
req: mockReq,
res: mockRes,
agent: mockAgent,
endpoint: EModelEndpoint.agents,
});
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
client.shouldSummarize = false;
client.maxContextTokens = 4096;
client.useMemory = jest.fn().mockResolvedValue(undefined);
});
it.each([
['CSV', 'csv-file', 'sample.csv', 'text/csv'],
[
'XLSX',
'xlsx-file',
'sample.xlsx',
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
],
])(
'routes default-supported provider uploads like %s as request documents without custom file config',
async (_label, file_id, filename, type) => {
const currentFile = makeUploadedFile(file_id, filename, type);
const message = {
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: `Read this ${filename}.`,
isCreatedByUser: true,
};
client.addDocuments = jest.fn(async (targetMessage, attachments) => {
targetMessage.documents = attachments.map((file) => ({
type: 'input_file',
filename: file.filename,
file_data: `data:${file.type};base64,Y29sMQox`,
}));
return attachments;
});
const files = await client.processAttachments(message, [currentFile]);
expect(client.addDocuments).toHaveBeenCalledWith(message, [currentFile]);
expect(message.documents).toEqual([
expect.objectContaining({
type: 'input_file',
filename,
}),
]);
expect(files).toEqual([currentFile]);
},
);
it('places request context inline and applies each agent context doc only once', async () => {
const requestFile = makeTextFile('request-file', 'request.txt', 'Shared request context');
const primaryContext = makeTextFile(
'primary-context',
'primary.txt',
'Primary private context',
);
const handoffContext = makeTextFile(
'handoff-context',
'handoff.txt',
'Handoff private context',
);
const handoffAgent = {
id: 'handoff-agent',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
instructions: 'Handoff instructions',
model_parameters: {
model: 'gpt-4',
},
tools: [],
};
client.options.attachments = [requestFile];
client.options.agentContextAttachmentsByAgentId = new Map([
['primary-agent', [primaryContext]],
['handoff-agent', [handoffContext]],
]);
client.agentConfigs = new Map([['handoff-agent', handoffAgent]]);
const result = await client.buildMessages(
[
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Use the available context.',
isCreatedByUser: true,
},
],
'msg-1',
{},
);
expect(result.prompt[0].content).toContain('Shared request context');
expect(mockAgent.additional_instructions).toContain('Primary private context');
expect(mockAgent.additional_instructions).not.toContain('Shared request context');
expect(mockAgent.additional_instructions).not.toContain('Handoff private context');
expect(handoffAgent.additional_instructions).toContain('Handoff private context');
expect(handoffAgent.additional_instructions).not.toContain('Shared request context');
expect(handoffAgent.additional_instructions).not.toContain('Primary private context');
});
it('places current request file context on the latest user message', async () => {
const currentFile = makeTextFile('current-file', 'current.txt', 'Current turn file body');
const previousFileContext =
'Attached document(s):\n```md\n# "previous.txt"\nPrevious turn file body\n```';
client.options.attachments = [currentFile];
const result = await client.buildMessages(
[
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'What is written here?',
isCreatedByUser: true,
fileContext: previousFileContext,
},
{
messageId: 'msg-2',
parentMessageId: 'msg-1',
sender: 'Assistant',
text: 'It describes the previous file.',
isCreatedByUser: false,
},
{
messageId: 'msg-3',
parentMessageId: 'msg-2',
sender: 'User',
text: 'What is written here?',
isCreatedByUser: true,
},
],
'msg-3',
{},
);
expect(result.prompt[0].content).toContain('Previous turn file body');
expect(result.prompt[2].content).toContain('Current turn file body');
expect(result.prompt[2].content).toContain('What is written here?');
expect(result.prompt[2].content).not.toContain('Previous turn file body');
expect(client.memoryPayload[2].content).toContain('What is written here?');
expect(client.memoryPayload[2].content).not.toContain('Current turn file body');
expect(mockAgent.additional_instructions ?? '').not.toContain('Current turn file body');
expect(result.prompt[2].content.indexOf('Current turn file body')).toBeLessThan(
result.prompt[2].content.indexOf('What is written here?'),
);
});
it('persists canonical token counts while counting request file context for the prompt', async () => {
const { countFormattedMessageTokens } = require('@librechat/api');
const currentFile = makeTextFile('current-file', 'current.txt', 'Current turn file body');
countFormattedMessageTokens.mockImplementation(({ content }) => {
const text = Array.isArray(content)
? content.map((part) => part.text ?? part[ContentTypes.TEXT] ?? '').join('\n')
: String(content ?? '');
return text.includes('Current turn file body') ? 200 : 20;
});
client.options.attachments = [currentFile];
const result = await client.buildMessages(
[
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'What is written here?',
isCreatedByUser: true,
},
],
'msg-1',
{},
);
expect(result.prompt[0].content).toContain('Current turn file body');
expect(result.tokenCountMap['msg-1']).toBe(20);
expect(result.promptTokens).toBe(200);
expect(client.indexTokenCountMap[0]).toBe(200);
expect(client.memoryPayload[0].content).toBe('What is written here?');
});
it('does not duplicate a file that is both request context and scoped context', async () => {
const sharedFile = makeTextFile('shared-file', 'shared.txt', 'Shared duplicate context');
client.options.attachments = [sharedFile];
client.options.agentContextAttachmentsByAgentId = new Map([['primary-agent', [sharedFile]]]);
client.agentConfigs = new Map();
const result = await client.buildMessages(
[
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Use the available context.',
isCreatedByUser: true,
},
],
'msg-1',
{},
);
const inlineOccurrences = (result.prompt[0].content.match(/Shared duplicate context/g) ?? [])
.length;
expect(inlineOccurrences).toBe(1);
expect(mockAgent.additional_instructions ?? '').not.toContain('Shared duplicate context');
});
it('keeps direct chats with context-doc agents working without request attachments', async () => {
const primaryContext = makeTextFile(
'primary-context',
'primary.txt',
'Direct primary context',
);
client.options.agentContextAttachmentsByAgentId = new Map([
['primary-agent', [primaryContext]],
]);
client.agentConfigs = new Map();
await client.buildMessages(
[
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Answer from your context.',
isCreatedByUser: true,
},
],
'msg-1',
{},
);
expect(mockAgent.additional_instructions).toContain('Direct primary context');
});
});
describe('runMemory method', () => {
let client;
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
let mockProcessMemory;
beforeEach(() => {
jest.clearAllMocks();
mockAgent = {
id: 'agent-123',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
model_parameters: {
model: 'gpt-4',
},
};
mockReq = {
user: {
id: 'user-123',
personalization: {
memories: true,
},
},
};
// Mock getAppConfig for memory tests
mockReq.config = {
memory: {
messageWindowSize: 3,
},
};
mockRes = {};
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
};
mockProcessMemory = jest.fn().mockResolvedValue([]);
client = new AgentClient(mockOptions);
client.processMemory = mockProcessMemory;
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
});
it('should filter out image URLs from message content', async () => {
const { HumanMessage, AIMessage } = require('@librechat/agents/langchain/messages');
const messages = [
new HumanMessage({
content: [
{
type: 'text',
text: 'What is in this image?',
},
{
type: 'image_url',
image_url: {
url: 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNkYPhfDwAChwGA60e6kgAAAABJRU5ErkJggg==',
detail: 'auto',
},
},
],
}),
new AIMessage('I can see a small red pixel in the image.'),
new HumanMessage({
content: [
{
type: 'text',
text: 'What about this one?',
},
{
type: 'image_url',
image_url: {
url: 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEAYABgAAD/',
detail: 'high',
},
},
],
}),
];
await client.runMemory(messages);
expect(mockProcessMemory).toHaveBeenCalledTimes(1);
const processedMessage = mockProcessMemory.mock.calls[0][0][0];
// Verify the buffer message was created
expect(processedMessage.constructor.name).toBe('HumanMessage');
expect(processedMessage.content).toContain('# Current Chat:');
// Verify that image URLs are not in the buffer string
expect(processedMessage.content).not.toContain('image_url');
expect(processedMessage.content).not.toContain('data:image');
expect(processedMessage.content).not.toContain('base64');
// Verify text content is preserved
expect(processedMessage.content).toContain('What is in this image?');
expect(processedMessage.content).toContain('I can see a small red pixel in the image.');
expect(processedMessage.content).toContain('What about this one?');
});
it('should handle messages with only text content', async () => {
const { HumanMessage, AIMessage } = require('@librechat/agents/langchain/messages');
const messages = [
new HumanMessage('Hello, how are you?'),
new AIMessage('I am doing well, thank you!'),
new HumanMessage('That is great to hear.'),
];
await client.runMemory(messages);
expect(mockProcessMemory).toHaveBeenCalledTimes(1);
const processedMessage = mockProcessMemory.mock.calls[0][0][0];
expect(processedMessage.content).toContain('Hello, how are you?');
expect(processedMessage.content).toContain('I am doing well, thank you!');
expect(processedMessage.content).toContain('That is great to hear.');
});
it('should handle mixed content types correctly', async () => {
const { HumanMessage } = require('@librechat/agents/langchain/messages');
const { ContentTypes } = require('librechat-data-provider');
const messages = [
new HumanMessage({
content: [
{
type: 'text',
text: 'Here is some text',
},
{
type: ContentTypes.IMAGE_URL,
image_url: {
url: 'https://example.com/image.png',
},
},
{
type: 'text',
text: ' and more text',
},
],
}),
];
await client.runMemory(messages);
expect(mockProcessMemory).toHaveBeenCalledTimes(1);
const processedMessage = mockProcessMemory.mock.calls[0][0][0];
// Should contain text parts but not image URLs
expect(processedMessage.content).toContain('Here is some text');
expect(processedMessage.content).toContain('and more text');
expect(processedMessage.content).not.toContain('example.com/image.png');
expect(processedMessage.content).not.toContain('IMAGE_URL');
});
it('should preserve original messages without mutation', async () => {
const { HumanMessage } = require('@librechat/agents/langchain/messages');
const originalContent = [
{
type: 'text',
text: 'Original text',
},
{
type: 'image_url',
image_url: {
url: 'data:image/png;base64,ABC123',
},
},
];
const messages = [
new HumanMessage({
content: [...originalContent],
}),
];
await client.runMemory(messages);
// Verify original message wasn't mutated
expect(messages[0].content).toHaveLength(2);
expect(messages[0].content[1].type).toBe('image_url');
expect(messages[0].content[1].image_url.url).toBe('data:image/png;base64,ABC123');
});
it('should handle message window size correctly', async () => {
const { HumanMessage, AIMessage } = require('@librechat/agents/langchain/messages');
const messages = [
new HumanMessage('Message 1'),
new AIMessage('Response 1'),
new HumanMessage('Message 2'),
new AIMessage('Response 2'),
new HumanMessage('Message 3'),
new AIMessage('Response 3'),
];
// Window size is set to 3 in mockReq
await client.runMemory(messages);
expect(mockProcessMemory).toHaveBeenCalledTimes(1);
const processedMessage = mockProcessMemory.mock.calls[0][0][0];
// Should only include last 3 messages due to window size
expect(processedMessage.content).toContain('Message 3');
expect(processedMessage.content).toContain('Response 3');
expect(processedMessage.content).not.toContain('Message 1');
expect(processedMessage.content).not.toContain('Response 1');
});
it('should cap memory input tokens and preserve recent content', async () => {
const { HumanMessage, AIMessage } = require('@librechat/agents/langchain/messages');
mockReq.config.memory.maxInputTokens = 12;
const messages = [
new HumanMessage(`OLDER_CONTENT ${'a'.repeat(600)}`),
new AIMessage('Intermediate response'),
new HumanMessage('Please remember LATEST_MEMORY_MARKER'),
];
await client.runMemory(messages);
expect(mockProcessMemory).toHaveBeenCalledTimes(1);
const processedMessage = mockProcessMemory.mock.calls[0][0][0];
expect(processedMessage.content).toContain('LATEST_MEMORY_MARKER');
expect(processedMessage.content).not.toContain('OLDER_CONTENT');
expect(Math.ceil(processedMessage.content.length / 4)).toBeLessThanOrEqual(12);
});
it('should return early if processMemory is not set', async () => {
const { HumanMessage } = require('@librechat/agents/langchain/messages');
client.processMemory = null;
const result = await client.runMemory([new HumanMessage('Test')]);
expect(result).toBeUndefined();
expect(mockProcessMemory).not.toHaveBeenCalled();
});
});
describe('getMessagesForConversation - mapMethod and mapCondition', () => {
const createMessage = (id, parentId, text, extras = {}) => ({
messageId: id,
parentMessageId: parentId,
text,
isCreatedByUser: false,
...extras,
});
it('should apply mapMethod to all messages when mapCondition is not provided', () => {
const messages = [
createMessage('msg-1', null, 'First message'),
createMessage('msg-2', 'msg-1', 'Second message'),
createMessage('msg-3', 'msg-2', 'Third message'),
];
const mapMethod = jest.fn((msg) => ({ ...msg, mapped: true }));
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-3',
mapMethod,
});
expect(result).toHaveLength(3);
expect(mapMethod).toHaveBeenCalledTimes(3);
result.forEach((msg) => {
expect(msg.mapped).toBe(true);
});
});
it('should apply mapMethod only to messages where mapCondition returns true', () => {
const messages = [
createMessage('msg-1', null, 'First message', { addedConvo: false }),
createMessage('msg-2', 'msg-1', 'Second message', { addedConvo: true }),
createMessage('msg-3', 'msg-2', 'Third message', { addedConvo: true }),
createMessage('msg-4', 'msg-3', 'Fourth message', { addedConvo: false }),
];
const mapMethod = jest.fn((msg) => ({ ...msg, mapped: true }));
const mapCondition = (msg) => msg.addedConvo === true;
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-4',
mapMethod,
mapCondition,
});
expect(result).toHaveLength(4);
expect(mapMethod).toHaveBeenCalledTimes(2);
expect(result[0].mapped).toBeUndefined();
expect(result[1].mapped).toBe(true);
expect(result[2].mapped).toBe(true);
expect(result[3].mapped).toBeUndefined();
});
it('should not apply mapMethod when mapCondition returns false for all messages', () => {
const messages = [
createMessage('msg-1', null, 'First message', { addedConvo: false }),
createMessage('msg-2', 'msg-1', 'Second message', { addedConvo: false }),
];
const mapMethod = jest.fn((msg) => ({ ...msg, mapped: true }));
const mapCondition = (msg) => msg.addedConvo === true;
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-2',
mapMethod,
mapCondition,
});
expect(result).toHaveLength(2);
expect(mapMethod).not.toHaveBeenCalled();
result.forEach((msg) => {
expect(msg.mapped).toBeUndefined();
});
});
it('should not call mapMethod when mapMethod is null', () => {
const messages = [
createMessage('msg-1', null, 'First message'),
createMessage('msg-2', 'msg-1', 'Second message'),
];
const mapCondition = jest.fn(() => true);
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-2',
mapMethod: null,
mapCondition,
});
expect(result).toHaveLength(2);
expect(mapCondition).not.toHaveBeenCalled();
});
it('should handle mapCondition with complex logic', () => {
const messages = [
createMessage('msg-1', null, 'User message', { isCreatedByUser: true, addedConvo: true }),
createMessage('msg-2', 'msg-1', 'Assistant response', { addedConvo: true }),
createMessage('msg-3', 'msg-2', 'Another user message', { isCreatedByUser: true }),
createMessage('msg-4', 'msg-3', 'Another response', { addedConvo: true }),
];
const mapMethod = jest.fn((msg) => ({ ...msg, processed: true }));
const mapCondition = (msg) => msg.addedConvo === true && !msg.isCreatedByUser;
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-4',
mapMethod,
mapCondition,
});
expect(result).toHaveLength(4);
expect(mapMethod).toHaveBeenCalledTimes(2);
expect(result[0].processed).toBeUndefined();
expect(result[1].processed).toBe(true);
expect(result[2].processed).toBeUndefined();
expect(result[3].processed).toBe(true);
});
it('should preserve message order after applying mapMethod with mapCondition', () => {
const messages = [
createMessage('msg-1', null, 'First', { addedConvo: true }),
createMessage('msg-2', 'msg-1', 'Second', { addedConvo: false }),
createMessage('msg-3', 'msg-2', 'Third', { addedConvo: true }),
];
const mapMethod = (msg) => ({ ...msg, text: `[MAPPED] ${msg.text}` });
const mapCondition = (msg) => msg.addedConvo === true;
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-3',
mapMethod,
mapCondition,
});
expect(result[0].text).toBe('[MAPPED] First');
expect(result[1].text).toBe('Second');
expect(result[2].text).toBe('[MAPPED] Third');
});
it('should work with summary option alongside mapMethod and mapCondition', () => {
const messages = [
createMessage('msg-1', null, 'First', { addedConvo: false }),
createMessage('msg-2', 'msg-1', 'Second', {
summary: 'Summary of conversation',
addedConvo: true,
}),
createMessage('msg-3', 'msg-2', 'Third', { addedConvo: true }),
createMessage('msg-4', 'msg-3', 'Fourth', { addedConvo: false }),
];
const mapMethod = jest.fn((msg) => ({ ...msg, mapped: true }));
const mapCondition = (msg) => msg.addedConvo === true;
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-4',
mapMethod,
mapCondition,
summary: true,
});
/** Traversal stops at msg-2 (has summary), so we get msg-4 -> msg-3 -> msg-2 */
expect(result).toHaveLength(3);
expect(result[0].content).toEqual([{ type: 'text', text: 'Summary of conversation' }]);
expect(result[0].role).toBe('system');
expect(result[0].mapped).toBe(true);
expect(result[1].mapped).toBe(true);
expect(result[2].mapped).toBeUndefined();
});
it('should handle empty messages array', () => {
const mapMethod = jest.fn();
const mapCondition = jest.fn();
const result = AgentClient.getMessagesForConversation({
messages: [],
parentMessageId: 'msg-1',
mapMethod,
mapCondition,
});
expect(result).toHaveLength(0);
expect(mapMethod).not.toHaveBeenCalled();
expect(mapCondition).not.toHaveBeenCalled();
});
it('should handle undefined mapCondition explicitly', () => {
const messages = [
createMessage('msg-1', null, 'First'),
createMessage('msg-2', 'msg-1', 'Second'),
];
const mapMethod = jest.fn((msg) => ({ ...msg, mapped: true }));
const result = AgentClient.getMessagesForConversation({
messages,
parentMessageId: 'msg-2',
mapMethod,
mapCondition: undefined,
});
expect(result).toHaveLength(2);
expect(mapMethod).toHaveBeenCalledTimes(2);
result.forEach((msg) => {
expect(msg.mapped).toBe(true);
});
});
});
describe('buildMessages - memory context for parallel agents', () => {
let client;
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
beforeEach(() => {
jest.clearAllMocks();
mockAgent = {
id: 'primary-agent',
name: 'Primary Agent',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
instructions: 'Primary agent instructions',
model_parameters: {
model: 'gpt-4',
},
tools: [],
};
mockReq = {
user: {
id: 'user-123',
personalization: {
memories: true,
},
},
body: {
endpoint: EModelEndpoint.openAI,
},
config: {
memory: {
disabled: false,
},
},
};
mockRes = {};
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
endpoint: EModelEndpoint.agents,
};
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
client.shouldSummarize = false;
client.maxContextTokens = 4096;
});
it('should only pass memory context to the primary agent by default', async () => {
const memoryContent = 'User prefers dark mode. User is a software developer.';
client.useMemory = jest
.fn()
.mockResolvedValue({ withKeys: memoryContent, withoutKeys: memoryContent });
const parallelAgent1 = {
id: 'parallel-agent-1',
name: 'Parallel Agent 1',
instructions: 'Parallel agent 1 instructions',
provider: EModelEndpoint.openAI,
};
const parallelAgent2 = {
id: 'parallel-agent-2',
name: 'Parallel Agent 2',
instructions: 'Parallel agent 2 instructions',
provider: EModelEndpoint.anthropic,
};
client.agentConfigs = new Map([
['parallel-agent-1', parallelAgent1],
['parallel-agent-2', parallelAgent2],
]);
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
});
expect(client.useMemory).toHaveBeenCalled();
expect(client.options.agent.instructions).toContain('Primary agent instructions');
expect(client.options.agent.instructions).not.toContain(memoryContent);
expect(client.options.agent.additional_instructions).toContain(memoryContent);
expect(parallelAgent1.instructions).toContain('Parallel agent 1 instructions');
expect(parallelAgent1.instructions).not.toContain(memoryContent);
expect(parallelAgent1.additional_instructions ?? '').not.toContain(memoryContent);
expect(parallelAgent2.instructions).toContain('Parallel agent 2 instructions');
expect(parallelAgent2.instructions).not.toContain(memoryContent);
expect(parallelAgent2.additional_instructions ?? '').not.toContain(memoryContent);
});
it('should pass memory context to parallel agents when automatic memory updates are enabled', async () => {
const memoryContent = 'User prefers dark mode. User is a software developer.';
client.useMemory = jest
.fn()
.mockResolvedValue({ withKeys: memoryContent, withoutKeys: memoryContent });
mockReq.config.memory.agent = {
enabled: true,
id: 'memory-agent',
};
const parallelAgent = {
id: 'parallel-agent-1',
name: 'Parallel Agent 1',
instructions: 'Parallel agent instructions',
provider: EModelEndpoint.openAI,
};
client.agentConfigs = new Map([['parallel-agent-1', parallelAgent]]);
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
});
expect(client.options.agent.instructions).toContain('Primary agent instructions');
expect(client.options.agent.instructions).not.toContain(memoryContent);
expect(client.options.agent.additional_instructions).toContain(memoryContent);
expect(parallelAgent.instructions).toContain('Parallel agent instructions');
expect(parallelAgent.instructions).not.toContain(memoryContent);
expect(parallelAgent.additional_instructions).toContain(memoryContent);
});
it('should not modify parallel agents when no memory context is available', async () => {
client.useMemory = jest.fn().mockResolvedValue(undefined);
const parallelAgent = {
id: 'parallel-agent-1',
name: 'Parallel Agent 1',
instructions: 'Original parallel instructions',
provider: EModelEndpoint.openAI,
};
client.agentConfigs = new Map([['parallel-agent-1', parallelAgent]]);
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
});
expect(parallelAgent.instructions).toBe('Original parallel instructions');
});
it('should handle parallel agents without existing instructions when memory stays primary-only', async () => {
const memoryContent = 'User is a data scientist.';
client.useMemory = jest
.fn()
.mockResolvedValue({ withKeys: memoryContent, withoutKeys: memoryContent });
const parallelAgentNoInstructions = {
id: 'parallel-agent-no-instructions',
name: 'Parallel Agent No Instructions',
provider: EModelEndpoint.openAI,
};
client.agentConfigs = new Map([
['parallel-agent-no-instructions', parallelAgentNoInstructions],
]);
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await client.buildMessages(messages, null, {
instructions: null,
additional_instructions: null,
});
expect(client.options.agent.additional_instructions).toContain(memoryContent);
expect(parallelAgentNoInstructions.instructions).toBeUndefined();
expect(parallelAgentNoInstructions.additional_instructions ?? '').not.toContain(
memoryContent,
);
});
it('should not modify agentConfigs when none exist', async () => {
const memoryContent = 'User prefers concise responses.';
client.useMemory = jest
.fn()
.mockResolvedValue({ withKeys: memoryContent, withoutKeys: memoryContent });
client.agentConfigs = null;
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await expect(
client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
}),
).resolves.not.toThrow();
expect(client.options.agent.additional_instructions).toContain(memoryContent);
});
it('should handle empty agentConfigs map', async () => {
const memoryContent = 'User likes detailed explanations.';
client.useMemory = jest
.fn()
.mockResolvedValue({ withKeys: memoryContent, withoutKeys: memoryContent });
client.agentConfigs = new Map();
const messages = [
{
messageId: 'msg-1',
parentMessageId: null,
sender: 'User',
text: 'Hello',
isCreatedByUser: true,
},
];
await expect(
client.buildMessages(messages, null, {
instructions: 'Base instructions',
additional_instructions: null,
}),
).resolves.not.toThrow();
expect(client.options.agent.additional_instructions).toContain(memoryContent);
});
});
describe('useMemory method - prelimAgent assignment', () => {
let client;
let mockReq;
let mockRes;
let mockAgent;
let mockOptions;
let mockCheckAccess;
let mockLoadAgent;
let mockInitializeAgent;
let mockCreateMemoryProcessor;
let mockGetFormattedMemories;
beforeEach(() => {
jest.clearAllMocks();
mockAgent = {
id: 'agent-123',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
instructions: 'Test instructions',
model: 'gpt-4',
model_parameters: {
model: 'gpt-4',
},
};
mockReq = {
user: {
id: 'user-123',
personalization: {
memories: true,
},
},
config: {
memory: {
agent: {
enabled: true,
id: 'agent-123',
},
},
endpoints: {
[EModelEndpoint.agents]: {
allowedProviders: [EModelEndpoint.openAI],
},
},
},
};
mockRes = {};
mockOptions = {
req: mockReq,
res: mockRes,
agent: mockAgent,
};
mockCheckAccess = require('@librechat/api').checkAccess;
mockLoadAgent = require('@librechat/api').loadAgent;
mockInitializeAgent = require('@librechat/api').initializeAgent;
mockCreateMemoryProcessor = require('@librechat/api').createMemoryProcessor;
mockGetFormattedMemories = require('~/models').getFormattedMemories;
mockGetFormattedMemories.mockResolvedValue({
withKeys: '',
withoutKeys: '',
totalTokens: 0,
});
});
it('should use current agent when memory config agent.id matches current agent id', async () => {
mockCheckAccess.mockResolvedValue(true);
mockInitializeAgent.mockResolvedValue({
...mockAgent,
provider: EModelEndpoint.openAI,
});
mockCreateMemoryProcessor.mockResolvedValue([undefined, jest.fn()]);
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
await client.useMemory();
expect(mockLoadAgent).not.toHaveBeenCalled();
expect(mockInitializeAgent).toHaveBeenCalledWith(
expect.objectContaining({
agent: mockAgent,
}),
expect.any(Object),
);
});
it('should load different agent when memory config agent.id differs from current agent id', async () => {
const differentAgentId = 'different-agent-456';
const differentAgent = {
id: differentAgentId,
provider: EModelEndpoint.openAI,
model: 'gpt-4',
instructions: 'Different agent instructions',
};
mockReq.config.memory.agent.id = differentAgentId;
mockCheckAccess.mockResolvedValue(true);
mockLoadAgent.mockResolvedValue(differentAgent);
mockInitializeAgent.mockResolvedValue({
...differentAgent,
provider: EModelEndpoint.openAI,
});
mockCreateMemoryProcessor.mockResolvedValue([undefined, jest.fn()]);
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
await client.useMemory();
expect(mockLoadAgent).toHaveBeenCalledWith(
expect.objectContaining({
agent_id: differentAgentId,
}),
expect.any(Object),
);
expect(mockInitializeAgent).toHaveBeenCalledWith(
expect.objectContaining({
agent: differentAgent,
}),
expect.any(Object),
);
});
it('should return existing memories without auto-processing when memory agent is not enabled', async () => {
mockReq.config.memory = {
personalize: true,
};
mockCheckAccess.mockResolvedValue(true);
mockGetFormattedMemories.mockResolvedValue({
withKeys: 'food: likes pasta',
withoutKeys: 'likes pasta',
totalTokens: 3,
});
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
const result = await client.useMemory();
expect(result).toEqual({ withKeys: 'food: likes pasta', withoutKeys: 'likes pasta' });
expect(mockGetFormattedMemories).toHaveBeenCalledWith({ userId: 'user-123' });
expect(mockInitializeAgent).not.toHaveBeenCalled();
expect(mockCreateMemoryProcessor).not.toHaveBeenCalled();
expect(client.processMemory).toBeUndefined();
});
it('should not initialize auto-processing when no memories exist', async () => {
mockReq.config.memory = {
personalize: true,
};
mockCheckAccess.mockResolvedValue(true);
mockGetFormattedMemories.mockResolvedValue({
withKeys: '',
withoutKeys: '',
totalTokens: 0,
});
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
const result = await client.useMemory();
expect(result).toEqual({ withKeys: '', withoutKeys: '' });
expect(mockGetFormattedMemories).toHaveBeenCalledWith({ userId: 'user-123' });
expect(mockInitializeAgent).not.toHaveBeenCalled();
expect(mockCreateMemoryProcessor).not.toHaveBeenCalled();
expect(client.processMemory).toBeUndefined();
});
it('should return existing memories without auto-processing when memory agent config lacks explicit enablement', async () => {
mockReq.config.memory.agent = {
id: 'agent-123',
};
mockCheckAccess.mockResolvedValue(true);
mockGetFormattedMemories.mockResolvedValue({
withKeys: 'tone: concise',
withoutKeys: 'prefers concise answers',
totalTokens: 4,
});
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
const result = await client.useMemory();
expect(result).toEqual({ withKeys: 'tone: concise', withoutKeys: 'prefers concise answers' });
expect(mockLoadAgent).not.toHaveBeenCalled();
expect(mockInitializeAgent).not.toHaveBeenCalled();
expect(mockCreateMemoryProcessor).not.toHaveBeenCalled();
});
it('should return undefined when loading memories fails without auto-processing', async () => {
const { logger } = require('@librechat/data-schemas');
const errorSpy = jest.spyOn(logger, 'error').mockImplementation(() => logger);
mockReq.config.memory = {
personalize: true,
};
mockCheckAccess.mockResolvedValue(true);
mockGetFormattedMemories.mockRejectedValue(new Error('DB connection failed'));
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
const result = await client.useMemory();
expect(result).toBeUndefined();
expect(mockGetFormattedMemories).toHaveBeenCalledWith({ userId: 'user-123' });
expect(mockInitializeAgent).not.toHaveBeenCalled();
expect(mockCreateMemoryProcessor).not.toHaveBeenCalled();
expect(client.processMemory).toBeUndefined();
expect(errorSpy).toHaveBeenCalledWith(
'[api/server/controllers/agents/client.js #useMemory] Error loading memories',
expect.any(Error),
);
});
it('should create ephemeral agent when no id but model and provider are specified', async () => {
mockReq.config.memory = {
agent: {
enabled: true,
model: 'gpt-4',
provider: EModelEndpoint.openAI,
},
};
mockCheckAccess.mockResolvedValue(true);
mockInitializeAgent.mockResolvedValue({
id: Constants.EPHEMERAL_AGENT_ID,
model: 'gpt-4',
provider: EModelEndpoint.openAI,
});
mockCreateMemoryProcessor.mockResolvedValue([undefined, jest.fn()]);
client = new AgentClient(mockOptions);
client.conversationId = 'convo-123';
client.responseMessageId = 'response-123';
await client.useMemory();
expect(mockLoadAgent).not.toHaveBeenCalled();
expect(mockInitializeAgent).toHaveBeenCalledWith(
expect.objectContaining({
agent: expect.objectContaining({
id: Constants.EPHEMERAL_AGENT_ID,
model: 'gpt-4',
provider: EModelEndpoint.openAI,
}),
}),
expect.any(Object),
);
});
});
});
describe('AgentClient - finalizeSubagentContent', () => {
/** Verifies the backend persistence path: per-subagent
* `createContentAggregator` instances (populated by the callbacks
* ON_SUBAGENT_UPDATE handler) have their `contentParts` harvested
* onto the matching parent `subagent` tool_call at message-save time
* so a page refresh shows the same activity the user saw live. */
const { GraphEvents } = jest.requireActual('@librechat/agents');
const { getDefaultHandlers } = require('./callbacks');
const makeClient = (subagentAggregatorsByToolCallId) => {
const client = new AgentClient({
req: { user: { id: 'u' }, body: {}, config: { endpoints: {} } },
res: {},
agent: {
id: 'agent',
endpoint: EModelEndpoint.openAI,
provider: EModelEndpoint.openAI,
model_parameters: { model: 'gpt-4' },
},
contentParts: [],
subagentAggregatorsByToolCallId,
});
return client;
};
const event = (phase, data, parentToolCallId = 'call_sub') => ({
runId: 'parent-run',
subagentRunId: 'child-run',
subagentType: 'self',
subagentAgentId: 'child',
parentToolCallId,
phase,
data,
timestamp: '2026-04-17T00:00:00Z',
});
/** Feeds a SubagentUpdateEvent sequence through the real
* `ON_SUBAGENT_UPDATE` handler so we exercise the same get-or-create
* aggregator logic the live request uses, rather than constructing
* aggregators directly in the test. */
const runSubagentEvents = async (events) => {
const map = new Map();
const handlers = getDefaultHandlers({
res: { write: jest.fn(), writableEnded: false },
aggregateContent: jest.fn(),
toolEndCallback: jest.fn(),
collectedUsage: [],
subagentAggregatorsByToolCallId: map,
});
const handler = handlers[GraphEvents.ON_SUBAGENT_UPDATE];
for (const e of events) {
await handler.handle(GraphEvents.ON_SUBAGENT_UPDATE, e);
}
return map;
};
it('attaches aggregated subagent_content to the matching subagent tool_call part', async () => {
const buffer = await runSubagentEvents([
event('run_step', {
id: 'step_msg',
index: 0,
stepDetails: { type: 'message_creation' },
}),
event('message_delta', {
id: 'step_msg',
delta: { content: [{ type: 'text', text: 'Hello ' }] },
}),
event('message_delta', {
id: 'step_msg',
delta: { content: [{ type: 'text', text: 'world!' }] },
}),
event('run_step', {
id: 'step_tool',
index: 1,
stepDetails: {
type: 'tool_calls',
tool_calls: [{ id: 'inner_1', name: 'calculator', args: '{}' }],
},
}),
event('run_step_completed', {
id: 'step_tool',
index: 1,
result: {
id: 'step_tool',
type: 'tool_call',
tool_call: {
id: 'inner_1',
name: 'calculator',
output: '4',
progress: 1,
},
},
}),
]);
const client = makeClient(buffer);
client.contentParts = [
{
type: 'tool_call',
tool_call: {
id: 'call_sub',
name: Constants.SUBAGENT,
args: '{}',
output: 'final text',
progress: 1,
},
},
];
client.finalizeSubagentContent();
const attached = client.contentParts[0].tool_call.subagent_content;
expect(Array.isArray(attached)).toBe(true);
expect(attached).toHaveLength(2);
expect(attached[0].type).toBe('text');
expect(attached[0].text).toBe('Hello world!');
expect(attached[1].type).toBe('tool_call');
expect(attached[1].tool_call.name).toBe('calculator');
expect(attached[1].tool_call.output).toBe('4');
/** Buffer drained so a second call (e.g. resumable retry) doesn't
* double-append. */
expect(buffer.size).toBe(0);
});
it('ignores tool_call parts whose name is not SUBAGENT', async () => {
const buffer = await runSubagentEvents([
event(
'run_step',
{
id: 'step_msg',
index: 0,
stepDetails: { type: 'message_creation' },
},
'call_regular',
),
event(
'message_delta',
{
id: 'step_msg',
delta: { content: [{ type: 'text', text: 'x' }] },
},
'call_regular',
),
]);
const client = makeClient(buffer);
client.contentParts = [
{
type: 'tool_call',
tool_call: { id: 'call_regular', name: 'calculator', args: '{}' },
},
];
client.finalizeSubagentContent();
expect(client.contentParts[0].tool_call.subagent_content).toBeUndefined();
});
it('is a safe no-op when the aggregator map is empty or missing', () => {
const client = makeClient(undefined);
client.contentParts = [
{
type: 'tool_call',
tool_call: { id: 'call_sub', name: Constants.SUBAGENT, args: '{}' },
},
];
expect(() => client.finalizeSubagentContent()).not.toThrow();
expect(client.contentParts[0].tool_call.subagent_content).toBeUndefined();
});
it('discards aggregators keyed by a tool_call_id not present in contentParts', async () => {
const buffer = await runSubagentEvents([
event(
'run_step',
{
id: 'step_msg',
index: 0,
stepDetails: { type: 'message_creation' },
},
'call_missing',
),
event(
'message_delta',
{
id: 'step_msg',
delta: { content: [{ type: 'text', text: 'x' }] },
},
'call_missing',
),
]);
const client = makeClient(buffer);
client.contentParts = [
{
type: 'tool_call',
tool_call: { id: 'call_other', name: Constants.SUBAGENT, args: '{}' },
},
];
client.finalizeSubagentContent();
expect(client.contentParts[0].tool_call.subagent_content).toBeUndefined();
});
it('keeps per-parent tool_call aggregators isolated for parallel subagents', async () => {
const buffer = await runSubagentEvents([
event(
'run_step',
{
id: 'step_a',
index: 0,
stepDetails: { type: 'message_creation' },
},
'call_a',
),
event(
'message_delta',
{ id: 'step_a', delta: { content: [{ type: 'text', text: 'A' }] } },
'call_a',
),
event(
'run_step',
{
id: 'step_b',
index: 0,
stepDetails: { type: 'message_creation' },
},
'call_b',
),
event(
'message_delta',
{ id: 'step_b', delta: { content: [{ type: 'text', text: 'B' }] } },
'call_b',
),
]);
const client = makeClient(buffer);
client.contentParts = [
{ type: 'tool_call', tool_call: { id: 'call_a', name: Constants.SUBAGENT, args: '{}' } },
{ type: 'tool_call', tool_call: { id: 'call_b', name: Constants.SUBAGENT, args: '{}' } },
];
client.finalizeSubagentContent();
expect(client.contentParts[0].tool_call.subagent_content).toEqual([
expect.objectContaining({ type: 'text', text: 'A' }),
]);
expect(client.contentParts[1].tool_call.subagent_content).toEqual([
expect.objectContaining({ type: 'text', text: 'B' }),
]);
});
});