🛟 fix: persist Vertex Gemini 3 thoughtSignatures across DB round-trips (#13026)

When a tool round-trip is interrupted between the tool result and the
model's text reply (user aborted, network drop, pod restart, ...) and
LibreChat persists the partial assistant message, the next conversation
turn reconstructs an `AIMessage` from `formatAgentMessages` that has
`tool_calls` populated but no `additional_kwargs.signatures`. Vertex
Gemini 3 rejects the resumed request with 400 because the most recent
historical functionCall has no `thought_signature`.

## Storage shape

Capture as `Record<tool_call_id, signature>` rather than a flat array.
This addresses the codex P1 review:

  > When an assistant turn contains multiple sequential tool-call batches,
  > this restoration path writes all persisted thoughtSignatures onto only
  > the last tool-bearing AIMessage. Vertex/Gemini validates signatures
  > for each step in the current tool-calling turn, so earlier
  > functionCall steps reconstructed without their signature can still
  > fail with 400.

A single agent run can fire multiple `chat_model_end` events when the
loop cycles the LLM with intervening tool results — each cycle owns a
distinct `tool_call_id`. Per-id storage maps each signature back onto
the right reconstructed `AIMessage`, not just the last one.

## Mapping

`additional_kwargs.signatures` is a flat array indexed by *response part*
(text + functionCall interleaved). `tool_calls` is just the function
calls in their original order. Non-empty signatures correspond 1:1 with
tool_calls in order — see `partsToSignatures` in
`@langchain/google-common`. Single-pass walk maps `signatures[i]` (when
non-empty) onto the i-th `tool_call.id`.

## Pipeline

| Stage | File | Change |
|---|---|---|
| Capture | callbacks.js | `ModelEndHandler` accepts `Record<string,string>` map; walks signatures + tool_calls in tandem to record per-id. Gated on the map being provided — non-Vertex flows are no-op (and also no-op even when provided, since they don't emit signatures). |
| Plumbing | initialize.js | Allocate `collectedThoughtSignatures = {}`, share with handler + client. Always allocated; the JSDoc explicitly documents that it stays empty for non-Vertex providers. |
| Surface | client.js | `sendCompletion` returns `metadata.thoughtSignatures` when the map has entries; falls through unchanged when empty. |
| Persist | (existing BaseClient.handleRespCompletion) | Writes `metadata` from `sendCompletion` onto `responseMessage.metadata`. Mongoose `Mixed` — no migration. |
| Restore | formatMessages.js | Track every tool-bearing AIMessage produced from a TMessage. For each, build a position-aligned `additional_kwargs.signatures` array (empty placeholders for tool_calls without a stored sig). Agents' `fixThoughtSignatures` dispatches non-empty entries to functionCall parts in order. |

## Live verification

- **Single-step:** real Vertex `gemini-3.1-flash-lite-preview` resume-after-tool case. With fix  / without  400.
- **Multi-step (codex case):** real two-step agent loop (list /tmp → echo done). Each step's signature attaches to its own reconstructed AIMessage. With fix  / without  400.
- **Cross-provider:** Anthropic Claude haiku-4.5 + OpenAI gpt-5-mini accept the persisted/restored shape unchanged.

## Tests

`modelEndHandler.spec.js` (new) — 6 tests:
- maps non-empty signatures onto tool_call_ids in order
- accumulates per-id across multiple `model_end` events (multi-step)
- no-op when `collectedThoughtSignatures` is null
- no-op when `signatures` field missing (non-Vertex)
- no-op when `tool_calls` missing
- preserves existing `collectedUsage` array contract

`formatAgentMessages.spec.js` — 6 new tests:
- restores onto the AIMessage that owns the tool_call
- per-step attachment for multi-step turns (codex review case)
- preserves tool_call ordering when signatures are partial
- no-op when metadata.thoughtSignatures absent
- no-op when assistant has no tool_calls
- no-op when stored ids don't match any current tool_call

37 passing across 3 suites; 15 existing formatAgentMessages tests unchanged.

## Compatibility

- Backward-compatible — restore gated on `metadata.thoughtSignatures` being a populated object; capture gated on the map being provided.
- No schema migration — uses `Message.metadata: Mixed` already in place.
- Cross-provider safe — non-Vertex providers tolerate the field (verified live against Anthropic + OpenAI converters).
- Pairs with [agents#159](https://github.com/danny-avila/agents/pull/159) for full coverage on histories that mix plain-text and toolcall AIMessages.
This commit is contained in:
Danny Avila 2026-05-08 18:51:34 -04:00 committed by GitHub
parent a565a61a23
commit d90567204e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 405 additions and 3 deletions

View file

@ -0,0 +1,156 @@
jest.mock('@librechat/data-schemas', () => ({
logger: { error: jest.fn(), debug: jest.fn() },
}));
jest.mock('@librechat/api', () => ({
sendEvent: jest.fn(),
emitEvent: jest.fn(),
createToolExecuteHandler: jest.fn(),
markSummarizationUsage: (usage) => usage,
}));
jest.mock('~/server/services/Files/Citations', () => ({
processFileCitations: jest.fn(),
}));
jest.mock('~/server/services/Files/Code/process', () => ({
processCodeOutput: jest.fn(),
runPreviewFinalize: jest.fn(),
}));
jest.mock('~/server/services/Files/process', () => ({
saveBase64Image: jest.fn(),
}));
const { ModelEndHandler } = require('../callbacks');
const buildGraph = () => ({
getAgentContext: () => ({
provider: 'vertexai',
clientOptions: { model: 'gemini-3.1-flash-lite-preview' },
}),
});
describe('ModelEndHandler — Vertex thoughtSignature capture (issue #13006 follow-up)', () => {
it('maps non-empty signatures onto tool_call_ids in order', async () => {
const collectedUsage = [];
const collectedThoughtSignatures = {};
const handler = new ModelEndHandler(collectedUsage, collectedThoughtSignatures);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 10, output_tokens: 5, total_tokens: 15 },
tool_calls: [
{ id: 'tc_a', name: 'a', args: {} },
{ id: 'tc_b', name: 'b', args: {} },
],
additional_kwargs: { signatures: ['SIG_A', '', 'SIG_B'] },
},
},
{ ls_model_name: 'gemini-3.1-flash-lite-preview', user_id: 'u1' },
buildGraph(),
);
expect(collectedThoughtSignatures).toEqual({ tc_a: 'SIG_A', tc_b: 'SIG_B' });
expect(collectedUsage).toHaveLength(1);
});
it('accumulates per-id across multiple model_end events (multi-step tool turn)', async () => {
const collectedUsage = [];
const collectedThoughtSignatures = {};
const handler = new ModelEndHandler(collectedUsage, collectedThoughtSignatures);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 5, output_tokens: 5, total_tokens: 10 },
tool_calls: [{ id: 'tc_step1', name: 'a', args: {} }],
additional_kwargs: { signatures: ['SIG_step1'] },
},
},
{ ls_model_name: 'g', user_id: 'u' },
buildGraph(),
);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 5, output_tokens: 5, total_tokens: 10 },
tool_calls: [{ id: 'tc_step2', name: 'b', args: {} }],
additional_kwargs: { signatures: ['SIG_step2'] },
},
},
{ ls_model_name: 'g', user_id: 'u' },
buildGraph(),
);
expect(collectedThoughtSignatures).toEqual({
tc_step1: 'SIG_step1',
tc_step2: 'SIG_step2',
});
});
it('is a no-op for signatures when collectedThoughtSignatures is null', async () => {
const collectedUsage = [];
const handler = new ModelEndHandler(collectedUsage, null);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 5, output_tokens: 5, total_tokens: 10 },
tool_calls: [{ id: 'tc1', name: 'a', args: {} }],
additional_kwargs: { signatures: ['SIG'] },
},
},
{ ls_model_name: 'g', user_id: 'u' },
buildGraph(),
);
expect(collectedUsage).toHaveLength(1);
});
it('does not store anything when signatures field is missing (non-Vertex providers)', async () => {
const collectedUsage = [];
const collectedThoughtSignatures = {};
const handler = new ModelEndHandler(collectedUsage, collectedThoughtSignatures);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 5, output_tokens: 5, total_tokens: 10 },
tool_calls: [{ id: 'tc1', name: 'a', args: {} }],
additional_kwargs: {},
},
},
{ ls_model_name: 'gpt-4', user_id: 'u' },
buildGraph(),
);
expect(collectedThoughtSignatures).toEqual({});
});
it('does not store anything when tool_calls is missing', async () => {
const collectedUsage = [];
const collectedThoughtSignatures = {};
const handler = new ModelEndHandler(collectedUsage, collectedThoughtSignatures);
await handler.handle(
'on_chat_model_end',
{
output: {
usage_metadata: { input_tokens: 5, output_tokens: 5, total_tokens: 10 },
additional_kwargs: { signatures: ['SIG_orphan'] },
},
},
{ ls_model_name: 'g', user_id: 'u' },
buildGraph(),
);
expect(collectedThoughtSignatures).toEqual({});
});
it('throws when collectedUsage is not an array (existing contract)', () => {
expect(() => new ModelEndHandler(null)).toThrow('collectedUsage must be an array');
});
});

View file

@ -21,12 +21,24 @@ const { saveBase64Image } = require('~/server/services/Files/process');
class ModelEndHandler {
/**
* @param {Array<UsageMetadata>} collectedUsage
* @param {Record<string, string> | null} [collectedThoughtSignatures] Map of
* `tool_call_id → thoughtSignature` accumulated across `chat_model_end`
* events. Used to persist Vertex Gemini 3 thought signatures across DB
* round-trips so resumed conversations don't 400 on the next API call.
* Each `model_end` may emit multiple tool calls (one per LLM cycle in a
* tool-using turn); per-id storage preserves the mapping so each tool
* call's signature can be restored onto the right reconstructed
* AIMessage rather than being concentrated on the last one.
* Optional; when `null`, the handler is a no-op for signatures. Non-Vertex
* providers don't emit `additional_kwargs.signatures`, so capture is also
* a no-op for them even when the map is provided.
*/
constructor(collectedUsage) {
constructor(collectedUsage, collectedThoughtSignatures = null) {
if (!Array.isArray(collectedUsage)) {
throw new Error('collectedUsage must be an array');
}
this.collectedUsage = collectedUsage;
this.collectedThoughtSignatures = collectedThoughtSignatures;
}
finalize(errorMessage) {
@ -82,6 +94,30 @@ class ModelEndHandler {
const taggedUsage = markSummarizationUsage(usage, metadata);
this.collectedUsage.push(taggedUsage);
/**
* `additional_kwargs.signatures` is a flat array indexed by response
* part position (text + functionCall interleaved). `tool_calls` is
* just the function calls in their original order. Non-empty
* signatures correspond 1:1 with `tool_calls` in order see
* `partsToSignatures` in `@langchain/google-common`. Walk both in a
* single pass to map each signature onto the right `tool_call.id`.
*/
const signatures = data?.output?.additional_kwargs?.signatures;
const toolCalls = data?.output?.tool_calls;
if (
this.collectedThoughtSignatures &&
Array.isArray(signatures) &&
Array.isArray(toolCalls)
) {
let toolIdx = 0;
for (const sig of signatures) {
if (typeof sig !== 'string' || sig.length === 0) continue;
if (toolIdx >= toolCalls.length) break;
const id = toolCalls[toolIdx++]?.id;
if (id) this.collectedThoughtSignatures[id] = sig;
}
}
} catch (error) {
logger.error('Error handling model end event:', error);
return this.finalize(errorMessage);
@ -183,6 +219,7 @@ function getDefaultHandlers({
aggregateContent,
toolEndCallback,
collectedUsage,
collectedThoughtSignatures = null,
streamId = null,
toolExecuteOptions = null,
summarizationOptions = null,
@ -194,7 +231,7 @@ function getDefaultHandlers({
);
}
const handlers = {
[GraphEvents.CHAT_MODEL_END]: new ModelEndHandler(collectedUsage),
[GraphEvents.CHAT_MODEL_END]: new ModelEndHandler(collectedUsage, collectedThoughtSignatures),
[GraphEvents.TOOL_END]: new ToolEndHandler(toolEndCallback, logger),
[GraphEvents.ON_RUN_STEP]: {
/**
@ -1023,6 +1060,7 @@ function buildSummarizationHandlers({ isStreaming, res }) {
}
module.exports = {
ModelEndHandler,
agentLogHandler,
agentLogHandlerObj,
getDefaultHandlers,

View file

@ -82,6 +82,7 @@ class AgentClient extends BaseClient {
agentConfigs,
contentParts,
collectedUsage,
collectedThoughtSignatures,
artifactPromises,
maxContextTokens,
subagentAggregatorsByToolCallId,
@ -94,6 +95,12 @@ class AgentClient extends BaseClient {
this.contentParts = contentParts;
/** @type {Array<UsageMetadata>} */
this.collectedUsage = collectedUsage;
/** Vertex Gemini 3 thought signatures captured during the run, keyed by
* `tool_call_id`. Persisted on `responseMessage.metadata.thoughtSignatures`
* and restored as `additional_kwargs.signatures` on subsequent turns to
* keep tool round-trips valid across DB reconstruction.
* @type {Record<string, string> | undefined} */
this.collectedThoughtSignatures = collectedThoughtSignatures;
/** @type {ArtifactPromises} */
this.artifactPromises = artifactPromises;
/** Per-request map of `createContentAggregator` instances keyed by
@ -722,7 +729,11 @@ class AgentClient extends BaseClient {
});
const completion = filterMalformedContentParts(this.contentParts);
return { completion };
const signatures = this.collectedThoughtSignatures;
if (!signatures || Object.keys(signatures).length === 0) {
return { completion };
}
return { completion, metadata: { thoughtSignatures: signatures } };
}
/**

View file

@ -109,6 +109,18 @@ const initializeClient = async ({ req, res, signal, endpointOption }) => {
/** @type {Array<UsageMetadata>} */
const collectedUsage = [];
/**
* Vertex Gemini 3 thought signatures captured from `chat_model_end` events,
* keyed by `tool_call_id`. Persisted on
* `responseMessage.metadata.thoughtSignatures` so subsequent conversation
* turns can restore each signature onto the right reconstructed AIMessage's
* `additional_kwargs.signatures` and avoid 400s when resuming after a tool
* round-trip without a final text reply. Always allocated; capture path
* is a no-op for providers that don't emit signatures (OpenAI, Anthropic,
* Bedrock, etc.).
* @type {Record<string, string>}
*/
const collectedThoughtSignatures = {};
/** @type {ArtifactPromises} */
const artifactPromises = [];
const { contentParts, aggregateContent } = createContentAggregator();
@ -215,6 +227,7 @@ const initializeClient = async ({ req, res, signal, endpointOption }) => {
aggregateContent,
toolEndCallback,
collectedUsage,
collectedThoughtSignatures,
streamId,
subagentAggregatorsByToolCallId,
});
@ -780,6 +793,7 @@ const initializeClient = async ({ req, res, signal, endpointOption }) => {
agentConfigs,
eventHandlers,
collectedUsage,
collectedThoughtSignatures,
aggregateContent,
artifactPromises,
primeInvokedSkills: handlePrimeInvokedSkills,