LibreChat/e2e/specs/mock/helpers.ts
Danny Avila 397ddc5366
🧠 feat: Add Memory as an Agent Capability with Inline Tools and Ephemeral Badge (#13869)
* 🧠 feat: Memory Agent Capability with Inline Tools and Ephemeral Badge

Add `AgentCapabilities.memory`, which expands into the inline set_memory/delete_memory tool pair (mirroring the execute_code expansion via registerMemoryTools) when a run-level memoryAvailable gate holds: capability enabled, memory configured, MEMORIES.USE permission, and personalization not opted out. Surfaces the memory artifact as an attachment in the agents tool-end callback.

Adds the ephemeral path (TEphemeralAgent.memory, load/added agent tool injection), a fully-gated memory badge plus tools-dropdown entry, the agent-builder Memory toggle with form round-trip, and a mock e2e test asserting the badge reaches the request payload. Additive to and independent of the existing post-turn memory extraction agent.

* 🩹 fix: Address Codex review on memory capability (gating, validKeys, usage guard)

- Strip the memory capability from the served agents capabilities when memory is not configured/enabled, so the badge, tools dropdown, agent-builder toggle, and backend capability gate stay consistent instead of exposing an inert toggle on default installs (where MEMORIES.USE defaults true).
- Surface configured memory.validKeys in the inline tool definitions so the model is told the allowed keys up front, matching the runtime createMemoryTool schema.
- Append a strict explicit-request usage guard to the agent instructions when inline memory tools are registered, preserving the memory-agent's privacy behavior.
- Add AppService tests covering memory-capability stripping.

*  test: Update AppService capability snapshots for memory strip

AppService now strips the memory capability from the served agents defaults when no memory block is configured; update the spec's expected capability lists to defaultAgentCapabilitiesWithoutMemory for the no-memory-config cases.

* 🛡️ fix: Address Codex re-review on memory capability (round 2)

- Strip the memory capability from the FINAL served agents config, not just defaults; loadEndpoints reparses any endpoints.agents block, so memory was still exposed in that common shape (packages/data-schemas/src/app/service.ts) + regression test.
- Re-check the full memory gate (config, opt-out, MEMORIES.USE) inside handleTools before constructing set_memory/delete_memory, so an unsolicited tool call from a model/custom endpoint can't bypass the runtime gates (api/app/clients/tools/util/handleTools.js).
- Restore the persisted memory toggle for model-spec conversations via applyModelSpecEphemeralAgent (client/src/utils/endpoints.ts).
- Clear LAST_MEMORY_TOGGLE_ on logout and clear-all-chats so a stale memory preference can't leak across users on a shared browser (client/src/utils/localStorage.ts).

* 🧠 fix: Address Codex re-review on memory capability (round 3)

- Serialize set_memory writes and advance a running token total inside createMemoryTool, so parallel batched calls in one event-driven turn can't each pass the limit check against a stale total and collectively exceed memory.tokenLimit (packages/api/src/agents/memory.ts) + tests.
- Inject the keyed memory context (withKeys) instead of withoutKeys when the running agent has the inline memory capability, so delete_memory has a visible key to target (api/server/controllers/agents/client.js).

* 🔐 fix: Address Codex re-review on memory capability (round 4)

- Detect inline memory by tool NAME (set_memory/delete_memory) across an initialized agent's tools + toolDefinitions, since the 'memory' marker is expanded at init and the prior string check never matched; inject the keyed memory context for any primary OR sub-agent that carries the inline memory tools (api/server/controllers/agents/client.js).
- Enforce memory WRITE permissions in the inline tool gate: set_memory requires CREATE+UPDATE and delete_memory requires UPDATE (matching the REST memory routes), so a USE-only role can't mutate/delete memories via agent tool calls (api/app/clients/tools/util/handleTools.js).

* 🔒 fix: Address Codex re-review on memory capability (round 5)

- Gate inline memory registration (memoryAvailable) on the memory WRITE permissions (USE+CREATE+UPDATE), so a read-only-memory role no longer has set_memory/delete_memory shown to the model only for the runtime loader to refuse them (api/server/services/Endpoints/agents/initialize.js).
- Enforce the per-agent memory opt-in at execution: handleTools now refuses to construct set_memory/delete_memory unless the agent actually declared them (toolDefinitions/tools), blocking hallucinated/undeclared memory tool calls from mutating memory.
- Fail closed when getFormattedMemories errors with a configured tokenLimit, instead of writing as if storage were empty and bypassing the cap (api/app/clients/tools/util/handleTools.js).

* 🩹 fix: Address Codex re-review on memory capability (round 6)

- Fix a P1 regression from the prior round: the execution-context agent keeps the raw 'memory' capability marker (not the expanded set_memory/delete_memory names), so the opt-in check now matches the marker. This restores memory writes/deletes AND avoids hijacking an MCP tool that merely shares the set_memory/delete_memory name (api/app/clients/tools/util/handleTools.js).
- Count repeated set_memory writes to the same key as replacements, not additions, against tokenLimit — set_memory upserts, so a same-key rewrite swaps its prior token contribution instead of double-counting (packages/api/src/agents/memory.ts) + test.
- Gate the memory badge, tools dropdown, and agent-builder toggle on the full memory write permissions (USE+CREATE+UPDATE) via a shared useHasMemoryAccess hook, so a read-only-memory role no longer sees an enabled Memory control the backend would refuse to wire up.

* 🧷 fix: Address Codex re-review on memory capability (round 7)

- Recognize inline memory across both execution-context agent shapes: initializeAgent now sets a LibreChat-only memoryToolsRegistered flag on the InitializedAgent, and the opt-in/detection checks accept that flag OR the raw 'memory' marker. Fixes memory failing for processAddedConvo agents (which store the initialized config, marker already expanded) while staying MCP-name-collision-safe (api/app/clients/tools/util/handleTools.js, packages/api/src/agents/initialize.ts, api/server/controllers/agents/client.js).
- Scope keyed memory context to memory-enabled agents only: useMemory now returns both keyed and unkeyed contexts, and buildMessages injects the keyed one (memory keys + token metadata) only to agents that can call delete_memory, while the primary/post-turn path keeps the unkeyed values — so a primary without memory tools no longer sees memory keys it doesn't need.

* 🔏 fix: Address Codex re-review on memory capability (round 8)

- Enforce memory size limits on inline writes: createMemoryTool now rejects keys over 1000 chars and values over memory.charLimit, matching the REST memory routes, so an inline-memory agent can't persist blobs the memory UI/API would reject (packages/api/src/agents/memory.ts, api/app/clients/tools/util/handleTools.js) + test.
- Recheck the agents 'memory' endpoint capability at execution time, so a stale/hallucinated set_memory/delete_memory call can't mutate memory after an admin removes the capability while the agent document still carries the marker (api/app/clients/tools/util/handleTools.js).

* ♻️ refactor: Move inline-memory backend logic into packages/api + share memory load

Workspace boundary: the inline-memory gating/detection logic that had crept into /api now lives in packages/api/src/agents/memory.ts (TS), with /api kept as thin wrappers.

- Add agentHasInlineMemoryTools, isMemoryToolAllowed, and buildInlineMemoryTool to packages/api; handleTools.js now calls buildInlineMemoryTool instead of constructing/gating the tools inline, and client.js imports agentHasInlineMemoryTools instead of redefining it.
- Optimize repeated memory loads: getRequestMemories memoizes getFormattedMemories per request (WeakMap keyed by req), so the run's memory-context load and every memory-enabled agent's set_memory token-usage load share a single DB fetch instead of one per agent.

* 🧠 fix: Invalidate request memory cache after inline writes

Inline set_memory/delete_memory now invalidate the request-scoped
getFormattedMemories cache on a successful write, so a later tool round
in the same response is seeded with the post-write usage total instead
of the stale pre-write one (multi-round writes no longer collectively
exceed tokenLimit, and a set after a delete is not over-counted). The
within-round sharing across multiple memory-enabled agents is preserved.

* 🧠 fix: Persist memory capability on saved agents; honor registration flag

- Add Tools.memory to the v1 systemTools allowlist so filterAuthorizedTools
  no longer silently drops the memory marker when an agent with the Memory
  capability is created/updated/duplicated through the builder (previously
  the capability only worked for ephemeral chats, not persisted agents).
- agentHasInlineMemoryTools now honors an explicit memoryToolsRegistered
  boolean before falling back to the raw `memory` marker, so an initialized
  config whose registration was denied (memoryAvailable false) is not given
  keyed memory context just because the marker survives in tools.

* 🧩 fix: Bring memory tool to parity with other ephemeral tools

- Add `memory` to the model-spec schema/type and honor `modelSpec.memory`
  in both ephemeral paths (load.ts, added.ts) and the frontend spec
  application, so admins can pre-enable Memory from a model spec exactly
  like webSearch/fileSearch/executeCode.
- Add LAST_MEMORY_TOGGLE_ to the timestamped-storage cleanup list so stale
  per-conversation memory toggles are purged on startup like the others.
- Hide the agent-builder Memory toggle for users who disabled memory in
  personalization (memories === false), mirroring the chat badge's opt-out
  gate, so the setting isn't shown as inert/misleading.

*  test: Cover memory in applyModelSpecEphemeralAgent spec defaults

Update the exact-object assertions to include the new `memory` field and
add positive coverage that `modelSpec.memory` maps to the ephemeral
agent's `memory` flag. Fixes the shard 2/4 failure from 672a03b05.
2026-06-24 17:14:13 -04:00

188 lines
6.2 KiB
TypeScript

import { expect } from '@playwright/test';
import type { Page, Response } from '@playwright/test';
/** Substring of the reply emitted by the mock LLM server. */
export const MOCK_REPLY_TEXT = 'E2E mock reply';
/** Custom endpoints defined in e2e/config/librechat.e2e.yaml. */
export const MOCK_ENDPOINTS = [
{ label: 'Mock Provider A', model: 'mock-model-a' },
{ label: 'Mock Provider B', model: 'mock-model-b' },
] as const;
export type MockEndpoint = { label: string; model: string };
export const NEW_CHAT_PATH = '/c/new';
type RefreshTokenBody = {
token?: string;
};
export function isAgentsStream(response: Response) {
return isAgentGenerationStart(response);
}
export function isAgentGenerationStart(response: Response) {
const { pathname } = new URL(response.url());
const isAgentsChat = pathname === '/api/agents/chat' || pathname.startsWith('/api/agents/chat/');
return (
response.request().method() === 'POST' &&
isAgentsChat &&
!pathname.endsWith('/abort') &&
response.status() === 200
);
}
const modelSelectorTrigger = (page: Page) =>
page.getByRole('button', { name: 'Select a model' }).first();
export const escapeRegExp = (value: string) => value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
/** Open the model selector, choose an endpoint, then its model (committed on the model click). */
export async function selectMockEndpoint(page: Page, endpoint: MockEndpoint) {
const trigger = modelSelectorTrigger(page);
await trigger.click();
await page.getByRole('option', { name: endpoint.label }).click();
const modelOption = page.getByRole('option', { name: endpoint.model, exact: true });
if (await modelOption.isVisible({ timeout: 1000 }).catch(() => false)) {
await modelOption.click();
}
await expect(trigger).not.toHaveText('Select a model');
}
/** Open the model selector and choose a configured model spec by label. */
export async function selectModelSpec(page: Page, label: string) {
const trigger = modelSelectorTrigger(page);
await expect(trigger).toBeVisible();
if ((await trigger.textContent())?.includes(label)) {
return;
}
await trigger.click();
await page.getByRole('option', { name: new RegExp(`(^|\\s)${escapeRegExp(label)}\\b`) }).click();
await expect(trigger).toContainText(label);
}
/** Enable the ephemeral Skills capability from the composer tool menu. */
export async function enableSkills(page: Page) {
await page.getByRole('button', { name: 'Tools Options' }).click();
await page.getByTestId('tools-menu-skills').click();
await page.keyboard.press('Escape');
await expect(page.getByRole('button', { name: 'Skills' })).toBeVisible();
}
/** Enable the ephemeral Memory capability from the composer tool menu. */
export async function enableMemory(page: Page) {
await page.getByRole('button', { name: 'Tools Options' }).click();
await page.getByTestId('tools-menu-memory').click();
await page.keyboard.press('Escape');
await expect(page.getByRole('checkbox', { name: 'Memory' })).toBeVisible();
}
/** The conversation messages container. */
export const messagesView = (page: Page) => page.getByTestId('messages-view');
/** Build the mock-model reply trigger and its expected rendered text for a label. */
export const replyPrompt = (label: string) => `E2E_REPLY:${label}`;
export const replyText = (label: string) => `E2E reply ${label}`;
/** The mock reply as rendered in the conversation, scoped to the messages view. */
export function mockReply(page: Page) {
return messagesView(page).getByText(new RegExp(MOCK_REPLY_TEXT, 'i'));
}
/** Type a message, send it, and wait for the streamed `/api/agents` response. */
export async function sendMessage(page: Page, text: string): Promise<Response> {
const input = page.getByRole('textbox', { name: 'Message input' });
await input.click();
await input.fill(text);
const [response] = await Promise.all([
page.waitForResponse(isAgentsStream, { timeout: 30000 }),
input.press('Enter'),
]);
return response;
}
export async function getAccessToken(page: Page): Promise<string> {
const result = await page.evaluate(async () => {
const response = await fetch('/api/auth/refresh', {
method: 'POST',
credentials: 'include',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({}),
});
const text = await response.text();
let json: unknown = null;
try {
json = text ? JSON.parse(text) : null;
} catch {
json = null;
}
return { ok: response.ok, status: response.status, text, json };
});
if (!result.ok) {
throw new Error(
`Expected /api/auth/refresh to return 2xx, got ${result.status}: ${result.text}`,
);
}
const body = result.json as RefreshTokenBody | null;
if (!body?.token) {
throw new Error(`Expected /api/auth/refresh to return a token, got: ${result.text}`);
}
return body.token;
}
export async function requestJson<T>(
page: Page,
params: {
path: string;
token: string;
method?: string;
body?: unknown;
},
): Promise<T> {
const result = await page.evaluate(
async ({ accessToken, body, method, urlPath }) => {
const headers: Record<string, string> = {
Authorization: `Bearer ${accessToken}`,
};
const init: RequestInit = {
method,
credentials: 'include',
headers,
};
if (body !== undefined) {
headers['Content-Type'] = 'application/json';
init.body = JSON.stringify(body);
}
const response = await fetch(urlPath, init);
const text = await response.text();
let json: unknown = null;
try {
json = text ? JSON.parse(text) : null;
} catch {
json = null;
}
return { ok: response.ok, status: response.status, text, json };
},
{
accessToken: params.token,
body: params.body,
method: params.method ?? 'GET',
urlPath: params.path,
},
);
if (!result.ok) {
throw new Error(
`Expected ${params.method ?? 'GET'} ${params.path} to return 2xx, got ${result.status}: ${result.text}`,
);
}
return result.json as T;
}
export async function fetchJson<T>(page: Page, path: string, token: string): Promise<T> {
return requestJson<T>(page, { path, token });
}