LibreChat/api/server/routes/files/preview.spec.js
Danny Avila 9dd062e42e
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
🧯 fix: Harden Data Retention Semantics (#13049)
* feat: support data retention for normal chats

Add retentionMode config variable supporting "all" and "temporary" values.
When "all" is set, data retention applies to all chats, not just temporary ones.
Adds isTemporary field to conversations for proper filtering.

Adapted to new TS method files in packages/data-schemas since upstream
moved models out of api/models/.

Based on danny-avila/LibreChat#10532

Co-Authored-By: WhammyLeaf <233105313+WhammyLeaf@users.noreply.github.com>
(cherry picked from commit 30109e90b0)

* feat: extend data retention to files, tool calls, and shared links

Add expiredAt field and TTL indexes to file, toolCall, and share schemas.
Set expiredAt on tool calls, shared links, and file uploads when
retentionMode is "all" or chat is temporary.

(cherry picked from commit 48973752d3)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: lint/test

(cherry picked from commit 310c514e6a)

* fix: address code review feedback for data retention PR

Critical:
- Fix BookmarkMenu crash: restore optional chaining on conversation
- Fix migration hazard: backward-compatible sidebar filter that also
  checks expiredAt for documents without isTemporary field

Major:
- Add logging to getRetentionExpiry error path, align with tools.js
- Add tests for retentionMode: ALL in saveConvo and saveMessage
- Fix share route: apply expiredAt for temporary chats too by
  querying the conversation's isTemporary flag server-side
- Add assertions for getRetentionExpiry mocks in process tests

Minor:
- Fix ChatRoute isTemporaryChat to be strictly boolean via Boolean()
- Fix stale test description (expired -> temporary)
- Comment out retentionMode default in example yaml
- Simplify verbose if/else to isTemporary === true
- Add compound index on { user: 1, isTemporary: 1 }
- Remove narrating comment from process.spec.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(cherry picked from commit 6bad535f90)

* chore: fix typescript

(cherry picked from commit 826527a46b)

* fix: lint

(cherry picked from commit 77817e80ea)

* fix: use mockSanitizeArtifactPath in retention test

The 'getRetentionExpiry is called with the request object' test
referenced an undefined `mockSanitizeFilename` identifier, breaking
both lint (no-undef) and the test suite. Use the existing
`mockSanitizeArtifactPath` mock that the surrounding tests already
use, since `processCodeOutput` calls `sanitizeArtifactPath` (not
`sanitizeFilename`) before invoking `getRetentionExpiry`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 52ea2da66d)

* fix: forward isTemporary from client for retention on file uploads and tool calls

Server-side `getRetentionExpiry` (file uploads) and the tool-call
controller both read `req.body.isTemporary`, but the file upload
multipart form and the tool-call payload did not include that field.
In `retentionMode: temporary` (default), files uploaded and tool
calls created from temporary chats were therefore retained
indefinitely.

Forward the Recoil `isTemporary` flag in both client paths so the
existing server checks can fire correctly. `ToolParams` gains an
optional `isTemporary` field.

Addresses Codex P1 review feedback on PR #29.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7e937df05a)

* test: stub store.isTemporary in useFileHandling test mocks

Previous commit added `useRecoilValue(store.isTemporary)` to the
hook. The test file mocks `~/store` with only `ephemeralAgentByConvoId`
and does not stub `useRecoilValue`, so all 7 cases threw
"Invalid argument to useRecoilValue: expected an atom or selector but
got undefined". Add a stub default export with `isTemporary` and a
`useRecoilValue` mock returning `false`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit eb1609537d)

* fix: harden data retention semantics

* fix: provide sweep request context for expired files

* fix: preserve temporary flags in all-retention updates

* fix: honor assistant versions in retention sweeps

* fix: retain non-temporary flags in all mode

* fix: hide expired retained records

* fix: propagate retained conversation expiry

* fix: refresh meili retention cutoff

* fix: prevent overlapping file sweeps

* fix: show legacy retained conversations

* fix: index legacy retained records

* fix: harden retention cleanup edge cases

* fix: count failed file storage sweeps

* fix: preserve legacy temporary retention

* fix: assign retention sweep worker deterministically

* fix: hide expired shared links on reads

* fix: prevent retention refresh after parent expiry

* fix: break code output retention import cycle

* fix: harden retention review findings

* fix: ignore expired share duplicates

* fix: reject expired retained share creation

* fix: harden retention review edge cases

* fix: address retention audit findings

* fix: enforce expired conversation shares in all retention

* fix: scope temporary upload flag to chat files

* fix: address retention review findings

* fix: address codex retention review findings

* fix: tighten missing storage detection

* test: remove unused file process spec bindings

---------

Co-authored-by: WhammyLeaf <233105313+WhammyLeaf@users.noreply.github.com>
Co-authored-by: Aron Gates <aron@muonspace.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-19 21:58:42 -04:00

396 lines
14 KiB
JavaScript

/**
* Coverage for the new GET /files/:file_id/preview endpoint.
*
* Deferred-preview code-execution flow: the immediate persist step
* emits a file record at `status: 'pending'`; the background render
* transitions it to `'ready'` (with text) or `'failed'` (with
* previewError). The frontend polls this endpoint until status is
* terminal. This suite asserts the response shape across all four
* states (pending, ready, failed, legacy/back-compat) and the auth
* boundary (404 vs 403).
*/
jest.mock('@librechat/data-schemas', () => ({
logger: { warn: jest.fn(), debug: jest.fn(), error: jest.fn(), info: jest.fn() },
SystemCapabilities: {},
}));
jest.mock('@librechat/api', () => ({
refreshS3FileUrls: jest.fn(),
resolveUploadErrorMessage: jest.fn(),
verifyAgentUploadPermission: jest.fn(),
}));
const mockFindFileById = jest.fn();
const mockGetFiles = jest.fn();
const mockUpdateFile = jest.fn();
const mockGetAgents = jest.fn().mockResolvedValue([]);
jest.mock('~/models', () => ({
findFileById: (...args) => mockFindFileById(...args),
getFiles: (...args) => mockGetFiles(...args),
updateFile: (...args) => mockUpdateFile(...args),
getAgents: (...args) => mockGetAgents(...args),
batchUpdateFiles: jest.fn(),
}));
jest.mock('~/server/services/Files/process', () => ({
filterFile: jest.fn(),
processFileUpload: jest.fn(),
processDeleteRequest: jest.fn().mockResolvedValue({ deletedFileIds: [], failedFileIds: [] }),
processAgentFileUpload: jest.fn(),
}));
jest.mock('~/server/services/Files/strategies', () => ({
getStrategyFunctions: jest.fn(() => ({})),
}));
jest.mock('~/server/controllers/assistants/helpers', () => ({
getOpenAIClient: jest.fn(),
}));
jest.mock('~/server/middleware/roles/capabilities', () => ({
hasCapability: jest.fn(() => (_req, _res, next) => next()),
}));
jest.mock('~/server/services/PermissionService', () => ({
checkPermission: jest.fn(() => (_req, _res, next) => next()),
getEffectivePermissions: jest.fn().mockResolvedValue(0),
}));
jest.mock('~/server/services/Files', () => ({
hasAccessToFilesViaAgent: jest.fn(),
}));
jest.mock('~/server/utils/files', () => ({
cleanFileName: (name) => name,
getContentDisposition: (name) => `attachment; filename="${name}"`,
}));
jest.mock('~/cache', () => ({
getLogStores: jest.fn(() => ({ get: jest.fn(), set: jest.fn() })),
}));
const express = require('express');
const request = require('supertest');
const filesRouter = require('./files');
/**
* Mount the router with a per-request user injector so we can simulate
* a logged-in user without spinning up the full auth stack.
*/
function buildApp({ user = { id: 'user-123', role: 'user' } } = {}) {
const app = express();
app.use(express.json());
app.use((req, _res, next) => {
req.user = user;
req.config = { fileStrategy: 'local' };
next();
});
app.use('/files', filesRouter);
return app;
}
const OWNER_USER_ID = 'user-123';
describe('GET /files/:file_id/preview', () => {
beforeEach(() => {
mockFindFileById.mockReset();
mockGetFiles.mockReset();
mockUpdateFile.mockReset();
mockGetAgents.mockReset();
mockGetAgents.mockResolvedValue([]);
});
it('returns 404 when the file does not exist (auth check fails first via fileAccess)', async () => {
/* `fileAccess` middleware does its own getFiles lookup and returns
* 404 before our handler ever runs. This test asserts the boundary
* lives there, not that the handler duplicates the check. */
mockGetFiles.mockResolvedValueOnce([]);
const res = await request(buildApp()).get('/files/missing-id/preview');
expect(res.status).toBe(404);
expect(res.body).toMatchObject({ error: 'Not Found' });
expect(mockFindFileById).not.toHaveBeenCalled();
});
it('returns 403 when the requester does not own the file and has no agent-based access', async () => {
/* fileAccess returns 403 — the file exists but belongs to someone
* else and no agent grants access. The preview handler should
* never run. */
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'someone-elses', user: 'other-user', filename: 'x.xlsx' },
]);
const res = await request(buildApp()).get('/files/someone-elses/preview');
expect(res.status).toBe(403);
expect(mockFindFileById).not.toHaveBeenCalled();
});
it('does not disclose preview text through an attacker-authored agent file reference', async () => {
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'victim-file', user: 'victim-user', filename: 'secret.xlsx', status: 'ready' },
]);
mockGetAgents.mockResolvedValueOnce([
{
id: 'agent-attacker',
author: 'attacker-user',
tool_resources: { execute_code: { file_ids: ['victim-file'] } },
},
]);
const res = await request(buildApp({ user: { id: 'attacker-user', role: 'user' } })).get(
'/files/victim-file/preview',
);
expect(res.status).toBe(403);
expect(mockFindFileById).not.toHaveBeenCalled();
});
it('returns status:pending without text/textFormat while the deferred render is in flight', async () => {
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-pending',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'pending',
},
]);
const res = await request(buildApp()).get('/files/fid-pending/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({ file_id: 'fid-pending', status: 'pending' });
/* Pending must NOT leak `text` and must NOT trigger the text re-fetch. */
expect(res.body).not.toHaveProperty('text');
expect(mockFindFileById).not.toHaveBeenCalled();
});
it('returns status:ready with text + textFormat when the deferred render succeeded', async () => {
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'fid-ready', user: OWNER_USER_ID, filename: 'data.xlsx', status: 'ready' },
]);
/* Text is fetched only on the terminal ready response. */
mockFindFileById.mockResolvedValueOnce({
file_id: 'fid-ready',
text: '<table><tr><td>1</td></tr></table>',
textFormat: 'html',
});
const res = await request(buildApp()).get('/files/fid-ready/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({
file_id: 'fid-ready',
status: 'ready',
text: '<table><tr><td>1</td></tr></table>',
textFormat: 'html',
});
});
it('returns status:failed with previewError when the deferred render errored', async () => {
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-failed',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'failed',
previewError: 'parser-error',
},
]);
const res = await request(buildApp()).get('/files/fid-failed/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({
file_id: 'fid-failed',
status: 'failed',
previewError: 'parser-error',
});
expect(mockFindFileById).not.toHaveBeenCalled();
});
it('defaults to status:ready for legacy records with no status field (back-compat)', async () => {
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-legacy',
user: OWNER_USER_ID,
filename: 'old.csv',
// status intentionally absent
},
]);
mockFindFileById.mockResolvedValueOnce({
file_id: 'fid-legacy',
text: 'csv,header\n1,2',
textFormat: 'text',
});
const res = await request(buildApp()).get('/files/fid-legacy/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({
file_id: 'fid-legacy',
status: 'ready',
text: 'csv,header\n1,2',
textFormat: 'text',
});
});
it('returns status:ready with no text when the record is ready but text is null (binary/oversized)', async () => {
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'fid-binary', user: OWNER_USER_ID, filename: 'image.bin' },
]);
mockFindFileById.mockResolvedValueOnce({
file_id: 'fid-binary',
text: null,
textFormat: null,
});
const res = await request(buildApp()).get('/files/fid-binary/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({ file_id: 'fid-binary', status: 'ready' });
});
it('returns ready with no text when ready record was deleted between fileAccess and text fetch', async () => {
/* `fileAccess` saw the record but the concurrent delete removed it
* before the text fetch. Surface ready-without-text rather than
* 500 — the client routes to download-only and stops polling. */
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'fid-race', user: OWNER_USER_ID, filename: 'data.xlsx', status: 'ready' },
]);
mockFindFileById.mockResolvedValueOnce(null);
const res = await request(buildApp()).get('/files/fid-race/preview');
expect(res.status).toBe(200);
expect(res.body).toEqual({ file_id: 'fid-race', status: 'ready' });
});
it('returns 500 with a stable shape if the text fetch throws unexpectedly', async () => {
mockGetFiles.mockResolvedValueOnce([
{ file_id: 'fid-boom', user: OWNER_USER_ID, filename: 'data.xlsx', status: 'ready' },
]);
mockFindFileById.mockRejectedValueOnce(new Error('mongo down'));
const res = await request(buildApp()).get('/files/fid-boom/preview');
expect(res.status).toBe(500);
expect(res.body).toMatchObject({ error: 'Internal Server Error' });
});
describe('lazy sweep for stale pending records', () => {
/* The boot-time `sweepOrphanedPreviews` only runs once at startup
* with a 5-min cutoff. A backend crash + quick restart can leave
* `pending` records younger than 5 min that never get touched
* again. This endpoint sweeps them on the spot whenever a polling
* request lands on one — the user is exactly the consumer who
* cares, so on-demand sweep is the right shape. (Codex P2 review
* on PR #12957.) */
const STALE_MS = 6 * 60 * 1000;
const FRESH_MS = 30 * 1000;
it('marks a stale pending record as failed:orphaned and returns the swept state', async () => {
const updatedAt = new Date(Date.now() - STALE_MS);
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-stale',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'pending',
updatedAt,
},
]);
mockUpdateFile.mockResolvedValueOnce({
file_id: 'fid-stale',
status: 'failed',
previewError: 'orphaned',
});
const res = await request(buildApp()).get('/files/fid-stale/preview');
expect(mockUpdateFile).toHaveBeenCalledWith(
{ file_id: 'fid-stale', status: 'failed', previewError: 'orphaned' },
{ status: 'pending', updatedAt },
);
expect(res.status).toBe(200);
expect(res.body).toEqual({
file_id: 'fid-stale',
status: 'failed',
previewError: 'orphaned',
});
});
it('does NOT sweep a fresh pending record (within the cutoff window)', async () => {
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-fresh',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'pending',
updatedAt: new Date(Date.now() - FRESH_MS),
},
]);
const res = await request(buildApp()).get('/files/fid-fresh/preview');
expect(mockUpdateFile).not.toHaveBeenCalled();
expect(res.status).toBe(200);
expect(res.body).toEqual({ file_id: 'fid-fresh', status: 'pending' });
});
it('sweeps a record past the 2min cutoff but below the 5min boot-sweep threshold', async () => {
/* Pins the cutoff change from 5min to 2min — without this, a
* future revert wouldn't fail the suite. */
const updatedAt = new Date(Date.now() - 3 * 60 * 1000);
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-mid',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'pending',
updatedAt,
},
]);
mockUpdateFile.mockResolvedValueOnce({
file_id: 'fid-mid',
status: 'failed',
previewError: 'orphaned',
});
const res = await request(buildApp()).get('/files/fid-mid/preview');
expect(mockUpdateFile).toHaveBeenCalled();
expect(res.body).toEqual({
file_id: 'fid-mid',
status: 'failed',
previewError: 'orphaned',
});
});
it('does NOT sweep a stale ready record (only pending qualifies)', async () => {
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-ready',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'ready',
updatedAt: new Date(Date.now() - STALE_MS),
},
]);
mockFindFileById.mockResolvedValueOnce({
file_id: 'fid-ready',
text: 'final',
textFormat: 'html',
});
const res = await request(buildApp()).get('/files/fid-ready/preview');
expect(mockUpdateFile).not.toHaveBeenCalled();
expect(res.body).toMatchObject({ status: 'ready', text: 'final' });
});
it('falls through to the original pending payload if the conditional sweep loses the race', async () => {
const updatedAt = new Date(Date.now() - STALE_MS);
mockGetFiles.mockResolvedValueOnce([
{
file_id: 'fid-race',
user: OWNER_USER_ID,
filename: 'data.xlsx',
status: 'pending',
updatedAt,
},
]);
mockUpdateFile.mockResolvedValueOnce(null);
const res = await request(buildApp()).get('/files/fid-race/preview');
expect(mockUpdateFile).toHaveBeenCalled();
expect(res.status).toBe(200);
expect(res.body).toEqual({ file_id: 'fid-race', status: 'pending' });
});
});
});