mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-06-28 18:31:24 +00:00
* 🛡️ feat: Audit log backend for SystemGrants changes
Add an AuditLog Mongoose collection that records every grant assign/revoke as an append-only entry capturing the actor, target principal, capability, timestamp, and tenant scope. Wire the entry-write into the existing admin assignGrant and revokeGrant handlers so the admin panel's audit-log tab populates as grants happen.
The data-schemas package gains the IAuditLog type, a Mongoose schema with tenant + target compound indexes for keyset pagination, a model factory wired through createModels, and an AuditLog methods factory exposing recordAuditEntry, listAuditLogPage (cursor-paginated, faceted, search-aware), findAuditLogEntry, and streamAuditLogEntries.
The packages/api admin layer adds createAdminAuditLogHandlers with three handlers backing the routes the admin panel already consumes: GET /api/admin/audit-log returns paginated entries, GET /api/admin/audit-log/:id returns a single entry for the permalink drawer, and GET /api/admin/audit-log/export.csv streams CSV with formula-injection defang plus UTF-8 BOM.
The Express layer mounts the new router at /api/admin/audit-log behind requireJwtAuth and the ACCESS_ADMIN capability, matching the existing admin route pattern. The audit emission failure is logged via logger.error but never rolls back the grant.
* 🧹 chore: Audit log backend cleanup — offset pagination, name-based filters, type tightening
Switch listAuditLogPage from cursor-based to offset-based pagination with skip().limit() + parallel countDocuments, returning { entries, total } instead of { entries, nextCursor }; the cursor encode and decode helpers are no longer needed and have been removed.
Interpret the actorId and targetPrincipalId filter parameters as case-insensitive partial regex against the denormalized actorName and targetName fields rather than exact-match against the underlying ObjectId. Admin panel users naturally filter by human name, not by Mongo identifier.
Replace the broad Record<string, unknown> casts on req.query with a typed AuditLogQuery shape, drop two unused exported types and the now-unused mongoose Types import, and fix the streamAuditLogEntries Omit literal to match the interface and the offset-based design.
* 🛠️ fix: Address audit log review feedback (CI typecheck, ISO offsets, no-op revoke, deps surface, schema, backpressure, tests)
Resolve the duplicate AuditAction export that broke the data-schemas TypeScript check by importing the canonical declaration from types/admin instead of re-declaring it in types/auditLog.
Accept timezone-offset ISO 8601 timestamps such as 2026-05-01T09:30:00+02:00 in the from and to filter params and reject local-time strings without a zone so every request resolves to an unambiguous instant.
Skip the audit emission on no-op revokes: revokeCapability now returns deletedCount so the admin handler can omit the grant_removed entry when the target grant did not exist, keeping the audit trail factually accurate. Mocks in the existing grants.spec.ts updated to the new return shape.
Drop the required recordAuditEntry from AdminAuditLogDeps since the audit-log handler factory never consumes it; the grants handler factory keeps its optional dep for the write path.
Tighten the tenantId validator on the audit log schema to require a non-empty trimmed string, and rewrite the listing-index comment to describe deterministic offset sort instead of keyset pagination.
Stream the CSV export with explicit backpressure (await drain when res.write returns false) and abort on client disconnect so a cancelled download no longer pins a Mongo cursor or buffers unbounded data in memory.
Add packages/data-schemas/src/methods/auditLog.spec.ts covering tenant and platform scoping, single and multi action filtering, partial-name filtering for actor and capability, the createdAt window, offset pagination with total, ObjectId and date stringification on the wire, regex-metacharacter escape, and streaming completeness.
* 🛠️ fix: Address P1 audit-log review findings (cursor cancel, drain race, filter naming, type dedupe, tenant scope, log enrichment)
The CSV stream handler kept draining Mongo batches after the client
disconnected because the `for await` loop only honored its abort flag
inside `onEntry`. Thread an `isCancelled` callback into
`streamAuditLogEntries` so the methods layer closes the cursor as soon
as the handler sees `close`/`aborted`; a `finally` block guarantees
release on throw. The drain promise in `writeChunk` now races against
the response's `close` event so a destroyed socket cannot strand the
handler on a `drain` that will never fire.
The HTTP filter keys `actorId` and `targetPrincipalId` always did
case-insensitive substring matches on the denormalized `actorName` /
`targetName` columns, never on ObjectIds — a client passing a real id
silently got zero rows. Renamed the wire-level keys to `actorQuery` /
`targetQuery` (matching what the matcher actually does) and kept the
old names as deprecated aliases for one release so the sibling
admin-panel PR can migrate without breaking; each legacy use logs a
deprecation warning. Renamed the corresponding fields in
`AuditLogFilters` too.
`AdminAuditLogEntryWire` duplicated `AdminAuditLogEntry` from
`types/admin.ts` field-for-field, violating the no-duplicate-types
rule. Deleted the duplicate, hoisted `AuditLogPage`,
`RecordAuditEntryInput`, and `AuditLogFilters` from
`methods/auditLog.ts` into `types/auditLog.ts`, and updated the
handler, method factory, and re-exports accordingly.
`tenantFilter` treated `''` as a valid tenant scope, producing a
`{ tenantId: '' }` query that silently returned nothing while the
schema validator rejected `''` on writes. Switched to a strict
`typeof tenantId === 'string' && tenantId.trim().length > 0` check so
reads agree with writes, with new spec coverage for empty and
whitespace-only inputs.
Audit-write failures now log the full forensic payload (action,
capability, tenantId, actorId, target metadata) inside a single meta
object so winston's standard signature surfaces it correctly; a comment
on the catch block explains why the failure mode stays silent (it must
never block a privileged operation).
Stronger filter parsing: invalid `action` values and unknown
`targetPrincipalType` now return 400 instead of silently dropping.
Extracted `MAX_LIMIT` to a constant. Replaced the
`Record<string, Date>` cast in `buildFilter` with a typed local.
Switched the stream cursor to `lean<IAuditLog[]>()` and removed the
`as IAuditLog` cast inside the loop.
* ✅ test: Cover admin audit-log handler with unit tests for auth, validation, tenant isolation, CSV output, and abort
The sibling admin handlers (grants, groups, roles, users) all have
handler specs; this one was missing. The new suite covers 401 on a
missing `req.user`, 400 on malformed ISO `from` / `to`, 400 on
limit > 500, 400 on negative offset, 400 on an unknown action or
`targetPrincipalType`, 400 on a non-ObjectId `:id`, 404 when the
methods layer returns null, that the caller's `tenantId` (not a
forged query-string `tenantId`) is the one passed to the methods
layer, that `actorQuery` / `targetQuery` round-trip, that the
deprecated `actorId` / `targetPrincipalId` aliases still map through,
that the CSV stream emits the BOM as the first chunk with CRLF line
endings and the expected header labels, that quotes, commas, and
newlines are properly escaped, that the formula-injection prefixes
(`=` `+` `-` `@` tab CR) are defanged, that an `isCancelled` callback
reaches the methods layer and flips to true on client `close`, and
that `res.end` is skipped when the client disconnected mid-stream.
* 🛡️ feat: Enforce append-only AuditLog at the schema level
Every field is now marked `immutable: true`, and pre-hooks on the
schema reject `updateOne`, `updateMany`, `findOneAndUpdate`,
`findOneAndReplace`, `replaceOne`, `deleteOne`, `deleteMany`,
`findOneAndDelete`, plus any `save()` against an existing document.
`timestamps` is reduced to `{ createdAt: true, updatedAt: false }`
since a mutable timestamp would imply mutation is allowed, and
`updatedAt` is dropped from `AuditLog` / `IAuditLog`. The methods
spec resets state between tests via the raw driver (`AuditLog.collection.deleteMany`),
which bypasses the pre-hooks; new specs assert that the model-level
update / delete / re-save paths reject with the append-only error and
that `updatedAt` is not stamped on new documents.
* ♻️ refactor: Share MAX_AUDIT_LOG_LIMIT between methods and handler
Renamed the methods-layer constant from the generic `MAX_LIMIT` to
`MAX_AUDIT_LOG_LIMIT`, exported it through `@librechat/data-schemas`,
and consumed it from the handler instead of duplicating `500` there.
Now the limit is single-sourced; bumping it once updates both the
clamp inside `listAuditLogPage` and the 400-error boundary the
handler returns to clients.
* 🛡️ feat: Gate audit-log routes on a dedicated `READ_AUDIT_LOG` capability
The audit-log routes were gated on `ACCESS_ADMIN`, which conflates "can log
into the admin panel" with "can see who granted what to whom." Anyone with
`ACCESS_ADMIN + READ_CONFIGS` (a config reviewer with no people-management
authority) could read the grant history of every user, group, and role —
information they have no need to know.
`READ_AUDIT_LOG` ('read:audit_log') is now an explicit, separately grantable
read capability with no MANAGE counterpart, matching the append-only nature
of the collection. `seedSystemGrants` iterates `Object.values(SystemCapabilities)`
so existing ADMIN-role seeds pick it up automatically on next startup.
This also makes an "auditor" persona possible: hold `ACCESS_ADMIN + READ_AUDIT_LOG`
without any MANAGE_* grants and you can review history without modifying anything.
* ♻️ refactor: Share AUDIT_ACTIONS, tighten audit dep types, document route order
Exports a runtime AUDIT_ACTIONS array from packages/data-schemas alongside the
AuditAction type so the Mongoose schema enum and the HTTP handler's whitelist
consume one source of truth instead of duplicating the literal pair.
Switches the grants handler's recordAuditEntry dep typing from a duplicated
inline object literal returning Promise<unknown> to the published
RecordAuditEntryInput type returning Promise<void>, and tightens the local
emitAudit args to AuditAction. Replaces the local ParsedFilters interface in
the audit-log handler with Omit<AuditLogFilters, 'offset' | 'limit'> to drop
the duplicate definition.
Drops the optional marker on AuditLog.createdAt. Mongoose always sets it at
insert time, so callers treating it as nullable were guarding against a state
the schema does not produce.
Adds a comment on api/server/routes/admin/audit.js noting that /export.csv
must precede /:id so a future contributor does not accidentally reorder them
into a 404 trap.
* 🛡️ feat: Resolve audit names without extra DB round-trips
For the actor name, JWT-authenticated `req.user` already carries `name`,
`username`, and `email`. `resolveUser` now derives the actor display name
from `req.user` directly and threads it through the caller context, so
every grant assign and revoke no longer triggers a separate `getUserById`
lookup.
For the target name, replaces the previous always-store-the-principalId
behavior (which buried opaque ObjectId strings in immutable audit rows
for USER and GROUP targets) with a `resolveTargetName` dep. ROLE
principals continue to use `principalId` directly because the SystemGrant
model stores role names there. USER and GROUP principals route through
the new dep, which in `api/server/routes/admin/grants.js` calls
`db.getUserById` or `db.findGroupById` respectively and falls back to
the principalId on miss or error so the audit row stays intelligible.
Drops the misleading "display name lookup happens in a later iteration"
comment.
* ✅ test: Cover audit emission, scope emitAudit to today's ROLE-only surface
Fixes a misleading test that claimed to verify "idempotent even if the grant
does not exist" while mocking deletedCount: 1 (the grant DID exist). Replaces
it with the actual no-op scenario (deletedCount: 0) and adds an assertion
that recordAuditEntry is NOT called, since the whole point of the
deletedCount > 0 gate is to avoid fictitious revocation rows.
Adds a dedicated audit emission describe block covering: grant_assigned
emission with the actor name resolved from req.user, grant_removed
emission when deletedCount is positive, and the no-emission fallback when
recordAuditEntry is not configured. The actor-name assertions exercise the
name / username / email fallback chain in resolveUser.
The previous commit also added a `resolveTargetName` dep and an
emitAudit branch for USER/GROUP targets. The grants surface is ROLE-only
today (MANAGE_CAPABILITY_BY_TYPE has only PrincipalType.ROLE), so that
code path is unreachable from the handler. Removed the dep and the
branch; the audit row uses principalId as the target name, which is the
human-readable role name for ROLE principals. A comment in emitAudit
flags where to plumb resolveTargetName back in once USER and GROUP
grants are enabled.
* 🛠️ fix: Inclusive `to` date filter and reject inverted ranges
A `?to=2025-01-15` filter previously stopped at midnight UTC of that
day, silently excluding everything that happened on January 15. The
`parseIsoDate` helper now widens a bare `YYYY-MM-DD` to 23:59:59.999Z
when called with the `end` boundary. Full ISO timestamps are honored
exactly, so callers that want minute-precision can still get it.
Also rejects inverted ranges (`from` later than `to`) with a 400 so
operators see a clear error instead of a silent empty result.
* 🛡️ feat: Cap audit-log CSV exports at 100k rows; cover stream error path
Introduces MAX_AUDIT_EXPORT_ROWS (100k) and threads a `maxRows` option
through streamAuditLogEntries. The handler now passes the cap into the
stream so a careless admin script or a hostile auditor cannot pin a
Node worker and a Mongo cursor by exporting unbounded result sets.
Beyond 100k rows, callers should slice exports by from / to date.
Adds a methods-layer spec for the cap behavior, a handler-layer spec
that asserts the option is plumbed through, and a handler-layer spec
that exercises the streamAuditLogEntries-throws-after-headers-sent path
(catch block falls through to res.end instead of attempting JSON).
Documents on buildFilter that case-insensitive substring regex filters
(actorName, targetName, capability, search) cannot use a B-tree index
and degrade to a tenant-scoped partition scan, so deployments with
hundreds of thousands of audit rows per tenant should constrain those
queries with a date window.
* 🧹 chore: Spell CSV_BOM as and drop a gratuitous optional chain
`revokeCapability` is typed `Promise<{ deletedCount: number }>` so the
`?.` on `revokeResult?.deletedCount` only obscured that the value cannot
be nullish.
`CSV_BOM` was a literal U+FEFF character invisible in most editors. Now
spelled as the Unicode escape so readers can see the constant; the test
that asserts on the first emitted chunk uses the same escape.
* 🔧 chore: Allowlist AuditLog in the tenant-isolation coverage guard
The AuditLog collection carries a tenantId field but scopes tenancy manually
inside listAuditLogPage / streamAuditLogEntries / recordAuditEntry using the
same $exists: false convention as SystemGrant. The tenant-isolation plugin
coverage spec now allows that and asserts it stays accurate.
* 🛠️ fix: Normalize blank tenantId before persisting audit entries
The `recordAuditEntry` write path was treating any non-null tenantId as a
real string, so empty or whitespace-only values reached the schema validator,
failed the non-empty-string check, and silently dropped the audit row. The
read-side `tenantFilter` already treats those values as platform-level scope,
so the write path now mirrors it: blank or whitespace-only tenantId becomes
an omitted field, which matches `{ tenantId: { $exists: false } }` queries
and clears validation. Added a regression test that records two entries with
blank and whitespace tenantId and asserts both persist with the tenantId
field absent.
* 🎨 style: collapse expect.objectContaining onto one line to satisfy prettier
* 🔒 fix: block document-level deleteOne/updateOne on AuditLog
Mongoose registers deleteOne and updateOne pre-hooks as query middleware
by default. The query-level append-only block on AuditLog therefore did
not cover Document.prototype.deleteOne() or Document.prototype.updateOne(),
leaving a path where a caller that had already loaded an audit row via
findOne could call .deleteOne() or .updateOne() on the instance and bypass
the schema contract.
Explicit { document: true, query: false } registrations close the holes,
and the spec now covers both code paths against a real in-memory Mongo.
* 🔒 fix: require ACCESS_ADMIN on audit-log routes
Every other admin router (config, grants, users, roles, groups, auth)
enforces requireJwtAuth followed by requireCapability(ACCESS_ADMIN) before
any feature-specific capability check. The audit-log router only required
READ_AUDIT_LOG, which is independent of ACCESS_ADMIN in CapabilityImplications,
so a role delegated only READ_AUDIT_LOG without ACCESS_ADMIN could read or
CSV-export the audit trail and bypass the admin boundary.
Aligned the middleware chain with the rest of the admin surface so
ACCESS_ADMIN gates entry and READ_AUDIT_LOG gates the feature within it.
* 🎨 chore: re-sort imports after dev rebase
Post-rebase sort-imports against the merge target — six audit-log files
landed with stale import ordering relative to the current scripts/sort-imports.mts
rules on dev. CI's import-order job flagged the drift; running the script
locally rewrites them in place. No semantic changes.
* 🔧 fix: explicit type annotations on audit-log model + schema exports
Dev migrated packages/data-schemas builds from rollup to tsdown with
--isolatedDeclarations enabled, which requires every exported function to
declare its return type and every exported variable to declare its type.
Two of our audit-log exports got swept up:
TS9007 models/auditLog.ts:12 createAuditLogModel return type
TS9010 schema/auditLog.ts:12 auditLogSchema variable type
Added Model<t.IAuditLog> on the factory and Schema<IAuditLog> on the
schema variable, matching the sibling SystemGrant convention. No runtime
behavior change.
* 🔧 fix: align revokeCapability type annotation with implementation
The rebase auto-merge of systemGrant.ts kept dev's outer type annotation
(`revokeCapability: ... => Promise<void>`) but our implementation returns
`Promise<{ deletedCount: number }>` (added during the bot-review loop to
let the audit emitter distinguish a real revoke from a no-op against a
nonexistent grant). The mismatch surfaced as TS2719 on the methods record
return at line 520. Updated the type annotation to match the impl.
The caller at packages/api/src/admin/grants.ts:444 reads
`revokeResult.deletedCount` to gate the audit emit, so the wider return
type is what the rest of the code already assumes.
* 🔧 fix: explicit factory return type on createAdminAuditLogHandlers
Same tsdown --isolatedDeclarations migration that hit packages/data-schemas
also applies to packages/api; the audit-log handler factory's inferred
return type tripped TS9013 against the new build pipeline. Annotated the
factory with explicit handler signatures matching the sibling
createAdminGrantsHandlers convention. Used Promise<Response | void> for
the export handler because its final res.end() path returns undefined,
unlike the other two handlers which always return a Response.
* 🛡️ feat: Generalize audit log into a tamper-evident, extensible event substrate
Reworks the SystemGrant-only audit log into a general-purpose, append-only
compliance substrate designed to absorb future event classes (agent runs,
tool/MCP calls, config + permission changes, approvals) without reshaping the
record. Nothing was shipped yet, so this replaces the grant-specific wire
shape rather than layering aliases.
Schema / record shape (packages/data-schemas):
- schemaVersion + two-level taxonomy: category + namespaced action
(grant.assigned/grant.removed), first-class outcome and severity.
- Structured actor{type,id,name} supporting non-user actors (system, agent,
service, schedule, webhook, api); generic target{type,id,name}; open
metadata map; request context{requestId,ip,userAgent,sessionId}.
Tamper-evidence (hash chain):
- Per-tenant chain keyed by chainKey with seq/prevHash/hash. Appends link to
the previous hash; a unique {chainKey,seq} index serializes concurrent
writes (dup-key retry) so the chain can never fork. createdAt is explicit so
it's covered by the hash.
- verifyAuditChain() walks a chain and detects modification, deletion, and
forged links; exposed via GET /api/admin/audit-log/verify.
Other best-practice gaps from the review:
- Keyset (cursor) pagination over seq alongside offset; stable under
concurrent appends. nextCursor in the page payload.
- Retention: purgeAuditLogEntries() privileged prefix-purge with a confirm
latch, returns a checkpoint; verify tolerates a purged prefix.
- Fail-closed option (AUDIT_LOG_FAIL_CLOSED) so a failed audit write can fail
the grant request instead of being swallowed; default stays fail-open.
- Grant handlers now capture request context and emit the new shape.
CSV export updated for the new columns (incl. seq/hash). data-schemas bumped
to 0.0.54 for the sibling admin-panel consumer. Tests rewritten: 28
methods-layer cases (chain genesis/linking, tamper detection, keyset, purge)
and the handler/grants specs updated for the new shape, fail-closed, and the
verify endpoint.
* 🛠️ fix: Address Codex review on the audit-log substrate
- F1 (fail-closed atomicity): assign/revoke now compensate (rollback grant /
restore grant) when a fail-closed audit write fails, so a 5xx never leaves an
unaudited mutation.
- F5: only emit grant.assigned for a real change — skip the audit when the role
already holds the capability (idempotent re-assert).
- F7: verifyAuditChain no longer silently trusts a non-genesis start; a purged
prefix must be authorized by a trusted checkpoint (purge now returns
{throughSeq, prevHash}), else verification fails as tampering.
- F4: block Model.bulkWrite on AuditLog (would bypass the append-only middleware).
- F3: CSV export appends an explicit TRUNCATED marker + logs when the row cap is hit.
- F6: reject out-of-range date-only filters (2025-02-31) instead of normalizing.
- F2: regenerate package-lock.json for the 0.0.54 data-schemas bump.
Tests: +1 methods (bulkWrite) +2 verify (deleted-prefix / checkpoint mismatch),
updated purge test for checkpoint flow; +4 api (re-assert skip, assign/revoke
fail-closed rollback, date reject, CSV truncation marker).
* 🛠️ fix: Address Codex round-2 on the audit-log substrate
- R2-1/R2-5 (P1/P2): base the grant.assigned audit decision on the atomic
upsert result. grantCapability now returns { grant, created } via
includeResultMetadata; the handler audits only when created. Removes the racy
pre-read, which also mis-handled inherited platform grants vs a new
tenant-scoped insert and concurrent double-assign.
- R2-2 (P2): namespace tenant chain keys (tenant:<id>) so a tenant whose id is
literally the platform sentinel can't share the platform audit chain.
- R2-4 (P2): validate literal calendar tokens for full ISO timestamps too, so
2025-02-31T00:00:00Z is rejected instead of normalizing to March 3.
Tests updated for the grantCapability { grant, created } contract (systemGrant +
grants specs) and the namespaced chain key (auditChainKey helper); +1 api date
case. data-schemas 141, api grants/audit 107 green.
R2-3 (deprecated actorId/targetPrincipalId aliases): not reinstating — the
surface is pre-release and its only consumer (admin-panel PR) migrates to the new
shape in lockstep, so there are no legacy clients to support.
R2-6 (role-deletion cascade emits no grant.removed): valid but a separate
workflow in roles.ts; tracked as a follow-up to keep this PR scoped.
* 🛠️ fix: Address Codex round-3 on the audit-log substrate
- R3-3 (P2): make a grant re-assert a true no-op — move grantedAt/grantedBy to
$setOnInsert so an existing grant is never silently mutated when the audit is
skipped (created:false now means nothing changed). grantedAt/grantedBy record
the original grant.
- R3-2 (P2): report CSV export truncation exactly. streamAuditLogEntries returns
{ count, truncated }; truncated is true only when rows existed beyond the cap,
so an exact-cap export is no longer falsely marked truncated.
- R3-5 (P2): block AuditLog.insertMany (another bulk path that skips the save
hook and could inject forged seq/prevHash/hash and poison the chain).
Tests: +insertMany rejection, +exact-cap vs truncated stream cases, +exact-cap
export-not-truncated handler case. ds 142, api 108 green.
R3-1 (deprecated query aliases) and R3-4 (role-deletion cascade audit) are
re-flags of R2-3/R2-6 — holding the prior decisions (pre-release surface; separate
roles.ts workflow tracked as a follow-up), pending maintainer direction.
* 🛡️ feat: Audit grant removals from the role-deletion cascade
Closes the forensic gap Codex flagged (R2-6/R3-4): deleting a role removed its
SystemGrants with no audit entries. `deleteGrantsForPrincipal` now returns the
removed grants, and the role-deletion handler emits a `grant.removed` audit entry
per removed grant (actor = caller, target = role, metadata.capability, request
context), matching the explicit revoke endpoint. Fail-open — the role is already
deleted, so a failed audit is logged, not propagated; sequential to keep the
per-tenant hash chain ordered.
Extracted `buildAuditContext` to admin/context.ts (shared by grants + roles).
Tests: role-deletion emits one entry per grant / none when no grants; ds 110,
api admin 202 green.
* 🛠️ fix: Address Codex round-4 on the audit-log substrate
- R4-1 (P2): don't silently drop an audit row under heavy append contention.
recordAuditEntry now retries duplicate-key seq collisions up to 12× with
jittered backoff (was 5, no backoff), so realistic bursts of parallel admin
writes resolve; the failClosed escape still applies on true exhaustion.
- R4-3 (P2): purge a contiguous seq prefix, not a date range. createdAt is
app-generated, so under multi-instance clock skew a later seq can carry an
earlier timestamp; a raw date delete could remove an interior row and break
verification. purgeAuditLogEntries now resolves the date to the first retained
seq and deletes only strictly-lower seqs, keeping the remaining chain contiguous.
Tests: +clock-skew purge case (no gap created). ds auditLog 33 green.
R4-2 (role-deletion grant audit) is a re-flag of R2-6/R3-4, already implemented
in 15472127d6 (roles.ts emitGrantRemovals + route wiring + tests); the finding's
cited line numbers predate that commit.
* 🛠️ fix: Address Codex round-5 on the audit-log substrate
- R5-1 (P2): scope each cascade grant.removed entry to the removed grant's own
tenant, not the caller's. A platform admin deleting a role can remove
tenant-scoped grants; those removals now land in the affected tenant's chain.
- R5-2 (P2): only return a purge checkpoint when rows were actually deleted. A
no-op confirmed purge no longer mints a trust boundary that could legitimize a
prefix it didn't authorize.
- R5-3 (P2): ensure the unique { chainKey, seq } index exists before appending
(memoized createIndexes), so serialization doesn't depend on a background build
— closes a silent chain-fork window under MONGO_AUTO_INDEX=false or at startup.
Tests: +per-grant-tenant cascade audit, +no-op-purge-no-checkpoint,
+index-built-before-append. ds auditLog 35, api roles 95 green.
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
437 lines
15 KiB
JavaScript
437 lines
15 KiB
JavaScript
const telemetry = require('./telemetry');
|
|
const fs = require('fs');
|
|
const path = require('path');
|
|
require('module-alias')({ base: path.resolve(__dirname, '..') });
|
|
const cors = require('cors');
|
|
const axios = require('axios');
|
|
const express = require('express');
|
|
const passport = require('passport');
|
|
const compression = require('compression');
|
|
const cookieParser = require('cookie-parser');
|
|
const mongoSanitize = require('express-mongo-sanitize');
|
|
const { logger, runAsSystem } = require('@librechat/data-schemas');
|
|
const {
|
|
isEnabled,
|
|
apiNotFound,
|
|
createMetrics,
|
|
ErrorController,
|
|
memoryDiagnostics,
|
|
performStartupChecks,
|
|
handleJsonParseError,
|
|
GenerationJobManager,
|
|
QUERY_DEVTOOLS_HEADER,
|
|
createStreamServices,
|
|
initializeFileStorage,
|
|
initializeDeploymentSkills,
|
|
maybeInjectQueryDevtoolsBootstrap,
|
|
preAuthTenantMiddleware,
|
|
setupGracefulShutdown,
|
|
updateInterfacePermissions,
|
|
} = require('@librechat/api');
|
|
const { connectDb, indexSync } = require('~/db');
|
|
const {
|
|
updateAccessPermissions,
|
|
sweepOrphanedPreviews,
|
|
getRoleByName,
|
|
seedDatabase,
|
|
} = require('~/models');
|
|
const initializeOAuthReconnectManager = require('./services/initializeOAuthReconnectManager');
|
|
const { capabilityContextMiddleware } = require('./middleware/roles/capabilities');
|
|
const createValidateImageRequest = require('./middleware/validateImageRequest');
|
|
const { startExpiredFileSweep } = require('./services/Files/process');
|
|
const { initializeGitHubSkillSync } = require('./services/Skills/sync');
|
|
const { jwtLogin, ldapLogin, passportLogin } = require('~/strategies');
|
|
const { checkMigrations } = require('./services/start/migration');
|
|
const optionalJwtAuth = require('./middleware/optionalJwtAuth');
|
|
const initializeMCPs = require('./services/initializeMCPs');
|
|
const configureSocialLogins = require('./socialLogins');
|
|
const createSpaFallback = require('./utils/fallback');
|
|
const { getAppConfig } = require('./services/Config');
|
|
const staticCache = require('./utils/staticCache');
|
|
const noIndex = require('./middleware/noIndex');
|
|
const routes = require('./routes');
|
|
|
|
const { PORT, HOST, ALLOW_SOCIAL_LOGIN, DISABLE_COMPRESSION, TRUST_PROXY } = process.env ?? {};
|
|
|
|
// Allow PORT=0 to be used for automatic free port assignment
|
|
const port = isNaN(Number(PORT)) ? 3080 : Number(PORT);
|
|
const host = HOST || 'localhost';
|
|
const trusted_proxy = Number(TRUST_PROXY) || 1; /* trust first proxy by default */
|
|
|
|
const app = express();
|
|
let serverReady = false;
|
|
|
|
const SERVER_NOT_READY_CODE = 'SERVER_NOT_READY';
|
|
const CHAT_START_RETRY_AFTER_SECONDS = '1';
|
|
|
|
const rejectChatStartsUntilReady = (req, res, next) => {
|
|
if (serverReady || req.method !== 'POST' || req.path === '/abort') {
|
|
return next();
|
|
}
|
|
|
|
res.set('Retry-After', CHAT_START_RETRY_AFTER_SECONDS);
|
|
return res.status(503).json({
|
|
code: SERVER_NOT_READY_CODE,
|
|
error: 'Server is still starting. Please retry shortly.',
|
|
});
|
|
};
|
|
|
|
const configureGenerationStreams = () => {
|
|
const streamServices = createStreamServices();
|
|
GenerationJobManager.configure({
|
|
...streamServices,
|
|
cleanupOnComplete: !isEnabled(process.env.STREAM_KEEP_COMPLETED_JOBS),
|
|
});
|
|
GenerationJobManager.initialize();
|
|
};
|
|
|
|
const startServer = async () => {
|
|
const { metricsMiddleware, metricsRouter } = createMetrics();
|
|
if (!process.env.METRICS_SECRET) {
|
|
logger.warn('[metrics] METRICS_SECRET is not set - /metrics will return 401 for all requests');
|
|
}
|
|
|
|
if (typeof Bun !== 'undefined') {
|
|
axios.defaults.headers.common['Accept-Encoding'] = 'gzip';
|
|
}
|
|
await connectDb();
|
|
|
|
logger.info('Connected to MongoDB');
|
|
indexSync().catch((err) => {
|
|
logger.error('[indexSync] Background sync failed:', err);
|
|
});
|
|
|
|
app.disable('x-powered-by');
|
|
app.set('trust proxy', trusted_proxy);
|
|
|
|
if (isEnabled(process.env.TENANT_ISOLATION_STRICT)) {
|
|
logger.warn(
|
|
'[Security] TENANT_ISOLATION_STRICT is active. Ensure your reverse proxy strips or sets ' +
|
|
'the X-Tenant-Id header — untrusted clients must not be able to set it directly.',
|
|
);
|
|
}
|
|
|
|
await runAsSystem(seedDatabase);
|
|
/* Recover stuck `status: 'pending'` records from a crash mid-render.
|
|
* `runAsSystem` is required — `File` is tenant-isolated and strict
|
|
* mode rejects unscoped queries. Lazy sweep in the preview endpoint
|
|
* covers anything younger than the boot cutoff. */
|
|
runAsSystem(sweepOrphanedPreviews).catch((err) => {
|
|
logger.error('[sweepOrphanedPreviews] Background sweep failed:', err);
|
|
});
|
|
const appConfig = await getAppConfig({ baseOnly: true });
|
|
initializeFileStorage(appConfig);
|
|
await initializeDeploymentSkills({ projectRoot: path.resolve(__dirname, '../..') });
|
|
initializeGitHubSkillSync(appConfig);
|
|
startExpiredFileSweep({ appConfig, loadAppConfig: getAppConfig });
|
|
await runAsSystem(async () => {
|
|
await performStartupChecks(appConfig);
|
|
await updateInterfacePermissions({ appConfig, getRoleByName, updateAccessPermissions });
|
|
});
|
|
|
|
const indexPath = path.join(appConfig.paths.dist, 'index.html');
|
|
let indexHTML = fs.readFileSync(indexPath, 'utf8');
|
|
|
|
// In order to provide support to serving the application in a sub-directory
|
|
// We need to update the base href if the DOMAIN_CLIENT is specified and not the root path
|
|
if (process.env.DOMAIN_CLIENT) {
|
|
const clientUrl = new URL(process.env.DOMAIN_CLIENT);
|
|
const baseHref = clientUrl.pathname.endsWith('/')
|
|
? clientUrl.pathname
|
|
: `${clientUrl.pathname}/`;
|
|
if (baseHref !== '/') {
|
|
logger.info(`Setting base href to ${baseHref}`);
|
|
indexHTML = indexHTML.replace(/base href="\/"/, `base href="${baseHref}"`);
|
|
}
|
|
}
|
|
|
|
const sendIndexHtml = (req, res) => {
|
|
res.set({
|
|
'Cache-Control': process.env.INDEX_CACHE_CONTROL || 'no-cache, no-store, must-revalidate',
|
|
Pragma: process.env.INDEX_PRAGMA || 'no-cache',
|
|
Expires: process.env.INDEX_EXPIRES || '0',
|
|
});
|
|
res.vary(QUERY_DEVTOOLS_HEADER);
|
|
|
|
const lang = req.cookies.lang || req.headers['accept-language']?.split(',')[0] || 'en-US';
|
|
const saneLang = lang.replace(/"/g, '"');
|
|
let updatedIndexHtml = indexHTML.replace(/lang="en-US"/g, `lang="${saneLang}"`);
|
|
updatedIndexHtml = maybeInjectQueryDevtoolsBootstrap(updatedIndexHtml, req);
|
|
|
|
res.type('html');
|
|
res.send(updatedIndexHtml);
|
|
};
|
|
|
|
app.get('/health', (_req, res) => res.status(200).send('OK'));
|
|
app.get('/livez', (_req, res) => res.status(200).send('OK'));
|
|
app.get('/readyz', (_req, res) => {
|
|
if (!serverReady) {
|
|
return res.status(503).send('NOT_READY');
|
|
}
|
|
return res.status(200).send('OK');
|
|
});
|
|
|
|
/* Middleware */
|
|
app.use(metricsMiddleware);
|
|
app.use(noIndex);
|
|
app.use(express.json({ limit: '3mb' }));
|
|
app.use(express.urlencoded({ extended: true, limit: '3mb' }));
|
|
app.use(handleJsonParseError);
|
|
|
|
/**
|
|
* Express 5 Compatibility: Make req.query writable for mongoSanitize
|
|
* In Express 5, req.query is read-only by default, but express-mongo-sanitize needs to modify it
|
|
*/
|
|
app.use((req, _res, next) => {
|
|
Object.defineProperty(req, 'query', {
|
|
...Object.getOwnPropertyDescriptor(req, 'query'),
|
|
value: req.query,
|
|
writable: true,
|
|
});
|
|
next();
|
|
});
|
|
|
|
app.use(mongoSanitize());
|
|
app.use(cors());
|
|
app.use(cookieParser());
|
|
|
|
if (!isEnabled(DISABLE_COMPRESSION)) {
|
|
app.use(compression());
|
|
} else {
|
|
console.warn('Response compression has been disabled via DISABLE_COMPRESSION.');
|
|
}
|
|
|
|
app.get('/index.html', sendIndexHtml);
|
|
app.use(staticCache(appConfig.paths.dist));
|
|
app.use(staticCache(appConfig.paths.fonts));
|
|
app.use(staticCache(appConfig.paths.assets));
|
|
|
|
if (telemetry.enabled) {
|
|
app.use(telemetry.telemetryMiddleware);
|
|
}
|
|
|
|
if (!ALLOW_SOCIAL_LOGIN) {
|
|
console.warn('Social logins are disabled. Set ALLOW_SOCIAL_LOGIN=true to enable them.');
|
|
}
|
|
|
|
/* OAUTH */
|
|
app.use(passport.initialize());
|
|
passport.use(jwtLogin());
|
|
passport.use(passportLogin());
|
|
|
|
/* LDAP Auth */
|
|
if (process.env.LDAP_URL && process.env.LDAP_USER_SEARCH_BASE) {
|
|
passport.use(ldapLogin);
|
|
}
|
|
|
|
if (isEnabled(ALLOW_SOCIAL_LOGIN)) {
|
|
await configureSocialLogins(app);
|
|
}
|
|
|
|
/* Per-request capability cache — must be registered before any route that calls hasCapability */
|
|
app.use(capabilityContextMiddleware);
|
|
|
|
/* Pre-auth tenant context for unauthenticated routes that need tenant scoping.
|
|
* The reverse proxy / auth gateway sets `X-Tenant-Id` header for multi-tenant deployments. */
|
|
app.use('/oauth', preAuthTenantMiddleware, routes.oauth);
|
|
/* API Endpoints */
|
|
app.use('/api/auth', preAuthTenantMiddleware, routes.auth);
|
|
app.use('/api/admin', routes.adminAuth);
|
|
app.use('/api/admin/config', routes.adminConfig);
|
|
app.use('/api/admin/grants', routes.adminGrants);
|
|
app.use('/api/admin/groups', routes.adminGroups);
|
|
app.use('/api/admin/roles', routes.adminRoles);
|
|
app.use('/api/admin/skills', routes.adminSkills);
|
|
app.use('/api/admin/users', routes.adminUsers);
|
|
app.use('/api/admin/audit-log', routes.adminAuditLog);
|
|
app.use('/api/actions', routes.actions);
|
|
app.use('/api/keys', routes.keys);
|
|
app.use('/api/api-keys', routes.apiKeys);
|
|
app.use('/api/user', routes.user);
|
|
app.use('/api/search', routes.search);
|
|
app.use('/api/messages', routes.messages);
|
|
app.use('/api/convos', routes.convos);
|
|
app.use('/api/presets', routes.presets);
|
|
app.use('/api/projects', routes.projects);
|
|
app.use('/api/prompts', routes.prompts);
|
|
app.use('/api/skills', routes.skills);
|
|
app.use('/api/categories', routes.categories);
|
|
app.use('/api/endpoints', routes.endpoints);
|
|
app.use('/api/balance', routes.balance);
|
|
app.use('/api/models', routes.models);
|
|
app.use('/api/config', preAuthTenantMiddleware, optionalJwtAuth, routes.config);
|
|
app.use('/api/assistants', routes.assistants);
|
|
app.use('/api/files', await routes.files.initialize());
|
|
app.use('/images/', createValidateImageRequest(appConfig.secureImageLinks), routes.staticRoute);
|
|
app.use('/api/share', preAuthTenantMiddleware, routes.share);
|
|
app.use('/api/roles', routes.roles);
|
|
app.use('/api/agents/chat', rejectChatStartsUntilReady);
|
|
app.use('/api/agents', routes.agents);
|
|
app.use('/api/banner', routes.banner);
|
|
app.use('/api/memories', routes.memories);
|
|
app.use('/api/permissions', routes.accessPermissions);
|
|
|
|
app.use('/api/tags', routes.tags);
|
|
app.use('/api/mcp', routes.mcp);
|
|
app.use('/api/rum', routes.rum);
|
|
|
|
app.use('/metrics', metricsRouter);
|
|
|
|
/** 404 for unmatched API routes */
|
|
app.use('/api', apiNotFound);
|
|
|
|
/** SPA fallback - serve index.html for all unmatched routes */
|
|
app.use(createSpaFallback(sendIndexHtml));
|
|
|
|
/** Record trace errors before the final error controller. */
|
|
if (telemetry.enabled) {
|
|
app.use(telemetry.telemetryErrorMiddleware);
|
|
}
|
|
/** Error handler (must be last - Express identifies error middleware by its 4-arg signature) */
|
|
app.use(ErrorController);
|
|
|
|
configureGenerationStreams();
|
|
|
|
const server = app.listen(port, host, async (err) => {
|
|
if (err) {
|
|
logger.error('Failed to start server:', err);
|
|
process.exit(1);
|
|
}
|
|
|
|
if (host === '0.0.0.0') {
|
|
logger.info(
|
|
`Server listening on all interfaces at port ${port}. Use http://localhost:${port} to access it`,
|
|
);
|
|
} else {
|
|
logger.info(`Server listening at http://${host == '0.0.0.0' ? 'localhost' : host}:${port}`);
|
|
}
|
|
|
|
/**
|
|
* The listen callback is async, so any rejection from these awaits would
|
|
* otherwise be detached from `startServer().catch(...)` (which only
|
|
* catches errors that happen before `app.listen`). Without explicit
|
|
* handling, the global `unhandledRejection` handler would swallow init
|
|
* failures and leave the server listening but only partially
|
|
* initialized — passing liveness checks while serving broken requests.
|
|
*/
|
|
try {
|
|
await runAsSystem(async () => {
|
|
await initializeMCPs();
|
|
await initializeOAuthReconnectManager();
|
|
});
|
|
await checkMigrations();
|
|
|
|
const inspectFlags = process.execArgv.some((arg) => arg.startsWith('--inspect'));
|
|
if (inspectFlags || isEnabled(process.env.MEM_DIAG)) {
|
|
memoryDiagnostics.start();
|
|
}
|
|
serverReady = true;
|
|
logger.info('Server readiness checks passing.');
|
|
} catch (initErr) {
|
|
serverReady = false;
|
|
logger.error('Post-listen initialization failed:', initErr);
|
|
process.exit(1);
|
|
}
|
|
});
|
|
|
|
setupGracefulShutdown(server);
|
|
};
|
|
|
|
/**
|
|
* Boot rejections (e.g. `connectDb`, `getAppConfig`, `performStartupChecks`)
|
|
* must remain fail-fast: a half-initialized process with no listening HTTP
|
|
* server should die immediately so the orchestrator restarts it, instead of
|
|
* being kept alive by the `unhandledRejection` handler below until the
|
|
* liveness probe eventually times out. Mirrors the pattern in
|
|
* `experimental.js`.
|
|
*/
|
|
startServer().catch((err) => {
|
|
logger.error('Failed to start server:', err);
|
|
process.exit(1);
|
|
});
|
|
|
|
let messageCount = 0;
|
|
process.on('uncaughtException', (err) => {
|
|
if (!err.message.includes('fetch failed')) {
|
|
logger.error('There was an uncaught error:', err);
|
|
}
|
|
|
|
if (err.message && err.message?.toLowerCase()?.includes('abort')) {
|
|
logger.warn('There was an uncatchable abort error.');
|
|
return;
|
|
}
|
|
|
|
if (err.message.includes('GoogleGenerativeAI')) {
|
|
logger.warn(
|
|
'\n\n`GoogleGenerativeAI` errors cannot be caught due to an upstream issue, see: https://github.com/google-gemini/generative-ai-js/issues/303',
|
|
);
|
|
return;
|
|
}
|
|
|
|
if (err.message.includes('fetch failed')) {
|
|
if (messageCount === 0) {
|
|
logger.warn('Meilisearch error, search will be disabled');
|
|
messageCount++;
|
|
}
|
|
|
|
return;
|
|
}
|
|
|
|
if (err.message.includes('OpenAIError') || err.message.includes('ChatCompletionMessage')) {
|
|
logger.error(
|
|
'\n\nAn Uncaught `OpenAIError` error may be due to your reverse-proxy setup or stream configuration, or a bug in the `openai` node package.',
|
|
);
|
|
return;
|
|
}
|
|
|
|
if (err.stack && err.stack.includes('@librechat/agents')) {
|
|
logger.error(
|
|
'\n\nAn error occurred in the agents system. The error has been logged and the app will continue running.',
|
|
{
|
|
message: err.message,
|
|
stack: err.stack,
|
|
},
|
|
);
|
|
return;
|
|
}
|
|
|
|
if (isEnabled(process.env.CONTINUE_ON_UNCAUGHT_EXCEPTION)) {
|
|
logger.error('Unhandled error encountered. The app will continue running.', {
|
|
name: err?.name,
|
|
message: err?.message,
|
|
stack: err?.stack,
|
|
});
|
|
return;
|
|
}
|
|
|
|
process.exit(1);
|
|
});
|
|
|
|
/**
|
|
* Unhandled promise rejection handler.
|
|
*
|
|
* Node 15+ terminates the process by default when a promise rejection is
|
|
* unhandled. MCP OAuth reconnect storms and streamable-HTTP transport resets
|
|
* can produce transient fire-and-forget rejections (ECONNRESET, token refresh
|
|
* races) that are recoverable — the server should log and keep serving other
|
|
* requests rather than silently crash under load.
|
|
*
|
|
* Non-Error reasons are forwarded as-is so structured payloads (e.g.
|
|
* `{ code: "ECONNRESET", errno: -104 }`) survive instead of being collapsed to
|
|
* "[object Object]" by `String()`.
|
|
*/
|
|
process.on('unhandledRejection', (reason) => {
|
|
if (reason instanceof Error) {
|
|
logger.error('Unhandled promise rejection. The app will continue running.', {
|
|
name: reason.name,
|
|
message: reason.message,
|
|
stack: reason.stack,
|
|
cause: reason.cause,
|
|
});
|
|
return;
|
|
}
|
|
logger.error('Unhandled promise rejection. The app will continue running.', { reason });
|
|
});
|
|
|
|
/** Export app for easier testing purposes */
|
|
module.exports = app;
|