Commit graph

41 commits

Author SHA1 Message Date
Atef Bellaaj
86fe79c37d
🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051)
* feat: Add granular access control to shared links via ACL system

* fix(shared-links): preserve isPublic on failed migration grants

Transient ACL failures during auto-migration permanently stranded
links — $unset ran unconditionally, removing the legacy flag that
triggers retry. Now only $unset isPublic after all grants succeed.

* fix(config): skip isPublic unset for failed ACL grants

Bulk migration unconditionally removed isPublic from all links,
even those whose ACL writes failed. Failed links then lost the
legacy marker needed for auto-migration retry. Now tracks failed
link IDs per-batch and excludes them from the $unset step.

Also adds sharedLink to AccessRole resourceType schema enum —
was missing, only worked because seedDefaultRoles uses
findOneAndUpdate which bypasses validation.

* ci(config): add jest config and PR workflow for migration tests

config/__tests__/ specs depend on api/jest.config.js module
mappings but had no dedicated runner. Adds config/jest.config.js
extending api config with absolutized paths, npm test:config
script, and a GitHub Actions workflow triggered by changes to
config/, api/models/, api/db/, or packages/ ACL code.

* fix(permissions): honor boolean sharedLinks config

SHARED_LINKS has no USE permission, so boolean config produced
an empty update payload — gate conditions only matched object
form, making `sharedLinks: false` a no-op on existing perms.

* fix(share): resolve role before creating shared link

Role lookup between create and grant left an orphaned link
without ACL entries if getRoleByName threw — retry then hit "Share already exists" with no recovery path.

* fix: Restore Public ACL Access Checks

* fix: Type Public ACL Lookup

* fix: Preserve Private Legacy Shared Links

* chore: Promote Shared Link Permission Migration

* fix: Address Shared Link Review Findings

* fix: Repair Shared Link CI Follow-Up

* fix: Narrow Shared Link Mongoose Test Mock

* fix: Address Shared Link Review Follow-Ups

* fix: Close Shared Link Review Gaps

* fix: Guard Missing Shared Link Permission Backfill

* test: Add Shared Link Mock E2E

* test: Stabilize Shared Link Mock E2E

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-06-03 14:17:17 -04:00
Danny Avila
1fa28ec45b
🛰️ feat: Add Auth Fallback Observability (#13488)
* feat: add auth fallback observability

* refactor: move auth log helpers to api package

* test: harden auth log context handling

* fix: keep auth logs low cardinality

* fix: render auth log context in debug messages

* fix: lower plain jwt auth failures to debug
2026-06-03 13:45:46 -04:00
Danny Avila
83d8ac0682
🪜 feat: Add OpenID Role Sync (#13415)
* Shared Role-Sync Core

* Environment Configuration

* Browser OpenID Wiring & improved shared component

* API Auth Wiring

* Improved Role Lookup

* added example for sync env

* small simplification

* protect existing manual assigned ADMIN Roles

* fix: Apply OpenID role-sync fallback for present-but-empty claims

Both role-sync call sites skipped on a falsy `openIdRoleValues`, treating an
empty claim string ('') the same as a missing claim and returning before
`selectOpenIdRole` could apply the configured fallback role. An IdP emitting
an empty roles claim for a user with no mapped groups left the stale local
role in place instead of the authoritative fallback.

Skip only when the helper returns `undefined` (missing/invalid), letting an
empty string flow through to fallback selection — consistent with how an
empty array is already handled. Adds regression coverage on both the OpenID
strategy and the remote-agent API auth paths.

* refactor: Address OpenID role-sync review feedback

- role.ts: reuse the shared escapeRegExp util instead of a local escapeRegex
  duplicate, matching prompt/skill/user/userGroup methods (Copilot).
- openidStrategy.js / remoteAgentAuth.ts: make the tenantStorage.run callbacks
  async so the documented ALS contract is satisfied and tenant context cannot
  be lost during Mongoose execution; the wrapped lookups/updates are already
  async, so behavior is unchanged (codex P2).

* fix: Harden OpenID role-sync claim and fallback handling

Addresses the second Codex review cycle (P2 findings):

- Apply fallback when the claim is absent: getOpenIdRolesForOpenIdSync now
  returns an empty list (not undefined) when the token source exists but has
  no usable claim value, so callers still run selection and assign the
  configured fallback instead of leaving a stale elevated role. A truly
  unavailable source still returns undefined and skips sync.
- Resolve group overage for access tokens too: the _claim_names/_claim_sources
  overage path previously only ran for claimSource 'id'; Entra also moves an
  oversized groups claim into access tokens, so 'access'+'groups' (the only
  source supported by remote-agent API sync) now resolves overage as well.
- Allow system fallback roles for tenant users: getLibreChatRolesForOpenIdSync
  treats SystemRoles (e.g. USER) as always-available canonical names, since
  they are provisioned globally at startup and a tenant-scoped lookup may not
  return them — preventing a spurious 'configured roles do not exist: USER'.

Adds unit and strategy-level coverage for all three.

* fix: Tighten OpenID role-sync tenant scoping and config validation

Addresses the third Codex review cycle:

- Constrain base-user role lookups to base roles (P2): findRolesByNames now
  filters to roles with an unset tenantId when no tenant ALS context is active,
  so a base user cannot match — and be assigned — a role that only exists within
  a tenant. Tenant-scoped lookups remain controlled by the isolation plugin.
- Re-enforce tenant login policy after role sync (P2): when role sync changes a
  tenant user's role, the OpenID strategy re-resolves the tenant appConfig and
  re-checks allowedDomains, so a token cannot complete login under the previous
  role's looser policy.
- Skip role-sync-specific validation when disabled (P3): getOpenIdRoleSyncOptions
  returns disabled options before validating role-sync settings, so a stale or
  mistyped value no longer breaks OpenID login while the feature is off.

Adds unit and strategy-level coverage for all three.

* fix: Run base role lookups under system context for strict isolation

Follow-up to the base-role scoping fix (Codex P1). With TENANT_ISOLATION_STRICT=true,
the tenant-isolation pre('find') hook throws on a context-less query before the manual
tenantId filter is honored, so base OpenID/remote-agent auth would 500 instead of
validating base roles. findRolesByNames now runs the no-context lookup inside
runAsSystem (SYSTEM_TENANT_ID), bypassing strict-mode injection while still applying an
explicit base-role (tenantId unset) filter. Adds a strict-mode regression test.

---------

Co-authored-by: Peter Rothlaender <peter.rothlaender@ginkgo.com>
2026-06-02 14:00:56 -04:00
Danny Avila
e0c346c0a4
🤫 chore: Quiet Repetitive Log Noise from Balance, CloudFront, and Capability Paths (#13461)
* chore: reduce auth and balance operational noise

* chore: tighten balance and capability noise handling

* chore: avoid balance 404s when disabled

* chore: use response locals for balance handoff
2026-06-01 20:40:16 -04:00
Danny Avila
479e9d59b7
🧠 refactor: Memoize MCP Permission Checks Per Request (#13419) 2026-05-30 18:32:06 -04:00
Danny Avila
83c7d637c3
🎨 chore: prettier --write all workspaces (#13281)
Run `prettier --write` over the source trees of every workspace to align
with the repo's own `.prettierrc` (`printWidth: 100`, `singleQuote: true`,
`trailingComma: 'all'`, etc.). **19 files reformatted total** — purely
whitespace and line-wrap changes, no functional edits and no API changes.

Scope:
- `packages/api/src/**/*.{ts,tsx}` — 14 files
- `packages/client/src/**/*.{ts,tsx}` — 1 file
- `packages/data-schemas/src/**/*.{ts,tsx}` — 4 files
- `api/**`, `client/**`, `packages/data-provider/**` — already prettier-clean

Most of the drift is in argument-list / type-annotation wrapping where
the formatted form fits within `printWidth` but the current source keeps
a hand-wrapped multi-line shape. Example:

  // before
  function countWebSearchDefinitions(
    toolDefinitions: Array<{ name: string }> | undefined,
  ): number { … }

  // after (still well under 100 cols)
  function countWebSearchDefinitions(toolDefinitions: Array<{ name: string }> | undefined): number { … }

`npx prettier --check` across all workspaces is now clean. The local
pre-commit hook (`lint-staged` → `prettier --write`) would have produced
the same result on any future edit to these files.

There are no prettier-checking workflows in CI today, so drift like this
can re-appear if PRs are merged with the hook bypassed. Companion PR
#13282 adds a `prettier --check` step to `eslint-ci.yml` so future
drift gets caught.
2026-05-23 17:52:58 -04:00
Danny Avila
1693b586d4
🏣 fix: Reject System Tenant In Auth Context (#13278) 2026-05-23 16:55:32 -04:00
Danny Avila
c342e2345b
🪪 fix: Resolve Group-Scoped Config Overrides (#13176)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* fix: resolve group-scoped config overrides

* test: fix endpoint config request mock typing

* fix: keep remote agent preauth config tenant-scoped

* test: align config scoping expectations

* test: reproduce group endpoint override resolution
2026-05-18 10:16:20 -04:00
Danny Avila
7f58e4c2ed
🧾 feat: Add Structured Logging Context (#13110)
* feat: add structured logging context

* fix: reduce cloudfront disabled logging

* fix: preserve strict reject logging context

* chore: format auth middleware test

* fix: omit system tenant from log context

* fix: type parser spec formatter info

* fix: normalize tenant guard before reject checks
2026-05-13 19:17:39 -04:00
Danny Avila
8735c1763c
🧵 fix: Preserve Upload Context Across Multipart Routes (#13072)
* fix skill multipart imports under strict isolation

* fix file upload context after multipart parsing

* fix skill upload tenant resolution

* fix rejected upload cleanup
2026-05-11 15:46:48 -04:00
Danny Avila
0449c423a2
🗝️ fix: Enforce Skill Share Role Permission (#13062)
* fix: enforce skill share role permission

* fix: preserve share capability bypass

* refactor: move share policy middleware to api package

* style: order share middleware imports

* fix: satisfy share middleware type checks

* test: cover share policy resource types
2026-05-11 09:39:58 -04:00
Danny Avila
52ccb1379b
🪪 refactor: Require Remote OIDC Audience for Agents API OAuth (#13066) 2026-05-11 08:38:13 -04:00
Danny Avila
40a05bbf83
📦 chore: npm audit fixes and Mongoose 8.23 TypeScript follow-ups (#12996)
* chore: Update axios dependency to version 1.16.0 across multiple package files

* chore: Update express-rate-limit and ip-address dependencies to versions 8.5.1 and 10.2.0 in package-lock.json and package.json

* chore: Update mongoose and hono dependencies to versions 8.23.1 and 4.12.18 across multiple package files

* fix: Add type parameters to mongoose lean queries in accessRole and aclEntry methods

* fix: Add type parameters to mongoose lean queries in action, agent, and agentCategory methods

* chore: Update moduleResolution to 'bundler' in tsconfig.json for api and data-schemas packages

* fix: Update mongoose lean queries to include type parameters across various methods for improved type safety
2026-05-07 09:47:40 -04:00
Danny Avila
ddf5879ccd
⏱️ fix: Align Auto-Refill Next Date (#12980)
* fix: Align auto-refill next date

* style: Fix auto-refill lint formatting

* refactor: Share auto-refill eligibility date

* refactor: Consolidate refill interval units

* fix: Guard malformed refill interval units

* fix: Preserve refill unit fallback label
2026-05-06 21:40:18 -04:00
Danny Avila
bff9bfea87
🐛 fix: Propagate User Identity to Subagent MCP Tool Calls (#12950)
* 🐛 fix: Propagate User Identity to Subagent MCP Tool Calls

The `@librechat/agents` SDK's `SubagentExecutor` invokes the child
workflow with a fresh configurable of `{ thread_id }` only — the
parent's `user` / `user_id` are dropped on the way into the child
graph. The child's `ToolNode` then dispatches `ON_TOOL_EXECUTE` to the
parent's handler, which merges `{ ...configurable, ...toolConfigurable }`,
but neither side carries user identity for subagents.

Downstream MCP tools read `config.configurable.user?.id || user_id` and
got `undefined`, so `MCPManager.getConnection` fell through to the
"No connection found for server X" error path — it can't reach the
user-connection lookup without a userId.

Re-inject `user` (via `createSafeUser`) and `user_id` from `req.user`
into the configurable returned by `loadToolsForExecution`. This is the
single point all controllers (chat, Responses API, OpenAI-compat) flow
through. For the parent agent it's a no-op (outer config already
carries the same values); for subagents it fills the gap so MCP
connection lookup, user-placeholder substitution, and tools that read
configurable.user all work correctly.

* 🐛 fix: Preserve `api-user` Fallback When Injecting Subagent Identity

Codex review pointed out that the prior commit unconditionally wrote
`user_id: req.user?.id` (and `user`) into `toolConfigurable`. The handler
merges via `{ ...configurable, ...toolConfigurable }` — `toolConfigurable`
wins — so when `req.user` is absent, this overwrote the outer config's
`'api-user'` fallback (set by `responses.js` / `openai.js` for the
unauthenticated API-key path) with `undefined`, breaking MCP connection
lookup for that path.

Only inject the keys when `req.user.id` is truthy. Omitting them lets
the merge preserve whatever the outer configurable already had. Tests
updated to assert key omission for `req.user` undefined / null / present
without `id`.

* 🩹 fix: Narrow `IUser.id` to required string

`IUser` extends mongoose `Document`, which types `id?: any` (the optional
virtual). At runtime `id` is always `_id.toString()` for a hydrated doc,
so narrow the type to a required string.

Closes two `@rollup/plugin-typescript` TS2322 warnings introduced by
PR #12450 (OIDC Bearer Token Authentication for Remote Agent API)
where `req.user = userResolution.user` and the
`(req: Request, res: Response, next: NextFunction)` signature both
failed against the project's local `Express.User` augmentation
(`{ [key: string]: any; id: string; }`) because `IUser.id` was
`any`/optional. Narrowing here fixes both at the source rather than
casting at every assignment site.

* 🩹 fix: Resolve TS Build Warnings Surfaced by `IUser.id` Narrowing

Three rollup TS plugin warnings surfaced after narrowing
`IUser.id` from `any` to `string`:

- `utils/env.ts:95` — `safeUser[field] = user[field]` failed strict
  checking because indexed write through a union-typed key collapses
  the LHS to the intersection of all field write types (i.e.,
  `undefined` when fields have mixed types). The previous `id?: any`
  on IUser had been masking this. Switch to `Object.assign(safeUser,
  { [field]: user[field] })` which widens the assignment.

- `endpoints/google/initialize.ts:35` — `getUserKey({ userId:
  req.user?.id, ... })` failed because `req.user?.id` is now
  `string | undefined` (no longer `any`). Match the pattern already
  used in `endpoints/openAI/initialize.ts:49`: `req.user?.id ?? ''`.

- `middleware/remoteAgentAuth.ts:465` — pre-existing, unrelated to
  the IUser change. The local (gitignored) `express.d.ts` augments
  `express.Request` but not `express-serve-static-core.Request`,
  so the explicit `(req: Request, ...)` annotation imported from
  `'express'` resolves to a Request whose `req.user` differs from
  the one `RequestHandler` expects internally. Type the closure as
  `RequestHandler` directly so TS infers params from the augmented
  type.

* 🩹 fix: Cast `RemoteAgentAuth` Closure to `RequestHandler`

My previous attempt removed the explicit `req: Request` annotation on
the closure to side-step the outer `RequestHandler` mismatch. That
shifted the error to every helper call site inside the closure
(`getConfigOptions(req)`, `runApiKeyAuth(req, ...)` at 467/474/493/
512/531), because the helpers annotate their params with
`express.Request` (which has the local `Request.user` augmentation),
while the unannotated closure inferred `req` as
`express-serve-static-core.Request` (no augmentation). Reproduced
locally by stubbing the gitignored `src/types/express.d.ts`.

Right approach: keep the explicit `req: Request` annotation so the
closure body matches the helpers' types, then cast at the return —
`RequestHandler`'s internal `Request` resolves through
`express-serve-static-core` and lacks the augmentation, so the cast
is the boundary that bridges the two views of `req.user`.

Verified against a build with the local express.d.ts stub: zero
warnings on `remoteAgentAuth.ts`, `env.ts`, and `google/initialize.ts`.
2026-05-05 12:19:50 +09:00
Artyom Bogachenko
5683706af5
🔐 feat: OIDC Bearer Token Authentication for Remote Agent API (#12450)
* Remote Agent Auth middleware

* consider migration and update user

* fix eslint errors

* add scope validation

* fix codex review errors

* add filter for use: sig

* add jwks-rsa deps

* Fix remote agent OIDC auth review findings

* Polish remote agent OIDC timeout coverage

* Reject remote OIDC tokens without subject

* Use tenant context for remote agent auth config

* Harden remote agent OIDC scope handling

* Polish remote agent OIDC cache and scope tests

* Resolve remote agent auth review comments

* Reuse OpenID email claim resolver for remote auth

* Skip empty OpenID email fallback claims

* Use pre-auth tenant context for remote auth config

* Downgrade expected OIDC fallback logging

* Require secure remote OIDC endpoints

* Polish remote agent auth edge cases

* Enforce unique balance records

* Bind remote OpenID users to issuer

* Fix issuer-scoped OpenID indexes

* Avoid unique balance index requirement

* Fix remote OpenID issuer normalization boundaries

* Require issuer-bound OpenID lookups

* Enforce tenant API key policy after auth

* Fix remote auth tenant policy types

* Normalize remote OIDC discovery issuer

* Allow normalized remote OIDC issuer validation

* Enforce resolved tenant OIDC policy

* Polish OpenID issuer and scope validation

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-05-04 17:06:35 -04:00
Dustin Healy
3d1b883e9d
👨‍👨‍👦‍👦 feat: Admin Users API Endpoints (#12446)
* feat: add admin user management endpoints

Add /api/admin/users with list, search, and delete handlers gated by
ACCESS_ADMIN + READ_USERS/MANAGE_USERS system grants. Handler factory
in packages/api uses findUsers, countUsers, and deleteUserById from
data-schemas.

* fix: address convention violations in admin users handlers

* fix: add pagination, self-deletion guard, and DB-level search limit

- listUsers now uses parsePagination + countUsers for proper pagination
  matching the roles/groups pattern
- findUsers extended with optional limit/offset options
- deleteUser returns 403 when caller tries to delete own account
- searchUsers passes limit to DB query instead of fetching all and
  slicing in JS
- Fix import ordering per CLAUDE.md, complete logger mock
- Replace fabricated date fallback with undefined

* fix: deterministic sort, null-safe pagination, consistent search filter

- Add sort option to findUsers; listUsers sorts by createdAt desc for
  deterministic pagination
- Use != null guards for offset/limit to handle zero values correctly
- Remove username from search filter since it is not in the projection
  or AdminUserSearchResult response type

* fix: last-admin deletion guard and search query max-length

- Prevent deleting the last admin user (look up target role, count
  admins, reject with 400 if count <= 1)
- Cap search query at 200 characters to prevent regex DoS
- Add tests for both guards

* fix: include missing capability name in 403 Forbidden response

* fix: cascade user deletion cleanup, search username, parallel capability checks

- Cascade Config, AclEntry, and SystemGrant cleanup on user deletion
  (matching the pattern in roles/groups handlers)
- Add username to admin search $or filter for parity with searchUsers
- Parallelize READ_* capability checks in listAllGrants with Promise.all

* fix: TOCTOU safety net, capability info leak, DRY/style cleanup, data-layer tests

- Add post-delete admin recount with CRITICAL log if race leaves 0 admins
- Revert capability name from 403 response to server-side log only
- Document thin deleteUserById limitation (full cascade is a future task)
- DRY: extract query.trim() to local variable in searchUsersHandler
- Add username to search projection, response type, and AdminUserSearchResult
- Functional filter/map in grants.ts parallel capability check
- Consistent null guards and limit>0 guard in findUsers options
- Fallback for empty result.message on delete response
- Fix mockUser() to generate unique _id per call
- Break long destructuring across multiple lines
- Assert countUsers filter and non-admin skip in delete tests
- Add data-layer tests for findUsers limit, offset, sort, and pagination

* chore: comment out admin delete user endpoint (out of scope)

* fix: cast USER principalId to ObjectId for ACL entry cleanup

ACL entries store USER principalId as ObjectId (via grantPermission casting),
but deleteAclEntries is a raw deleteMany that passes the filter through.
Passing a string won't match stored ObjectIds, leaving orphaned entries.

* chore: comment out unused requireManageUsers alongside disabled delete route

* fix: add missing logger.warn mock in capabilities test

* fix: harden admin users handlers — type safety, response consistency, test coverage

- Unify response shape: AdminUserSearchResult.userId → id, add AdminUserListItem type
- Fix unsafe req.query type assertion in searchUsersHandler (typeof guards)
- Anchor search regex with ^ for prefix matching (enables index usage)
- Add total/capped to search response for truncation signaling
- Add parseInt radix, remove redundant new Date() wrap
- Add tests: countUsers throw, countUsers call args, array query param, capped flag

* fix: scope deleteGrantsForPrincipal to tenant, deterministic search sort, align test mocks

- Add tenantId option to AdminUsersDeps.deleteGrantsForPrincipal and
  pass req.user.tenantId at the call site, matching the pattern already
  used by the roles and groups handlers
- Add sort: { name: 1 } to searchUsersHandler for deterministic results
- Align test mock deleteUserById messages with production output
  ('User was deleted successfully.')
- Make capped-results test explicitly set limit: '20' instead of
  relying on the implicit default

* test: add tenantId propagation test for deleteGrantsForPrincipal

Add tenantId to createReqRes user type and test that a non-undefined
tenantId is threaded through to deleteGrantsForPrincipal.

* test: remove redundant deleteUserById override in tenantId test

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-03-30 23:06:50 -04:00
Danny Avila
fd01dfc083
💰 fix: Lazy-Initialize Balance Record at Check Time for Overrides (#12474)
* fix: Lazy-initialize balance record when missing at check time

When balance is configured via admin panel DB overrides, users with
existing sessions never pass through the login middleware that creates
their balance record. This causes checkBalanceRecord to find no record
and return balance: 0, blocking the user.

Add optional balanceConfig and upsertBalanceFields deps to
CheckBalanceDeps. When no balance record exists but startBalance is
configured, lazily create the record instead of returning canSpend: false.

Pass the new deps from BaseClient, chatV1, and chatV2 callers.

* test: Add checkBalance lazy initialization tests

Cover lazy balance init scenarios: successful init with startBalance,
insufficient startBalance, missing config fallback, undefined
startBalance, missing upsertBalanceFields dep, and startBalance of 0.

* fix: Address review findings for lazy balance initialization

- Use canonical BalanceConfig and IBalanceUpdate types from
  @librechat/data-schemas instead of inline type definitions
- Include auto-refill fields (autoRefillEnabled, refillIntervalValue,
  refillIntervalUnit, refillAmount, lastRefill) during lazy init,
  mirroring the login middleware's buildUpdateFields logic
- Add try/catch around upsertBalanceFields with graceful fallback to
  canSpend: false on DB errors
- Read balance from DB return value instead of raw startBalance constant
- Fix misleading test names to describe observable throw behavior
- Add tests: upsertBalanceFields rejection, auto-refill field inclusion,
  DB-returned balance value, and logViolation assertions

* fix: Address second review pass findings

- Fix import ordering: package type imports before local type imports
- Remove misleading comment on DB-fallback test, rename for clarity
- Add logViolation assertion to insufficient-balance lazy-init test
- Add test for partial auto-refill config (autoRefillEnabled without
  required dependent fields)

* refactor: Replace createMockReqRes factory with describe-scoped consts

Replace zero-argument factory with plain const declarations using
direct type casts instead of double-cast through unknown.

* fix: Sort local type imports longest-first, add missing logViolation assertion

- Reorder local type imports in spec file per AGENTS.md (longest to
  shortest within sub-group)
- Add logViolation assertion to startBalance: 0 test for consistent
  violation payload coverage across all throw paths
2026-03-30 22:51:07 -04:00
Dustin Healy
a4a17ac771
⛩️ feat: Admin Grants API Endpoints (#12438)
* feat: add System Grants handler factory with tests

Handler factory with 4 endpoints: getEffectiveCapabilities (expanded
capability set for authenticated user), getPrincipalGrants (list grants
for a specific principal), assignGrant, and revokeGrant. Write ops
dynamically check MANAGE_ROLES/GROUPS/USERS based on target principal
type. 31 unit tests covering happy paths, validation, 403, and errors.

* feat: wire System Grants REST routes

Mount /api/admin/grants with requireJwtAuth + ACCESS_ADMIN gate.
Add barrel export for createAdminGrantsHandlers and AdminGrantsDeps.

* fix: cascade grant cleanup on role deletion

Add deleteGrantsForPrincipal to AdminRolesDeps and call it in
deleteRoleHandler via Promise.allSettled after successful deletion,
matching the groups cleanup pattern. 3 tests added for cleanup call,
skip on 404, and resilience to cleanup failure.

* fix: simplify cascade grant cleanup on role deletion

Replace Promise.allSettled wrapper with a direct try/catch for the
single deleteGrantsForPrincipal call.

* fix: harden grant handlers with auth, validation, types, and RESTful revoke

- Add per-handler auth checks (401) and granular capability gates
  (READ_* for getPrincipalGrants, possession check for assignGrant)
- Extract validatePrincipal helper; rewrite validateGrantBody to use
  direct type checks instead of unsafe `as string` casts
- Align DI types with data layer (ResolvedPrincipal.principalType
  widened to string, getUserPrincipals role made optional)
- Switch revoke route from DELETE body to RESTful URL params
- Return 201 for assignGrant to match roles/groups create convention
- Handle null grantCapability return with 500
- Add comprehensive test coverage for new auth/validation paths

* fix: deduplicate ResolvedPrincipal, typed body, defensive auth checks

- Remove duplicate ResolvedPrincipal from capabilities.ts; import the
  canonical export from grants.ts
- Replace Record<string, unknown> with explicit GrantRequestBody interface
- Add defensive 403 when READ_CAPABILITY_BY_TYPE lookup misses
- Document revoke asymmetry (no possession check) with JSDoc
- Use _id only in resolveUser (avoid Mongoose virtual reliance)
- Improve null-grant error message
- Complete logger mock in tests

* refactor: move ResolvedPrincipal to shared types to fix circular dep

Extract ResolvedPrincipal from admin/grants.ts to types/principal.ts
so middleware/capabilities.ts imports from shared types rather than
depending upward on the admin handler layer.

* chore: remove dead re-export, align logger mocks across admin tests

- Remove unused ResolvedPrincipal re-export from grants.ts (canonical
  source is types/principal.ts)
- Align logger mocks in roles.spec.ts and groups.spec.ts to include
  all log levels (error, warn, info, debug) matching grants.spec.ts

* fix: cascade Config and AclEntry cleanup on role deletion

Add deleteConfig and deleteAclEntries to role deletion cascade,
matching the group deletion pattern. Previously only grants were
cleaned up, leaving orphaned config overrides and ACL entries.

* perf: single-query batch for getEffectiveCapabilities

Add getCapabilitiesForPrincipals (plural) to the data layer — a single
$or query across all principals instead of N+1 parallel queries. Wire
it into the grants handler so getEffectiveCapabilities hits the DB once
regardless of how many principals the user has.

* fix: defer SystemCapabilities access to factory call time

Move all SystemCapabilities usage (VALID_CAPABILITIES,
MANAGE_CAPABILITY_BY_TYPE, READ_CAPABILITY_BY_TYPE) inside the
createAdminGrantsHandlers factory. External test suites that mock
@librechat/data-schemas without providing SystemCapabilities crashed
at import time when grants.ts was loaded transitively.

* test: add data-layer and handler test coverage for review findings

- Add 6 mongodb-memory-server tests for getCapabilitiesForPrincipals:
  multi-principal batch, empty array, filtering, tenant scoping
- Add handler test: all principals filtered (only PUBLIC)
- Add handler test: granting an implied capability succeeds
- Add handler test: all cascade cleanup operations fail simultaneously
- Document platform-scope-only tenantId behavior in JSDoc

* fix: resolveUser fallback to user.id, early-return empty principals

- Match capabilities middleware pattern: _id?.toString() ?? user.id
  to handle JWT-deserialized users without Mongoose _id
- Move empty-array guard before principals.map() to skip unnecessary
  normalizePrincipalId calls
- Add comment explaining VALID_PRINCIPAL_TYPES module-scope asymmetry

* refactor: derive VALID_PRINCIPAL_TYPES from capability maps

Make MANAGE_CAPABILITY_BY_TYPE and READ_CAPABILITY_BY_TYPE
non-Partial Records over a shared GrantPrincipalType union, then
derive VALID_PRINCIPAL_TYPES from the map keys. This makes divergence
between the three data structures structurally impossible.

* feat: add GET /api/admin/grants list-all-grants endpoint

Add listAllGrants data-layer method and handler so the admin panel
can fetch all grants in a single request instead of fanning out
N+M calls per role and group. Response is filtered to only include
grants for principal types the caller has read access to.

* fix: update principalType to use GrantPrincipalType for consistency in grants handling

- Refactor principalType in createAdminGrantsHandlers to use GrantPrincipalType instead of PrincipalType for better type accuracy.
- Ensure type consistency across the grants handling logic in the API.

* fix: address admin grants review findings — tenantId propagation, capability validation, pagination, and test coverage

Propagate tenantId through all grant operations for multi-tenancy support.
Extract isValidCapability to accept full SystemCapability union (base, section,
assign) and reuse it in both Mongoose schema validation and handler input checks.
Replace listAllGrants with paginated listGrants + countGrants. Filter PUBLIC
principals from getCapabilitiesForPrincipals queries. Export getCachedPrincipals
from ALS store for fast-path principal resolution. Move DELETE capability param
to query string to avoid colon-in-URL issues. Remove dead code and add
comprehensive handler and data-layer test coverage.

* refactor: harden admin grants — FilterQuery types, auth-first ordering, DELETE path param, isValidCapability tests

Replace Record<string, unknown> with FilterQuery<ISystemGrant> across all
data-layer query filters. Refactor buildTenantFilter to a pure tenantCondition
function that returns a composable FilterQuery fragment, eliminating the $or
collision between tenant and principal queries. Move auth check before input
validation in getPrincipalGrantsHandler, assignGrantHandler, and
revokeGrantHandler to avoid leaking valid type names to unauthenticated callers.
Switch DELETE route from query param back to path param (/:capability) with
encodeURIComponent per project conventions. Add compound index for listGrants
sort. Type VALID_PRINCIPAL_TYPES as Set<GrantPrincipalType>. Remove unused
GetCachedPrincipalsFn type export. Add dedicated isValidCapability unit tests
and revokeGrant idempotency test.

* refactor: batch capability checks in listGrantsHandler via getHeldCapabilities

Replace 3 parallel hasCapabilityForPrincipals DB calls with a single
getHeldCapabilities query that returns the subset of capabilities any
principal holds. Also: defensive limit(0) clamp, parallelized assignGrant
auth checks, principalId type-vs-required error split, tenantCondition
hoisted to factory top, JSDoc on cascade deps, DELETE route encoding note.

* fix: normalize principalId and filter undefined in getHeldCapabilities

Add normalizePrincipalId + null guard to getHeldCapabilities, matching
the contract of getCapabilitiesForPrincipals. Simplify allCaps build
with flatMap, add no-tenantId cross-check and undefined-principalId
test cases.

* refactor: use concrete types in GrantRequestBody, rename encoding test

Replace unknown fields with explicit string types in GrantRequestBody,
matching the established pattern in roles/groups/config handlers. Rename
misleading 'encoded' test to 'with colons' since Express auto-decodes
req.params.

* fix: support hierarchical parent capabilities in possession checks

hasCapabilityForPrincipals and getHeldCapabilities now resolve parent
base capabilities for section/assignment grants. An admin holding
manage:configs can now grant manage:configs:<section> and transitively
read:configs:<section>. Fixes anti-escalation 403 blocking config
capability delegation.

* perf: use getHeldCapabilities in assignGrant to halve DB round-trips

assignGrantHandler was making two parallel hasCapabilityForPrincipals
calls to check manage + capability possession. getHeldCapabilities was
introduced in this PR specifically for this pattern. Replace with a
single batched call. Update corresponding spec assertions.

* fix: validate role existence before granting capabilities

Grants for non-existent role names were silently persisted, creating
orphaned grants that could surprise-activate if a role with that name
was later created. Add optional checkRoleExists dep to assignGrant and
wire it to getRoleByName in the route file.

* refactor: tighten principalType typing and use grantCapability in tests

Narrow getCapabilitiesForPrincipals parameter from string to
PrincipalType, removing the redundant cast. Replace direct
SystemGrant.create() calls in getCapabilitiesForPrincipals tests with
methods.grantCapability() to honor the schema's normalization invariant.
Add getHeldCapabilities extended capability tests.

* test: rename misleading cascade cleanup test name

The test only injects failure into deleteGrantsForPrincipal, not all
cascade operations. Rename from 'cascade cleanup fails' to 'grant
cleanup fails' to match the actual scope.

* fix: reorder role check after permission guard, add tenantId to index

Move checkRoleExists after the getHeldCapabilities permission check so
that a sub-MANAGE_ROLES admin cannot probe role name existence via
400 vs 403 response codes.

Add tenantId to the { principalType, capability } index so listGrants
queries in multi-tenant deployments can use a covering index instead
of post-scanning for tenant condition.

Add missing test for checkRoleExists throwing.

* fix: scope deleteGrantsForPrincipal to tenant on role deletion

deleteGrantsForPrincipal previously filtered only on principalType +
principalId, deleting grants across all tenants. Since the role schema
supports multi-tenancy (compound unique index on name + tenantId), two
tenants can share a role name like 'editor'. Deleting that role in one
tenant would wipe grants for identically-named roles in other tenants.

Add optional tenantId parameter to deleteGrantsForPrincipal. When
provided, scopes the delete to that tenant plus platform-level grants.
Propagate req.user.tenantId through the role deletion cascade.

* fix: scope grant cleanup to tenant on group deletion

Same cross-tenant gap as the role deletion path: deleteGroupHandler
called deleteGrantsForPrincipal without tenantId, so deleting a group
would wipe its grants across all tenants. Extract req.user.tenantId
and pass it through.

* test: add HTTP integration test for admin grants routes

Supertest-based test with real MongoMemoryServer exercising the full
Express wiring: route registration, injected auth middleware, handler
DI deps, and real DB round-trips.

Covers GET /, GET /effective, POST / + DELETE / lifecycle, role
existence validation, and 401 for unauthenticated callers.

Also documents the expandImplications scope: the /effective endpoint
returns base-level capabilities only; section-level resolution is
handled at authorization check time by getParentCapabilities.

* fix: use exact tenant match in deleteGrantsForPrincipal, normalize principalId, harden API

CRITICAL: deleteGrantsForPrincipal was using tenantCondition (a
read-query helper) for deleteMany, which includes the
{ tenantId: { $exists: false } } arm. This silently destroyed
platform-level grants when a tenant-scoped role/group deletion
occurred. Replace with exact { tenantId } match for deletes so
platform-level grants survive tenant-scoped cascade cleanup.

Refactor deleteGrantsForPrincipal signature from fragile positional
overload (sessionOrTenantId union + maybeSession) to a clean options
object: { tenantId?, session? }. Update all callers and test assertions.

Add normalizePrincipalId to hasCapabilityForPrincipals to match the
pattern already used by getHeldCapabilities — prevents string/ObjectId
type mismatch on USER/GROUP principal queries.

Also: export GrantPrincipalType from barrel, add upper-bound cap to
listGrants, document GROUP/USER existence check trade-off, add
integration tests for tenant-isolation property of deleteGrantsForPrincipal.

* fix: forward tenantId to getUserPrincipals in resolvePrincipals

resolvePrincipals had tenantId available from the caller but only
forwarded it to getCachedPrincipals (cache lookup). The DB fallback
via getUserPrincipals omitted it. While the Group schema's
applyTenantIsolation Mongoose plugin handles scoping via
AsyncLocalStorage in HTTP request context, explicitly passing tenantId
makes the contract visible and prevents silent cross-tenant group
resolution if called outside request context.

* fix: remove unused import and add assertion to 401 integration test

Remove unused SystemCapabilities import flagged by ESLint. Add explicit
body assertion to the 401 test so it has a jest expect() call.

* chore: hoist grant limit constants to scope, remove dead isolateModules

Move GRANTS_DEFAULT_LIMIT / GRANTS_MAX_LIMIT from inside listGrants
function body to createSystemGrantMethods scope so they are evaluated
once at module load. Remove dead jest.isolateModules + jest.doMock
block in integration test — the ~/models mock was never exercised
since handlers are built with explicit DI deps.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-03-30 16:49:23 -04:00
Danny Avila
fda1bfc3cc
🔬 ci: Add TypeScript Type Checks to Backend Workflow and Fix All Type Errors (#12451)
* fix(data-schemas): resolve TypeScript strict type check errors in source files

- Constrain ConfigSection to string keys via `string & keyof TCustomConfig`
- Replace broken `z` import from data-provider with TCustomConfig derivation
- Add `_id: Types.ObjectId` to IUser matching other Document interfaces
- Add `federatedTokens` and `openidTokens` optional fields to IUser
- Type mongoose model accessors as `Model<IRole>` and `Model<IUser>`
- Widen `getPremiumRate` param to accept `number | null`
- Widen `bulkWriteAclEntries` ops to untyped `AnyBulkWriteOperation[]`
- Fix `getUserPrincipals` return type to use `PrincipalType` enum
- Add non-null assertions for `connection.db` in migration files
- Import DailyRotateFile constructor directly instead of relying on
  broken module augmentation across mismatched node_modules trees
- Add winston-daily-rotate-file as devDependency for type resolution

* fix(data-schemas): resolve TypeScript type errors in test files

- Replace arbitrary test keys with valid TCustomConfig properties in config.spec
- Use non-null assertions for permission objects in role.methods.spec
- Replace `.SHARED_GLOBAL` access with `.not.toHaveProperty()` for legacy field
- Add non-null assertions for balance, writeRate, readRate in spendTokens.spec
- Update mock user _id to use ObjectId in user.test
- Remove unused Schema import in tenantIndexes.spec

* fix(api): resolve TypeScript strict type check errors across source and test files

- Widen getUserPrincipals dep type in capabilities middleware
- Fix federatedTokens type in createSafeUser return
- Use proper mock req type for read-only properties in preAuthTenant.spec
- Replace `as IUser` casts with ObjectId-typed mocks in openid/oidc specs
- Use TokenExchangeMethodEnum values instead of string literals in MCP specs
- Fix SessionStore type compatibility in sessionCache specs
- Replace `catch (error: any)` with `(error as Error)` in redis specs
- Remove invalid properties from test data in initialize and MCP specs
- Add String.prototype.isWellFormed declaration for sanitizeTitle spec

* fix(client): resolve TypeScript type errors in shared client components

- Add default values for destructured bindings in OGDialogTemplate
- Replace broken ExtendedFile import with inline type in FileIcon

* ci: add TypeScript type-check job to backend review workflow

Add a `typecheck` job that runs `tsc --noEmit` on all four TypeScript
workspaces (data-provider, data-schemas, @librechat/api, @librechat/client)
after the build step. Catches type errors that rollup builds may miss.

* fix(data-schemas): add local type declaration for DailyRotateFile transport

The `winston-daily-rotate-file` package ships a module augmentation for
`winston/lib/winston/transports`, but it fails when winston and
winston-daily-rotate-file resolve from different node_modules trees
(which happens in this monorepo due to npm hoisting).

Add a local `.d.ts` declaration that augments the same module path from
within data-schemas' compilation unit, so `tsc --noEmit` passes while
keeping the original runtime pattern (`new winston.transports.DailyRotateFile`).

* fix: address code review findings from PR #12451

- Restore typed `AnyBulkWriteOperation<AclEntry>[]` on bulkWriteAclEntries,
  cast to untyped only at the tenantSafeBulkWrite call site (Finding 1)
- Type `findUser` model accessor consistently with `findUsers` (Finding 2)
- Replace inline `import('mongoose').ClientSession` with top-level import type
- Use `toHaveLength` for spy assertions in playwright-expect spec file
- Replace numbered Record casts with `.not.toHaveProperty()` in
  role.methods.spec for SHARED_GLOBAL assertions
- Use per-test ObjectIds instead of shared testUserId in openid.spec
- Replace inline `import()` type annotations with top-level SessionData
  import in sessionCache spec
- Remove extraneous blank line in user.ts searchUsers

* refactor: address remaining review findings (4–7)

- Extract OIDCTokens interface in user.ts; deduplicate across IUser fields
  and oidc.ts FederatedTokens (Finding 4)
- Move String.isWellFormed declaration from spec file to project-level
  src/types/es2024-string.d.ts (Finding 5)
- Replace verbose `= undefined` defaults in OGDialogTemplate with null
  coalescing pattern (Finding 6)
- Replace `Record<string, unknown>` TestConfig with named interface
  containing explicit test fields (Finding 7)
2026-03-28 21:06:39 -04:00
Danny Avila
877c2efc85
🏗️ feat: bulkWrite isolation, pre-auth context, strict-mode fixes (#12445)
* fix: wrap seedDatabase() in runAsSystem() for strict tenant mode

seedDatabase() was called without tenant context at startup, causing
every Mongoose operation inside it to throw when
TENANT_ISOLATION_STRICT=true. Wrapping in runAsSystem() gives it the
SYSTEM_TENANT_ID sentinel so the isolation plugin skips filtering,
matching the pattern already used for performStartupChecks and
updateInterfacePermissions.

* fix: chain tenantContextMiddleware in optionalJwtAuth

optionalJwtAuth populated req.user but never established ALS tenant
context, unlike requireJwtAuth which chains tenantContextMiddleware
after successful auth. Authenticated users hitting routes with
optionalJwtAuth (e.g. /api/banner) had no tenant isolation.

* feat: tenant-safe bulkWrite wrapper and call-site migration

Mongoose's bulkWrite() does not trigger schema-level middleware hooks,
so the applyTenantIsolation plugin cannot intercept it. This adds a
tenantSafeBulkWrite() utility that injects the current ALS tenant
context into every operation's filter/document before delegating to
native bulkWrite.

Migrates all 8 runtime bulkWrite call sites:
- agentCategory (seedCategories, ensureDefaultCategories)
- conversation (bulkSaveConvos)
- message (bulkSaveMessages)
- file (batchUpdateFiles)
- conversationTag (updateTagsForConversation, bulkIncrementTagCounts)
- aclEntry (bulkWriteAclEntries)

systemGrant.seedSystemGrants is intentionally not migrated — it uses
explicit tenantId: { $exists: false } filters and is exempt from the
isolation plugin.

* feat: pre-auth tenant middleware and tenant-scoped config cache

Adds preAuthTenantMiddleware that reads X-Tenant-Id from the request
header and wraps downstream in tenantStorage ALS context. Wired onto
/oauth, /api/auth, /api/config, and /api/share — unauthenticated
routes that need tenant scoping before JWT auth runs.

The /api/config cache key is now tenant-scoped
(STARTUP_CONFIG:${tenantId}) so multi-tenant deployments serve the
correct login page config per tenant.

The middleware is intentionally minimal — no subdomain parsing, no
OIDC claim extraction. The private fork's reverse proxy or auth
gateway sets the header.

* feat: accept optional tenantId in updateInterfacePermissions

When tenantId is provided, the function re-enters inside
tenantStorage.run({ tenantId }) so all downstream Mongoose queries
target that tenant's roles instead of the system context. This lets
the private fork's tenant provisioning flow call
updateInterfacePermissions per-tenant after creating tenant-scoped
ADMIN/USER roles.

* fix: tenant-filter $lookup in getPromptGroup aggregation

The $lookup stage in getPromptGroup() queried the prompts collection
without tenant filtering. While the outer PromptGroup aggregate is
protected by the tenantIsolation plugin's pre('aggregate') hook,
$lookup runs as an internal MongoDB operation that bypasses Mongoose
hooks entirely.

Converts from simple field-based $lookup to pipeline-based $lookup
with an explicit tenantId match when tenant context is active.

* fix: replace field-level unique indexes with tenant-scoped compounds

Field-level unique:true creates a globally-unique single-field index in
MongoDB, which would cause insert failures across tenants sharing the
same ID values.

- agent.id: removed field-level unique, added { id, tenantId } compound
- convo.conversationId: removed field-level unique (compound at line 50
  already exists: { conversationId, user, tenantId })
- message.messageId: removed field-level unique (compound at line 165
  already exists: { messageId, user, tenantId })
- preset.presetId: removed field-level unique, added { presetId, tenantId }
  compound

* fix: scope MODELS_CONFIG, ENDPOINT_CONFIG, PLUGINS, TOOLS caches by tenant

These caches store per-tenant configuration (available models, endpoint
settings, plugin availability, tool definitions) but were using global
cache keys. In multi-tenant mode, one tenant's cached config would be
served to all tenants.

Appends :${tenantId} to cache keys when tenant context is active.
Falls back to the unscoped key when no tenant context exists (backward
compatible for single-tenant OSS deployments).

Covers all read, write, and delete sites:
- ModelController.js: get/set MODELS_CONFIG
- PluginController.js: get/set PLUGINS, get/set TOOLS
- getEndpointsConfig.js: get/set/delete ENDPOINT_CONFIG
- app.js: delete ENDPOINT_CONFIG (clearEndpointConfigCache)
- mcp.js: delete TOOLS (updateMCPTools, mergeAppTools)
- importers.js: get ENDPOINT_CONFIG

* fix: add getTenantId to PluginController spec mock

The data-schemas mock was missing getTenantId, causing all
PluginController tests to throw when the controller calls
getTenantId() for tenant-scoped cache keys.

* fix: address review findings — migration, strict-mode, DRY, types

Addresses all CRITICAL, MAJOR, and MINOR review findings:

F1 (CRITICAL): Add agents, conversations, messages, presets to
SUPERSEDED_INDEXES in tenantIndexes.ts so dropSupersededTenantIndexes()
drops the old single-field unique indexes that block multi-tenant inserts.

F2 (CRITICAL): Unknown bulkWrite op types now throw in strict mode
instead of silently passing through without tenant injection.

F3 (MAJOR): Replace wildcard export with named export for
tenantSafeBulkWrite, hiding _resetBulkWriteStrictCache from the
public package API.

F5 (MAJOR): Restore AnyBulkWriteOperation<IAclEntry>[] typing on
bulkWriteAclEntries — the unparameterized wrapper accepts parameterized
ops as a subtype.

F7 (MAJOR): Fix config.js tenant precedence — JWT-derived
req.user.tenantId now takes priority over the X-Tenant-Id header for
authenticated requests.

F8 (MINOR): Extract scopedCacheKey() helper into tenantContext.ts and
replace all 11 inline occurrences across 7 files.

F9 (MINOR): Use simple localField/foreignField $lookup for the
non-tenant getPromptGroup path (more efficient index seeks).

F12 (NIT): Remove redundant BulkOp type alias.
F13 (NIT): Remove debug log that leaked raw tenantId.

* fix: add new superseded indexes to tenantIndexes test fixture

The test creates old indexes to verify the migration drops them.
Missing fixture entries for agents.id_1, conversations.conversationId_1,
messages.messageId_1, and presets.presetId_1 caused the count assertion
to fail (expected 22, got 18).

* fix: restore logger.warn for unknown bulk op types in non-strict mode

* fix: block SYSTEM_TENANT_ID sentinel from external header input

CRITICAL: preAuthTenantMiddleware accepted any string as X-Tenant-Id,
including '__SYSTEM__'. The tenantIsolation plugin treats SYSTEM_TENANT_ID
as an explicit bypass — skipping ALL query filters. A client sending
X-Tenant-Id: __SYSTEM__ to pre-auth routes (/api/share, /api/config,
/api/auth, /oauth) would execute Mongoose operations without tenant
isolation.

Fixes:
- preAuthTenantMiddleware rejects SYSTEM_TENANT_ID in header
- scopedCacheKey returns the base key (not key:__SYSTEM__) in system
  context, preventing stale cache entries during runAsSystem()
- updateInterfacePermissions guards tenantId against SYSTEM_TENANT_ID
- $lookup pipeline separates $expr join from constant tenantId match
  for better index utilization
- Regression test for sentinel rejection in preAuthTenant.spec.ts
- Remove redundant getTenantId() call in config.js

* test: add missing deleteMany/replaceOne coverage, fix vacuous ALS assertions

bulkWrite spec:
- deleteMany: verifies tenant-scoped deletion leaves other tenants untouched
- replaceOne: verifies tenantId injected into both filter and replacement
- replaceOne overwrite: verifies a conflicting tenantId in the replacement
  document is overwritten by the ALS tenant (defense-in-depth)
- empty ops array: verifies graceful handling

preAuthTenant spec:
- All negative-case tests now use the capturedNext pattern to verify
  getTenantId() inside the middleware's execution context, not the
  test runner's outer frame (which was always undefined regardless)

* feat: tenant-isolate MESSAGES cache, FLOWS cache, and GenerationJobManager

MESSAGES cache (streamAudio.js):
- Cache key now uses scopedCacheKey(messageId) to prefix with tenantId,
  preventing cross-tenant message content reads during TTS streaming.

FLOWS cache (FlowStateManager):
- getFlowKey() now generates ${type}:${tenantId}:${flowId} when tenant
  context is active, isolating OAuth flow state per tenant.

GenerationJobManager:
- tenantId added to SerializableJobData and GenerationJobMetadata
- createJob() captures the current ALS tenant context (excluding
  SYSTEM_TENANT_ID) and stores it in job metadata
- SSE subscription endpoint validates job.metadata.tenantId matches
  req.user.tenantId, blocking cross-tenant stream access
- Both InMemoryJobStore and RedisJobStore updated to accept tenantId

* fix: add getTenantId and SYSTEM_TENANT_ID to MCP OAuth test mocks

FlowStateManager.getFlowKey() now calls getTenantId() for tenant-scoped
flow keys. The 4 MCP OAuth test files mock @librechat/data-schemas
without these exports, causing TypeError at runtime.

* fix: correct import ordering per AGENTS.md conventions

Package imports sorted shortest to longest line length, local imports
sorted longest to shortest — fixes ordering violations introduced by
our new imports across 8 files.

* fix: deserialize tenantId in RedisJobStore — cross-tenant SSE guard was no-op in Redis mode

serializeJob() writes tenantId to the Redis hash via Object.entries,
but deserializeJob() manually enumerates fields and omitted tenantId.
Every getJob() from Redis returned tenantId: undefined, causing the
SSE route's cross-tenant guard to short-circuit (undefined && ... → false).

* test: SSE tenant guard, FlowStateManager key consistency, ALS scope docs

SSE stream tenant tests (streamTenant.spec.js):
- Cross-tenant user accessing another tenant's stream → 403
- Same-tenant user accessing own stream → allowed
- OSS mode (no tenantId on job) → tenant check skipped

FlowStateManager tenant tests (manager.tenant.spec.ts):
- completeFlow finds flow created under same tenant context
- completeFlow does NOT find flow under different tenant context
- Unscoped flows are separate from tenant-scoped flows

Documentation:
- JSDoc on getFlowKey documenting ALS context consistency requirement
- Comment on streamAudio.js scopedCacheKey capture site

* fix: SSE stream tests hang on success path, remove internal fork references

The success-path tests entered the SSE streaming code which never
closes, causing timeout. Mock subscribe() to end the response
immediately. Restructured assertions to verify non-403/non-404.

Removed "private fork" and "OSS" references from code and test
descriptions — replaced with "deployment layer", "multi-tenant
deployments", and "single-tenant mode".

* fix: address review findings — test rigor, tenant ID validation, docs

F1: SSE stream tests now mock subscribe() with correct signature
(streamId, writeEvent, onDone, onError) and assert 200 status,
verifying the tenant guard actually allows through same-tenant users.

F2: completeFlow logs the attempted key and ALS tenantId when flow
is not found, so reverse proxy misconfiguration (missing X-Tenant-Id
on OAuth callback) produces an actionable warning.

F3/F10: preAuthTenantMiddleware validates tenant ID format — rejects
colons, special characters, and values exceeding 128 chars. Trims
whitespace. Prevents cache key collisions via crafted headers.

F4: Documented cache invalidation scope limitation in
clearEndpointConfigCache — only the calling tenant's key is cleared;
other tenants expire via TTL.

F7: getFlowKey JSDoc now lists all 8 methods requiring consistent
ALS context.

F8: Added dedicated scopedCacheKey unit tests — base key without
context, base key in system context, scoped key with tenant, no
ALS leakage across scope boundaries.

* fix: revert flow key tenant scoping, fix SSE test timing

FlowStateManager: Reverts tenant-scoped flow keys. OAuth callbacks
arrive without tenant ALS context (provider redirects don't carry
X-Tenant-Id), so completeFlow/failFlow would never find flows
created under tenant context. Flow IDs are random UUIDs with no
collision risk, and flow data is ephemeral (TTL-bounded).

SSE tests: Use process.nextTick for onDone callback so Express
response headers are flushed before res.write/res.end are called.

* fix: restore getTenantId import for completeFlow diagnostic log

* fix: correct completeFlow warning message, add missing flow test

The warning referenced X-Tenant-Id header consistency which was only
relevant when flow keys were tenant-scoped (since reverted). Updated
to list actual causes: TTL expiry, missing flow, or routing to a
different instance without shared Keyv storage.

Removed the getTenantId() call and import — no longer needed since
flow keys are unscoped.

Added test for the !flowState branch in completeFlow — verifies
return false and logger.warn on nonexistent flow ID.

* fix: add explicit return type to recursive updateInterfacePermissions

The recursive call (tenantId branch calls itself without tenantId)
causes TypeScript to infer circular return type 'any'. Adding
explicit Promise<void> satisfies the rollup typescript plugin.

* fix: update MCPOAuthRaceCondition test to match new completeFlow warning

* fix: clearEndpointConfigCache deletes both scoped and unscoped keys

Unauthenticated /api/endpoints requests populate the unscoped
ENDPOINT_CONFIG key. Admin config mutations clear only the
tenant-scoped key, leaving the unscoped entry stale indefinitely.
Now deletes both when in tenant context.

* fix: tenant guard on abort/status endpoints, warn logs, test coverage

F1: Add tenant guard to /chat/status/:conversationId and /chat/abort
matching the existing guard on /chat/stream/:streamId. The status
endpoint exposes aggregatedContent (AI response text) which requires
tenant-level access control.

F2: preAuthTenantMiddleware now logs warn for rejected __SYSTEM__
sentinel and malformed tenant IDs, providing observability for
bypass probing attempts.

F3: Abort fallback path (getActiveJobIdsForUser) now has tenant
check after resolving the job.

F4: Test for strict mode + SYSTEM_TENANT_ID — verifies runAsSystem
bypasses tenantSafeBulkWrite without throwing in strict mode.

F5: Test for job with tenantId + user without tenantId → 403.

F10: Regex uses idiomatic hyphen-at-start form.

F11: Test descriptions changed from "rejects" to "ignores" since
middleware calls next() (not 4xx).

Also fixes MCPOAuthRaceCondition test assertion to match updated
completeFlow warning message.

* fix: test coverage for logger.warn, status/abort guards, consistency

A: preAuthTenant spec now mocks logger and asserts warn calls for
__SYSTEM__ sentinel, malformed characters, and oversized headers.

B: streamTenant spec expanded with status and abort endpoint tests —
cross-tenant status returns 403, same-tenant returns 200 with body,
cross-tenant abort returns 403.

C: Abort endpoint uses req.user.tenantId (not req.user?.tenantId)
matching stream/status pattern — requireJwtAuth guarantees req.user.

D: Malformed header warning now includes ip in log metadata,
matching the sentinel warning for consistent SOC correlation.

* fix: assert ip field in malformed header warn tests

* fix: parallelize cache deletes, document tenant guard, fix import order

- clearEndpointConfigCache uses Promise.all for independent cache
  deletes instead of sequential awaits
- SSE stream tenant guard has inline comment explaining backward-compat
  behavior for untenanted legacy jobs
- conversation.ts local imports reordered longest-to-shortest per
  AGENTS.md

* fix: tenant-qualify userJobs keys, document tenant guard backward-compat

Job store userJobs keys now include tenantId when available:
- Redis: stream:user:{tenantId:userId}:jobs (falls back to
  stream:user:{userId}:jobs when no tenant)
- InMemory: composite key tenantId:userId in userJobMap

getActiveJobIdsByUser/getActiveJobIdsForUser accept optional tenantId
parameter, threaded through from req.user.tenantId at all call sites
(/chat/active and /chat/abort fallback).

Added inline comments on all three SSE tenant guards explaining the
backward-compat design: untenanted legacy jobs remain accessible
when the userId check passes.

* fix: parallelize cache deletes, document tenant guard, fix import order

Fix InMemoryJobStore.getActiveJobIdsByUser empty-set cleanup to use
the tenant-qualified userKey instead of bare userId — prevents
orphaned empty Sets accumulating in userJobMap for multi-tenant users.

Document cross-tenant staleness in clearEndpointConfigCache JSDoc —
other tenants' scoped keys expire via TTL, not active invalidation.

* fix: cleanup userJobMap leak, startup warning, DRY tenant guard, docs

F1: InMemoryJobStore.cleanup() now removes entries from userJobMap
before calling deleteJob, preventing orphaned empty Sets from
accumulating with tenant-qualified composite keys.

F2: Startup warning when TENANT_ISOLATION_STRICT is active — reminds
operators to configure reverse proxy to control X-Tenant-Id header.

F3: mergeAppTools JSDoc documents that tenant-scoped TOOLS keys are
not actively invalidated (matching clearEndpointConfigCache pattern).

F5: Abort handler getActiveJobIdsForUser call uses req.user.tenantId
(not req.user?.tenantId) — consistent with stream/status handlers.

F6: updateInterfacePermissions JSDoc clarifies SYSTEM_TENANT_ID
behavior — falls through to caller's ALS context.

F7: Extracted hasTenantMismatch() helper, replacing three identical
inline tenant guard blocks across stream/status/abort endpoints.

F9: scopedCacheKey JSDoc documents both passthrough cases (no context
and SYSTEM_TENANT_ID context).

* fix: clean userJobMap in evictOldest — same leak as cleanup()
2026-03-28 16:43:50 -04:00
Danny Avila
9f6d8c6e93
🧵 feat: ALS Context Middleware, Tenant Threading, and Config Cache Invalidation (#12407)
* feat: add tenant context middleware for ALS-based isolation

Introduces tenantContextMiddleware that propagates req.user.tenantId
into AsyncLocalStorage, activating the Mongoose applyTenantIsolation
plugin for all downstream DB queries within a request.

- Strict mode (TENANT_ISOLATION_STRICT=true) returns 403 if no tenantId
- Non-strict mode passes through for backward compatibility
- No-op for unauthenticated requests
- Includes 6 unit tests covering all paths

* feat: register tenant middleware and wrap startup/auth in runAsSystem()

- Register tenantContextMiddleware in Express app after capability middleware
- Wrap server startup initialization in runAsSystem() for strict mode compat
- Wrap auth strategy getAppConfig() calls in runAsSystem() since they run
  before user context is established (LDAP, SAML, OpenID, social login, AuthService)

* feat: thread tenantId through all getAppConfig callers

Pass tenantId from req.user to getAppConfig() across all callers that
have request context, ensuring correct per-tenant cache key resolution.

Also fixes getBaseConfig admin endpoint to scope to requesting admin's
tenant instead of returning the unscoped base config.

Files updated:
- Controllers: UserController, PluginController
- Middleware: checkDomainAllowed, balance
- Routes: config
- Services: loadConfigModels, loadDefaultModels, getEndpointsConfig, MCP
- Audio services: TTSService, STTService, getVoices, getCustomConfigSpeech
- Admin: getBaseConfig endpoint

* feat: add config cache invalidation on admin mutations

- Add clearOverrideCache(tenantId?) to flush per-principal override caches
  by enumerating Keyv store keys matching _OVERRIDE_: prefix
- Add invalidateConfigCaches() helper that clears base config, override
  caches, tool caches, and endpoint config cache in one call
- Wire invalidation into all 5 admin config mutation handlers
  (upsert, patch, delete field, delete overrides, toggle active)
- Add strict mode warning when __default__ tenant fallback is used
- Add 3 new tests for clearOverrideCache (all/scoped/base-preserving)

* chore: update getUserPrincipals comment to reflect ALS-based tenant filtering

The TODO(#12091) about missing tenantId filtering is resolved by the
tenant context middleware + applyTenantIsolation Mongoose plugin.
Group queries are now automatically scoped by tenantId via ALS.

* fix: replace runAsSystem with baseOnly for pre-tenant code paths

App configs are tenant-owned — runAsSystem() would bypass tenant
isolation and return cross-tenant DB overrides. Instead, add
baseOnly option to getAppConfig() that returns YAML-derived config
only, with zero DB queries.

All startup code, auth strategies, and MCP initialization now use
getAppConfig({ baseOnly: true }) to get the YAML config without
touching the Config collection.

* fix: address PR review findings — middleware ordering, types, cache safety

- Chain tenantContextMiddleware inside requireJwtAuth after passport auth
  instead of global app.use() where req.user is always undefined (Finding 1)
- Remove global tenantContextMiddleware registration from index.js
- Update BalanceMiddlewareOptions to include tenantId, remove redundant cast (Finding 4)
- Add warning log when clearOverrideCache cannot enumerate keys on Redis (Finding 3)
- Use startsWith instead of includes for cache key filtering (Finding 12)
- Use generator loop instead of Array.from for key enumeration (Finding 3)
- Selective barrel export — exclude _resetTenantMiddlewareStrictCache (Finding 5)
- Move isMainThread check to module level, remove per-request check (Finding 9)
- Move mid-file require to top of app.js (Finding 8)
- Parallelize invalidateConfigCaches with Promise.all (Finding 10)
- Remove clearOverrideCache from public app.js exports (internal only)
- Strengthen getUserPrincipals comment re: ALS dependency (Finding 2)

* fix: restore runAsSystem for startup DB ops, consolidate require, clarify baseOnly

- Restore runAsSystem() around performStartupChecks, updateInterfacePermissions,
  initializeMCPs, and initializeOAuthReconnectManager — these make Mongoose
  queries that need system context in strict tenant mode (NEW-3)
- Consolidate duplicate require('@librechat/api') in requireJwtAuth.js (NEW-1)
- Document that baseOnly ignores role/userId/tenantId in JSDoc (NEW-2)

* test: add requireJwtAuth tenant chaining + invalidateConfigCaches tests

- requireJwtAuth: 5 tests verifying ALS tenant context is set after
  passport auth, isolated between concurrent requests, and not set
  when user has no tenantId (Finding 6)
- invalidateConfigCaches: 4 tests verifying all four caches are cleared,
  tenantId is threaded through, partial failure is handled gracefully,
  and operations run in parallel via Promise.all (Finding 11)

* fix: address Copilot review — passport errors, namespaced cache keys, /base scoping

- Forward passport errors in requireJwtAuth before entering tenant
  middleware — prevents silent auth failures from reaching handlers (P1)
- Account for Keyv namespace prefix in clearOverrideCache — stored keys
  are namespaced as "APP_CONFIG:_OVERRIDE_:..." not "_OVERRIDE_:...",
  so override caches were never actually matched/cleared (P2)
- Remove role from getBaseConfig — /base should return tenant-scoped
  base config, not role-merged config that drifts per admin role (P2)
- Return tenantStorage.run() for cleaner async semantics
- Update mock cache in service.spec.ts to simulate Keyv namespacing

* fix: address second review — cache safety, code quality, test reliability

- Decouple cache invalidation from mutation response: fire-and-forget
  with logging so DB mutation success is not masked by cache failures
- Extract clearEndpointConfigCache helper from inline IIFE
- Move isMainThread check to lazy once-per-process guard (no import
  side effect)
- Memoize process.env read in overrideCacheKey to avoid per-request
  env lookups and log flooding in strict mode
- Remove flaky timer-based parallelism assertion, use structural check
- Merge orphaned double JSDoc block on getUserPrincipals
- Fix stale [getAppConfig] log prefix → [ensureBaseConfig]
- Fix import order in tenant.spec.ts (package types before local values)
- Replace "Finding 1" reference with self-contained description
- Use real tenantStorage primitives in requireJwtAuth spec mock

* fix: move JSDoc to correct function after clearEndpointConfigCache extraction

* refactor: remove Redis SCAN from clearOverrideCache, rely on TTL expiry

Redis SCAN causes 60s+ stalls under concurrent load (see #12410).
APP_CONFIG defaults to FORCED_IN_MEMORY_CACHE_NAMESPACES, so the
in-memory store.keys() path handles the standard case. When APP_CONFIG
is Redis-backed, overrides expire naturally via overrideCacheTtl (60s
default) — an acceptable window for admin config mutations.

* fix: remove return from tenantStorage.run to satisfy void middleware signature

* fix: address second review — cache safety, code quality, test reliability

- Switch invalidateConfigCaches from Promise.all to Promise.allSettled
  so partial failures are logged individually instead of producing one
  undifferentiated error (Finding 3)
- Gate overrideCacheKey strict-mode warning behind a once-per-process
  flag to prevent log flooding under load (Finding 4)
- Add test for passport error forwarding in requireJwtAuth — the
  if (err) { return next(err) } branch now has coverage (Finding 5)
- Add test for real partial failure in invalidateConfigCaches where
  clearAppConfigCache rejects (not just the swallowed endpoint error)

* chore: reorder imports in index.js and app.js for consistency

- Moved logger and runAsSystem imports to maintain a consistent import order across files.
- Improved code readability by ensuring related imports are grouped together.
2026-03-26 17:35:00 -04:00
Danny Avila
4b6d68b3b5
🎛️ feat: DB-Backed Per-Principal Config System (#12354)
*  feat: Add Config schema, model, and methods for role-based DB config overrides

Add the database foundation for principal-based configuration overrides
(user, group, role) in data-schemas. Includes schema with tenantId and
tenant isolation, CRUD methods, and barrel exports.

* 🔧 fix: Add shebang and enforce LF line endings for git hooks

The pre-commit hook was missing #!/bin/sh, and core.autocrlf=true was
converting it to CRLF, both causing "Exec format error" on Windows.
Add .gitattributes to force LF for .husky/* and *.sh files.

*  feat: Add admin config API routes with section-level capability checks

Add /api/admin/config endpoints for managing per-principal config
overrides (user, group, role). Handlers in @librechat/api use DI pattern
with section-level hasConfigCapability checks for granular access control.

Supports full overrides replacement, per-field PATCH via dot-paths, field
deletion, toggle active, and listing.

* 🐛 fix: Move deleteConfigField fieldPath from URL param to request body

The path-to-regexp wildcard syntax (:fieldPath(*)) is not supported by
the version used in Express. Send fieldPath in the DELETE request body
instead, which also avoids URL-encoding issues with dotted paths.

*  feat: Wire config resolution into getAppConfig with override caching

Add mergeConfigOverrides utility in data-schemas for deep-merging DB
config overrides into base AppConfig by priority order.

Update getAppConfig to query DB for applicable configs when role/userId
is provided, with short-TTL caching and a hasAnyConfigs feature flag
for zero-cost when no DB configs exist.

Also: add unique compound index on Config schema, pass userId from
config middleware, and signal config changes from admin API handlers.

* 🔄 refactor: Extract getAppConfig logic into packages/api as TS service

Move override resolution, caching strategy, and signalConfigChange from
api/server/services/Config/app.js into packages/api/src/app/appConfigService.ts
using the DI factory pattern (createAppConfigService). The JS file becomes
a thin wiring layer injecting loadBaseConfig, cache, and DB dependencies.

* 🧹 chore: Rename configResolution.ts to resolution.ts

*  feat: Move admin types & capabilities to librechat-data-provider

Move SystemCapabilities, CapabilityImplications, and utility functions
(hasImpliedCapability, expandImplications) from data-schemas to
data-provider so they are available to external consumers like the
admin panel without a data-schemas dependency.

Add API-friendly admin types: TAdminConfig, TAdminSystemGrant,
TAdminAuditLogEntry, TAdminGroup, TAdminMember, TAdminUserSearchResult,
TCapabilityCategory, and CAPABILITY_CATEGORIES.

data-schemas re-exports these from data-provider and extends with
config-schema-derived types (ConfigSection, SystemCapability union).

Bump version to 0.8.500.

* feat: Add JSON-serializable admin config API response types to data-schemas

Add AdminConfig, AdminConfigListResponse, AdminConfigResponse, and
AdminConfigDeleteResponse types so both LibreChat API handlers and the
admin panel can share the same response contract. Bump version to 0.0.41.

* refactor: Move admin capabilities & types from data-provider to data-schemas

SystemCapabilities, CapabilityImplications, utility functions,
CAPABILITY_CATEGORIES, and admin API response types should not be in
data-provider as it gets compiled into the frontend bundle, exposing
the capability surface. Moved everything to data-schemas (server-only).

All consumers already import from @librechat/data-schemas, so no
import changes needed elsewhere. Consolidated duplicate AdminConfig
type (was in both config.ts and admin.ts).

* chore: Bump @librechat/data-schemas to 0.0.42

* refactor: Reorganize admin capabilities into admin/ and types/admin.ts

Split systemCapabilities.ts following data-schemas conventions:
- Types (BaseSystemCapability, SystemCapability, AdminConfig, etc.)
  → src/types/admin.ts
- Runtime code (SystemCapabilities, CapabilityImplications, utilities)
  → src/admin/capabilities.ts

Revert data-provider version to 0.8.401 (no longer modified).

* chore: Fix import ordering, rename appConfigService to service

- Rename app/appConfigService.ts → app/service.ts (directory provides context)
- Fix import order in admin/config.ts, types/admin.ts, types/config.ts
- Add naming convention to AGENTS.md

* feat: Add DB base config support (role/__base__)

- Add BASE_CONFIG_PRINCIPAL_ID constant for reserved base config doc
- getApplicableConfigs always includes __base__ in queries
- getAppConfig queries DB even without role/userId when DB configs exist
- Bump @librechat/data-schemas to 0.0.43

* fix: Address PR review issues for admin config

- Add listAllConfigs method; listConfigs endpoint returns all active
  configs instead of only __base__
- Normalize principalId to string in all config methods to prevent
  ObjectId vs string mismatch on user/group lookups
- Block __proto__ and all dunder-prefixed segments in field path
  validation to prevent prototype pollution
- Fix configVersion off-by-one: default to 0, guard pre('save') with
  !isNew, use $inc on findOneAndUpdate
- Remove unused getApplicableConfigs from admin handler deps

* fix: Enable tree-shaking for data-schemas, bump packages

- Switch data-schemas Rollup output to preserveModules so each source
  file becomes its own chunk; consumers (admin panel) can now import
  just the modules they need without pulling in winston/mongoose/etc.
- Add sideEffects: false to data-schemas package.json
- Bump data-schemas to 0.0.44, data-provider to 0.8.402

* feat: add capabilities subpath export to data-schemas

Adds `@librechat/data-schemas/capabilities` subpath export so browser
consumers can import BASE_CONFIG_PRINCIPAL_ID and capability constants
without pulling in Node.js-only modules (winston, async_hooks, etc.).

Bump version to 0.0.45.

* fix: include dist/ in data-provider npm package

Add explicit files field so npm includes dist/types/ in the published
package. Without this, the root .gitignore exclusion of dist/ causes
npm to omit type declarations, breaking TypeScript consumers.

* chore: bump librechat-data-provider to 0.8.403

* feat: add GET /api/admin/config/base for raw AppConfig

Returns the full AppConfig (YAML + DB base merged) so the admin panel
can display actual config field values and structure. The startup config
endpoint (/api/config) returns TStartupConfig which is a different shape
meant for the frontend app.

* chore: imports order

* fix: address code review findings for admin config

Critical:
- Fix clearAppConfigCache: was deleting from wrong cache store (CONFIG_STORE
  instead of APP_CONFIG), now clears BASE and HAS_DB_CONFIGS keys
- Eliminate race condition: patchConfigField and deleteConfigField now use
  atomic MongoDB $set/$unset with dot-path notation instead of
  read-modify-write cycles, removing the lost-update bug entirely
- Add patchConfigFields and unsetConfigField atomic DB methods

Major:
- Reorder cache check before principal resolution in getAppConfig so
  getUserPrincipals DB query only fires on cache miss
- Replace '' as ConfigSection with typed BROAD_CONFIG_ACCESS constant
- Parallelize capability checks with Promise.all instead of sequential
  awaits in for loops
- Use loose equality (== null) for cache miss check to handle both null
  and undefined returns from cache implementations
- Set HAS_DB_CONFIGS_KEY to true on successful config fetch

Minor:
- Remove dead pre('save') hook from config schema (all writes use
  findOneAndUpdate which bypasses document hooks)
- Consolidate duplicate type imports in resolution.ts
- Remove dead deepGet/deepSet/deepUnset functions (replaced by atomic ops)
- Add .sort({ priority: 1 }) to getApplicableConfigs query
- Rename _impliedBy to impliedByMap

* fix: self-referencing BROAD_CONFIG_ACCESS constant

* fix: replace type-cast sentinel with proper null parameter

Update hasConfigCapability to accept ConfigSection | null where null
means broad access check (MANAGE_CONFIGS or READ_CONFIGS only).
Removes the '' as ConfigSection type lie from admin config handlers.

* fix: remaining review findings + add tests

- listAllConfigs accepts optional { isActive } filter so admin listing
  can show inactive configs (#9)
- Standardize session application to .session(session ?? null) across
  all config DB methods (#15)
- Export isValidFieldPath and getTopLevelSection for testability
- Add 38 tests across 3 spec files:
  - config.spec.ts (api): path validation, prototype pollution rejection
  - resolution.spec.ts: deep merge, priority ordering, array replacement
  - config.spec.ts (data-schemas): full CRUD, ObjectId normalization,
    atomic $set/$unset, configVersion increment, toggle, __base__ query

* fix: address second code review findings

- Fix cross-user cache contamination: overrideCacheKey now handles
  userId-without-role case with its own cache key (#1)
- Add broad capability check before DB lookup in getConfig to prevent
  config existence enumeration (#2/#3)
- Move deleteConfigField fieldPath from request body to query parameter
  for proxy/load balancer compatibility (#5)
- Derive BaseSystemCapability from SystemCapabilities const instead of
  manual string union (#6)
- Return 201 on upsert creation, 200 on update (#11)
- Remove inline narration comments per AGENTS.md (#12)
- Type overrides as Partial<TCustomConfig> in DB methods and handler
  deps (#13)
- Replace double as-unknown-as casts in resolution.ts with generic
  deepMerge<T> (#14)
- Make override cache TTL injectable via AppConfigServiceDeps (#16)
- Add exhaustive never check in principalModel switch (#17)

* fix: remaining review findings — tests, rename, semantics

- Rename signalConfigChange → markConfigsDirty with JSDoc documenting
  the stale-window tradeoff and overrideCacheTtl knob
- Fix DEFAULT_OVERRIDE_CACHE_TTL naming convention
- Add createAppConfigService tests (14 cases): cache behavior, feature
  flag, cross-user key isolation, fallback on error, markConfigsDirty
- Add admin handler integration tests (13 cases): auth ordering,
  201/200 on create/update, fieldPath from query param, markConfigsDirty
  calls, capability checks

* fix: global flag corruption + empty overrides auth bypass

- Remove HAS_DB_CONFIGS_KEY=false optimization: a scoped query returning
  no configs does not mean no configs exist globally. Setting the flag
  false from a per-principal query short-circuited all subsequent users.
- Add broad manage capability check before section checks in
  upsertConfigOverrides: empty overrides {} no longer bypasses auth.

* test: add regression and invariant tests for config system

Regression tests:
- Bug 1: User A's empty result does not short-circuit User B's overrides
- Bug 2: Empty overrides {} returns 403 without MANAGE_CONFIGS

Invariant tests (applied across ALL handlers):
- All 5 mutation handlers call markConfigsDirty on success
- All 5 mutation handlers return 401 without auth
- All 5 mutation handlers return 403 without capability
- All 3 read handlers return 403 without capability

* fix: third review pass — all findings addressed

Service (service.ts):
- Restore HAS_DB_CONFIGS=false for base-only queries (no role/userId)
  so deployments with zero DB configs skip DB queries (#1)
- Resolve cache once at factory init instead of per-invocation (#8)
- Use BASE_CONFIG_PRINCIPAL_ID constant in overrideCacheKey (#10)
- Add JSDoc to clearAppConfigCache documenting stale-window (#4)
- Fix log message to not say "from YAML" (#14)

Admin handlers (config.ts):
- Use configVersion===1 for 201 vs 200, eliminating TOCTOU race (#2)
- Add Array.isArray guard on overrides body (#5)
- Import CapabilityUser from capabilities.ts, remove duplicate (#6)
- Replace as-unknown-as cast with targeted type assertion (#7)
- Add MAX_PATCH_ENTRIES=100 cap on entries array (#15)
- Reorder deleteConfigField to validate principalType first (#12)
- Export CapabilityUser from middleware/capabilities.ts

DB methods (config.ts):
- Remove isActive:true from patchConfigFields to prevent silent
  reactivation of disabled configs (#3)

Schema (config.ts):
- Change principalId from Schema.Types.Mixed to String (#11)

Tests:
- Add patchConfigField unsafe fieldPath rejection test (#9)
- Add base-only HAS_DB_CONFIGS=false test (#1)
- Update 201/200 tests to use configVersion instead of findConfig (#2)

* fix: add read handler 401 invariant tests + document flag behavior

- Add invariant: all 3 read handlers return 401 without auth
- Document on markConfigsDirty that HAS_DB_CONFIGS stays true after
  all configs are deleted until clearAppConfigCache or restart

* fix: remove HAS_DB_CONFIGS false optimization entirely

getApplicableConfigs([]) only queries for __base__, not all configs.
A deployment with role/group configs but no __base__ doc gets the
flag poisoned to false by a base-only query, silently ignoring all
scoped overrides. The optimization is not safe without a comprehensive
Config.exists() check, which adds its own DB cost. Removed entirely.

The flag is now write-once-true (set when configs are found or by
markConfigsDirty) and only cleared by clearAppConfigCache/restart.

* chore: reorder import statements in app.js for clarity

* refactor: remove HAS_DB_CONFIGS_KEY machinery entirely

The three-state flag (false/null/true) was the source of multiple bugs
across review rounds. Every attempt to safely set it to false was
defeated by getApplicableConfigs querying only a subset of principals.

Removed: HAS_DB_CONFIGS_KEY constant, all reads/writes of the flag,
markConfigsDirty (now a no-op concept), notifyChange wrapper, and all
tests that seeded false manually.

The per-user/role TTL cache (overrideCacheTtl, default 60s) is the
sole caching mechanism. On cache miss, getApplicableConfigs queries
the DB. This is one indexed query per user per TTL window — acceptable
for the config override use case.

* docs: rewrite admin panel remaining work with current state

* perf: cache empty override results to avoid repeated DB queries

When getApplicableConfigs returns no configs for a principal, cache
baseConfig under their override key with TTL. Without this, every
user with no per-principal overrides hits MongoDB on every request
after the 60s cache window expires.

* fix: add tenantId to cache keys + reject PUBLIC principal type

- Include tenantId in override cache keys to prevent cross-tenant
  config contamination. Single-tenant deployments (tenantId undefined)
  use '_' as placeholder — no behavior change for them.
- Reject PrincipalType.PUBLIC in admin config validation — PUBLIC has
  no PrincipalModel and is never resolved by getApplicableConfigs,
  so config docs for it would be dead data.
- Config middleware passes req.user.tenantId to getAppConfig.

* fix: fourth review pass findings

DB methods (config.ts):
- findConfigByPrincipal accepts { includeInactive } option so admin
  GET can retrieve inactive configs (#5)
- upsertConfig catches E11000 duplicate key on concurrent upserts and
  retries without upsert flag (#2)
- unsetConfigField no longer filters isActive:true, consistent with
  patchConfigFields (#11)
- Typed filter objects replace Record<string, unknown> (#12)

Admin handlers (config.ts):
- patchConfigField: serial broad capability check before Promise.all
  to pre-warm ALS principal cache, preventing N parallel DB calls (#3)
- isValidFieldPath rejects leading/trailing dots and consecutive
  dots (#7)
- Duplicate fieldPaths in patch entries return 400 (#8)
- DEFAULT_PRIORITY named constant replaces hardcoded 10 (#14)
- Admin getConfig and patchConfigField pass includeInactive to
  findConfigByPrincipal (#5)
- Route import uses barrel instead of direct file path (#13)

Resolution (resolution.ts):
- deepMerge has MAX_MERGE_DEPTH=10 guard to prevent stack overflow
  from crafted deeply nested configs (#4)

* fix: final review cleanup

- Remove ADMIN_PANEL_REMAINING.md (local dev notes with Windows paths)
- Add empty-result caching regression test
- Add tenantId to AdminConfigDeps.getAppConfig type
- Restore exhaustive never check in principalModel switch
- Standardize toggleConfigActive session handling to options pattern

* fix: validate priority in patchConfigField handler

Add the same non-negative number validation for priority that
upsertConfigOverrides already has. Without this, invalid priority
values could be stored via PATCH and corrupt merge ordering.

* chore: remove planning doc from PR

* fix: correct stale cache key strings in service tests

* fix: clean up service tests and harden tenant sentinel

- Remove no-op cache delete lines from regression tests
- Change no-tenant sentinel from '_' to '__default__' to avoid
  collision with a real tenant ID when multi-tenancy is enabled
- Remove unused CONFIG_STORE from AppConfigServiceDeps

* chore: bump @librechat/data-schemas to 0.0.46

* fix: block prototype-poisoning keys in deepMerge

Skip __proto__, constructor, and prototype keys during config merge
to prevent prototype pollution via PUT /api/admin/config overrides.
2026-03-25 19:39:29 -04:00
Danny Avila
9e0592a236
📜 feat: Implement System Grants for Capability-Based Authorization (#11896)
* feat: Implement System Grants for Role-Based Capabilities

- Added a new `systemGrant` model and associated methods to manage role-based capabilities within the application.
- Introduced middleware functions `hasCapability` and `requireCapability` to check user permissions based on their roles.
- Updated the database seeding process to include system grants for the ADMIN role, ensuring all necessary capabilities are assigned on startup.
- Enhanced type definitions and schemas to support the new system grant functionality, improving overall type safety and clarity in the codebase.

* test: Add unit tests for capabilities middleware and system grant methods

- Introduced comprehensive unit tests for the capabilities middleware, including `hasCapability` and `requireCapability`, ensuring proper permission checks based on user roles.
- Added tests for the `SystemGrant` methods, verifying the seeding of system grants, capability granting, and revocation processes.
- Enhanced test coverage for edge cases, including idempotency of grant operations and handling of unexpected errors in middleware.
- Utilized mocks for database interactions to isolate tests and improve reliability.

* refactor: Transition to Capability-Based Access Control

- Replaced role-based access checks with capability-based checks across various middleware and routes, enhancing permission management.
- Introduced `hasCapability` and `requireCapability` functions to streamline capability verification for user actions.
- Updated relevant routes and middleware to utilize the new capability system, ensuring consistent permission enforcement.
- Enhanced type definitions and added tests for the new capability functions, improving overall code reliability and maintainability.

* test: Enhance capability-based access tests for ADMIN role

- Updated tests to reflect the new capability-based access control, specifically for the ADMIN role.
- Modified test descriptions to clarify that users with the MANAGE_AGENTS capability can bypass permission checks.
- Seeded capabilities for the ADMIN role in multiple test files to ensure consistent permission checks across different routes and middleware.
- Improved overall test coverage for capability verification, ensuring robust permission management.

* test: Update capability tests for MCP server access

- Renamed test to reflect the correct capability for bypassing permission checks, changing from MANAGE_AGENTS to MANAGE_MCP_SERVERS.
- Updated seeding of capabilities for the ADMIN role to align with the new capability structure.
- Ensured consistency in capability definitions across tests and middleware for improved permission management.

* feat: Add hasConfigCapability for enhanced config access control

- Introduced `hasConfigCapability` function to check user permissions for managing or reading specific config sections.
- Updated middleware to export the new capability function, ensuring consistent access control across the application.
- Enhanced unit tests to cover various scenarios for the new capability, improving overall test coverage and reliability.

* fix: Update tenantId filter in createSystemGrantMethods

- Added a condition to set tenantId filter to { $exists: false } when tenantId is null, ensuring proper handling of cases where tenantId is not provided.
- This change improves the robustness of the system grant methods by explicitly managing the absence of tenantId in the filter logic.

* fix: account deletion capability check

- Updated the `canDeleteAccount` middleware to ensure that the `hasManageUsers` capability check only occurs if a user is present, preventing potential errors when the user object is undefined.
- This change improves the robustness of the account deletion logic by ensuring proper handling of user permissions.

* refactor: Optimize seeding of system grants for ADMIN role

- Replaced sequential capability granting with parallel execution using Promise.all in the seedSystemGrants function.
- This change improves performance and efficiency during the initialization of system grants, ensuring all capabilities are granted concurrently.

* refactor: Simplify systemGrantSchema index definition

- Removed the sparse option from the unique index on principalType, principalId, capability, and tenantId in the systemGrantSchema.
- This change streamlines the index definition, potentially improving query performance and clarity in the schema design.

* refactor: Reorganize role capability check in roles route

- Moved the capability check for reading roles to occur after parsing the roleName, improving code clarity and structure.
- This change ensures that the authorization logic is consistently applied before fetching role details, enhancing overall permission management.

* refactor: Remove unused ISystemGrant interface from systemCapabilities.ts

- Deleted the ISystemGrant interface as it was no longer needed, streamlining the code and improving clarity.
- This change helps reduce clutter in the file and focuses on relevant capabilities for the system.

* refactor: Migrate SystemCapabilities to data-schemas

- Replaced imports of SystemCapabilities from 'librechat-data-provider' with imports from '@librechat/data-schemas' across multiple files.
- This change centralizes the management of system capabilities, improving code organization and maintainability.

* refactor: Update account deletion middleware and capability checks

- Modified the `canDeleteAccount` middleware to ensure that the account deletion permission is only granted to users with the `MANAGE_USERS` capability, improving security and clarity in permission management.
- Enhanced error logging for unauthorized account deletion attempts, providing better insights into permission issues.
- Updated the `capabilities.ts` file to ensure consistent handling of user authentication checks, improving robustness in capability verification.
- Refined type definitions in `systemGrant.ts` and `systemGrantMethods.ts` to utilize the `PrincipalType` enum, enhancing type safety and code clarity.

* refactor: Extract principal ID normalization into a separate function

- Introduced `normalizePrincipalId` function to streamline the normalization of principal IDs based on their type, enhancing code clarity and reusability.
- Updated references in `createSystemGrantMethods` to utilize the new normalization function, improving maintainability and reducing code duplication.

* test: Add unit tests for principalId normalization in systemGrant

- Introduced tests for the `grantCapability`, `revokeCapability`, and `getCapabilitiesForPrincipal` methods to verify correct handling of principalId normalization between string and ObjectId formats.
- Enhanced the `capabilities.ts` middleware to utilize the `PrincipalType` enum for improved type safety.
- Added a new utility function `normalizePrincipalId` to streamline principal ID normalization logic, ensuring consistent behavior across the application.

* feat: Introduce capability implications and enhance system grant methods

- Added `CapabilityImplications` to define relationships between broader and implied capabilities, allowing for more intuitive permission checks.
- Updated `createSystemGrantMethods` to expand capability queries to include implied capabilities, improving authorization logic.
- Enhanced `systemGrantSchema` to include an `expiresAt` field for future TTL enforcement of grants, and added validation to ensure `tenantId` is not set to null.
- Documented authorization requirements for prompt group and prompt deletion methods to clarify access control expectations.

* test: Add unit tests for canDeleteAccount middleware

- Introduced unit tests for the `canDeleteAccount` middleware to verify account deletion permissions based on user roles and capabilities.
- Covered scenarios for both allowed and blocked account deletions, including checks for ADMIN users with the `MANAGE_USERS` capability and handling of undefined user cases.
- Enhanced test structure to ensure clarity and maintainability of permission checks in the middleware.

* fix: Add principalType enum validation to SystemGrant schema

Without enum validation, any string value was accepted for principalType
and silently stored. Invalid documents would never match capability
queries, creating phantom grants impossible to diagnose without raw DB
inspection. All other ACL models in the codebase validate this field.

* fix: Replace seedSystemGrants Promise.all with bulkWrite for concurrency safety

When two server instances start simultaneously (K8s rolling deploy, PM2
cluster), both call seedSystemGrants. With Promise.all + findOneAndUpdate
upsert, both instances may attempt to insert the same documents, causing
E11000 duplicate key errors that crash server startup.

bulkWrite with ordered:false handles concurrent upserts gracefully and
reduces 17 individual round trips to a single network call. The returned
documents (previously discarded) are no longer fetched.

* perf: Add AsyncLocalStorage per-request cache for capability checks

Every hasCapability call previously required 2 DB round trips
(getUserPrincipals + SystemGrant.exists) — replacing what were O(1)
string comparisons. Routes like patchPromptGroup triggered this twice,
and hasConfigCapability's fallback path resolved principals twice.

This adds a per-request AsyncLocalStorage cache that:
- Caches resolved principals (same for all checks within one request)
- Caches capability check results (same user+cap = same answer)
- Automatically scoped to request lifetime (no stale grants)
- Falls through to DB when no store exists (background jobs, tests)
- Requires no signature changes to hasCapability

The capabilityContextMiddleware is registered at the app level before
all routes, initializing a fresh store per request.

* fix: Add error handling for inline hasCapability calls

canDeleteAccount, fetchAssistants, and validateAuthor all call
hasCapability without try-catch. These were previously O(1) string
comparisons that could never throw. Now they hit the database and can
fail on connection timeout or transient errors.

Wrap each call in try-catch, defaulting to deny (false) on error.
This ensures a DB hiccup returns a clean 403 instead of an unhandled
500 with a stack trace.

* test: Add canDeleteAccount DB-error resilience test

Tests that hasCapability rejection (e.g., DB timeout) results in a clean
403 rather than an unhandled exception. Validates the error handling
added in the previous commit.

* refactor: Use barrel import for hasCapability in validateAuthor

Import from ~/server/middleware barrel instead of directly from
~/server/middleware/roles/capabilities for consistency with other
non-middleware consumers. Files within the middleware barrel itself
must continue using direct imports to avoid circular requires.

* refactor: Remove misleading pre('save') hook from SystemGrant schema

The pre('save') hook normalized principalId for USER/GROUP principals,
but the primary write path (grantCapability) uses findOneAndUpdate —
which does not trigger save hooks. The normalization was already handled
explicitly in grantCapability itself. The hook created a false impression
of schema-level enforcement that only covered save()/create() paths.

Replace with a comment documenting that all writes must go through
grantCapability.

* feat: Add READ_ASSISTANTS capability to complete manage/read pair

Every other managed resource had a paired READ_X / MANAGE_X capability
except assistants. This adds READ_ASSISTANTS and registers the
MANAGE_ASSISTANTS → READ_ASSISTANTS implication in CapabilityImplications,
enabling future read-only assistant visibility grants.

* chore: Reorder systemGrant methods for clarity

Moved hasCapabilityForPrincipals to a more logical position in the returned object of createSystemGrantMethods, improving code readability. This change also maintains the inclusion of seedSystemGrants in the export, ensuring all necessary methods are available.

* fix: Wrap seedSystemGrants in try-catch to avoid blocking startup

Seeding capabilities is idempotent and will succeed on the next restart.
A transient DB error during seeding should not prevent the server from
starting — log the error and continue.

* refactor: Improve capability check efficiency and add audit logging

Move hasCapability calls after cheap early-exits in validateAuthor and
fetchAssistants so the DB check only runs when its result matters. Add
logger.debug on every capability bypass grant across all 7 call sites
for auditability, and log errors in catch blocks instead of silently
swallowing them.

* test: Add integration tests for AsyncLocalStorage capability caching

Exercises the full vertical — ALS context, generateCapabilityCheck,
real getUserPrincipals, real hasCapabilityForPrincipals, real MongoDB
via MongoMemoryServer. Covers per-request caching, cross-context
isolation, concurrent request isolation, negative caching, capability
implications, tenant scoping, group-based grants, and requireCapability
middleware.

* test: Add systemGrant data-layer and ALS edge-case integration tests

systemGrant.spec.ts (51 tests): Full integration tests for all
systemGrant methods against real MongoDB — grant/revoke lifecycle,
principalId normalization (string→ObjectId for USER/GROUP, string for
ROLE), capability implications (both directions), tenant scoping,
schema validation (null tenantId, invalid enum, required fields,
unique compound index).

capabilities.integration.spec.ts (27 tests): Adds ALS edge cases —
missing context degrades gracefully with no caching (background jobs,
child processes), nested middleware creates independent inner context,
optional-chaining safety when store is undefined, mid-request grant
changes are invisible due to result caching, requireCapability works
without ALS, and interleaved concurrent contexts maintain isolation.

* fix: Add worker thread guards to capability ALS usage

Detect when hasCapability or capabilityContextMiddleware is called from
a worker thread (where ALS context does not propagate from the parent).
hasCapability logs a warn-once per factory instance; the middleware logs
an error since mounting Express middleware in a worker is likely a
misconfiguration. Both continue to function correctly — the guard is
observability, not a hard block.

* fix: Include tenantId in ALS principal cache key for tenant isolation

The principal cache key was user.id:user.role, which would reuse
cached principals across tenants for the same user within a request.
When getUserPrincipals gains tenant-scoped group resolution, principals
from tenant-a would incorrectly serve tenant-b checks. Changed to
user.id:user.role:user.tenantId to prevent cross-tenant cache hits.

Adds integration test proving separate principal lookups per tenantId.

* test: Remove redundant mocked capabilities.spec.js

The JS wrapper test (7 tests, all mocked) is a strict subset of
capabilities.integration.spec.ts (28 tests, real MongoDB). Every
scenario it covered — hasCapability true/false, tenantId passthrough,
requireCapability 403/500, error handling — is tested with higher
fidelity in the integration suite.

* test: Replace mocked canDeleteAccount tests with real MongoDB integration

Remove hasCapability mock — tests now exercise the full capability
chain against real MongoDB (getUserPrincipals, hasCapabilityForPrincipals,
SystemGrant collection). Only mocks remaining are logger and cache.

Adds new coverage: admin role without grant is blocked, user-level
grant bypasses deletion restriction, null user handling.

* test: Add comprehensive tests for ACL entry management and user group methods

Introduces new tests for `deleteAclEntries`, `bulkWriteAclEntries`, and `findPublicResourceIds` in `aclEntry.spec.ts`, ensuring proper functionality for deleting and bulk managing ACL entries. Additionally, enhances `userGroup.spec.ts` with tests for finding groups by ID and name pattern, including external ID matching and source filtering. These changes improve coverage and validate the integrity of ACL and user group operations against real MongoDB interactions.

* refactor: Update capability checks and logging for better clarity and error handling

Replaced `MANAGE_USERS` with `ACCESS_ADMIN` in the `canDeleteAccount` middleware and related tests to align with updated permission structure. Enhanced logging in various middleware functions to use `logger.warn` for capability check failures, providing clearer error messages. Additionally, refactored capability checks in the `patchPromptGroup` and `validateAuthor` functions to improve readability and maintainability. This commit also includes adjustments to the `systemGrant` methods to implement retry logic for transient failures during capability seeding, ensuring robustness in the face of database errors.

* refactor: Enhance logging and retry logic in seedSystemGrants method

Updated the logging format in the seedSystemGrants method to include error messages for better clarity. Improved the retry mechanism by explicitly mocking multiple failures in tests, ensuring robust error handling during transient database issues. Additionally, refined imports in the systemGrant schema for better type management.

* refactor: Consolidate imports in canDeleteAccount middleware

Merged logger and SystemCapabilities imports from the data-schemas module into a single line for improved readability and maintainability of the code. This change streamlines the import statements in the canDeleteAccount middleware.

* test: Enhance systemGrant tests for error handling and capability validation

Added tests to the systemGrant methods to handle various error scenarios, including E11000 race conditions, invalid ObjectId strings for USER and GROUP principals, and invalid capability strings. These enhancements improve the robustness of the capability granting and revoking logic, ensuring proper error propagation and validation of inputs.

* fix: Wrap hasCapability calls in deny-by-default try-catch at remaining sites

canAccessResource, files.js, and roles.js all had hasCapability inside
outer try-catch blocks that returned 500 on DB failure instead of
falling through to the regular ACL check. This contradicts the
deny-by-default pattern used everywhere else.

Also removes raw error.message from the roles.js 500 response to
prevent internal host/connection info leaking to clients.

* fix: Normalize user ID in canDeleteAccount before passing to hasCapability

requireCapability normalizes req.user.id via _id?.toString() fallback,
but canDeleteAccount passed raw req.user directly. If req.user.id is
absent (some auth layers only populate _id), getUserPrincipals received
undefined, silently returning empty principals and blocking the bypass.

* fix: Harden systemGrant schema and type safety

- Reject empty string tenantId in schema validator (was only blocking
  null; empty string silently orphaned documents)
- Fix reverseImplications to use BaseSystemCapability[] instead of
  string[], preserving the narrow discriminated type
- Document READ_ASSISTANTS as reserved/unenforced

* test: Use fake timers for seedSystemGrants retry tests and add tenantId validation

- Switch retry tests to jest.useFakeTimers() to eliminate 3+ seconds
  of real setTimeout delays per test run
- Add regression test for empty-string tenantId rejection

* docs: Add TODO(#12091) comments for tenant-scoped capability gaps

In multi-tenant mode, platform-level grants (no tenantId) won't match
tenant-scoped queries, breaking admin access. getUserPrincipals also
returns cross-tenant group memberships. Both need fixes in #12091.
2026-03-21 14:28:54 -04:00
Danny Avila
8ba2bde5c1
📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830)
* chore: move database model methods to /packages/data-schemas

* chore: add TypeScript ESLint rule to warn on unused variables

* refactor: model imports to streamline access

- Consolidated model imports across various files to improve code organization and reduce redundancy.
- Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path.
- Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact.
- Enhanced test files to align with the new import paths, maintaining test coverage and integrity.

* chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas

* test: update agent model mocks in unit tests

- Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality.
- Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication.
- Ensured consistency in agent mock implementations across test files.

* fix: update types in data-schemas

* refactor: enhance type definitions in transaction and spending methods

- Updated type definitions in `checkBalance.ts` to use specific request and response types.
- Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety.
- Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods.
- Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness.

* refactor: streamline model imports and enhance code organization

- Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy.
- Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact.
- Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods.

* feat: implement loadAddedAgent and refactor agent loading logic

- Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution.
- Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`.
- Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`.
- Enhanced type definitions and improved error handling in the agent loading process.
- Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage.

* refactor: enhance balance handling with new update interface

- Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase.
- Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety.
- Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities.
- Enhanced overall code clarity and maintainability by refining type definitions related to balance operations.

* feat: add unit tests for loadAgent functionality and enhance agent loading logic

- Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks.
- Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality.
- Improved test coverage for agent loading, including handling of non-existent agents and user permissions.

* chore: reorder memory method exports for consistency

- Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`.
2026-03-21 14:28:53 -04:00
Danny Avila
58f128bee7
🗑️ chore: Remove Deprecated Project Model and Associated Fields (#11773)
* chore: remove projects and projectIds usage

* chore: empty line linting

* chore: remove isCollaborative property across agent models and related tests

- Removed the isCollaborative property from agent models, controllers, and tests, as it is deprecated in favor of ACL permissions.
- Updated related validation schemas and data provider types to reflect this change.
- Ensured all references to isCollaborative were stripped from the codebase to maintain consistency and clarity.
2026-03-21 14:28:53 -04:00
Danny Avila
6169d4f70b
🚦 fix: 404 JSON Responses for Unmatched API Routes (#11976)
* feat: Implement 404 JSON response for unmatched API routes

- Added middleware to return a 404 JSON response with a message for undefined API routes.
- Updated SPA fallback to serve index.html for non-API unmatched routes.
- Ensured the error handler is positioned correctly as the last middleware in the stack.

* fix: Enhance logging in BaseClient for better token usage tracking

- Updated `getTokenCountForResponse` to log the messageId of the response for improved debugging.
- Enhanced userMessage logging to include messageId, tokenCount, and conversationId for clearer context during token count mapping.

* chore: Improve logging in processAddedConvo for better debugging

- Updated the logging structure in the processAddedConvo function to provide clearer context when processing added conversations.
- Removed redundant logging and enhanced the output to include model, agent ID, and endpoint details for improved traceability.

* chore: Enhance logging in BaseClient for improved token usage tracking

- Added debug logging in the BaseClient to track response token usage, including messageId, model, promptTokens, and completionTokens for better debugging and traceability.

* chore: Enhance logging in MemoryAgent for improved context

- Updated logging in the MemoryAgent to include userId, conversationId, messageId, and provider details for better traceability during memory processing.
- Adjusted log messages to provide clearer context when content is returned or not, aiding in debugging efforts.

* chore: Refactor logging in initializeClient for improved clarity

- Consolidated multiple debug log statements into a single message that provides a comprehensive overview of the tool context being stored for the primary agent, including the number of tools and the size of the tool registry. This enhances traceability and debugging efficiency.

* feat: Implement centralized 404 handling for unmatched API routes

- Introduced a new middleware function `apiNotFound` to standardize 404 JSON responses for undefined API routes.
- Updated the server configuration to utilize the new middleware, enhancing code clarity and maintainability.
- Added tests to ensure correct 404 responses for various non-GET methods and the `/api` root path.

* fix: Enhance logging in apiNotFound middleware for improved safety

- Updated the `apiNotFound` function to sanitize the request path by replacing problematic characters and limiting its length, ensuring safer logging of 404 errors.

* refactor: Move apiNotFound middleware to a separate file for better organization

- Extracted the `apiNotFound` function from the error middleware into its own file, enhancing code organization and maintainability.
- Updated the index file to export the new `notFound` middleware, ensuring it is included in the middleware stack.

* docs: Add comment to clarify usage of unsafeChars regex in notFound middleware

- Included a comment in the notFound middleware file to explain that the unsafeChars regex is safe to reuse with .replace() at the module scope, as it does not retain lastIndex state.
2026-02-27 22:49:54 -05:00
Danny Avila
3fa94e843c
⚛️ refactor: Redis Scalability Improvements for High-Throughput Deployments (#11840)
* fix: Redis scalability improvements for high-throughput deployments

  Replace INCR+check+DECR race in concurrency middleware with atomic Lua
  scripts. The old approach allowed 3-4 concurrent requests through a
  limit of 2 at 300 req/s because another request could slip between the
  INCR returning and the DECR executing. The Lua scripts run atomically
  on the Redis server, eliminating the race window entirely.

  Add exponential backoff with jitter to all three Redis retry strategies
  (ioredis single-node, cluster, keyv). Previously all instances retried
  at the same millisecond after an outage, causing a connection storm.

  Batch the RedisJobStore cleanup loop into parallel chunks of 50. With
  1000 stale jobs, this reduces cleanup from ~20s of sequential calls to
  ~2s. Also pipeline appendChunk (xadd + expire) into a single round-trip
  and refresh TTL on every chunk instead of only the first, preventing
  TTL expiry during long-running streams.

  Propagate publish errors in RedisEventTransport.emitDone and emitError
  so callers can detect dropped completion/error events. emitChunk is left
  as swallow-and-log because its callers fire-and-forget without await.

  Add jest.config.js for the API package with babel TypeScript support and
  path alias resolution. Fix existing stream integration tests that were
  silently broken due to missing USE_REDIS_CLUSTER=false env var.

* chore: Migrate Jest configuration from jest.config.js to jest.config.mjs

Removed the old jest.config.js file and integrated the Jest configuration into jest.config.mjs, adding Babel TypeScript support and path alias resolution. This change streamlines the configuration for the API package.

* fix: Ensure Redis retry delays do not exceed maximum configured delay

Updated the delay calculation in Redis retry strategies to enforce a maximum delay defined in the configuration. This change prevents excessive delays during reconnection attempts, improving overall connection stability and performance.

* fix: Update RedisJobStore cleanup to handle job failures gracefully

Changed the cleanup process in RedisJobStore to use Promise.allSettled instead of Promise.all, allowing for individual job failures to be logged without interrupting the entire cleanup operation. This enhances error handling and provides better visibility into issues during job cleanup.
2026-02-18 00:04:33 -05:00
Danny Avila
b6af884dd2
🔐 feat: Admin Auth. Routes with Secure Cross-Origin Token Exchange (#11297)
* feat: implement admin authentication with OpenID & Local Auth proxy support

* feat: implement admin OAuth exchange flow with caching support

- Added caching for admin OAuth exchange codes with a short TTL.
- Introduced new endpoints for generating and exchanging admin OAuth codes.
- Updated relevant controllers and routes to handle admin panel redirects and token exchanges.
- Enhanced logging for better traceability of OAuth operations.

* refactor: enhance OpenID strategy mock to support multiple verify callbacks

- Updated the OpenID strategy mock to store and retrieve verify callbacks by strategy name.
- Improved backward compatibility by maintaining a method to get the last registered callback.
- Adjusted tests to utilize the new callback retrieval methods, ensuring clarity in the verification process for the 'openid' strategy.

* refactor: reorder import statements for better organization

* refactor: admin OAuth flow with improved URL handling and validation

- Added a utility function to retrieve the admin panel URL, defaulting to a local development URL if not set in the environment.
- Updated the OAuth exchange endpoint to include validation for the authorization code format.
- Refactored the admin panel redirect logic to handle URL parsing more robustly, ensuring accurate origin comparisons.
- Removed redundant local URL definitions from the codebase for better maintainability.

* refactor: remove deprecated requireAdmin middleware and migrate to TypeScript

- Deleted the old requireAdmin middleware file and its references in the middleware index.
- Introduced a new TypeScript version of the requireAdmin middleware with enhanced error handling and logging.
- Updated routes to utilize the new requireAdmin middleware, ensuring consistent access control for admin routes.

* feat: add requireAdmin middleware for admin role verification

- Introduced requireAdmin middleware to enforce admin role checks for authenticated users.
- Implemented comprehensive error handling and logging for unauthorized access attempts.
- Added unit tests to validate middleware functionality and ensure proper behavior for different user roles.
- Updated middleware index to include the new requireAdmin export.
2026-01-28 17:44:31 -05:00
Danny Avila
76e17ba701
🔧 refactor: Permission handling for Resource Sharing (#11283)
* 🔧 refactor: permission handling for public sharing

- Updated permission keys from SHARED_GLOBAL to SHARE across various files for consistency.
- Added public access configuration in librechat.example.yaml.
- Adjusted related tests and components to reflect the new permission structure.

* chore: Update default SHARE permission to false

* fix: Update SHARE permissions in tests and implementation

- Added SHARE permission handling for user and admin roles in permissions.spec.ts and permissions.ts.
- Updated expected permissions in tests to reflect new SHARE permission values for various permission types.

* fix: Handle undefined values in PeoplePickerAdminSettings component

- Updated the checked and value props of the Switch component to handle undefined values gracefully by defaulting to false. This ensures consistent behavior when the field value is not set.

* feat: Add CREATE permission handling for prompts and agents

- Introduced CREATE permission for user and admin roles in permissions.spec.ts and permissions.ts.
- Updated expected permissions in tests to include CREATE permission for various permission types.

* 🔧 refactor: Enhance permission handling for sharing dialog usability

* refactor: public sharing permissions for resources

- Added middleware to check SHARE_PUBLIC permissions for agents, prompts, and MCP servers.
- Updated interface configuration in librechat.example.yaml to include public sharing options.
- Enhanced components and hooks to support public sharing functionality.
- Adjusted tests to validate new permission handling for public sharing across various resource types.

* refactor: update Share2Icon styling in GenericGrantAccessDialog

* refactor: update Share2Icon size in GenericGrantAccessDialog for consistency

* refactor: improve layout and styling of Share2Icon in GenericGrantAccessDialog

* refactor: update Share2Icon size in GenericGrantAccessDialog for improved consistency

* chore: remove redundant public sharing option from People Picker

* refactor: add SHARE_PUBLIC permission handling in updateInterfacePermissions tests
2026-01-10 14:02:56 -05:00
Danny Avila
a7aa4dc91b
🚦 refactor: Concurrent Request Limiter for Resumable Streams (#11167)
* feat: Implement concurrent request handling in ResumableAgentController

- Introduced a new concurrency management system by adding `checkAndIncrementPendingRequest` and `decrementPendingRequest` functions to manage user request limits.
- Replaced the previous `concurrentLimiter` middleware with a more integrated approach directly within the `ResumableAgentController`.
- Enhanced violation logging and request denial for users exceeding their concurrent request limits.
- Removed the obsolete `concurrentLimiter` middleware file and updated related imports across the codebase.

* refactor: Simplify error handling in ResumableAgentController and enhance SSE error management

- Removed the `denyRequest` middleware and replaced it with a direct response for concurrent request violations in the ResumableAgentController.
- Improved error handling in the `useResumableSSE` hook to differentiate between network errors and other error types, ensuring more informative error responses are sent to the error handler.

* test: Enhance MCP server configuration tests with new mocks and improved logging

- Added mocks for MCP server registry and manager in `index.spec.js` to facilitate testing of server configurations.
- Updated debug logging in `initializeMCPs.spec.js` to simplify messages regarding server configurations, improving clarity in test outputs.

* refactor: Enhance concurrency management in request handling

- Updated `checkAndIncrementPendingRequest` and `decrementPendingRequest` functions to utilize Redis for atomic request counting, improving concurrency control.
- Added error handling for Redis operations to ensure requests can proceed even during Redis failures.
- Streamlined cache key generation for both Redis and in-memory fallback, enhancing clarity and performance in managing pending requests.
- Improved comments and documentation for better understanding of the concurrency logic and its implications.

* refactor: Improve atomicity in Redis operations for pending request management

- Updated `checkAndIncrementPendingRequest` to utilize Redis pipelines for atomic INCR and EXPIRE operations, enhancing concurrency control and preventing edge cases.
- Added error handling for pipeline execution failures to ensure robust request management.
- Improved comments for clarity on the concurrency logic and its implications.
2026-01-01 11:10:56 -05:00
Danny Avila
01413eea3d
🛡️ feat: Add Middleware for JSON Parsing and Prompt Group Updates (#10757)
* 🗨️ fix: Safe Validation for Prompt Updates

- Added `safeValidatePromptGroupUpdate` function to validate and sanitize prompt group update requests, ensuring only allowed fields are processed and sensitive fields are stripped.
- Updated the `patchPromptGroup` route to utilize the new validation function, returning appropriate error messages for invalid requests.
- Introduced comprehensive tests for the validation logic, covering various scenarios including allowed and disallowed fields, enhancing overall request integrity and security.
- Created a new schema file for prompt group updates, defining validation rules and types for better maintainability.

* 🔒 feat: Add JSON parse error handling middleware
2025-12-02 00:10:30 -05:00
Danny Avila
838fb53208
🔃 refactor: Decouple Effects from AppService, move to data-schemas (#9974)
* chore: linting for `loadCustomConfig`

* refactor: decouple CDN init and variable/health checks from AppService

* refactor: move AppService to packages/data-schemas

* chore: update AppConfig import path to use data-schemas

* chore: update JsonSchemaType import path to use data-schemas

* refactor: update UserController to import webSearchKeys and redefine FunctionTool typedef

* chore: remove AppService.js

* refactor: update AppConfig interface to use Partial<TCustomConfig> and make paths and fileStrategies optional

* refactor: update checkConfig function to accept Partial<TCustomConfig>

* chore: fix types

* refactor: move handleRateLimits to startup checks as is an effect

* test: remove outdated rate limit tests from AppService.spec and add new handleRateLimits tests in checks.spec
2025-10-05 06:37:57 -04:00
Danny Avila
9a210971f5
🛜 refactor: Streamline App Config Usage (#9234)
* WIP: app.locals refactoring

WIP: appConfig

fix: update memory configuration retrieval to use getAppConfig based on user role

fix: update comment for AppConfig interface to clarify purpose

🏷️ refactor: Update tests to use getAppConfig for endpoint configurations

ci: Update AppService tests to initialize app config instead of app.locals

ci: Integrate getAppConfig into remaining tests

refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests

refactor: Rename initializeAppConfig to setAppConfig and update related tests

ci: Mock getAppConfig in various tests to provide default configurations

refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests

chore: rename `Config/getAppConfig` -> `Config/app`

fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters

chore: correct parameter documentation for imageOutputType in ToolService.js

refactor: remove `getCustomConfig` dependency in config route

refactor: update domain validation to use appConfig for allowed domains

refactor: use appConfig registration property

chore: remove app parameter from AppService invocation

refactor: update AppConfig interface to correct registration and turnstile configurations

refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services

refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files

refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type

refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration

ci: update related tests

refactor: update getAppConfig call in getCustomConfigSpeech to include user role

fix: update appConfig usage to access allowedDomains from actions instead of registration

refactor: enhance AppConfig to include fileStrategies and update related file strategy logic

refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions

chore: remove deprecated unused RunManager

refactor: get balance config primarily from appConfig

refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic

refactor: remove getCustomConfig usage and use app config in file citations

refactor: consolidate endpoint loading logic into loadEndpoints function

refactor: update appConfig access to use endpoints structure across various services

refactor: implement custom endpoints configuration and streamline endpoint loading logic

refactor: update getAppConfig call to include user role parameter

refactor: streamline endpoint configuration and enhance appConfig usage across services

refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file

refactor: add type annotation for loadedEndpoints in loadEndpoints function

refactor: move /services/Files/images/parse to TS API

chore: add missing FILE_CITATIONS permission to IRole interface

refactor: restructure toolkits to TS API

refactor: separate manifest logic into its own module

refactor: consolidate tool loading logic into a new tools module for startup logic

refactor: move interface config logic to TS API

refactor: migrate checkEmailConfig to TypeScript and update imports

refactor: add FunctionTool interface and availableTools to AppConfig

refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig`

WIP: fix tests

* fix: rebase conflicts

* refactor: remove app.locals references

* refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware

* refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients

* test: add balance configuration to titleConvo method in AgentClient tests

* chore: remove unused `openai-chat-tokens` package

* chore: remove unused imports in initializeMCPs.js

* refactor: update balance configuration to use getAppConfig instead of getBalanceConfig

* refactor: integrate configMiddleware for centralized configuration handling

* refactor: optimize email domain validation by removing unnecessary async calls

* refactor: simplify multer storage configuration by removing async calls

* refactor: reorder imports for better readability in user.js

* refactor: replace getAppConfig calls with req.config for improved performance

* chore: replace getAppConfig calls with req.config in tests for centralized configuration handling

* chore: remove unused override config

* refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config

* chore: remove customConfig parameter from TTSService constructor

* refactor: pass appConfig from request to processFileCitations for improved configuration handling

* refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config`

* test: add mockAppConfig to processFileCitations tests for improved configuration handling

* fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor

* fix: type safety in useExportConversation

* refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached

* chore: change `MongoUser` typedef to `IUser`

* fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest

* fix: remove unused setAppConfig mock from Server configuration tests
2025-08-26 12:10:18 -04:00
Danny Avila
50b7bd6643
🔄 fix: Ensure lastRefill Date for Existing Users & Refactor Balance Middleware (#9086)
- Deleted setBalanceConfig middleware and its associated file.
- Introduced createSetBalanceConfig factory function to create middleware for synchronizing user balance settings.
- Updated auth and oauth routes to use the new balance configuration middleware.
- Added comprehensive tests for the new balance middleware functionality.
- Updated package versions and dependencies in package.json and package-lock.json.
- Added balance types and updated middleware index to export new balance middleware.
2025-08-15 17:02:49 -04:00
Danny Avila
e4e25aaf2b
🔎 feat: Add Prompt and Agent Permissions Migration Checks (#9063)
* chore: fix mock typing in packages/api tests

* chore: improve imports, type handling and method signatures for MCPServersRegistry

* chore: use enum in migration scripts

* chore: ParsedServerConfig type to enhance server configuration handling

* feat: Implement agent permissions migration check and logging

* feat: Integrate migration checks into server initialization process

* feat: Add prompt permissions migration check and logging to server initialization

* chore: move prompt formatting functions to dedicated prompts dir
2025-08-14 17:20:00 -04:00
“Praneeth
949682ef0f
🏪 feat: Agent Marketplace
bugfix: Enhance Agent and AgentCategory schemas with new fields for category, support contact, and promotion status

refactored and moved agent category methods and schema to data-schema package

🔧 fix: Merge and Rebase Conflicts

- Move AgentCategory from api/models to @packages/data-schemas structure
  - Add schema, types, methods, and model following codebase conventions
  - Implement auto-seeding of default categories during AppService startup
  - Update marketplace controller to use new data-schemas methods
  - Remove old model file and standalone seed script

refactor: unify agent marketplace to single endpoint with cursor pagination

  - Replace multiple marketplace routes with unified /marketplace endpoint
  - Add query string controls: category, search, limit, cursor, promoted, requiredPermission
  - Implement cursor-based pagination replacing page-based system
  - Integrate ACL permissions for proper access control
  - Fix ObjectId constructor error in Agent model
  - Update React components to use unified useGetMarketplaceAgentsQuery hook
  - Enhance type safety and remove deprecated useDynamicAgentQuery
  - Update tests for new marketplace architecture
  -Known issues:
  see more button after category switching + Unit tests

feat: add icon property to ProcessedAgentCategory interface

- Add useMarketplaceAgentsInfiniteQuery and useGetAgentCategoriesQuery to client/src/data-provider/Agents/
  - Replace manual pagination in AgentGrid with infinite query pattern
  - Update imports to use local data provider instead of librechat-data-provider
  - Add proper permission handling with PERMISSION_BITS.VIEW/EDIT constants
  - Improve agent access control by adding requiredPermission validation in backend
  - Remove manual cursor/state management in favor of infinite query built-ins
  - Maintain existing search and category filtering functionality

refactor: consolidate agent marketplace endpoints into main agents API and improve data management consistency

  - Remove dedicated marketplace controller and routes, merging functionality into main agents v1 API
  - Add countPromotedAgents function to Agent model for promoted agents count
  - Enhance getListAgents handler with marketplace filtering (category, search, promoted status)
  - Move getAgentCategories from marketplace to v1 controller with same functionality
  - Update agent mutations to invalidate marketplace queries and handle multiple permission levels
  - Improve cache management by updating all agent query variants (VIEW/EDIT permissions)
  - Consolidate agent data access patterns for better maintainability and consistency
  - Remove duplicate marketplace route definitions and middleware

selected view only agents injected in the drop down

fix: remove minlength validation for support contact name in agent schema

feat: add validation and error messages for agent name in AgentConfig and AgentPanel

fix: update agent permission check logic in AgentPanel to simplify condition

Fix linting WIP

Fix Unit tests WIP

ESLint fixes

eslint fix

refactor: enhance isDuplicateVersion function in Agent model for improved comparison logic

- Introduced handling for undefined/null values in array and object comparisons.
- Normalized array comparisons to treat undefined/null as empty arrays.
- Added deep comparison for objects and improved handling of primitive values.
- Enhanced projectIds comparison to ensure consistent MongoDB ObjectId handling.

refactor: remove redundant properties from IAgent interface in agent schema

chore: update localization for agent detail component and clean up imports

ci: update access middleware tests

chore: remove unused PermissionTypes import from Role model

ci: update AclEntry model tests

ci: update button accessibility labels in AgentDetail tests

refactor: update exhaustive dep. lint warning

🔧 fix: Fixed agent actions access

feat: Add role-level permissions for agent sharing people picker

  - Add PEOPLE_PICKER permission type with VIEW_USERS and VIEW_GROUPS permissions
  - Create custom middleware for query-aware permission validation
  - Implement permission-based type filtering in PeoplePicker component
  - Hide people picker UI when user lacks permissions, show only public toggle
  - Support granular access: users-only, groups-only, or mixed search modes

refactor: Replace marketplace interface config with permission-based system

  - Add MARKETPLACE permission type to handle marketplace access control
  - Update interface configuration to use role-based marketplace settings (admin/user)
  - Replace direct marketplace boolean config with permission-based checks
  - Modify frontend components to use marketplace permissions instead of interface config
  - Update agent query hooks to use marketplace permissions for determining permission levels
  - Add marketplace configuration structure similar to peoplePicker in YAML config
  - Backend now sets MARKETPLACE permissions based on interface configuration
  - When marketplace enabled: users get agents with EDIT permissions in dropdown lists  (builder mode)
  - When marketplace disabled: users get agents with VIEW permissions  in dropdown lists (browse mode)

🔧 fix: Redirect to New Chat if No Marketplace Access and Required Agent Name Placeholder (#8213)

* Fix: Fix the redirect to new chat page if access to marketplace is denied

* Fixed the required agent name placeholder

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>

chore: fix tests, remove unnecessary imports

refactor: Implement permission checks for file access via agents

- Updated `hasAccessToFilesViaAgent` to utilize permission checks for VIEW and EDIT access.
- Replaced project-based access validation with permission-based checks.
- Enhanced tests to cover new permission logic and ensure proper access control for files associated with agents.
- Cleaned up imports and initialized models in test files for consistency.

refactor: Enhance test setup and cleanup for file access control

- Introduced modelsToCleanup array to track models added during tests for proper cleanup.
- Updated afterAll hooks in test files to ensure all collections are cleared and only added models are deleted.
- Improved consistency in model initialization across test files.
- Added comments for clarity on cleanup processes and test data management.

chore: Update Jest configuration and test setup for improved timeout handling

- Added a global test timeout of 30 seconds in jest.config.js.
- Configured jest.setTimeout in jestSetup.js to allow individual test overrides if needed.
- Enhanced test reliability by ensuring consistent timeout settings across all tests.

refactor: Implement file access filtering based on agent permissions

- Introduced `filterFilesByAgentAccess` function to filter files based on user access through agents.
- Updated `getFiles` and `primeFiles` functions to utilize the new filtering logic.
- Moved `hasAccessToFilesViaAgent` function from the File model to permission services, adjusting imports accordingly
- Enhanced tests to ensure proper access control and filtering behavior for files associated with agents.

fix: make support_contact field a nested object rather than a sub-document

refactor: Update support_contact field initialization in agent model

- Removed handling for empty support_contact object in createAgent function.
- Changed default value of support_contact in agent schema to undefined.

test: Add comprehensive tests for support_contact field handling and versioning

refactor: remove unused avatar upload mutation field and add informational toast for success

chore: add missing SidePanelProvider for AgentMarketplace and organize imports

fix: resolve agent selection race condition in marketplace HandleStartChat
- Set agent in localStorage before newConversation to prevent useSelectorEffects from auto-selecting previous agent

fix: resolve agent dropdown showing raw ID instead of agent info from URL

  - Add proactive agent fetching when agent_id is present in URL parameters
  - Inject fetched agent into agents cache so dropdowns display proper name/avatar
  - Use useAgentsMap dependency to ensure proper cache initialization timing
  - Prevents raw agent IDs from showing in UI when visiting shared agent links

Fix: Agents endpoint renamed to "My Agent" for less confusion with the Marketplace agents.

chore: fix ESLint issues and Test Mocks

ci: update permissions structure in loadDefaultInterface tests

- Refactored permissions for MEMORY and added new permissions for MARKETPLACE and PEOPLE_PICKER.
- Ensured consistent structure for permissions across different types.

feat:  support_contact validation to allow empty email strings
2025-08-13 16:24:18 -04:00
Danny Avila
1ccac58403
🔒 fix: Provider Validation for Social, OpenID, SAML, and LDAP Logins (#8999)
* fix: social login provider crossover

* feat: Enhance OpenID login handling and add tests for provider validation

* refactor: authentication error handling to use ErrorTypes.AUTH_FAILED enum

* refactor: update authentication error handling in LDAP and SAML strategies to use ErrorTypes.AUTH_FAILED enum

* ci: Add validation for login with existing email and different provider in SAML strategy

chore: Add logging for existing users with different providers in LDAP, SAML, and Social Login strategies
2025-08-11 18:51:46 -04:00
Danny Avila
33834cd484
🧹 feat: Automatic File Cleanup for Mistral OCR Uploads (#8827)
* chore: Handle optional token_endpoint in OAuth metadata discovery

* chore: Simplify permission typing logic in checkAccess function

* feat: Implement `deleteMistralFile` function and integrate file cleanup in `uploadMistralOCR`
2025-08-03 17:11:14 -04:00
Danny Avila
91a2df4759
🔧 refactor: Change Permissions Check from some to every for Stricter Access Validation (#8270)
* 🔧 refactor: Change Permissions Check from `some` to `every` for Stricter Access Validation

* 🧪 ci: Add comprehensive tests for access middleware functions

* fix: custom provider check logic in `getProviderConfig` function
2025-07-05 15:53:08 -04:00
Danny Avila
33b4a97b42
🔒 fix: Agents Config/Permission Checks after Streamline Change (#8089)
* refactor: access control logic to TypeScript

* chore: Change EndpointURLs to a constant object for improved type safety

* 🐛 fix: Enhance agent access control by adding skipAgentCheck functionality

* 🐛 fix: Add endpointFileConfig prop to AttachFileMenu and update file handling logic

* 🐛 fix: Update tool handling logic to support optional groupedTools and improve null checks, add dedicated tool dialog for Assistants

* chore: Export Accordion component from UI index for improved modularity

* feat: Add ActivePanelContext for managing active panel state across components

* chore: Replace string IDs with EModelEndpoint constants for assistants and agents in useSideNavLinks

* fix: Integrate access checks for agent creation and deletion routes in actions.js
2025-06-26 18:53:05 -04:00