LibreChat/packages/api/src/acl
Danny Avila 3bd2681272
🪐 fix: Replace $bitsAllSet ACL Queries for Azure Cosmos DB Compatibility (#12736)
* 🐛 fix: replace `$bitsAllSet` ACL queries for Cosmos DB compatibility

Azure Cosmos DB for MongoDB API does not implement the `$bitsAllSet`
query operator, so every permission check against Cosmos DB threw. The
five read paths in `aclEntry.ts` (`hasPermission`, `findAccessibleResources`,
`findPublicResourceIds`, and two sites in `getSoleOwnedResourceIds`) now
fetch candidate entries and apply the bitwise mask in application code.
This matches the existing FerretDB-compatible pattern.

Fixes #12729.

* 🐛 fix: delegate `findPubliclyAccessibleResources` to fixed DB method

`AccessControlService.findPubliclyAccessibleResources` inlined the same
`$bitsAllSet` query as the data-schemas layer, which fails on Azure
Cosmos DB for MongoDB. Delegate to `_dbMethods.findPublicResourceIds`
so a single implementation carries the Cosmos-compatible bitwise logic.

Refs #12729.

* 🐛 fix: move `$bitsAllSet` filter out of remote-agent aggregation

`enrichRemoteAgentPrincipals` used `$bitsAllSet` inside an aggregation
`$match`, which Azure Cosmos DB for MongoDB does not implement. Project
`permBits` through the pipeline and filter for `PermissionBits.SHARE` in
application code. The extra documents fetched are bounded by ACL entries
on a single agent resource, so the cost is negligible.

Refs #12729.

* 🧪 test: rename misleading public-dedup test and add real dedup coverage

The test previously named "returns deduplicated IDs even if the public
principal has multiple entries" only set up a single ACL entry, so it
did not actually exercise deduplication. Split into two tests: one for
the happy path (single entry with required bits), and one that bypasses
`grantPermission`'s upsert via `AclEntry.create` to confirm the
application-layer dedup Map handles genuine duplicates.

Refs #12729.

* 🧪 test: cover SHARE-bit filter in `enrichRemoteAgentPrincipals`

The `$bitsAllSet` match stage in `enrichRemoteAgentPrincipals` previously
guaranteed every aggregation row had SHARE; the Cosmos DB fix moved that
check into a JS `continue` branch with no direct coverage. Add a
dependency-injected unit test that stubs the aggregation with mixed
SHARE / non-SHARE / zero-bit rows and asserts only SHARE holders are
enriched and queued for backfill. Also includes a regression guard that
the `$match` pipeline stage no longer contains a `permBits` filter.

Refs #12729.

* ♻️ refactor: extract `filterByBitsAndDedup` helper for ACL reads

`findAccessibleResources` and `findPublicResourceIds` each inlined the
same bitmask-filter + `Map`-based dedup loop. Lift it into a private
`filterByBitsAndDedup(entries, requiredBits)` helper so the Cosmos-DB
compatible pattern lives in one place. Pure rename/extract — no
behavior change.

Refs #12729.

* 📝 docs: fix stale `\$bitsAllSet` references in FerretDB spec

The describe block and header comment in the FerretDB parity spec still
referenced `\$bitsAllSet queries` after the Cosmos DB compatibility fix
moved the bit mask into application code. Update the title to
\"Bitwise permission queries\" and rewrite the header comment to
describe the application-layer behavior being validated.

Refs #12729.

*  perf: push permission-mask filter back to the database via `$in`

The original fix for #12729 moved `$bitsAllSet` filtering into application
code, which meant every ACL read fetched the full set of rows for a
principal/resource and filtered in JS. For tenants with large ACL
collections this inflates wire transfer and heap.

Replace the JS filter with `permBits: { $in: permissionBitSupersets(X) }`.
For the 4-bit `PermissionBits` enum the `$in` list is at most 16 values
(8 for a single-bit mask like SHARE). `$in` is indexable and supported by
Azure Cosmos DB for MongoDB, so the filter runs on the server again —
restoring `.distinct('resourceId')` and `findOne()` semantics.

`permissionBitSupersets(requiredBits)` is memoized and exported from
`@librechat/data-schemas`. Callers restored:
  - `hasPermission`: back to `findOne` short-circuit
  - `findAccessibleResources` / `findPublicResourceIds`: back to `.distinct()`
  - `getSoleOwnedResourceIds`: back to the `$match` + `$group` aggregation
  - `enrichRemoteAgentPrincipals`: bit filter back in `$match`, JS `continue` removed

Refs #12729.

* 🧪 test: add `\$bitsAllSet` vs `\$in` parity + perf spec

Introduces `aclEntry.parity.spec.ts` — a side-by-side spec that runs the
legacy `\$bitsAllSet` query and the current `\$in`-based query against the
same `mongodb-memory-server` fixture and asserts identical output sets for
every affected method (`hasPermission`, `findAccessibleResources`,
`findPublicResourceIds`, `getSoleOwnedResourceIds`) across all 7 meaningful
permBits combinations.

Also logs median wall-clock time for the two query paths over 20 runs on
an 800-entry fixture, with a loose 3x guard against catastrophic
regressions. Initial local numbers: 1.05 ms vs 1.07 ms
(findAccessibleResources), 1.10 ms vs 1.05 ms (findPublicResourceIds).

Refs #12729.

* 🔒 hardening: freeze `permissionBitSupersets` cache + enum-shape guard

Two defensive changes from the comprehensive audit:

* Cached superset arrays are now `Object.freeze`d and the return type is
  `readonly number[]`. Previously the cached arrays were returned by
  mutable reference, so a caller that mutated the result would silently
  corrupt the process-wide cache for every subsequent permission check.
  `Object.freeze` turns that into a loud `TypeError` at the mutation
  site. All existing call sites pass the result directly to Mongoose's
  `$in`, which does not mutate.
* Added a module-load guard `if (MAX_PERM_BITS === 0) throw`. If
  `PermissionBits` is ever refactored to a `const` object or string
  enum, `Object.values(...).filter(isNumber)` would return `[]` and
  `MAX_PERM_BITS` would silently become 0, making every query match no
  rows and breaking every permission check. The guard fails loudly
  instead.

Also collapsed four identical JSDoc lines across `hasPermission`,
`findAccessibleResources`, `findPublicResourceIds`, and
`getSoleOwnedResourceIds` into a single `{@link permissionBitSupersets}`
reference.

Refs #12729.

* 🧪 test: add focused unit tests for `permissionBitSupersets`

The helper is the single point of correctness for every ACL read path
(every query uses `permBits: { $in: permissionBitSupersets(X) }`), so it
warrants direct coverage independent of the higher-level parity and
behavior specs. Six cases added:

* `requiredBits=0` returns all 16 values
* `requiredBits=15` returns `[15]` only
* every returned value is a bitwise superset of `requiredBits`
* full parity against the `$bitsAllSet` definition for every
  `required` in 0..15
* memoization: repeat calls return the same frozen reference
* frozen result throws `TypeError` on mutation attempts

Refs #12729.

* 🧪 test: tighten parity perf guard and document fixture constants

The `expect(currentMs).toBeLessThan(legacyMs * 3 + 50)` form was
dominated by the `+ 50` additive term at typical sub-ms query
latencies — at legacy=1ms, a 50x regression would still pass. Replace
with `Math.max(legacyMs * 5, 50)` so the multiplicative ceiling is
intact once the new path climbs out of the fixed noise floor.

Also added inline rationale for the `FIXTURE_SIZE = 800` and
`PERF_ITERATIONS = 20` constants.

Refs #12729.

* 🧹 chore: remove stale perf-guard comment, hoist rationale to describe

Commit a75f122 left a "Sanity check: A 3x multiplier" comment block
above the first perf guard, and 21852ac layered a second
"Multiplicative-dominant guard" block directly below it. The stale
block described the `legacyMs * 3 + 50` formula that no longer exists,
so both blocks coexisted and contradicted each other.

Delete the stale block from the first test, remove the redundant copy
from the second test, and lift the (now-single) rationale to the
enclosing `describe('performance')` block. Each test is now one
`expect` line — the rationale lives once, at the scope it applies to.

Refs #12729.

* 📝 docs: sharpen MAX_PERM_BITS guard message + test-sort consistency

Two NITs from the follow-up audit:

* The `MAX_PERM_BITS === 0` guard now names the specific refactors that
  could cause it (const object / string enum / all-string shape) and
  gives two concrete remediation paths (rewrite the reducer, or give
  `permissionBitSupersets` an explicit bit-width parameter). Previously
  the message just said "update permissionBitSupersets before
  continuing", which was vague.
* The `requiredBits=15` unit test now applies the same
  `[...result].sort((a, b) => a - b)` normalization as the other tests
  so the set-equality assertions are uniform. The function happens to
  return values in ascending order, but the helper's JSDoc does not
  promise ordering, so sorting before comparing is the correct
  defensive pattern.

Refs #12729.

* 🔒 fix: enforce `permBits <= MAX_PERM_BITS` at the schema level

Codex flagged a real silent regression: `permissionBitSupersets`
enumerates `$in` candidates only in `[0, MAX_PERM_BITS]` (currently 15),
so any ACL row with bits above the enum — e.g. `permBits = 31` from a
future role or a manual DB write — would be excluded from reads that
`$bitsAllSet` would have matched. Current write paths only produce
values in `[0, 15]` via the `PermissionBits` / `RoleBits` enums, so
no live data is affected, but the schema did not prevent out-of-range
writes, so the divergence was reachable.

Fix: add `min: 0`, `max: MAX_PERM_BITS`, and an `isInteger` validator to
the `permBits` field in the AclEntry schema. `MAX_PERM_BITS` is derived
from `PermissionBits` the same way `permissionBitSupersets` computes it,
so when a new bit is added to the enum both the read-side enumeration
and the write-side bound expand together.

Tests: four new cases cover the upper bound, over-limit, negative, and
non-integer inputs, each asserting `Mongoose.Error.ValidationError`.
Plus updated JSDoc on `permissionBitSupersets` to document the
invariant that the schema now enforces.

Refs #12729.

* ♻️ refactor: hoist `MAX_PERM_BITS` + enum-shape/ceiling guards to shared util

Addresses one MINOR DRY finding and one NIT defensive-guard finding from
the latest audit. `MAX_PERM_BITS` was duplicated in `methods/aclEntry.ts`
and `schema/aclEntry.ts` with identical computations; they could silently
diverge. Move the constant plus both module-load guards to
`src/common/permissions.ts`:

* `MAX_PERM_BITS === 0` guard — fail loudly if `PermissionBits` is ever
  refactored to a `const` object or string enum (applied in both sites
  now, not just methods).
* `MAX_PERM_BITS > 255` guard — circuit-breaker for the `$in` enumeration
  strategy. At 4 bits the list tops out at 16; at 8 bits 256. Beyond
  that the approach degrades, so fail at module load rather than emit an
  unusably-large `$in` list.

Both the schema and the methods file now import from the single
`src/common/permissions.ts`. `readonly number[]` return type and
`Object.freeze` cache-value protection from the prior commits are
preserved.

Refs #12729.

* 🔒 fix: reject out-of-range `requiredBits` without caching (DoS fix)

Codex flagged a real unbounded-cache DoS vector:
`api/server/controllers/agents/v1.js` parses `req.query.requiredPermission`
via `parseInt` and forwards it unvalidated to `findAccessibleResources`,
which eventually calls `permissionBitSupersets(requiredBits)`. Because
the old code memoized *every* distinct input in a process-global Map,
an attacker could flood the cache with unique integers and grow memory
without bound.

Fix: when `requiredBits` is not an integer, is negative, or has any bits
set above `MAX_PERM_BITS`, return a single shared frozen empty array and
do NOT touch the cache. An empty `$in` list correctly matches zero rows
(rows cannot satisfy a bit we do not support), so the response is also
semantically correct — previously it "worked" only because invalid
inputs coincidentally expanded to `[]` too, but at the cost of a cache
entry each time.

Five new tests exercise the rejection path: upper-bound overflow, mixed
in-range+out-of-range bits, shared-instance identity across rejected
inputs, a 2000-unique-value cache-growth probe, and a regression check
that legitimate in-range inputs still get memoized normally.

Refs #12729.

* 🧪 test: probe `Map.prototype.set` to prove cache doesn't grow on rejection

Strengthens the DoS cache-growth test the review pass flagged: reference
identity alone allows a hypothetical regression like
`supersetCache.set(requiredBits, EMPTY_SUPERSETS); return EMPTY_SUPERSETS;`
to pass while still leaking one entry per attacker request.

Add a second test that spies on `Map.prototype.set`, records the global
call count before and after a burst of 1500 rejected inputs (500 each of
over-MAX integers, negatives, and non-integers), and asserts the delta
is zero. Manually verified the test has teeth: injecting the
hypothetical regression in the implementation produced exactly 1500
extra `Map.set` calls and the new test failed as expected; reverting
restored the clean state.

Renames the existing identity-based test to `(reference identity)` so
its scope is clear alongside the new `(Map-write probe)` companion.

Refs #12729.
2026-04-19 22:28:48 -04:00
..
accessControlService.spec.ts
accessControlService.ts 🪐 fix: Replace $bitsAllSet ACL Queries for Azure Cosmos DB Compatibility (#12736) 2026-04-19 22:28:48 -04:00