Commit graph

4 commits

Author SHA1 Message Date
Danny Avila
963068b112 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613)
* 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema

Adds Skills as a new first-class resource modeled on Anthropic's Agent
Skills, reusing the existing Prompt ACL stack for sharing. Lays the
groundwork for multi-file skills (SkillFile schema + metadata routes)
without wiring upload processing — single-file skills (inline SKILL.md
body) work end-to-end, multi-file uploads are stubbed for phase 2.

* 🔬 fix: Wire Skill Cleanup, AccessRole Enum, and Express 5 Path Params

CI surfaced four follow-ups from the initial Skills scaffolding commit
that local builds missed:

- AccessRole's resourceType field had a hardcoded enum that didn't
  include `'skill'`, blocking SKILL_OWNER/EDITOR/VIEWER role creation
  in every test that hit the AccessRole model.
- The seedDefaultRoles assertion in accessRole.spec.ts hard-listed the
  expected role IDs and needed the new SKILL_* entries.
- deleteUserController had no cleanup for skills, and the
  deleteUserResourceCoverage guard test enforces every ResourceType has
  a documented handler — wired in db.deleteUserSkills(user._id) and
  added the entry to HANDLED_RESOURCE_TYPES.
- Express 5's path-to-regexp v6 rejects the legacy `(*)` named-group
  glob syntax. The two skill file routes now use a plain `:relativePath`
  param; the client already encodeURIComponents the path, so a single
  param is sufficient and decoded server-side.

* 🪡 fix: Make Skill Name Uniqueness Application-Level

Resolve three more CI failures from the Skills scaffolding PR:

- Mongoose creates indexes asynchronously and mongodb-memory-server
  tests can race ahead of the unique (name, author, tenantId) index
  being built, so the duplicate-name uniqueness test was flaky.
  Added an explicit findOne pre-check inside createSkill that throws
  with code 11000 (mimicking the index violation), giving deterministic
  behavior. The unique index stays as the persistent guarantee.
- The deleteUser.spec.js and UserController.spec.js suites mock the
  ~/models module directly and were missing deleteUserSkills, causing
  deleteUserController to throw and return 500 instead of 200.
- Removed two doc-comment claims that the SKILL_NAME_MAX_LENGTH and
  SKILL_DESCRIPTION_MAX_LENGTH constants "match Anthropic's API". The
  values themselves are reasonable but the comments were misleading
  about who enforces them.

* 🪢 fix: Address Code Review Findings on Skills Scaffolding

Resolve all 15 findings from the comprehensive PR review:

Critical:
- Rollback the created skill when grantPermission throws so a transient
  ACL failure cannot leave an orphaned, inaccessible skill in the DB.
- Fix infinite query cache corruption in useUpdateSkillMutation helpers.
  setQueriesData([QueryKeys.skills]) matches useSkillsInfiniteQuery's
  InfiniteData cache entries, which have { pages, pageParams } shape —
  spreading data.skills on those would throw. Added an isInfiniteSkillData
  guard and per-page transform so both flat and infinite caches update
  correctly.

Major:
- Fix TUpdateSkillContext type: the public type declared previousListData
  but onMutate actually returns previousListSnapshots (a [key, value]
  tuple array). Updated the type + added TSkillCacheEntry as a shared
  export from data-provider.
- Add cancelQueries calls before optimistic update in onMutate so
  in-flight refetches cannot clobber the optimistic state.
- Parallelize deleteUserSkills ACL removal via Promise.allSettled instead
  of a sequential await loop — O(1) round-trip vs O(n).
- Stub mockDeleteUserSkills in stubDeletionMocks() and assert it's called
  with user.id in the deleteUser.spec.js happy-path test.
- Add idResolver: getSkillById to the SKILL branch in accessPermissions.js
  so GET /api/permissions/skill/<missing-id> returns 404 instead of 403.

Minor:
- Reuse resolved skill from req.resourceAccess.resourceInfo in getHandler
  to eliminate a redundant getSkillById call per GET /api/skills/:id.
- Reject PATCH /api/skills/:id requests whose body contains only
  expectedVersion — previously they silently bumped version with no
  changes, triggering spurious 409s for collaborators.
- Make TSkill.frontmatter optional (wire type) and add serializeFrontmatter
  / serializeSourceMetadata helpers that return undefined for empty
  objects instead of casting incomplete data to SkillFrontmatter.
- Standardize deleteUserSkills to accept string | ObjectId and convert
  internally, matching deleteUserPrompts's signature; UserController now
  passes user.id consistently.
- Replace bumpSkillVersionAndRecount (read-then-write, racy) with
  bumpSkillVersionAndAdjustFileCount using atomic $inc. upsertSkillFile
  pre-checks existence to distinguish insert (+1) from replace (0).
- Add DELETE /api/skills/:id/files/:relativePath integration tests
  covering success, 404, and 403 paths.

Nits:
- Drop trivial resolveSkillId wrapper — pass getSkillById directly.
- Remove dead staleTime: 1000 * 10 from useListSkillsQuery since all
  refetch triggers are already disabled.

* 🧭 fix: Resolve Second Skills Review Pass — Cache, Gate, TOCTOU

Address 13 of 14 findings from the second code review; reject #13 as
misread of the AGENTS.md import-order rule (package types correctly
precede local types regardless of length).

Major:
- Fix addSkillToCachedLists closure bug: a hoisted `prepended` flag
  was shared across every cache entry matched by setQueriesData, so
  concurrent flat + infinite caches would silently drop the prepend
  on whichever was processed second. Replaced the shared helper with
  three per-entry inline updaters that handle InfiniteData at the
  page level (page 0 only for prepend, all pages for replace/remove).
- Tighten patchHandler's expectedVersion validation: NaN passes
  `typeof === 'number'` and would previously leak current skill state
  via a misleading 409. Now requires finite positive integer and
  returns 400 otherwise.
- Guard decodeURIComponent in deleteFileHandler with try/catch —
  malformed percent encoding now returns 400 instead of 500.
- Add PermissionTypes.SKILLS + skillPermissionsSchema +
  TSkillPermissions in data-provider; seed default SKILLS permissions
  for ADMIN (all true) and USER (use + create only); wire
  checkSkillAccess / checkSkillCreate via generateCheckAccess onto
  the skills router mirroring the prompts pattern. Skills route now
  enforces role-based capability gates alongside per-resource ACLs.
  Test suite adds a mocked getRoleByName returning permissive SKILLS.
- Fix upsertSkillFile TOCTOU: replaced the pre-check + upsert pair
  with a single `findOneAndUpdate({ new: false, upsert: true })` call
  that atomically returns the pre-update doc (null ⇒ insert) so
  fileCount delta can't double-count on concurrent same-path uploads.

Minor:
- Add `sourceMetadata` to listSkillsByAccess .select() so summaries
  no longer silently drop the field for GitHub/Notion-synced skills.
- Include `cursor` in useListSkillsQuery's query key so manual
  pagination doesn't alias across pages.
- Clean up TSkillSummary to `Omit<TSkill, 'body' | 'frontmatter'>`
  matching what serializeSkillSummary actually emits; drop the
  Omit-then-re-add noise.
- Skip getPublicSkillIdSet in createHandler; a newly-created skill
  cannot have a PUBLIC ACL entry, so pass an empty set directly
  instead of paying a DB round-trip.
- Trim SkillMethods public surface: drop internal helpers
  countSkillFiles / deleteSkillFilesBySkillId / getSkillFile from the
  return object; inline the file cascade into deleteSkill.
- Use TSkillConflictResponse at the PATCH 409 call site instead of
  an inline ad-hoc object literal.
- Drop the now-unused EXPECTED_VERSION_ERROR module constant.

* 🧩 fix: Extend Role Schema + Types with SKILLS PermissionType

CI type-check and unit test failures from the PermissionTypes.SKILLS
addition surfaced three unrelated places that all hardcode the
permission-type set:

- IRole.permissions in data-schemas/types/role.ts enumerates every
  PermissionTypes key as an optional field. Adding SKILLS to the enum
  without updating the interface caused TS7053 'expression of type
  PermissionTypes can't be used to index type' errors in
  role.methods.spec.ts (lines 407-408, 477-478) because
  Object.values(PermissionTypes) now yielded a value the interface
  didn't cover.
- schema/role.ts rolePermissionsSchema mirrors the interface at the
  Mongoose layer; also needed SKILLS added so the persisted role
  document can actually store skill permissions.
- data-provider/roles.spec.ts has a guard test that every permission
  type carrying CREATE/SHARE/SHARE_PUBLIC must be explicitly "tracked"
  either in RESOURCE_PERMISSION_TYPES or in the PROMPTS/AGENTS/MEMORIES
  exemption list. Added SKILLS to the exemption list since skills
  follow the same default model as prompts/agents (USE + CREATE on for
  USER, SHARE / SHARE_PUBLIC off).

All three are additive pass-throughs with no behavior change.

* 🏷️ refactor: Introduce ISkillSummary for Narrow List Projection

Follow-up NITs from the second review pass on the Skills PR:

- Define ISkillSummary = Omit<ISkill, 'body' | 'frontmatter'> and use
  it as the element type in ListSkillsByAccessResult. The list query's
  .select() intentionally omits body and frontmatter for payload size,
  but the previous type claimed both fields were present — a type lie
  that would mislead future readers even though serializeSkillSummary
  never touches those fields at runtime. handlers.ts's signature for
  serializeSkillSummary now accepts ISkillSummary too.
- Document the intentional second-round-trip `findOne` in
  upsertSkillFile. Switching to `findOneAndUpdate({ new: false })`
  was required for TOCTOU-safe insert-vs-replace detection, which
  means the handler needs a follow-up query to return the post-upsert
  document. A comment now explains the tradeoff so future readers
  don't silently "optimize" it away.

No behavior change.

* 🌐 fix: Wire SKILL into SHARE_PUBLIC Resource Maps

Address codex comment #1 — making a skill public was blocked on two
hardcoded resource→permission-type maps that didn't know about SKILL:

- api/server/middleware/checkSharePublicAccess.js's
  resourceToPermissionType map was missing ResourceType.SKILL, so
  PUT /api/permissions/skill/:id with { public: true } would fall
  through to the 400 "Unsupported resource type for public sharing"
  path even though PermissionTypes.SKILLS exists and ADMIN has
  SHARE_PUBLIC configured. Added the mapping.
- client/src/hooks/Sharing/useCanSharePublic.ts has an identical
  client-side map used to gate the "Make Public" UI toggle. Without
  the SKILL mapping the hook returned false for everyone, so the
  toggle wouldn't render for skills once the sharing UI lands in
  phase 2. Added the mapping.

Codex comment #2 (create/update cache writes inject skills into
unrelated filtered lists) is invalid — it flags a pattern that
mirrors useUpdatePromptGroup (which the PR description explicitly
cites as the model) and is a deliberate optimistic-update tradeoff.
Trying to match each cache key's embedded filter would couple the
mutation callback to query-key internals, which is exactly what
setQueriesData is designed to avoid. No change there.

* 🧪 feat: Frontmatter Validation, Reserved-Name Fixes, Coaching Warnings

Address the follow-up review notes on the Skills PR. This commit closes
the gap between the wire-type promise and what the backend actually
enforces, tightens the reserved-name rules, and adds a non-blocking
coaching tier for validators.

Frontmatter validation (new):
- Add `validateSkillFrontmatter` in data-schemas/methods/skill.ts with
  strict mode — unknown keys are rejected so expanding the allowed set
  is an intentional code change. Known keys are type-checked against a
  `FrontmatterKind` table derived from Anthropic's Agent Skills spec
  (name, description, when-to-use, allowed-tools, arguments,
  argument-hint, user-invocable, disable-model-invocation, model,
  effort, context, agent, paths, shell, hooks, version, metadata).
- `hooks` and `metadata` get a shallow JSON-safety check (max depth 4,
  max string 2000, max array 100) instead of a full schema, since their
  full shapes live outside this module.
- Wired into BOTH createSkill AND updateSkill so the PATCH path can't
  smuggle invalid frontmatter past the validator.

Validation warning tier (new):
- Add optional `severity: 'error' | 'warning'` to `ValidationIssue`
  (defaults to error). `partitionIssues` splits an issue list into
  blocking errors and non-blocking warnings.
- `createSkill` / `updateSkill` filter on errors for the throw check
  and return warnings in a new `warnings: ValidationIssue[]` field on
  their result objects (`CreateSkillResult` / `UpdateSkillResult`).
- `validateSkillDescription` now emits a `TOO_SHORT` warning for
  descriptions under 20 chars — the primary triggering field, so a
  little coaching goes a long way.
- `createHandler` / `patchHandler` in packages/api surface the warnings
  via a new `attachWarnings` helper that decorates the serialized
  response with a `warnings?: TSkillWarning[]` field.
- `TSkill` gains an optional `warnings?: TSkillWarning[]` field
  documented as "present on POST/PATCH, never on GET".

Reserved-name filter (tightened):
- Replace the substring match (`.includes('anthropic')`) with prefix
  matching on `anthropic-` and `claude-` plus exact-match rejection of
  CLI slash-command collisions (help, clear, compact, model, exit,
  quit, settings, plus the bare `anthropic` / `claude` words). Both
  the pure validator (`methods/skill.ts`) and the Mongoose schema
  validator (`schema/skill.ts`) updated in lockstep; comments on
  each reference the other to prevent drift.
- `research-anthropic-helper` and `about-claude` are now allowed;
  `anthropic-helper`, `claude-bot`, and `settings` are still rejected.

Documentation:
- Add docstrings on `ISkill`, `schema/skill.ts`, and `TSkill` explaining
  the semantics of `name` (Claude-visible identifier, kebab-case,
  stable), `displayTitle` (UI-only cosmetic label, NOT sent to Claude),
  `description` (highest-leverage trigger field), and `source` /
  `sourceMetadata` (reserved for phase 2+ external sync).
- Add a detailed consistency comment on `bumpSkillVersionAndAdjustFileCount`
  explaining that it runs as a separate MongoDB operation from
  upsertSkillFile/deleteSkillFile, so `fileCount` can drift if the
  second op fails — options listed, tradeoff documented, phase 1
  risk window noted as closed because upload is still stubbed.

Tests:
- data-schemas skill.spec.ts: destructure `{ skill, warnings }` from
  createSkill at every call site; add a TOO_SHORT warning test, a
  frontmatter strict-mode test, reserved-prefix tests (including
  positive cases for substring names that should pass), CLI reserved
  word tests, and a full `validateSkillFrontmatter` describe block
  covering unknown keys, type mismatches, and deep-nesting rejection.
- api/server/routes/skills.test.js: bump default test description
  above the 20-char threshold, add a warning-emission test, add
  reserved-prefix + reserved-CLI-word tests, add an unknown-frontmatter-
  key test asserting the 400 response carries `issues` with `UNKNOWN_KEY`.

* 📦 fix: Export CreateSkillResult from data-schemas Methods Index

`CreateSkillResult` was defined in `methods/skill.ts` and consumed by
`packages/api/src/skills/handlers.ts` but never re-exported from the
methods barrel, so the type-check job failed with TS2724
"'@librechat/data-schemas' has no exported member named 'CreateSkillResult'".

Rollup's bundle-mode build picked up the type via its internal resolver,
but the standalone `tsc --noEmit` type-check ran against the package's
public entrypoint and couldn't see it. Added the type import + export
alongside the existing `UpdateSkillResult` export, which fixes the
CI type-check without any runtime change.
2026-04-25 04:01:59 -04:00
Atef Bellaaj
a0fed6173c
🗂️ refactor: Migrate S3 Storage to TypeScript in packages/api (#11947)
* Migrate S3 storage module with unit and integration tests

  - Migrate S3 CRUD and image operations to packages/api/src/storage/s3/
  - Add S3ImageService class with dependency injection
  - Add unit tests using aws-sdk-client-mock
  - Add integration tests with real s3 bucket (condition presence of  AWS_TEST_BUCKET_NAME)

* AI Review Findings Fixes

* chore: tests and refactor S3 storage types

- Added mock implementations for the 'sharp' library in various test files to improve image processing testing.
- Updated type references in S3 storage files from MongoFile to TFile for consistency and type safety.
- Refactored S3 CRUD operations to ensure proper handling of file types and improve code clarity.
- Enhanced integration tests to validate S3 file operations and error handling more effectively.

* chore: rename test file

* Remove duplicate import of refreshS3Url

* chore: imports order

* fix: remove duplicate imports for S3 URL handling in UserController

* fix: remove duplicate import of refreshS3FileUrls in files.js

* test: Add mock implementations for 'sharp' and '@librechat/api' in UserController tests

- Introduced mock functions for the 'sharp' library to facilitate image processing tests, including metadata retrieval and buffer conversion.
- Enhanced mocking for '@librechat/api' to ensure consistent behavior in tests, particularly for the needsRefresh and getNewS3URL functions.

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2026-03-21 14:28:55 -04:00
Danny Avila
8ba2bde5c1
📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830)
* chore: move database model methods to /packages/data-schemas

* chore: add TypeScript ESLint rule to warn on unused variables

* refactor: model imports to streamline access

- Consolidated model imports across various files to improve code organization and reduce redundancy.
- Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path.
- Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact.
- Enhanced test files to align with the new import paths, maintaining test coverage and integrity.

* chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas

* test: update agent model mocks in unit tests

- Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality.
- Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication.
- Ensured consistency in agent mock implementations across test files.

* fix: update types in data-schemas

* refactor: enhance type definitions in transaction and spending methods

- Updated type definitions in `checkBalance.ts` to use specific request and response types.
- Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety.
- Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods.
- Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness.

* refactor: streamline model imports and enhance code organization

- Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy.
- Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact.
- Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods.

* feat: implement loadAddedAgent and refactor agent loading logic

- Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution.
- Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`.
- Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`.
- Enhanced type definitions and improved error handling in the agent loading process.
- Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage.

* refactor: enhance balance handling with new update interface

- Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase.
- Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety.
- Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities.
- Enhanced overall code clarity and maintainability by refining type definitions related to balance operations.

* feat: add unit tests for loadAgent functionality and enhance agent loading logic

- Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks.
- Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality.
- Improved test coverage for agent loading, including handling of non-existent agents and user permissions.

* chore: reorder memory method exports for consistency

- Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`.
2026-03-21 14:28:53 -04:00
Danny Avila
38521381f4
🐘 feat: FerretDB Compatibility (#11769)
* feat: replace unsupported MongoDB aggregation operators for FerretDB compatibility

Replace $lookup, $unwind, $sample, $replaceRoot, and $addFields aggregation
stages which are unsupported on FerretDB v2.x (postgres-documentdb backend).

- Prompt.js: Replace $lookup/$unwind/$project pipelines with find().select().lean()
  + attachProductionPrompts() batch helper. Replace $group/$replaceRoot/$sample
  in getRandomPromptGroups with distinct() + Fisher-Yates shuffle.
- Agent/Prompt migration scripts: Replace $lookup anti-join pattern with
  distinct() + $nin two-step queries for finding un-migrated resources.

All replacement patterns verified against FerretDB v2.7.0.

* fix: use $pullAll for simple array removals, fix memberIds type mismatches

Replace $pull with $pullAll for exact-value scalar array removals. Both
operators work on MongoDB and FerretDB, but $pullAll is more explicit for
exact matching (no condition expressions).

Fix critical type mismatch bugs where ObjectId values were used against
String[] memberIds arrays in Group queries:
- config/delete-user.js: use string uid instead of ObjectId user._id
- e2e/setup/cleanupUser.ts: convert userId.toString() before query

Harden PermissionService.bulkUpdateResourcePermissions abort handling to
prevent crash when abortTransaction is called after commitTransaction.

All changes verified against FerretDB v2.7.0 and MongoDB Memory Server.

* fix: harden transaction support probe for FerretDB compatibility

Commit the transaction before aborting in supportsTransactions probe, and
wrap abortTransaction in try-catch to prevent crashes when abort is called
after a successful commit (observed behavior on FerretDB).

* feat: add FerretDB compatibility test suite, retry utilities, and CI config

Add comprehensive FerretDB integration test suite covering:
- $pullAll scalar array operations
- $pull with subdocument conditions
- $lookup replacement (find + manual join)
- $sample replacement (distinct + Fisher-Yates)
- $bit and $bitsAllSet operations
- Migration anti-join pattern
- Multi-tenancy (useDb, scaling, write amplification)
- Sharding proof-of-concept
- Production operations (backup/restore, schema migration, deadlock retry)

Add production retryWithBackoff utility for deadlock recovery during
concurrent index creation on FerretDB/DocumentDB backends.

Add UserController.spec.js tests for deleteUserController (runs in CI).

Configure jest and eslint to isolate FerretDB tests from CI pipelines:
- packages/data-schemas/jest.config.mjs: ignore misc/ directory
- eslint.config.mjs: ignore packages/data-schemas/misc/

Include Docker Compose config for local FerretDB v2.7 + postgres-documentdb,
dedicated jest/tsconfig for the test files, and multi-tenancy findings doc.

* style: brace formatting in aclEntry.ts modifyPermissionBits

* refactor: reorganize retry utilities and update imports

- Moved retryWithBackoff utility to a new file `retry.ts` for better structure.
- Updated imports in `orgOperations.ferretdb.spec.ts` to reflect the new location of retry utilities.
- Removed old import statement for retryWithBackoff from index.ts to streamline exports.

* test: add $pullAll coverage for ConversationTag and PermissionService

Add integration tests for deleteConversationTag verifying $pullAll
removes tags from conversations correctly, and for
syncUserEntraGroupMemberships verifying $pullAll removes user from
non-matching Entra groups while preserving local group membership.

---------
2026-03-21 14:28:49 -04:00