LibreChat/packages/data-schemas
Danny Avila 376370d610
♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953)
* ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint

The /api/endpoints/context-projection endpoint re-fetched a conversation's
messages from Mongo and re-tokenized them to project the context gauge for
snapshot-less branches. The browser already holds those messages and their
per-message tokenCounts, so this duplicated work on the request path (an
unbounded read + server-side BPE tokenization until it was later capped).

Move the snapshot-less estimate fully client-side, from the in-memory index:

- sumBranch accumulates an uncalibrated char/4 estimate (estTokens) for
  count-less messages (imports / pre-feature) under the same summary cutoff
- useTokenUsage folds estTokens (calibrated via the existing calibrationFamily
  ratio) into the existing fallback; known per-message counts render unchanged
- delete the endpoint, controller, rate limiter, route, the getMessageTextStats
  data-schemas method, and the data-provider surface (endpoint/key/type/service/query)

No DB read, no server tokenization, no rate-limit knobs; the gauge recomputes
reactively from the index. Net -793 lines.

* 🩹 fix: Count quotes and object-form content in client context estimate

Address Codex review on the client-side context estimate:

- messageChars now reads object-form content text (part.text.value), not
  only string text/think, so imported / pre-feature messages whose body
  lives in content parts are no longer estimated as zero.
- Count-less user messages include their merged quote excerpts in the
  estimate, mirroring what the send path prepends into the prompt.

* 🩹 fix: Cap over-window estimate and surface estimated tokens in breakdown

Address remaining Codex review on the client-side context estimate:

- Clamp the snapshot-less estimate's displayed usedTokens to maxTokens. The
  send path prunes an over-window branch before calling the model, so the
  gauge never actually exceeds the window; this avoids impossible values
  (e.g. 50k / 8k) without re-introducing client-side pruning.
- Surface the calibrated count-less estimate as its own "Estimated" row in
  the breakdown popover, so a branch of only count-less imported / pre-feature
  messages is no longer shown as Input 0 / Output 0 under a non-zero header.

* 🩹 fix: Refine client context estimate per Codex re-review

- Drop calibration from the snapshot-less estimate. The removed projection
  never actually calibrated (the client never sent a ratio), and a ratio
  inflated by provider-injected context over-estimates visible imported text.
- Exclude reasoning (think) / error parts from the estimate; the send path
  strips them, so they are not part of the next call's context.
- Fold quote text into the estimate even when a tokenCount is present, since
  the edit route recounts tokenCount from text only and drops the merged quote.

* 🩹 fix: Recount quoted user turns instead of topping up the stored count

The previous round added quote chars on top of a quoted message's stored
tokenCount, which double-counts the common (unedited) case where the count
already includes the merged quote prompt. Match the removed projection
instead: for quoted user turns, ignore the stored count and estimate the
full merged text. This both avoids the double-count and still corrects the
stale text-only count an edit leaves behind.

* 🩹 fix: Trust stored counts for quoted turns; count tool-call parts

- Quoted user turns: revert to trusting a present tokenCount. The send path's
  stored count already includes the merged quote (and any calibration), and
  the client's char/4 path is coarser, so recounting regressed normal turns.
  Only count-less messages estimate quotes from text.
- Count tool-call name/args/output for count-less assistant messages; the
  formatter sends them back as context, so omitting them under-reported
  imported branches with tool history.

* 🩹 fix: Exclude in-flight tail from estimate to avoid resume double-count

On resume the live path seeds liveTokens from the partial response and also
writes that content into the messages cache, where the count-less response
is estimated into estTokens too — double-counting the in-flight output on the
snapshot-less estimate path. sumBranch now exposes the tail message's own
estimate (tailEstTokens); the estimate path drops it while a stream is live,
so the in-flight response is counted once (via liveTokens). The breakdown's
Estimated row uses the same in-flight-adjusted value.

* 🩹 fix: Recount quoted user turns in context estimate (match send path)

A quoted user turn's stored tokenCount is unreliable for the gauge: a
text-only Save edit recomputes it from text alone, and the send path
(needsCanonicalTokenCount in agents/client.js) recounts the quote-merged
prompt every turn regardless of the stored value. Mirror that on the client
— estimate quoted turns from the merged text+quotes and ignore the stored
count — so snapshot-less branches don't under-report by the quote block.
Reverts the earlier "trust the count" assumption, which the server disproves.

* 🧹 chore: Route useResumableSSE diagnostics through the frontend logger

Convert the [ResumableSSE]/[Debug] console.log and console.error diagnostics
to the gated frontend `logger` (client/src/utils/logger), splitting the tag
from the message so object arguments are passed through as real args (logged
expandably, not stringified) and the logs stay tag-filterable and off the
production console unless explicitly enabled. All log statements preserved;
nothing removed.

* 🩹 fix: Prefer content over text when estimating count-less messages

A stopped agent response is saved with both a `text` field and a structured
`content` array, and the send path formats from content. messageChars
early-returned on `text`, dropping the content array (and the tool-call tokens
it carries) from the snapshot-less estimate — also making the tool_call
handling dead for such messages. Prefer content when present, fall back to text.
2026-06-25 15:29:31 -04:00
..
misc/ferretdb 📦 chore: Update TypeScript Config for TS v7 (#12794) 2026-04-23 12:51:03 -04:00
src ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953) 2026-06-25 15:29:31 -04:00
.gitignore
babel.config.cjs
jest.config.mjs 🐘 feat: FerretDB Compatibility (#11769) 2026-03-21 14:28:49 -04:00
LICENSE
package.json v0.8.7 (#13907) 2026-06-24 14:49:32 -04:00
README.md 🔐 feat: Granular Role-based Permissions + Entra ID Group Discovery (#7804) 2025-08-13 16:24:17 -04:00
tsconfig.build.json 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
tsconfig.json 🔧 chore: Enforce isolatedDeclarations in data-schemas tsconfig (#13593) 2026-06-08 09:55:20 -04:00
tsconfig.spec.json 📦 chore: Update TypeScript Config for TS v7 (#12794) 2026-04-23 12:51:03 -04:00
tsdown.config.mjs 🪟 fix: Cross-Platform Absolute-Path Check in tsdown neverBundle Predicates (#13700) 2026-06-13 11:04:46 -04:00

LibreChat Data Schemas Package

This package provides the database schemas, models, types, and methods for LibreChat using Mongoose ODM.

📁 Package Structure

packages/data-schemas/
├── src/
│   ├── schema/         # Mongoose schema definitions
│   ├── models/         # Model factory functions
│   ├── types/          # TypeScript type definitions
│   ├── methods/        # Database operation methods
│   ├── common/         # Shared constants and enums
│   ├── config/         # Configuration files (winston, etc.)
│   └── index.ts        # Main package exports

🏗️ Architecture Patterns

1. Schema Files (src/schema/)

Schema files define the Mongoose schema structure. They follow these conventions:

  • Naming: Use lowercase filenames (e.g., user.ts, accessRole.ts)
  • Imports: Import types from ~/types for TypeScript support
  • Exports: Export only the schema as default

Example:

import { Schema } from 'mongoose';
import type { IUser } from '~/types';

const userSchema = new Schema<IUser>(
  {
    name: { type: String },
    email: { type: String, required: true },
    // ... other fields
  },
  { timestamps: true }
);

export default userSchema;

2. Type Definitions (src/types/)

Type files define TypeScript interfaces and types. They follow these conventions:

  • Base Type: Define a plain type without Mongoose Document properties
  • Document Interface: Extend the base type with Document and _id
  • Enums/Constants: Place related enums in the type file or common/ if shared

Example:

import type { Document, Types } from 'mongoose';

export type User = {
  name?: string;
  email: string;
  // ... other fields
};

export type IUser = User &
  Document & {
    _id: Types.ObjectId;
  };

3. Model Factory Functions (src/models/)

Model files create Mongoose models using factory functions. They follow these conventions:

  • Function Name: create[EntityName]Model
  • Singleton Pattern: Check if model exists before creating
  • Type Safety: Use the corresponding interface from types

Example:

import userSchema from '~/schema/user';
import type * as t from '~/types';

export function createUserModel(mongoose: typeof import('mongoose')) {
  return mongoose.models.User || mongoose.model<t.IUser>('User', userSchema);
}

4. Database Methods (src/methods/)

Method files contain database operations for each entity. They follow these conventions:

  • Function Name: create[EntityName]Methods
  • Return Type: Export a type for the methods object
  • Operations: Include CRUD operations and entity-specific queries

Example:

import type { Model } from 'mongoose';
import type { IUser } from '~/types';

export function createUserMethods(mongoose: typeof import('mongoose')) {
  async function findUserById(userId: string): Promise<IUser | null> {
    const User = mongoose.models.User as Model<IUser>;
    return await User.findById(userId).lean();
  }

  async function createUser(userData: Partial<IUser>): Promise<IUser> {
    const User = mongoose.models.User as Model<IUser>;
    return await User.create(userData);
  }

  return {
    findUserById,
    createUser,
    // ... other methods
  };
}

export type UserMethods = ReturnType<typeof createUserMethods>;

5. Main Exports (src/index.ts)

The main index file exports:

  • createModels() - Factory function for all models
  • createMethods() - Factory function for all methods
  • Type exports from ~/types
  • Shared utilities and constants

🚀 Adding a New Entity

To add a new entity to the data-schemas package, follow these steps:

Step 1: Create the Type Definition

Create src/types/[entityName].ts:

import type { Document, Types } from 'mongoose';

export type EntityName = {
  /** Field description */
  fieldName: string;
  // ... other fields
};

export type IEntityName = EntityName &
  Document & {
    _id: Types.ObjectId;
  };

Step 2: Update Types Index

Add to src/types/index.ts:

export * from './entityName';

Step 3: Create the Schema

Create src/schema/[entityName].ts:

import { Schema } from 'mongoose';
import type { IEntityName } from '~/types';

const entityNameSchema = new Schema<IEntityName>(
  {
    fieldName: { type: String, required: true },
    // ... other fields
  },
  { timestamps: true }
);

export default entityNameSchema;

Step 4: Create the Model Factory

Create src/models/[entityName].ts:

import entityNameSchema from '~/schema/entityName';
import type * as t from '~/types';

export function createEntityNameModel(mongoose: typeof import('mongoose')) {
  return (
    mongoose.models.EntityName || 
    mongoose.model<t.IEntityName>('EntityName', entityNameSchema)
  );
}

Step 5: Update Models Index

Add to src/models/index.ts:

  1. Import the factory function:
import { createEntityNameModel } from './entityName';
  1. Add to the return object in createModels():
EntityName: createEntityNameModel(mongoose),

Step 6: Create Database Methods

Create src/methods/[entityName].ts:

import type { Model, Types } from 'mongoose';
import type { IEntityName } from '~/types';

export function createEntityNameMethods(mongoose: typeof import('mongoose')) {
  async function findEntityById(id: string | Types.ObjectId): Promise<IEntityName | null> {
    const EntityName = mongoose.models.EntityName as Model<IEntityName>;
    return await EntityName.findById(id).lean();
  }

  // ... other methods

  return {
    findEntityById,
    // ... other methods
  };
}

export type EntityNameMethods = ReturnType<typeof createEntityNameMethods>;

Step 7: Update Methods Index

Add to src/methods/index.ts:

  1. Import the methods:
import { createEntityNameMethods, type EntityNameMethods } from './entityName';
  1. Add to the return object in createMethods():
...createEntityNameMethods(mongoose),
  1. Add to the AllMethods type:
export type AllMethods = UserMethods &
  // ... other methods
  EntityNameMethods;

📝 Best Practices

  1. Consistent Naming: Use lowercase for filenames, PascalCase for types/interfaces
  2. Type Safety: Always use TypeScript types, avoid any
  3. JSDoc Comments: Document complex fields and methods
  4. Indexes: Define database indexes in schema files for query performance
  5. Validation: Use Mongoose schema validation for data integrity
  6. Lean Queries: Use .lean() for read operations when you don't need Mongoose document methods

🔧 Common Patterns

Enums and Constants

Place shared enums in src/common/:

// src/common/permissions.ts
export enum PermissionBits {
  VIEW = 1,
  EDIT = 2,
  DELETE = 4,
  SHARE = 8,
}

Compound Indexes

For complex queries, add compound indexes:

schema.index({ field1: 1, field2: 1 });
schema.index(
  { uniqueField: 1 },
  { 
    unique: true, 
    partialFilterExpression: { uniqueField: { $exists: true } }
  }
);

Virtual Properties

Add computed properties using virtuals:

schema.virtual('fullName').get(function() {
  return `${this.firstName} ${this.lastName}`;
});

🧪 Testing

When adding new entities, ensure:

  • Types compile without errors
  • Models can be created successfully
  • Methods handle edge cases (null checks, validation)
  • Indexes are properly defined for query patterns

📚 Resources