mirror of https://github.com/danny-avila/LibreChat.git synced 2026-07-01 20:01:35 +00:00

History

Danny Avila 376370d610 ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953 ) * ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint The /api/endpoints/context-projection endpoint re-fetched a conversation's messages from Mongo and re-tokenized them to project the context gauge for snapshot-less branches. The browser already holds those messages and their per-message tokenCounts, so this duplicated work on the request path (an unbounded read + server-side BPE tokenization until it was later capped). Move the snapshot-less estimate fully client-side, from the in-memory index: - sumBranch accumulates an uncalibrated char/4 estimate (estTokens) for count-less messages (imports / pre-feature) under the same summary cutoff - useTokenUsage folds estTokens (calibrated via the existing calibrationFamily ratio) into the existing fallback; known per-message counts render unchanged - delete the endpoint, controller, rate limiter, route, the getMessageTextStats data-schemas method, and the data-provider surface (endpoint/key/type/service/query) No DB read, no server tokenization, no rate-limit knobs; the gauge recomputes reactively from the index. Net -793 lines. * 🩹 fix: Count quotes and object-form content in client context estimate Address Codex review on the client-side context estimate: - messageChars now reads object-form content text (part.text.value), not only string text/think, so imported / pre-feature messages whose body lives in content parts are no longer estimated as zero. - Count-less user messages include their merged quote excerpts in the estimate, mirroring what the send path prepends into the prompt. * 🩹 fix: Cap over-window estimate and surface estimated tokens in breakdown Address remaining Codex review on the client-side context estimate: - Clamp the snapshot-less estimate's displayed usedTokens to maxTokens. The send path prunes an over-window branch before calling the model, so the gauge never actually exceeds the window; this avoids impossible values (e.g. 50k / 8k) without re-introducing client-side pruning. - Surface the calibrated count-less estimate as its own "Estimated" row in the breakdown popover, so a branch of only count-less imported / pre-feature messages is no longer shown as Input 0 / Output 0 under a non-zero header. * 🩹 fix: Refine client context estimate per Codex re-review - Drop calibration from the snapshot-less estimate. The removed projection never actually calibrated (the client never sent a ratio), and a ratio inflated by provider-injected context over-estimates visible imported text. - Exclude reasoning (think) / error parts from the estimate; the send path strips them, so they are not part of the next call's context. - Fold quote text into the estimate even when a tokenCount is present, since the edit route recounts tokenCount from text only and drops the merged quote. * 🩹 fix: Recount quoted user turns instead of topping up the stored count The previous round added quote chars on top of a quoted message's stored tokenCount, which double-counts the common (unedited) case where the count already includes the merged quote prompt. Match the removed projection instead: for quoted user turns, ignore the stored count and estimate the full merged text. This both avoids the double-count and still corrects the stale text-only count an edit leaves behind. * 🩹 fix: Trust stored counts for quoted turns; count tool-call parts - Quoted user turns: revert to trusting a present tokenCount. The send path's stored count already includes the merged quote (and any calibration), and the client's char/4 path is coarser, so recounting regressed normal turns. Only count-less messages estimate quotes from text. - Count tool-call name/args/output for count-less assistant messages; the formatter sends them back as context, so omitting them under-reported imported branches with tool history. * 🩹 fix: Exclude in-flight tail from estimate to avoid resume double-count On resume the live path seeds liveTokens from the partial response and also writes that content into the messages cache, where the count-less response is estimated into estTokens too — double-counting the in-flight output on the snapshot-less estimate path. sumBranch now exposes the tail message's own estimate (tailEstTokens); the estimate path drops it while a stream is live, so the in-flight response is counted once (via liveTokens). The breakdown's Estimated row uses the same in-flight-adjusted value. * 🩹 fix: Recount quoted user turns in context estimate (match send path) A quoted user turn's stored tokenCount is unreliable for the gauge: a text-only Save edit recomputes it from text alone, and the send path (needsCanonicalTokenCount in agents/client.js) recounts the quote-merged prompt every turn regardless of the stored value. Mirror that on the client — estimate quoted turns from the merged text+quotes and ignore the stored count — so snapshot-less branches don't under-report by the quote block. Reverts the earlier "trust the count" assumption, which the server disproves. * 🧹 chore: Route useResumableSSE diagnostics through the frontend logger Convert the [ResumableSSE]/[Debug] console.log and console.error diagnostics to the gated frontend `logger` (client/src/utils/logger), splitting the tag from the message so object arguments are passed through as real args (logged expandably, not stringified) and the logs stay tag-filterable and off the production console unless explicitly enabled. All log statements preserved; nothing removed. * 🩹 fix: Prefer content over text when estimating count-less messages A stopped agent response is saved with both a `text` field and a structured `content` array, and the send path formats from content. messageChars early-returned on `text`, dropping the content array (and the tool-call tokens it carries) from the snapshot-less estimate — also making the tool_call handling dead for such messages. Prefer content when present, fall back to text.		2026-06-25 15:29:31 -04:00
..
misc/ferretdb	📦 chore: Update TypeScript Config for TS v7 (#12794 )	2026-04-23 12:51:03 -04:00
src	♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953 )	2026-06-25 15:29:31 -04:00
.gitignore
babel.config.cjs
jest.config.mjs	🐘 feat: FerretDB Compatibility (#11769 )	2026-03-21 14:28:49 -04:00
LICENSE
package.json	✨ v0.8.7 (#13907 )	2026-06-24 14:49:32 -04:00
README.md	🔐 feat: Granular Role-based Permissions + Entra ID Group Discovery (#7804 )	2025-08-13 16:24:17 -04:00
tsconfig.build.json	📦 refactor: Consolidate DB models, encapsulating Mongoose usage in `data-schemas` (#11830 )	2026-03-21 14:28:53 -04:00
tsconfig.json	🔧 chore: Enforce isolatedDeclarations in data-schemas tsconfig (#13593 )	2026-06-08 09:55:20 -04:00
tsconfig.spec.json	📦 chore: Update TypeScript Config for TS v7 (#12794 )	2026-04-23 12:51:03 -04:00
tsdown.config.mjs	🪟 fix: Cross-Platform Absolute-Path Check in tsdown neverBundle Predicates (#13700 )	2026-06-13 11:04:46 -04:00

README.md

LibreChat Data Schemas Package

This package provides the database schemas, models, types, and methods for LibreChat using Mongoose ODM.

📁 Package Structure

packages/data-schemas/
├── src/
│   ├── schema/         # Mongoose schema definitions
│   ├── models/         # Model factory functions
│   ├── types/          # TypeScript type definitions
│   ├── methods/        # Database operation methods
│   ├── common/         # Shared constants and enums
│   ├── config/         # Configuration files (winston, etc.)
│   └── index.ts        # Main package exports

🏗️ Architecture Patterns

1. Schema Files (`src/schema/`)

Schema files define the Mongoose schema structure. They follow these conventions:

Naming: Use lowercase filenames (e.g., user.ts, accessRole.ts)
Imports: Import types from ~/types for TypeScript support
Exports: Export only the schema as default

Example:

import { Schema } from 'mongoose';
import type { IUser } from '~/types';

const userSchema = new Schema<IUser>(
  {
    name: { type: String },
    email: { type: String, required: true },
    // ... other fields
  },
  { timestamps: true }
);

export default userSchema;

2. Type Definitions (`src/types/`)

Type files define TypeScript interfaces and types. They follow these conventions:

Base Type: Define a plain type without Mongoose Document properties
Document Interface: Extend the base type with Document and _id
Enums/Constants: Place related enums in the type file or common/ if shared

Example:

import type { Document, Types } from 'mongoose';

export type User = {
  name?: string;
  email: string;
  // ... other fields
};

export type IUser = User &
  Document & {
    _id: Types.ObjectId;
  };

3. Model Factory Functions (`src/models/`)

Model files create Mongoose models using factory functions. They follow these conventions:

Function Name: create[EntityName]Model
Singleton Pattern: Check if model exists before creating
Type Safety: Use the corresponding interface from types

Example:

import userSchema from '~/schema/user';
import type * as t from '~/types';

export function createUserModel(mongoose: typeof import('mongoose')) {
  return mongoose.models.User || mongoose.model<t.IUser>('User', userSchema);
}

4. Database Methods (`src/methods/`)

Method files contain database operations for each entity. They follow these conventions:

Function Name: create[EntityName]Methods
Return Type: Export a type for the methods object
Operations: Include CRUD operations and entity-specific queries

Example:

import type { Model } from 'mongoose';
import type { IUser } from '~/types';

export function createUserMethods(mongoose: typeof import('mongoose')) {
  async function findUserById(userId: string): Promise<IUser | null> {
    const User = mongoose.models.User as Model<IUser>;
    return await User.findById(userId).lean();
  }

  async function createUser(userData: Partial<IUser>): Promise<IUser> {
    const User = mongoose.models.User as Model<IUser>;
    return await User.create(userData);
  }

  return {
    findUserById,
    createUser,
    // ... other methods
  };
}

export type UserMethods = ReturnType<typeof createUserMethods>;

5. Main Exports (`src/index.ts`)

The main index file exports:

createModels() - Factory function for all models
createMethods() - Factory function for all methods
Type exports from ~/types
Shared utilities and constants

🚀 Adding a New Entity

To add a new entity to the data-schemas package, follow these steps:

Step 1: Create the Type Definition

Create src/types/[entityName].ts:

import type { Document, Types } from 'mongoose';

export type EntityName = {
  /** Field description */
  fieldName: string;
  // ... other fields
};

export type IEntityName = EntityName &
  Document & {
    _id: Types.ObjectId;
  };

Step 2: Update Types Index

Add to src/types/index.ts:

export * from './entityName';

Step 3: Create the Schema

Create src/schema/[entityName].ts:

import { Schema } from 'mongoose';
import type { IEntityName } from '~/types';

const entityNameSchema = new Schema<IEntityName>(
  {
    fieldName: { type: String, required: true },
    // ... other fields
  },
  { timestamps: true }
);

export default entityNameSchema;

Step 4: Create the Model Factory

Create src/models/[entityName].ts:

import entityNameSchema from '~/schema/entityName';
import type * as t from '~/types';

export function createEntityNameModel(mongoose: typeof import('mongoose')) {
  return (
    mongoose.models.EntityName || 
    mongoose.model<t.IEntityName>('EntityName', entityNameSchema)
  );
}

Step 5: Update Models Index

Add to src/models/index.ts:

Import the factory function:

import { createEntityNameModel } from './entityName';

Add to the return object in createModels():

EntityName: createEntityNameModel(mongoose),

Step 6: Create Database Methods

Create src/methods/[entityName].ts:

import type { Model, Types } from 'mongoose';
import type { IEntityName } from '~/types';

export function createEntityNameMethods(mongoose: typeof import('mongoose')) {
  async function findEntityById(id: string | Types.ObjectId): Promise<IEntityName | null> {
    const EntityName = mongoose.models.EntityName as Model<IEntityName>;
    return await EntityName.findById(id).lean();
  }

  // ... other methods

  return {
    findEntityById,
    // ... other methods
  };
}

export type EntityNameMethods = ReturnType<typeof createEntityNameMethods>;

Step 7: Update Methods Index

Add to src/methods/index.ts:

Import the methods:

import { createEntityNameMethods, type EntityNameMethods } from './entityName';

Add to the return object in createMethods():

...createEntityNameMethods(mongoose),

Add to the AllMethods type:

export type AllMethods = UserMethods &
  // ... other methods
  EntityNameMethods;

📝 Best Practices

Consistent Naming: Use lowercase for filenames, PascalCase for types/interfaces
Type Safety: Always use TypeScript types, avoid any
JSDoc Comments: Document complex fields and methods
Indexes: Define database indexes in schema files for query performance
Validation: Use Mongoose schema validation for data integrity
Lean Queries: Use .lean() for read operations when you don't need Mongoose document methods

🔧 Common Patterns

Enums and Constants

Place shared enums in src/common/:

// src/common/permissions.ts
export enum PermissionBits {
  VIEW = 1,
  EDIT = 2,
  DELETE = 4,
  SHARE = 8,
}

Compound Indexes

For complex queries, add compound indexes:

schema.index({ field1: 1, field2: 1 });
schema.index(
  { uniqueField: 1 },
  { 
    unique: true, 
    partialFilterExpression: { uniqueField: { $exists: true } }
  }
);

Virtual Properties

Add computed properties using virtuals:

schema.virtual('fullName').get(function() {
  return `${this.firstName} ${this.lastName}`;
});

🧪 Testing

When adding new entities, ensure:

Types compile without errors
Models can be created successfully
Methods handle edge cases (null checks, validation)
Indexes are properly defined for query patterns

README.md

LibreChat Data Schemas Package

📁 Package Structure

🏗️ Architecture Patterns

1. Schema Files (src/schema/)

2. Type Definitions (src/types/)

3. Model Factory Functions (src/models/)

4. Database Methods (src/methods/)

5. Main Exports (src/index.ts)

🚀 Adding a New Entity

Step 1: Create the Type Definition

Step 2: Update Types Index

Step 3: Create the Schema

Step 4: Create the Model Factory

Step 5: Update Models Index

Step 6: Create Database Methods

Step 7: Update Methods Index

📝 Best Practices

🔧 Common Patterns

Enums and Constants

Compound Indexes

Virtual Properties

🧪 Testing

📚 Resources

1. Schema Files (`src/schema/`)

2. Type Definitions (`src/types/`)

3. Model Factory Functions (`src/models/`)

4. Database Methods (`src/methods/`)

5. Main Exports (`src/index.ts`)