LibreChat/api/server/routes
Danny Avila 376370d610
♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953)
* ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint

The /api/endpoints/context-projection endpoint re-fetched a conversation's
messages from Mongo and re-tokenized them to project the context gauge for
snapshot-less branches. The browser already holds those messages and their
per-message tokenCounts, so this duplicated work on the request path (an
unbounded read + server-side BPE tokenization until it was later capped).

Move the snapshot-less estimate fully client-side, from the in-memory index:

- sumBranch accumulates an uncalibrated char/4 estimate (estTokens) for
  count-less messages (imports / pre-feature) under the same summary cutoff
- useTokenUsage folds estTokens (calibrated via the existing calibrationFamily
  ratio) into the existing fallback; known per-message counts render unchanged
- delete the endpoint, controller, rate limiter, route, the getMessageTextStats
  data-schemas method, and the data-provider surface (endpoint/key/type/service/query)

No DB read, no server tokenization, no rate-limit knobs; the gauge recomputes
reactively from the index. Net -793 lines.

* 🩹 fix: Count quotes and object-form content in client context estimate

Address Codex review on the client-side context estimate:

- messageChars now reads object-form content text (part.text.value), not
  only string text/think, so imported / pre-feature messages whose body
  lives in content parts are no longer estimated as zero.
- Count-less user messages include their merged quote excerpts in the
  estimate, mirroring what the send path prepends into the prompt.

* 🩹 fix: Cap over-window estimate and surface estimated tokens in breakdown

Address remaining Codex review on the client-side context estimate:

- Clamp the snapshot-less estimate's displayed usedTokens to maxTokens. The
  send path prunes an over-window branch before calling the model, so the
  gauge never actually exceeds the window; this avoids impossible values
  (e.g. 50k / 8k) without re-introducing client-side pruning.
- Surface the calibrated count-less estimate as its own "Estimated" row in
  the breakdown popover, so a branch of only count-less imported / pre-feature
  messages is no longer shown as Input 0 / Output 0 under a non-zero header.

* 🩹 fix: Refine client context estimate per Codex re-review

- Drop calibration from the snapshot-less estimate. The removed projection
  never actually calibrated (the client never sent a ratio), and a ratio
  inflated by provider-injected context over-estimates visible imported text.
- Exclude reasoning (think) / error parts from the estimate; the send path
  strips them, so they are not part of the next call's context.
- Fold quote text into the estimate even when a tokenCount is present, since
  the edit route recounts tokenCount from text only and drops the merged quote.

* 🩹 fix: Recount quoted user turns instead of topping up the stored count

The previous round added quote chars on top of a quoted message's stored
tokenCount, which double-counts the common (unedited) case where the count
already includes the merged quote prompt. Match the removed projection
instead: for quoted user turns, ignore the stored count and estimate the
full merged text. This both avoids the double-count and still corrects the
stale text-only count an edit leaves behind.

* 🩹 fix: Trust stored counts for quoted turns; count tool-call parts

- Quoted user turns: revert to trusting a present tokenCount. The send path's
  stored count already includes the merged quote (and any calibration), and
  the client's char/4 path is coarser, so recounting regressed normal turns.
  Only count-less messages estimate quotes from text.
- Count tool-call name/args/output for count-less assistant messages; the
  formatter sends them back as context, so omitting them under-reported
  imported branches with tool history.

* 🩹 fix: Exclude in-flight tail from estimate to avoid resume double-count

On resume the live path seeds liveTokens from the partial response and also
writes that content into the messages cache, where the count-less response
is estimated into estTokens too — double-counting the in-flight output on the
snapshot-less estimate path. sumBranch now exposes the tail message's own
estimate (tailEstTokens); the estimate path drops it while a stream is live,
so the in-flight response is counted once (via liveTokens). The breakdown's
Estimated row uses the same in-flight-adjusted value.

* 🩹 fix: Recount quoted user turns in context estimate (match send path)

A quoted user turn's stored tokenCount is unreliable for the gauge: a
text-only Save edit recomputes it from text alone, and the send path
(needsCanonicalTokenCount in agents/client.js) recounts the quote-merged
prompt every turn regardless of the stored value. Mirror that on the client
— estimate quoted turns from the merged text+quotes and ignore the stored
count — so snapshot-less branches don't under-report by the quote block.
Reverts the earlier "trust the count" assumption, which the server disproves.

* 🧹 chore: Route useResumableSSE diagnostics through the frontend logger

Convert the [ResumableSSE]/[Debug] console.log and console.error diagnostics
to the gated frontend `logger` (client/src/utils/logger), splitting the tag
from the message so object arguments are passed through as real args (logged
expandably, not stringified) and the logs stay tag-filterable and off the
production console unless explicitly enabled. All log statements preserved;
nothing removed.

* 🩹 fix: Prefer content over text when estimating count-less messages

A stopped agent response is saved with both a `text` field and a structured
`content` array, and the send path formats from content. messageChars
early-returned on `text`, dropping the content array (and the tool-call tokens
it carries) from the snapshot-less estimate — also making the tool_call
handling dead for such messages. Prefer content when present, fall back to text.
2026-06-25 15:29:31 -04:00
..
__test-utils__ 🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051) 2026-06-03 14:17:17 -04:00
__tests__ 🔄 feat: Continue Shared Conversations as Personal Copies (#13714) 2026-06-24 16:27:01 -04:00
admin 📒 feat: Audit Log Backend for SystemGrant Assign and Revoke Events (#13087) 2026-06-18 15:42:33 -04:00
agents 🪝 feat: HITL Tool Approval Scaffolding (Slice A) (#12938) 2026-06-24 16:47:16 -04:00
assistants 🧭 fix: Tighten Action OAuth Endpoint Validation (#13142) 2026-05-15 14:53:41 -04:00
files 🗜️ fix: Support Windows ZIP MIME Uploads (#13794) 2026-06-16 11:19:06 -04:00
types WIP: Update UI to match Official Style; Vision and Assistants 👷🏽 (#1190) 2023-11-16 10:42:24 -05:00
accessPermissions.js 🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051) 2026-06-03 14:17:17 -04:00
accessPermissions.sharePolicy.test.js 🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051) 2026-06-03 14:17:17 -04:00
accessPermissions.test.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
actions.js fix: Extend and Decouple MCP OAuth Flow Timeouts (#13622) 2026-06-09 17:50:02 -04:00
apiKeys.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
auth.2fa-ratelimit.test.js 🚦 fix: Guard Auth Continuation with Dedicated Limiter (#13555) 2026-06-06 14:21:28 -04:00
auth.cloudfront.test.js 🚦 fix: Guard Auth Continuation with Dedicated Limiter (#13555) 2026-06-06 14:21:28 -04:00
auth.js 🚦 fix: Guard Auth Continuation with Dedicated Limiter (#13555) 2026-06-06 14:21:28 -04:00
balance.js 🤫 chore: Quiet Repetitive Log Noise from Balance, CloudFront, and Capability Paths (#13461) 2026-06-01 20:40:16 -04:00
banner.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
categories.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
config.js 🔐 fix: Gate Shared Startup Config By Link Access (#13897) 2026-06-23 08:28:37 -04:00
convos.js 🔖 feat: Add Pinned Conversations (#13492) 2026-06-17 20:26:55 -04:00
endpoints.js ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint (#13953) 2026-06-25 15:29:31 -04:00
index.js 📒 feat: Audit Log Backend for SystemGrant Assign and Revoke Events (#13087) 2026-06-18 15:42:33 -04:00
keys.js 🔱 chore: Harden API Routes Against IDOR and DoS Attacks (#11760) 2026-02-12 18:08:24 -05:00
mcp.js 🔐 fix: Honor Admin-Panel MCP Allowlist Overrides Without Restart (#13814) 2026-06-17 20:14:53 -04:00
memories.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
messages.js 📋 refactor: Attach Message Context to Langfuse Feedback Scores (#13604) 2026-06-08 15:54:01 -04:00
models.js 🛠️ refactor: Model Loading and Custom Endpoint Error Handling (#1849) 2024-02-20 12:57:58 -05:00
oauth.js 🩻 refactor: Replace Opaque OAuth Errors with Structured Failure Diagnostics (#13471) 2026-06-02 15:06:42 -04:00
oauth.test.js feat: Immediate Conversation Title Generation (#13395) 2026-06-02 16:40:57 -04:00
presets.js 🧹 chore: Cleanup Logger and Utility Imports (#9935) 2025-10-01 23:30:47 -04:00
projects.js 🗂️ feat: Add Private Chat Projects (#13467) 2026-06-03 15:29:18 -04:00
prompts.js 📁 refactor: Prompts UI (#11570) 2026-03-22 16:56:22 -04:00
prompts.test.js 📁 refactor: Prompts UI (#11570) 2026-03-22 16:56:22 -04:00
roles.js 📜 feat: Skills UI + Initial E2E CRUD / Sharing (#12580) 2026-04-25 04:02:00 -04:00
rum.js 📈 fix: Isolate RUM Telemetry Proxy Auth from App Auth (#13765) 2026-06-15 12:49:44 -04:00
search.js 🧹 chore: Cleanup Logger and Utility Imports (#9935) 2025-10-01 23:30:47 -04:00
settings.js 🎚️ feat: Per-User Skill Active/Inactive Toggle with Ownership-Aware Defaults (#12692) 2026-04-25 04:02:00 -04:00
share.js 🔄 feat: Continue Shared Conversations as Personal Copies (#13714) 2026-06-24 16:27:01 -04:00
skills.js 🧬 feat: Add GitHub Skill Sync (#13293) 2026-06-10 21:05:54 -04:00
skills.tenant.test.js 🧵 fix: Preserve Upload Context Across Multipart Routes (#13072) 2026-05-11 15:46:48 -04:00
skills.test.js 🧬 feat: Add GitHub Skill Sync (#13293) 2026-06-10 21:05:54 -04:00
static.js 🧹 chore: Cleanup Logger and Utility Imports (#9935) 2025-10-01 23:30:47 -04:00
tags.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
user.js 📌 feat: Pin Agents and Models in the Sidebar (#10634) 2025-12-11 16:38:20 -05:00