LibreChat/api/utils
Danny Avila 9f8b6d92c0
🤖 feat: Add Claude Sonnet 5 Support (#14042)
*  feat: Add Claude Sonnet 5 Support

Wire up the claude-sonnet-5 model across token, pricing, and model-list
config:

- Context window (1M) and max output (128K) in @librechat/api token maps
- Standard pricing ($3/$15 per MTok) and cache rates in data-schemas tx
- 128K output-token carve-out in anthropicSettings (the family-wide 64K
  rule capped Sonnet 5 below its real limit); Bedrock/Vertex thinking and
  1M-context detection already cover sonnet major >= 5 generically
- Add to shared Anthropic, Bedrock, and Vertex default model lists, plus
  the .env.example examples
- Tests for context/output/pricing/matching across the affected packages

*  test: Align Sonnet 5+ maxOutputTokens defaults with 128K spec

getLLMConfig defaults flow from anthropicSettings.maxOutputTokens.reset(),
which now returns 128K for Sonnet 5+. Update the future-proofing assertions
in llm.spec.ts (Sonnet 5.x and 6-9.x) that still expected the old family-wide
64K cap. Haiku stays 64K; Opus stays 128K.

* 🎚️ fix: Gate Sonnet 5 capability behaviors (sampling, thinking)

Adding claude-sonnet-5 to the default list exposed it without the Anthropic
capability gates, all confirmed against the live API:

- omitsSamplingParameters: Sonnet 5 returns 400 on non-default temperature/
  top_p/top_k ('deprecated for this model'); now dropped so selecting the
  model with saved sampling settings no longer fails.
- requiresExplicitThinkingDisabled: omitting 'thinking' runs adaptive ON by
  default on Sonnet 5, so disabling thinking now sends { type: 'disabled' }
  (verified: 200, no thinking block) instead of omitting the field.
- omitsThinkingByDefault: thinking.display defaults to omitted (empty thinking
  blocks); the display resolver now returns 'summarized' for Sonnet 5+ so the
  Thoughts UI keeps working (verified: 757-char summary returned).

Gates apply to both the direct Anthropic and Bedrock paths. Tests added in
bedrock.spec and llm.spec.

* 🩹 fix: Sonnet 5 Bedrock availability + thinking-off persistence

Round-2 Codex review (all verified against the live API / Anthropic docs):

- Sonnet 5 is NOT available on the legacy Bedrock InvokeModel/Converse surface
  (Anthropic docs: 'use Claude in Amazon Bedrock or Claude Platform on AWS'),
  which is what LibreChat's ChatBedrockConverse uses. Removed it from the
  default Bedrock model lists (config + .env.example). Opus 4.8/4.7/Fable 5
  stay — those ARE reachable via InvokeModel. Sonnet 5 remains on the direct
  Anthropic API and Vertex, where it works.
- Reverted the Bedrock-side explicit-disabled thinking handling added last
  round: with Sonnet 5 off Bedrock, no Bedrock model needs { type: 'disabled' },
  so that path (and its round-trip concern) no longer applies.
- Direct Anthropic path: a persisted { type: 'disabled' } thinking object now
  normalizes to a boolean flag in getLLMConfig, so a user's Sonnet 5
  'thinking off' setting stays off across the model_parameters round trip
  instead of flipping back to adaptive (a truthy object skipped the disabled
  branch).

* ↩️ fix: Restore Sonnet 5 on Bedrock (Converse) — verified live

Reverses the round-2 removal: Sonnet 5 IS available on AWS Bedrock. Tested
live via the Converse API:

- global.anthropic.claude-sonnet-5 returns a normal response
- bare anthropic.claude-sonnet-5 needs an inference profile — but that's
  identical to the already-shipping Opus 4.8 / Fable 5 / Sonnet 4.6 entries,
  which all fail bare on-demand the same way
- temperature=0.5 -> 400 'deprecated for this model'; thinking {type:disabled}
  suppresses reasoning — same as the direct API

The 'legacy' Bedrock docs page that claimed Sonnet 5 wasn't on the surface is
stale. Restored:
- anthropic.claude-sonnet-5 in bedrockModels + .env.example
- the Bedrock explicit-disabled thinking handling (requiresExplicitThinkingDisabled
  -> { type: 'disabled' })
- the Finding 4 round-trip fix in bedrockInputSchema (coerce a persisted
  disabled AMRF.thinking to thinking=false instead of !!thinking -> true), with
  an end-to-end schema->parser test proving 'thinking off' stays sticky.

Direct-path round-trip fix (getLLMConfig thinkingFlag) is unchanged.

* 💵 fix: Sonnet 5 intro pricing + sticky disabled thinking on Bedrock reload

Round-4 Codex review (both verified):

- Pricing: Anthropic lists Sonnet 5 at introductory $2/$10 per MTok (cache
  $2.50/$0.20) through 2026-08-31, reverting to $3/$15 ($3.75/$0.30) on
  Sep 1 (confirmed on platform.claude.com/pricing). The static tx multiplier
  table is used for real balance transactions, so the post-intro rates were
  overcharging ~50% during the launch window. Switched to the intro rates with
  a revert comment on both the token and cache entries.

- Bedrock disabled-thinking persistence: initializeBedrock feeds persisted
  model_parameters straight through bedrockInputParser (NOT bedrockInputSchema),
  where additionalModelRequestFields is a known key — so a prior
  thinking:{type:'disabled'} was ignored and rebuilt as adaptive on reload.
  bedrockInputParser now surfaces a persisted disabled AMRF.thinking as
  thinking=false so it re-emits {type:'disabled'}. Verified end-to-end against
  the real initializeBedrock call path.
2026-06-30 19:26:33 -04:00
..
logger.js 🗂️ feat: Allow Disabling File Log Transports (#13215) 2026-05-20 23:16:56 -04:00
LoggingSystem.js feat: Logins log for Fail2Ban (#986) 2023-09-24 12:18:10 -04:00
tokens.spec.js 🤖 feat: Add Claude Sonnet 5 Support (#14042) 2026-06-30 19:26:33 -04:00