mirror of https://github.com/danny-avila/LibreChat.git synced 2026-06-26 17:31:27 +00:00

Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active. https://librechat.ai/

ai anthropic artifacts aws azure chatgpt chatgpt-clone claude clone deepseek gemini google gpt-5 librechat mcp o1 openai responses-api vision webui

Find a file

Danny Avila d18d62e7c1 🪙 refactor: Reconcile Context Gauge to Actual Provider Tokens (#13780 ) * 🪙 fix: Reconcile Context Gauge to Actual Provider Tokens The context gauge could read several× too high (e.g. 213K when the real prompt was 56K) and stay there across reloads. Root cause: the SDK's calibrationRatio is `cumulativeProviderReported / cumulativeRawSent`, but a provider's server-side web search injects large fetched content into the prompt that the SDK never sent or counted — pinning the ratio at its cap (5) and multiplying every later message estimate, including post-summary ones. The gauge rendered (and persisted) that inflated estimate, never the provider's actual token count. Fix: reconcile the snapshot to the call's ACTUAL prompt tokens (input + cache), which already arrive in on_token_usage. Only messageTokens is calibration-scaled (instructions/summary are raw tiktoken), so keep those and set messageTokens to the remainder, recomputing free space. Shared `promptTokensFromUsage` + `reconcileContextUsage` in data-provider; applied server-side in buildPersistedContextUsage (reload-stable) and client-side in useUsageHandler on each primary usage (corrects at turn-end, no follow-up needed). Also drop the summary double-count from the Breakdown Messages row. Deferred (separate agents PR): the SDK over-calibration also fires summarization prematurely; fixing it needs decoupling real-content estimation from server-side injection headroom without weakening pruning-overflow safety. * 🪙 fix: Harden Token Reconciliation for Provider-less + Resume Paths Codex review on the reconciliation: - promptTokensFromUsage: when the provider is absent (custom/OpenAI-compatible payloads), fall back to the same magnitude heuristic normalizeUsageUnits uses (cache ≤ input ⇒ already included) so cached events aren't re-inflated. - Resume: backfillUsage restores a primary call's usage without replaying a live on_token_usage (Redis mode), so the live reconcile never ran and a reconnected session stayed on the inflated estimate. New reconcileBackfill reconciles the restored snapshot from the final primary call after contextHandler installs it. * 🪙 fix: Reconcile Resume Snapshot Server-Side, Not via Backfill Codex: the client reconcileBackfill scanned the resumed run's collectedUsage and applied the final primary to the latest snapshot — but on a mid-call resume that usage belongs to an EARLIER call, corrupting the restored gauge. Move the resume reconciliation server-side: GenerationJobManager.persistTokenUsage reconciles the stored contextUsage to a primary usage's actual prompt tokens as it arrives. That usage is the post-invoke truth for the call the latest stored snapshot precedes (no snapshot is captured between a call's pre-invoke dispatch and its usage), so it's correct by construction and run-matched. A mid-call resume (no usage yet) keeps the raw snapshot instead of mis-applying an earlier call's tokens; it reconciles once the call completes. Removed client reconcileBackfill; the live-path reconcile (non-resume) stays. * 🪙 fix: Guard Reconciliation Against Replays and Snapshot Races Two Codex concurrency findings on the reconciliation: - Client: reconcile only on a NEWLY folded primary usage. A replayed duplicate (folded=false on resume) can be an earlier tool-loop call sharing the run id, which would overwrite the latest snapshot with an earlier, smaller prompt. Moved the reconcile after the folded guard. - Server: serialize the context-usage write through the same per-stream queue as the token-usage write. persistTokenUsage reconciles the stored snapshot (read-modify-write); an unserialized trackContextUsage could store a newer snapshot between the read and write — or a stale reconciled write could land after a newer snapshot — clobbering the newer run's gauge when calls interleave. FIFO keeps each call's snapshot ahead of its own usage and behind the next. * chore: import order in GenerationJobManager.ts		2026-06-16 11:05:44 -04:00
.devcontainer	🐳 chore: Upgrade Docker Builds To Node 24 (#13448 )	2026-06-01 10:03:18 -04:00
.do/gitnexus	⏫ ci: Bump GitNexus to 1.6.7 to Fix Embeddings Index Timeout (#13658 )	2026-06-10 14:05:54 -04:00
.github	📈 fix: Isolate RUM Telemetry Proxy Auth from App Auth (#13765 )	2026-06-15 12:49:44 -04:00
.husky	🔧 chore: Update ESLint config, Import Sorting script, Test Sharding, Bump `@librechat/agents` (#13552 )	2026-06-06 12:31:55 -04:00
.vscode	🔐 feat: Granular Role-based Permissions + Entra ID Group Discovery (#7804 )	2025-08-13 16:24:17 -04:00
api	🪙 refactor: Reconcile Context Gauge to Actual Provider Tokens (#13780 )	2026-06-16 11:05:44 -04:00
client	🪙 refactor: Reconcile Context Gauge to Actual Provider Tokens (#13780 )	2026-06-16 11:05:44 -04:00
config	🔗 feat: Add Granular Access Control to Shared Links via ACL System (#13051 )	2026-06-03 14:17:17 -04:00
e2e	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
helm	📊 chore: Bump Helm chart version to 2.0.6	2026-06-15 13:14:12 -04:00
packages	🪙 refactor: Reconcile Context Gauge to Actual Provider Tokens (#13780 )	2026-06-16 11:05:44 -04:00
redis-config	🔄 refactor: Migrate Cache Logic to TypeScript (#9771 )	2025-10-02 09:33:58 -04:00
scripts	🔧 chore: Update ESLint config, Import Sorting script, Test Sharding, Bump `@librechat/agents` (#13552 )	2026-06-06 12:31:55 -04:00
skill	🗂️ feat: Add Deployment Skill Directory (#13523 )	2026-06-05 10:24:28 -04:00
src/tests	🆔 feat: Add OpenID Connect Federated Provider Token Support (#9931 )	2025-11-21 09:51:11 -05:00
utils	🐳 chore: Update image registry references in Docker/Helm configurations (#12026 )	2026-03-02 22:14:50 -05:00
.dockerignore	🐳 : Further Docker build Cleanup & Docs Update (#1502 )	2024-01-06 11:59:08 -05:00
.env.example	📡 refactor: Gate Noisy Redis OTEL Instrumentation (#13764 )	2026-06-15 12:48:20 -04:00
.gitattributes	🎛️ feat: DB-Backed Per-Principal Config System (#12354 )	2026-03-25 19:39:29 -04:00
.gitignore	🎭 feat: Add Credential-Free Playwright Smoke Suite with a Local Mock LLM (#13472 )	2026-06-02 16:36:39 -04:00
.nvmrc	🐳 chore: Upgrade Docker Builds To Node 24 (#13448 )	2026-06-01 10:03:18 -04:00
.prettierrc	🧹 chore: Migrate to Flat ESLint Config & Update Prettier Settings (#5737 )	2025-02-09 12:15:20 -05:00
AGENTS.md	📋 chore: Move project instructions from AGENTS.md to CLAUDE.md	2026-03-31 21:50:38 -04:00
bun.lock	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
CLAUDE.md	🐳 chore: Upgrade Docker Builds To Node 24 (#13448 )	2026-06-01 10:03:18 -04:00
deploy-compose.yml	🌐 fix: Centralize Outbound Proxy Handling (#13726 )	2026-06-14 10:47:49 -04:00
docker-compose.override.yml.example	🐳 chore: Update image registry references in Docker/Helm configurations (#12026 )	2026-03-02 22:14:50 -05:00
docker-compose.yml	🌐 fix: Centralize Outbound Proxy Handling (#13726 )	2026-06-14 10:47:49 -04:00
Dockerfile	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
Dockerfile.multi	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
eslint.config.mjs	✨ feat: Surface Model Spec Branding on Landing and Selector (#13662 )	2026-06-10 21:02:22 -04:00
librechat.example.yaml	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
LICENSE	🗒️ docs: Update LICENSE.md Year: 2025 -> 2026 (#12554 )	2026-04-08 09:12:44 -04:00
package-lock.json	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
package.json	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
rag.yml	🐳 chore: Update image registry references in Docker/Helm configurations (#12026 )	2026-03-02 22:14:50 -05:00
README.md	📚 docs: Add Skills, Subagents, and CloudFront References (#13096 )	2026-05-12 21:41:09 -04:00
README.zh.md	✨ v0.8.7-rc1 (#13592 )	2026-06-15 13:10:30 -04:00
turbo.json	📦 chore: Update Turbo package to v2.9.17	2026-06-10 15:34:53 -04:00

README.md

LibreChat

English · 中文

✨ Features

🖥️ UI & Experience inspired by ChatGPT with enhanced design and features
🤖 AI Model Selection:
- Anthropic (Claude), AWS Bedrock, OpenAI, Azure OpenAI, Google, Vertex AI, OpenAI Responses API (incl. Azure)
- Custom Endpoints: Use any OpenAI-compatible API with LibreChat, no proxy required
- Compatible with Local & Remote AI Providers:
  - Ollama, groq, Cohere, Mistral AI, Apple MLX, koboldcpp, together.ai,
  - OpenRouter, Helicone, Perplexity, ShuttleAI, Deepseek, Qwen, and more
🔧 Code Interpreter API:
- Secure, Sandboxed Execution in Python, Node.js (JS/TS), Go, C/C++, Java, PHP, Rust, and Fortran
- Seamless File Handling: Upload, process, and download files directly
- No Privacy Concerns: Fully isolated and secure execution
🔦 Agents & Tools Integration:
- LibreChat Agents:
  - No-Code Custom Assistants: Build specialized, AI-driven helpers
  - Agent Marketplace: Discover and deploy community-built agents
  - Collaborative Sharing: Share agents with specific users and groups
  - Flexible & Extensible: Use MCP Servers, tools, file search, code execution, and more
  - Skills: Create reusable SKILL.md instruction bundles for manual, automatic, or always-on agent workflows
  - Subagents: Delegate focused work to isolated child agent runs with their own context windows
  - Compatible with Custom Endpoints, OpenAI, Azure, Anthropic, AWS Bedrock, Google, Vertex AI, Responses API, and more
  - Model Context Protocol (MCP) Support for Tools
🔍 Web Search:
- Search the internet and retrieve relevant information to enhance your AI context
- Combines search providers, content scrapers, and result rerankers for optimal results
- Customizable Jina Reranking: Configure custom Jina API URLs for reranking services
- Learn More →
🪄 Generative UI with Code Artifacts:
- Code Artifacts allow creation of React, HTML, and Mermaid diagrams directly in chat
🎨 Image Generation & Editing
- Text-to-image and image-to-image with GPT-Image-1
- Text-to-image with DALL-E (3/2), Stable Diffusion, Flux, or any MCP server
- Produce stunning visuals from prompts or refine existing images with a single instruction
💾 Presets & Context Management:
- Create, Save, & Share Custom Presets
- Switch between AI Endpoints and Presets mid-chat
- Edit, Resubmit, and Continue Messages with Conversation branching
- Create and share prompts with specific users and groups
- Fork Messages & Conversations for Advanced Context control
💬 Multimodal & File Interactions:
- Upload and analyze images with Claude 3, GPT-4.5, GPT-4o, o1, Llama-Vision, and Gemini 📸
- Chat with Files using Custom Endpoints, OpenAI, Azure, Anthropic, AWS Bedrock, & Google 🗃️
🌎 Multilingual UI:
- English, 中文 (简体), 中文 (繁體), العربية, Deutsch, Español, Français, Italiano
- Polski, Português (PT), Português (BR), Русский, 日本語, Svenska, 한국어, Tiếng Việt
- Türkçe, Nederlands, עברית, Català, Čeština, Dansk, Eesti, فارسی
- Suomi, Magyar, Հայերեն, Bahasa Indonesia, ქართული, Latviešu, ไทย, ئۇيغۇرچە
🧠 Reasoning UI:
- Dynamic Reasoning UI for Chain-of-Thought/Reasoning AI models like DeepSeek-R1
🎨 Customizable Interface:
- Customizable Dropdown & Interface that adapts to both power users and newcomers
🌊 Resumable Streams:
- Never lose a response: AI responses automatically reconnect and resume if your connection drops
- Multi-Tab & Multi-Device Sync: Open the same chat in multiple tabs or pick up on another device
- Production-Ready: Works from single-server setups to horizontally scaled deployments with Redis
🗣️ Speech & Audio:
- Chat hands-free with Speech-to-Text and Text-to-Speech
- Automatically send and play Audio
- Supports OpenAI, Azure OpenAI, and Elevenlabs
📥 Import & Export Conversations:
- Import Conversations from LibreChat, ChatGPT, Chatbot UI
- Export conversations as screenshots, markdown, text, json
🔍 Search & Discovery:
- Search all messages/conversations
👥 Multi-User & Secure Access:
- Multi-User, Secure Authentication with OAuth2, LDAP, & Email Login Support
- Built-in Moderation, and Token spend tools
⚙️ Configuration & Deployment:
- Configure Proxy, Reverse Proxy, Docker, & many Deployment options
- Use S3 with CloudFront for stable media links, edge delivery, signed cookies, and secured downloads
- Use completely local or deploy on the cloud
📖 Open-Source & Community:
- Completely Open-Source & Built in Public
- Community-driven development, support, and feedback

For a thorough review of our features, see our docs here 📚

🪶 All-In-One AI Conversations with LibreChat

LibreChat is a self-hosted AI chat platform that unifies all major AI providers in a single, privacy-focused interface.

Beyond chat, LibreChat provides AI Agents, Model Context Protocol (MCP) support, Artifacts, Code Interpreter, custom actions, conversation search, and enterprise-ready multi-user authentication.

Open source, actively developed, and built for anyone who values control over their AI infrastructure.

🌐 Resources

GitHub Repo:

RAG API: github.com/danny-avila/rag_api
Website: github.com/LibreChat-AI/librechat.ai

Other:

Website: librechat.ai
Documentation: librechat.ai/docs
Blog: librechat.ai/blog

📝 Changelog

Keep up with the latest updates by visiting the releases page and notes:

⚠️ Please consult the changelog for breaking changes before updating.

⭐ Star History

✨ Contributions

Contributions, suggestions, bug reports and fixes are welcome!

For new features, components, or extensions, please open an issue and discuss before sending a PR.

If you'd like to help translate LibreChat into your language, we'd love your contribution! Improving our translations not only makes LibreChat more accessible to users around the world but also enhances the overall user experience. Please check out our Translation Guide.

💖 This project exists in its current state thanks to all the people who contribute

🎉 Special Thanks

We thank Locize for their translation management tools that support multiple languages in LibreChat.