Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models. https://ollama.com
Find a file
Daniel Hiltgen 4fe5609563
metal: harden for ggml initialization failures (#15755)
* metal: harden for ggml initialization failures

ggml_metal_device_init performs a probe to verify the tensor API compiles.  On
some systems this passes, even though kernel coverage isn't complete, which
results in a later crash when compiling the real kernels.  This change adds a
single retry if any of the error strings match this failure mode to disable the
tensor API.  It also hardens an error case in the Go initDevices to detect
device initialization failures and panic instead of crashing later on a nil
array entry.

Fixes #15734

* review comments

* review comments
2026-04-30 16:28:03 -07:00
.github Update to ROCm 7.2.1 (#15483) 2026-04-12 12:11:58 -07:00
anthropic anthropic: fix empty inputs in content blocks (#15105) 2026-03-27 15:41:27 -07:00
api launch: use vram bytes for model recommendations (#15885) 2026-04-29 18:40:14 -07:00
app app: remove ollama update url env var used for testing (#15905) 2026-04-30 13:14:08 -07:00
auth auth: fix problems with the ollama keypairs (#12373) 2025-09-22 23:20:20 -07:00
cmd launch: use vram bytes for model recommendations (#15885) 2026-04-29 18:40:14 -07:00
convert New models (#15861) 2026-04-28 11:50:12 -07:00
discover metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
docs launch/docs: fix title for pool (#15883) 2026-04-29 17:18:44 -04:00
envconfig add ability to turn on debug request logging (#14106) 2026-03-19 17:08:17 -07:00
format
fs New models (#15861) 2026-04-28 11:50:12 -07:00
harmony Parser for Cogito v2 (#13145) 2025-11-19 17:21:07 -08:00
integration mlxrunner: add logprobs support 2026-04-20 17:43:00 -07:00
internal Reapply "don't require pulling stubs for cloud models" again (#14608) 2026-03-06 14:27:47 -08:00
kvcache model: support for qwen3.5 architecture (#14378) 2026-02-24 20:08:05 -08:00
llama cgo: suppress deprecated warning to quiet down go build (#15438) 2026-04-13 13:04:11 -07:00
llm metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
logutil logutil: fix source field (#12279) 2025-09-16 16:18:07 -07:00
manifest create: avoid gc race with create (#15628) 2026-04-16 13:29:16 -07:00
middleware Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
ml metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
model renderers: update gemma4 renderer (#15886) 2026-04-29 18:40:23 -07:00
openai openai: map responses reasoning effort to think (#15789) 2026-04-24 02:49:36 -07:00
parser MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
progress Add z-image image generation prototype (#13659) 2026-01-09 21:09:46 -08:00
readline Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
runner metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
sample Revert "runner: add token history sampling parameters to ollama runner (#14537)" (#14776) 2026-03-10 21:07:52 -07:00
scripts mlx: remove stale x86 libmlx library (#15443) 2026-04-08 20:51:47 -07:00
server launch: use vram bytes for model recommendations (#15885) 2026-04-29 18:40:14 -07:00
template template: fix args-as-json rendering (#13636) 2026-01-06 18:33:57 -08:00
thinking thinking: fix double emit when no opening tag 2025-08-21 21:03:12 -07:00
tokenizer tokenizer: fix multi-regex BPE offset handling (#15844) 2026-04-27 14:14:27 -07:00
tools preserve tool definition and call JSON ordering (#13525) 2026-01-05 18:03:36 -08:00
types Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
version
x metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
.dockerignore
.gitattributes .gitattributes: add app/webview to linguist-vendored (#13274) 2025-11-29 23:46:10 -05:00
.gitignore create: Clean up experimental paths, fix create from existing safetensor model (#14679) 2026-04-07 08:12:57 -07:00
.golangci.yaml ci: restore previous linter rules (#13322) 2025-12-03 18:55:02 -08:00
CMakeLists.txt ci: harden cuda include path handling (#15093) 2026-03-27 07:57:07 -07:00
CMakePresets.json MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
CONTRIBUTING.md docs: fix typos in repository documentation (#10683) 2025-11-15 20:22:29 -08:00
Dockerfile Update to ROCm 7.2.1 (#15483) 2026-04-12 12:11:58 -07:00
go.mod launch: add hermes (#15569) 2026-04-15 12:00:23 -07:00
go.sum cmd: set codex env vars on launch and handle zstd request bodies (#14122) 2026-02-18 17:19:36 -08:00
LICENSE
main.go
Makefile.sync Revert "Update vendored llama.cpp to b7847" (#14061) 2026-02-03 18:39:36 -08:00
MLX_C_VERSION mlx: update as of 3/23 (#14789) 2026-03-23 11:28:44 -07:00
MLX_VERSION mlx: update as of 3/23 (#14789) 2026-03-23 11:28:44 -07:00
README.md cmd/launch: add Copilot CLI integration (#15583) 2026-04-15 17:22:53 -07:00
SECURITY.md docs: fix typos in repository documentation (#10683) 2025-11-15 20:22:29 -08:00

ollama

Ollama

Start building with open models.

Download

macOS

curl -fsSL https://ollama.com/install.sh | sh

or download manually

Windows

irm https://ollama.com/install.ps1 | iex

or download manually

Linux

curl -fsSL https://ollama.com/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Libraries

Community

Get started

ollama

You'll be prompted to run a model or connect Ollama to your existing agents or applications such as Claude Code, OpenClaw, OpenCode , Codex, Copilot, and more.

Coding

To launch a specific integration:

ollama launch claude

Supported integrations include Claude Code, Codex, Copilot CLI, Droid, and OpenCode.

AI assistant

Use OpenClaw to turn Ollama into a personal AI assistant across WhatsApp, Telegram, Slack, Discord, and more:

ollama launch openclaw

Chat with a model

Run and chat with Gemma 3:

ollama run gemma3

See ollama.com/library for the full list.

See the quickstart guide for more details.

REST API

Ollama has a REST API for running and managing models.

curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [{
    "role": "user",
    "content": "Why is the sky blue?"
  }],
  "stream": false
}'

See the API documentation for all endpoints.

Python

pip install ollama
from ollama import chat

response = chat(model='gemma3', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response.message.content)

JavaScript

npm i ollama
import ollama from "ollama";

const response = await ollama.chat({
  model: "gemma3",
  messages: [{ role: "user", content: "Why is the sky blue?" }],
});
console.log(response.message.content);

Supported backends

  • llama.cpp project founded by Georgi Gerganov.

Documentation

Community Integrations

Want to add your project? Open a pull request.

Chat Interfaces

Web

Desktop

  • Dify.AI - LLM app development platform
  • AnythingLLM - All-in-one AI app for Mac, Windows, and Linux
  • Maid - Cross-platform mobile and desktop client
  • Witsy - AI desktop app for Mac, Windows, and Linux
  • Cherry Studio - Multi-provider desktop client
  • Ollama App - Multi-platform client for desktop and mobile
  • PyGPT - AI desktop assistant for Linux, Windows, and Mac
  • Alpaca - GTK4 client for Linux and macOS
  • SwiftChat - Cross-platform including iOS, Android, and Apple Vision Pro
  • Enchanted - Native macOS and iOS client
  • RWKV-Runner - Multi-model desktop runner
  • Ollama Grid Search - Evaluate and compare models
  • macai - macOS client for Ollama and ChatGPT
  • AI Studio - Multi-provider desktop IDE
  • Reins - Parameter tuning and reasoning model support
  • ConfiChat - Privacy-focused with optional encryption
  • LLocal.in - Electron desktop client
  • MindMac - AI chat client for Mac
  • Msty - Multi-model desktop client
  • BoltAI for Mac - AI chat client for Mac
  • IntelliBar - AI-powered assistant for macOS
  • Kerlig AI - AI writing assistant for macOS
  • Hillnote - Markdown-first AI workspace
  • Perfect Memory AI - Productivity AI personalized by screen and meeting history

Mobile

SwiftChat, Enchanted, Maid, Ollama App, Reins, and ConfiChat listed above also support mobile platforms.

Code Editors & Development

Libraries & SDKs

Frameworks & Agents

RAG & Knowledge Bases

  • RAGFlow - RAG engine based on deep document understanding
  • R2R - Open-source RAG engine
  • MaxKB - Ready-to-use RAG chatbot
  • Minima - On-premises or fully local RAG
  • Chipper - AI interface with Haystack RAG
  • ARGO - RAG and deep research on Mac/Windows/Linux
  • Archyve - RAG-enabling document library
  • Casibase - AI knowledge base with RAG and SSO
  • BrainSoup - Native client with RAG and multi-agent automation

Bots & Messaging

Terminal & CLI

Productivity & Apps

Observability & Monitoring

  • Opik - Debug, evaluate, and monitor LLM applications
  • OpenLIT - OpenTelemetry-native monitoring for Ollama and GPUs
  • Lunary - LLM observability with analytics and PII masking
  • Langfuse - Open source LLM observability
  • HoneyHive - AI observability and evaluation for agents
  • MLflow Tracing - Open source LLM observability

Database & Embeddings

Infrastructure & Deployment

Cloud

Package Managers