mirror of
https://github.com/ollama/ollama.git
synced 2026-07-05 15:27:25 +00:00
Added cached prompt token counts to Ollama responses and compatibility usage fields. This carries local `llama-server` `cache_n` and MLX cache hits through `/api/generate`, `/api/chat`, OpenAI-compatible endpoints, and Anthropic-compatible `/v1/messages`. Cloud responses are passed through as-is, so cache counts will show up there once Cloud starts returning them. |
||
|---|---|---|
| .. | ||
| examples | ||
| client.go | ||
| client_test.go | ||
| types.go | ||
| types_test.go | ||
| types_typescript_test.go | ||