mirror of
https://github.com/ollama/ollama.git
synced 2026-05-13 14:27:00 +00:00
* mlx: add laguna model support * convert: support fp8 safetensors import Decode HF F8_E4M3 safetensors with block scale companions into GGUF-supported tensor types, and record which output tensors came from FP8 source weights. Use that source-precision metadata during create quantization: default FP8-sourced GGUFs to Q8_0, keep non-FP8 tensors at their original precision for Q8_0, and promote non-FP8 quantizable tensors to Q8_0 for Q4_K requests. * ggml: add laguna model support * server: preserve generate logprobs with builtin parsers Generate requests were dropping logprob-only chunks whenever a builtin parser buffered visible content. Chat already handled this case, but generate only forwarded chunks with visible response, thinking, or tool-call output. Keep generate chunks that carry logprobs even when the builtin parser has not flushed visible content yet, and add a regression test that exercises the behavior with a generic thinking parser. * review comments - perf improvements * ggml: implement nemotron 3 nano omni * add poolside integration * update poolside doc * adapt to new cache setup * fix test * fix test --------- Co-authored-by: Eva Ho <hoyyeva@gmail.com> |
||
|---|---|---|
| .. | ||
| api | ||
| capabilities | ||
| images | ||
| integrations | ||
| tools/extract-examples | ||
| api.md | ||
| cli.mdx | ||
| cloud.mdx | ||
| context-length.mdx | ||
| development.md | ||
| docker.mdx | ||
| docs.json | ||
| examples.md | ||
| faq.mdx | ||
| favicon-dark.svg | ||
| favicon.svg | ||
| gpu.mdx | ||
| import.mdx | ||
| index.mdx | ||
| linux.mdx | ||
| logo.svg | ||
| macos.mdx | ||
| modelfile.mdx | ||
| ollama-logo.svg | ||
| ollama.png | ||
| openapi.yaml | ||
| quickstart.mdx | ||
| README.md | ||
| styling.css | ||
| template.mdx | ||
| troubleshooting.mdx | ||
| windows.mdx | ||
Documentation
Getting Started
- Quickstart
- Examples
- Importing models
- MacOS Documentation
- Linux Documentation
- Windows Documentation
- Docker Documentation