ollama/cmd
Daniel Hiltgen 87288ced4f
New models (#15861)
* mlx: add laguna model support

* convert: support fp8 safetensors import

Decode HF F8_E4M3 safetensors with block scale companions into GGUF-supported tensor types, and record which output tensors came from FP8 source weights.

Use that source-precision metadata during create quantization: default FP8-sourced GGUFs to Q8_0, keep non-FP8 tensors at their original precision for Q8_0, and promote non-FP8 quantizable tensors to Q8_0 for Q4_K requests.

* ggml: add laguna model support

* server: preserve generate logprobs with builtin parsers

Generate requests were dropping logprob-only chunks whenever a builtin parser buffered visible content. Chat already handled this case, but generate only forwarded chunks with visible response, thinking, or tool-call output.

Keep generate chunks that carry logprobs even when the builtin parser has not flushed visible content yet, and add a regression test that exercises the behavior with a generic thinking parser.

* review comments - perf improvements

* ggml: implement nemotron 3 nano omni

* add poolside integration

* update poolside doc

* adapt to new cache setup

* fix test

* fix test

---------

Co-authored-by: Eva Ho <hoyyeva@gmail.com>
2026-04-28 11:50:12 -07:00
..
bench Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
config cmd: refactor tui and launch (#14609) 2026-03-12 18:39:06 -07:00
internal/fileutil cmd: refactor tui and launch (#14609) 2026-03-12 18:39:06 -07:00
launch New models (#15861) 2026-04-28 11:50:12 -07:00
runner Runner for Ollama engine 2025-02-13 17:09:26 -08:00
tui launch: add hermes (#15569) 2026-04-15 12:00:23 -07:00
background_unix.go cmd: ollama menu and launch improvements (#14038) 2026-02-09 11:30:16 -08:00
background_windows.go cmd: ollama menu and launch improvements (#14038) 2026-02-09 11:30:16 -08:00
cmd.go api: accept "max" as a think value (#15787) 2026-04-24 01:49:39 -07:00
cmd_launcher_test.go cmd: ollama launch vscode (#15060) 2026-03-25 16:37:02 -04:00
cmd_test.go modelfiles: fix /save command and add shortname for safetensors based models (#15413) 2026-04-08 21:05:39 -07:00
editor_unix.go feature: add ctrl-g to allow users to use an editor to edit their prompt (#14197) 2026-02-11 13:04:41 -08:00
editor_windows.go feature: add ctrl-g to allow users to use an editor to edit their prompt (#14197) 2026-02-11 13:04:41 -08:00
interactive.go modelfiles: fix /save command and add shortname for safetensors based models (#15413) 2026-04-08 21:05:39 -07:00
interactive_test.go Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
start.go nolintlint 2024-06-04 11:13:30 -07:00
start_darwin.go cmd: ollama launch improvements (#14099) 2026-02-05 15:08:17 -08:00
start_default.go lint 2024-08-01 17:06:06 -07:00
start_windows.go spawn desktop quickly (#11011) 2025-06-08 09:34:52 -07:00
warn_thinking_test.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00