ollama/x/create
Daniel Hiltgen 87288ced4f
New models (#15861)
* mlx: add laguna model support

* convert: support fp8 safetensors import

Decode HF F8_E4M3 safetensors with block scale companions into GGUF-supported tensor types, and record which output tensors came from FP8 source weights.

Use that source-precision metadata during create quantization: default FP8-sourced GGUFs to Q8_0, keep non-FP8 tensors at their original precision for Q8_0, and promote non-FP8 quantizable tensors to Q8_0 for Q4_K requests.

* ggml: add laguna model support

* server: preserve generate logprobs with builtin parsers

Generate requests were dropping logprob-only chunks whenever a builtin parser buffered visible content. Chat already handled this case, but generate only forwarded chunks with visible response, thinking, or tool-call output.

Keep generate chunks that carry logprobs even when the builtin parser has not flushed visible content yet, and add a regression test that exercises the behavior with a generic thinking parser.

* review comments - perf improvements

* ggml: implement nemotron 3 nano omni

* add poolside integration

* update poolside doc

* adapt to new cache setup

* fix test

* fix test

---------

Co-authored-by: Eva Ho <hoyyeva@gmail.com>
2026-04-28 11:50:12 -07:00
..
client New models (#15861) 2026-04-28 11:50:12 -07:00
create.go New models (#15861) 2026-04-28 11:50:12 -07:00
create_test.go mlx: Support NVIDIA TensorRT Model Optimizer import (#15566) 2026-04-27 18:28:10 -07:00
dtype.go mlx: Support NVIDIA TensorRT Model Optimizer import (#15566) 2026-04-27 18:28:10 -07:00
gemma4.go Keep Gemma4 router projection in source precision (#15613) 2026-04-15 15:04:23 -07:00
gemma4_test.go Keep Gemma4 router projection in source precision (#15613) 2026-04-15 15:04:23 -07:00
imagegen.go create: Clean up experimental paths, fix create from existing safetensor model (#14679) 2026-04-07 08:12:57 -07:00
laguna.go New models (#15861) 2026-04-28 11:50:12 -07:00
laguna_test.go New models (#15861) 2026-04-28 11:50:12 -07:00
qwen35.go create: Clean up experimental paths, fix create from existing safetensor model (#14679) 2026-04-07 08:12:57 -07:00