ollama

mirror of https://github.com/ollama/ollama.git synced 2026-07-08 17:02:17 +00:00

History

Daniel Hiltgen 87288ced4f New models (#15861 ) * mlx: add laguna model support * convert: support fp8 safetensors import Decode HF F8_E4M3 safetensors with block scale companions into GGUF-supported tensor types, and record which output tensors came from FP8 source weights. Use that source-precision metadata during create quantization: default FP8-sourced GGUFs to Q8_0, keep non-FP8 tensors at their original precision for Q8_0, and promote non-FP8 quantizable tensors to Q8_0 for Q4_K requests. * ggml: add laguna model support * server: preserve generate logprobs with builtin parsers Generate requests were dropping logprob-only chunks whenever a builtin parser buffered visible content. Chat already handled this case, but generate only forwarded chunks with visible response, thinking, or tool-call output. Keep generate chunks that carry logprobs even when the builtin parser has not flushed visible content yet, and add a regression test that exercises the behavior with a generic thinking parser. * review comments - perf improvements * ggml: implement nemotron 3 nano omni * add poolside integration * update poolside doc * adapt to new cache setup * fix test * fix test --------- Co-authored-by: Eva Ho <hoyyeva@gmail.com>		2026-04-28 11:50:12 -07:00
..
internal	docs: fix typos in repository documentation (#10683 )	2025-11-15 20:22:29 -08:00
auth.go	server: reject unexpected auth hosts (#13738 )	2026-01-16 14:10:36 -05:00
auth_test.go	server: reject unexpected auth hosts (#13738 )	2026-01-16 14:10:36 -05:00
cloud_proxy.go	cloud_proxy: for the web_search legacy path, flush on newlines (#14897 )	2026-03-17 13:30:17 -07:00
cloud_proxy_test.go	cloud_proxy: for the web_search legacy path, flush on newlines (#14897 )	2026-03-17 13:30:17 -07:00
create.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
create_test.go	Clean up the manifest and modelpath (#13807 )	2026-01-21 11:46:17 -08:00
download.go	Clean up the manifest and modelpath (#13807 )	2026-01-21 11:46:17 -08:00
fixblobs.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
fixblobs_test.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
gemma4_test.go	gemma4: render differently based on model size	2026-04-15 14:37:16 -07:00
images.go	create: avoid gc race with create (#15628 )	2026-04-16 13:29:16 -07:00
images_test.go	create: avoid gc race with create (#15628 )	2026-04-16 13:29:16 -07:00
inference_request_log.go	add ability to turn on debug request logging (#14106 )	2026-03-19 17:08:17 -07:00
laguna_quantization_test.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
logprob.go	logprob: add bytes to logprobs (#13068 )	2025-11-13 13:49:25 -08:00
model.go	create: Clean up experimental paths, fix create from existing safetensor model (#14679 )	2026-04-07 08:12:57 -07:00
model_resolver.go	Reapply "don't require pulling stubs for cloud models" again (#14608 )	2026-03-06 14:27:47 -08:00
model_resolver_test.go	Reapply "don't require pulling stubs for cloud models" again (#14608 )	2026-03-06 14:27:47 -08:00
prompt.go	gemma4: render differently based on model size	2026-04-15 14:37:16 -07:00
prompt_test.go	gemma4: render differently based on model size	2026-04-15 14:37:16 -07:00
quantization.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
quantization_test.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
renderer_resolution.go	gemma4: render differently based on model size	2026-04-15 14:37:16 -07:00
routes.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
routes_cloud_test.go	revert context length warnings change (#15121 )	2026-03-28 16:43:59 -07:00
routes_create_test.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
routes_debug_test.go	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
routes_delete_test.go	Reapply "don't require pulling stubs for cloud models" again (#14608 )	2026-03-06 14:27:47 -08:00
routes_generate_renderer_test.go	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
routes_generate_test.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
routes_harmony_streaming_test.go	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
routes_list_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_options_test.go	server: use tiered VRAM-based default context length	2026-02-02 10:47:09 -08:00
routes_request_log_test.go	add ability to turn on debug request logging (#14106 )	2026-03-19 17:08:17 -07:00
routes_test.go	modelfiles: fix /save command and add shortname for safetensors based models (#15413 )	2026-04-08 21:05:39 -07:00
routes_web_experimental_test.go	cloud_proxy: send ollama client version (#14769 )	2026-03-10 15:53:25 -07:00
sched.go	New models (#15861 )	2026-04-28 11:50:12 -07:00
sched_test.go	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
sparse_common.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
sparse_windows.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
test_home_test.go	add ability to disable cloud (#14221 )	2026-02-12 15:47:00 -08:00
upload.go	Clean up the manifest and modelpath (#13807 )	2026-01-21 11:46:17 -08:00