ollama/x
Daniel Hiltgen 1e1b34dada
mlx: refined model push behavior (#15431)
* mlx: refined model push behavior

Refine the algorithm for parallel push of safetensors based models to get
better reliability and throughput.

* review comments, hardening, and performance tuning for slow links

* review comments
2026-05-08 14:25:30 -07:00
..
agent x/cmd: enable web search and web fetch with flag (#13690) 2026-01-12 13:59:40 -08:00
cmd Reapply "don't require pulling stubs for cloud models" again (#14608) 2026-03-06 14:27:47 -08:00
create mlx: Gemma4 MTP speculative decoding (#15980) 2026-05-05 08:55:04 -07:00
imagegen mlx: partial cleanup of imagegen layout (#15435) 2026-05-05 14:15:30 -07:00
internal/mlxthread Update MLX and MLX-C with threading fixes (#15845) 2026-05-03 10:03:14 -07:00
mlxrunner mlx: Gemma4 MTP speculative decoding (#15980) 2026-05-05 08:55:04 -07:00
models mlx: Gemma4 MTP speculative decoding (#15980) 2026-05-05 08:55:04 -07:00
safetensors mlx: Support NVIDIA TensorRT Model Optimizer import (#15566) 2026-04-27 18:28:10 -07:00
server mlx: Support NVIDIA TensorRT Model Optimizer import (#15566) 2026-04-27 18:28:10 -07:00
tokenizer New models (#15861) 2026-04-28 11:50:12 -07:00
tools add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00
transfer mlx: refined model push behavior (#15431) 2026-05-08 14:25:30 -07:00