mirror of
https://github.com/ollama/ollama.git
synced 2026-07-05 15:27:25 +00:00
* mlx: rework the MLX sampler Replace the MLX sampler transform chain with an explicit distribution pipeline that applies: 1. penalties 2. top-k 3. temperature/softmax 4. top-p 5. min-p 6. normalize 7. categorical The common top_k path now keeps sparse [B,K] token ids/probabilities on GPU instead of carrying full-vocab scores, and sampled MTP reuses those draft/target distributions for acceptance, bonus, and residual sampling. This change also fixes the seed parameter so that temperature sampling and sampled MTP are reproducible. |
||
|---|---|---|
| .. | ||
| agent | ||
| cmd | ||
| create | ||
| imagegen | ||
| internal/mlxthread | ||
| mlxrunner | ||
| models | ||
| safetensors | ||
| server | ||
| tokenizer | ||
| tools | ||
| transfer | ||