ollama/x/mlxrunner/sample
Jesse Gross f93efe2809 mlxrunner: apply in-flight drafts to proposal penalty history
Sampler.Distribution built row i as if draftTokens[:i] were appended, leaving
a single-row proposal call with no draft history, so a drafter skipped the
repeat/presence penalties the target's validation applies and re-proposed
penalized tokens. Align rows with the end of the draft chain instead: the
final row sees every draft token, each earlier row one fewer.
2026-06-22 15:25:45 -07:00
..
logprob_test.go mlxrunner: batch the sampler across multiple sequences 2026-04-25 09:53:53 -07:00
sample.go mlxrunner: apply in-flight drafts to proposal penalty history 2026-06-22 15:25:45 -07:00
sample_test.go mlxrunner: apply in-flight drafts to proposal penalty history 2026-06-22 15:25:45 -07:00