ollama

mirror of https://github.com/ollama/ollama.git synced 2026-07-07 08:11:40 +00:00

History

Patrick Devine 964ea42c09 mlx: x/create rewrite (#16919 ) This is a rewrite of the create functionality for the MLX engine. The core idea behind the create functionality is to break the import/convert into a pipeline of distinct phases: * Read (scan the safetensors directory for the various bits of metadata) * Classify (determine what the import type) * Plan (determine any transforms that need to be done) * Write (transform any data as necessary and write out the blobs) * Create the manifest Each architecture has a "policy" which determines how to convert the model correctly. A number of different formats for safetensors are supported including: * nvfp4 (two formats: model optimized, torch) * fp8 datatypes (convert to mxfp8) * standard bf16 based weights A number of cleanups/simplifications have been done including: * using the baked in names for the tensors instead of munging them into something else * unified 3d expert tensors (instead of separate per expert tensors) * fewer unnecessary transforms to the various tensors in a model (keep a model as close to the source as possible) * unified capability checking * draft model handling (for MTP) is done on the same path Image generation has been intentionally removed.	2026-07-03 18:30:45 -07:00
..
create.go	mlx: x/create rewrite (#16919 )	2026-07-03 18:30:45 -07:00
create_test.go	mlx: x/create rewrite (#16919 )	2026-07-03 18:30:45 -07:00

This is a rewrite of the create functionality for the MLX engine.

The core idea behind the create functionality is to break the import/convert into a pipeline of distinct phases:

* Read (scan the safetensors directory for the various bits of metadata)
* Classify (determine what the import type)
* Plan (determine any transforms that need to be done)
* Write (transform any data as necessary and write out the blobs)
* Create the manifest

Each architecture has a "policy" which determines how to convert the model correctly. A number of different formats for safetensors are supported including:

* nvfp4 (two formats: model optimized, torch)
* fp8 datatypes (convert to mxfp8)
* standard bf16 based weights

A number of cleanups/simplifications have been done including:

* using the baked in names for the tensors instead of munging them into something else
* unified 3d expert tensors (instead of separate per expert tensors)
* fewer unnecessary transforms to the various tensors in a model (keep a model as close to the source as possible)
* unified capability checking
* draft model handling (for MTP) is done on the same path

Image generation has been intentionally removed.

2026-07-03 18:30:45 -07:00

create.go

mlx: x/create rewrite (#16919 )

2026-07-03 18:30:45 -07:00

create_test.go

mlx: x/create rewrite (#16919 )

2026-07-03 18:30:45 -07:00