mirror of
https://github.com/ollama/ollama.git
synced 2026-07-04 23:02:07 +00:00
This is a rewrite of the create functionality for the MLX engine. The core idea behind the create functionality is to break the import/convert into a pipeline of distinct phases: * Read (scan the safetensors directory for the various bits of metadata) * Classify (determine what the import type) * Plan (determine any transforms that need to be done) * Write (transform any data as necessary and write out the blobs) * Create the manifest Each architecture has a "policy" which determines how to convert the model correctly. A number of different formats for safetensors are supported including: * nvfp4 (two formats: model optimized, torch) * fp8 datatypes (convert to mxfp8) * standard bf16 based weights A number of cleanups/simplifications have been done including: * using the baked in names for the tensors instead of munging them into something else * unified 3d expert tensors (instead of separate per expert tensors) * fewer unnecessary transforms to the various tensors in a model (keep a model as close to the source as possible) * unified capability checking * draft model handling (for MTP) is done on the same path Image generation has been intentionally removed. |
||
|---|---|---|
| .. | ||
| batch | ||
| cache | ||
| mlx | ||
| model | ||
| sample | ||
| cache.go | ||
| cache_test.go | ||
| cache_trie.go | ||
| cache_trie_test.go | ||
| client.go | ||
| imports.go | ||
| mtp.go | ||
| mtp_test.go | ||
| pipeline.go | ||
| runner.go | ||
| server.go | ||
| speculate.go | ||
| speculate_depth.go | ||
| speculate_depth_test.go | ||
| speculate_stats.go | ||
| status_memory.go | ||
| status_memory_test.go | ||
| utf8_buffer.go | ||
| utf8_buffer_test.go | ||