ollama/integration
Daniel Hiltgen 16739dee60
server: align generate with native chat templates (#16878)
* server: align generate with native chat templates

/api/generate rebuilt chat-like prompts through the Go template path even when the model selected its native GGUF Jinja chat template, so the same model rendered differently between generate and chat.

Route chat-like generate requests through the shared native chat preparation path, keep deprecated context and image handling working there, and keep explicit OLLAMA_GO_TEMPLATE overrides intact.

Fixes #16792

* review comments

Fall back to "{{ .Prompt }}" when lacking templates
2026-06-24 13:43:56 -07:00
..
testdata tests: fix embeddinggemma integration test (#12830) 2025-10-29 11:07:28 -07:00
api_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
audio_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
audio_test_data_test.go Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
basic_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
concurrency_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
context_test.go llm: context shift allow shiftable prompts (#16764) 2026-06-16 12:55:52 -07:00
create_imagegen_test.go create: Clean up experimental paths, fix create from existing safetensor model (#14679) 2026-04-07 08:12:57 -07:00
create_test.go integration: look for the "hf" tool in integration tests (#16765) 2026-06-16 11:04:54 -07:00
embed_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
generate_jinja_test.go server: align generate with native chat templates (#16878) 2026-06-24 13:43:56 -07:00
imagegen_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
library_models_test.go integration: improve ability to test individual models (#14948) 2026-03-24 14:28:23 -07:00
llm_image_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
max_queue_test.go integration: improve ability to test individual models (#14948) 2026-03-24 14:28:23 -07:00
model_arch_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
model_perf_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
quantization_test.go integration: improve ability to test individual models (#14948) 2026-03-24 14:28:23 -07:00
README.md runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
thinking_test.go Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00
tools_stress_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
tools_test.go test: integration test hardening (#13532) 2026-05-08 15:54:17 -07:00
utils_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
vision_test.go runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
vision_test_data_test.go Add support for gemma4 (#15214) 2026-04-02 11:33:33 -07:00

Integration Tests

This directory contains integration tests to exercise Ollama end-to-end to verify behavior

By default, these tests are disabled so go test ./... will exercise only unit tests. To run integration tests you must pass the integration tag. go test -tags=integration ./... Some tests require additional tags to enable to allow scoped testing to keep the duration reasonable. For example, testing a broad set of models requires -tags=integration,models and a longer timeout (~60m or more depending on the speed of your GPU.). To view the current set of tag combinations use find integration -type f | xargs grep "go:build"

The integration tests have 2 modes of operating.

  1. By default, on Unix systems, they will start the server on a random port, run the tests, and then shutdown the server. On Windows you must ALWAYS run the server on OLLAMA_HOST for the tests to work.
  2. If OLLAMA_TEST_EXISTING is set to a non-empty string, the tests will run against an existing running server, which can be remote based on your OLLAMA_HOST environment variable

Set OLLAMA_TEST_LOG_SERVER=1 to print the managed server log after each test run, even when the tests pass. This only applies when the integration test harness starts the server.

Important

Before running the tests locally without the "test existing" setting, compile ollama from the top of the source tree go build . in addition to GPU support with cmake if applicable on your platform. The integration tests expect to find an ollama binary at the top of the tree.

Testing a New Model

When implementing new model architecture, use OLLAMA_TEST_MODEL to run the integration suite against your model.

# Build the binary first
go build .

# Run integration tests against it
OLLAMA_TEST_MODEL=mymodel go test -tags integration -v -count 1 -timeout 15m ./integration/