mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-06-26 01:16:24 +00:00
👷 ci: Add API runtime smoke (boot the production image) to docker-smoke (#13605)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
* 👷 ci: Add API runtime smoke (boot the production image) to docker-smoke The docker-smoke workflow only built the `client-package-build` stage and never booted the runtime, so it couldn't catch the class of regression that recently took production down: the api tsdown bundle externalizes runtime deps that, after `npm ci --omit=dev`, were missing from the image (`Cannot find module 'get-stream'`). - Add an `api-runtime-smoke` job that builds the real production image (final `api-build` stage, `npm ci --omit=dev`), then: 1. loads the @librechat/api bundle's full require graph in the pruned image (deterministic, no DB) — fails on any missing/ESM-incompatible runtime dependency. 2. boots the actual entrypoint and asserts no module-load crash (the server loads its require graph before connecting to Mongo, so this surfaces without a database). - Expand triggers to include `packages/api/**`, `packages/data-schemas/**`, and `api/package.json` (previously a packages/api change only triggered this via a root lockfile change, and even then only built the client stage). - Add gha build cache + concurrency cancellation to bound CI cost. * 👷 ci: Address Codex review — boot smoke against real Mongo + crash detection - Boot the production image against a real MongoDB container with the env the server needs, so the *entire* require graph loads. `api/db/connect.js` throws at module scope without `MONGO_URI` and is imported before models/services/routes, so the previous no-env boot exercised almost none of the legacy API graph. (Codex finding 2) - Gate on `/health` returning 200 AND the container staying alive, failing on any container exit. A non-module startup crash (ReferenceError, SyntaxError, bad config) now fails the smoke instead of slipping past a missing-module grep. (Codex finding 3) - Expand trigger from `api/package.json` to `api/**`, since the image copies the whole `api/` tree and runs `node server/index.js`. (Codex finding 1) * 👷 ci: Address Codex round 2 — poll /readyz + cover all image inputs - Poll /readyz instead of /health. /health returns 200 at app.listen, but initializeMCPs() and checkMigrations() run *after* listen and process.exit(1) on failure; /readyz only returns 200 once serverReady is set after those complete. So post-listen startup crashes now fail the smoke too. (finding A) - Expand triggers to every source tree copied into the production image: client/**, config/**, skill/** (the final stage copies client/dist, config, and skill). (finding B)
This commit is contained in:
parent
2a956f143d
commit
0bd1a7350f
1 changed files with 92 additions and 0 deletions
92
.github/workflows/docker-smoke.yml
vendored
92
.github/workflows/docker-smoke.yml
vendored
|
|
@ -9,12 +9,22 @@ on:
|
|||
- 'Dockerfile.multi'
|
||||
- 'package.json'
|
||||
- 'package-lock.json'
|
||||
- 'api/**'
|
||||
- 'client/**'
|
||||
- 'config/**'
|
||||
- 'skill/**'
|
||||
- 'packages/api/**'
|
||||
- 'packages/client/**'
|
||||
- 'packages/data-provider/**'
|
||||
- 'packages/data-schemas/**'
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
concurrency:
|
||||
group: docker-smoke-${{ github.workflow }}-${{ github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
client-package-target:
|
||||
name: Build Docker client package target
|
||||
|
|
@ -34,3 +44,85 @@ jobs:
|
|||
platforms: linux/amd64
|
||||
push: false
|
||||
target: client-package-build
|
||||
|
||||
api-runtime-smoke:
|
||||
name: API runtime smoke (production image boots)
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
# Build the real production image (final `api-build` stage), which installs
|
||||
# with `npm ci --omit=dev` — the same prune that, in prod, exposed runtime
|
||||
# dependencies the tsdown bundle externalizes but were never declared.
|
||||
- name: Build production image
|
||||
uses: docker/build-push-action@v5
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.multi
|
||||
platforms: linux/amd64
|
||||
push: false
|
||||
load: true
|
||||
tags: librechat-api-smoke:ci
|
||||
cache-from: type=gha,scope=docker-smoke-api
|
||||
cache-to: type=gha,mode=max,scope=docker-smoke-api
|
||||
|
||||
# Loads the entire externalized require graph of the built @librechat/api
|
||||
# bundle inside the pruned production image. A missing or ESM-incompatible
|
||||
# runtime dependency (e.g. the `get-stream` regression) fails here with a
|
||||
# non-zero exit — deterministically, with no database required.
|
||||
- name: Verify production image resolves all runtime modules
|
||||
run: |
|
||||
docker run --rm librechat-api-smoke:ci \
|
||||
node -e "require('@librechat/api'); require('@librechat/api/telemetry'); console.log('module resolution OK')"
|
||||
|
||||
# Boot the real entrypoint against a real MongoDB so the *entire* server
|
||||
# require graph loads (api/db throws at module scope without MONGO_URI, and
|
||||
# is imported before models/services/routes), then gate on /readyz AND the
|
||||
# container staying alive. /readyz only returns 200 after the post-listen
|
||||
# startup (initializeMCPs + checkMigrations) sets serverReady, and those
|
||||
# steps process.exit(1) on failure — so ANY startup crash (missing module,
|
||||
# ReferenceError, bad config, post-listen failure) fails the smoke.
|
||||
- name: Boot production image against MongoDB and poll /readyz
|
||||
run: |
|
||||
set -u
|
||||
docker network create lc-smoke
|
||||
docker run -d --name lc-mongo --network lc-smoke mongo:8.0.20
|
||||
docker run -d --name lc-api --network lc-smoke -p 3080:3080 \
|
||||
-e HOST=0.0.0.0 -e PORT=3080 \
|
||||
-e NODE_ENV=production \
|
||||
-e MONGO_URI=mongodb://lc-mongo:27017/LibreChat \
|
||||
-e CREDS_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \
|
||||
-e CREDS_IV=0123456789abcdef0123456789abcdef \
|
||||
-e JWT_SECRET=docker-smoke-jwt-secret \
|
||||
-e JWT_REFRESH_SECRET=docker-smoke-jwt-refresh-secret \
|
||||
-e SEARCH=false \
|
||||
librechat-api-smoke:ci
|
||||
|
||||
healthy=""
|
||||
for i in $(seq 1 60); do
|
||||
if [ "$(docker inspect -f '{{.State.Running}}' lc-api 2>/dev/null)" != "true" ]; then
|
||||
echo "::error::API container exited during startup (exit code $(docker inspect -f '{{.State.ExitCode}}' lc-api 2>/dev/null))"
|
||||
break
|
||||
fi
|
||||
if [ "$(curl -sS -o /dev/null -w '%{http_code}' http://localhost:3080/readyz 2>/dev/null || true)" = "200" ]; then
|
||||
healthy="yes"
|
||||
echo "/readyz returned 200 — server fully booted (post-listen startup complete)."
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
|
||||
echo "----- last 100 lines of api container logs -----"
|
||||
docker logs lc-api 2>&1 | tail -100 || true
|
||||
echo "------------------------------------------------"
|
||||
docker rm -f lc-api lc-mongo >/dev/null 2>&1 || true
|
||||
docker network rm lc-smoke >/dev/null 2>&1 || true
|
||||
|
||||
if [ -z "$healthy" ]; then
|
||||
echo "::error::Production image failed to reach a ready /readyz within timeout"
|
||||
exit 1
|
||||
fi
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue