- extract the token-config resolution (override gathering + cache lookup +
buildTokenConfigMap) into resolveTokenConfigMap in packages/api, leaving
the /api controller a thin request-scoped wrapper (CLAUDE.md TS rule)
- getConvoKey prefers the user message's real conversationId once the
`created` event stamps it, so a new chat's first-response live gauge and
totals land under the id TokenUsage subscribes to instead of NEW_CONVO
- carry the post-snapshot output estimate into the context snapshot at
finalize so the gauge keeps the last response after live resets
- accumulate per-rate billable units and price the session cost at
render, so usage events arriving before the token-config load still
count once it resolves
- pass user-scoped token-config cache keys through loadConfigModels
fetches and drop the controller's unscoped fallback to prevent serving
another user's resolved config
- tag emitted usage events with a per-run seq so resume dedupe never
drops a distinct call with an identical payload
- admit the static tokenConfig override in the custom endpoint schema so
it survives zod parsing into req.config