⚡ feat: Immediate Conversation Title Generation (#13395)

* ⚡ feat: Immediate Conversation Title Generation Generate conversation titles as soon as the request is made (in parallel with the response, from the user's first message) as the new default, fixing the #13318 race where a transient /gen_title 404 left new chats stuck on "New Chat". - Add per-endpoint `titleTiming` ('immediate' | 'final') to baseEndpointSchema; `endpoints.all` acts as the global default, unset = immediate. Resolve via a new `resolveTitleTiming` helper (`all` takes precedence). - Fire title generation in parallel with `sendMessage`; `titleConvo` waits (bounded, abortable) for the agent run and titles from the user input only. Persist after the conversation row exists; defer `disposeClient` until the title settles. - Expose `titleGenerationTiming` via startup config; `useTitleGeneration` fetches eagerly in immediate mode with a bounded 404 retry and never treats a transient 404 as final. Skip title queueing for temporary conversations. - Supersedes #13329 while incorporating its bounded 404-retry. * 🩹 fix: Address Copilot review findings on title timing - Guard against an undefined conversationId in addTitle (skip + warn) so the gen_title cache key can't collide as `userId-undefined` and saveConvo is never called without a conversationId. - Gate the title `useQueries` on `enabled` so no /gen_title request fires while unauthenticated (e.g. after logout) even if the module queue holds IDs. - Drop the stale `conversationId` param from the titleConvo JSDoc. - Add a regression test for the undefined-conversationId guard. * 🧵 fix: Harden immediate-title edge cases from codex review - Cancel in-flight immediate title generation when the request aborts: thread job.abortController.signal through addTitle so pressing Stop on a new chat neither consumes the title model nor surfaces a title for a cancelled turn. - Preserve a locally-applied title when the final SSE event's conversation carries no title yet (built before the title was saved), so long immediate-mode responses no longer revert the chat to "New Chat" until reload. - Guarantee one full post-completion gen_title fetch cycle before giving up, so a `final`-mode title (generated only after the stream ends) is still fetched under a global `immediate` default instead of being stranded. - Add regression tests for the abort propagation and the undefined-conversationId guard. * 🔁 fix: Correct title abort, post-completion refetch, and replacement ordering Follow-up to codex review of the immediate-title fixes: - Use a dedicated title AbortController instead of `job.abortController`. The latter is also aborted by `completeJob` on *successful* completion, which cancelled any title slower than a short response. The title is now cancelled only on a real user Stop or when the stream is replaced; a completed-then- aborted title is discarded (no save, cache cleared) rather than persisted. - Reset (not remove) the post-completion title query: `resetQueries` refetches the mounted observer with a fresh retry budget, whereas `removeQueries` left it stuck in its error state, so the promised post-completion cycle never ran. - Run the job-replacement check before resolving `convoReady`, and on a replaced stream cancel/discard the stale title so a discarded prompt can't persist a title. * 🧷 fix: Tighten title abort ordering and endpoint-level timing resolution Follow-up to codex review: - Abort the title controller before resolving `convoReady` on a stopped turn, so the title task can't resume and persist before the later abort. - Cancel the title and unblock its waits on ANY send failure (not just user aborts): a preflight/quota failure before the run exists otherwise hangs `_waitForRun`, deferring client disposal until the 45s title timeout. - Resolve `titleTiming` for custom endpoints via `getCustomEndpointConfig` (their config lives under `endpoints.custom[]`, not `endpoints[endpoint]`). - Derive the startup `titleGenerationTiming` via `resolveTitleTiming` for the agents endpoint so an endpoint-level `final` (without `endpoints.all`) is honored client-side instead of defaulting to immediate and burning eager gen_title polls. * 🪢 fix: Per-agent title timing and safer abort/replacement handling Follow-up to codex review: - Resolve `titleTiming` from the agent's actual endpoint after initialization, so a per-endpoint `final` override on a custom/provider endpoint backing an (ephemeral) agent is honored instead of always using the `agents` endpoint's value. - Don't preserve a locally-fetched title on a stopped (unfinished) turn: the server cancels and discards that title, so keeping it client-side would diverge from server state and leave the stopped chat titled until reload. - On abort/replacement, only delete the cached title if it still holds THIS task's value — a replacement stream shares the `userId-conversationId` key and may have already cached its own valid title that must not be removed. * 🪞 fix: Mirror AgentClient title-config resolution for titleTiming Per maintainer guidance, keep titleTiming resolution identical to how `AgentClient#titleConvo` already resolves the endpoint config — `endpoints.all` is the intended global override and the agent's actual provider endpoint is used: - Resolve via `endpoints.all ?? endpoints[endpoint] ?? getProviderConfig(endpoint) .customEndpointConfig` (was using `getCustomEndpointConfig` directly). Going through `getProviderConfig` picks up its case-insensitive fallback for normalized provider names (e.g. `openrouter` → `OpenRouter`), so a custom endpoint's `titleTiming` is honored like its other title settings. - Add `titleTiming` to the Azure endpoint schema `.pick()` so `endpoints.azureOpenAI.titleTiming` is no longer silently stripped by Zod. Note: per-endpoint title settings being skipped when `endpoints.all` is present is the existing, intended global-override behavior — not changed here. * 🧪 test: Cover useTitleGeneration effect logic (integration) Adds a deterministic white-box integration test that drives the real hook's React effects with a controllable react-query surface, locking down the stateful decisions that previously had no coverage: - immediate mode fetches a queued conversation while its stream is still active - final mode gates until the stream completes, then becomes eligible - success applies the fetched title to the conversation caches - a 404 while active defers (removeQueries) instead of giving up - a 404 after completion forces a fresh fetch via resetQueries (post-completion remount) * feat: Stream immediate title events * style: Format title SSE handler * test: Preserve data-provider exports in OAuth mock * test: Isolate OAuth route API mock * test: Keep OAuth callback factory capture * fix: Replay streamed title events on resume * fix: Honor agents title timing precedence * style: Format title timing fixes
2026-06-09 17:31:19 +00:00 · 2026-06-02 16:40:57 -04:00 · 2026-06-02 16:40:57 -04:00 · 2ef7bdfbc2
commit 2ef7bdfbc2
parent b45e4aeae5
22 changed files with 1437 additions and 52 deletions
--- a/api/server/controllers/agents/client.js
+++ b/api/server/controllers/agents/client.js
@ -82,6 +82,14 @@ class AgentClient extends BaseClient {
    /** @type {AgentRun} */
    this.run;

+    /** Resolves with the agent run once `chatCompletion` initializes it (or
+     *  `null` if initialization fails), letting immediate-mode title generation
+     *  await the run instead of throwing when fired before the run exists.
+     *  @type {Promise<AgentRun | null> | null} */
+    this._runReady = null;
+    /** @type {((run: AgentRun | null) => void) | null} */
+    this._resolveRun = null;
+
    const {
      agentConfigs,
      contentParts,
@ -1039,6 +1047,10 @@ class AgentClient extends BaseClient {
        }

        this.run = run;
+        if (this._resolveRun) {
+          this._resolveRun(run);
+          this._resolveRun = null;
+        }

        const streamId = this.options.req?._resumableStreamId;
        if (streamId && run.Graph) {
@ -1170,6 +1182,10 @@ class AgentClient extends BaseClient {
          err,
        );
      }
+      if (this._resolveRun) {
+        this._resolveRun(this.run ?? null);
+        this._resolveRun = null;
+      }
      run = null;
      config = null;
      memoryPromise = null;
@ -1177,14 +1193,58 @@ class AgentClient extends BaseClient {
  }

  /**
-   *
+   * Resolves with the agent run once it is initialized, or `null` if
+   * initialization fails. Lets immediate-mode title generation await the run
+   * instead of throwing when fired before `chatCompletion` assigns `this.run`.
+   * Rejects promptly if the provided signal aborts before the run is ready.
+   * @param {AbortSignal} [signal]
+   * @returns {Promise<AgentRun | null>}
+   */
+  _waitForRun(signal) {
+    if (this.run) {
+      return Promise.resolve(this.run);
+    }
+    if (!this._runReady) {
+      this._runReady = new Promise((resolve) => {
+        this._resolveRun = resolve;
+      });
+    }
+    if (!signal) {
+      return this._runReady;
+    }
+    if (signal.aborted) {
+      return Promise.reject(new Error('Aborted before run initialization'));
+    }
+    return new Promise((resolve, reject) => {
+      const onAbort = () => reject(new Error('Aborted before run initialization'));
+      signal.addEventListener('abort', onAbort, { once: true });
+      this._runReady.then((run) => {
+        signal.removeEventListener('abort', onAbort);
+        resolve(run);
+      });
+    });
+  }
+
+  /**
   * @param {Object} params
   * @param {string} params.text
-   * @param {string} params.conversationId
+   * @param {AbortController} params.abortController
+   * @param {boolean} [params.immediate] When true, the title is generated as soon
+   *   as the request is made — the run is awaited (instead of throwing) and the
+   *   title derives from the user's input only (`contentParts` is empty).
   */
-  async titleConvo({ text, abortController }) {
+  async titleConvo({ text, abortController, immediate = false }) {
    if (!this.run) {
-      throw new Error('Run not initialized');
+      if (!immediate) {
+        throw new Error('Run not initialized');
+      }
+      await this._waitForRun(abortController?.signal);
+      if (!this.run) {
+        logger.debug(
+          '[api/server/controllers/agents/client.js #titleConvo] Run unavailable for immediate title generation',
+        );
+        return;
+      }
    }
    const { handleLLMEnd, collected: collectedMetadata } = createMetadataAggregator();
    const { req, agent } = this.options;
@ -1324,7 +1384,7 @@ class AgentClient extends BaseClient {
        provider,
        clientOptions,
        inputText: text,
-        contentParts: this.contentParts,
+        contentParts: immediate ? [] : this.contentParts,
        titleMethod: endpointConfig?.titleMethod,
        titlePrompt: endpointConfig?.titlePrompt,
        titlePromptTemplate: endpointConfig?.titlePromptTemplate,
--- a/api/server/controllers/agents/client.test.js
+++ b/api/server/controllers/agents/client.test.js
@ -128,6 +128,52 @@ describe('AgentClient - titleConvo', () => {
      ).rejects.toThrow('Run not initialized');
    });

+    it('waits for the run in immediate mode instead of throwing', async () => {
+      client.run = null;
+      const abortController = new AbortController();
+
+      const titlePromise = client.titleConvo({ text: 'Test', abortController, immediate: true });
+
+      // Simulate `chatCompletion` assigning the run (client.js: `this.run = run`).
+      client.run = mockRun;
+      client._resolveRun(mockRun);
+
+      await titlePromise;
+      expect(mockRun.generateTitle).toHaveBeenCalled();
+    });
+
+    it('passes empty contentParts in immediate mode (title from the user input only)', async () => {
+      client.contentParts = [{ type: 'text', text: 'Streaming response so far' }];
+      const abortController = new AbortController();
+
+      await client.titleConvo({ text: 'Hello there', abortController, immediate: true });
+
+      const call = mockRun.generateTitle.mock.calls[0][0];
+      expect(call.contentParts).toEqual([]);
+      expect(call.inputText).toBe('Hello there');
+    });
+
+    it('uses live contentParts in non-immediate (final) mode', async () => {
+      client.contentParts = [{ type: 'text', text: 'Full response' }];
+      const abortController = new AbortController();
+
+      await client.titleConvo({ text: 'Hello there', abortController });
+
+      const call = mockRun.generateTitle.mock.calls[0][0];
+      expect(call.contentParts).toEqual([{ type: 'text', text: 'Full response' }]);
+    });
+
+    it('rejects promptly when aborted before the run initializes in immediate mode', async () => {
+      client.run = null;
+      const abortController = new AbortController();
+      abortController.abort();
+
+      await expect(
+        client.titleConvo({ text: 'Test', abortController, immediate: true }),
+      ).rejects.toThrow('Aborted before run initialization');
+      expect(mockRun.generateTitle).not.toHaveBeenCalled();
+    });
+
    it('should use titlePrompt from endpoint config', async () => {
      const text = 'Test conversation text';
      const abortController = new AbortController();
--- a/api/server/controllers/agents/request.js
+++ b/api/server/controllers/agents/request.js
@ -4,6 +4,7 @@ const {
  sendEvent,
  getViolationInfo,
  buildMessageFiles,
+  resolveTitleTiming,
  GenerationJobManager,
  decrementPendingRequest,
  sanitizeMessageForTransmit,
@ -93,6 +94,12 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit

  const userId = req.user.id;

+  /** When to generate the conversation title. `immediate` (default) fires title
+   *  generation in parallel with the response, from the user's first message;
+   *  `final` defers it until the full response completes (legacy behavior).
+   *  Resolved from the agent's actual endpoint once the client is initialized. */
+  let titleTiming = 'immediate';
+
  const { allowed, pendingRequests, limit } = await checkAndIncrementPendingRequest(userId);
  if (!allowed) {
    const violationInfo = getViolationInfo(pendingRequests, limit);
@ -213,6 +220,13 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit

    client = result.client;

+    // Resolve title timing from the public agents endpoint first, then fall
+    // back to the agent's actual backing provider/custom endpoint.
+    titleTiming = resolveTitleTiming({
+      appConfig: req.config,
+      endpoint: [endpointOption?.endpoint, client?.options?.agent?.endpoint],
+    });
+
    if (client?.sender) {
      GenerationJobManager.updateMetadata(streamId, { sender: client.sender });
    }
@ -243,6 +257,56 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
        );
      }

+      /** Immediate-mode title generation runs in parallel with the response, so
+       *  the conversation row may not exist when the title resolves. `convoReady`
+       *  resolves once the response (and thus the conversation) has been saved,
+       *  gating the title's `saveConvo`. Declared here so both the success tail
+       *  and the catch block can settle it and gate `disposeClient` on the title. */
+      let immediateTitlePromise = null;
+      let titleEventPromise = null;
+      let acceptsTitleEvents = true;
+      let resolveConvoReady;
+      const convoReady = new Promise((resolve) => {
+        resolveConvoReady = resolve;
+      });
+      /** Dedicated controller so a user Stop (or a replaced stream) cancels the
+       *  in-flight title — kept separate from `job.abortController`, which
+       *  `completeJob` also aborts on *successful* completion and would otherwise
+       *  cancel a title that is merely slower than a short response. */
+      const titleAbortController = new AbortController();
+      const abortTitleOnJobAbort = () => titleAbortController.abort();
+      if (job.abortController.signal.aborted) {
+        titleAbortController.abort();
+      } else {
+        job.abortController.signal.addEventListener('abort', abortTitleOnJobAbort, { once: true });
+      }
+      const titleEligible =
+        addTitle && parentMessageId === Constants.NO_PARENT && isNewConvo && !req.body?.isTemporary;
+      const emitTitleEvent = ({ conversationId: titleConversationId, title }) => {
+        titleEventPromise = (async () => {
+          if (!acceptsTitleEvents || titleAbortController.signal.aborted) {
+            return;
+          }
+          const currentJob = await GenerationJobManager.getJob(streamId);
+          if (!currentJob || currentJob.createdAt !== jobCreatedAt) {
+            return;
+          }
+          if (titleAbortController.signal.aborted) {
+            return;
+          }
+          await GenerationJobManager.emitChunk(streamId, {
+            event: 'title',
+            data: {
+              conversationId: titleConversationId,
+              title,
+            },
+          });
+        })().catch((err) => {
+          logger.error('[ResumableAgentController] Error emitting title event', err);
+        });
+        return titleEventPromise;
+      };
+
      try {
        const onStart = (userMsg, respMsgId, _isNewConvo) => {
          userMessage = userMsg;
@ -289,7 +353,23 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
          },
        };

-        const response = await client.sendMessage(text, messageOptions);
+        const sendPromise = client.sendMessage(text, messageOptions);
+
+        if (titleEligible && titleTiming === 'immediate') {
+          immediateTitlePromise = addTitle(req, {
+            text,
+            conversationId,
+            client,
+            immediate: true,
+            convoReady,
+            signal: titleAbortController.signal,
+            onTitleGenerated: emitTitleEvent,
+          }).catch((err) => {
+            logger.error('[ResumableAgentController] Error in immediate title generation', err);
+          });
+        }
+
+        const response = await sendPromise;

        const messageId = response.messageId;
        const endpoint = endpointOption.endpoint;
@ -355,11 +435,45 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
            originalCreatedAt: jobCreatedAt,
            currentCreatedAt: currentJob?.createdAt,
          });
+          // Discard the stale title from this replaced stream: cancel it and
+          // unblock its persistence wait without letting it save (the newer job
+          // owns the conversation now).
+          titleAbortController.abort();
+          job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
+          acceptsTitleEvents = false;
+          resolveConvoReady();
          // Still decrement pending request since we incremented at start
          await decrementPendingRequest(userId);
+          if (immediateTitlePromise) {
+            immediateTitlePromise.finally(() => {
+              if (client) {
+                disposeClient(client);
+              }
+            });
+          } else if (client) {
+            disposeClient(client);
+          }
          return;
        }

+        // If the user stopped this turn, cancel the title BEFORE unblocking its
+        // persistence wait — otherwise resolving `convoReady` lets the title task
+        // resume and save before the later abort runs.
+        if (wasAbortedBeforeComplete) {
+          titleAbortController.abort();
+        } else {
+          job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
+        }
+
+        // The conversation row now exists and this stream is authoritative; allow
+        // any in-flight immediate title generation to persist (saveConvo uses noUpsert).
+        resolveConvoReady();
+        acceptsTitleEvents = false;
+
+        if (titleEventPromise) {
+          await titleEventPromise;
+        }
+
        if (!wasAbortedBeforeComplete) {
          const finalEvent = {
            final: true,
@ -402,7 +516,20 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
          await decrementPendingRequest(userId);
        }

-        if (shouldGenerateTitle) {
+        if (titleTiming === 'immediate') {
+          // Title was fired in parallel above (if eligible); a stopped turn already
+          // aborted it before `resolveConvoReady`. Defer disposal until it settles
+          // so the run/req aren't torn down mid-generation.
+          if (immediateTitlePromise) {
+            immediateTitlePromise.finally(() => {
+              if (client) {
+                disposeClient(client);
+              }
+            });
+          } else if (client) {
+            disposeClient(client);
+          }
+        } else if (shouldGenerateTitle) {
          addTitle(req, {
            text,
            response: { ...response },
@ -422,6 +549,15 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
          }
        }
      } catch (error) {
+        // Any failure (user Stop, or a preflight/quota failure before the run is
+        // even created) must cancel the title and unblock its waits: the title's
+        // `_waitForRun` would otherwise never resolve, deferring client disposal
+        // until the 45s title timeout, and no title should persist for a failed turn.
+        titleAbortController.abort();
+        job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
+        acceptsTitleEvents = false;
+        resolveConvoReady();
+
        // Check if this was an abort (not a real error)
        const wasAborted = job.abortController.signal.aborted || error.message?.includes('abort');

@ -436,7 +572,14 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit

        await decrementPendingRequest(userId);

-        if (client) {
+        // Defer disposal until any immediate title settles (it holds the run/req).
+        if (immediateTitlePromise) {
+          immediateTitlePromise.finally(() => {
+            if (client) {
+              disposeClient(client);
+            }
+          });
+        } else if (client) {
          disposeClient(client);
        }

--- a/api/server/routes/config.js
+++ b/api/server/routes/config.js
@ -4,9 +4,10 @@ const {
  getBalanceConfig,
  getCloudFrontConfig,
  resolveBuildInfo,
+  resolveTitleTiming,
  sanitizeModelSpecs,
 } = require('@librechat/api');
-const { defaultSocialLogins } = require('librechat-data-provider');
+const { EModelEndpoint, defaultSocialLogins } = require('librechat-data-provider');
 const { logger, getTenantId, SystemCapabilities } = require('@librechat/data-schemas');
 const { hasCapability } = require('~/server/middleware/roles/capabilities');
 const { getLdapConfig } = require('~/server/services/Config/ldap');
@ -258,6 +259,10 @@ router.get('/', async function (req, res) {
      ...buildPostLoginPayload(),
      socialLogins: appConfig?.registration?.socialLogins ?? defaultSocialLogins,
      interface: appConfig?.interfaceConfig,
+      titleGenerationTiming: resolveTitleTiming({
+        appConfig,
+        endpoint: EModelEndpoint.agents,
+      }),
      turnstile: appConfig?.turnstileConfig,
      modelSpecs: sanitizeModelSpecs(appConfig?.modelSpecs),
      balance: balanceConfig,
--- a/api/server/routes/oauth.test.js
+++ b/api/server/routes/oauth.test.js
@ -69,17 +69,13 @@ jest.mock('librechat-data-provider', () => ({
  },
 }));

-jest.mock(
-  '@librechat/api',
-  () => ({
-    buildOAuthFailureLog: (...args) => mockBuildOAuthFailureLog(...args),
-    createOpenIDCallbackAuthenticator: (...args) => mockCreateOpenIDCallbackAuthenticator(...args),
-    createSetBalanceConfig: jest.fn(() => (_req, _res, next) => next()),
-    getOAuthFailureMessage: (...args) => mockGetOAuthFailureMessage(...args),
-    redirectToAuthFailure: (...args) => mockRedirectToAuthFailure(...args),
-  }),
-  { virtual: true },
-);
+jest.mock('@librechat/api', () => ({
+  buildOAuthFailureLog: (...args) => mockBuildOAuthFailureLog(...args),
+  createOpenIDCallbackAuthenticator: (...args) => mockCreateOpenIDCallbackAuthenticator(...args),
+  createSetBalanceConfig: jest.fn(() => (_req, _res, next) => next()),
+  getOAuthFailureMessage: (...args) => mockGetOAuthFailureMessage(...args),
+  redirectToAuthFailure: (...args) => mockRedirectToAuthFailure(...args),
+}));

 jest.mock('~/server/middleware', () => ({
  checkDomainAllowed: jest.fn((_req, _res, next) => next()),
--- a/api/server/services/Endpoints/agents/title.js
+++ b/api/server/services/Endpoints/agents/title.js
@ -5,9 +5,40 @@ const getLogStores = require('~/cache/getLogStores');
 const { saveConvo } = require('~/models');

 /**
- * Add title to conversation in a way that avoids memory retention
+ * Add title to conversation in a way that avoids memory retention.
+ *
+ * @param {ServerRequest} req
+ * @param {Object} params
+ * @param {string} params.text - The user's first message.
+ * @param {TMessage} [params.response] - The assistant response (legacy/`final` timing only).
+ * @param {AgentClient} params.client
+ * @param {string} [params.conversationId] - Required for `immediate` timing, where
+ *   `response` is not yet available; falls back to `response.conversationId`.
+ * @param {boolean} [params.immediate] - When true, the title is generated in parallel
+ *   with the response (from the user's first message) and persisted to the conversation
+ *   only after `convoReady` resolves (the conversation row must exist for `noUpsert`).
+ * @param {Promise<void>} [params.convoReady] - Resolves once the conversation has been
+ *   persisted; awaited before saving the title in `immediate` mode.
+ * @param {AbortSignal} [params.signal] - When aborted (e.g. the user stops an
+ *   immediate-mode generation), cancels the in-flight title model call so a
+ *   cancelled turn neither consumes the title model nor surfaces a title.
+ * @param {(params: { conversationId: string, title: string }) => Promise<void>|void} [params.onTitleGenerated]
+ *   Called after the title is cached and before persistence waits for the
+ *   conversation row. Used by live streams to push the title immediately.
 */
-const addTitle = async (req, { text, response, client }) => {
+const addTitle = async (
+  req,
+  {
+    text,
+    response,
+    client,
+    conversationId,
+    immediate = false,
+    convoReady,
+    signal,
+    onTitleGenerated,
+  },
+) => {
  const { TITLE_CONVO = true } = process.env ?? {};
  if (!isEnabled(TITLE_CONVO)) {
    return;
@ -22,8 +53,14 @@ const addTitle = async (req, { text, response, client }) => {
    return;
  }

+  const convoId = conversationId ?? response?.conversationId;
+  if (!convoId) {
+    logger.warn('[addTitle] Missing conversationId; skipping title generation');
+    return;
+  }
+
  const titleCache = getLogStores(CacheKeys.GEN_TITLE);
-  const key = `${req.user.id}-${response.conversationId}`;
+  const key = `${req.user.id}-${convoId}`;
  /** @type {NodeJS.Timeout} */
  let timeoutId;
  try {
@ -35,12 +72,22 @@ const addTitle = async (req, { text, response, client }) => {

    let titlePromise;
    let abortController = new AbortController();
+    /** Propagate a request abort (Stop) to the title generation so a cancelled
+     *  turn does not consume the title model or surface a title. */
+    if (signal) {
+      if (signal.aborted) {
+        abortController.abort();
+      } else {
+        signal.addEventListener('abort', () => abortController.abort(), { once: true });
+      }
+    }
    if (client && typeof client.titleConvo === 'function') {
      titlePromise = Promise.race([
        client
          .titleConvo({
            text,
            abortController,
+            immediate,
          })
          .catch((error) => {
            logger.error('Client title error:', error);
@ -65,6 +112,37 @@ const addTitle = async (req, { text, response, client }) => {
    }

    await titleCache.set(key, title, 120000);
+
+    if (!signal?.aborted && typeof onTitleGenerated === 'function') {
+      try {
+        await onTitleGenerated({ conversationId: convoId, title });
+      } catch (error) {
+        logger.error('Error emitting generated title:', error);
+      }
+    }
+
+    /** In immediate mode the title is generated in parallel with the response,
+     *  so the conversation row may not exist yet. `saveConvo` with `noUpsert`
+     *  is a silent no-op when the row is missing, which would drop the title
+     *  from the database (the cache above still serves the live UI). Wait for
+     *  the controller to signal the conversation has been persisted. */
+    if (convoReady) {
+      await convoReady;
+    }
+
+    if (signal?.aborted) {
+      // The turn was stopped, or this stream was replaced, after the title had
+      // already been generated — discard it instead of persisting a title for a
+      // cancelled/discarded response. Only clear the cache if it still holds THIS
+      // task's title: a replacement stream shares the `userId-conversationId` key
+      // and may have already cached its own (valid) title that we must not remove.
+      const cached = await titleCache.get(key);
+      if (cached === title) {
+        await titleCache.delete(key);
+      }
+      return;
+    }
+
    await saveConvo(
      {
        userId: req?.user?.id,
@ -72,7 +150,7 @@ const addTitle = async (req, { text, response, client }) => {
        interfaceConfig: req?.config?.interfaceConfig,
      },
      {
-        conversationId: response.conversationId,
+        conversationId: convoId,
        title,
      },
      { context: 'api/server/services/Endpoints/agents/title.js', noUpsert: true },
--- a/api/server/services/Endpoints/agents/title.test.js
+++ b/api/server/services/Endpoints/agents/title.test.js
@ -0,0 +1,258 @@
+/** Backing store so `get` reflects prior `set`/`delete` — addTitle reads the cache
+ *  back to avoid clobbering a replacement stream's title on abort. */
+const mockCacheStore = new Map();
+const mockCache = {
+  get: jest.fn((key) => mockCacheStore.get(key)),
+  set: jest.fn((key, value) => mockCacheStore.set(key, value)),
+  delete: jest.fn((key) => mockCacheStore.delete(key)),
+};
+const mockSaveConvo = jest.fn();
+
+jest.mock('@librechat/api', () => ({
+  isEnabled: (val) => val === true || val === 'true',
+  sanitizeTitle: (title) => title,
+}));
+
+jest.mock('@librechat/data-schemas', () => ({
+  logger: { debug: jest.fn(), info: jest.fn(), warn: jest.fn(), error: jest.fn() },
+}));
+
+jest.mock('librechat-data-provider', () => ({
+  CacheKeys: { GEN_TITLE: 'GEN_TITLE' },
+}));
+
+jest.mock('~/cache/getLogStores', () => jest.fn(() => mockCache));
+
+jest.mock('~/models', () => ({
+  saveConvo: (...args) => mockSaveConvo(...args),
+}));
+
+const addTitle = require('./title');
+
+const flush = () => new Promise((resolve) => setImmediate(resolve));
+
+const makeClient = (title = 'Generated Title') => ({
+  options: { titleConvo: true },
+  titleConvo: jest.fn().mockResolvedValue(title),
+});
+
+const makeReq = () => ({ user: { id: 'user-1' }, body: {}, config: {} });
+
+describe('agents addTitle', () => {
+  beforeEach(() => {
+    jest.clearAllMocks();
+    mockCacheStore.clear();
+  });
+
+  it('uses the explicit conversationId for the cache key and saveConvo (immediate mode)', async () => {
+    const client = makeClient('My Title');
+
+    await addTitle(makeReq(), {
+      text: 'hello',
+      client,
+      conversationId: 'cid-immediate',
+      immediate: true,
+      convoReady: Promise.resolve(),
+    });
+
+    expect(mockCache.set).toHaveBeenCalledWith(
+      'user-1-cid-immediate',
+      'My Title',
+      expect.any(Number),
+    );
+    expect(mockSaveConvo).toHaveBeenCalledWith(
+      expect.anything(),
+      expect.objectContaining({ conversationId: 'cid-immediate', title: 'My Title' }),
+      expect.objectContaining({ noUpsert: true }),
+    );
+  });
+
+  it('passes immediate:true through to client.titleConvo', async () => {
+    const client = makeClient();
+
+    await addTitle(makeReq(), {
+      text: 'hello',
+      client,
+      conversationId: 'cid',
+      immediate: true,
+      convoReady: Promise.resolve(),
+    });
+
+    expect(client.titleConvo).toHaveBeenCalledWith(expect.objectContaining({ immediate: true }));
+  });
+
+  it('falls back to response.conversationId in legacy (final) mode', async () => {
+    const client = makeClient('Legacy Title');
+
+    await addTitle(makeReq(), {
+      text: 'hi',
+      client,
+      response: { conversationId: 'resp-cid' },
+    });
+
+    expect(mockCache.set).toHaveBeenCalledWith(
+      'user-1-resp-cid',
+      'Legacy Title',
+      expect.any(Number),
+    );
+    expect(mockSaveConvo).toHaveBeenCalledWith(
+      expect.anything(),
+      expect.objectContaining({ conversationId: 'resp-cid', title: 'Legacy Title' }),
+      expect.objectContaining({ noUpsert: true }),
+    );
+    expect(client.titleConvo).toHaveBeenCalledWith(expect.objectContaining({ immediate: false }));
+  });
+
+  it('caches the title immediately but defers saveConvo until convoReady resolves', async () => {
+    const client = makeClient('Deferred Title');
+    let resolveConvo;
+    const convoReady = new Promise((resolve) => {
+      resolveConvo = resolve;
+    });
+
+    const pending = addTitle(makeReq(), {
+      text: 'hello',
+      client,
+      conversationId: 'cid-defer',
+      immediate: true,
+      convoReady,
+    });
+
+    await flush();
+
+    // Title is cached for the live UI, but persistence waits for the row to exist.
+    expect(mockCache.set).toHaveBeenCalledWith(
+      'user-1-cid-defer',
+      'Deferred Title',
+      expect.any(Number),
+    );
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+
+    resolveConvo();
+    await pending;
+
+    expect(mockSaveConvo).toHaveBeenCalledWith(
+      expect.anything(),
+      expect.objectContaining({ conversationId: 'cid-defer', title: 'Deferred Title' }),
+      expect.objectContaining({ noUpsert: true }),
+    );
+  });
+
+  it('notifies when the title is cached before waiting for convoReady', async () => {
+    const order = [];
+    const client = makeClient('Streamed Title');
+    const onTitleGenerated = jest.fn(async () => {
+      order.push('title-event');
+    });
+    let resolveConvo;
+    const convoReady = new Promise((resolve) => {
+      resolveConvo = resolve;
+    });
+
+    mockCache.set.mockImplementationOnce((key, value) => {
+      order.push('cache');
+      mockCacheStore.set(key, value);
+    });
+    mockSaveConvo.mockImplementationOnce(async () => {
+      order.push('save');
+    });
+
+    const pending = addTitle(makeReq(), {
+      text: 'hello',
+      client,
+      conversationId: 'cid-stream',
+      immediate: true,
+      convoReady,
+      onTitleGenerated,
+    });
+
+    await flush();
+
+    expect(onTitleGenerated).toHaveBeenCalledWith({
+      conversationId: 'cid-stream',
+      title: 'Streamed Title',
+    });
+    expect(order).toEqual(['cache', 'title-event']);
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+
+    resolveConvo();
+    await pending;
+
+    expect(order).toEqual(['cache', 'title-event', 'save']);
+  });
+
+  it('skips generation when the endpoint disables titleConvo', async () => {
+    const client = makeClient();
+    client.options.titleConvo = false;
+
+    await addTitle(makeReq(), { text: 'hi', client, conversationId: 'cid', immediate: true });
+
+    expect(client.titleConvo).not.toHaveBeenCalled();
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+  });
+
+  it('skips generation for temporary conversations', async () => {
+    const client = makeClient();
+    const req = makeReq();
+    req.body.isTemporary = true;
+
+    await addTitle(req, { text: 'hi', client, conversationId: 'cid', immediate: true });
+
+    expect(client.titleConvo).not.toHaveBeenCalled();
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+  });
+
+  it('skips generation when neither conversationId nor response is provided', async () => {
+    const client = makeClient();
+
+    await addTitle(makeReq(), { text: 'hi', client });
+
+    expect(client.titleConvo).not.toHaveBeenCalled();
+    expect(mockCache.set).not.toHaveBeenCalled();
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+  });
+
+  it('propagates an aborted request signal and discards the title without persisting', async () => {
+    const client = makeClient();
+    const ac = new AbortController();
+    const onTitleGenerated = jest.fn();
+    ac.abort();
+
+    await addTitle(makeReq(), {
+      text: 'hi',
+      client,
+      conversationId: 'cid',
+      immediate: true,
+      convoReady: Promise.resolve(),
+      signal: ac.signal,
+      onTitleGenerated,
+    });
+
+    const { abortController } = client.titleConvo.mock.calls[0][0];
+    expect(abortController.signal.aborted).toBe(true);
+    expect(onTitleGenerated).not.toHaveBeenCalled();
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+    expect(mockCache.delete).toHaveBeenCalledWith('user-1-cid');
+  });
+
+  it("does not delete a replacement stream's cached title when aborted", async () => {
+    const client = makeClient('Stale Title');
+    const ac = new AbortController();
+    ac.abort();
+    // Simulate a replacement stream having cached its own (newer) title under the
+    // shared `userId-conversationId` key by the time this stale task re-reads it.
+    mockCache.get.mockImplementationOnce(() => 'Newer Title');
+
+    await addTitle(makeReq(), {
+      text: 'hi',
+      client,
+      conversationId: 'cid',
+      immediate: true,
+      convoReady: Promise.resolve(),
+      signal: ac.signal,
+    });
+
+    expect(mockCache.delete).not.toHaveBeenCalled();
+    expect(mockSaveConvo).not.toHaveBeenCalled();
+  });
+});
--- a/client/src/data-provider/SSE/tests/queries.test.ts
+++ b/client/src/data-provider/SSE/tests/queries.test.ts
@ -0,0 +1,149 @@
+/**
+ * `~/utils` re-exports from `@librechat/client`, which pulls in framer-motion (an
+ * external peer not present in the jsdom test env). Provide a minimal mock with the
+ * two symbols `queries.ts` uses. `isNotFoundError` mirrors the real axios-only
+ * implementation in `~/utils/errors`.
+ */
+jest.mock('~/utils', () => {
+  const isNotFoundError = (error: unknown): boolean => {
+    if (error != null && typeof error === 'object') {
+      const response = (error as { response?: { status?: number } }).response;
+      return response?.status === 404;
+    }
+    return false;
+  };
+  return { isNotFoundError, updateConvoInAllQueries: jest.fn() };
+});
+
+jest.mock('librechat-data-provider', () => ({
+  apiBaseUrl: () => '',
+  QueryKeys: { conversation: 'conversation', activeJobs: 'activeJobs' },
+  request: { get: jest.fn() },
+  dataService: { genTitle: jest.fn(), getActiveJobs: jest.fn() },
+}));
+
+jest.mock('@tanstack/react-query', () => ({
+  useQuery: jest.fn(() => ({ data: undefined })),
+  useQueries: jest.fn(() => []),
+  useQueryClient: jest.fn(() => ({ setQueryData: jest.fn(), removeQueries: jest.fn() })),
+}));
+
+/** `queries.ts` imports `useGetStartupConfig` from `../Endpoints` (i.e.
+ *  `data-provider/Endpoints`); from this test file that resolves to `../../Endpoints`.
+ *  Mock it so the module under test does not pull in the full data-provider barrel. */
+jest.mock('../../Endpoints', () => ({
+  useGetStartupConfig: jest.fn(() => ({ data: undefined })),
+}));
+
+import { genTitleQueryKey, queueTitleGeneration } from '../queries';
+
+/** Build a minimal Axios-shaped error with a given HTTP status. */
+function makeAxiosError(status: number): Error {
+  const err = new Error(`HTTP ${status}`) as Error & {
+    isAxiosError: boolean;
+    response: { status: number };
+  };
+  err.isAxiosError = true;
+  err.response = { status };
+  return err;
+}
+
+describe('genTitleQueryKey', () => {
+  it('returns a two-element tuple with the conversationId', () => {
+    expect(genTitleQueryKey('abc-123')).toEqual(['genTitle', 'abc-123']);
+  });
+
+  it('returns different keys for different conversation IDs', () => {
+    expect(genTitleQueryKey('conv-1')).not.toEqual(genTitleQueryKey('conv-2'));
+  });
+});
+
+describe('queueTitleGeneration', () => {
+  it('runs without throwing for a new conversation ID', () => {
+    expect(() => queueTitleGeneration('new-conv-queue-1')).not.toThrow();
+  });
+
+  it('is safe to call multiple times for the same conversation ID', () => {
+    expect(() => {
+      queueTitleGeneration('new-conv-queue-2');
+      queueTitleGeneration('new-conv-queue-2');
+    }).not.toThrow();
+  });
+});
+
+/**
+ * The title-fetch retry policy in `useTitleGeneration`:
+ *
+ *   retry: (failureCount, error) => isNotFoundError(error) && failureCount < 3
+ *   retryDelay: () => 5_000
+ *
+ * The server `/gen_title` route waits up to ~15.5s before returning 404 while the
+ * title is still generating. Retrying ONLY on 404 (never on 401/403/5xx/network)
+ * means a transient "still generating" response is never treated as final (#13318),
+ * while genuine errors stay terminal.
+ *
+ * These tests pin the classification contract and the failure cap so any future
+ * change to either is caught.
+ */
+describe('title fetch retry policy — error classification', () => {
+  const isNotFoundError = (error: unknown): boolean => {
+    if (error != null && typeof error === 'object') {
+      const response = (error as { response?: { status?: number } }).response;
+      return response?.status === 404;
+    }
+    return false;
+  };
+
+  it('returns true for a 404 (server still generating the title)', () => {
+    expect(isNotFoundError(makeAxiosError(404))).toBe(true);
+  });
+
+  it('returns false for a 401 (auth failure — do not retry)', () => {
+    expect(isNotFoundError(makeAxiosError(401))).toBe(false);
+  });
+
+  it('returns false for a 500 (server failure — do not retry)', () => {
+    expect(isNotFoundError(makeAxiosError(500))).toBe(false);
+  });
+
+  it('returns false for a plain network Error', () => {
+    expect(isNotFoundError(new Error('Network Error'))).toBe(false);
+  });
+
+  it('returns false for null/undefined', () => {
+    expect(isNotFoundError(null)).toBe(false);
+    expect(isNotFoundError(undefined)).toBe(false);
+  });
+});
+
+describe('title fetch retry policy — failure cap', () => {
+  const isNotFoundError = (error: unknown): boolean => {
+    if (error != null && typeof error === 'object') {
+      const response = (error as { response?: { status?: number } }).response;
+      return response?.status === 404;
+    }
+    return false;
+  };
+
+  const retryPredicate = (failureCount: number, error: unknown): boolean =>
+    isNotFoundError(error) && failureCount < 3;
+
+  const notFound = makeAxiosError(404);
+
+  it('retries the first three 404 failures', () => {
+    expect(retryPredicate(0, notFound)).toBe(true);
+    expect(retryPredicate(1, notFound)).toBe(true);
+    expect(retryPredicate(2, notFound)).toBe(true);
+  });
+
+  it('stops after the third 404 failure', () => {
+    expect(retryPredicate(3, notFound)).toBe(false);
+  });
+
+  it('never retries a non-404 regardless of attempt count', () => {
+    const authErr = makeAxiosError(401);
+    for (let i = 0; i < 5; i++) {
+      expect(retryPredicate(i, authErr)).toBe(false);
+    }
+  });
+});
--- a/client/src/data-provider/SSE/tests/useTitleGeneration.test.ts
+++ b/client/src/data-provider/SSE/tests/useTitleGeneration.test.ts
@ -0,0 +1,178 @@
+/**
+ * White-box integration test for `useTitleGeneration`'s effect logic.
+ *
+ * Complements the helper unit tests in `queries.test.ts` by driving the REAL
+ * hook (real React state/effects) with a controllable react-query surface, so
+ * the stateful decisions — immediate-vs-final eligibility, success application,
+ * defer-while-active, and the post-completion `resetQueries` remount — are
+ * deterministically locked down without timer/async flakiness.
+ *
+ * `~/utils` re-exports from `@librechat/client` (framer-motion peer, absent in
+ * jsdom); mocked to the two symbols the hook uses. react-query is mocked so we
+ * control `activeJobIds`, the per-conversation query results, and spy the
+ * QueryClient — the hook's own React effects still run for real.
+ */
+let mockActiveJobIds: string[] = [];
+let mockTiming: 'immediate' | 'final' = 'immediate';
+let mockQueriesResults: Array<{
+  isSuccess?: boolean;
+  isError?: boolean;
+  data?: { title: string };
+  error?: unknown;
+}> = [];
+let mockCapturedQueries: Array<{ queryKey: unknown[] }> = [];
+
+const mockSetQueryData = jest.fn();
+const mockRemoveQueries = jest.fn();
+const mockResetQueries = jest.fn();
+const mockUpdateConvoInAllQueries = jest.fn();
+
+jest.mock('@tanstack/react-query', () => ({
+  useQuery: jest.fn(() => ({ data: { activeJobIds: mockActiveJobIds } })),
+  useQueries: jest.fn(({ queries }: { queries: Array<{ queryKey: unknown[] }> }) => {
+    mockCapturedQueries = queries;
+    return mockQueriesResults;
+  }),
+  useQueryClient: jest.fn(() => ({
+    setQueryData: mockSetQueryData,
+    removeQueries: mockRemoveQueries,
+    resetQueries: mockResetQueries,
+  })),
+}));
+
+jest.mock('../../Endpoints', () => ({
+  useGetStartupConfig: () => ({ data: { titleGenerationTiming: mockTiming } }),
+}));
+
+jest.mock('~/utils', () => ({
+  isNotFoundError: (error: unknown): boolean => {
+    if (error != null && typeof error === 'object') {
+      return (error as { response?: { status?: number } }).response?.status === 404;
+    }
+    return false;
+  },
+  updateConvoInAllQueries: (...args: unknown[]) => mockUpdateConvoInAllQueries(...args),
+}));
+
+jest.mock('librechat-data-provider', () => ({
+  apiBaseUrl: () => '',
+  QueryKeys: { conversation: 'conversation', activeJobs: 'activeJobs' },
+  request: { get: jest.fn() },
+  dataService: { genTitle: jest.fn(), getActiveJobs: jest.fn() },
+}));
+
+import { renderHook, act } from '@testing-library/react';
+import {
+  useTitleGeneration,
+  genTitleQueryKey,
+  queueTitleGeneration,
+  markTitleGenerationProcessed,
+} from '../queries';
+
+const notFound = { response: { status: 404 } };
+
+/** queryKeys passed to the latest `useQueries` call (i.e. the ready-to-fetch set). */
+const eligibleKeys = () => mockCapturedQueries.map((q) => JSON.stringify(q.queryKey));
+const isEligible = (id: string) => eligibleKeys().includes(JSON.stringify(genTitleQueryKey(id)));
+
+beforeEach(() => {
+  mockActiveJobIds = [];
+  mockTiming = 'immediate';
+  mockQueriesResults = [];
+  mockCapturedQueries = [];
+  jest.clearAllMocks();
+});
+
+describe('useTitleGeneration — eligibility', () => {
+  it('immediate mode: fetches a queued conversation while its stream is still active', () => {
+    mockTiming = 'immediate';
+    mockActiveJobIds = ['conv-imm'];
+
+    renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-imm'));
+
+    expect(isEligible('conv-imm')).toBe(true);
+  });
+
+  it('final mode: gates a queued conversation until its stream completes', () => {
+    mockTiming = 'final';
+    mockActiveJobIds = ['conv-fin'];
+
+    const { rerender } = renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-fin'));
+    expect(isEligible('conv-fin')).toBe(false);
+
+    // Stream completes — the conversation leaves the active set.
+    mockActiveJobIds = [];
+    rerender();
+    expect(isEligible('conv-fin')).toBe(true);
+  });
+
+  it('stops polling when a title is completed by an SSE event', () => {
+    mockTiming = 'immediate';
+    mockActiveJobIds = ['conv-sse-title'];
+
+    const { rerender } = renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-sse-title'));
+    expect(isEligible('conv-sse-title')).toBe(true);
+
+    act(() => markTitleGenerationProcessed('conv-sse-title'));
+    rerender();
+
+    expect(isEligible('conv-sse-title')).toBe(false);
+  });
+});
+
+describe('useTitleGeneration — result handling', () => {
+  it('applies the fetched title to the conversation caches on success', () => {
+    mockTiming = 'immediate';
+    mockActiveJobIds = ['conv-ok'];
+
+    const { rerender } = renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-ok'));
+
+    mockQueriesResults = [{ isSuccess: true, isError: false, data: { title: 'Quantum Chat' } }];
+    rerender();
+
+    expect(mockSetQueryData).toHaveBeenCalledWith(
+      ['conversation', 'conv-ok'],
+      expect.any(Function),
+    );
+    expect(mockUpdateConvoInAllQueries).toHaveBeenCalled();
+
+    const call = mockSetQueryData.mock.calls.find(
+      ([key]) => JSON.stringify(key) === JSON.stringify(['conversation', 'conv-ok']),
+    );
+    const updater = call?.[1] as (c?: { title?: string }) => { title?: string };
+    expect(updater({ title: 'New Chat' })).toEqual(
+      expect.objectContaining({ title: 'Quantum Chat' }),
+    );
+  });
+
+  it('a 404 while the stream is active defers (removeQueries), not giving up', () => {
+    mockTiming = 'immediate';
+    mockActiveJobIds = ['conv-active404'];
+
+    const { rerender } = renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-active404'));
+
+    mockQueriesResults = [{ isError: true, isSuccess: false, error: notFound }];
+    rerender();
+
+    expect(mockRemoveQueries).toHaveBeenCalledWith(genTitleQueryKey('conv-active404'));
+    expect(mockResetQueries).not.toHaveBeenCalled();
+  });
+
+  it('a 404 after the stream completes forces a fresh fetch via resetQueries', () => {
+    mockTiming = 'immediate';
+    mockActiveJobIds = []; // stream already complete
+
+    const { rerender } = renderHook(() => useTitleGeneration(true));
+    act(() => queueTitleGeneration('conv-done404'));
+
+    mockQueriesResults = [{ isError: true, isSuccess: false, error: notFound }];
+    rerender();
+
+    expect(mockResetQueries).toHaveBeenCalledWith(genTitleQueryKey('conv-done404'));
+  });
+});
--- a/client/src/data-provider/SSE/queries.ts
+++ b/client/src/data-provider/SSE/queries.ts
@ -2,7 +2,8 @@ import { useEffect, useMemo, useState } from 'react';
 import { apiBaseUrl, QueryKeys, request, dataService } from 'librechat-data-provider';
 import { useQuery, useQueries, useQueryClient } from '@tanstack/react-query';
 import type { Agents, TConversation } from 'librechat-data-provider';
-import { updateConvoInAllQueries } from '~/utils';
+import { isNotFoundError, updateConvoInAllQueries } from '~/utils';
+import { useGetStartupConfig } from '../Endpoints';

 export interface StreamStatusResponse {
  active: boolean;
@ -45,6 +46,12 @@ export interface ActiveJobsResponse {
 const titleQueue = new Set<string>();
 const processedTitles = new Set<string>();

+/** Conversations whose eager (immediate-mode) title fetch 404'd while the stream
+ * was still active. They wait for stream completion before fetching again instead
+ * of busy-looping — covers a per-endpoint `final` override under a global
+ * `immediate` default. */
+const deferredTitles = new Set<string>();
+
 /** Listeners to notify when queue changes (for non-resumable streams like assistants) */
 const queueListeners = new Set<() => void>();

@ -56,13 +63,31 @@ export function queueTitleGeneration(conversationId: string) {
  }
 }

+export function markTitleGenerationProcessed(conversationId: string) {
+  processedTitles.add(conversationId);
+  titleQueue.delete(conversationId);
+  deferredTitles.delete(conversationId);
+  queueListeners.forEach((listener) => listener());
+}
+
 /**
 * Hook to process the title generation queue.
- * Only fetches titles AFTER the job completes (not in activeJobIds).
+ *
+ * Timing is driven by the server's effective default (`titleGenerationTiming`):
+ * - `immediate` (default): fetch the title in parallel with the active stream so
+ *   it appears while the response is still streaming.
+ * - `final` (legacy): fetch only after the stream completes.
+ *
+ * The title query retries on 404 (server still generating) so a transient
+ * not-ready response is never treated as final (#13318).
 * Place this high in the component tree (e.g., Nav.tsx).
 */
 export function useTitleGeneration(enabled = true) {
  const queryClient = useQueryClient();
+  const { data: startupConfig } = useGetStartupConfig();
+  /** Defaults to immediate until startup config loads. */
+  const timing = startupConfig?.titleGenerationTiming ?? 'immediate';
+
  const [queueVersion, setQueueVersion] = useState(0);
  const [readyToFetch, setReadyToFetch] = useState<string[]>([]);

@ -82,33 +107,54 @@ export function useTitleGeneration(enabled = true) {

  useEffect(() => {
    const activeSet = new Set(activeJobIds);
-    const completedJobs: string[] = [];
+    const eligible: string[] = [];

    for (const conversationId of titleQueue) {
-      if (!activeSet.has(conversationId) && !processedTitles.has(conversationId)) {
-        completedJobs.push(conversationId);
+      if (processedTitles.has(conversationId)) {
+        continue;
+      }
+      const eager = timing === 'immediate' && !deferredTitles.has(conversationId);
+      if (eager || !activeSet.has(conversationId)) {
+        eligible.push(conversationId);
      }
    }

-    if (completedJobs.length > 0) {
-      setReadyToFetch((prev) => [...new Set([...prev, ...completedJobs])]);
+    if (eligible.length > 0) {
+      setReadyToFetch((prev) => [...new Set([...prev, ...eligible])]);
    }
-  }, [activeJobIds, queueVersion]);
+  }, [activeJobIds, queueVersion, timing]);

-  // Fetch titles for ready conversations
+  useEffect(() => {
+    setReadyToFetch((prev) => {
+      const next = prev.filter((id) => !processedTitles.has(id));
+      return next.length === prev.length ? prev : next;
+    });
+  }, [queueVersion]);
+
+  // Fetch titles for ready conversations.
  const titleQueries = useQueries({
    queries: readyToFetch.map((conversationId) => ({
      queryKey: genTitleQueryKey(conversationId),
      queryFn: () => dataService.genTitle({ conversationId }),
+      // Gate on `enabled` so no /gen_title request fires while unauthenticated
+      // (e.g. after logout) even if the module-level queue still holds IDs.
+      enabled,
      staleTime: Infinity,
-      retry: false,
+      /** Retry only on 404 (title still generating server-side) so a transient
+       * not-ready response is never treated as final. All other errors are
+       * terminal. Bounded retry adapted from PR #13329. */
+      retry: (failureCount: number, error: unknown) => isNotFoundError(error) && failureCount < 3,
+      retryDelay: () => 5_000,
    })),
  });

  useEffect(() => {
+    const activeSet = new Set(activeJobIds);
    titleQueries.forEach((titleQuery, index) => {
      const conversationId = readyToFetch[index];
-      if (!conversationId || processedTitles.has(conversationId)) return;
+      if (!conversationId || processedTitles.has(conversationId)) {
+        return;
+      }

      if (titleQuery.isSuccess && titleQuery.data) {
        const { title } = titleQuery.data;
@ -121,17 +167,34 @@ export function useTitleGeneration(enabled = true) {
        if (window.location.pathname.includes(conversationId)) {
          document.title = title;
        }
-        processedTitles.add(conversationId);
-        titleQueue.delete(conversationId);
+        markTitleGenerationProcessed(conversationId);
        setReadyToFetch((prev) => prev.filter((id) => id !== conversationId));
      } else if (titleQuery.isError) {
-        // Mark as processed even on error to avoid infinite retries
-        processedTitles.add(conversationId);
-        titleQueue.delete(conversationId);
-        setReadyToFetch((prev) => prev.filter((id) => id !== conversationId));
+        // Retries are exhausted here (the query only retries on 404). A title may
+        // still be generated *after* the stream completes (final mode generates
+        // only once the response ends), so don't treat the first 404 as final —
+        // guarantee one fresh, full-budget fetch cycle that runs post-completion.
+        if (activeSet.has(conversationId)) {
+          // Failed while still streaming: drop and clear so the completion
+          // transition re-promotes a fresh fetch (instead of busy-looping).
+          deferredTitles.add(conversationId);
+          queryClient.removeQueries(genTitleQueryKey(conversationId));
+          setReadyToFetch((prev) => prev.filter((id) => id !== conversationId));
+        } else if (!deferredTitles.has(conversationId)) {
+          // First failure at/after completion without a prior deferral: grant one
+          // fresh cycle. Polling has stopped (no re-promotion), so reset the query
+          // in place — `resetQueries` refetches active observers with a fresh retry
+          // budget, unlike `removeQueries`, which leaves the observer in error state.
+          deferredTitles.add(conversationId);
+          queryClient.resetQueries(genTitleQueryKey(conversationId));
+        } else {
+          // The post-completion fetch also failed — the title is genuinely absent.
+          markTitleGenerationProcessed(conversationId);
+          setReadyToFetch((prev) => prev.filter((id) => id !== conversationId));
+        }
      }
    });
-  }, [titleQueries, readyToFetch, queryClient]);
+  }, [titleQueries, readyToFetch, queryClient, activeJobIds]);
 }

 /**
--- a/client/src/hooks/SSE/tests/useResumableSSE.spec.ts
+++ b/client/src/hooks/SSE/tests/useResumableSSE.spec.ts
@ -80,6 +80,8 @@ jest.mock('~/data-provider', () => ({

 const mockErrorHandler = jest.fn();
 const mockCreatedHandler = jest.fn();
+const mockStepHandler = jest.fn();
+const mockTitleHandler = jest.fn();
 const mockSetIsSubmitting = jest.fn();
 const mockClearStepMaps = jest.fn();

@ -89,7 +91,8 @@ jest.mock('~/hooks/SSE/useEventHandlers', () =>
    finalHandler: jest.fn(),
    createdHandler: mockCreatedHandler,
    attachmentHandler: jest.fn(),
-    stepHandler: jest.fn(),
+    stepHandler: mockStepHandler,
+    titleHandler: mockTitleHandler,
    contentHandler: jest.fn(),
    resetContentHandler: jest.fn(),
    syncStepMessage: jest.fn(),
@ -177,6 +180,8 @@ describe('useResumableSSE - 404 error path', () => {
    localStorage.clear();
    mockErrorHandler.mockClear();
    mockCreatedHandler.mockClear();
+    mockStepHandler.mockClear();
+    mockTitleHandler.mockClear();
    mockClearStepMaps.mockClear();
    mockSetIsSubmitting.mockClear();
    mockSetQueryData.mockClear();
@ -430,6 +435,67 @@ describe('useResumableSSE - 404 error path', () => {
    unmount();
  });

+  it('routes title stream events to the title handler', async () => {
+    const submission = buildSubmission();
+    const chatHelpers = buildChatHelpers();
+
+    const { unmount } = renderHook(() => useResumableSSE(submission, chatHelpers));
+
+    await act(async () => {
+      await Promise.resolve();
+    });
+
+    const titleEvent = {
+      event: 'title',
+      data: {
+        conversationId: CONV_ID,
+        title: 'Streamed Title',
+      },
+    };
+    const sse = getLastSSE();
+    await act(async () => {
+      sse._emit('message', { data: JSON.stringify(titleEvent) });
+    });
+
+    expect(mockTitleHandler).toHaveBeenCalledWith(titleEvent);
+    expect(mockStepHandler).not.toHaveBeenCalled();
+    unmount();
+  });
+
+  it('replays title events from resume state sync', async () => {
+    const submission = buildSubmission();
+    const chatHelpers = buildChatHelpers();
+
+    const { unmount } = renderHook(() => useResumableSSE(submission, chatHelpers));
+
+    await act(async () => {
+      await Promise.resolve();
+    });
+
+    const titleEvent = {
+      event: 'title',
+      data: {
+        conversationId: CONV_ID,
+        title: 'Resumed Title',
+      },
+    };
+    const sse = getLastSSE();
+    await act(async () => {
+      sse._emit('message', {
+        data: JSON.stringify({
+          sync: true,
+          resumeState: {
+            runSteps: [],
+            titleEvent,
+          },
+        }),
+      });
+    });
+
+    expect(mockTitleHandler).toHaveBeenCalledWith(titleEvent);
+    unmount();
+  });
+
  it.each([undefined, 500, 503])(
    'does not call errorHandler for responseCode %s (reconnect path)',
    async (responseCode) => {
--- a/client/src/hooks/SSE/useEventHandlers.ts
+++ b/client/src/hooks/SSE/useEventHandlers.ts
@ -33,7 +33,11 @@ import {
  removeConvoFromAllQueries,
  findConversationInInfinite,
 } from '~/utils';
-import { startupConfigKey, queueTitleGeneration } from '~/data-provider';
+import {
+  startupConfigKey,
+  queueTitleGeneration,
+  markTitleGenerationProcessed,
+} from '~/data-provider';
 import useAttachmentHandler from '~/hooks/SSE/useAttachmentHandler';
 import useContentHandler from '~/hooks/SSE/useContentHandler';
 import useStepHandler from '~/hooks/SSE/useStepHandler';
@ -53,6 +57,17 @@ type TSyncData = {
  conversationId: string;
 };

+type TTitleEvent = {
+  event: 'title';
+  data?: {
+    conversationId?: string;
+    title?: string;
+  };
+};
+
+const hasRealTitle = (title?: string | null): title is string =>
+  title != null && title !== '' && title !== 'New Chat';
+
 export type EventHandlerParams = {
  isAddedRequest?: boolean;
  setCompleted: React.Dispatch<React.SetStateAction<Set<unknown>>>;
@ -467,6 +482,42 @@ export default function useEventHandlers({
    ],
  );

+  const titleHandler = useCallback(
+    (event: TTitleEvent) => {
+      const { conversationId, title } = event.data ?? {};
+      if (!conversationId || !hasRealTitle(title)) {
+        return;
+      }
+
+      queryClient.setQueryData<TConversation>([QueryKeys.conversation, conversationId], (convo) =>
+        convo ? { ...convo, title } : convo,
+      );
+      updateConvoInAllQueries(queryClient, conversationId, (convo) => ({ ...convo, title }));
+      markTitleGenerationProcessed(conversationId);
+
+      if (location.pathname.includes(conversationId)) {
+        document.title = title;
+      }
+
+      if (setConversation && !isAddedRequest) {
+        setConversation((prevState) => {
+          if (!prevState) {
+            return prevState;
+          }
+          if (prevState.conversationId && prevState.conversationId !== conversationId) {
+            return prevState;
+          }
+          return {
+            ...prevState,
+            conversationId,
+            title,
+          };
+        });
+      }
+    },
+    [queryClient, location.pathname, setConversation, isAddedRequest],
+  );
+
  const finalHandler = useCallback(
    (data: TFinalResData, submission: EventSubmission) => {
      const { requestMessage, responseMessage, conversation, runMessages } = data;
@ -523,7 +574,8 @@ export default function useEventHandlers({

        const isNewConvo = conversation.conversationId !== submissionConvo.conversationId;

-        if (isNewConvo && conversation.conversationId) {
+        // Skip temporary conversations — the server never generates titles for them.
+        if (isNewConvo && conversation.conversationId && !_isTemporary) {
          queueTitleGeneration(conversation.conversationId);
        }

@ -600,6 +652,28 @@ export default function useEventHandlers({
          removeConvoFromAllQueries(queryClient, submissionConvo.conversationId);
        }

+        /** A title applied locally (e.g. an immediate-mode title fetched while the
+         *  response was still streaming) must survive the final event, whose
+         *  `conversation` was built before the title was saved and so carries no
+         *  title yet — otherwise the chat reverts to "New Chat" until reload.
+         *  Skip preservation for a stopped (unfinished) turn: the server cancels
+         *  and discards that title, so the local one would diverge from server state. */
+        const titlePreservable = responseMessage?.unfinished !== true;
+        const finalConversationId = conversation.conversationId;
+        const shouldRollbackStreamedTitle =
+          !titlePreservable && finalConversationId && !hasRealTitle(serverConversation.title);
+
+        if (shouldRollbackStreamedTitle && finalConversationId) {
+          updateConvoInAllQueries(queryClient, finalConversationId, (convo) => ({
+            ...convo,
+            title: null,
+          }));
+          if (location.pathname.includes(finalConversationId)) {
+            const startupConfig = queryClient.getQueryData<TStartupConfig>(startupConfigKey(true));
+            document.title = startupConfig?.appTitle ?? 'LibreChat';
+          }
+        }
+
        if (setConversation && isAddedRequest !== true) {
          setConversation((prevState) => {
            const update = {
@ -609,14 +683,28 @@ export default function useEventHandlers({
            if (prevState?.model != null && prevState.model !== submissionConvo.model) {
              update.model = prevState.model;
            }
+            const prevTitle = prevState?.title;
+            if (titlePreservable && !hasRealTitle(conversation.title) && hasRealTitle(prevTitle)) {
+              update.title = prevTitle;
+            }
            if (conversation.conversationId) {
              queryClient.setQueryData<TConversation>(
                [QueryKeys.conversation, conversation.conversationId],
-                (cachedConvo) =>
-                  ({
+                (cachedConvo) => {
+                  const merged = {
                    ...cachedConvo,
                    ...serverConversation,
-                  }) as TConversation,
+                  } as TConversation;
+                  const cachedTitle = cachedConvo?.title;
+                  if (
+                    titlePreservable &&
+                    !hasRealTitle(serverConversation.title) &&
+                    hasRealTitle(cachedTitle)
+                  ) {
+                    merged.title = cachedTitle;
+                  }
+                  return merged;
+                },
              );
            }
            return update;
@ -890,6 +978,7 @@ export default function useEventHandlers({
    messageHandler,
    contentHandler,
    createdHandler,
+    titleHandler,
    syncStepMessage,
    attachmentHandler,
    abortConversation,
--- a/client/src/hooks/SSE/useResumableSSE.ts
+++ b/client/src/hooks/SSE/useResumableSSE.ts
@ -211,6 +211,7 @@ export default function useResumableSSE(
    messageHandler,
    contentHandler,
    createdHandler,
+    titleHandler,
    syncStepMessage,
    attachmentHandler,
    resetContentHandler,
@ -317,6 +318,11 @@ export default function useResumableSSE(
            return;
          }

+          if (data.event === 'title') {
+            titleHandler(data);
+            return;
+          }
+
          if (data.event != null) {
            stepHandler(data, { ...currentSubmission, userMessage } as EventSubmission);
            return;
@ -398,11 +404,17 @@ export default function useResumableSSE(
              }
            }

+            if (data.resumeState?.titleEvent) {
+              titleHandler(data.resumeState.titleEvent);
+            }
+
            if (data.pendingEvents?.length > 0) {
              console.log(`[ResumableSSE] Replaying ${data.pendingEvents.length} pending events`);
              const submission = { ...currentSubmission, userMessage } as EventSubmission;
              for (const pendingEvent of data.pendingEvents) {
-                if (pendingEvent.event != null) {
+                if (pendingEvent.event === 'title') {
+                  titleHandler(pendingEvent);
+                } else if (pendingEvent.event != null) {
                  stepHandler(pendingEvent, submission);
                } else if (pendingEvent.type != null) {
                  contentHandler({ data: pendingEvent, submission });
@ -670,6 +682,7 @@ export default function useResumableSSE(
      finalHandler,
      createdHandler,
      attachmentHandler,
+      titleHandler,
      stepHandler,
      contentHandler,
      resetContentHandler,
@ -801,9 +814,11 @@ export default function useResumableSSE(
          setStreamId(newStreamId);
          // Optimistically add to active jobs
          addActiveJob(newStreamId);
-          // Queue title generation if this is a new conversation (first message)
+          // Queue title generation if this is a new conversation (first message).
+          // Skip temporary conversations — the server never generates titles for
+          // them, so polling would 404 indefinitely.
          const isNewConvo = submission.userMessage?.parentMessageId === Constants.NO_PARENT;
-          if (isNewConvo) {
+          if (isNewConvo && !submission.isTemporary) {
            queueTitleGeneration(newStreamId);
          }
          if (isInitialNewConversation(submission)) {
--- a/client/src/hooks/SSE/useSSE.ts
+++ b/client/src/hooks/SSE/useSSE.ts
@ -42,6 +42,7 @@ export default function useSSE(
    messageHandler,
    contentHandler,
    createdHandler,
+    titleHandler,
    attachmentHandler,
    abortConversation,
  } = useEventHandlers({
@ -113,6 +114,8 @@ export default function useSSE(
        };

        createdHandler(data, { ...submission, userMessage } as EventSubmission);
+      } else if (data.event === 'title') {
+        titleHandler(data);
      } else if (data.event != null) {
        stepHandler(data, { ...submission, userMessage } as EventSubmission);
      } else if (data.sync != null) {
--- a/librechat.example.yaml
+++ b/librechat.example.yaml
@ -362,6 +362,12 @@ endpoints:
  #   maxRecursionLimit: 100
  #   # (optional) Disable the builder interface for agents
  #   disableBuilder: false
+  #   # (optional) When conversation titles are generated:
+  #   #   immediate (default): generate as soon as the request is made, in parallel
+  #   #     with the response, from the user's first message (title appears within ~1-2s).
+  #   #   final: defer generation until the full response completes (legacy behavior).
+  #   # Set under `endpoints.all` instead to apply as the global default for all endpoints.
+  #   titleTiming: immediate
  #   # (optional) Maximum total citations to include in agent responses, defaults to 30
  #   maxCitations: 30
  #   # (optional) Maximum citations per file to include in agent responses, defaults to 7
--- a/packages/api/src/endpoints/config/providers.spec.ts
+++ b/packages/api/src/endpoints/config/providers.spec.ts
@ -1,7 +1,7 @@
 import { Providers } from '@librechat/agents';
 import { EModelEndpoint } from 'librechat-data-provider';
 import type { AppConfig } from '@librechat/data-schemas';
-import { getProviderConfig, providerConfigMap } from './providers';
+import { getProviderConfig, providerConfigMap, resolveTitleTiming } from './providers';

 const buildAppConfig = (
  customEndpoints: Array<{ name: string; baseURL?: string; apiKey?: string }>,
@ -97,3 +97,94 @@ describe('getProviderConfig', () => {
    ).toThrow('Provider openrouter not supported');
  });
 });
+
+describe('resolveTitleTiming', () => {
+  const withEndpoints = (endpoints: Record<string, unknown>): AppConfig =>
+    ({ endpoints }) as unknown as AppConfig;
+
+  it("defaults to 'immediate' when no config is provided", () => {
+    expect(resolveTitleTiming({})).toBe('immediate');
+  });
+
+  it("defaults to 'immediate' when endpoints is missing", () => {
+    expect(resolveTitleTiming({ appConfig: {} as AppConfig })).toBe('immediate');
+  });
+
+  it("defaults to 'immediate' when no titleTiming is set", () => {
+    const appConfig = withEndpoints({ [EModelEndpoint.agents]: { titleConvo: true } });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('immediate');
+  });
+
+  it("returns 'final' from the global `all` config", () => {
+    const appConfig = withEndpoints({ all: { titleTiming: 'final' } });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('final');
+  });
+
+  it("returns 'final' from the per-endpoint config when `all` is unset", () => {
+    const appConfig = withEndpoints({ [EModelEndpoint.agents]: { titleTiming: 'final' } });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('final');
+  });
+
+  it('lets `all` take precedence over the per-endpoint value', () => {
+    const appConfig = withEndpoints({
+      all: { titleTiming: 'immediate' },
+      [EModelEndpoint.agents]: { titleTiming: 'final' },
+    });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('immediate');
+  });
+
+  it('does not let unrelated `all` config block a per-endpoint value', () => {
+    const appConfig = withEndpoints({
+      all: { titleConvo: true },
+      [EModelEndpoint.agents]: { titleTiming: 'final' },
+    });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('final');
+  });
+
+  it('checks endpoint candidates in order before provider fallback', () => {
+    const appConfig = withEndpoints({
+      [EModelEndpoint.agents]: { titleTiming: 'final' },
+      [EModelEndpoint.openAI]: { titleTiming: 'immediate' },
+    });
+    expect(
+      resolveTitleTiming({
+        appConfig,
+        endpoint: [EModelEndpoint.agents, EModelEndpoint.openAI],
+      }),
+    ).toBe('final');
+  });
+
+  it('falls back to backing provider timing when agents has no titleTiming', () => {
+    const appConfig = withEndpoints({
+      [EModelEndpoint.agents]: { titleConvo: true },
+      [EModelEndpoint.openAI]: { titleTiming: 'final' },
+    });
+    expect(
+      resolveTitleTiming({
+        appConfig,
+        endpoint: [EModelEndpoint.agents, EModelEndpoint.openAI],
+      }),
+    ).toBe('final');
+  });
+
+  it("returns 'immediate' for an endpoint with no override and no `all`", () => {
+    const appConfig = withEndpoints({ [EModelEndpoint.openAI]: { titleTiming: 'final' } });
+    expect(resolveTitleTiming({ appConfig, endpoint: EModelEndpoint.agents })).toBe('immediate');
+  });
+
+  it("resolves 'final' from a custom endpoint config (endpoints.custom[])", () => {
+    const appConfig = withEndpoints({
+      [EModelEndpoint.custom]: [{ name: 'MyProvider', titleTiming: 'final' }],
+    });
+    expect(resolveTitleTiming({ appConfig, endpoint: 'MyProvider' })).toBe('final');
+  });
+
+  it('resolves a normalized custom provider name (openrouter -> OpenRouter)', () => {
+    const appConfig = withEndpoints({
+      [EModelEndpoint.custom]: [
+        { name: 'OpenRouter', baseURL: 'https://openrouter.ai/api/v1', titleTiming: 'final' },
+      ],
+    });
+    expect(resolveTitleTiming({ appConfig, endpoint: 'openrouter' })).toBe('final');
+  });
+});
--- a/packages/api/src/endpoints/config/providers.ts
+++ b/packages/api/src/endpoints/config/providers.ts
@ -49,6 +49,69 @@ export const providerConfigMap: Record<string, InitializeFn> = {
  [EModelEndpoint.anthropic]: initializeAnthropic,
 };

+export type TitleTiming = 'immediate' | 'final';
+
+/**
+ * Resolves when conversation titles are generated for a given endpoint.
+ *
+ * `endpoints.all.titleTiming`, when present, is the global override. Otherwise,
+ * endpoint candidates are checked in order so the public endpoint (for example
+ * `agents`) can override the backing provider, with provider/custom config used
+ * as a fallback. Resolving custom providers via `getProviderConfig` picks up its
+ * case-insensitive fallback for normalized provider names (e.g. `openrouter` →
+ * `OpenRouter`). Defaults to `immediate`.
+ */
+export function resolveTitleTiming({
+  appConfig,
+  endpoint,
+}: {
+  appConfig?: AppConfig;
+  endpoint?: string | Array<string | undefined>;
+}): TitleTiming {
+  const endpoints = appConfig?.endpoints;
+  const resolveConfiguredTiming = (config?: Partial<TEndpoint>): TitleTiming | undefined =>
+    config?.titleTiming === 'final' || config?.titleTiming === 'immediate'
+      ? config.titleTiming
+      : undefined;
+
+  const globalTiming = resolveConfiguredTiming(endpoints?.all);
+  if (globalTiming) {
+    return globalTiming;
+  }
+
+  const endpointCandidates = (Array.isArray(endpoint) ? endpoint : [endpoint]).filter(
+    (value): value is string => !!value,
+  );
+
+  for (const endpointCandidate of endpointCandidates) {
+    const endpointConfig = endpoints?.[endpointCandidate as keyof NonNullable<typeof endpoints>] as
+      | Partial<TEndpoint>
+      | undefined;
+    const endpointTiming = resolveConfiguredTiming(endpointConfig);
+    if (endpointTiming) {
+      return endpointTiming;
+    }
+  }
+
+  for (const endpointCandidate of endpointCandidates) {
+    if (!appConfig) {
+      continue;
+    }
+    try {
+      const providerTiming = resolveConfiguredTiming(
+        getProviderConfig({ provider: endpointCandidate, appConfig }).customEndpointConfig,
+      );
+      if (providerTiming) {
+        return providerTiming;
+      }
+    } catch {
+      // Unsupported providers fall back to the default timing.
+    }
+  }
+
+  return 'immediate';
+}
+
 /**
 * Result from getProviderConfig
 */
--- a/packages/api/src/stream/GenerationJobManager.ts
+++ b/packages/api/src/stream/GenerationJobManager.ts
@ -961,6 +961,7 @@ class GenerationJobManagerClass {
    this.jobStore.recordActivity?.(streamId);

    await this.trackUserMessage(streamId, event);
+    await this.trackTitleEvent(streamId, event);

    // For Redis mode, persist chunk for later reconstruction (fire-and-forget for resumability)
    if (this._isRedis) {
@ -1043,6 +1044,21 @@ class GenerationJobManagerClass {
    }
  }

+  /**
+   * Persist the last title event so resume sync can replay it. Content
+   * aggregation only reconstructs message parts, so UI-only events need their
+   * own metadata slot.
+   */
+  private async trackTitleEvent(streamId: string, event: t.ServerSentEvent): Promise<void> {
+    if (!('event' in event) || event.event !== 'title') {
+      return;
+    }
+
+    await this.jobStore.updateJob(streamId, {
+      titleEvent: JSON.stringify(event),
+    });
+  }
+
  /**
   * Persist user message metadata from the created event.
   * Awaited in emitChunk so the HSET commits before the PUBLISH,
@ -1152,6 +1168,14 @@ class GenerationJobManagerClass {
    const result = await this.jobStore.getContentParts(streamId);
    const aggregatedContent = result?.content ?? [];
    const runSteps = await this.jobStore.getRunSteps(streamId);
+    let titleEvent: t.ResumeState['titleEvent'];
+    if (jobData.titleEvent) {
+      try {
+        titleEvent = JSON.parse(jobData.titleEvent) as t.ResumeState['titleEvent'];
+      } catch {
+        // Ignore malformed persisted title events.
+      }
+    }

    logger.debug(`[GenerationJobManager] getResumeState:`, {
      streamId,
@ -1166,6 +1190,7 @@ class GenerationJobManagerClass {
      responseMessageId: jobData.responseMessageId,
      conversationId: jobData.conversationId,
      sender: jobData.sender,
+      titleEvent,
    };
  }

--- a/packages/api/src/stream/tests/GenerationJobManager.stream_integration.spec.ts
+++ b/packages/api/src/stream/tests/GenerationJobManager.stream_integration.spec.ts
@ -1356,6 +1356,28 @@ describe('GenerationJobManager Integration Tests', () => {
      await manager.destroy();
    });

+    test('should include emitted title event in resume state', async () => {
+      const manager = createInMemoryManager();
+      const streamId = `title-resume-${Date.now()}`;
+      await manager.createJob(streamId, 'user-1', streamId);
+
+      const titleEvent = {
+        event: 'title',
+        data: {
+          conversationId: streamId,
+          title: 'Resumed Title',
+        },
+      } satisfies ServerSentEvent;
+
+      await manager.emitChunk(streamId, titleEvent);
+
+      const resumeState = await manager.getResumeState(streamId);
+
+      expect(resumeState?.titleEvent).toEqual(titleEvent);
+
+      await manager.destroy();
+    });
+
    test('should replay buffer by default when no options are passed', async () => {
      const manager = createInMemoryManager();
      const streamId = `replay-buf-${Date.now()}`;
--- a/packages/api/src/stream/interfaces/IJobStore.ts
+++ b/packages/api/src/stream/interfaces/IJobStore.ts
@ -39,6 +39,9 @@ export interface SerializableJobData {
  /** Serialized final event for replay */
  finalEvent?: string;

+  /** Serialized title event for replay during active-stream resume */
+  titleEvent?: string;
+
  /** Endpoint metadata for abort handling - avoids storing functions */
  endpoint?: string;
  iconURL?: string;
@ -139,6 +142,13 @@ export interface ResumeState {
  responseMessageId?: string;
  conversationId?: string;
  sender?: string;
+  titleEvent?: {
+    event: 'title';
+    data?: {
+      conversationId?: string;
+      title?: string;
+    };
+  };
 }

 /**
--- a/packages/data-provider/src/config.ts
+++ b/packages/data-provider/src/config.ts
@ -403,6 +403,13 @@ export const baseEndpointSchema = z.object({
    .optional(),
  titleEndpoint: z.string().optional(),
  titlePromptTemplate: z.string().optional(),
+  /**
+   * When conversation titles are generated. `immediate` (default) generates the
+   * title as soon as the request is made, in parallel with the response, from the
+   * user's first message. `final` defers generation until the full response
+   * completes (legacy behavior).
+   */
+  titleTiming: z.union([z.literal('immediate'), z.literal('final')]).optional(),
  /** Maximum characters allowed in a single tool result before truncation. */
  maxToolResultChars: z.number().positive().optional(),
 });
@ -662,6 +669,7 @@ export const azureEndpointSchema = z
        titleMethod: true,
        titleModel: true,
        titlePrompt: true,
+        titleTiming: true,
        titlePromptTemplate: true,
      })
      .partial(),
@ -1132,6 +1140,10 @@ export type TStartupConfig = {
  modelDescriptions?: Record<string, Record<string, string>>;
  sharedLinksEnabled: boolean;
  publicSharedLinksEnabled: boolean;
+  /** Effective default timing for when conversation titles become fetchable.
+   * `immediate` = fetch in parallel with the active stream (default);
+   * `final` = fetch only after the stream completes (legacy). */
+  titleGenerationTiming?: 'immediate' | 'final';
  analyticsGtmId?: string;
  rum?: TRumConfig;
  bundlerURL?: string;
--- a/packages/data-provider/src/types/agents.ts
+++ b/packages/data-provider/src/types/agents.ts
@ -217,6 +217,13 @@ export namespace Agents {
    responseMessageId?: string;
    conversationId?: string;
    sender?: string;
+    titleEvent?: {
+      event: 'title';
+      data?: {
+        conversationId?: string;
+        title?: string;
+      };
+    };
  }
  /**
   * Represents a run step delta i.e. any changed fields on a run step during