mtproto_proxy/doc/migration-flow.md
Sergey Prokhorov 121d8b7413
docs: split-mode setup guide, architecture diagrams, cert script, build
README:
- New 'Split-mode setup' section: motivation, firewall rules, step-by-step
  instructions for both VPN tunnel and TLS distribution options
- Split-mode bullet added to Features list
- Notes on DPI-resistant tunnels (Shadowsocks, VLESS/XRay, Hysteria2) for
  Russian deployment; standard VPN protocols (WireGuard, OpenVPN) may be blocked
- Install instructions updated to use `make init-config` (copies templates,
  auto-detects public IP) instead of manual cp; ROLE= documented throughout
- Split-mode Step 4 uses `make ROLE=back/front` so template-change detection
  works correctly after `git pull`

Makefile:
- ROLE ?= both variable selects config templates (both/front/back)
- Config prereq rules use $(SYS_CONFIG_SRC) / $(VM_ARGS_SRC) based on ROLE
- New `init-config` target: force-copies templates, auto-detects public IP,
  prints edit reminder; replaces manual cp in install workflow

scripts/gen_dist_certs.sh:
- Two-step workflow: `init <dir>` on back server (CA + back cert),
  `add-node <dir> <name>` per front server (cert signed by existing CA)
- Generates per-node ssl_dist.<name>.conf with paths substituted (no
  NODE_NAME placeholder to edit manually)
- ssl_dist.<name>.conf is now used directly (no rename to ssl_dist.conf);
  vm.args examples and README updated to match

config/vm.args.{front,back}.example:
- -ssl_dist_optfile points to role-specific filename (ssl_dist.front.conf /
  ssl_dist.back.conf) so cert files can be copied as-is without renaming

AGENTS.md:
- Role-overview Mermaid flowchart showing front/back/both process split
- Data-plane section replaced with links to doc/ (no duplication)
- Supervision tree, key interactions, split-mode config keys updated

doc/handler-downstream-flow.md, doc/migration-flow.md:
- Mermaid box grouping to visually separate FRONT and BACK node participants
- erpc:call reference corrected (was rpc:call)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-12 00:34:45 +02:00

65 lines
2.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Transparent client migration on DC connection death
Telegram periodically closes the TCP connection to the proxy ("DC connection
rotation", typically every 3070 s). Instead of dropping all clients
multiplexed on that connection, the proxy remaps each idle client to a
surviving (or freshly-started) DC connection transparently.
**Key actors:**
- `mtp_down_conn (old)` — the dying downstream connection process
- `mtp_dc_pool` — pool managing all downstream connections for one DC
- `mtp_handler` — one process per connected Telegram client
- `mtp_down_conn (new)` — replacement downstream spawned by the pool
**Split-mode note:** in `front/back` split mode `mtp_handler` lives on the
front node and `mtp_dc_pool` / `mtp_down_conn` live on the back node. Every
message in the diagram below that crosses the front↔back boundary (the
`migrate` cast, `upstream_new` cast, `Pool->>Handler` reply, etc.) is carried
transparently by Erlang distribution — no code changes are needed because
Erlang PIDs and `gen_server` calls work across nodes unchanged. Process
monitors also fire on node disconnection, so a back-node restart causes all
affected front-node handlers to exit cleanly.
```mermaid
sequenceDiagram
participant TG as Telegram
box LightGreen "BACK node"
participant OldDown as mtp_down_conn (old)
participant Pool as mtp_dc_pool
participant NewDown as mtp_down_conn (new)
end
box LightBlue "FRONT node"
participant Handler as mtp_handler
end
TG->>OldDown: TCP close
OldDown->>Pool: downstream_closing(self()) [sync]
Pool-->>Pool: remove OldDown from ds_store + monitors
Pool-->>NewDown: spawn & connect (maybe_restart_connection)
Pool-->>OldDown: ok
OldDown->>Handler: migrate(OldDown) [cast, to all known upstreams]
Note over OldDown: drain_mailbox(5000)
alt upstream_new in mailbox
Note over Pool,OldDown: Race: pool processed a {get} call just before<br/>downstream_closing — upstream_new cast already queued
Pool-->>OldDown: upstream_new(Handler2, Opts) [cast, queued]
OldDown->>Handler2: migrate(OldDown) [cast, immediately]
end
alt Handler was blocked in down_send
Handler-->>OldDown: {send, Data} [call, in mailbox]
OldDown-->>Handler: {error, migrating}
Note over Handler: metric[mid_send] → stop<br/>(client reconnects and resends)
else Handler was idle
Handler->>Pool: migrate(OldDown, self(), Opts) [sync]
Pool-->>Pool: remove Handler from upstreams map
Pool->>NewDown: upstream_new(Handler, Opts) [cast]
Pool-->>Handler: NewDown pid
Note over Handler: down = NewDown<br/>metric[ok]
end
Note over OldDown: stop {shutdown, downstream_migrated}
```