From dfe8ebf034bee8e131ae1ae0cbf33dc72fa74809 Mon Sep 17 00:00:00 2001 From: Sergey Prokhorov Date: Tue, 7 Apr 2026 13:46:39 +0200 Subject: [PATCH] Send TLS decode_error alert on malformed ClientHello and missing SNI MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Scanners probe for fake-TLS proxies by sending structurally malformed ClientHellos (e.g. ExtensionsLen=0 with trailing extension bytes). A real TLS server responds with a fatal decode_error alert; previously the proxy crashed the handler process silently, making it detectable. Changes: - mtp_fake_tls: add TLS_REC_ALERT, TLS_ALERT_FATAL, TLS_ALERT_DECODE_ERROR macros; export tls_decode_error_alert/0 which builds the 7-byte alert frame from macros - mtp_fake_tls: add second clause to parse_client_hello/1 that throws {protocol_error, tls_bad_client_hello, bad_client_hello} instead of letting a bare function_clause propagate - mtp_fake_tls: tighten parse_sni/1 catch to match the specific tagged error rather than a catch-all error:_ - mtp_handler: add attempt_fronting clauses for tls_bad_client_hello and tls_no_sni — both send the decode_error alert before closing - mtp_handler: effective_secret/2 now raises tls_bad_client_hello (not tls_invalid_digest) when per_sni_secrets=on and the ClientHello has no SNI, so it also gets the alert treatment - single_dc_SUITE: new malformed_tls_hello_decode_error_case/1 verifies the alert bytes are sent and the metric is incremented - AGENTS.md: document test organisation, process architecture diagram, and upstream/downstream naming note Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- AGENTS.md | 86 ++++++++++++++++++++++++++++++++++++++++ src/mtp_fake_tls.erl | 18 ++++++++- src/mtp_handler.erl | 10 ++++- test/single_dc_SUITE.erl | 48 +++++++++++++++++++++- 4 files changed, 158 insertions(+), 4 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index fbe14f1..d316add 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -36,6 +36,66 @@ Dockerfile Docker image build | `mtp_metric` | Metrics/telemetry | | `mtp_session_storage` | Replay-attack protection (session deduplication) | +### Process architecture + +``` +OTP supervision tree +──────────────────────────────────────────────────────────────────── +mtproto_proxy_sup (one_for_one) + ├── mtp_config (gen_server, singleton) + ├── mtp_session_storage (gen_server, singleton) + ├── mtp_dc_pool_sup (supervisor, simple_one_for_one) + │ └── mtp_dc_pool (gen_server, one per DC id, permanent) + ├── mtp_down_conn_sup (supervisor, simple_one_for_one) + │ └── mtp_down_conn (gen_server, one per Telegram TCP conn, temporary) + └── Ranch listeners (one per configured port: mtp_ipv4, mtp_ipv6, …) + └── mtp_handler (gen_server, one per client TCP conn, transient) + + +Data-plane message flow (one client connection) +──────────────────────────────────────────────────────────────────── + + Telegram client Telegram server + │ │ + │ TCP (fake-TLS / obfuscated / secure) │ + ▼ ▼ + ┌─────────────┐ gen_server:call({send,Data}) ┌──────────────┐ raw TCP ┌──────────────┐ + │ mtp_handler │ ──────────────────────────────► │ mtp_down_conn│ ────────────► │ Telegram DC │ + │ (upstream) │ │ (downstream) │ ◄──────────── │ (middle srv) │ + │ │ ◄────────────────────────────── │ │ └──────────────┘ + └──────┬──────┘ gen_server:cast({proxy_ans}) └──────────────┘ + │ + │ on first data: mtp_config:get_downstream_safe/2 → picks (pool, down_conn) + │ on disconnect: cast({return, self()}) → releases slot + ▼ + ┌─────────────┐ + │ mtp_dc_pool │ — spawns mtp_down_conn via mtp_down_conn_sup when pool is empty + │ (per DC) │ + └─────────────┘ + +> **Naming note:** the terms "upstream" and "downstream" in the current code are the +> opposite of what one might expect: +> `upstream` = the client-side connection (`mtp_handler`), +> `downstream` = the Telegram-server-side connection (`mtp_down_conn`). +> This will be renamed in a future refactor. + + +Key interactions +──────────────────────────────────────────────────────────────────── +mtp_handler → mtp_config : get_downstream_safe/2 — resolves DC id to + a (pool_pid, down_conn_pid) pair on first + upstream data packet +mtp_handler → mtp_down_conn : send/2 (sync call) — forward client data +mtp_down_conn → mtp_handler : cast {proxy_ans, …} — forward Telegram reply +mtp_down_conn → mtp_handler : cast {close_ext, …} — Telegram closed stream +mtp_handler → mtp_dc_pool : return/2 (cast) — release slot on disconnect +mtp_dc_pool → mtp_down_conn : upstream_new/upstream_closed (cast) +mtp_dc_pool → mtp_down_conn_sup: start_conn/2 — spawn new TCP conn to Telegram +mtp_down_conn → mtp_config : get_netloc/1, get_secret/0 — read DC address + and proxy secret for RPC handshake +mtp_config → mtp_dc_pool_sup : start_pool/1 — create pool when new DC seen +``` + ## Build Requires Erlang/OTP 25+. @@ -79,6 +139,32 @@ Individual steps: Always run `make test` before committing. Fix all xref warnings and dialyzer errors — they are treated as errors. +### Test organisation — where to add new tests + +There are three kinds of tests, each with a clear home: + +| Kind | Files | When to add | +|------|-------|-------------| +| **EUnit** (unit) | `src/*.erl`, `-ifdef(TEST)` blocks | Pure functions with no I/O: codec encode/decode round-trips, packet parsing helpers, crypto primitives | +| **PropEr** (property-based) | `test/prop_mtp_.erl` | Codec/parser properties that should hold for *arbitrary* inputs — e.g. encode→decode identity, parser accepts all valid inputs, parser never crashes on random bytes | +| **Common Test** (integration) | `test/single_dc_SUITE.erl` | End-to-end behaviour involving a real listener + fake DC: protocol negotiation, policy enforcement, error handling visible at the TCP level (alerts sent, connections closed), domain fronting, replay protection | + +**Rule of thumb:** if the behaviour is observable only over a TCP socket or requires a running application, it belongs in `single_dc_SUITE`. If it is a property of a pure function, add a PropEr property in the matching `prop_mtp_.erl`. If it is a targeted unit case for a specific input, use EUnit. + +**What changes need new tests:** + +- **New codec or protocol module** → PropEr round-trip property in `prop_mtp_.erl` + a CT `echo_*_case` in `single_dc_SUITE` +- **New protocol error path** → CT case that sends the triggering byte sequence over TCP and asserts the exact response (alert bytes, metric counter, connection close) +- **New policy or config option** → CT case that sets the env, exercises the path, resets env in `{post, Cfg}` +- **New parser clause or binary pattern** → PropEr property verifying the clause accepts all valid inputs and a targeted EUnit/PropEr case for boundary/malformed inputs +- **Security-critical paths** (replay detection, session storage, digest validation) → CT case; also consider PropEr for the pure crypto/comparison functions + +**Naming conventions:** + +- CT cases: `_case/1` — auto-discovered by `all/0` +- PropEr properties: `prop_/0` (or `/1` with a `doc` clause) +- Each CT case must implement `{pre, Cfg}` / `{post, Cfg}` / `Cfg when is_list(Cfg)` clauses and call `setup_single` / `stop_single` to avoid resource leaks + ### Debugging CT failures When `rebar3 ct` (or `make test`) reports failures, **do not rely on the terminal output** — it is truncated and shows only the last error. Instead, go straight to the HTML logs: diff --git a/src/mtp_fake_tls.erl b/src/mtp_fake_tls.erl index b5938ed..a31fb3d 100644 --- a/src/mtp_fake_tls.erl +++ b/src/mtp_fake_tls.erl @@ -16,6 +16,7 @@ -export([from_client_hello/2, derive_sni_secret/3, parse_sni/1, + tls_decode_error_alert/0, new/0, try_decode_packet/2, decode_all/2, @@ -54,9 +55,13 @@ -define(TLS_12_VERSION, 3, 3). -define(TLS_13_VERSION, 3, 4). -define(TLS_REC_CHANGE_CIPHER, 20). +-define(TLS_REC_ALERT, 21). -define(TLS_REC_HANDSHAKE, 22). -define(TLS_REC_DATA, 23). +-define(TLS_ALERT_FATAL, 2). +-define(TLS_ALERT_DECODE_ERROR, 50). + -define(TLS_12_DATA, ?TLS_REC_DATA, ?TLS_12_VERSION). -define(DIGEST_POS, 11). @@ -156,10 +161,17 @@ parse_sni(Data) -> {error, no_sni} end catch - error:_ -> + error:{protocol_error, tls_bad_client_hello, _} -> {error, bad_hello} end. +%% TLS fatal decode_error alert (RFC 8446 §6). +%% Sent to clients whose ClientHello is structurally invalid or lacks an SNI, +%% making the proxy behave like a real TLS server rather than silently dropping. +-spec tls_decode_error_alert() -> binary(). +tls_decode_error_alert() -> + <>. + %% Derive a per-SNI 16-byte secret from the base secret, SNI domain and a salt. %% Derivation: SHA256(salt || hex32(base_secret) || sni_domain)[0:16] %% @@ -194,7 +206,9 @@ parse_client_hello(< + error({protocol_error, tls_bad_client_hello, bad_client_hello}). parse_suites(Bin) -> [Suite || <> <= Bin]. diff --git a/src/mtp_handler.erl b/src/mtp_handler.erl index 51b2d86..77e4044 100644 --- a/src/mtp_handler.erl +++ b/src/mtp_handler.erl @@ -516,6 +516,14 @@ attempt_fronting(replay_session_detected, SniDomain, Config -> do_front(SniDomain, Config, Acc, Ip, Listener, S) end; +attempt_fronting(tls_bad_client_hello, _Extra, + #state{sock = Sock, transport = Transport} = _S) -> + _ = Transport:send(Sock, mtp_fake_tls:tls_decode_error_alert()), + skip; +attempt_fronting(tls_no_sni, _Extra, + #state{sock = Sock, transport = Transport} = _S) -> + _ = Transport:send(Sock, mtp_fake_tls:tls_decode_error_alert()), + skip; attempt_fronting(_Type, _Extra, _S) -> skip. @@ -544,7 +552,7 @@ effective_secret(Data, Secret) -> mtp_fake_tls:derive_sni_secret(Secret, Sni, Salt); {error, Reason} -> %% No SNI — not a valid fake-TLS MTP connection; fail fast. - error({protocol_error, tls_invalid_digest, Reason}) + error({protocol_error, tls_bad_client_hello, Reason}) end end. diff --git a/test/single_dc_SUITE.erl b/test/single_dc_SUITE.erl index 239d7e6..3eec673 100644 --- a/test/single_dc_SUITE.erl +++ b/test/single_dc_SUITE.erl @@ -26,7 +26,8 @@ domain_fronting_fragmented_case/1, domain_fronting_replay_case/1, per_sni_secrets_on_case/1, - per_sni_secrets_wrong_secret_case/1 + per_sni_secrets_wrong_secret_case/1, + malformed_tls_hello_decode_error_case/1 ]). -export([set_env/2, @@ -735,6 +736,51 @@ per_sni_secrets_wrong_secret_case(Cfg) when is_list(Cfg) -> 1, mtp_test_metric:get_tags( count, [?APP, protocol_error, total], [?FUNCTION_NAME, tls_invalid_digest])). +%% @doc A structurally malformed ClientHello (ExtensionsLen=0 but data follows) must cause +%% the proxy to send a TLS fatal decode_error alert and then close the connection, +%% rather than crashing silently. +malformed_tls_hello_decode_error_case({pre, Cfg}) -> + setup_single(?FUNCTION_NAME, 10000 + ?LINE, #{}, Cfg); +malformed_tls_hello_decode_error_case({post, Cfg}) -> + stop_single(Cfg); +malformed_tls_hello_decode_error_case(Cfg) when is_list(Cfg) -> + Host = ?config(mtp_host, Cfg), + Port = ?config(mtp_port, Cfg), + %% Build a ClientHello that is structurally valid at the TLS record layer + %% (correct lengths, version bytes) but lies about ExtensionsLen=0 while + %% trailing bytes follow — this is the exact pattern seen from real scanners. + TlsPacketLen = 512, + HelloLen = 508, % TlsPacketLen - 4 (hello type + hello len field) + Random = crypto:strong_rand_bytes(32), + SessId = crypto:strong_rand_bytes(32), + CipherSuites = <<19, 1>>, % TLS_AES_128_GCM_SHA256, 2 bytes + %% Padding fills the rest of the TLS frame after ExtensionsLen to hit TlsPacketLen exactly. + %% Consumed so far inside the frame: hello_type(1)+hello_len(3)+version(2)+random(32) + %% +sessid_len(1)+sessid(32)+cs_len(2)+cs(2)+comp_len(1)+comp(1)+ext_len(2) = 79 + PaddingLen = TlsPacketLen - 79, + Padding = binary:copy(<<0>>, PaddingLen), + MalformedHello = <<22, 3, 1, TlsPacketLen:16, % TLS record header (handshake, TLS1.0) + 1, HelloLen:24, % ClientHello type + length + 3, 3, % legacy version (TLS1.2) + Random/binary, % 32-byte random + 32, SessId/binary, % session ID + 2:16, CipherSuites/binary, % cipher suites + 1, 0, % compression methods + 0:16, % ExtensionsLen = 0 (lie) + Padding/binary>>, % trailing bytes that should be extensions + {ok, Sock} = gen_tcp:connect(Host, Port, [binary, {active, false}], 2000), + ok = gen_tcp:send(Sock, MalformedHello), + %% Proxy must send back a TLS fatal decode_error alert (21, 3, 3, 0, 2, 2, 50) + ExpectedAlert = mtp_fake_tls:tls_decode_error_alert(), + {ok, Response} = gen_tcp:recv(Sock, byte_size(ExpectedAlert), 5000), + ?assertEqual(ExpectedAlert, Response), + %% Then close the connection + ?assertEqual({error, closed}, gen_tcp:recv(Sock, 0, 2000)), + gen_tcp:close(Sock), + ?assertEqual( + 1, mtp_test_metric:get_tags( + count, [?APP, protocol_error, total], [?FUNCTION_NAME, tls_bad_client_hello])). + setup_single(Name, MtpPort, DcCfg0, Cfg) -> setup_single(Name, "127.0.0.1", MtpPort, DcCfg0, Cfg).