Title
Create new category
Edit page index title
Edit category
Edit link
Tunneling Investigation
This runbook is the family-specific continuation of Critical Alert Triage for any alert whose evidence points at the Domain Name System (DNS) protocol being used as a covert data channel. It covers the two trigger surfaces the DNS Tunneling Detection family produces — a per-query DNS Tunneling Suspicious Alert on the Hunt page, and the sibling DNS Tunneling Hourly aggregation that identifies the parent domain driving sustained per-query volume. Either surface routes here from the step-6 hand-off in the generic runbook, and the steps below take the analyst from "one or more DNS queries look like tunneling" to "a triage decision recorded in the incident-management system".
This runbook is written for Tier 1 Security Operations Center (SOC) analysts performing first-response triage on DNS anomalies, Tier 2 analysts resolving escalations when tunneling corroborates signals from another family, and threat hunters pivoting from a suspicious DNS lead into a broader compromise investigation. It assumes the analyst has already executed Critical alert triage through step 5 when the lead originated from an IOC-auto-escalated Critical alert — the affected asset, the supporting sidebar evidence, the repetition history, and any same-flow corroboration are already in hand. For Medium or Low DNS Tunneling alerts triaged directly from the Hunt page queue, the analyst records the affected asset and its criticality tier before executing the runbook below.
First-use acronym expansions in this runbook: SOC (Security Operations Center), IOC (Indicator of Compromise), DNS (Domain Name System), TLS (Transport Layer Security), SNI (Server Name Indication), HTTP (Hypertext Transfer Protocol), HTTPS (Hypertext Transfer Protocol Secure), DoH (DNS over HTTPS), DoT (DNS over TLS), C2 (command-and-control), IP (Internet Protocol), RFC-1918 (the Internet Engineering Task Force standard reserving private IP ranges), NXDOMAIN (Non-Existent Domain DNS response), TXT (DNS text record type), NULL (DNS null record type), CNAME (DNS canonical-name record type), MX (DNS mail-exchange record type), TTL (Time-to-Live), FQDN (Fully Qualified Domain Name), TLD (Top-Level Domain), ASN (Autonomous System Number), GeoIP (geographical IP lookup), TIDB (Threat Intelligence Database), REPDB (Reputation Database), EDR (endpoint detection and response), VPN (Virtual Private Network), IR (incident response), PCAP (packet capture), DGA (Domain Generation Algorithm), MVP (Minimum Viable Product), FRD (Functional Requirements Document).
Trigger Scenario
An analyst reaches this runbook from one of two starting points. In either case the lead describes one or more DNS queries whose shape matches the tunneling pipeline's per-query indicator set, or a parent domain whose hourly query profile matches the pipeline's aggregate signature.
- Primary — DNS Tunneling Suspicious Alert on the Hunt page. The analyst opens the Hunt Page, navigates to the All Alerts bucket, and selects the DNS Tunneling Suspicious Alert sub-tab. A row carries
alert_type: "dns_tunneling", a severity label (Critical under IOC auto-escalation, High at suspicion score three or four, Medium at score two, Low at score one), the fullquery_name, therecord_type, thequery_lengthin characters, thesubdomain_count, thesuspicion_scoreon the 1–4 scale, and the per-indicator boolean flags (is_long_query,has_many_subdomains,is_unusual_type,has_large_response,has_encoded_label). - Secondary — DNS Tunneling Hourly aggregation alert. The tunneling pipeline also outputs an hourly aggregate (
dns_tunneling_hourly) grouped by parent domain over a one-hour tumbling window. A row on the aggregation surface carriestotal_queries,unique_subdomains,unique_sources, the counts of each per-query indicator class (long_query_count,many_subdomain_count,unusual_type_count,large_response_count), and the summary booleanshas_high_unique_ratio(unique subdomains divided by total queries greater than 0.8) andhas_high_query_rate(total queries greater than 100). The aggregation emits for any parent domain with more than five queries per hour, so a Low-signal parent can still surface here when analysts review the queue for noise and concentration.
The two trigger surfaces converge in this runbook because the investigation is the same: DNS traffic to one parent domain is carrying data the way a tunneling tool does, and the analyst needs to decide whether the channel is a covert exfiltration or command path, a legitimate service-discovery or telemetry protocol that happens to use long encoded subdomains, or a benign misconfiguration emitting unusual query patterns. A per-query alert points at the specific moment the pipeline scored a query as suspicious; the hourly aggregate points at the parent domain driving the volume. A lead on either surface prompts the analyst to look for the other — a per-query alert without the parent's hourly profile is thin evidence, and an hourly aggregate without drilling into its constituent queries leaves the actual payload shape unread.
Prerequisites
Before executing this runbook the analyst confirms the following.
- Steps 1 through 5 of Critical Alert Triage are complete when the lead is a Critical-severity alert. The affected asset, its owner and criticality tier, the sidebar evidence summary, and any repetition or same-flow correlation already live in the incident ticket. For Medium or Low DNS Tunneling alerts triaged directly, the analyst still records the affected asset and its criticality tier before executing the runbook.
- A Hunt session open on the originating DNS Tunneling Suspicious Alert sub-tab (for the primary trigger) or on the originating DNS Tunneling Hourly aggregate row (for the secondary trigger), with the detail sidebar visible.
- Access to the organization's asset inventory for the internal host named as the DNS client. On a DNS Tunneling alert the internal host is the querying source; the destination is either the organization's recursive resolver or, when the client bypassed the resolver, a direct outbound name server.
- A working mental model of the organization's DNS egress pattern — which resolver or resolvers the corporate network enforces, whether DNS over HTTPS (DoH) or DNS over TLS (DoT) is permitted or blocked, which internal subnets are allowed to issue direct outbound DNS to the public internet, and which service-discovery protocols running in the environment legitimately produce long-label or high-volume query profiles (for example, some anti-malware cloud-lookup clients, some content-delivery cache-busting services, some monitoring or analytics services).
- Access to the incident-management system used by the SOC and, when available, read access to the organization's EDR console so emergent host-level behavior following the tunneling activity can be corroborated against endpoint telemetry.
Missing any of these is a hard stop. A DNS tunneling investigation run without asset context tends to misread the direction of the activity; run without a DNS-egress baseline, it tends to over-escalate legitimate cloud-lookup traffic that encodes state into subdomain labels; run without a sense of the organization's service-discovery inventory, it tends to miss legitimate long-label protocols that resemble tunneling.
Investigation Steps
Each step is numbered, adds a distinct piece of evidence, and feeds the decision tree at the end. The analyst executes steps in order even when the verdict looks obvious early; a tunneling disposition is never safely recorded on the raw suspicion score alone because a score of three is reached in benign telemetry nearly as often as in active tunneling — the evidence that determines disposition lives in the query string shape, the parent-domain profile, the source host's DNS egress baseline, and the correlation against other detection families.
1. Open the alert sidebar and capture the dns_tunneling block
On the Hunt page, the analyst clicks the alert row to open the detail sidebar. The DNS Tunneling Suspicious section renders for any record carrying a dns_tunneling block and lists the full per-query payload; on the hourly aggregate surface, the parallel DNS Tunneling Hourly section lists the aggregate fields. The analyst records the following fields from the per-query alert before any pivot.
dns_tunneling.query_name— the full queried Fully Qualified Domain Name (FQDN). Captured in full, character-for-character, so the encoding analysis in step 3 has the exact string to read.dns_tunneling.record_type— the DNS record type (A,AAAA,TXT,NULL,CNAME,MX,ANY, and so on).TXT,NULL, andCNAMEare the types flagged by theis_unusual_typeindicator because tunneling tools favor high-payload-capacity types.dns_tunneling.query_length— the length of the query name in characters. Theis_long_queryindicator fires when this value exceeds 50.dns_tunneling.subdomain_count— the number of subdomain levels in the query. Thehas_many_subdomainsindicator fires when this value exceeds 3.dns_tunneling.first_subdomain— the leftmost subdomain label, captured separately because thehas_encoded_labelindicator fires when this label exceeds 25 characters and because encoded payloads concentrate in the leftmost label.dns_tunneling.answer_length— the total character length of the response answer field. Thehas_large_responseindicator fires when this value exceeds 200 characters.dns_tunneling.suspicion_score— the integer sum of the five per-indicator booleans, on a 1–4 scale (the score on the alert surface is always at least 1 because alerts only emit whensuspicion_score ≥ 1; score of 4 is possible when four indicators coincide but the fifth does not).dns_tunneling.is_long_query,has_many_subdomains,is_unusual_type,has_large_response,has_encoded_label— the five per-indicator booleans. The pattern of which indicators fired is more diagnostic than the aggregate score (see step 2).dns_tunneling.malicious_iocsanddns_tunneling.iocs_checked— the IOC enrichment counters. A truthymalicious_iocsfield is the auto-escalation signal read in step 7.
For an hourly aggregation alert the analyst captures the parallel field set from the DNS Tunneling Hourly section: the parent domain, total_queries, unique_subdomains, unique_sources, per-indicator aggregate counts (long_query_count, many_subdomain_count, unusual_type_count, large_response_count), has_high_unique_ratio (unique subdomains divided by total queries greater than 0.8 — the strongest aggregate-level tunneling signal, because benign telemetry reuses subdomains while tunneling emits mostly fresh ones), and has_high_query_rate (total queries greater than 100 per hour).
Above the dns_tunneling block the base network section carries src_ip, src_port, dest_ip (the recursive resolver), dest_port (almost always 53, but 853 on DNS over TLS and 443 on DNS over HTTPS), proto (UDP or TCP), app_proto (dns), and the community_id correlator. The analyst captures the full base record so subsequent pivots have a reference point. "DNS Tunneling Suspicious alert on 10.40.12.88 (finance-workstation-412, Tier 2), query YWNrLWVuY29kZWQtcGF5bG9hZC1ibG9iLXYy.c2.evil-tunnel.net (62 chars), record type TXT, subdomain_count 4, suspicion_score 4, community_id 1:abc..." is the shape of a capture note that survives handoff.
2. Read the suspicion score and indicator pattern against the confidence ladder
The raw suspicion_score on the 1–4 scale is informative but not self-interpreting; the pattern of which indicators fired carries the real signal. The analyst locates the alert on the ladder from Behavioral Detections and, separately, reads which indicator combination fired.
| Score | Severity | Confidence | Interpretive Note |
|---|---|---|---|
| IOC auto-escalation fires (any feed hit on the query domain or source). | Critical | 0.99 | Treated as IOC-Critical per the auto-escalation rule. Confidence is not derived from the score at this band. |
| 4 — four indicators coincide. | High | 0.85 | Extreme signal. Rare in benign traffic; nearly always worth escalation on its own. |
| 3 — three indicators coincide. | High | 0.85 | Strong signal. Common in real tunneling; occasional in cloud-lookup telemetry with encoded state. |
| 2 — two indicators coincide. | Medium | 0.70 | Moderate signal. Appears in both tunneling and a handful of benign services; the parent-domain profile in step 4 is the disambiguating lever. |
| 1 — one indicator fires. | Low | 0.50 | Weak signal. Most queries at this tier are corroborating context rather than standalone priorities; large-volume concentration of Low alerts on one parent domain is itself a signal that the hourly aggregate should surface. |
Beyond the aggregate score, the analyst reads the indicator combination. Certain combinations are more diagnostic of tunneling than others.
is_long_query+has_encoded_labeltogether. A long query whose length is concentrated in the leftmost label is the classic tunneling shape — the encoded payload rides in the first subdomain. This pair with a third indicator (unusual type, many subdomains, large response) is the strongest per-query tunneling signal.is_unusual_type+has_large_responsetogether. ATXTorNULLquery with a response over 200 characters is the signature of response-channel tunneling (the server carries the payload back to the client).has_many_subdomains+has_encoded_labeltogether. A deeply-nested query whose labels look encoded is a subdomain-chunking fingerprint — some tunneling tools split long payloads across multiple labels.is_long_queryalone orhas_many_subdomainsalone. Either on its own is weak evidence — many legitimate services produce long queries (cloud-lookup clients, content-delivery cache-busting) or deep subdomains (service-discovery protocols).
The analyst records the score and the indicator pattern explicitly in the ticket. "Suspicion score 4 (long_query + encoded_label + unusual_type + large_response — TXT record, 62-char query, 36-char leftmost label, 412-char response — three-indicator tunneling fingerprint plus the response-channel signal)" is the shape of a note that survives handoff and reveals the disposition weight at a glance.
3. Examine the query_name string for visible encoding patterns
This is the fastest-to-read, highest-signal step in the runbook when the alert is per-query. The query_name is captured in full on the row and the sidebar; the analyst reads it character-by-character for the shapes tunneling tools produce.
- Base64-like leftmost label. Characters are
A–Z,a–z,0–9, plus+or-or_(URL-safe Base64 substitutions). No pronounceable structure. Example:YWNrLWVuY29kZWQtcGF5bG9hZC1ibG9iLXYy. A label that looks like Base64 is strong evidence of an encoded payload. - Hexadecimal leftmost label. Characters are
0–9anda–f. Common in simpler tunneling tools that hex-encode payloads. Example:4869206d6f6d2068657265206973206d792074756e6e656c. A hex-only long label is almost never legitimate. - Hyphen-delimited chunked encoding. Some tools split a payload into hyphen-separated segments of fixed length. Example:
ack-encoded-payload-blob-v2. The chunk pattern is a subdomain-chunking fingerprint. - Domain-Generation-Algorithm (DGA) shape. The label looks algorithmic but does not appear encoded — consonant-vowel patterns with no pronounceable structure, fixed-length random-looking strings. Example:
xz8kqt4p7vg2rx. A DGA shape in the leftmost label is weak evidence of tunneling on its own but is a DGA-family signal that may route the lead toC2 Beacon Investigation via the DGA Detection alert. - Record-type mismatch. A
TXTorNULLrecord type on a short, pronounceable label is anomalous in a different way: legitimateTXTrecords exist (Sender Policy Framework, DomainKeys Identified Mail, site-verification tokens) but they are typically issued by mail servers or verification clients, not by end-user workstations. ATXTquery from a workstation is itself a context signal even before the label shape is read. - Pronounceable, short, and structured. If the leftmost label is a word or a short abbreviation (for example,
www,mail,api,cdn,prod,login), the encoding-shape case is weak. The alert may still be interesting for record-type or answer-length reasons, but the payload-encoding hypothesis is not supported by the string itself.
The analyst records the shape finding explicitly. "Leftmost label YWNrLWVuY29kZWQtcGF5bG9hZC1ibG9iLXYy — 36 characters, Base64 alphabet (no non-Base64 characters, no obvious padding), no pronounceable structure, consistent with an encoded payload of approximately 27 decoded bytes" is the kind of finding that determines whether the evidence stands on its own or needs corroboration to justify escalation.
4. Pivot to the parent domain and read the hourly aggregate
A per-query alert shows one query; the parent-domain hourly aggregate shows whether that query is a one-off or part of a sustained pattern. This is the single strongest disambiguation step in the runbook between tunneling and benign telemetry. Tunneling concentrates on a small number of parent domains and generates high unique-subdomain variety per parent per hour; benign telemetry reuses the same subdomain set and produces low variety.
From the alert sidebar the analyst right-clicks the parent domain (the second-level and top-level domain components — for example, evil-tunnel.net on the query YWNrLWVuY29kZWQtcGF5bG9hZC1ibG9iLXYy.c2.evil-tunnel.net) and selects Hunt all events for this domain (or, equivalently, opens a new tab filtered on the parent). A separate pivot reads the DNS Tunneling Hourly aggregate for the same parent when one exists on the DNS Tunneling Hourly aggregate sub-tab.
total_queriesfor the parent over the last hour. A parent driving over 100 queries per hour satisfieshas_high_query_rate; over 1,000 queries per hour is the scale at which tunneling tools operate for interactive sessions or bulk exfiltration. A parent with fewer than 20 queries per hour is weak aggregate evidence regardless of the per-query indicator pattern.unique_subdomainsand thehas_high_unique_ratioboolean. The ratio of unique subdomains to total queries is the most diagnostic aggregate field. Legitimate telemetry reuses subdomains (the same endpoint is queried many times); tunneling emits a fresh subdomain per query (each subdomain carries a different payload chunk). A ratio above 0.8 (four out of five queries to a unique subdomain) is the pipeline's threshold and matches tunneling; a ratio under 0.3 matches legitimate telemetry.unique_sources. A single source driving all the traffic is a host-scoped lead — one compromised or misconfigured workstation. Multiple sources on the same parent are either a coordinated compromise (rare but decisive) or a legitimate shared service the whole organization uses.- Per-indicator aggregate counts (
long_query_count,many_subdomain_count,unusual_type_count,large_response_count). These fields show which indicators dominate the parent's hourly profile. A parent whose hourly profile is dominated byunusual_type_count(mostlyTXTrecords) andlong_query_countis a strong tunneling fingerprint; a parent whose profile is dominated bymany_subdomain_countalone with no label-length contribution is more likely a deeply-nested service-discovery protocol.
The analyst records the aggregate reading explicitly. "Parent evil-tunnel.net, last hour: 847 total queries from 1 source, 841 unique subdomains (0.993 unique ratio — far above threshold), 834 long-query matches, 814 unusual-type matches, 421 encoded-label matches; has_high_unique_ratio: true, has_high_query_rate: true — tunneling-scale aggregate from a single host" is the shape of a finding that carries nearly the full weight of the disposition on its own. "Parent telemetry.vendor-cloud.example, last hour: 312 total queries from 84 sources, 12 unique subdomains (0.038 unique ratio), query concentration on three endpoints — benign telemetry profile despite per-query alerts firing at score 2" is the shape of a finding that weighs strongly toward close or tune.
5. Check the source host's DNS egress — corporate resolver or direct outbound
The dest_ip on the alert is the DNS server the client queried. Whether that server is the organization's sanctioned recursive resolver or an unsanctioned external name server is a high-signal context field that the per-query indicators do not capture.
- Corporate recursive resolver as
dest_ip. The normal pattern for compliant clients. A tunneling alert where the query rode the corporate resolver indicates the tunneling is either riding the resolver transparently (the attacker's authoritative server resolves the encoded subdomain, and the resolver forwards the answer in the usual way) or the pipeline flagged a benign long-label query the resolver answered honestly. Neither case is rule-out; the resolver is not a filter for tunneling content, only a forwarder. - Direct outbound to a public recursive resolver (for example,
8.8.8.8,1.1.1.1,9.9.9.9) or to an attacker-controlled name server. Clients that bypass the corporate resolver for a named external resolver are policy-violating on most networks and are a meaningful signal on their own. Clients that bypass the resolver entirely and issue direct outbound queries to an arbitrary external IP on port 53 are strongly suspicious — tunneling tools routinely hard-code the operator's authoritative name server as the destination to avoid caching behavior at intermediate resolvers. - DNS over HTTPS (DoH) or DNS over TLS (DoT) destination. Queries on
dest_port443 (DoH) or 853 (DoT) to a public DoH or DoT provider indicate a client configured to encrypt DNS to an external endpoint. Where DoH and DoT are not sanctioned on the network, this is a policy finding in its own right and the tunneling alert may be a sub-signal of a broader configuration drift. - Internal egress concentration. If many sources concentrate on the same non-corporate resolver, the investigation shifts from per-host to environment-level — a policy push went wrong, a software rollout hard-coded the wrong resolver, or an attacker established a shared egress point.
The analyst records the egress reading. "Source 10.40.12.88 queried 203.0.113.53 directly on UDP/53 — not the corporate resolver 10.0.0.53 — policy-violating egress in its own right" is a disposition-moving finding. "Source queried corporate resolver 10.0.0.53 normally; tunneling rode the resolver as a benign forwarder" is a finding that shifts weight to the parent-domain profile in step 4.
6. Pull related DNS events for the same source over the prior 6 hours
A per-query alert is a single event; the prior several hours of the source host's DNS activity reveal whether the alerting query is isolated (a single anomaly) or part of a sustained stream (an active tunnel). The six-hour window is a reasonable default — long enough to capture a short interactive session or the ramp of a multi-hour exfiltration, short enough to stay within first-response triage time budgets.
The analyst right-clicks the internal host's IP and selects Hunt all events from this IP. A new All Events tab opens on the Hunt Page filtered to the host; the analyst widens the time range to Last 6 hours and narrows the bucket to Network Sessions → DNS.
- How many queries fired in the window, total. A host that normally issues tens of DNS queries per hour for daily use that now issues thousands of queries per hour is sharply contrasted against its own baseline — the volume alone is a disposition lever regardless of the per-query indicator pattern.
- How many queries hit the alerting parent domain. The concentration ratio (queries to the alerting parent divided by total queries from the source) is decisive. A source whose DNS egress is dominated by one parent domain (say, 80 percent of queries) with encoded-label and unusual-type patterns is running a tunnel on that parent. A source whose queries to the alerting parent are a handful amid normal traffic is ambiguous — the parent may still be tunneling, but this source is not driving it.
- Temporal cadence. Tunneling queries cluster tightly — bursts of hundreds of queries within seconds or minutes, followed by pauses. The cadence is visible by reading the timestamps on consecutive rows. A regular inter-query interval (every 250 milliseconds, every second, every five seconds) driven by a single source is a timer-driven tunneling heartbeat and is a strong signal.
- Record-type distribution. A host whose queries in the window are overwhelmingly
TXT,NULL, orCNAMEhas a record-type profile characteristic of tunneling. A host whose queries are overwhelminglyA/AAAAwith a handful of unusual-type alerts is not, on this evidence, running an unusual-type tunnel. - NXDOMAIN ratio. A high NXDOMAIN ratio alongside tunneling indicators is a DGA-plus-tunneling pattern; the DGA Detection alert described in Behavioral detections may have fired in parallel on the same source. When that happens, the analyst hands off to C2 beacon investigation because DGA-plus-tunneling is frequently a two-stage rendezvous-then-channel pattern.
The analyst records the six-hour reading. "Source 10.40.12.88, last 6 hours: 2,414 DNS queries total, 2,187 to parent evil-tunnel.net (90.6 percent concentration), all TXT or NULL record type, average inter-query interval 9.2 seconds with low variance, 0 NXDOMAINs — sustained single-parent tunneling profile over the entire window" is the shape of a finding that closes nearly every disposition ambiguity. "Source last 6 hours: 47 DNS queries total, 3 to the alerting parent (6.4 percent concentration), mixed record types, sporadic cadence — isolated per-query alerts against a normal baseline" is the shape that weighs toward monitor or close.
7. Check for IOC enrichment on the query domain
The IOC auto-escalation check produces the same disposition pressure it does in every other family-specific runbook. The alert sidebar carries the C2 Enrichment section whenever any C2 feed match fired on the query name or any resolved IP; the Insights Enrichment section carries parallel matches from the OPSWAT InSights Threat Intelligence Database (TIDB) and Reputation Database (REPDB) feeds.
- Any
c2.matches[]entry present on the query domain, the parent domain, or any resolved IP. The alert is IOC-Critical by construction. Severity is Critical, alert-level confidence is 0.99, and the match payload identifies the threat actor or malware family the indicator belongs to. A DNS Tunneling alert whose parent domain is a C2-flagged entity is a two-source corroborated finding and is sufficient for escalation to incident response on its own. - Pipeline
malicious_iocs > 0withiocs_checked > 0. The tunneling pipeline's independent IOC check observed a feed hit on the query domain during evaluation. The pipeline auto-escalates its own severity to Critical and confidence to 0.99 under the same auto-escalation rule. - Parallel InSights TIDB match on the query domain or its parent. See InSights TIDB and REPDB for the feed semantics. A TIDB hit promotes the alert to Critical under the IOC auto-escalation rule.
- Parallel InSights REPDB match on the query domain or its parent. A REPDB hit is lower-confidence than a TIDB hit but is still material and also promotes to Critical, especially when the REPDB category names a class consistent with tunneling (dynamic DNS providers, newly-registered domain services, known tunneling-as-a-service operators).
- No IOC match on any feed. The alert is threshold-severity on behavioral evidence alone. The case for escalation now depends on the indicator pattern in step 2, the query-string shape in step 3, the parent-domain profile in step 4, the egress posture in step 5, and the six-hour source profile in step 6.
The analyst records the IOC state explicitly. "IOC match present on parent evil-tunnel.net via C2 feed entry naming 'Storm-1877 dnscat2 operator'" is a disposition-moving finding. "No IOC match on the query, the parent, or any resolved IP — the case rests on the aggregate profile" shifts the disposition weight to the other evidence.
8. Check TLS and HTTPS traffic from the same source for a multi-protocol tunnel
Some tunneling tools operate DNS-only, some operate HTTPS-only, and some maintain both channels concurrently — DNS as a low-bandwidth signaling or fallback channel and HTTPS as the high-bandwidth data channel. A DNS tunneling alert with concurrent HTTPS traffic to the same parent domain (or to a related domain under the same operator) is a multi-protocol tunneling signature that escalates the disposition above any single-protocol read.
The analyst widens the same Hunt all events from this IP tab used in step 6 (or opens a new one) and switches the bucket to Network Sessions → TLS and Network Sessions → HTTP successively, holding the Last 6 hours time range.
- TLS sessions to the alerting parent domain or related operator-controlled domains. A TLS session whose SNI matches the alerting parent (or a related domain whose registration, ASN, or resolved IP ties it to the same operator) is concurrent evidence of an HTTPS leg on the tunnel. The analyst captures the SNI, the JA3 / JA4 client fingerprint, the certificate subject and issuer, and the session byte counts for correlation.
- HTTP sessions to unresolved IPs with no SNI. Plaintext HTTP sessions with an
Hostheader matching the alerting parent, or HTTP sessions to bare IPs with unusual User-Agent strings (toolkit defaults likepython-requests,Go-http-client, or hand-rolled transports), are the other shape of a parallel data channel. - Flow-level byte asymmetry on TLS or HTTPS sessions to the parent. When the concurrent HTTPS leg carries substantially more bytes in one direction than the other over the same window, the session is potentially the data leg of the tunnel and the DNS alerts are the signaling leg. A concurrent Data Exfiltration alert fires when the upload volume crosses the pipeline's threshold, giving the investigation a second independent signal.
- No concurrent TLS or HTTPS traffic to the parent or related operators. The tunnel, if present, is DNS-only. This is the common case for tools like iodine and dnscat2 in their default configurations; it does not weaken the DNS tunneling hypothesis but closes off one corroboration path.
The analyst records the multi-protocol reading. "No TLS or HTTPS traffic from this source to evil-tunnel.net or any related parent in the last 6 hours — DNS-only tunnel" is a finding. "TLS sessions from this source to cdn.evil-tunnel.net (same parent, different subdomain) over the same window, 47 MB uploaded versus 2 MB downloaded, JA3 fingerprint a1b2c3d4... not matching any recognized client — concurrent HTTPS upload channel, multi-protocol tunnel signature" is a finding that strongly weighs toward escalation and cross-cuts into a data exfiltration investigation on the same source.
When session evidence across DNS, TLS, and HTTP is insufficient — the DNS answers were not recorded in full, the TLS SNI was absent or encrypted via Encrypted Client Hello, or the HTTP transactions lacked header detail — the analyst requests a PCAP for the alert's time window through the organization's PCAP request workflow. PCAP availability is configuration-dependent and selective on MetaDefender NDR (see Alert, Flow, and PCAP Pivoting); when one is available, the analyst inspects the raw DNS payload for the bytes the sidebar cannot surface (full response record data, truncated-response flags, unusual response codes). When a PCAP is unavailable, the investigation concludes on the evidence already gathered, and the analyst explicitly records the gap in the ticket.
Decision Tree
The analyst records one of four outcomes. Each branch lists the minimum artifacts captured before the ticket is closed.
- Escalate — confirmed or probable tunneling; contain source and investigate compromise. The disposition when any of the following holds: the alert is IOC-Critical (C2, TIDB, or REPDB match on the query domain or its parent) on any severity tier; the per-query suspicion score is 3 or 4 and the parent-domain hourly aggregate shows
has_high_unique_ratiotrue orhas_high_query_ratetrue; the six-hour source profile in step 6 shows a sustained single-parent concentration with a cadence-driven inter-query interval; a concurrent TLS or HTTPS leg to the parent identifies a multi-protocol tunnel; or the source's DNS egress bypasses the corporate resolver to an attacker-controlled name server. The analyst opens an incident response (IR) ticket, requests endpoint isolation for the affected host, hands the host off to forensic recovery and compromise assessment, blocks the parent domain at the organization's DNS firewall or proxy if policy allows indicator-based blocking, and transfers the evidence record and pivot tabs into the incident. When a concurrent Data Exfiltration or Beaconing alert fires on the same source, the analyst hands off in parallel to Data Exfiltration Investigation or C2 Beacon Investigation. The Escalate branch is where this runbook ends and incident-response procedures take over. - Monitor — protocol misuse without confirmed exfiltration. The disposition when the evidence is partial: a Medium- or Low-severity alert on a Tier 2 or Tier 3 asset with no IOC enrichment, a parent-domain profile that is unusual but does not clear the
has_high_unique_ratiothreshold, a six-hour source profile that shows isolated alerts rather than sustained concentration, no multi-protocol leg, and no policy-violating egress posture. The analyst records the observation in the ticket, leaves the Hunt tabs open, schedules a follow-up review (typically within twenty-four hours for Medium and forty-eight hours for Low), and watches for recurrence on the same source, for a belated corroborating signal from another family, or for the hourly aggregate to exceed thresholds. When the pattern recurs or the aggregate trips, the analyst re-enters at step 1 with the accumulated context and re-evaluates; when neither surfaces, the disposition moves to close as benign. - Close as benign — identified legitimate explanation. The disposition when the evidence positively identifies a legitimate driver: the parent domain belongs to a recognized telemetry or cloud-lookup vendor (anti-malware cloud reputation services, content-delivery cache-busting, corporate anti-phishing protection, vendor licensing check-ins, monitoring or analytics services), the per-query encoding is known vendor behavior rather than a payload, the parent-domain hourly aggregate shows the low unique-subdomain ratio characteristic of telemetry rather than the high ratio characteristic of tunneling, and no other detection family has corroborated on the same source. The analyst records the specific fields that justify the conclusion in the ticket — "parent
lookup.vendor-av.examplebelongs to the organization's anti-malware vendor's cloud-reputation lookup service (vendor documentation URL, corporate asset inventory entry), queries encode a per-file hash into the first subdomain, parent hourly aggregatehas_high_unique_ratio: trueis expected per-file uniqueness,has_high_query_rate: false, no concurrent HTTPS data leg, no IOC enrichment, closing as benign" is the shape of a defensible close-note — so later readers can reopen the conclusion when one of those fields changes. - Tune the rule — recurring benign service-discovery or telemetry pattern with a scoped policy fix. The disposition when the same benign pattern has been confirmed multiple times across the same population, the source of the noise is understood (a specific vendor's telemetry client producing long encoded subdomains, a specific service-discovery protocol producing deeply-nested names, a specific corporate cache-busting service producing random-looking labels), and the Policy surface for the DNS Tunneling family supports a scoped exclusion. The tuning lever documented in Behavioral detections — DNS Tunneling — Tuning Considerations is the parent-domain allowlist: the analyst identifies the benign parent from the
dns_tunneling_hourlyaggregate and adds it to Policy so the pipeline skips the family of queries under that parent without changing the per-query thresholds. Tightening the per-query indicator thresholds themselves (the 50-character query-length threshold, the 3-subdomain threshold, the 25-character leftmost-label threshold, the 200-character answer-length threshold) is explicitly discouraged per the detection chapter — an attacker that stays just under a raised threshold is still tunneling, and the raised threshold reduces coverage for that attacker's traffic. Tuning is a detection-engineering action, not a triage shortcut: the analyst opens a follow-up task against the DNS Tunneling Policy definition rather than tuning inline, submits the change through the organization's peer-review workflow, and keeps the alert live until the tuned policy deploys. The default disposition for a single confirmed-benign alert is close, not tune — a single occurrence does not justify allowlisting a parent domain, and the allowlist reduces coverage for future variants of the matched pattern.
Every branch is recorded in the incident-management system. The runbook reference, the disposition, the asset context, the suspicion score and indicator pattern, the query-string encoding reading, the parent-domain aggregate reading, the egress-posture reading, the six-hour source profile, the IOC state, and the pivot tab references form the minimum record. When the disposition is Escalate, the record hands off directly into the incident.
Common False-Positive Patterns
A material share of DNS Tunneling alerts at the Medium and Low tiers have benign explanations. Recognizing the patterns saves investigation time and keeps analysts from over-escalating legitimate services that legitimately use long-label DNS.
- Anti-malware cloud-lookup telemetry. Many endpoint anti-malware products perform cloud reputation lookups by encoding a file hash, a URL hash, or a behavioral fingerprint into the leftmost subdomain label of a vendor-controlled parent domain. The result is a high-volume, high-unique-subdomain profile that trips
has_high_unique_ratioandhas_high_query_rateon the parent aggregate, andis_long_queryplushas_encoded_labelon individual queries. The parent domain is owned by the vendor and usually well-documented. Disposition: close when the parent identity is confirmed; tune via parent-domain allowlist when the volume sustains across the endpoint population. - Content-delivery cache-busting. Some content-delivery networks and analytics platforms encode per-request identifiers into the first subdomain label for cache-busting, per-user tracking, or session routing. The encoding shape resembles Base64 or hex and trips
has_encoded_label,is_long_query, and occasionallyhas_many_subdomains. Disposition: close when the parent identity is confirmed against the vendor inventory; tune via allowlist when the volume sustains. - Domain-validation and anti-phishing services. Corporate anti-phishing gateways, link-protection services, and domain-reputation providers perform in-line DNS lookups against custom vendor parents that encode the inspected URL into the subdomain. Trips
is_long_query,has_encoded_label, and sometimesis_unusual_type(TXTrecords for policy metadata). Disposition: close when the security-stack inventory confirms the vendor; tune via allowlist when the volume is cross-population. - Service-discovery protocols and DNS-SD. Some service-discovery protocols (DNS Service Discovery with
PTR,SRV, andTXTrecords, multicast DNS variants, cloud-service-mesh discovery) produce deeply-nested DNS names andTXTrecord queries. Tripshas_many_subdomainsandis_unusual_type. Disposition: close when the query pattern matches documented service-discovery usage; tune via allowlist only when the volume is sustained and the protocol identity is unambiguous. - Long-label branded or marketing domains. Some domains register long, unusual, or marketing-encoded names that trip
is_long_queryonArecord queries. Common examples are vanity domains, campaign-tracking URLs, and some enterprise-registered domains with descriptive subdomain schemes. Tripsis_long_queryalone, rarely with a second indicator. Disposition: close on the weak-signal argument; tune is rarely appropriate because a single indicator at the Low tier is already near the noise floor and does not justify policy changes. - DNS-based monitoring probes. External uptime or synthetic-monitoring services (and internal network-monitoring variants) query specific probe-named subdomains under a vendor parent on a fixed cadence to measure DNS egress health. Trips
is_long_queryorhas_many_subdomainswhen probe names are descriptive, but typically produces a low unique-subdomain ratio because the probe set is small and reused. Disposition: close when the monitor identity is confirmed; tune via allowlist when the monitor produces sustained volume across the population. - Misconfigured applications querying dead hostnames. Applications with stale configuration, legacy DNS clients with bad cache eviction, or configuration-drift bugs can issue repeated queries to the same set of odd-shaped names. Trips individual indicators (
is_long_query,is_unusual_type, depending on the misconfiguration) without clearing the aggregate thresholds because the unique-subdomain count stays low. Disposition: close once the misconfiguration is identified and the owner is notified; the parallel DGA Detection alert will also fire and shares the same remedy.
Closing on a false-positive pattern still requires the runbook's evidence record. "Looks like cloud telemetry" is not a disposition; "parent lookup.vendor-av.example belongs to the organization's anti-malware vendor's cloud reputation service (vendor documentation URL, corporate asset inventory entry), query encodes a per-file SHA-256 hash into the leftmost label, parent hourly aggregate shows 2,412 queries from 1,847 sources with has_high_unique_ratio: true (per-file uniqueness is expected for this service), no IOC enrichment, no concurrent HTTPS data leg; closing as benign" is.
See Also
- Critical Alert Triage — the generic first-response runbook that hands off to this one.
- C2 Beacon Investigation — companion runbook when a DGA Detection or Beaconing Detection alert fires on the same source, which is common when a DGA-based implant uses tunneling for its rendezvous-and-channel pattern.
- Data Exfiltration Investigation — companion runbook when a Data Exfiltration Detection alert fires on a concurrent HTTPS leg on the same source, identifying the multi-protocol tunnel's high-bandwidth data channel.
- ML Anomaly Investigation — companion runbook when an ML Random Cut Forest Anomaly fires on a DNS event from the same source, corroborating the tunneling pattern with an independent anomaly-detection signal.
- Alert, Flow, and PCAP Pivoting — the pivot-mechanics meta-runbook referenced by steps 4, 6, and 8.
- C2 Beacon Investigation — background on the DNS Tunneling Detection family, including the per-query indicator definitions, the suspicion-score ladder, the hourly aggregation fields, the severity and confidence ladders, and the Policy-managed parent-domain allowlist referenced in the Tune the rule branch.
- Behavioral Detections — background on the companion DGA Detection referenced in steps 3 and 6.
- C2 Beacon Investigation — background on the C2 feed whose matches drive the IOC auto-escalation in step 7.
- InSights TIDB and REPDB — parallel intelligence feeds referenced in step 7.
- Detection Overview — unified severity scale, confidence scale, and the IOC auto-escalation rule.
- Hunt Page — tabs, sidebar, right-click pivots (Hunt all events from this IP, Hunt all events for this domain, Show all events with this community id, Show related events), the DNS Tunneling Suspicious Alert sub-tab, and the Network Sessions → DNS / TLS / HTTP buckets used throughout this runbook.
- Manager Configuration— Policy management surface referenced in the Tune the rule branch.