Detecting TOR Communication in Network Traffic

Erik Hjelmvik

Saturday, 06 April 2013 20:55:00 (UTC/GMT)

Detecting TOR Communication in Network Traffic

The anonymity network Tor is often misused by hackers and criminals in order to remotely control hacked computers. In this blog post we explain why Tor is so well suited for such malicious purposes, but also how incident responders can detect Tor traffic in their networks.

Yellow onions with cross section. Photo taken by Andrew c

The privacy network Tor (originally short for The Onion Router) is often used by activists and whistleblowers, who wish to preserve their anonymity online. Tor is also used by citizens of countries with censored Internet (like in China, Saudi Arabia and Belarus), in order to evade the online censorship and surveillance systems. Authorities in repressive regimes are therefore actively trying to detect and block Tor traffic, which makes research on Tor protocol detection a sensitive subject.

Tor is, however, not only used for good; a great deal of the traffic in the Tor networks is in fact port scans, hacking attempts, exfiltration of stolen data and other forms of online criminality. Additionally, in December last year researchers at Rapid7 revealed a botnet called “SkyNet” that used Tor for its Command-and-Control (C2) communication. Here is what they wrote about the choice of running the C2 over Tor:

"Common botnets generally host their Command & Control (C&C) infrastructure on hacked, bought or rented servers, possibly registering domains to resolve the IP addresses of their servers. This approach exposes the botnet from being taken down or hijacked. The security industry generally will try to take the C&C servers offline and/or takeover the associated domains
[...]
What the Skynet botnet creator realized, is that he could build a much stronger infrastructure at no cost just by utilizing Tor as the internal communication protocol, and by using the Hidden Services functionality that Tor provides."

Tor disguised as HTTPS

Tor doesn't just provide encryption, it is also designed to look like normal HTTPS traffic. This makes Tor channels blend in quite well with normal web surfing traffic, which makes Tor communication difficult to identify even for experienced incident responders. As an example, here is how tshark interprets a Tor session to port TCP 443:

$ tshark -nr tbot_2E1814CCCF0.218EB916.pcap | head
1 0.000000 172.16.253.130 -> 86.59.21.38 TCP 62 1565 > 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
2 0.126186 86.59.21.38 -> 172.16.253.130 TCP 60 443 > 1565 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460
3 0.126212 172.16.253.130 -> 86.59.21.38 TCP 54 1565 > 443 [ACK] Seq=1 Ack=1 Win=64240 Len=0
4 0.127964 172.16.253.130 -> 86.59.21.38 SSL 256 Client Hello
5 0.128304 86.59.21.38 -> 172.16.253.130 TCP 60 443 > 1565 [ACK] Seq=1 Ack=203 Win=64240 Len=0
6 0.253035 86.59.21.38 -> 172.16.253.130 TLSv1 990 Server Hello, Certificate, Server Key Exchange, Server Hello Done
7 0.259231 172.16.253.130 -> 86.59.21.38 TLSv1 252 Client Key Exchange, Change Cipher Spec, Encrypted Handshake Message
8 0.259408 86.59.21.38 -> 172.16.253.130 TCP 60 443 > 1565 [ACK] Seq=937 Ack=401 Win=64240 Len=0
9 0.379712 86.59.21.38 -> 172.16.253.130 TLSv1 113 Change Cipher Spec, Encrypted Handshake Message
10 0.380009 172.16.253.130 -> 86.59.21.38 TLSv1 251 Encrypted Handshake Message

A Tor session to TCP port 443, decoded by tshark as if it was HTTPS

The thsark output above looks no different from when a real HTTPS session is being analyzed. So in order to detect Tor traffic one will need to apply some sort of traffic classification or application identification. However, most implementations for protocol identification rely on either port number inspection or protocol specification validation. But Tor often communicate over TCP 443 and it also follows the TLS protocol spec (RFC 2246), because of this most products for intrusion detection and deep packet inspection actually fail at identifying Tor traffic. A successful method for detecting Tor traffic is to instead utilize statistical analysis of the communication protocol in order to tell different SSL implementations apart. One of the very few tools that has support for protocol identification via statistical analysis is CapLoader.

CapLoader provides the ability to differentiate between different types of SSL traffic without relying on port numbers. This means that Tor sessions can easily be identified in a network full of HTTPS traffic.

Analyzing the tbot PCAPs from Contagio

@snowfl0w provides some nice analysis of the SkyNet botnet (a.k.a. Trojan.Tbot) at the Contagio malware dump, where she also provides PCAP files with the network traffic generated by the botnet.

The following six PCAP files are provided via Contagio:

tbot_191B26BAFDF58397088C88A1B3BAC5A6.pcap (7.55 MB)
tbot_23AAB9C1C462F3FDFDDD98181E963230.pcap (3.24 MB)
tbot_2E1814CCCF0C3BB2CC32E0A0671C0891.pcap (4.08 MB)
tbot_5375FB5E867680FFB8E72D29DB9ABBD5.pcap (5.19 MB)
tbot_A0552D1BC1A4897141CFA56F75C04857.pcap (3.97 MB) [only outgoing packets]
tbot_FC7C3E087789824F34A9309DA2388CE5.pcap (7.43 MB)

Unfortunately the file “tbot_A055[...]” only contains outgoing network traffic. This was likely caused by an incorrect sniffer setup, such as a misconfigured switch monitor port (aka SPAN port) or failure to capture the traffic from both monitor ports on a non-aggregating network tap (we recommend using aggregation taps in order to avoid these types of problems, see our sniffing tutorial for more details). The analysis provided here is therefore based on the other five pcap files provided by Contagio.

Here is a timeline with relative timestamps (the frame timestamps in the provided PCAP files were way of anyway, we noticed an offset of over 2 months!):

0 seconds : Victim boots up and requests an IP via DHCP
5 seconds : Victim perform a DNS query for time.windows.com
6 seconds : Victim gets time via NTP
---{malware most likely gets executed here somewhere}---
22 seconds : Victim performs DNS query for checkip.dyndns.org
22 seconds : Victim gets its external IP via an HTTP GET request to checkip.dyndns.org
23 seconds : Victim connects to the Tor network, typically on port TCP 9001 or 443
---{lots of Tor traffic from here on}---

This is what it looks like when one of the tbot pcap files has been loaded into CapLoader with the “Identify protocols” feature activated:
CapLoader detecting Tor protocol
CapLoader with protocol detection in action - see “TOR” in the “Sub_Protocol” column

Notice how the flows to TCP ports 80, 9101 and 443 are classified as Tor? The statistical method for protocol detection in CapLoader is so effective that CapLoader actually ignores port numbers altogether when identifying the protocol. The speed with which CapLoader parses PCAP files also enables analysis of very large capture files. A simple way to detect Tor traffic in large volumes of network traffic is therefore to load a capture file into CapLoader (with “Identify protocols” activated), sort the flows on the “Sub_Protocol” column, and scroll down to the flows classified as Tor protocol.

Beware of more Tor backdoors

Most companies and organizations allow traffic on TCP 443 to pass through their firewalls without content inspection. The privacy provided by Tor additionally makes it easy for a botnet herder to control infected machines without risking his identity to be revealed. These two factors make Tor a perfect fit for hackers and online criminals who need to control infected machines remotely.

Here is what Claudio Guarnieri says about the future use of Tor for botnets in his Rapid7 blog post:

“The most important factor is certainly the adoption of Tor as the main communication channel and the use of Hidden Services for protecting the backend infrastructure. While it’s surprising that not more botnets adopt the same design, we can likely expect more to follow the lead in the future.”

Incident responders will therefore need to learn how to detect Tor traffic in their networks, not just in order to deal with insiders or rogue users, but also in order to counter malware using it as part of their command-and-control infrastructure. However, as I've shown in this blog post, telling Tor apart from normal SSL traffic is difficult. But making use of statistical protocol detection, such as the Port Independent Protocol Identification (PIPI) feature provided with CapLoader, is in fact an effective method to detect Tor traffic in your networks.

Posted by Erik Hjelmvik on Saturday, 06 April 2013 20:55:00 (UTC/GMT)

Tags: #CapLoader #TOR #Protocol Identification #SSL #TLS #HTTPS #PCAP #PIPI