Keyword Search in PCAP files

Sherlock Holmes and Magnifying Glass via Inside Croydon A new function in the free version of CapLoader 1.2 is the "Find Keyword" feature. This keyword search functionality makes it possible to seek large capture files for a string or byte pattern super fast!

You might say, so what? PCAP string search can already be done with tools like tcpflow, ngrep and even Wireshark; what's the benefit of adding yet another tool to this list? One benefit is that CapLoader doesn't just give you the packet or content that matched the keyword, it will instead extract the whole TCP or UDP flow that contained the match. CapLoader also supports many different encodings, which is demonstrated in this blog post.

Here are a few quick wins with CapLoader's keyword search feature:

  • Track User-Agent - Search for a specific user agent string to extract all the HTTP traffic from a particular browser or malware.
  • Track Domain Name - Search for a particular domain name to get all DNS lookups as well as web traffic relating to that domain (including HTTP "referer" field matches).
  • Extract Messages - Search for a keyword in e-mail or chat traffic to get the whole e-mail or conversation, not just the single packet that matched.
  • Extract Files - Search for a unique string or byte sequence in a file (such as a piece of malware) to enable extraction of the complete file transfer.

EXAMPLE: DigitalCorpora M57

As an example, let's search the digital corpora file net-2009-12-06-11:59.pcap (149 MB) for the keyword "immortal". Follow these steps in order to veify our analysis using the free edition of CapLoader.

  1. Start CapLoader and select File -> Open URL, enter:
    http://digitalcorpora.org/corp/nps/scenarios/2009-m57-patents/net/net-2009-12-06-11:59.pcap.gz
  2. Edit -> Find Keyword (or Ctrl+F), enter "immortal" CapLoader Find Keyword Form
  3. Click the "Find and Select All Matching Flows" button
  4. One TCP flow is now selected (Flow_ID 5469, 192.168.1.104:2592 -> 192.168.1.1:25) CapLoader with one selected flow
  5. Right click the selected flow (ID 5469) and select "Flow Transcript"
CapLoader Flow Transcript of SMTP email attachment

CapLoader transcript of SMTP email flow

Looks as if an email has been sent with an attachment named "microscope1.jpg". However, the string "immortal" cannot be seen anywhere in the transcript view. The match that CapLoader found was actually in the contents of the attachment, which has been base64 encoded in the SMTP transfer in accordance with RFC 2045 (MIME).

The email attachment can easily be extracted from the PCAP file using NetworkMiner. However, to keep things transparent, let's just do a simple manual verification of the matched data. The first three lines of the email attachment are:

/9j/4AAQSkZJRgABAQEAkACQAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
AQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEB
AQEBAQEBAQEBAQEBAQEBAQEBAQFwYXNzd29yZD1pbW1vcnRhbAEBAQEBAQEBAQEBAQH/wAAR
Decoding this with base64 gives us:
0000000: ffd8 ffe0 0010 4a46 4946 0001 0101 0090 ......JFIF......
0000010: 0090 0000 ffdb 0043 0001 0101 0101 0101 .......C........
0000020: 0101 0101 0101 0101 0101 0101 0101 0101 ................
0000030: 0101 0101 0101 0101 0101 0101 0101 0101 ................
0000040: 0101 0101 0101 0101 0101 0101 0101 0101 ................
0000050: 0101 0101 0101 0101 01ff db00 4301 0101 ............C...
0000060: 0101 0101 0101 0101 0101 0101 0101 0101 ................
0000070: 0101 0101 0101 0101 0101 0101 0101 0101 ................
0000080: 7061 7373 776f 7264 3d69 6d6d 6f72 7461 password=immorta
0000090: 6c01 0101 0101 0101 0101 0101 0101 ffc0 l...............

Tools like ngrep, tcpflow and Wireshark won't find any match for the string "immortal" since they don't support searching in base64 encoded data. CapLoader, on the other hand, supports lots of encodings.

Supported Text Encodings

CapLoader currently supports fast searching of text strings in any of the following encodings:

  • ASCII
  • Base64 (used in email attachments and HTTP POST's)
  • DNS label encoding (RFC 1035)
  • HTML
  • Quoted Printable (used in body of email messages)
  • Unicode
  • URL encoding
  • UTF8

CapLoader also supports several local character sets, including the following code pages:

  • 437 MS-DOS Latin US
  • 850 MS-DOS Latin 1
  • 932 Japanese
  • 936 Simplified Chinese
  • 949 Korean
  • 1251 Windows Cyrillic (Slavic)
  • 1256 Windows Arabic

Having all these encodings also makes it possible to search network traffic for words like хакер, القراصنة, ハッカー, 黑客 or 해커.

The Art of War by Sun Tzu

Getting CapLoader

CapLoader is a commercial tool that also comes in a free trial edition. The search feature is available in both versions, so feel free to download CapLoader and try it your self!

CapLoader is available from the following URL:
http://www.netresec.com/?page=CapLoader

Posted by Erik Hjelmvik on Wednesday, 02 April 2014 13:15:00 (UTC/GMT)

Tags: #search#find#keyword#flow#stream#PCAP#SMTP#transcript#free#network

Short URL: https://netresec.com/?b=1447A3D

X / twitter

NETRESEC on X / Twitter: @netresec

Mastodon

NETRESEC on Mastodon: @netresec@infosec.exchange