Passive DNS mining from PCAP with dpkt & Python

Update 04/14: A friend pointed me to dnssnarf, a project that looks like it was written at a DojoSec meeting by Christopher McBee and then updated a bit later on by Grant Stavely. It uses Scapy (which I hear is really neat if you haven’t played with it). Check Grant’s blog post about dnssnarf out.

So, here is another quickie in case anyone needs it out there in the Intertubes. Say you have a .pcap file, or many .pcap files, and you want to mine the DNS responses out of them so you can build up a passive DNS database and track malicious resolutions to build a list of ban-able IP addresses. This script aims to parse a given .pcap file (tcpdump/wireshark libpcap format) and returns the results of the query types you have interest in.

This script is built around dpkt, a tool by Dug Song, and the contents are heavily inspired by the tutorials present at Jon Oberheide’s site (also a developer of dpkt). Honestly, most of the time writing this was spent understanding how dpkt handled its internal data structures and how to get to the data. The documentation on dpkt is not the most mature, but the source is pretty readable, if you keep the references I mention in the comments at hand. Also, this script was only tested with Python 2.6 and dpkt 1.7 on Linux, it was confirmed to not work on Windows as dpkt appears to have some serious problems with Windows at the moment.

#!/usr/bin/env python

import dpkt, socket, sys

if len(sys.argv) < 2 or len(sys.argv) > 2:
 print "Usage:\n", sys.argv[0], "filename.pcap"
 sys.exit()

f = open(sys.argv[1])
pcap = dpkt.pcap.Reader(f)

for ts, buf in pcap:
 # make sure we are dealing with IP traffic
 # ref: http://www.iana.org/assignments/ethernet-numbers
 try: eth = dpkt.ethernet.Ethernet(buf)
 except: continue
 if eth.type != 2048: continue
 # make sure we are dealing with UDP
 # ref: http://www.iana.org/assignments/protocol-numbers/
 try: ip = eth.data
 except: continue
 if ip.p != 17: continue
 # filter on UDP assigned ports for DNS
 # ref: http://www.iana.org/assignments/port-numbers
 try: udp = ip.data
 except: continue
 if udp.sport != 53 and udp.dport != 53: continue
 # make the dns object out of the udp data and check for it being a RR (answer)
 # and for opcode QUERY (I know, counter-intuitive)
 try: dns = dpkt.dns.DNS(udp.data)
 except: continue
 if dns.qr != dpkt.dns.DNS_R: continue
 if dns.opcode != dpkt.dns.DNS_QUERY: continue
 if dns.rcode != dpkt.dns.DNS_RCODE_NOERR: continue
 if len(dns.an) < 1: continue
 # now we're going to process and spit out responses based on record type
 # ref: http://en.wikipedia.org/wiki/List_of_DNS_record_types
 for answer in dns.an:
   if answer.type == 5:
     print "CNAME request", answer.name, "\tresponse", answer.cname
   elif answer.type == 1:
     print "A request", answer.name, "\tresponse", socket.inet_ntoa(answer.rdata)
   elif answer.type == 12:
     print "PTR request", answer.name, "\tresponse", answer.ptrname
About these ads

6 Comments

  1. Posted June 1, 2010 at 6:32 am | Permalink | Reply

    What are the mining for dns

    • Posted June 6, 2010 at 9:06 am | Permalink | Reply

      I’m not sure I understand your question, Vincent. Could you please be more specific? I’d be glad to help, if I can.

  2. Posted August 19, 2010 at 1:43 am | Permalink | Reply

    Without doing too much investigating, it seems that not all A responses come back with 4 bytes, making socket.inet_ntoa throw an error on some pcaps.

    Adding:

    if len(answer.rdata) == 4:

    before attempting to print the output appears to fix it.

    • Posted September 12, 2010 at 6:26 pm | Permalink | Reply

      Lou,

      First, sorry for the delay in response.

      Thanks for the tip. I’m trying to figure out why an A record wouldn’t be 4-bytes. Do you have a sample you can show me? I haven’t found this on our network at work, which is pretty large, but that’s not saying it couldn’t happen.

      There is mention on the Wikipedia page here of some RFC’s that might explain the issue, http://en.wikipedia.org/wiki/List_of_DNS_record_types. Notably, they talk about “Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host, but also used for DNSBLs, storing subnet masks in RFC 1101, etc.”. A subnet mask should still be 4 bytes. But in reading the RFC 1035, I came across this in section 3.4.1:

      Hosts that have multiple Internet addresses will have multiple A
      records.

      Maybe you’re seeing packets with multiple A record addresses in one response? I haven’t tested that, I might have to iterate over answer.rdata, maybe it comes back as an array if there are multiple responses or something. Will test it and report back here.

      An example of this:

      8 0.658817 192.168.1.1 192.168.1.5 DNS Standard query response CNAME http://www.l.google.com A 72.14.204.147 A 72.14.204.104 A 72.14.204.99 A 72.14.204.103

      And:

      31 242.110444 192.168.1.1 192.168.1.5 DNS Standard query response A 74.200.243.253 A 76.74.255.123 A 72.233.2.58 A 72.233.2.59 A 76.74.254.123 A 74.200.243.251

      Whereas:

      21 106.408197 192.168.1.1 192.168.1.5 DNS Standard query response A 64.85.164.40

      Thanks again!

  3. alfonso
    Posted December 3, 2010 at 6:11 pm | Permalink | Reply

    Traceback (most recent call last):
    File “./parserdns.py”, line 12, in
    for ts, buf in pcap:
    File “build/bdist.linux-i686/egg/dpkt/pcap.py”, line 141, in __iter__
    File “build/bdist.linux-i686/egg/dpkt/dpkt.py”, line 75, in __init__
    dpkt.dpkt.NeedData

    • alfonso
      Posted December 3, 2010 at 6:13 pm | Permalink | Reply

      12 for ts, buf in pcap:

One Trackback

  1. By 2010 in review « mishou.org on January 2, 2011 at 10:15 am

    [...] The busiest day of the year was October 28th with 77 views. The most popular post that day was Passive DNS mining from PCAP with dpkt & Python. [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: