Update 04/14: A friend pointed me to dnssnarf, a project that looks like it was written at a DojoSec meeting by Christopher McBee and then updated a bit later on by Grant Stavely. It uses Scapy (which I hear is really neat if you haven’t played with it). Check Grant’s blog post about dnssnarf out.
So, here is another quickie in case anyone needs it out there in the Intertubes. Say you have a .pcap file, or many .pcap files, and you want to mine the DNS responses out of them so you can build up a passive DNS database and track malicious resolutions to build a list of ban-able IP addresses. This script aims to parse a given .pcap file (tcpdump/wireshark libpcap format) and returns the results of the query types you have interest in.
This script is built around dpkt, a tool by Dug Song, and the contents are heavily inspired by the tutorials present at Jon Oberheide’s site (also a developer of dpkt). Honestly, most of the time writing this was spent understanding how dpkt handled its internal data structures and how to get to the data. The documentation on dpkt is not the most mature, but the source is pretty readable, if you keep the references I mention in the comments at hand. Also, this script was only tested with Python 2.6 and dpkt 1.7 on Linux, it was confirmed to not work on Windows as dpkt appears to have some serious problems with Windows at the moment.
#!/usr/bin/env python
import dpkt, socket, sys
if len(sys.argv) < 2 or len(sys.argv) > 2:
print "Usage:\n", sys.argv[0], "filename.pcap"
sys.exit()
f = open(sys.argv[1])
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
# make sure we are dealing with IP traffic
# ref: http://www.iana.org/assignments/ethernet-numbers
try: eth = dpkt.ethernet.Ethernet(buf)
except: continue
if eth.type != 2048: continue
# make sure we are dealing with UDP
# ref: http://www.iana.org/assignments/protocol-numbers/
try: ip = eth.data
except: continue
if ip.p != 17: continue
# filter on UDP assigned ports for DNS
# ref: http://www.iana.org/assignments/port-numbers
try: udp = ip.data
except: continue
if udp.sport != 53 and udp.dport != 53: continue
# make the dns object out of the udp data and check for it being a RR (answer)
# and for opcode QUERY (I know, counter-intuitive)
try: dns = dpkt.dns.DNS(udp.data)
except: continue
if dns.qr != dpkt.dns.DNS_R: continue
if dns.opcode != dpkt.dns.DNS_QUERY: continue
if dns.rcode != dpkt.dns.DNS_RCODE_NOERR: continue
if len(dns.an) < 1: continue
# now we're going to process and spit out responses based on record type
# ref: http://en.wikipedia.org/wiki/List_of_DNS_record_types
for answer in dns.an:
if answer.type == 5:
print "CNAME request", answer.name, "\tresponse", answer.cname
elif answer.type == 1:
print "A request", answer.name, "\tresponse", socket.inet_ntoa(answer.rdata)
elif answer.type == 12:
print "PTR request", answer.name, "\tresponse", answer.ptrname
6 Comments
What are the mining for dns
I’m not sure I understand your question, Vincent. Could you please be more specific? I’d be glad to help, if I can.
Without doing too much investigating, it seems that not all A responses come back with 4 bytes, making socket.inet_ntoa throw an error on some pcaps.
Adding:
if len(answer.rdata) == 4:
before attempting to print the output appears to fix it.
Lou,
First, sorry for the delay in response.
Thanks for the tip. I’m trying to figure out why an A record wouldn’t be 4-bytes. Do you have a sample you can show me? I haven’t found this on our network at work, which is pretty large, but that’s not saying it couldn’t happen.
There is mention on the Wikipedia page here of some RFC’s that might explain the issue, http://en.wikipedia.org/wiki/List_of_DNS_record_types. Notably, they talk about “Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host, but also used for DNSBLs, storing subnet masks in RFC 1101, etc.”. A subnet mask should still be 4 bytes. But in reading the RFC 1035, I came across this in section 3.4.1:
Hosts that have multiple Internet addresses will have multiple A
records.
Maybe you’re seeing packets with multiple A record addresses in one response? I haven’t tested that, I might have to iterate over answer.rdata, maybe it comes back as an array if there are multiple responses or something. Will test it and report back here.
An example of this:
8 0.658817 192.168.1.1 192.168.1.5 DNS Standard query response CNAME http://www.l.google.com A 72.14.204.147 A 72.14.204.104 A 72.14.204.99 A 72.14.204.103
And:
31 242.110444 192.168.1.1 192.168.1.5 DNS Standard query response A 74.200.243.253 A 76.74.255.123 A 72.233.2.58 A 72.233.2.59 A 76.74.254.123 A 74.200.243.251
Whereas:
21 106.408197 192.168.1.1 192.168.1.5 DNS Standard query response A 64.85.164.40
Thanks again!
Traceback (most recent call last):
File “./parserdns.py”, line 12, in
for ts, buf in pcap:
File “build/bdist.linux-i686/egg/dpkt/pcap.py”, line 141, in __iter__
File “build/bdist.linux-i686/egg/dpkt/dpkt.py”, line 75, in __init__
dpkt.dpkt.NeedData
12 for ts, buf in pcap:
One Trackback
[...] The busiest day of the year was October 28th with 77 views. The most popular post that day was Passive DNS mining from PCAP with dpkt & Python. [...]