Le Zeek, C’est Chic: Using an NSM for Offense
In one of my many former lives (and occasionally in this one) I played "defense", wading through network traffic, logs, etc. for Bad Things™. Outside of the standard FOSS (and even commercial) tools for doing that, I grew to have a real fondness for Zeek, which is often the cornerstone for other network security monitoring (NSM) products and platforms. These days, I use Zeek primarily for NSM purposes and profiling of IoT (and other embedded) devices we at Atredis are either testing or researching.
However, some people may not be aware of the potential for using Zeek in red team or network penetration testing capacities. In this post, I'll touch briefly on Zeek's capabilities and then get into a few examples of using Zeek to help guide/inform testing efforts.
What is Zeek?
From the Zeek docs (a.k.a. "Book of Zeek"):
Zeek is a passive, open-source network traffic analyzer. Many operators use Zeek as a network security monitor (NSM) to support investigations of suspicious or malicious activity. Zeek also supports a wide range of traffic analysis tasks beyond the security domain, including performance measurement and troubleshooting.
First created in 1994, it was originally known as "Bro" (as in "Big Brother", a nod to George Orwell's 1984). Zeek consists of a very powerful pipeline for processing packets, assembling them into streams, analyzing fields/contents, extracting metadata/files, outputting to various sources/formats, etc. Zeek is also a core component of platforms like Security Onion, Malcolm, Corelight, Bricata, etc.
Why a "defensive" tool?
You might be asking yourself -- er, rather me, but rhetorically -- this question. The reason is simple: using tcpdump, Wireshark, and their ilk in offensive operations is not altogether different. In fact, SANS SEC503 ("Intrusion Detection In-Depth") covers using these tools for their intended, non offense purposes. The other reason is that while full content captures are great, you don't always need them. Moreover, these tools can all complement each other (i.e., use Zeek for broader analysis and statistics, and keep your tcpdump and Wireshark for more thorough, full content analyses).
Installation and Setup
I'm not going to cover "how to install Zeek" in this post, as it's very well-documented in the Book of Zeek. However, there are a couple of things to enable for the purposes of the examples herein.
Zeek JSON Logs
The default format for Zeek logs is tab-delimited. However, I prefer Zeek's JSON-formatted logs for easier parsing with tools like jq. JSON log output is easy to enable by adding (or uncommenting) the following line in local.zeek
:
@load policy/tuning/json-logs.zeek
MAC Address Logging
Although this isn't totally pertinent to the examples later on, I find MAC address logging hugely helpful for host/device identification. Turning on the following option in local.zeek
will add layer 2 source/destination fields to entries in conn.log
:
@load policy/protocols/conn/mac-logging
{
"ts": 1619634510.803026,
"uid": "CGgSqRTXbeiqDz71l",
"id.orig_h": "172.18.0.253",
"id.orig_p": 26820,
"id.resp_h": "1.1.1.1",
"id.resp_p": 53,
"proto": "udp",
"service": "dns",
...
"orig_l2_addr": "88:dc:96:6e:13:5c",
"resp_l2_addr": "0a:e8:4c:68:1d:60"
}
Zeek Logs
The Book of Zeek has a more thorough explanation of each log type, but a quick rundown is as follows:
Log/File Name | Description |
---|---|
conn.log |
Hosts, ports, bytes transferred, transport layer protocols, etc. |
dns.log |
Queries, query types, answers |
http.log |
Hostnames, URIs, HTTP verbs, etc. |
files.log |
File types, filenames, hashes, etc. |
ftp.log |
Users, commands, paths, etc. |
ssl.log |
SSL/TLS versions, ciphers, hostnames, server ports, etc. |
x509.log |
Cert versions, cert subjects, cert, issuers, dates, etc. |
smtp.log |
Senders, recipients, subjects, message bodies, routes/paths, etc. |
ssh.log |
Client/server versions, algorithms, pubkey fingerprints, etc. |
pe.log |
Architectures, OSes, PE sections, debug info |
dhcp.log |
Message types, assigned addresses, MAC addresses, hostnames, etc. |
ntp.log |
Times, versions, strata, offsets, clients/servers, etc. |
SMB Logs (plus DCE-RPC, Kerberos, NTLM) | SMB share mappings, DCE-RPC call info, Kerberos KDC interactions, etc. |
irc.log |
Commands, nicks/users, etc. |
rdp.log |
Hosts, security protocols, cookies, etc. |
traceroute.log |
Source/dest, protocols, ports |
tunnel.log |
(Typically Teredo) tunnel types, actions, hosts, etc. |
dpd.log |
Used for reporting problems with Dynamic Protocol Detection |
known_*.log and software.log |
Which ports/hosts and software (versions) were observed |
weird.log and notice.log |
Issues where protocols deviated from norm |
capture_loss.log and reporter.log |
Diagnostic |
Of course, there are other logs specific to other protocols, such as modbus.log
, dnp3.log
, mqtt.log
, etc.
Log Correlation
Log entries are also assigned IDs (uid
) for correlation across different log types. For example, a connection (in conn.log
) might correspond to an HTTP request (http.log
). That HTTP request may have downloaded a file (files.log
), which was a Portable Executable (PE) (whose analysis shows up in pe.log
). This is seen in the following example. First, we'll start with conn.log
:
{
"ts": 1616187600.203065,
"uid": "C3R4Ar79TjjOQZDk1",
"id.orig_h": "192.168.0.132",
"id.orig_p": 50395,
"id.resp_h": "142.250.34.2",
"id.resp_p": 80,
"proto": "tcp",
"service": "http",
"duration": 17.525580167770386,
"orig_bytes": 339,
"resp_bytes": 2778935,
"conn_state": "RSTO",
"local_orig": true,
"local_resp": false,
"missed_bytes": 2525951,
"history": "ShADadcgcgcgR",
"orig_pkts": 102,
"orig_ip_bytes": 4431,
"resp_pkts": 179,
"resp_ip_bytes": 260156,
"orig_l2_addr": "34:41:5d:9f:0d:8f",
"resp_l2_addr": "02:42:c0:a8:00:02"
}
Connection entry in conn.log
Note the uid
value of C3R4Ar79TjjOQZDk1
, which is seen in the following HTTP request in http.log
:
{
"ts": 1616187600.226666,
"uid": "C3R4Ar79TjjOQZDk1",
"id.orig_h": "192.168.0.132",
"id.orig_p": 50395,
"id.resp_h": "142.250.34.2",
"id.resp_p": 80,
"trans_depth": 1,
"method": "GET",
"host": "edgedl.gvt1.com",
"uri": "/chrome_updater.exe",
"version": "1.1",
"user_agent": "Google Update/1.3.36.72;winhttp",
"request_body_len": 0,
"response_body_len": 2778496,
"status_code": 200,
"status_msg": "OK",
"tags": [],
"resp_fuids": [
"FnFzCVkm11eShPHLb"
],
"resp_mime_types": [
"application/x-dosexec"
]
}
HTTP request in http.log
In the above log entry, we see a few additional fields, such as the uri
, method
, host
, etc. -- all items specific to HTTP. Additionally, the value in resp_fuids
(FnFzCVkm11eShPHLb
) corresponds to a unique ID for the file associated with this request. This value is observed in the fuid
field of the files.log
entry shown below:
{
"ts": 1616187600.257684,
"fuid": "FnFzCVkm11eShPHLb",
"tx_hosts": [
"142.250.34.2"
],
"rx_hosts": [
"192.168.0.132"
],
"conn_uids": [
"C3R4Ar79TjjOQZDk1"
],
"source": "HTTP",
"depth": 0,
"analyzers": [
"MD5",
"SHA1",
"PE"
],
"mime_type": "application/x-dosexec",
"duration": 0.34926891326904297,
"local_orig": false,
"is_orig": false,
"seen_bytes": 252545,
"total_bytes": 2778496,
"missing_bytes": 2525951,
"overflow_bytes": 0,
"timedout": false
}
Finally, as this was a PE, it was examined by Zeek's PE analyzer. In the following pe.log
entry, we see FnFzCVkm11eShPHLb
in the id
field, along with additional information about the binary:
{
"ts": 1616187600.273825,
"id": "FnFzCVkm11eShPHLb",
"machine": "AMD64",
"compile_ts": 1615499290,
"os": "Windows XP x64 or Server 2003",
"subsystem": "WINDOWS_GUI",
"is_exe": true,
"is_64bit": true,
"uses_aslr": true,
"uses_dep": true,
"uses_code_integrity": false,
"uses_seh": true,
"has_import_table": true,
"has_export_table": false,
"has_cert_table": true,
"has_debug_data": true,
"section_names": [
".text",
".rdata",
".data",
".pdata",
".00cfg",
".rsrc",
".reloc"
]
}
With some of these high-level basics out of the way, I'll now go into some more specific examples.
The Scenario
On a recent attack simulation project, our team was dropped onto a customer's highly critical OT/ICS network, with the directive of being extremely diligent to avoid any sort of disruption of controllers, supervisory systems, management systems, etc. Rules around scanning, discovery, and enumeration activities were very prohibitive. However, we were provided access to a monitoring/SPAN port which mirrored traffic from certain network segments. This was a perfect source of data to analyze with Zeek, and helped further guide our active testing efforts while respecting the customer's constraints.
For the following examples, we'll be using jq to parse Zeek's various logs in a syntax like jq [query] [log file]
.
Extracting DNS queries from dns.log
Perhaps the simplest -- and maybe most obvious -- example is using Zeek's dns.log
to gather information on DNS queries.
$ jq '. | {client: ."id.orig_h", server: ."id.resp_h", query: .query, type: .qtype_name, answers: .answers}' dns.log
{
"client": "192.168.11.198",
"server": "192.168.102.1",
"query": "dci.sophosupd.net",
"type": "A",
"answers": [
"d27v6ck90qm3ay.cloudfront.net",
"99.84.106.91",
"99.84.106.109",
"99.84.106.129",
"99.84.106.76"
]
}
{
"client": "192.168.11.30",
"server": "192.168.102.1",
"query": "ping3.teamviewer.com",
"type": "A",
"answers": [
"188.172.214.62",
"213.227.173.158",
"162.220.222.190",
"162.250.5.94",
"162.250.6.158"
]
}
{
"client": "192.168.11.113",
"server": "192.168.11.255",
"query": "FILESERVER02",
"type": "NB",
"answers": null
}
Finding listening services (or "scanning without scanning")
In lieu of sending traffic to the target network(s), we let Zeek do the heavy lifting in analyzing which hosts are likely listening on which ports, and which application-layer protocols are observed on those ports.
Command
$ jq '{host: .host, port: .port_num, proto: .port_proto, service: .service}' known_services.log
Example Output
{
"host": "192.168.11.196",
"port": 5900,
"proto": "tcp",
"service": [
"RFB"
]
}
{
"host": "192.168.10.52",
"port": 502,
"proto": "tcp",
"service": [
"MODBUS"
]
}
{
"host": "192.168.102.1",
"port": 53,
"proto": "udp",
"service": [
"DNS"
]
}
{
"host": "192.168.11.195",
"port": 135,
"proto": "tcp",
"service": [
"DCE_RPC"
]
}
Hosts with access to other subnets
In this example, we query the connection log (conn.log
) to see which hosts are talking across subnets. This is useful when trying to identify possible pivots.
Command
$ jq '. | select((."id.resp_h" | startswith("192.168.11")) or (."id.orig_h" | startswith("192.168.11"))) | {src: ."id.orig_h", dst: ."id.resp_h"}' conn.log
Example Output
{
"src": "192.168.9.15",
"dst": "192.168.11.1"
}
{
"src": "192.168.9.109",
"dst": "192.168.11.140"
}
{
"src": "192.168.9.12",
"dst": "192.168.11.1"
}
Hosts with access to other subnets and respective destination ports
We can take the above example a step further and also query for the ports associated with the conversation(s) to get even more insight about the relationships between hosts/devices.
Command
$ jq '. | select((."id.resp_h" | startswith("192.168.11")) or (."id.orig_h" | startswith("192.168.11"))) | {src: ."id.orig_h", srcport: ."id.orig_p", dst: ."id.resp_h", dstport: ."id.resp_p"}' conn.log
Example Output
{
"src": "192.168.9.21",
"srcport": 52433,
"dst": "192.168.11.1",
"dstport": 88
}
{
"src": "192.168.9.109",
"srcport": 61067,
"dst": "192.168.11.140",
"dstport": 80
}
{
"src": "192.168.9.21",
"srcport": 52432,
"dst": "192.168.11.1",
"dstport": 445
}
Cleartext FTP passwords
Note: password logging needs to be enabled first by adding the following line to local.zeek
:
"redef FTP::default_capture_password = T;"
In this example, we query ftp.log
for very simple values: usernames and passwords.
Command
$ jq '. | {server: ."id.resp_h", port: ."id.resp_p", username: .user, password: .password}' ftp.log
Example Output
{
"host": "192.168.11.196",
"port": 21,
"username": "upload",
"password": "upload123"
}
Session IDs in URLs
Zeek's HTTP analyzer will extract elements from HTTP requests, including the method, URI, User-Agent, etc. In the following example, we query for any uri
field with the string sessionID
(with a case insensitive match).
Command
$ jq '. | select(.uri | match("sessionID", "i")) | {host: ."id.resp_h", port: ."id.resp_p", uri: .uri}' http.log
Example Output
{
"host": "192.168.11.196",
"port": 8080,
"uri": "/login.jsp;JSESSIONID=D7E73C21F471E6488CE00B50FD0E5186?client=client"
}
Software/version inventory
Zeek's software.log
can be used to identify which applications/services and their respective versions (where available) are observed, including both clients and servers, as shown in the following example.
Command
$ jq '. | {host: .host, port: .host_p, software: .unparsed_version}' software.log
Example Output
{
"host": "192.168.9.140",
"port": 80,
"software": "GoAhead-Webs"
}
{
"host": "192.168.9.13",
"port": 8080,
"software": "Apache-Coyote/1.1"
}
{
"host": "192.168.9.13",
"port": null,
"software": "PH.Framework.Communication.SshNet.SshClient.0.0.1"
}
VNC Port and Desktop/Display Name
The VNC (or, rather, "RFB") analyzer can pull additional information about VNC servers and display names. In the following example, we query the rfb.log
to identify which VNC servers were observed.
Command
$ jq '. | {host: ."id.resp_h", port: ."id.resp_p", title: .desktop_name}' rfb.log
Example Output
{
"host": "192.168.9.140",
"port": 5900,
"title": "PanelView VNC Server"
}
{
"host": "192.168.10.61",
"port": 5900,
"title": "admin-pc ( 192.168.10.61 ) - service mode"
}
Correlating from an HTTP request to an extracted file
Here we have a longer, albeit distilled example to demonstrate correlating an HTTP request down to an extracted file. In this case, we wanted to identify XML files containing configuration data, such as credentials. First we'll look in http.log
for any (plaintext) HTTP requests that fetched an XML file.
Filtering for specific MIME types in http.log
Command
jq '. | select(.resp_mime_types[] | match("xml")) | {host: .host, uri: .uri, fuids: .resp_fuids, mime_type: .resp_mime_types}' http.log
Example Output
{
"host": "192.168.10.110",
"uri": "/config.xml",
"fuids": [
"F7Hil53SZhP7kZbkm4"
],
"mime_type": [
"application/xml"
]
}
As an identifier for a file (fuid
) was returned, we know there was a file associated with this. So, we then want to identify the name of the extracted file by querying files.log
.
Filtering for extracted files in files.log
Command
$ jq '. | select(.fuid=="F7Hil53SZhP7kZbkm4") | .extracted' files.log
Example output
"extract-1619800042.170101-HTTP-F7Hil53SZhP7kZbkm4"
Command
Finally, we can simply cat
the extracted file on disk.
$ cat /opt/zeek/logs/current/extract_files/extract-1619800042.170101-HTTP-F7Hil53SZhP7kZbkm4
Example XML file with credentials
<?xml version="1.0" encoding="UTF-8"?>
<connectionStrings>
<add name="ud_DEV" connectionString="connectDB=uDB; uid=db2admin; pwd=password; dbalias=uDB;" providerName="System.Data.Odbc" />
</connectionStrings>
Conclusion
This post probably does very little justice to just how powerful Zeek truly is, and barely scratches the surface of its usefulness for both defense and offense. Shuttling Zeek logs into something like Elasticsearch can provide tremendous awareness about network activity, but that's not always possible (or reasonable) in an offensive operation. Combined with a tool like jq
-- and a source of network traffic, of course -- Zeek's capabilities can be quickly and easily leveraged to gain more insight into the target network and hosts/devices.
For anyone interested in doing more with Zeek from either angle, here are a few recommended resources: