Cyber Threat Hunting with Splunk — Part 1: Reconnaissance

Author: Allen Ace Lab: Home DFIR Lab (Oracle VirtualBox + BOTS v2)
Date: 2025-09-25

Executive summary

This document describes a hands-on threat hunting exercise using Splunk and the Boss of the SOC (BOTS) v2 dataset. The focus is detecting reconnaissance against public-facing webservers carried out via a non-standard browser. Key findings include a suspicious user agent (NaenaraBrowser) connecting via an ExpressVPN IP and downloading company_contacts.xlsx.

Goals & Objectives

Practice Splunk-based threat hunting workflows
Detect reconnaissance behaviors and anomalous user agents
Pivot from user agents to source IPs and accessed resources
Produce IOCs and reproducible hunt queries

Lab Setup

VM used: Windows VM1 (Splunk instance)
Dataset: Boss of the SOC v2 (Attack Only)
Ingest steps (summary):

Download Attack-Only dataset from https://github.com/splunk/botsv2
Extract with 7-Zip (decompress then unarchive)
Move botsv2_data_set into C:\Program Files\Splunk\etc\apps
Restart Splunk (if required) and verify ingestion with:

spl

index=botsv2 sourcetype=stream:smtp

Set time picker to 01 Aug 2017 → 31 Aug 2017.

Threat Intel Excerpt (Hunt seed)

“The unknown adversary is conducting reconnaissance of public-facing webservers using a non-standard browser over port 80.”

Hunt hypothesis: An adversary probed www.froth.ly in August 2017 using a non-standard browser. We expect to find anomalous user agent strings and associated source IPs that accessed sensitive assets.

Our first query will be using the sourcetype stream:http to see what fields are available

Spl

index=botsv2 sourcetype=stream:http

As seen below, there are fields for user agent strings (“http_user_agent”) and website (”site”) that we can use to determine which user agents accessed our organization website, froth.ly.

We can now write a query that will show us all user agent strings that accessed our website froth.ly and the number of times it was seen.

index=botsv2 sourcetype=stream:http site="www.froth.ly" | stats count by http_user_agent | sort + count

Using an open source website, linked below, we can learn more about the user agent string.

https://explore.whatismybrowser.com/useragents/parse/

As seen above, the Naenara browser is being run from a Fedora Linux system. The Linux system is not uncommon; however, we should do more research on the browser. By using Google we discovered that Naenara is a North Korean web browser.

While this is informative we should not jump to conclusions when it comes to attribution. We can now pivot from the user agent string to discover more, such as source IP address.

We can pivot by clicking the user agent string and selecting view events. This adds the user agent string to our query.

Now we can add a stats count by the source destination ip address to determine the IP address the session with the Naenar browser connected to our website.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" | stats count by src dest

As seen below, all the sessions were from a single IP address, 85.203.47.86.

Now we know the IP address (85.203.47.86) that the user agent used to connect to www.froth.ly.

We can then research the IP address using open source intelligence. Using IPinfo.io, with a free account, we can research the IP address 85.203.47.86.

https://ipinfo.io/

As seen below, we can see that 85.203.47.86 is part of the ExpressVPN service.

Continue Hunting

At this point we know that a suspicious user agent accessed our public-facing website. We need to continue our analysis to determine what information was accessed. This may give us insight in to the intent of the attacker.

By running our previous query, we can look at other available fields and what kind of information we can discover.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" | stats count by src dest

As seen below, there is a field named http_content_type. That field includes the type of information that was accessed. The most interesting type of information access is a spreadsheet.

By adding the interesting content type to our search and reviewing the uri_path field, we can see the name of the spreadsheet is company_contacts.xlsx.

Creating a table, the information is presented in a more useful format.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" http_content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" | table _time, src, dest, uri_path, url

A spreadsheet with company contacts is the kind of information an adversary would search for during a reconnaissaince. This could provide them with a target list for social engineering/phishing.

Below is a diagram of what we know.

Findings

Practice Splunk-based threat hunting workflowsPractice Splunk-based threat hunting workflowsAnomalous user agent: NaenaraBrowser/3.5b4 (appears to be Naenara, a DPRK browser)
Practice Splunk-based threat hunting workflowsPractice Splunk-based threat hunting workflowsSource IP: 85.203.47.86 (identified as an ExpressVPN endpoint via IPinfo)
Practice Splunk-based threat hunting workflowsAccessed resource: /path/to/company_contacts.xlsx (download of spreadsheet)
Practice Splunk-based threat hunting workflowsConclusion: Adversary-level reconnaissance likely aimed at harvesting contact lists for spear-phishing or follow-on attacks. Attribution is not conclusive due to VPN usage; focus on IOCs and response.

IOCs (Indicators of Compromise)

Practice Splunk-based threat hunting workflowsUser agent: Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv:19.1br) ... NaenaraBrowser/3.5b4
Practice Splunk-based threat hunting workflowsIP: 85.203.47.86 (ExpressVPN endpoint)
- Practice Splunk-based threat hunting workflowsURI: /company_contacts.xlsx
- Practice Splunk-based threat hunting workflowsTime range: August 2017 (see timeline or screenshots)

Save these to ioc.yml or iocs.csv for ingestion by detection tools.

How to reproduce (short)

- Practice Splunk-based threat hunting workflowsIngest BOTS v2 Attack dataset into Splunk (place under etc/apps/ and restart).
- Practice Splunk-based threat hunting workflowsSet time picker to Aug 2017.
- Practice Splunk-based threat hunting workflowsRun the queries listed above.
- Practice Splunk-based threat hunting workflowsPivot on anomalous UAs and follow the src field to get IPs.

Next steps (Part 2: Initial Access)

- Practice Splunk-based threat hunting workflowsInvestigate whether the attacker returned using the same or different UAs/IPs.
- Practice Splunk-based threat hunting workflowsHunt for suspicious POSTs, form submissions, or exploit attempts in server logs.
- Practice Splunk-based threat hunting workflowsCheck web server access logs for HTTP referrers and potential exploit strings.
- Practice Splunk-based threat hunting workflowsExpand hunting to DNS logs and firewall logs for related hostnames/IPs.

References

- Practice Splunk-based threat hunting workflowsSplunk BOTS v2: https://github.com/splunk/botsv2
- Practice Splunk-based threat hunting workflowsWhatIsMyBrowser UA parser: https://explore.whatismybrowser.com/useragents/parse/
- Practice Splunk-based threat hunting workflowsIP enrichment: https://ipinfo.io/

🔗 Related Investigations

➡️ Part 2: Initial Access

This investigation continues into the next phase of the attack lifecycle, focusing on how the adversary gains initial access after reconnaissance.

License

This repository is licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
images		images
queries		queries
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cyber Threat Hunting with Splunk — Part 1: Reconnaissance

Executive summary

Goals & Objectives

Lab Setup

Continue Hunting

Findings

IOCs (Indicators of Compromise)

How to reproduce (short)

Next steps (Part 2: Initial Access)

References

🔗 Related Investigations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Cyber Threat Hunting with Splunk — Part 1: Reconnaissance

Executive summary

Goals & Objectives

Lab Setup

Continue Hunting

Findings

IOCs (Indicators of Compromise)

How to reproduce (short)

Next steps (Part 2: Initial Access)

References

🔗 Related Investigations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages