Skip to content

0x0allenace/Threat-Hunting-Recon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cyber Threat Hunting with Splunk — Part 1: Reconnaissance

Author: Allen Ace Lab: Home DFIR Lab (Oracle VirtualBox + BOTS v2)
Date: 2025-09-25

Executive summary

This document describes a hands-on threat hunting exercise using Splunk and the Boss of the SOC (BOTS) v2 dataset. The focus is detecting reconnaissance against public-facing webservers carried out via a non-standard browser. Key findings include a suspicious user agent (NaenaraBrowser) connecting via an ExpressVPN IP and downloading company_contacts.xlsx.

Goals & Objectives

  • Practice Splunk-based threat hunting workflows
  • Detect reconnaissance behaviors and anomalous user agents
  • Pivot from user agents to source IPs and accessed resources
  • Produce IOCs and reproducible hunt queries

Lab Setup

VM used: Windows VM1 (Splunk instance)
Dataset: Boss of the SOC v2 (Attack Only)
Ingest steps (summary):

  1. Download Attack-Only dataset from https://github.com/splunk/botsv2
  2. Extract with 7-Zip (decompress then unarchive)
  3. Move botsv2_data_set into C:\Program Files\Splunk\etc\apps
  4. Restart Splunk (if required) and verify ingestion with:

spl

index=botsv2 sourcetype=stream:smtp

Set time picker to 01 Aug 2017 → 31 Aug 2017.

Time Picker

Threat Intel Excerpt (Hunt seed)

“The unknown adversary is conducting reconnaissance of public-facing webservers using a non-standard browser over port 80.”

Hunt hypothesis: An adversary probed www.froth.ly in August 2017 using a non-standard browser. We expect to find anomalous user agent strings and associated source IPs that accessed sensitive assets.

Our first query will be using the sourcetype stream:http to see what fields are available

Spl

index=botsv2 sourcetype=stream:http

As seen below, there are fields for user agent strings (“http_user_agent”) and website (”site”) that we can use to determine which user agents accessed our organization website, froth.ly.

Source type stream

We can now write a query that will show us all user agent strings that accessed our website froth.ly and the number of times it was seen.

index=botsv2 sourcetype=stream:http site="www.froth.ly" | stats count by http_user_agent | sort + count

Source type by count

Using an open source website, linked below, we can learn more about the user agent string.

https://explore.whatismybrowser.com/useragents/parse/

open source website

As seen above, the Naenara browser is being run from a Fedora Linux system. The Linux system is not uncommon; however, we should do more research on the browser. By using Google we discovered that Naenara is a North Korean web browser.

Naenara browser

While this is informative we should not jump to conclusions when it comes to attribution. We can now pivot from the user agent string to discover more, such as source IP address.

We can pivot by clicking the user agent string and selecting view events. This adds the user agent string to our query.

User agent pivot

Now we can add a stats count by the source destination ip address to determine the IP address the session with the Naenar browser connected to our website.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" | stats count by src dest

As seen below, all the sessions were from a single IP address, 85.203.47.86.

sessions

Now we know the IP address (85.203.47.86) that the user agent used to connect to www.froth.ly.

We can then research the IP address using open source intelligence. Using IPinfo.io, with a free account, we can research the IP address 85.203.47.86.

https://ipinfo.io/

As seen below, we can see that 85.203.47.86 is part of the ExpressVPN service.

Express VPN

Continue Hunting

At this point we know that a suspicious user agent accessed our public-facing website. We need to continue our analysis to determine what information was accessed. This may give us insight in to the intent of the attacker.

By running our previous query, we can look at other available fields and what kind of information we can discover.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" | stats count by src dest

As seen below, there is a field named http_content_type. That field includes the type of information that was accessed. The most interesting type of information access is a spreadsheet.

Http content type

By adding the interesting content type to our search and reviewing the uri_path field, we can see the name of the spreadsheet is company_contacts.xlsx.

Company contacts

Creating a table, the information is presented in a more useful format.

index=botsv2 sourcetype=stream:http site="www.froth.ly" http_user_agent="Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv: 19.1br) Gecko/20130508 Fedora/1.9.1-2.5.rs3.0 NaenaraBrowser/3.5b4" http_content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" | table _time, src, dest, uri_path, url

Table format

A spreadsheet with company contacts is the kind of information an adversary would search for during a reconnaissaince. This could provide them with a target list for social engineering/phishing.

Below is a diagram of what we know.

Recon diagram

Findings

  • Practice Splunk-based threat hunting workflowsPractice Splunk-based threat hunting workflowsAnomalous user agent: NaenaraBrowser/3.5b4 (appears to be Naenara, a DPRK browser)
  • Practice Splunk-based threat hunting workflowsPractice Splunk-based threat hunting workflowsSource IP: 85.203.47.86 (identified as an ExpressVPN endpoint via IPinfo)
  • Practice Splunk-based threat hunting workflowsAccessed resource: /path/to/company_contacts.xlsx (download of spreadsheet)
  • Practice Splunk-based threat hunting workflowsConclusion: Adversary-level reconnaissance likely aimed at harvesting contact lists for spear-phishing or follow-on attacks. Attribution is not conclusive due to VPN usage; focus on IOCs and response.

IOCs (Indicators of Compromise)

  • Practice Splunk-based threat hunting workflowsUser agent: Mozilla/5.0 (X11; U; Linux i686; ko-KP; rv:19.1br) ... NaenaraBrowser/3.5b4
  • Practice Splunk-based threat hunting workflowsIP: 85.203.47.86 (ExpressVPN endpoint)
    • Practice Splunk-based threat hunting workflowsURI: /company_contacts.xlsx
    • Practice Splunk-based threat hunting workflowsTime range: August 2017 (see timeline or screenshots)

Save these to ioc.yml or iocs.csv for ingestion by detection tools.

How to reproduce (short)

- Practice Splunk-based threat hunting workflowsIngest BOTS v2 Attack dataset into Splunk (place under etc/apps/ and restart).
- Practice Splunk-based threat hunting workflowsSet time picker to Aug 2017.
- Practice Splunk-based threat hunting workflowsRun the queries listed above.
- Practice Splunk-based threat hunting workflowsPivot on anomalous UAs and follow the src field to get IPs.

Next steps (Part 2: Initial Access)

- Practice Splunk-based threat hunting workflowsInvestigate whether the attacker returned using the same or different UAs/IPs.
- Practice Splunk-based threat hunting workflowsHunt for suspicious POSTs, form submissions, or exploit attempts in server logs.
- Practice Splunk-based threat hunting workflowsCheck web server access logs for HTTP referrers and potential exploit strings.
- Practice Splunk-based threat hunting workflowsExpand hunting to DNS logs and firewall logs for related hostnames/IPs.

References

- Practice Splunk-based threat hunting workflowsSplunk BOTS v2: https://github.com/splunk/botsv2
- Practice Splunk-based threat hunting workflowsWhatIsMyBrowser UA parser: https://explore.whatismybrowser.com/useragents/parse/
- Practice Splunk-based threat hunting workflowsIP enrichment: https://ipinfo.io/

🔗 Related Investigations

This investigation continues into the next phase of the attack lifecycle, focusing on how the adversary gains initial access after reconnaissance.

License

This repository is licensed under the MIT License. See LICENSE for details.

About

Splunk-based threat hunting case study analyzing reconnaissance activity using the BOTS v2 dataset, focused on detecting anomalous user agents, pivoting to source IPs, and extracting actionable IOCs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors