Threat Hunting for Uncategorized Proxy Events
Attackers rely on the abstraction provided between domains and IP addresses to make their infrastructure more resilient. A domain name can be registered in a matter of minutes, and multiple domains can be configured to point to the same host. This allows attackers to quickly switch between domains and subdomains to avoid detection. One trick experienced hunters use is to rely on the immature nature of these domains and hunt for malicious activity with that in mind. In this post, I’ll discuss HTTP proxy categorization and demonstrate how you can use Sqrrl to hunt for malware using previously unseen domains.
Most organizations deploy HTTP proxies as both an efficiency and security mechanism. A proxy is generally a hardware device that all clients must connect through to access the internet. In doing so, commonly visited pages are cached, delivering a performance boost to the organizations users and saving bandwidth. For the security practitioner, the proxy also creates a single point of protection and detection through its ability to log web browsing and block or alert on entries appearing in blacklists.
Figure 1: HTTP proxies serve as a choke point for web browsing
These blacklists are a subset of the proxy provider’s URL categorization service. By crawling and categorizing sites on the internet, proxies can be configured to deny access to website meeting certain criteria. For example, many workplaces choose to block sites classified as pornography, and public schools often block gaming sites. Sites that have been associated with malware are often categorized accordingly and blocked by default with most proxies.
Proxy vendors have gotten good at automating their categorization using things like web crawlers and automated submission analysis from existing customers. That means that most of the URLs you encounter will already be categorized. But, what does it mean when you encounter a URL that isn’t categorized?
First, the URL could be relatively new. It might also mean that it isn’t being indexed by search engines, or that it serves a very specific purpose and has gone undiscovered as a result. Of course, that purpose could be to direct you to a system hosting malware. Many times, it is likely a combination of these things. The important takeaway is that domains registered by malicious actors and used for things like phishing attempts and malware C2 often spend a bit more time as an uncategorized sites as far as your proxy concerned. That means we can hunt for them!
Hunting for Uncategorized Domains
Finding uncategorized domains only requires that you have access to the data feed generated by your proxy, that you’re logging URL categorization, and that those logs are collected somewhere that’s searchable. With that in mind, we ask the question “Did any system on my network make an HTTP request to an uncategorized site?”
This question can be answered with a simple aggregation using Sqrrl Query Language as I’ve done here:
SELECT * FROM Sqrrl_ProxySG WHERE sc_filter_category = 'Unknown'
This query returns all HTTP proxy records (Sqrrl_ProxySG) matching a proxy category (sc_filter_cateogry) of Unknown.
Figure 2: A list of all HTTP proxy records for uncategorized sites
If there are only a small number of results, this might be a fine place to start. In large networks or where a lot of unknowns exist, you may be better of performing an aggregation and sorting the visited sites by how many times they were requested, as I’ve done here:
SELECT COUNT(*),cs_uri_stem FROM Sqrrl_ProxySG WHERE sc_filter_category = ‘Unknown’ GROUP BY cs_uri_stem ORDER BY COUNT(*) LIMIT 100
This query selects the URI stem field (cs_uri_stem) from the HTTP proxy data source where the category is unknown and groups and counts all unique entries for that field. The results are sorted by the count with only the top 100 values shown.
Figure 3: A list of all uncategorized sites grouped by URI stem
A couple things to keep in mind. First, URL categorization is sometimes an optional feature or paid add-on provided by proxy vendors. You’ll need to ensure that service is enabled. As well, there is no standard naming convention for what proxy vendors call uncategorized URLs. For example, some vendors will refer to these as “Unknown” whereas others refer to them as “Uncategorized”. You’ll need to edit the above query to account for whatever your proxy identifies these sites as.
Using Risk Triggers with Uncategorized Domains
Uncategorized domains aren’t rare enough to be used for alert-based detection, which is why they are well suited as a hunting input. However, these events are also useful during the context of other investigations. You can use Sqrrl’s risk trigger functionality to decorate entities with context indicating they’ve connected to an uncategorized site.
The risk trigger below will accomplish this goal.
Figure 4: A Risk Trigger for uncategorized site visits
Risk triggers are run once a day across the previous 24-hours of data. Should this risk trigger match an entity, it will assign a risk score to it. Those risk scores can be examined during an investigation with higher risk items bubbling up to the main Sqrrl dashboard.
Tips for Investigating Uncategorized Domains
The same strategy you use to investigate any communication with a potentially hostile domain should apply to investigating suspicious uncategorized domains. When you’ve found one, start by performing a Google search on the domain to see if you can quickly determine that it is legitimate and rule out the finding. If a determination can’t be made, you can examine the relationships existing between your internal hosts and the domain.
You should consider examining the following relationships:
- Suspicious files downloaded from the domain. This could indicate the download of malware.
- Unsolicited outbound communication to the hostile domain with no referrer. This could indicate malware command and control.
- The reputation of the domain. This would clue you in if someone else discovered the domain is attacker-controlled.
Figure 5: An uncategorized site reveals interesting relationships
This type of hunting exercise is most likely to find new malware, newer variants of existing malware, or phishing-related domains.
HTTP proxy domain categorization has several uses. Hunters can take advantage of rapid URL classification by searching for communication with yet uncategorized domains to find evidence of malicious activity based on domains that are new or used for a specific purpose. This is a simple technique, but one I’ve used countless times to find malware that other tools haven’t been able to find. By searching for uncategorized domains or Sqrrl or creating an uncategorized domain risk trigger, you’ll be able to enrich your investigations similarly.