When Scanning Isn’t Enough: Practical Tips for Log4j Vulnerability Detection
The Log4j critical vulnerability (CVE-2021-44228) is being actively exploited and is a major concern for organisations worldwide. CyberCX is helping hundreds of customers respond and recover from this incident.
In this post, we explain why traditional vulnerability scanning will not find all vulnerable instances of Log4j and what methods blue teams can use instead.
What’s Log4j and how is it exploited?
It’s entirely likely that you have Log4j present somewhere in your environment, even if you’d never heard the term “Log4j” before it appeared in headlines recently. That’s because Log4j is an extremely common open source library. Its flexible licence means that it’s used by a huge number of Java applications. Sometimes those applications themselves are packaged and used by other applications.
Versions of Log4j prior to 2.15.0 and 2.16.0 have a vulnerability where certain user-provided malicious input will be processed in a way that lets an attacker execute code on the machine. From there, the attacker can conduct malicious activities such as:
- collecting credentials
- installing ransomware and crypto miners
- exfiltrating data, or
- using the access to pivot further into the victim environment.
Why won’t vulnerability scanning find all vulnerable instances of Log4j?
Most remotely exploitable software vulnerabilities have signatures or characteristics that can be identified by vulnerability scanners run by both attackers and defenders. For example, to test for the 2014 Heartbleed vulnerability, a specially-crafted data packet is sent to the target server. If the server responds in a certain way, it is vulnerable. If it does not respond, it’s not vulnerable.
Unfortunately, Log4j does not work like this. Applications only pass input to the vulnerable Log4j logger under certain circumstances. What input is provided, and when, is up to the developer of the particular application. It might log a lot of input or it might only log limited and specific parameters. Alternatively, input might be logged only when the application is in a certain state. Logs may be sent to dedicated logging servers, or they might be batched up and sent to the logger well after the input is received from the user.
This creates a fundamental problem: if you ran a vulnerability scanner and the result was negative, is that because you’re not vulnerable, or because the scanner didn’t know that the application only sends user input to the vulnerable logger under certain conditions?
Vulnerability scanning is still a good place to start
To be clear – vulnerability scanners are useful in dealing with this situation. A positive detection can give you a simple, high-signal confirmation that you have exploitable Log4j in your environment. If a vulnerability scanner only identifies 50% of the vulnerable Log4j instances you have littered around your environment, you have still cut the problem space in half. But a negative scan result does not mean you can rest easy. There is still every chance that it was a false-negative.
How can blue teams look for Log4shell from the inside?
So what can be done?
First, a lot of the vulnerability scanning problems can be worked around by cleverly implemented vulnerability scanning approaches. The problem of only certain data fields being logged can be addressed by brute-forcing the problem and stuffing a payload in every data field possible. For a scanner to know what data fields to try, it needs to be application-context aware. Performing scanning by instrumenting it into an intercepting proxy like Burp Suite Pro or ZAP has proved to be a successful approach.
Second, the blue team also has a big advantage attackers don’t. We don’t need to rely on external scanning but can look inside the box.
At first glance, this appears an easy problem to solve. The Log4j library files have certain signatures. If we just look for those file signatures on the file systems of our servers, we should be able to identify all potentially vulnerable instances of the library, right?
But this can be challenging. Log4j could be embedded in a library. It could be embedded in a library in a third-party app. It could be embedded in a library in a third-party app embedded in another third-party app. In a container. In a Kubernetes cluster.
Like the scanning problem, most of these detection complications can be worked around by some cleverly implemented detection engines that can recursively look inside containers such as ZIP files, JAR files, Docker containers and so forth.
The most difficult part is reaching the files to run these detections. Most of the time, the files we’re interested in checking are on application servers and in containers spread throughout your infrastructure. For sound security design reasons, the file systems of these servers and containers are not designed to be easily accessible from any single point in your network.
Designing for the Future
To steal a line from a former US Secretary of Defense, “You respond to cyber incidents with the infrastructure you have, not the infrastructure you might want or wish you had at a later time”.
While vulnerability scanning and file system-based signature detection are useful in this case, neither are guaranteed to identify all instances of vulnerable Log4j. Like the glitter from that kids’ birthday party you went to three months ago, we’re all going to be finding vulnerable instances of Log4j for a long time to come.
Rather than treating Log4j in isolation, we need to plan now for similar incidents in future. We can’t prevent vulnerabilities in third-party software. But by thinking long-term about how security principles and design patterns integrate into system architecture, we can make a big difference to our ability to respond.
The log4j vulnerability has demonstrated this for a lot of organisations. Those that have invested in deployment pipelines for infrastructure can use these pipelines to integrate checks for known-vulnerable software components, dramatically lowering response times for newly published vulnerabilities. Organisations that have prioritised the ability to instrument and query running infrastructure using tooling like osquery are able to quickly and more confidently identify instances of the vulnerable software.
Ultimately, the Log4j vulnerability neatly illustrates many shortcomings in the approach to security architecture and application security commonly followed in our industry. Rather than relying on scanning and searching for vulnerable files, ideally systems would be designed so that detection of known-vulnerable components is performed automatically.
In 2022, CyberCX will be launching a series of application security focussed blog posts that take this strategic view of security and will look at topics like “pushing security left”, infrastructure as code, integrating Software Component Analysis (SCA) tools into pipelines and more.
How can CyberCX help in the short term?
CyberCX is currently engaged with customers to detect, respond and recover from this incident. If you need assistance, we can help:
- triage your environment
- identify and apply mitigations
- explain the technical issues to non-technical management
- look for indicators of compromise
- respond to compromised systems.
For urgent assistance, contact us:
UK: +44 (0) 1865 504 032
US: +1 212 364 5192
How can CyberCX help in the long term?
CyberCX offers consulting services to assist our customers address long-term security needs, including:
- secure software development life cycle maturity reviews
- DevOps security reviews and consulting
- security tooling integration services
- threat modelling
- cyber intelligence
- incident response retainers
- cyber security strategy reviews and updates.
Read more about Log4j