Why You Need Static Analysis, Dynamic Analysis, and Machine Learning?
Point solutions in security are just that: they focus on a single point to intervene throughout the attack lifecycle. Even if the security solution has a 90 percent success rate, that still leaves a 1 in 10 chance that it will fail to stop an attack from progressing past that point. To improve the odds of stopping successful cyberattacks, organizations cannot rely on point solutions. There must be layers of defenses, covering multiple points of interception. Stacking effective techniques increases the overall effectiveness of the security solutions, providing the opportunity to break the attack lifecycle at multiple points.
Related Video
Why Machine Learning is crucial to discover and secure IoT devices
Below are the three threat identification methods that, working in conjunction, can prevent successful cyberattacks:
Dynamic Analysis
The Only Tool That Can Detect a Zero-Day Threat
With dynamic analysis, a suspected file is detonated in a virtual machine, such as a malware analysis environment, and analyzed to see what it does. The file is graded on what it does upon execution, rather than relying on signatures for identification of threats. This enables dynamic analysis to identify threats that are unlike anything that has ever been seen before.
For the most accurate results, the sample should have full access to the internet, just like an average endpoint on a corporate network would, as threats often require command and control to fully unwrap themselves. As a prevention mechanism, malware analysis can prohibit reaching out to the internet and will fake response calls to attempt to trick the threat into revealing itself, but this can be unreliable and is not a true replacement for internet access.
Malware Analysis Environments Are Recognizable and the Process Is Time-Consuming
To evade detection, attackers will try to identify if the attack is being run in a malware analysis environment by profiling the network. They will search for indicators that the malware is in a virtual environment, such as being detonated at similar times or by the same IP addresses, lack of valid user activity like keyboard strokes or mouse movement, or virtualization technology like unusually large amounts of disk space. If determined to be running in a malware analysis environment, the attacker will stop running the attack. This means that the results are susceptible to any failure in the analysis. For example, if the sample phones home during the detonation process, but the operation is down because the attacker identified malware analysis, the sample will not do anything malicious, and the analysis will not identify any threat. Similarly, if the threat requires a specific version of a particular piece of software to run, it will not do anything identifiably malicious in the malware analysis environment.
It can take several minutes to bring up a virtual machine, drop the file in it, see what it does, tear the machine down and analyze the results. While dynamic analysis is the most expensive and time-consuming method, it is also the only tool that can effectively detect unknown or zero-day threats.
Static Analysis
Swift Results and No Requirements for Analysis
Unlike dynamic analysis, static analysis looks at the contents of a specific file as it exists on a disk, rather than as it is detonated. It parses data, extracting patterns, attributes and artifacts, and flags anomalies.
Static analysis is resilient to the issues that dynamic analysis presents. It is extremely efficient – taking only a fraction of a second – and much more cost-effective. Static analysis can also work for any file because there are no specific requirements, environments that need to be tailored, or outgoing communications needed from the file for analysis to happen.
Packed Files Result in Lost Visibility
However, static analysis can be evaded relatively easily if the file is packed. While packed files work fine in dynamic analysis, visibility into the actual file is lost during static analysis as the repacking the sample turns the entire file into noise. What can be extracted statically is next to nothing.
Machine Learning
New Versions of Threats Clustered With Known Threats Based on Behavior
Rather than doing specific pattern-matching or detonating a file, machine learning parses the file and extracts thousands of features. These features are run through a classifier, also called a feature vector, to identify if the file is good or bad based on known identifiers. Rather than looking for something specific, if a feature of the file behaves like any previously assessed cluster of files, the machine will mark that file as part of the cluster. For good machine learning, training sets of good and bad verdicts is required, and adding new data or features will improve the process and reduce false positive rates.
Machine learning compensates for what dynamic and static analysis lack. A sample that is inert, doesn’t detonate, is crippled by a packer, has command and control down, or is not reliable can still be identified as malicious with machine learning. If numerous versions of a given threat have been seen and clustered together, and a sample has features like those in the cluster, the machine will assume the sample belongs to the cluster and mark it as malicious in seconds.
Only Able to Find More of What Is Already Known
Like the other two methods, machine learning should be looked at as a tool with many advantages, but also some disadvantages. Namely, machine learning trains the model based on only known identifiers. Unlike dynamic analysis, machine learning will never find anything truly original or unknown. If it comes across a threat that looks nothing like anything its seen before, the machine will not flag it, as it is only trained to find more of what is already known.
Layered Techniques in a Platform
To thwart whatever advanced adversaries can throw at you, you need more than one piece of the puzzle. You need layered techniques – a concept that used to be a multivendor solution. While defense in depth is still appropriate and relevant, it needs to progress beyond multivendor point solutions to a platform that integrates static analysis, dynamic analysis and machine learning. All three working together can actualize defense in depth through layers of integrated solutions.
Palo Alto Networks® Next-Generation Security Platform integrates with WildFire® cloud-based threat analysis service to feed components contextual, actionable threat intelligence, providing safe enablement across the network, endpoint and cloud. WildFire combines a custom-built dynamic analysis engine, static analysis, machine learning and bare metal analysis for advanced threat prevention techniques. While many malware analysis environments leverage open source technology, WildFire has removed all open-source virtualization within the dynamic analysis engine and replaced it with a virtual environment built from the ground up. Attackers must create entirely unique threats to evade detection in WildFire, separate from the techniques used against other cybersecurity vendors. For the small percentage of attacks that could evade WildFire’s first three layers of defenses – dynamic analysis, static analysis and machine learning – files displaying evasive behavior are dynamically steered into a bare metal environment for full hardware execution.
Within the platform, these techniques work together nonlinearly. If one technique identifies a file as malicious, it is noted as such across the entire platform for a multilayered approach that improves the security of all other functions.