Malware Obfuscation: Techniques, Definition & Detection
What is Malware Obfuscation?
Malware obfuscation is the act of making the code of a program hard to discover or understand—by both humans and computers—but without changing how the program works. The goal is not just to make a program unreadable, but to hide its presence completely.
Compression, encryption, and encoding are some of the most common obfuscation methods used by threat actors. Multiple methods are often used in tandem to evade a wider variety of cybersecurity tools at the initial point of intrusion.
Such is the power of obfuscation that hackers were able to use Cobalt Strike, a commercial penetration testing tool, to siphon data from thousands of companies around the world in one of the biggest cybersecurity breaches of the 21st century—the SolarWinds breach in 2020.
All it took was for hackers to hide the malware in an inconspicuously labeled image (e.g., gracious_truth.jpg, festive_computer.jpg) and encrypt it with a simple rotating XOR cipher. Once inside the target network, the payload was then activated to download and install additional malware components.
Malware Obfuscation Detection Tools
Detecting and defending against obfuscated malware can be tricky because many obfuscation techniques exploit native tools and features such as HTML tag attributes and built-in compilers, which cannot be recognized as malicious for practical reasons.
Uncovering malicious code that has been obfuscated is tough. This is why adopting a defense-in-depth approach that includes behavioral analysis of all assets and users in your environment is a good way to assure that strangely-behaving applications are discovered, even if the malicious code isn't caught before execution. Start by deploying multiple types of defenses, namely the three in the SOC visibility triad: Network detection and response (NDR), endpoint detection and response (EDR), and security information and event management (SIEM) systems.
History of Malware Obfuscation
Though it is unclear when digital obfuscation started being developed seriously, we can point to a few milestones over the last 40 years. Much like early viruses, many early applications of obfuscation were not malicious.
1984 saw the creation of the International Obfuscated C Code Contest, which was the first competition in the world to see who could write the most obfuscated C program. Though it was more of an academic exercise to push the boundaries of obfuscation, it also revealed the power of obfuscation through many mind-boggling creations over the years.
Things picked up in the 1990s and 2000s as digital watermarks, a form of steganography, were used to identify copies of illegally distributed music and movies. This coincided with the passing of the Digital Millennium Copyright Act (DMCA) in 1998, which was used by the music and movie industries to combat piracy.
The early 2000s also saw the first instances of obfuscated malware. In 2005, we saw the PoisonIvy remote access trojan (RAT) hide part of its code to evade signature-based detection tools. Another RAT, Hydraq, used spaghetti code in 2009 as a means of obfuscation. It rearranged code blocks so that it could not be followed linearly, then used jump instructions to execute them in the right order.
Notably, the MITRE ATT&CK entry on obfuscated files or information is relatively new, having only been created on 31 May 2017. Few procedure examples in its database were found before 2015, indicating an explosion of interest around obfuscation in recent years.
More recently, we see signs of maturation and commercialization in the marketplace. In 2020, researchers found a number of vendors providing obfuscation-as-a-service for Android applications, with prices starting at $20 per APK. Impressively, this off-the-shelf service reduced payload detection rates by nearly 50%.