I never heard of this lab before, but they stumbled upon this really neat malware and did a great writeup! They also have a Python script to scrape the CnC servers.
Source: http://blog.accuvantlabs.com/blog/dgrif/anatomy-targeted-attack
Anatomy of a Targeted Attack
Stage zero—malware dropper
We constantly deal with targeted attacks, and sometimes we are lucky enough to find the initial command and control mechanisms still live. On one malware response we found a piece of malware that was querying a website that was acting as a Command and Control (CnC) server. We were able to mirror the entire site and reverse engineer the control mechanisms within the malware. The following deep dive discusses some of our findings.
The CnC server that was sending out instructions was designed to appear like a simple website. The HTML on the site rendered as a normal webpage. However, we discovered that the commands were hidden within the HTML comments of the webpage. When we visited the landing page on the site, we were greeted with the following under-construction message.
So if we view the source for this web page we can see the following normal looking HTML:
The comment at the top, while fairly inconspicuous, is part of how the attacker controls the malware on the infected machine. The directory we downloaded contained numerous MRTG files, logs, and a Microsoft CAB file. The MRTG reports also appear legit and harmless just like the under construction page as you can see in the following figure.
If we scan the contents of the html files for the string “DOCHTML” we see it contained in several files:
Within the binary, we find a function responsible for enumerating API addresses from urlmon.dll and wininet.dll. The API’s are called indirectly to avoid having entries in the import table. In addition there is a function responsible for decoding the operations to add another layer of obfuscation the binary. The network related APIs are used to pull an index.html file, that will be parsed for instructions. The primary instructions are to sleep, terminate, or download and execute a specified file.
The CnC structure, parses the first line of the html file it downloads and looks for the leading “<!–DOCHTML” followed by the trailing “–>” What is in between is the actual command, which is then translated to a 1, 2, or 3 and returned to determine if the binary should download and execute, sleep, or die. To verify this, we can cab a copy of calc.exe and force the malware to download and execute this on a remote system, as seen below.
Simple enough for a CnC mechanism, although the CAB file we pulled from the malicious site was a remote access tool instead of calc.exe. Additionally, while reversing the malware sample we coded a snippet of idapython that emulates the string decoding routine and automatically comments all references to those strings in the IDB. This way you don’t have to ameliorate each encoded string by hand. The decode function was fairly simple, the first and last byte of the array are XOR’d together to form a key, that is used to XOR each of the remaining as shown below.
The malware is simple yet extremely effective, hiding perfectly fine in plain site.
Immunity’s python API for ImmDbg has a “disasmBackward” function that the IDC language appears to lack. Since that functionality was extremely useful in enumerating parameters to functions, I emulated the function in IdaPython. This is effective and useful for some situations, but string operations can be expensive so keep that in mind should you choose to reuse the function elsewhere. A link to the script file is below.