In an earlier post titled “Growing Your Malware Corpus”, I outlined methods for building a comprehensive test corpus of malware for detection engineering. It covers using sources like VX-Underground for malware samples and details how to organize and unzip these files using Python scripts.
In today’s post we’re going to cover using Python to apply a standard naming methodology to all our malware samples.
Depending on where you curate your samples from, they could be named by their hash, or as they were identified during investigation, like invoice.exe. Depending on the size of your collection, I’d surmise it’s highly unlikely that they have a consistent naming format.
I don’t know about you, but a title that indicates the malware family and platform is a lot more useful to me than a hash value when perusing the corpus for a juicy malware sample. We can rename all our malware files using Python and the command line utility for Windows Defender.
Step 1: You’ll need to install Python on a Windows box that has Windows Defender.
Install Python
If you don’t have Python installed on your Windows machine, you can do so by downloading the installer from python.org, or alternatively, installing from the Windows store.
Windows Store installer for Python versions 3.7 to 3.12
Directory Exclusion
Within the Windows Defender Virus & Threat protection settings, add an exclusion for the directory you’re going to be using with the malware. Make sure the exclusion is in place before connecting the drive with the malware so it doesn’t get nuked.
Note: Doing this assumes you’ve evaluated the potential risks associated with handling malware, even in controlled settings, and have taken safety precautions. This is not an exercise to be conducted on your corporate workstation.
Screenshot of the D:\Malware Directory being excluded from Windows Defender.
Automatic Sample submission
It’s up to you if you want to disable the Automatic Sample submission. If you do, you’ll still may get prompted to send some.
Automatic Sample Submission turned off in Windows Defender Configuration.Windows Defender requesting to send samples to Microsoft for further analysis.
Rename_Malware.py
The star of this show is the python script that was shared on twitter from vx-underground.
The post walks through various options for utilizing Windows Defender command line, MpCPmdRun.exe. Using that information a Python script was developed to loop through the contents of a directory, analyze those files with Windows Defender, and then rename the files accordingly based on the malware identification.
Python code for rename_malware.py in VS Code.
You can grab the code from the linked post, or a copy on my Github here.
Once you’ve got Python installed, directory exclusion configured, and a pocketful of kryptonite (malware), – you’re ready to go.
python rename_malware.py D:\Malware
Windows Defender command line will run through each file and rename them based on its detection.
The script recursively renames the analyzed files.
I’m running this on a copy of my malware corpus of 30,000+ malware samples.
Counting the Corpus
A bit of handy PowerShell math. Before and after the process I wanted to be sure of how many files were present to ensure that the antivirus didn’t remove any. I also wanted to exclude counting pdfs as many of the samples in my corpus also have accompanying write-ups.
The script continues recursively renaming the analyzed files.Energizer Rabbit. “Still Going!”
Finally… not begrudgingly at all considering over 30,000 samples were analyzed, the script has reached the end of the samples.
Script has reached the end of the files.
If we do a directory listing on the contents of the malware directory, we see that the majority of the files have all been renamed based on their malware identification.
File listing showing malware files named Trojan.Powershell… Trojan.Script… etc.
Hooray!
That makes it much easier to search and query through the malware repository.
If you’re like me and have your favorite forensic tools for Linux, and your favorite tools for Windows, you can run them both on the same machine without having to diminish resources with the use of a virtual machine. You can do this by installing SIFT (SANS Investigative Forensic Toolkit) within WSL (Windows Subsystem for Linux).
Note: this article assumes that WSL is already installed. If not, GTS.
Start off by grabbing Ubuntu 22.04 from the Windows store, or if you prefer the command line.
Finally, install the server mode version of SIFT. Server mode only installs the SIFT command line applications, which is most appropriate for running under WSL.
If all goes right you’ll see a wall of text that concludes (after a few minutes) with ‘salt-call completed successfully.’
My go-to test for SIFT installations has always been to run Volatility (-h for help).
vol.py -h
If you’re seeing output, the mission was a success.
Besides saving the resources needed for a full VM, you also don’t have to worry about duplicating copies of evidence items as both Windows and Ubuntu are running on the same machine.
Now get yourself familiar with the Linux tools of the SIFT Workstation and enjoy running them in parallel with your favorite Windows forensic applications.
I’ve been participating in the MAGNET sponsored Capture the Flag (CTF) events since before being happily employed there. In a way you could say that one helped facilitate the other, but that’s a story for another time. This blog actually started back in 2020 to, among other things, share my write-ups of that years CTF.
The 2024 CTF event was part of the Virtual Summit that ran from February 27th to March 7th. There were more than 50 presentations about topics like mobile forensics, artificial intelligence, eDiscovery, malware, ransomware, digital evidence review, video forensics, and live Q&A sessions.
The CTF questions were divided into three groups, iOS, Android & Ciphers. The evidence sources included a full file system extraction of an iPhone 14, a logical extraction of an Android phone, a Facebook ‘Download Your Data’ export and an export of Discord messages. I focused almost entirely on the iOS questions, and even had a few of those left on the table when the 3 hours allotted for the challenge was up. The numbers in parenthesis represent the point value which is intended to align to question difficulty. I processed the iOS extraction with AXIOM Cyber and iLEAPP.
MVS 2024 CTF: iOS
Why are your messages green? (5)
For this one we’ll use MAGNET Axiom, specifically the Conversation View. In the message thread below, we can determine from the conversation that the first time the two persons met was December 17, 2003.
The question title suggests (not so surreptitiously) that we’re going to be dealing with an image file. In the MEDIA > Photos Media Information we see a picture of a store shelf of a pain relief gel. (I know the feeling. Take care of yourself young forensicators; and don’t forget the sunscreen.) The price of the item was $10.99.
Answer the call (5)
In the Refined Results for Web Chat URLs we see the user visiting a Discord server with the guild ID of 136986169563938816.
Don’t ghost me (5)
To solve this one we’re first going to need to know what MYAI refers to. Running a global search for MYAI shows that it’s a SnapChat “Artificial Intelligence” bot. Again we’ll switch to Conversation View. Once we do so we can see that Chadwick was annoyed with MYAI on December 26th at 11:27:45 UTC.
Build me up, buttercup (5)
For this question I found it easier to produce the result from the iLEAPP report. What I found interesting is identifying all the other locations where the build ID of the device may be captured, like in the user’s YouTube playback history.
Warning Signs (5)
In order to get this flag we need to combine two iOS iMessage events. We see that the user joined Boost Mobile on November 29th. The warning about reaching maximum data usage was received on December 27. There are 18 days between those dates.
One is The Loneliest Number (10)
The answer for this one can be found in the iOS snapshots on the device. This is often an interesting artifact for me as you get a glimpse (literally) into the applications that have been used on the device. These snapshot images are recorded whenever a user switches between one application and another, and is what produces the carousel like view when switching apps. It looks like Chad’s feeling a little short on friends. I can sympathize at times. Meanwhile the advice from ChatGPT is good advice for making and maintaining connections in the DFIR community as well.
For when I can’t Find My gear (10)
Drilling into the Cached Locations and examining in World map view, we see a cluster of activity around the Neptune Mountaineering. (You’ll also be able to find that Chad connected to their Guest Wi-Fi when he was visiting the store.)
Just a couple steps away (10)
Apple Health Steps is one of the artifacts found under Connected Devices. If we apply a filter for just events on 12/3, we see four values recorded. Add the four together and you get the total steps for the day.
I hear Stanley cups are all the rage (25)
While perusing the photos I saw that there was one captured at a hockey game on December 22. In the image we can see that the game took place at the Ball Arena.
My sports knowledge is on par with my cooking abilities – not good. I decided to ‘phone a friend’ to help with this one, the Google Bard (now Gemini) AI.
Can anyone Kelp? (25)
If you filter out the applications from apple (com.apple…) there aren’t too many remaining, and of those only a few are games. Of those I can only see one dealing with greens.
The name of the application Terrarium was not accepted for an answer. Checking iLEAPP to see if there was another application that I had missed, I saw the full name of the game is Terrarium: Garden Idle. It’s a good idea to always validate your evidence with at least one addition tool from your primary.
The easy way or the hard way (25)
Again looking at the chat history we have a conversation between Chad an Rocco. The last message sent was on December 21, 2023 at 06:29:36 UTC.
Follow the Breadcrumbs (50)
This answer was easier to grab from iLEAPP as there’s a specific entry for Biome Text Input Sessions. Filtering for amazon, we see 4 entries. 2 of those occurred on December 24.
Season’s Greetings (75)
Start off with a search for Susan and we can see there’s a iMessage chat history. Chadwick’s first message to Susan says “Christmas Susan! 🪴 how have you been?”
MVS 2024 CTF: Ciphers
While working through the iOS questions I diverted my attention to a few of the Cipher questions when I needed to give my brain a change of pace. I only did a few of them.
Have you ever tried reading the alphabet in reverse? (5)
For this one we’ll throw the sample text into dcode.fr. Doing so suggests it is an Atbash Cipher.
“Atbash (Mirror code), a substitution cipher replacing the first letter of the alphabet with the last, the second with the penultimate etc.”
That sounds to me like a backwards alphabet. Decode the text using the Atbash Cipher on dcode.fr.
Why did the bicycle fall over? It was tired of all the ROTation! (5)
From the clue we can be pretty sure this is a ROT cipher. Using CyberChef we can try the ROT13 Brute Force. Scanning through the output we see that the output for a rotation of 2 produces a legible result and is the answer for the challenge.
VIGorous ENcrypting? Embrace the Riddle’s Essence, it’s “essential”! (10)
A quick Googling on VIG and cipher and we learn there’s a Vigenère cipher.
Off to CyberChef. Choose the Vigenère cipher recipe, enter the input provided in the question, QshprMzepw, and use the key “essential”. The decoded text is MapleTrees.
That’s all for me. Thanks to Jessica Hyde and her team at Hexordia and the students at Champlain College that put all the effort into coming up with the challenges. Also thanks to the winningest Kevin who took the year off from competition to join the CTF creation team.
As always it was a lot of fun, and I learned a couple things along the way.
I had an older MacBook Pro (15-inch, 2.53GHz, Mid 2009) that had been unused for a while as it was no longer getting updates from Apple. It’s one of the Intel chip ones and last ran Monterey. I pulled it out of the closet and decided to give it a refresh by installing REMnux on it. The process was pretty straightforward, but there were a couple things noted along the way I thought I’d share.
Start off by downloading the Ubuntu 20.04.6 AMD64 Desktop ISO. Yes, 20.04. Later installations aren’t supported by the REMnux installer.
Next you’ll want to burn the image to a flash drive, and make it bootable, using Rufus (Windows) or Balena Etcher (Mac.) This model MacBook has USB-A ports which seems like a relic compared to the current Macs. You’ll need at least an 8GB flash drive for the Ubuntu image. The first free one I could find was 32GB so I used that.
With the bootable USB drive inserted, power-up the MacBook and hold the option key until you see the different hard drives listed.
The flash drive is the one that shows as EFI Boot. Select it and hit return/enter.
Once everything is booted up you’ll get to the Try or Install Ubuntu menu. We’ll choose install.
Specify options as needed for timezone, keyboard, etc. For the username we’ll use remnux and the password malware as that’s the default. After the installation you can set the password for the remnux user as you wish.
At the Installation type we’ll choose Erase disk and install Ubuntu.
Sorry for the wavy resolution. Tough to get good screenshots during bare-metal OS installations.
Once the installation completes, hit Restart Now.
When I first logged in I was getting an error, “Activation of network connection failed” when trying to authenticate to the wireless network. Disabling IPv6 for that network fixed. it.
Now that we’ve got connectivity, we can grab any available Ubuntu updates.
sudo apt-get update && sudo apt-get upgrade
If at any point you’re prompted to do a distribution upgrade (a version of Ubuntu later than 20.04), choose Don’t Upgrade.
Once you’ve done all the OS updates, and rebooted, we can start the REMnux installation. We’ll be following the Install from Scratch instructions at remnux.org
The first time I ran the installer it failed as curl wasn’t installed. So take care of that before starting the install.
sudo apt-get install curl
At this point we’re ready to run the installation. The one deviation I’m choosing here is that rather than the standard install, I’m choosing the ‘cloud mode.’
If you’re depoying REMnux in a remote cloud environment and will need to keep the SSH daemon enabled for remotely accessing the system, use the following command instead to avoid disabling the SSH daemon. Remember to harden the system after it installs to avoid unauthorized logins.
remnux.org
In my case I plan to be ssh’ing into the box from within my own network more often than actual hands on keyboard, hence the cloud mode.
sudo remnux install --mode=cloud
At this point grab a coffee, walk the dog, or find something to do while the wall of text streams by.
Note if the install fails the first time don’t be afraid to re-run the install command a 2nd time.
Finally when it’s done, Reboot.
There you go. A shiny, happy, malware analysis machine.