Pealing back the layers of a batch script ransomware

Our sample today comes from Ahmet Payaslioglu AT_Computeus7 in This twitter thread.

I was tagged along with a few other people that may be interested in the sample.

The main file was run on AnyRun Here. This is where I downloaded that from.

I also downloaded the 2 pastebin files that were referenced before they have a chance disappear.

Let’s take a look at the main file downloaded.


There are several thing we see here. I’ve been calling these Environmental variable like like because of the values are surrounded by 2 “%” percent signs  like  this %jVElq:~20,1%. But I guess the technical term would be Delayed Expansion.


We can see where it is enabled here.


The next thing is the set values.

These are basically Alphabets that the numeric value will index from and return a char or string depending on the values.


Here we can see how each of those parts work to build a string.


Here we can see a couple of things. There are several “@” symbols showing on the left side spread out and the matches for the variable name seem to stop at the “@” symbol.

That can be an indication there are multiple levels of encoding/obfuscation at work here.


Here we can see the result of each layer getting decoded to produce the strings/ Alphabets to decode the next layer.

There were a total of 10 layers to get this to fully decode.

The tools I had were close but not close enough to decode these layers so I wrote a new one.

Experimenting with Reg-X to make it easier to find the the “Matching” values.


Here is what I used.    %jVElq:~[0-9]+,[0-9]+%

In my decoder I had to allow for changing the name so that looks like this.

Dim Pattern As String = “%” & Name & “:~[0-9]+,[0-9]%”


Here we can see it did a pretty good job of finding and replacing the values.

This tool will only work on 1 name/ value at a time so you have to start at the top and work your way down on the name/Alphabet.  You also have to copy the result from the output back to the input to get it fully decoded. Yes I got lazy and didn’t add a button to do that.


Once we get the first layer decoded we see the second and that is where my tool failed.

So what was going on?


Take a closer look. Now do you see it?

If you take a close look a the variable name of “xuunL” then look at the highlighted values they are using mixed case parameter names.

My Reg-X the way it was built would not pick up on the various different versions. I also could not find a way for it to “Ignore Case”

My next step was to build a function to “Normalize” the case to what it was originally set at. It searches 1 step at a time thru the input and matches the value then replaces it with the proper case.

So when it would find   %XuUNL:  it would replace it with %xuunL:  throughout the input data.

This is still a lot of work if you have 10 rounds but you can control and work with the output.


After decoding the 10 layers we see where it is setting a file extension in different folders. But it does not appears to be encrypting anything. Unless I’ve missed it so far.

We can also see it is downloading 2 files from pastebin.


Here we see a note claiming the files were encrypted.


Looking at the renamed files on Anyrun suggest that it does only rename files and not encrypt them.

Although this does not appear to be encrypting anything (Right now) it is going thru and disabling Security product and dumping data.


Disabling taskmgr, exiting if a listed username is found ? Sandbox or developer username?

Looking at the decoded file from the “EZ88t5c1” pastebin link we can see links to telegram, Discord, Google Docs.



Looking at the cleaned up version of the pastebin link “bLnD8FWX” we can see that is is shutting down found security products.

There is a lot to process in this piece of malware.

That is it for this one.



Link to Twitter thread.

Link to Anyrun for the main script.

Link to my github for the tool used here. (password is “clean”0

Posted in Uncategorized | Tagged , , | 2 Comments

What’s the difference and why should I care ?

On occasion I go hunting in various sandboxes by scrolling down the list of submissions to look for something interesting to look at.

I don’t normally see that many PowerPoint samples  So I took an interest in this one that was marked as suspicious at the time.

This file Here on InQuest Labs.


We see some vba code and bitly[.]com link so defiantly suspicious.

We also see there is a distinctive file path in the code but it gets picked up as a domain name.




So as of this post there are now 58 hits for the path string.

So I chose a file the was listed as malicious to see what it looked like.

This file Here on InQuest Labs.


Scrolling this file we now see it has an autopen function so why the difference.

Normally I will use 7Zip to decompress the document and then navigate to the ppt folder.



Generally you would see a filename similar to VBAProject.bin but in this case we see it as “dsjhfsfhsjfh.c”


If we open this with a hex editor we can see it has the header bytes of “D0 CF 11 E0” telling us we are in the right place.

Now 7Zip does not like this file extension so if we change it to “.bin” we can use it to decompress this data further.



Now this is the file marked as “suspicious”.



We notice something is defiantly different here.


This is the malicious one we can see it has a different file extension also.


We see the same filenames and sizes as the other one.



We can see there is more information here.

My next step is usually to copy paste the bytes into my decompress tool.


This tool uses the decompression function from oledump that decompresses the macro code.

I converted the python to vb dot net.

As you see here with the malicious version we have no problem extracting the vba.

There is nothing to extract from the suspicious one.

Lets look at the suspicious project data again.


We can see here that there is an auto open function but it is not showing up in the decompressed file.

I next looked at the 2 files in oledump.


As you would expect both vba macros are listed as upper case “M” that usually tells us that they contain Macro Code.


Now looking at the suspicious version we see that one of the macrocode is listed as a small “m”.


This tool works by searching from the “Attribut” string and then getting the start to ending of this section.

To be honest even though I got it to work doing the conversion I’m still not 100 % sure how it works.

I can read thru the compression and see what it is doing but it is not the same as decompressing it for possible odd code segments.

So my next step was to ping Didier Stevens @DidierStevens with some screenshots and a questions about the small “m” in this case giving a false sense.

After some time her released a brilliant ISC Diary on this type of sample Here .

So Armed with this information about the file format I can better extract the data.


Here is the entire “Stream” from that vba code.

We lean from the diary that byte 2 and 3 contain the length parameter and a flag for the data after the null byte.


So from the diary we need to read bytes 2 and 3 in little Endian.

So it would be B0 20 and the B is a flag  and the length would be 0x020 but that is much shorter than what we see here so we need to modify this value to B1 59 which when we do a replacement would be 59 B1.


So now that we fixed that length we can decompress it .

Now that we answered the question of what the difference is let’s look at the question of why I/We should care.

Even though these files get “Cleaned” we still need to know that they exist and may get overlooked as an IOC because they are no longer able to be malicious.

What if the file did it’s “Bad” first before it got “Cleaned”?

We would still need to extract some information that may give us further IOC’s.

That is for this one I hope you learned something too.



InQuest Labs:
Suspicious file Here
Malicious File Here

Didier Stevens:
ICS Diary Here

Posted in Uncategorized | Comments Off on What’s the difference and why should I care ?

Peeling away the layers of obfuscation from Excel VBA to dll

When I first seen this Tweet here by FileScan.IO @filescan_itsec I thought this would be a easy target for deobfuscation. I was wrong. The layers just kept peeling away.


Looking at the Twitter link you can get a pretty good idea what this is doing and where it is going.

If you can find all of the parts in the thread I have multiple screenshots that shows more detail but not all of it.

So here I will go thru in more detail the multiple steps required to get to the final malware.

Part 1 The document

Due to the many layers, I’ll have to break this up into parts to keep from getting lost on the way.

We first look at the doc and see there is a warning for disabled macro.


If we attempt to view it with the VBA IDE it throws an error telling us it is unviewable.


So lets see what oledump will tell us.


This tells us there is 1 VBA Macro called ‘ThisWorkbook’”.

We can continue to to dump the code with oledump or we can use a alternate method that I normally use.


Either type of file (Header PK 0r 0xDOCFI1E)  we can use 7Zip to decompress them.’ Navigate to the XL folder then we find the vbaproject.bin.

Next we can decompress that file too with 7Zip.


We next navigate and find the special compressed “ThisWorkbook” file. This tool here will take all of the bytes from the file and locate the header and extract the macrocode without running the file. It uses code converted from Didier Stevens @DidierStevens python code for decompression.


In the screenshot above we can see we have a check for the country setting of 81.

If we take a look here we see that the value comes from the control panel setting.


So what ever Country setting you are using is what would show up here. There is some references where people believe that this value is the same as the telephone country code. I think it may come from the file WINNLS.H.


Here is a built in list of country codes.


Here we can see that 81 is for Japan. So that tells us this sample is targeting only systems with that setting.

One other clue we have is the code page value as seen here on Iris-H.


In this case  932 – ANSI/OEM Japanese; Japanese (Shift-JIS)

So now we understand in order for this to run we would need to have our sandbox running under this Location ID.

Now let’s go back to the script.

After studying the script we see that all of the hex data gets reassembled and using 4 chars at a time and it is then converted to a string.


Notepad++ will allow you to search using regular expressions. With a little trial and error I was able to come up with this Reg-X to clean up the hex string and reassemble it.


This is after. It is a lot quicker than doing this line by line.



And the Bottom section.

Next we need to assemble the cleaned hex together and then decode to string.


After decoding the hex to string we now find that we are faced with a PowerShell script that is Invoke Dosfuscation , obfuscated.


This version set’s variables at the top of the script then assembles them at the bottom.


After reassembling we see we have a layer of Invoke Obfuscation now.


Now we decompress the base64 string to another Invoke Obfuscation encoded PowerShell script.


Now we are down to our last layer for this section.


At the top of this script we see a list of 4 files to download. As of 2021-12-06 I was still able to download the first 2 picture files the last failed to download.

Back down towards the bottom we have a for loop to attempt to download each file.

Once the file is downloaded it will use the function to extract each pixel  and the B and G color values , do the math then output as a char.

After that it will pass a base64 string that was decoded from the pixels and the result of the   ([System.Version]).”nAME” which evaluates to “Version”. This value is used as a key in the decryption..

The decryption function will use the first 32 bytes of the base64 Encoded data in the Derive key function. It will also use the passed key value.

The HMAC section is not really required to properly decrypt the data.

We then use RijndaelManaged decryption.

If the data was properly decrypted we then have the first byte of 0x1F. Which is also tells us without even looking at the rest of the code this is GZip Compressed data output.

While still in the decryption function it will GZip Decompress the data before returning another base64 string.


Here we see the result of decoding the Picture file.


Here we Decrypt the passed data and it is output as GZip Compressed.


After GZip decompressing we see yet another base64 string. This time with a 3 byte, byte order mark at the beginning.



As we can see we have another base64 string but this one get a little more done with it.

If we check the hash of the 2 downloaded picture files we find they are Different.

BlSALZQZ_o.png  MD5 : 86CEAF2709B769A29A40DCC4822D8F5E

o7h7NeV.png  MD5 : 59B3D9169EBC1C1EF6221A41609F8316


Although the Picture files are different the extracted base64 string are the same in both.

Part 2 The second decryption

In this section we are going to start where we left off on the last Base64 decoded string.

If we look close here at the decoding function for this layer we are using the same decryption function but with a different key and encoded data.


Looking at the first line we see $Fghg  will get set to a value.

The first value 102 Shift left 2 = 408. Next we Need to find the LCID.

An Internet search will land us on This page for the LCID Enumeration.


As we can see it is a defined ID for the Locale and Language code.


We have already determined this is targeting Japan so we scroll down to get the LCID for that Locale. Which is 1041.


Looking at line 1 it will get the 2 numeric values add them together and pass them to line 2 which will turn the numeric vale into a string to be used as a key passed in line 3.

$r44r=Ottass -Igaa $MmUz -Pcxc $Fghg;


Here we can see the values of the large base64 string and the key of  “1449” passed into our decrypt function to output  GZip compressed data.


After Decrypting to GZip compressed we had base64 with a Byte order mark again.

Upon base64 decompressing that we end up with this PowerShell script.


As we continue to scroll down we can see we appear to have some loader code obfuscated with Invoke Obfuscation again.

Using Kahu Security’s  new version of PSUnveil_v0.2 found Here we find that the normal “mode” selections fail to decode the script but the new “Clean-Up” option does a pretty good job of cleaning up the obfuscation. It still leaves a few obfuscated items Including the top part.



Using the older version of PSUnveil  we start cleaning the top section using the Arrow to move each layer over to the left as it is deobfuscated.



When attempting to clean the last layer it throws an error with both versions of the tool.


Using a tool I wrote some time back I was able to get the last layer cleaned up.


This shows it takes 4 rounds of deobfuscation just to clean up this section.

Part 3 The second download


After formatting the script by hand to be a little more readable we see again that it is attempting to download picture file. As of 2021-12-07 the first two files are still live.



0q0WQuZj_o.png  MD5 : 277F4ABC0E387BEF97DBB2B34F223C7F

cf2262W.png  MD5 : B2EB068210417E7056F4640AD1F2B36C

As we can clearly see by the bin Diff and the File hash of these two are different like the first two pictures we downloaded. Also like the first two the extracted base64 string are the same.


Once we base64 decode the string extracted from the picture file we find that this time it is still encoded / encrypted in some form.


Going back to our PowerShell Steggo script we see that the result of the extraction is passed to the value  ${MAGG}.


Going back to the last script from part 2 we look for the value and find it here in this Xor function.

It will first get the LCID of the system and convert the numeric value to string.

It then converts the string to a Char code array.

Next we pass the base64 string extracted from the picture file, Base64 decode to a byte array

Finally it will Xor the data by the the LCID (“1041”).


After we Xor the data it outputs a GZip Base64 Encoded String.


Now after GZunzip / base64 decode we now see we have another base64 string that starts with “TVpQ” and it will base64 decode to “MZP” which tells us we are now looking at a base64 encoded  Executable.


Here we can see this is a well know binary on Virus Total and has been around for a few years.

Here on FileSan.IO it also has a BlackEnergy tag.

That is it for this one I hope you enjoyed the journey thru the layers.


Link to FileScan.IO Twitter thread.
Link to FileScan.IO of the Document.
Link to FileScan.IO of the Extracted Final Binary.

Link to xlCountrySetting page.
Link to LCID page

Link to Irish-H for document.

Link to Kahu Security tools download page.

Link to my GitHub with extracted files and my tools used.

Posted in Uncategorized | Tagged , , , | 1 Comment

Excel 4 macro code obfuscation

This sample comes from a Twitter thread located Here by Frost @fr0s7_ and appears to be  “BazarLoader”

Since this is a Xlsb file I usually just open it up in my Office 2010 Pro sandbox and then convert to Xlsm and unzip it so I can just view as xml.

The first thing I always do is take a quick look with a hex editor looking for anything of interest.


As we can see from the first 2 bytes we have a “PK” or zip file format.

Once we “UnZip” the file and navigate to the xl folder we can verify this is a binary file and it also contains a Excel 4 macro folder named “macrosheets”.



If we look at the SharedStrings.bin file we can see that strings are in a Unicode format and not that easy to see where they split up at.


Looking at sheet1.bin in the macrosheets folder we can see it is not human readable. 

This is the point where I usually convert the file.


Here we can see we still have a “PK” file but you can clearly see the data is presented a little differently.


Once we unzip and navigate to the xl folder here it now looks a little different.



And now if we look at the SharedStrings.xml file it is a little different.

By the counts there are 34 indexed shared strings. Each appears to be randomly generated strings.


I wrote a tool to aid in extracting and indexing the shared string from the xml file.

When I first parsed the shared strings I ended up with 0-37 index values instead of 0-33.
Turns out the tool stumbled on a rare random Char value I was using to split on.


Here we see the xml version of the macro code. Like the shared strings it is hard to see thru all of the xml tags what is there so I wrote a parser for those too.


This tool is designed to extract values to aid in better viewing what is happening without all of the xml tags. In this case some are left.


Here we see what the values are.


If we look at the highlighted values in green  we see that it is looking for the string in cell ‘E11’ then we are taking the char at the index and taking so many chars. “MID(E11,12,1)” . In vbs the index start at 1 but in this the index starts at 0.

So now we know the first char code was converted to “S” and now we see the first extracted letter is “h” and the next to letter is “e” and then the next 2 are at the same index and is “l”.

Now we have the word “Shell” extracted.

This would be a pain to do by hand, but now that we understand how it works what else is available to extract this data.

The Answer is “XLMMacroDeobfuscator” located here .


As we can see here this tool does a great job of presenting us with the deobfuscated strings.

The version I’m using here is from October 3rd 2021 before it was updated several more times. The version number stayed the same so you need to verify by the install/ file date.

Using the latest version as of November 12th 2021 it only returned the eval result. Also notice in the screen shot that showed the data it is a “Partial Evaluation” where in the updated version it is a “Full Evaluation”.

I have not looked at the byte format for the Macro sheet data but I have looked at the shared strings in the binary format.

Do to the lack of information that I can find on the file format let’s take a quick look at the data in this file as shown below. Notice the patterns.


In the original sample I wrote an extraction tool for we can see how it is laid out slightly different.



Although the file in my original sample was labeled qut.xml it was not an xml file at all. So you can not count on a file name or extension for searches.


And here is what it looks like in the Hex editor.



Lets take a look at format for this sample then we will go back and look at the one from the beginning.

We can see the first 3 bytes of the data appear to be a fixed Header value.

The next 4 bytes are the “Count”. If I understand correctly, it is the total times the string/chars are referenced.

The next 4 bytes are the “Unique Count”. These should be the total number of strings shown in the cells.

Next it gets interesting.

The first byte is always 0x13 Next we have 1 or 2 bytes (Unknown). Perhaps it is a data type ? It appears that it could be 1 or 2 bytes then a null byte depending on the string.

Next we have the length of the string as displayed in the cell. It uses at least 2 bytes.
So the first is only 1 char then value is 0x0100 or in reverse order 0x0001.

After that we have 2 null bytes. Then finally the Unicode bytes for the string.

Now lets go back to our first file that we extracted from this sample.


Notice how everything is aligned but the area in the red box.


If we look at the string under index 16 we see it is 531 characters long.

531 = 0x0213 and our value in the data is 0x1302.


Now everything lines up.

Here we see the first byte  0X13 then 2 unknown bytes then a null byte then 2 bytes for the length and then a double null and finally the start of out Unicode string values.

So in this sample we have extra 0x13 in a place that will break the tool.

At this point the tool will work on a few but will need a total rewrite based on this new information.

There have been plenty of samples that I have looked at where you did not even need to look at the VBA or macro code. All you needed to do was extract the shared strings to get the urls or paths used.

That is it for this one I hope you learned from this as much as I did.



Link to Twitter thread
Link to Sample on InQuest Labs
Link to Sample on Iris-H

Link to XLMMacroDeobfuscator

Link to my tools on GitHub

Posted in Uncategorized | Tagged , , | 1 Comment

A deeper look at Office documents flat style

Over the last few years I have seen some samples that use the xml style of Word Documents with base64 encoded ActiveMime data.

What started this was a recent Twitter post by HunterMaor @bit_dam Here where he was not able to get the the final payload to download.

Let’s take a closer look at this one.


At the top we can see it is an xml format.


The next thing we notice is the base64 string with a name of “editdata.mso”. Not all samples I’ve looked at use that name but it does seem that most all of them do.

Finally we scroll all the way to the end just past the base64 encoded picture file that gets displayed.


If we look close here we can see that there appears to be a Xml encoded Html/HTA page here.

Using a couple of tools I wrote to aid in getting information out of large xml files and word xml/html encoding  we can get a cleaned up version.



If we clean the output up we can see we do have a html page.


If we scroll down a bit we see that the base64 string at the top will be split on the string “aGkh”.


If we split on the string then that will give us 2 base 64 string and then the script control string.

Here we will Concentrate on the larger Base64 string.


If we base64 decode we can see that the output is reversed. When we flip it around we can see we have a downloader script.


If we “JS” format this we can see it will call out to a long url and if it get a response of 200 then it will write a file with a jpg extension to the path.


If we look at the smaller base64 decoded string we find it is not a jpg file but a file being loaded by “regsvr32”.


Going back to the “editdata.mso” base64 string. Once we base64 decode to hex(bytes) we see the header tells us this is an ActiveMime file.


If we look at the bytes at offset 0x32 and 0x33 those are 2 bytes for a Zlib header.

You can extract from here down and then use a Zlib Library to decompress this part. Or skip those 2 bytes and use a Dot Net Decompress function to decompress this to a new byte array.



After we decompress that we now have a “DOCFILE” file that we can use 7Zip to decompress to a folder view.


So in this case our ActiveMime data is a vba project.


And we have 3 scripts in it.


Here are the 3 combined scripts extracted. We can see it will build and run a HTA.

That brings us to the next section of finding more samples.

Looking for more samples in my repository requires me to use yara to parse the files do to the size of my repository now.

Cleaning up my original test rule leaves me this to find all files with the base64 encoded string “ActiveMime”


This rule found 28 files in my repository. The bulk of them were older Emotet samples that did not use the script at the end like this one did. They just used highly obfuscated VBA.

Note: This rule is very broad and will catch anything that uses the base64 encoded “MimeType” not just what we are looking for with these samples.

I also ran the rule against Hybrid Analysis . At last check it has 3 pages of found files. All of files I checked were of this new type though.

The newer versions began being logged on HA in January 2021.

Lets take a look a a different version that is using obfuscation to hide the script better.

This sample can be found Here on

This starts out the same as the other sample.




But when we get down to where the xml script was in the first sample we see something different.


We can still extract the data from the xml the same way. It is still encoded but how ?


After several years of experience. Without even looking at the VBA code we can make a educated guess on how this is encoded.

If we scroll all of the way to the bottom  we can see right after the “>” the string “tqdkj” and if we highlight every instance of it we can see a patter beginning to emerge in between.

This is just a builder artifact that that will exist no matter what the characters are that will be used.

So we now know that it is just a simple string replacement (Removal).



Now we have the script.


Base64 decode the first string to get the url it is calling out to.


Now we can see where it is calling out to.

Now that we did this the “Hard Way” lets take look at what it looks like in Word.


Here we can see something is there that won’t show up unless you are able to highlight it.


Here we see the font color is white and the size is 1.


Once we increase the font and change the color then we can better see what it happening.

Just copy paste the text and then do the deobfuscation to get the script.

While trying to find any sample in a sandbox that was able to download the picture file (dll) if found some other information on the url.

(Note: none of the sandboxes were able to download the dll. I checked @HybridAnalysis, @anyrun_app, and @hatching_io

We find this is TA551as tagged in this report Here.


Multiples samples also mapped to the urls tagged as TA551 so that leads me to believe that this format /builder belongs to them.

That is pretty much it for this one.

Further Reading:

Link to a 2015 Sans ICS Diary
Link to a 2015 Trustwave Post
Link to a “Insecure” archive page with links to other articles including the Trustwave one.

Links From Post:

Link to first document
Link to second document
Link to IOC database.

Posted in Malware, security, VBScript | Tagged , , | 1 Comment

More on Yara And Building Rules

I’ve been learning how to build and modify yara rules lately but my biggest pain was getting the formattting correct.

In a recent Twitter thread Here James @James_inthe_box  posted where asyncrat was using pastebin  to host their encoded rat.

My repository is now getting large enough with similar samples I will need more than just my simple single string search utility to search with.

I also need a way to standardize how I write the rules.

While we were all going though the sample on Twitter  Nadav Lorber @LNadav  from Morphisec had released a blog post Here that started with the vbs dropper that led to the pastebin links.

I just finished downloading all of the vbs hashes that I could find on either “ANY.RUN @anyrun_app” or “Hybrid Analysis @HybridAnalysis” . I don’t have access to download from VirusTotal.

All of the files I could not find in the other two locations were located on VirusTotal.

There appears to have been 51 hashes to search for. The last 7 that I found wrote to a bat file for the next stage instead of PowerShell.

Of the remaining ones I found they used various forms of obfuscation from xor’ing with a long “random” string to various layouts of chr(number) . They would be mixed case and even Chrw() for wide char/ Unicode even though the decimal values were in ascii decimal range.

Let take a look at one and se what we are going to run into.


Here we have a get object which turns out to be the class ID of Shell.

Also in this screenshot we have a large number of “cHR(“ values with math functions.

The math function could change drastically so we can not count on those.

At the bottom we have a few possible things we can use for a rule.


For this sample I’m going to go with with group of strings


I’m choosing the CLSID because it is distinctive , the sleep as an extra values but the  “&cHR(“ in multiples will tell me they are trying to hide something.

So lets take a look at the Yara Builder.


As you can see here it is just a simple fill in the bank and click a button.

So lets fill in the blanks and see what we get.


You may notice and extra empty $s3 = “” in there too.

With the exception of the strings section all of the text boxes take the input string just input those values in the formatted output. If you leave a box empty it will put an empty string in like the s value.

For the strings it will stake each sting in the line using ‘CRLF” for the new line and split them then number the string and then out put to the formatted strings section.

And just in case everyone was wondering what that large group of char codes decode to we have this.


More Char codes and powerShell, go figure.

So our yara output now looks like this.


Now that is a decent start to get our formatting but what can we do to improve it with the limited amount of usable code available.


On a test on Hybrid Analysis this version throws an error. Can you see it ?

I left a space between “with” and “CLSID” so now we know HA don’t like spaces in the rule name either.

The space has been fixed in the final version.

And what does it return ?


Two of the files already on our target list.

After looking at several of the other files downloaded we see the Char( space differently. I’m not sure if there is an easier way yet to do this so we have the 4 different versions.

If we wanted to catch the ChrW versions we would also need to add that to out rule.

A few things that kept messing with me was when I tried to put a dash in the rule name. Yara does not like that but underscores is ok.

Another thing is the lower case section names and the keywords.

Every time I mistakenly uppercased them then it would throw an error.

That is is for this one.

I hope it is helpful to someone.


Link to Twittter thread

Link to the blog post

Link to my GitHub with the tool.

Posted in Malware, Programming, VBScript | Tagged , , , | 2 Comments

SunCrypt, PowerShell obfuscation, shellcode and more yara

This didn’t start as a blog post. It started as a conversation with Hari Charan @grep_security about something they were looking at called SunCrypt ransomware.

Looking up the name I ran across a couple of interesting blog post, one by Sapphire here and one by Acronis here . Seeing that this was obfuscated PowerShell it peaked my interest.

Searching for some samples to work with also revealed that  you can do a tag search on tri.age of “family: suncrypt” (without the space)


The PowerShell loader we are going to use here is the one from the Acronis blog post with a hash  of  MD5: d87fcd8d2bf450b0056a151e9a116f72 . There are multiple copies on for that hash. There are 3 copies on Tri.age here.

Hari Charan @grep_security also pointed me to a couple of  open source yara rules to search for the PowerShell loaders.

This one appears as though it will search for the ransomware binary here and this one will search for the PowerShell script here .

Let’s take a look at some of the encoding.



If we look at this part it takes 3 values , assembles them , then it base64 decodes to byte.

But it will also do something to the strings before it reassembles them.


We can see the first string is redirected to a function that will read right to left , basically just reverse the string.


If we Look at the second string it is getting a substring of what is there starting at index 16 and taking 2000 characters.


The encoded string is actually 2032 characters long before we get the substring.

The final string is is just another reverse string.

Then we just have a long base 64 string after reassembling the pieces.


Remember we still have to convert this to byte and it will get loaded into memory using VirtualAlloc.



Looking at the bytes in a hex editor we can not see anything that makes any sense.

The next step is to drop this into CyberChef here and view the assembly.


This is also where I hinted on Twitter of a “Somewhat useful tool” which will be on my Github.

If we look down further we see more API calls.


And even further down we see a different type of string building using a “push pop”. I have not made a tool for that yet.


Although doing this statically we can not tell for sure how this is used it can give some clues as to what it will be doing by the API calls.

What started all of this was when I was trying to write a yara rule to find more samples to test this tool with and look for any outliers that would break it or not be what I was looking for.


I’m still learning yara and this version just looked for the format of the “MOV BYTE PTR”.

I ended up with over 552 hits for this and many false positives. I knew I need to find something to rule out some of the values that did not return strings or would return either encoded or garbage looking strings.

After several hours of trial and error I ended up with this.


That reduced it down to 214 hits. It ended up being shellcode and binary samples that used that format. I’m sure there are a few more samples in that mix that would be false positives but it was good enough for what I wanted.

After going thru that exercise I was wanting to try and find a way to let the obfuscated PowerShell self decode. So I started by looking for a way to just let it reassemble the base64 string and then write that to a file.


The template part is the path variable and the pipe out to file. But you have to remember to remove the “[Byte[]]” part and the “[System.Convert]::FromBase64String” from each one you wanted to rebuild and just dump to a text file for further processing of the base64 string.

So I then went back and searched for how to just output to a binary file since that is what we ultimately wanted anyway..


The variable for the path can be the same but instead of pipe to write file / text we add the line with the System IO and make sure we have the variable name the same as in the extracted PowerShell.

Moving on to the large base64 string.


Using Notepad++ we notice the highlighted area is all 1 section. You may also notice the extra parameter name right after the join.


Searching for that value we find it all the way up right after the code for the shellcode reassembling.

So when we go to use the self decode trick we need from here all of the way to the end of the highlighted area to be sure we have all of the needed parameters to rebuild the base64 string before it gets decoded to hex/binary data.

Once we drop this into our wrapper and verify we have the proper output name set we can then just input it into the PowerShell ISE and run it and it will output our binary file for the next step.



Now the first four bytes of this output appears to be a length of the remaining bytes in the output. These will need to be removed for the next step.


Here we see it is a 32 bit binary with a Timestamp of 9/18/2020  although the file was assembled today in the created date.

If we look at the Unicode strings we can see that file extension strings are not obfuscated or hashed like the other blog post showed.


One of the next things I was looking for is how to extract the ransom Note.

The other Blog post gives us clues what we are looking for so lets look at the file in a hex editor.


There is a very distinctive string that begins with “11” as it turn out “0x11” is the xor key.

One of the other samples used 0x13 for the xor key.

If we scroll down to the end we can see clearly where this section will end.



If we keep scrolling down while we still have multiple “11” values we get to this.


If we xor that by 0x11 we get this.


Next I upped this to Anyrun here because I could not figure out at the time where the ip was coming from.


One of the last pieces of this puzzle is that it does a post request with some encoded data.


If we look at the data that gets dumped from the packet we see this.


So as a guess I checked to see if it had a single byte xor key and to my surprise it did.


The same one as the rest to decode with, 0x11.


Does this passed  hex value look familiar ? It is from the section where the IP was extracted.

What is it? I do not know. If someone does please let me know.

One other thing while I was not initially able to find the IP, I dropped this into IDA to see if I could figure out how it worked.

Seeing this ..


And this..


Was still no help to figure out what was passed.

I’m sure the IDA Experts could tease out the information quick but that is something else I still need to learn.

While working on this and needing more samples to compare I also wrote a yara rule to detect the obfuscation format. The open source one  will detect the base 64 encoding method.


This first version will search for substring as a string and only has to be found once since the value is “11” in the string.


This version will search for the “Substring” string  as bytes but allow for multiple possible values in the start point for the substring.

Well that is pretty much as far I can go on this.

Possible future research.

Set up a vm with Sysmon and PowerShell logging enabled as suggested by Lee Holmes here and run the sample to see what the logs will show me.

Take a closer look and learn how the encryption works.



Link to Acronis Blog post
Link to Sapphire Blog post

Link to Anyrun for the extracted ransomware
Link to Anyrun for PowerShell sample
Link to tri.age Search

Link to my Github for Files

Link for open source  yara rule for the binary
Link for open source  yara rule for finding the PowerShell script

Link for working with CyberChef Assembly

Posted in Malware, PowerShell | Tagged , , , , | 1 Comment

Ursa Loader and the many rabbit holes

On August 4th 2020 JAMESWT @JAMESWT_MHT posted on Twitter here about malware spam hitting Italy using ursa loader.

I mainly look at the obfuscation and this vbscipt looked rather interesting. Little did I know what I was in for.

So I start by downloading the vbscript bypassing the extraction from the msi file and find this.


Let’s format this a little to be easier to read.


I also renamed some parameters to make it easier to follow since the names were to similar.

We have 3 values that get calculated and used in the decoding function. It will also take the first letter and subtract the value of val-1 from it to use later in the calculations.

After a bit of trial and error I was able to work out how these strings decoded.


After formatting the output and removing the extra “:”  which I believe are being used as a new line split point we see this.



Next we need to locate a sample where we can continue to follow along with so we can search ANY.RUN @anyrun_app for a sample.

Mikhail Kasimov @500mk500 post a link to here on anyrun where you can see the request of q=1 .


Then you get a response of another large encoded string.


Once decoded you get this.


For those eagle eyes you will notice the encoded string is different for both of those screenshots but the output decoding is almost exactly the same.

You may notice the encoded section at the bottom of the screenshot labeled “wCnfg”.

After decoding it we see this.


If we format it a little better we can see the same names get repeated several times.


One other thing we will se as we scroll down the script is it will use WMI to get various bits of system information and use that to test if it is running in vm of some sort.


At this point I stopped looking into it. I got the script decoded and passed on the information.

On August 7th 2020 JAMESWT @JAMESWT_MHT posted again here that it was again hitting Italy .

Then I started digging in deeper to try and understand it better. The trip down the rabbit hole was about to get rough. We have to find a sample and start from the beginning.

NOTE: There are a few sample runs the have a very explicit NSFW picture which will be in the “no threats detected” ones because it mainly just returns the picture.

So we have to start with a sample that has the MSI that gets downloaded here .

After downloading and extracting the sections using 7Zip we see the original vbscript that get run that we have seen.


Next we have to try and find something that picks up where the first one left off. I chose this one here.


To get an idea of what order things get done in we need to download the pcap and follow along in order.


If we just do a filter of “http.request or http.response” we still have a lot of background noise. So let’s build a new filter just for this.
(http.request or http.response) and ip.addr ==


Now we can see which urls are getting called from the first large script.


In packet 34 we see the request and in packet 38 we see the response with a short encoded string.


Here are my decoding notes. We have the encoded , decoded. and some notes about the use.


Our next request is packet 504 for a file named “lp1a1.bd2” and the response is in packet 7016 . looking at it we have a PK/ Zip file being downloaded.


Once we decompress this we see this.


The file appears to be encoded in some way. Back to the main script.

After doing some string replacements and following the trail of the url that was called we end up with this decoding function.


So after careful study of the function I built a Windows application with 2 textboxes.

After fixing a off by 1 bug. (I was adding 255 instead of 256)  I finally get something besides an error or garbage.


If you look close at the output you can see the first 2 bytes decoded to 0x4D5A (MZ).

The size of the encoded file is 0x6A6801 (6,973,441) bytes, a very large File. Do to the size it can take over 15 minuets to decode a file because of all of the string manipulation that need to be done. So I wrote a new one the will take a file as input and write the decoded file to the same folder. This one works almost instant.


It looks like this extracted file is a Delphi C++ with 2 exe embedded in the resources.


I’m not sure what it does.

Sha1 : 06D2E4EC20053ABDBE76E94F71966235BB9FAA56
Sha 256 : 58EA17C1572275B930A56FE1EBBF4156B84932C7F89E883994B941A6B6F7DD44
MD5 : 77ACA543DBD3D3C32A2A335975A5FB1E

Our next request is in packet 7271 and the response in packet 7273.

Another short encoded string.


Our next request is at packet 10045 for  /lp1asq.bd2 and the response is in packet 12594.


This one is encoded also. After decoding we get this.


Another PK / Zip file. And after unzipping it we get.


It appears it is an open SSL Library.


Our next request is in packet 15083 for /lp1asl.bd2 and the response is in packet 15945


Another encoded file.


Decoded is another PK / Zip file.



Looks like another open ssl library.

Our next request is in packet 16013 for /lp1ass.bd2 and the response is in packet 16270.


Another Encoded file.


Another PK / Zip file after decoding.



And yet another Open SSL library.

Our last request we have is in packet 16402 for /lp1aai.bd2 and the response is in packet 16996.


Another Encoded file.


Another Compressed file.



Now this last one the the pcap is real interesting. As you can see it is an AutoIt executable for running AutoIt scripts and Compiled binary’s.

This pcap seemed to be the one with the most packets. I can only assume the reason there are not more is because the sandbox ran out of time to process everything.

Although we have extracted everything we can from the pcap we still have not went back to map the request/ response to the Script that called them. I’ll leave that for a later exercise.

And for those interested here is the list of unique UA string found in this pcap.

UA = Index Location: 0xE31
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko  UA End

UA = Index Location: 0xC7DD
Microsoft-CryptoAPI/6.1  UA End

UA = Index Location: 0x4D4EA7
Mozilla/5.0 (Windows NT 6.1; rv:68.0) Gecko/20100101 Firefox/68.0  UA End

Now that that is done let’s crawl out of this rabbit hole and look into another.

On August 7th 2020 lc4m @luc4m had created their own decoder for the scripts and had extracted the urls in the vb script.

A search of “hxxp://104.44.143.]28/m/” resulted in finding an open directory. See the twitter thread here and another part of it here (that will giveaway what I find)

They were kind enough to send me a copy.

Upon looking at 1 of the many files in the directory we look at one of them and see this.


This appears to be encoded in some way.  So on a guess I tried the tool to decode these like I did from the packets with the same extension.


And it did indeed decode to a PK / Zip file.


At first I thought that this might be encrypted in some other way and posted a screenshot on Twitter and marc ochsenmeier @ochsenmeier suggested here it looked like a Autouit file. I didn’t even notice a AU3 on the second line.

I then tried the AutoIt decompilers I had and they would not work. So back to Google search for a few hours.

I finally stumbled on to this post that listed the AutoIt header bytes Locate Here by doing a search for the first bytes of the file.

It turns out this file is of type A3X binary format. it is a stand alone compiled file that can be run with AutoIt3.exe binary. We seen it downloaded last in the section above.

So the next Question is how do we decompile this. After more time searching I find this project called “myaut_contrib” located here on github.

After downloading and extracting this to my vm and then installing it for a right click menu (which only works with an exe file extension). I open up the myAutToExe.exe  and drag-n-droped the .A3X file into it and click on the scan file at the top and then ran it in auto mode.

Here is the output.


We can see several files it dropped but most importantly it dropped a .au3 decompiled Script file.



After Cleaning hex string and dropping it into a hex editor we can now save the file.


Here is what we end up with. There was no detection on VT for this since it gets loaded to memory so I uploaded a copy here .  For those that don’t have VTI access I also uploaded it to Malshare here .

That is about as far as we can go with this rabbit hole without digging into the file itself.

Let’s climb out of this one and find another one.

Googling for this loader I’m not finding a lot of information about it.

Going back to anyrun we do a search using the tag ‘ursa” and we find there are 3 pages of files to look at.


my interest here is to see how far it goes back and see what differences there are in the older version and the new version.


So we go to the last page and scroll to the bottom and see that it was run on August 31st of 2018. So this have been around at least that long. So we will go to that here and see what we have.


This one appears to start out with an executable so let’s download this and take a closer look.


We can see here this is a dot net program. So we can decompile it. I did, and dumped it as a project to be able to do better searches for information.


If we look at form1 we see encoded strings like in the first part from the vbscript files. Decoding this we see that the vbscript was encoded and put into the dot net binary to be run from there. Strange but ok, it works.

So we next need to go to the pcap and see what comes next.

Like the first one we will set a filter of

“(http.request or http.response) and ip.addr ==” to narrow down on just the information we want.


As we can see there is not a whole lot going on in this one.

Let’s start with the first packet at 90 which is a post.


In the first example it posted q = 1 and here it is c = 55 . Is this a possible campaign ID ?

So we get the response in packet  151 and get this large encoded blob.



After decoding this we see something similar to what we have already decoded in the first part but all of the string data here was encoded also.

If we continue on to packet 157 we see a request for /njy4rs33/ny3a.php but there was no response so we move to the next request in packet 160 for /njy4rs33/m/ny337.aj6 .

Notice here the file extension is different that the more recent first one we looked at.


In this case it is a PK / Zip file.


Once we unzip it we have a executable.



I’m going to stop at this point due to the length already.

Some more observations I made was the file extension does not mean it is encoded or compressed.

Going thru various samples and decoding them from the pcaps I was able to get, it may be a executable also with the same file extension.

So don’t trust a file extension.

That’s it for this one.

I hope you learned as much as I have.

Posted in Malware, security | Tagged , , , | 3 Comments

PowerShell Steganography

Any programming language that can have access to the pixels of a picture file can do a form of byte and pixel modification to hide data within the pixel bytes.

The less of a degree you modify the pixel data the less change that the modified file will be noticed as hiding some form of data.

To me this is more of true steganography than the types that just append an exe to the end of the picture data because it is modifying the the pixel data.

The downside is you have to have some program or script to decode and extract the data which will point directly to the picture file used.

These type picture files picture files do not automatically run the data within whereas those with embedded shellcode  or exe files can be run by certain programs when viewed.

There are many ways that the hidden data can be obfuscated and stored in the picture file but at some point it still has to be extracted and that leaves a trail of instructions how it is done.

The first time I ran into this was in November of 2018 in this Twitter thread

So let’s just take a closer look at the part that decodes the picture file.


Here on the first and second line we see it is creating a new objet for working with bitmaps and then opening the the file from the internet instead of downloading then opening it.

The next line it is getting each pixel byte from 0-427.

If we look at the properties of the downloaded picture we see the width is 428 pixels wide.


It will next extract the RGB values from the pixels and then do the math.

The “B” would be the “B” value and the “G” would be the G” of the RGB in this case.

If we take a look at the “screenshot” of the the picture file it is nothing special and no real indication that it is hiding anything.(I didn’t want to add the real encoded file here)


So we need to open the file extract each pixel and decode them using the function in the PowerShell then output the decoded string. I have seen several different ways of encoding the pixel data this is only 1 of them.

As usual I have built a tool to do this the easy way.

One more thing we need is the string length from the output so we are also not outputting the extra garbage data. We can get that from the get string with a length of 0 to 1907 .

Select the file, Input the output length and click a button.


Dealing with the output is another matter.

This sample uses a function that will reverse a string , then it will do several char replace before the final decoding.

Here it is after the reverse string.


This is what most of the samples I’ve looked at do. They have more layers of encoding usually from Invoke-Obfuscation or a similar tool.

The next question is where did this picture encoding come from ?

It came from here we also find a entry in the MITRE | ATT&CK framework here

Although the code to decode the picture file remains mostly the same the variables are usually all different including the height and width of the picture file and the variable names for the function calls.

The tool to extract the data can be found on my Github here

That’s it for this one.

Posted in Malware, PowerShell, Programming | Tagged , , , | 1 Comment

Extracting Shellcode from VBA to PowerShell

This post will revolve around using my tools to extract the vba code then clean a base64 string that is exploded into multiple lines and then decode to a PowerShell script then extract the shellcode from the script and get the IP/Url from the shellcode.

The Twitter link where this came from can be found Here . The file we will be looking at is found Here.

The first thing we need to do is get a copy of the vba from the site.


We can click on the copy content button  in the upper right hand corner to copy it to the clipboard then we can paste it into out favorite text editor.


Just by this it appears to build a PowerShell script.


Here at the bottom of the script we can see that “stringFinal” is the the rebuilt powershell script that will base 64 decode to “Something”. It will run the powershell with shell.

The next question is how do we easily rebuild this base64 string.

In this Link to twitter I was asking people about a Reg-X solution. There were several replies and even a method to make some changes and let it extract itself.

This post is aimed at statically decoding with  just my tools. It is just a way to demonstrate how the tools work.

So since the strings are reassembled in order , rather than reassembling by hand we can use Reg-X to clean the base64 string to be able to decode it without having to run it.

If you view the link above you can see part of the thread where Malwrologist @DissectMalware has some screenshots on how to reassemble the base64 string using Notepad ++ and 2 different regular expressions to do the job which was my original goal.


There are also several different suggestions in that thread.

Recently I had built a new tool that does Reg-X replace for a script. The twitter link for that is Here.

There they used Reg-X to decode strings.

So lets try this new tool using the 2 step process.


Using a combination of Reg-X patterns we start with 

string[0-9]{1,} = \”

It will clear the name with the number thru the first “


So now we take the output from the first Reg-X replace and put in into the input for the next round using this pattern.


That will clean the end “ and the newlines thus reassembling the base64 string. It will leave the final “ in the string so that will need to be removed before inputting this in to the base64 decoder.


Here they are still using PowerShell to load the Hex encoded shellcode.


Finally we can highlight and copy paste just the hex encoded shelcode click a button and if nothing goes wrong we get the IP/Url it is calling out to.

Note: This tool does not work on those types that call and load calc.exe or executable those are a different format.

We can also check to see what api’s are found in the

Notice the checkboxes up top, those can be unchecked to clean up the not found output.


With some practice this can be extracted and decoded within a few minuets.

That’s it for this one.

Thanks for reading if you got this far.



Link to original Twitter message.
Link to file.

Link to Twitter thread about Reg-X.

Link to Twitter about script the Reg-X tool was built for.

Link to Github for the Reg-X tool.
Link to Github for Remaining tools for getting the IP and API’s used.

Posted in Malware, PowerShell, VBScript | Tagged , , , | 1 Comment