More on Yara And Building Rules

I’ve been learning how to build and modify yara rules lately but my biggest pain was getting the formattting correct.

In a recent Twitter thread Here James @James_inthe_box  posted where asyncrat was using pastebin  to host their encoded rat.

My repository is now getting large enough with similar samples I will need more than just my simple single string search utility to search with.

I also need a way to standardize how I write the rules.

While we were all going though the sample on Twitter  Nadav Lorber @LNadav  from Morphisec had released a blog post Here that started with the vbs dropper that led to the pastebin links.

I just finished downloading all of the vbs hashes that I could find on either “ANY.RUN @anyrun_app” or “Hybrid Analysis @HybridAnalysis” . I don’t have access to download from VirusTotal.

All of the files I could not find in the other two locations were located on VirusTotal.

There appears to have been 51 hashes to search for. The last 7 that I found wrote to a bat file for the next stage instead of PowerShell.

Of the remaining ones I found they used various forms of obfuscation from xor’ing with a long “random” string to various layouts of chr(number) . They would be mixed case and even Chrw() for wide char/ Unicode even though the decimal values were in ascii decimal range.

Let take a look at one and se what we are going to run into.


Here we have a get object which turns out to be the class ID of Shell.

Also in this screenshot we have a large number of “cHR(“ values with math functions.

The math function could change drastically so we can not count on those.

At the bottom we have a few possible things we can use for a rule.


For this sample I’m going to go with with group of strings


I’m choosing the CLSID because it is distinctive , the sleep as an extra values but the  “&cHR(“ in multiples will tell me they are trying to hide something.

So lets take a look at the Yara Builder.


As you can see here it is just a simple fill in the bank and click a button.

So lets fill in the blanks and see what we get.


You may notice and extra empty $s3 = “” in there too.

With the exception of the strings section all of the text boxes take the input string just input those values in the formatted output. If you leave a box empty it will put an empty string in like the s value.

For the strings it will stake each sting in the line using ‘CRLF” for the new line and split them then number the string and then out put to the formatted strings section.

And just in case everyone was wondering what that large group of char codes decode to we have this.


More Char codes and powerShell, go figure.

So our yara output now looks like this.


Now that is a decent start to get our formatting but what can we do to improve it with the limited amount of usable code available.


On a test on Hybrid Analysis this version throws an error. Can you see it ?

I left a space between “with” and “CLSID” so now we know HA don’t like spaces in the rule name either.

The space has been fixed in the final version.

And what does it return ?


Two of the files already on our target list.

After looking at several of the other files downloaded we see the Char( space differently. I’m not sure if there is an easier way yet to do this so we have the 4 different versions.

If we wanted to catch the ChrW versions we would also need to add that to out rule.

A few things that kept messing with me was when I tried to put a dash in the rule name. Yara does not like that but underscores is ok.

Another thing is the lower case section names and the keywords.

Every time I mistakenly uppercased them then it would throw an error.

That is is for this one.

I hope it is helpful to someone.


Link to Twittter thread

Link to the blog post

Link to my GitHub with the tool.

Posted in Malware, Programming, VBScript | Tagged , , , | 2 Comments

SunCrypt, PowerShell obfuscation, shellcode and more yara

This didn’t start as a blog post. It started as a conversation with Hari Charan @grep_security about something they were looking at called SunCrypt ransomware.

Looking up the name I ran across a couple of interesting blog post, one by Sapphire here and one by Acronis here . Seeing that this was obfuscated PowerShell it peaked my interest.

Searching for some samples to work with also revealed that  you can do a tag search on tri.age of “family: suncrypt” (without the space)


The PowerShell loader we are going to use here is the one from the Acronis blog post with a hash  of  MD5: d87fcd8d2bf450b0056a151e9a116f72 . There are multiple copies on for that hash. There are 3 copies on Tri.age here.

Hari Charan @grep_security also pointed me to a couple of  open source yara rules to search for the PowerShell loaders.

This one appears as though it will search for the ransomware binary here and this one will search for the PowerShell script here .

Let’s take a look at some of the encoding.



If we look at this part it takes 3 values , assembles them , then it base64 decodes to byte.

But it will also do something to the strings before it reassembles them.


We can see the first string is redirected to a function that will read right to left , basically just reverse the string.


If we Look at the second string it is getting a substring of what is there starting at index 16 and taking 2000 characters.


The encoded string is actually 2032 characters long before we get the substring.

The final string is is just another reverse string.

Then we just have a long base 64 string after reassembling the pieces.


Remember we still have to convert this to byte and it will get loaded into memory using VirtualAlloc.



Looking at the bytes in a hex editor we can not see anything that makes any sense.

The next step is to drop this into CyberChef here and view the assembly.


This is also where I hinted on Twitter of a “Somewhat useful tool” which will be on my Github.

If we look down further we see more API calls.


And even further down we see a different type of string building using a “push pop”. I have not made a tool for that yet.


Although doing this statically we can not tell for sure how this is used it can give some clues as to what it will be doing by the API calls.

What started all of this was when I was trying to write a yara rule to find more samples to test this tool with and look for any outliers that would break it or not be what I was looking for.


I’m still learning yara and this version just looked for the format of the “MOV BYTE PTR”.

I ended up with over 552 hits for this and many false positives. I knew I need to find something to rule out some of the values that did not return strings or would return either encoded or garbage looking strings.

After several hours of trial and error I ended up with this.


That reduced it down to 214 hits. It ended up being shellcode and binary samples that used that format. I’m sure there are a few more samples in that mix that would be false positives but it was good enough for what I wanted.

After going thru that exercise I was wanting to try and find a way to let the obfuscated PowerShell self decode. So I started by looking for a way to just let it reassemble the base64 string and then write that to a file.


The template part is the path variable and the pipe out to file. But you have to remember to remove the “[Byte[]]” part and the “[System.Convert]::FromBase64String” from each one you wanted to rebuild and just dump to a text file for further processing of the base64 string.

So I then went back and searched for how to just output to a binary file since that is what we ultimately wanted anyway..


The variable for the path can be the same but instead of pipe to write file / text we add the line with the System IO and make sure we have the variable name the same as in the extracted PowerShell.

Moving on to the large base64 string.


Using Notepad++ we notice the highlighted area is all 1 section. You may also notice the extra parameter name right after the join.


Searching for that value we find it all the way up right after the code for the shellcode reassembling.

So when we go to use the self decode trick we need from here all of the way to the end of the highlighted area to be sure we have all of the needed parameters to rebuild the base64 string before it gets decoded to hex/binary data.

Once we drop this into our wrapper and verify we have the proper output name set we can then just input it into the PowerShell ISE and run it and it will output our binary file for the next step.



Now the first four bytes of this output appears to be a length of the remaining bytes in the output. These will need to be removed for the next step.


Here we see it is a 32 bit binary with a Timestamp of 9/18/2020  although the file was assembled today in the created date.

If we look at the Unicode strings we can see that file extension strings are not obfuscated or hashed like the other blog post showed.


One of the next things I was looking for is how to extract the ransom Note.

The other Blog post gives us clues what we are looking for so lets look at the file in a hex editor.


There is a very distinctive string that begins with “11” as it turn out “0x11” is the xor key.

One of the other samples used 0x13 for the xor key.

If we scroll down to the end we can see clearly where this section will end.



If we keep scrolling down while we still have multiple “11” values we get to this.


If we xor that by 0x11 we get this.


Next I upped this to Anyrun here because I could not figure out at the time where the ip was coming from.


One of the last pieces of this puzzle is that it does a post request with some encoded data.


If we look at the data that gets dumped from the packet we see this.


So as a guess I checked to see if it had a single byte xor key and to my surprise it did.


The same one as the rest to decode with, 0x11.


Does this passed  hex value look familiar ? It is from the section where the IP was extracted.

What is it? I do not know. If someone does please let me know.

One other thing while I was not initially able to find the IP, I dropped this into IDA to see if I could figure out how it worked.

Seeing this ..


And this..


Was still no help to figure out what was passed.

I’m sure the IDA Experts could tease out the information quick but that is something else I still need to learn.

While working on this and needing more samples to compare I also wrote a yara rule to detect the obfuscation format. The open source one  will detect the base 64 encoding method.


This first version will search for substring as a string and only has to be found once since the value is “11” in the string.


This version will search for the “Substring” string  as bytes but allow for multiple possible values in the start point for the substring.

Well that is pretty much as far I can go on this.

Possible future research.

Set up a vm with Sysmon and PowerShell logging enabled as suggested by Lee Holmes here and run the sample to see what the logs will show me.

Take a closer look and learn how the encryption works.



Link to Acronis Blog post
Link to Sapphire Blog post

Link to Anyrun for the extracted ransomware
Link to Anyrun for PowerShell sample
Link to tri.age Search

Link to my Github for Files

Link for open source  yara rule for the binary
Link for open source  yara rule for finding the PowerShell script

Link for working with CyberChef Assembly

Posted in Malware, PowerShell | Tagged , , , , | 1 Comment

Ursa Loader and the many rabbit holes

On August 4th 2020 JAMESWT @JAMESWT_MHT posted on Twitter here about malware spam hitting Italy using ursa loader.

I mainly look at the obfuscation and this vbscipt looked rather interesting. Little did I know what I was in for.

So I start by downloading the vbscript bypassing the extraction from the msi file and find this.


Let’s format this a little to be easier to read.


I also renamed some parameters to make it easier to follow since the names were to similar.

We have 3 values that get calculated and used in the decoding function. It will also take the first letter and subtract the value of val-1 from it to use later in the calculations.

After a bit of trial and error I was able to work out how these strings decoded.


After formatting the output and removing the extra “:”  which I believe are being used as a new line split point we see this.



Next we need to locate a sample where we can continue to follow along with so we can search ANY.RUN @anyrun_app for a sample.

Mikhail Kasimov @500mk500 post a link to here on anyrun where you can see the request of q=1 .


Then you get a response of another large encoded string.


Once decoded you get this.


For those eagle eyes you will notice the encoded string is different for both of those screenshots but the output decoding is almost exactly the same.

You may notice the encoded section at the bottom of the screenshot labeled “wCnfg”.

After decoding it we see this.


If we format it a little better we can see the same names get repeated several times.


One other thing we will se as we scroll down the script is it will use WMI to get various bits of system information and use that to test if it is running in vm of some sort.


At this point I stopped looking into it. I got the script decoded and passed on the information.

On August 7th 2020 JAMESWT @JAMESWT_MHT posted again here that it was again hitting Italy .

Then I started digging in deeper to try and understand it better. The trip down the rabbit hole was about to get rough. We have to find a sample and start from the beginning.

NOTE: There are a few sample runs the have a very explicit NSFW picture which will be in the “no threats detected” ones because it mainly just returns the picture.

So we have to start with a sample that has the MSI that gets downloaded here .

After downloading and extracting the sections using 7Zip we see the original vbscript that get run that we have seen.


Next we have to try and find something that picks up where the first one left off. I chose this one here.


To get an idea of what order things get done in we need to download the pcap and follow along in order.


If we just do a filter of “http.request or http.response” we still have a lot of background noise. So let’s build a new filter just for this.
(http.request or http.response) and ip.addr ==


Now we can see which urls are getting called from the first large script.


In packet 34 we see the request and in packet 38 we see the response with a short encoded string.


Here are my decoding notes. We have the encoded , decoded. and some notes about the use.


Our next request is packet 504 for a file named “lp1a1.bd2” and the response is in packet 7016 . looking at it we have a PK/ Zip file being downloaded.


Once we decompress this we see this.


The file appears to be encoded in some way. Back to the main script.

After doing some string replacements and following the trail of the url that was called we end up with this decoding function.


So after careful study of the function I built a Windows application with 2 textboxes.

After fixing a off by 1 bug. (I was adding 255 instead of 256)  I finally get something besides an error or garbage.


If you look close at the output you can see the first 2 bytes decoded to 0x4D5A (MZ).

The size of the encoded file is 0x6A6801 (6,973,441) bytes, a very large File. Do to the size it can take over 15 minuets to decode a file because of all of the string manipulation that need to be done. So I wrote a new one the will take a file as input and write the decoded file to the same folder. This one works almost instant.


It looks like this extracted file is a Delphi C++ with 2 exe embedded in the resources.


I’m not sure what it does.

Sha1 : 06D2E4EC20053ABDBE76E94F71966235BB9FAA56
Sha 256 : 58EA17C1572275B930A56FE1EBBF4156B84932C7F89E883994B941A6B6F7DD44
MD5 : 77ACA543DBD3D3C32A2A335975A5FB1E

Our next request is in packet 7271 and the response in packet 7273.

Another short encoded string.


Our next request is at packet 10045 for  /lp1asq.bd2 and the response is in packet 12594.


This one is encoded also. After decoding we get this.


Another PK / Zip file. And after unzipping it we get.


It appears it is an open SSL Library.


Our next request is in packet 15083 for /lp1asl.bd2 and the response is in packet 15945


Another encoded file.


Decoded is another PK / Zip file.



Looks like another open ssl library.

Our next request is in packet 16013 for /lp1ass.bd2 and the response is in packet 16270.


Another Encoded file.


Another PK / Zip file after decoding.



And yet another Open SSL library.

Our last request we have is in packet 16402 for /lp1aai.bd2 and the response is in packet 16996.


Another Encoded file.


Another Compressed file.



Now this last one the the pcap is real interesting. As you can see it is an AutoIt executable for running AutoIt scripts and Compiled binary’s.

This pcap seemed to be the one with the most packets. I can only assume the reason there are not more is because the sandbox ran out of time to process everything.

Although we have extracted everything we can from the pcap we still have not went back to map the request/ response to the Script that called them. I’ll leave that for a later exercise.

And for those interested here is the list of unique UA string found in this pcap.

UA = Index Location: 0xE31
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko  UA End

UA = Index Location: 0xC7DD
Microsoft-CryptoAPI/6.1  UA End

UA = Index Location: 0x4D4EA7
Mozilla/5.0 (Windows NT 6.1; rv:68.0) Gecko/20100101 Firefox/68.0  UA End

Now that that is done let’s crawl out of this rabbit hole and look into another.

On August 7th 2020 lc4m @luc4m had created their own decoder for the scripts and had extracted the urls in the vb script.

A search of “hxxp://104.44.143.]28/m/” resulted in finding an open directory. See the twitter thread here and another part of it here (that will giveaway what I find)

They were kind enough to send me a copy.

Upon looking at 1 of the many files in the directory we look at one of them and see this.


This appears to be encoded in some way.  So on a guess I tried the tool to decode these like I did from the packets with the same extension.


And it did indeed decode to a PK / Zip file.


At first I thought that this might be encrypted in some other way and posted a screenshot on Twitter and marc ochsenmeier @ochsenmeier suggested here it looked like a Autouit file. I didn’t even notice a AU3 on the second line.

I then tried the AutoIt decompilers I had and they would not work. So back to Google search for a few hours.

I finally stumbled on to this post that listed the AutoIt header bytes Locate Here by doing a search for the first bytes of the file.

It turns out this file is of type A3X binary format. it is a stand alone compiled file that can be run with AutoIt3.exe binary. We seen it downloaded last in the section above.

So the next Question is how do we decompile this. After more time searching I find this project called “myaut_contrib” located here on github.

After downloading and extracting this to my vm and then installing it for a right click menu (which only works with an exe file extension). I open up the myAutToExe.exe  and drag-n-droped the .A3X file into it and click on the scan file at the top and then ran it in auto mode.

Here is the output.


We can see several files it dropped but most importantly it dropped a .au3 decompiled Script file.



After Cleaning hex string and dropping it into a hex editor we can now save the file.


Here is what we end up with. There was no detection on VT for this since it gets loaded to memory so I uploaded a copy here .  For those that don’t have VTI access I also uploaded it to Malshare here .

That is about as far as we can go with this rabbit hole without digging into the file itself.

Let’s climb out of this one and find another one.

Googling for this loader I’m not finding a lot of information about it.

Going back to anyrun we do a search using the tag ‘ursa” and we find there are 3 pages of files to look at.


my interest here is to see how far it goes back and see what differences there are in the older version and the new version.


So we go to the last page and scroll to the bottom and see that it was run on August 31st of 2018. So this have been around at least that long. So we will go to that here and see what we have.


This one appears to start out with an executable so let’s download this and take a closer look.


We can see here this is a dot net program. So we can decompile it. I did, and dumped it as a project to be able to do better searches for information.


If we look at form1 we see encoded strings like in the first part from the vbscript files. Decoding this we see that the vbscript was encoded and put into the dot net binary to be run from there. Strange but ok, it works.

So we next need to go to the pcap and see what comes next.

Like the first one we will set a filter of

“(http.request or http.response) and ip.addr ==” to narrow down on just the information we want.


As we can see there is not a whole lot going on in this one.

Let’s start with the first packet at 90 which is a post.


In the first example it posted q = 1 and here it is c = 55 . Is this a possible campaign ID ?

So we get the response in packet  151 and get this large encoded blob.



After decoding this we see something similar to what we have already decoded in the first part but all of the string data here was encoded also.

If we continue on to packet 157 we see a request for /njy4rs33/ny3a.php but there was no response so we move to the next request in packet 160 for /njy4rs33/m/ny337.aj6 .

Notice here the file extension is different that the more recent first one we looked at.


In this case it is a PK / Zip file.


Once we unzip it we have a executable.



I’m going to stop at this point due to the length already.

Some more observations I made was the file extension does not mean it is encoded or compressed.

Going thru various samples and decoding them from the pcaps I was able to get, it may be a executable also with the same file extension.

So don’t trust a file extension.

That’s it for this one.

I hope you learned as much as I have.

Posted in Malware, security | Tagged , , , | 3 Comments

PowerShell Steganography

Any programming language that can have access to the pixels of a picture file can do a form of byte and pixel modification to hide data within the pixel bytes.

The less of a degree you modify the pixel data the less change that the modified file will be noticed as hiding some form of data.

To me this is more of true steganography than the types that just append an exe to the end of the picture data because it is modifying the the pixel data.

The downside is you have to have some program or script to decode and extract the data which will point directly to the picture file used.

These type picture files picture files do not automatically run the data within whereas those with embedded shellcode  or exe files can be run by certain programs when viewed.

There are many ways that the hidden data can be obfuscated and stored in the picture file but at some point it still has to be extracted and that leaves a trail of instructions how it is done.

The first time I ran into this was in November of 2018 in this Twitter thread

So let’s just take a closer look at the part that decodes the picture file.


Here on the first and second line we see it is creating a new objet for working with bitmaps and then opening the the file from the internet instead of downloading then opening it.

The next line it is getting each pixel byte from 0-427.

If we look at the properties of the downloaded picture we see the width is 428 pixels wide.


It will next extract the RGB values from the pixels and then do the math.

The “B” would be the “B” value and the “G” would be the G” of the RGB in this case.

If we take a look at the “screenshot” of the the picture file it is nothing special and no real indication that it is hiding anything.(I didn’t want to add the real encoded file here)


So we need to open the file extract each pixel and decode them using the function in the PowerShell then output the decoded string. I have seen several different ways of encoding the pixel data this is only 1 of them.

As usual I have built a tool to do this the easy way.

One more thing we need is the string length from the output so we are also not outputting the extra garbage data. We can get that from the get string with a length of 0 to 1907 .

Select the file, Input the output length and click a button.


Dealing with the output is another matter.

This sample uses a function that will reverse a string , then it will do several char replace before the final decoding.

Here it is after the reverse string.


This is what most of the samples I’ve looked at do. They have more layers of encoding usually from Invoke-Obfuscation or a similar tool.

The next question is where did this picture encoding come from ?

It came from here we also find a entry in the MITRE | ATT&CK framework here

Although the code to decode the picture file remains mostly the same the variables are usually all different including the height and width of the picture file and the variable names for the function calls.

The tool to extract the data can be found on my Github here

That’s it for this one.

Posted in Malware, PowerShell, Programming | Tagged , , , | 1 Comment

Extracting Shellcode from VBA to PowerShell

This post will revolve around using my tools to extract the vba code then clean a base64 string that is exploded into multiple lines and then decode to a PowerShell script then extract the shellcode from the script and get the IP/Url from the shellcode.

The Twitter link where this came from can be found Here . The file we will be looking at is found Here.

The first thing we need to do is get a copy of the vba from the site.


We can click on the copy content button  in the upper right hand corner to copy it to the clipboard then we can paste it into out favorite text editor.


Just by this it appears to build a PowerShell script.


Here at the bottom of the script we can see that “stringFinal” is the the rebuilt powershell script that will base 64 decode to “Something”. It will run the powershell with shell.

The next question is how do we easily rebuild this base64 string.

In this Link to twitter I was asking people about a Reg-X solution. There were several replies and even a method to make some changes and let it extract itself.

This post is aimed at statically decoding with  just my tools. It is just a way to demonstrate how the tools work.

So since the strings are reassembled in order , rather than reassembling by hand we can use Reg-X to clean the base64 string to be able to decode it without having to run it.

If you view the link above you can see part of the thread where Malwrologist @DissectMalware has some screenshots on how to reassemble the base64 string using Notepad ++ and 2 different regular expressions to do the job which was my original goal.


There are also several different suggestions in that thread.

Recently I had built a new tool that does Reg-X replace for a script. The twitter link for that is Here.

There they used Reg-X to decode strings.

So lets try this new tool using the 2 step process.


Using a combination of Reg-X patterns we start with 

string[0-9]{1,} = \”

It will clear the name with the number thru the first “


So now we take the output from the first Reg-X replace and put in into the input for the next round using this pattern.


That will clean the end “ and the newlines thus reassembling the base64 string. It will leave the final “ in the string so that will need to be removed before inputting this in to the base64 decoder.


Here they are still using PowerShell to load the Hex encoded shellcode.


Finally we can highlight and copy paste just the hex encoded shelcode click a button and if nothing goes wrong we get the IP/Url it is calling out to.

Note: This tool does not work on those types that call and load calc.exe or executable those are a different format.

We can also check to see what api’s are found in the

Notice the checkboxes up top, those can be unchecked to clean up the not found output.


With some practice this can be extracted and decoded within a few minuets.

That’s it for this one.

Thanks for reading if you got this far.



Link to original Twitter message.
Link to file.

Link to Twitter thread about Reg-X.

Link to Twitter about script the Reg-X tool was built for.

Link to Github for the Reg-X tool.
Link to Github for Remaining tools for getting the IP and API’s used.

Posted in Malware, PowerShell, VBScript | Tagged , , , | 1 Comment

More adventures with shell code and the Shikata Ga Nai Encoder

The other day I was given a sample vbscript file by Paul Melson  @pmelson  so I could take a look at the odd shell code in it.

Here is the original script.


This starts out as a normal script running PowerShell to do a base64 decode. The next level was Gzip base64 encoded then we get to the 3rd layer.


This looks like a standard shell code loader. But when I tried to run it thru my tools we discover that is Shikata Ga Nai Encoded.


In a previous blog post Here I go in a little deeper on how this Shikata Ga Nai encoding works. I had to modify my decoder for this sample to extract the final shell code.

While trying to decode this the normal way of using CyberChef to find the start key and inputting all of the shell code bytes into my tool and removing 1 byte /2 chars at a time by hand until I get the decoded shell code I was not finding the expected output of the decoded shell code.



The highlighted bytes are the key in byte order. My tool will reverse them to be used.

My next step is to fire up a VM and run the extracted shell code thru SCDbg and see if it returns anything. SCDbg only works with 32 bit shell code.

Here is what I seen.


If all I wanted to do is extract and IP then I could stop here.

This tells me it does decode but not how it decodes.

I next tried to run it thru psxray located Here but it didn’t fully decode the shell code.

The next thing I tried was blobrunner  located Here in conjunction with x32 dbg

I then began to suspect that this may have been encoded multiple times or by more than 1 encoder.

After a Google search I found This blog post that tells me there is an option in  Msfvenom to encode the shell code with multiple rounds.

This verified my theory of multiple layers . But how does it work ? how do you peel off the layers ?

Let’s start by comparing the encoded to the decoded file in the hex editor.


Here we can see that the decoded and encoded file are the same to offset 0x25.

That tells me that the encoded bytes start there.

The other question is how do I know if I have it properly decoded or not ? This type of encoding will not be easy to tell by just looking at the decoded value. So I added a search into the return value of the decoder. If the result contains the bytes for “FNSTENV” 0xD97424F4 then I hopefully found the decoded layer.

Testing this by only inputting the bytes 0x25 onward and the start key from CyberChef we see this.


Once we click ok it will write to the output to keep from over running the decode.


Here we can see that for every for 4 decoded bytes a new key is generated for the next 4 bytes.


So we take a large enough sample from the decoded section to compare to the known decoded shell code from SCDbg and we see that it matches up with the starting offset at 0x25.

So the next theory is to take the ouput from the decoder drop it into CyberChef to get the next key and then move the output of the decoder to the input.

At this point I’m not doing any pointer math to try and figure out where the next part starts. We could also copy the bytes at offset 0x25 to a new file then compare the output to it and see where the difference it but I’ll do it the easier way of just adding a shift button to the tool.

It will shift / remove 1 byte (2 chars) from the left giving it a new point to start decoding at.


While testing this theory of inputting the output from the previous run I decided to keep a log of each layer.

The enc len was a guess based on the the first 2 outputs that I never went back to verify based on the number from CyberChef results.

Each layer needed to be shifted 20+ times before it found the decoded output when you just input the output.

One odd thing I found around level 5-6 was it pops up the found message at 12 shifts but the CyberChef looks a little strange compared to the others and the first decoded bytes were not in the known decoded file like before.


After some trial and error I decided to try and continue shifting and see if the messaged popped up again. It did at around 28 shifts.


In the short shifts we don’t have a “MOV” to get the key but we do in the 28 shift one.

Continuing on, this process was repeated for 10 rounds. Dropping each output decoded bytes into CyberChef to retrieve the next key and then shifting until we get the message that there is another round or the the decoded shell code was found.


We can verify that it decoded correctly by dropping it in the other tools.

So if we look up the term “Shikata Ga Nai” we see it means “it cannot be helped” or “nothing can be done about it”.

Now something can be done about it now that we have a better understanding of how it works.

After writing this post I decided to move the code that did the checking and pop up a message box after the code that writes the output to the output text box. That way it will already be present when the message box pops up so a screenshot can be taken then.

I also considered adding a counter to keep track of the shifts but decided not to for now.

That’s it for this one, thanks for following along. I hope you found it as interesting to read as I did to research it.

Hash for Script
Sha1 : 94659C6520CFBDCA3CFECDA7781CED15659B0687
Sha 256 : 8B5366D58D00CBA37DB8D1E1CCDD1C767F730EA197476A736EB8FEED43B8FCBC
MD5 : 2B0324C016BD023EC1405007A7DCD6A1

Hash for Shell code

Sha1 : 8E1B4CFC4B1146C332AE6D4C5F9C86C242574370
Sha 256 : F8BDB9D9CE545075F483F4F1F919560EEE0108E313121DD2B891BFDF31A65DCE
MD5 : 9F88A4BBAFF1B8F530EE29F7226B3338



Link to VT for Shell Code
Link to Previous blog post on this type of encoding.
Link to SCDbg
Link to PSXray
Link to BlobRunner
Link to blog post on Msfvenom
Link to my Github with tools

Posted in Malware, PowerShell, Programming, security | Tagged , , | Comments Off on More adventures with shell code and the Shikata Ga Nai Encoder

A quick look at the current emotet encoding

I have went thru several samples today of this type of encoding but todays sample will be from ExecuteMalware @executemalware located here and the Twitter reference is here.

Here we can see that only 3 of the urls are displayed.


Emotet usually has 5 urls so where are they.

When we check the system they are now dropping a .jse file instead of powershell

Going thru my samples we find that they used this style before here is 1 reference from Twitter here.

If we are in a hurry we can get the script from anyrun.

We click on the winword.exe and see this.


Then click on the more info to see this.


Find and click on the jse file to see this. There could be several files to scroll thru before you see the jse file.


Although this file is labeled as a .jse file it is not “Java Script Encoded” as it should be with that extension.

From previous research on the JSE and VBE encoding if the script engine does not find the header values for the encoding then it will attempt to run it as a normal script.

The script is unformatted to start with so lets pretty it up and take a look at the parts we need.

The way this works it it will take the array at the top and rotate  the array so many positions and build a new array. In the Screenshot we see the value that it will use highlighted.


This has looked pretty much the same in every version of this I have worked with.

The difference is sometime they will also “\x” hex encode everything to make it more difficult to tell what it is doing.

The next thing we need to look for is the index and function that will be replaced when the script is run.


If we just see the function name and an index value then the array will usually just get base64 decoded. Then from the new array that was built it will use this index in the array to do the replacement.

In this case there are 2 values, and from experience I know this is a RC4 key for decoding the base64 – RC4 encoded values.

Here is the decode function to verify if it is using ( % 0x100 )  Mod  or a AND  in the decoding


If we scroll down they were nice enough to give us 3 of the urls plain.


So in order to extract the last 2 urls you need to.

1: Reorder the array  using the provided value.
2: Get a list of Indexes and the key values.
3: use a for each loop of some kind and base64 decode  -> Rc4 decode for that index value and output this to a decoded Indexed list.
4: Locate and extract the last 2 Urls.

Using my tools lets see how this works.

First of all, the function name for decoding  is just “b(“. They are usually like “_x4E349(” or something similar.

So lets rename this to make sure my simple tool does not get the wrong thing.


That is simple enough to make sure I don’t get something else with a small case b.


Now we have a list with the index number and the key value.


This tool here will do the reorder just before it try’s to decode each value.

Now we can get the last 2 urls from the file.

Now that we have the decoded list if you are real motivated you can now do the replacements and get a better understand of what the script is doing.

One of the old tricks that used to be used was to keep scripts from running, change the program that was associated with the script extension.

Do the the limited use of JSE or VBE file extension it “may” be safe to set the default program for these extensions to notepad.

This should stop these from running and also alert the user that something is not right.

That’s it for this one.

If you have any question you can get me on Twitter at @Ledtech3.

Posted in Malware, Programming, security | Tagged , , | Comments Off on A quick look at the current emotet encoding

Chasing malware down the rabbit hole to see where it goes.

Lets start this journey with the blog post by Pondurance  titled “777 RANSOMWARE COMBINES WITH TRICKBOT” located here.

There is not a whole lot here but it describes 2 layers of shellcode  and some indicator’s and the first is the URL “hxxps://fearlesslyhuman[.]org”.

This URL seemed familiar but upon looking it up I was having difficulty finding very much information on it. The first search led me to Hybrid Analysis where we find this calling out to a /boot URL.


It is also only labeled as “no specific threat” .

If we scroll down we see there is a PowerShell script but it is split up in 2 areas of the strings section and no download is available for this script. So let’s just extract it from the strings section.


Here is the extracted script. If we look close it is base64 encoded GZipped.

Lets extract that and see what we have.


As with the original blog post that started this run we have base64 encoded data that is also Xor’d with decimal 35.

You could probably make up a CyberChef recipe to do all of the steps I’m going to do with my tools.

Base64 decode to byte –> Xor bytes by Decimal 35 to get Clean shellcode.


One thing my tool does not extract is the latter part of the url highlighted by the red box. Also as seen in the hex editor.


Let’s also take a look at what API’s are found in this Shellcode.


So our full url that this is calling out to is fearlesslyhuman[.]org/FSkX .

Unfortunately this does not contain the next level of download. The hunt continues.

Our next stop is  Here to see what it can tell us.


We see 8 hits and as of 10 days ago it appears to be down. It also appears to be the Same IP in the ones that connected.

Here on Virus Total it gives us a little more information but not a lot.


Searching on for our URL we see several hits.


If we go thru this list we are not finding anything special. Thanks to @James_inthe_box for locating this sample for me that is different than what we see in this list. Sample Here .

I’m not sure why it is not showing up in this list.


Looking at the traffic we can see there is more going on here than in the other sample in the list.

Lets download the PowerShell script and extract the shellcode from this.


As we can see here this script is the same as the last one we pulled apart so lets extract the shellcode and see what it tells us.


Here we see it is using a “/HLnZ” path instead. Where did we see that?


As we can see here is it tagged as binary so lets download that and see what we have.


And the Hash Information from Anyrun

MD5     7E8AF84B1CB9E43F1A66D385A63C9EAB
SHA1     C7651DD95BA7D82D1B8593E5EB5F0454AFF8373A
SHA256     17CB1CCC53B52E0EE31514673F4962E673280E505324739873542D541200120C
SSDEEP     6144:5RAY+7omj6nn5QEj+8vnLDckHWgvgV8Cm:92oo6n5va8PMS9C8Cm

Here we find it On VirusTotal by the hash with no detections.

Lets Look at this binary data in a hex editor.


As we can see here it starts with an “FC” which it what Shellcode normally starts with.

Lets drop this into CyberChef and see what it tells us.


That is not good, it throws an error. Lets just try the first part of it then.


Well that worked but still does not tell me much.

Next stop, fire up the VM and load this into SCDbg.


I’m a GUI Person so used that instead of the command line.


As in the blog post referenced in the beginning we can see many calls to API functions and SCDbg also drops 2 files for us.



We can see this is a decoded PE file from the Shellcode but it appears the parts of the PE header got stomped after it was loaded into memory.


This version that was dropped still has the “MZ” in it.


Lets remove the decoding shellcode from the start and then we have a decoded version of the binary.

Note: this is designed to be loaded and run from the original PowerShell shellcode.

Looking at this shellcode and the resulting executable got me wondering how it gets decoded.

What can we use? Possibly a shellcode to exe utility and then load (and or) run it in IDA or or in X86Dbg .

After chatting with  @herrcore about loading shellcode to be able to view it in a debugger he pointed me to a tool called BlobRunner, Here There are prebuilt binaries and the source code so you can build it yourself.

There is also a video that goes with it Here but they are using IDA Pro (The hard way) to view the blob.

At first I had trouble figuring out how to use it, so used a smaller sample file to get a feel for it.

The steps to run it.

Copy the binary for blobrunner and the shellcode into a folder on your vm.

Open a command prompt from the folder (so you don’t have to use full paths)

Pass the parameter’s of the blobrunner and a space and the name of the shellcode file into cmd.

In this case it will be  blobrunner.exe  “HLnZ.bin” (I used double quotes incase a filename has a space)

After entering the command and hitting enter we see this.


Next we Open X86 Dbg and attach to the blobrunner process.


Next we go back to the command window and look for the Entry value.(or you can do it first)


Then back to X86Dbg and open the Memory Map section and look for that address.


I double clicked on that address and went to the place in the CPU tab where it is.


Here we are at the beginning of the shellcode. Set a break point here then go back to the cmd window and hit a key to start it running again.

Next hit run in the debugger and it will break at that break point. You can then start stepping thru from there to see what it is doing.

Note: If you hit run in the debugger first without setting the breakpoint first it will get away from you. (I’ve Done it)

Going thru this there are a couple of jumps to set things up, Notice the second “Call” at “B”

Now look what happens as we step thru and make some jumps.



Notice that the assembly changed after making the jumps, and the real odd part is if you scroll back up it returns back to normal and the same as what CyberChef shows.


If we set a breakpoint at the pop ebp (ox33) and run it till there (after the loop is complete) it will self decode. You can then just select and copy all of the bytes here to a hex editor then clean out the beginning shellcode or you can use the follow in memory map option and then dump it to file from there.


That is how to do it that way.

But I’m still curious on how the algorithm works to decode.

I then stepped thru it enough times until I was sure I had it down on what it does.


It is a little slow because it is dealing with a lot of data and also formatting the input string /hex from a text box. Large amounts of data in a text box is always slower than importing the data straight from a file and working with it that way.

And finally here is how this Works.


That 34200 is a counter that gets reduced by 4 every round.

I am assuming that it is also a length value.

And 1 final thing. Does it run in Anyrun  Here ?



That is it for this one I hope you learned as much as I did.


Anyrun Links:
Link to Good run
Link to Extracted File

Link to Download
Link to Video

Hybrid Analysis:
Link to Report

Link to the report for URL

Virus Total:
Link to Data on Url
Link to shellcode

Posted in Malware, PowerShell, security | Tagged , , | Comments Off on Chasing malware down the rabbit hole to see where it goes.

A deeper look inside one of the new Emotet Malware Docs

The sample here comes from a quick search supplied by ANY.RUN @anyrun_app  of #emotet-doc to filter quickly on documents you want to look at. Twitter reference Here and the link to the file we are going to use Here.

One of the first things I always do is to always look at the file in a hex editor to verify what type of file I am dealing with. Never trust a file extension for the type of file.


As we can see here this is a “Zip” style so we can simply decompress it to get the contents out to look at them.

One other thing we notice too is when we just scroll down in the hex editor is the script we are after can be seen clearly.


If we just search for “var” in the hex editor it will take us right to the beginning of the script and we can just copy it out without even having to run or unzip the file.

On a side note, the “var” is a java script keyword and would not normally be found in a vba file.

Looking at Any.Run we see this.


If the run looks like this with the “jse” then it may be this type. Unless they change it after seeing this blog post. (It wouldn’t be the first time)

If we left click on the first process Winword.exe then click on the More Info we see this.



Here we see the Jse File. Lets click on that.

One note about the “JSE” file here. This is not a JScript Encoded file as the scripting environment would encode it and is usually associated with this file extension.

A JSE file looks like this.



This is just plain java script that is obfuscated by a tool.


Here is our script that we will be working with. So Lets download that.

Now lets look at 1 more way to extract the file.

After we Unzip the file we see


Select the word folder


Now lets look at the vbaProject.bin file in a hex editor.


Her it looks like a “OLE” header so this is compressed in that format

If we try our search for “var” again we see.


Now we can stop here and just copy the entire text section to your text editor and then clean this up to leave just the script.

Lets go 1 step further incase this gets more obfuscated later.

We can use 7Zip to extract the contents of the vbaProject.bin .


Lets look in UserForm1 folder.


Looking at the size of the “o” object this looks promising. If it is not very large then there will not be much in the files but we need to check them anyway to verify there is not anything useful in it.

This is a binary file so we need to look at it in a hex editor.


This is the lowest level we can go to get the script out. We could also use Office and extract it from the textbox or properties box in the VBA tools.

There are also the Decalage @decalage2 python tools Here and a few others.

Here is what the script looks like Normally.


That is to difficult to see what it is doing so lets do some java script formatting on a “Copy” to get a better view.


This is a very distinct format that has ben used for some time now, and if you understand the basics of these then they can be very easy to decode if you want to take the time to build tools to help with the boring parts.

There are basically 3 things we are looking for when we see this format. We look at the array at the top. It is usually either base64 , \x encoding , and I’ve even see a modified base64 encoding also in past samples that were not Emotet.

The next thing we are looking for is this function just below the array.


As you can see it has a push shift function and a value of 0xd3. What this will do is rotate this array 1 place that many times in a circular fashion. Not doing the math to be sure , but if that number happened to be the same count as the array it should just go back where it started from, just an example.

The last piece we need is the part with the index numbers.


This “b(‘0x0’, ‘ILb*’) is the index and a key to decode the base64 string in the array.

If it only has the index number then it either does not need another level of decoding or  the same key is used for everything and you will have to verify /locate it.


Here is the function where the index value and the key are passed to.


Here we see the base64 decode with the atob() function and then the RC4 decode below that. This version uses the  “Mod 256” you might run across some code that uses “AND” 256. So just to be aware. (Also 255 is used in some scripts)

So now we have a pretty good understanding how the decoding works lets decode this.



Using this tool I had written for the Neutrino EK we extract the whole command and just the index and key and save both to separate files.


This is where that value of 0xd3 comes in at.

This tool will take the base64 string array split them and rotate them to the proper place by that value. It will then use the list of Indexes/Keys from the last step to do the decoding of the array.


If all we want to do is extract the list of urls we could just stop here.

But lets see about doing the replacements in the rest of the script.


Here we are lined up with the decoded and encoded.

Although we have done the replacements of the encoded file there are actually more layers.


Notice the “\x” encoded characters lets fix that.


Does that string look familiar now. The error box in Anyrun ? (This screenshot was borrowed from the text report section and saved as png)


There is still 1 more trick used here that I have previously seen in in the pages that led to Angler EK.


We have 1 more layer of separated values. As we can see here we have a key value pair where the “eb” is an array of variables, the key is “imKcX” and the value is
“Not Supported File Format”.

We can verify that by looking at the screenshot above.

There are several places it will do this.

Anyone that has tried to step thru one of these in a debugger knows how much of a pain and long winded these can be before they finally spit out the decoded page.

And that is if there are no debugger checks to throw you a curve.

One more note. I have seen some samples in the past that used this style that have been run thru this style of encoder twice. So you may need to look close and repeat the process to get it decoded as fully as possible.

That is as far as I’m going on this.  My challenge to you is to build your tools to be able to quickly decode these also.

Mine are to fragile to release for this post.

In conclusion, once you understand the layout of how these decode you can apply that knowledge to the various “Types” you may run across.

Learn it and help make this type of encoding obsolete.


If you have any questions place contact me on Twitter at @Ledtech3.

Posted in Malware | Tagged , , , | Comments Off on A deeper look inside one of the new Emotet Malware Docs

Another Look at the Rig Exploit Kit

It has been awhile since I have written up anything on this exploit kit since it had moved to the background more and I have not seen as may samples as I used to.

It has gone thru many changes since I have first seen it and started learning how to disassemble it to lean how it works.

What sparked the interest this time is a series of reversing videos released by Vitali Kremez @VK_Intel and 0verfl0w @0verfl0w_ of SentinelOne . You can find the tweet reference here.

This post will focus on the .saz file from the video #3 on Rig EK. @VK_Intel was kind enough to give me a copy of the saz file so I could do this write-up of an alternate method of extracting the exploit code. The method used in the video was using the debugger in Google Chrome. I personally dislike that debugger so use a copy of IE9 in a VM when I actually need to run html/Script in a debugger.

So lets first take a look at the file in Fiddler.


There is limited traffic here to work with. We start with a redirect to the Rig EK landing page which we can see on the right.  Lets extract this page. I generally dump everything as raw to a folder and then look thru it for what I want but Vitali demonstrates another way in the video that is more precise to only extract what you want/ need.

So here is what the landing page looks like. If you zoom in and out it appears as if everything is running together and difficult to understand.


Here it is after doing some JS Pretty / Formatting.


Now that this is formatted, you should be able to zoom in on the picture and see that there are 4 script blocks on this page each has it’s own code and exploit it is targeting to attempt to download an encoded file to decode and run on the system. The downloaded  file / malware will very on the campaign.

So lets take a closer look at each section to see what they look like.



If you look close you can tell this is base64 encoded.


In past version of this there was string replacements in the base64 string itself but now they seem to just put it in the string of letters for the base64 alphabet in the decoder for what ever reason.

So all we are going to do is extract the base64 string from this section and use a different tool to do the decoding.


This section has no tricks so we can just copy paste the base 64string into the decoder and decode as UTF-8

Here is what the whole thing looks like normally after base64 decoding.


Um , we can do better so lets format it. Unfortunately the formatted version takes forever to save a screenshot of so lets just look at the most interesting parts of this one and I will Include the decoded files and my tools on my Github.

In previous samples I have pulled apart over the years I have taken the many hours it takes to decode these by hand to see what the functions do and how they work.

This code also looks allot like what you would have seen in Angler EK.

In this part we see the Shellcode.


Lets copy this shellcode over to another page and then we can look at it in a hex editor.


Notice the first part of this shellcode here.


Now Lets drop this into CyberChef  X86 disassembler Here


We can see here that there is an “Xor” by 0x84 being used. So lets try that.


Above you can see I just decoded the entire encoded shellcode by 0x84 and where it is highlighted you can see the decoded script. So lets extract and format it for a closer look.


Does this Look Familiar to anyone ?

Basically all of this trouble to hide what is going on is boiled down to “It just passes parameters and downloads an encoded file , decodes and runs some malware”.

But what are those parameters. In some older versions it was was hidden inside the insane encoding but now it is as simple as looking at the bottom of the page.


This is the url that gets passed and the RC4 decoding key that is used with the exploit downloader.


Now lets take a look at section-2.


As we can see here this section is shorter than the first but there is one thing they do here that they didn’t do in the last section. They split the base64 string by using double quotes and the plus symbol. So lets clean this up and base64 decode it to see what is happening.


Here we can see this one is using Flash to download the sample. There have been several different exploits used with flash in order to run the final payload. So lets take a closer look at this one.


This has several script in this that do various things but what catches my interest is this familiar looking shellcode that I have not seen previously in a Rig EK flash file.



Notice anything here ? It looks pretty much like the first one.


So in this case instead of a flash exploit they are using this one ? I’m not sure but the people that can ID exploits used will have to determine that.

So this section will pass what parameters ?


If we look at the traffic, the only thing we have is the traffic associated with the flash download.

In this case the first part of this will be where it will call out to download the flash file from.

The second part after the highlighted function name is where it will download the encoded file/malware from and the last part is the key that will be used to decode the downloaded file.

There are 2 more section that will decode pretty much as the first and I will include them in there own sub folder at each stage of the decoding.

But this is as far as we can go with this sample.
Since this does not have the frame with the encoded malware lets try another sample from “”  located Here.

New Sample

So if we open the pcap and set a filter of “http.request or http.response” we see the familiar patter as the last in the .saz file.


But in this case we have an extra download after the flash file download.

So lets set a filter for the landing page and see what frame/packet number it is in and extract it.


As it turns out it is packet-112.

Lets take a quick look at the flash file since we can see that it gets downloaded.


As we can see here that the flash in this one is more “old style” and it will do some decoding inside of the flash code.

One thing to note is that if you attempt to just use the “Export all parts” then it does not properly export with this version of FFDec and you have to painstakingly copy paste each of the scripts to a folder.

Notice the auto generated message vs the generated code in the decompiler.


As we extract and look at each script this may be using multiple exploits ?


Going thru this source I’m not seeing how this would even trigger the download based on past work with these. Perhaps I’m just forgetting something or missed what I was looking for?

So the final question is, how do we know which of these sections/scripts downloads the encoded binary ?

The answer is in frame 196.


It tells us the download is the result of the request in frame 132.


Now after extracting all of the request from the sections we can compare which ones match and that will tell us what exploit was used.


In this case it was section 4. and here is the script.


Frame 196 contains the encoded binary and after RC4 decoding  we discover that the data in this PCAP was truncated and we didn’t get the full binary.


One other thing to note is the RC4 routine here  uses “And 255”
where others may use “MOD 255”.


In conclusion we see that they are using a scatter approach using multiple exploits and scripts on the landing page. Perhaps just one was meant to work while the others are experiments or possibly they are used as fill to waste time of researchers or to send “false flags” on what exploit was used ? Looking at some of the code you can see it “does not work as expected” so only they can answer that question for sure.

That is it for this one I hope I was able to show a viable other way to see what is happening rather than trying to step thru in a debugger.


Link to Twitter post for the Tutorial that started this HERE.

Link to Malware Traffic page HERE

Link to my Tools used here and the files on Github HERE

Posted in Malware | Tagged , , | Comments Off on Another Look at the Rig Exploit Kit