More adventures with shell code and the Shikata Ga Nai Encoder

The other day I was given a sample vbscript file by Paul Melson  @pmelson  so I could take a look at the odd shell code in it.

Here is the original script.

Script-1

This starts out as a normal script running PowerShell to do a base64 decode. The next level was Gzip base64 encoded then we get to the 3rd layer.

Script-2 

This looks like a standard shell code loader. But when I tried to run it thru my tools we discover that is Shikata Ga Nai Encoded.

Script-3

In a previous blog post Here I go in a little deeper on how this Shikata Ga Nai encoding works. I had to modify my decoder for this sample to extract the final shell code.

While trying to decode this the normal way of using CyberChef to find the start key and inputting all of the shell code bytes into my tool and removing 1 byte /2 chars at a time by hand until I get the decoded shell code I was not finding the expected output of the decoded shell code.

cc-1

CC-Notes

The highlighted bytes are the key in byte order. My tool will reverse them to be used.

My next step is to fire up a VM and run the extracted shell code thru SCDbg and see if it returns anything. SCDbg only works with 32 bit shell code.

Here is what I seen.

Scdbg

If all I wanted to do is extract and IP then I could stop here.

This tells me it does decode but not how it decodes.

I next tried to run it thru psxray located Here but it didn’t fully decode the shell code.

The next thing I tried was blobrunner  located Here in conjunction with x32 dbg

I then began to suspect that this may have been encoded multiple times or by more than 1 encoder.

After a Google search I found This blog post that tells me there is an option in  Msfvenom to encode the shell code with multiple rounds.

This verified my theory of multiple layers . But how does it work ? how do you peel off the layers ?

Let’s start by comparing the encoded to the decoded file in the hex editor.

shellcomp-b

Here we can see that the decoded and encoded file are the same to offset 0x25.

That tells me that the encoded bytes start there.

The other question is how do I know if I have it properly decoded or not ? This type of encoding will not be easy to tell by just looking at the decoded value. So I added a search into the return value of the decoder. If the result contains the bytes for “FNSTENV” 0xD97424F4 then I hopefully found the decoded layer.

Testing this by only inputting the bytes 0x25 onward and the start key from CyberChef we see this.

FoundNewLayer

Once we click ok it will write to the output to keep from over running the decode.

newkeys

Here we can see that for every for 4 decoded bytes a new key is generated for the next 4 bytes.

CompBytes-2

So we take a large enough sample from the decoded section to compare to the known decoded shell code from SCDbg and we see that it matches up with the starting offset at 0x25.

So the next theory is to take the ouput from the decoder drop it into CyberChef to get the next key and then move the output of the decoder to the input.

At this point I’m not doing any pointer math to try and figure out where the next part starts. We could also copy the bytes at offset 0x25 to a new file then compare the output to it and see where the difference it but I’ll do it the easier way of just adding a shift button to the tool.

It will shift / remove 1 byte (2 chars) from the left giving it a new point to start decoding at.

Layers

While testing this theory of inputting the output from the previous run I decided to keep a log of each layer.

The enc len was a guess based on the the first 2 outputs that I never went back to verify based on the number from CyberChef results.

Each layer needed to be shifted 20+ times before it found the decoded output when you just input the output.

One odd thing I found around level 5-6 was it pops up the found message at 12 shifts but the CyberChef looks a little strange compared to the others and the first decoded bytes were not in the known decoded file like before.

CC-12

After some trial and error I decided to try and continue shifting and see if the messaged popped up again. It did at around 28 shifts.

cc-28

In the short shifts we don’t have a “MOV” to get the key but we do in the 28 shift one.

Continuing on, this process was repeated for 10 rounds. Dropping each output decoded bytes into CyberChef to retrieve the next key and then shifting until we get the message that there is another round or the the decoded shell code was found.

FinalDecode

We can verify that it decoded correctly by dropping it in the other tools.

So if we look up the term “Shikata Ga Nai” we see it means “it cannot be helped” or “nothing can be done about it”.

Now something can be done about it now that we have a better understanding of how it works.

After writing this post I decided to move the code that did the checking and pop up a message box after the code that writes the output to the output text box. That way it will already be present when the message box pops up so a screenshot can be taken then.

I also considered adding a counter to keep track of the shifts but decided not to for now.

That’s it for this one, thanks for following along. I hope you found it as interesting to read as I did to research it.

Hash for Script
Sha1 : 94659C6520CFBDCA3CFECDA7781CED15659B0687
Sha 256 : 8B5366D58D00CBA37DB8D1E1CCDD1C767F730EA197476A736EB8FEED43B8FCBC
MD5 : 2B0324C016BD023EC1405007A7DCD6A1

Hash for Shell code

Sha1 : 8E1B4CFC4B1146C332AE6D4C5F9C86C242574370
Sha 256 : F8BDB9D9CE545075F483F4F1F919560EEE0108E313121DD2B891BFDF31A65DCE
MD5 : 9F88A4BBAFF1B8F530EE29F7226B3338

 

Links:

Link to VT for Shell Code
Link to Previous blog post on this type of encoding.
Link to SCDbg
Link to PSXray
Link to BlobRunner
Link to blog post on Msfvenom
Link to my Github with tools

Advertisement
Posted in Malware, PowerShell, Programming, security | Tagged , , | Comments Off on More adventures with shell code and the Shikata Ga Nai Encoder

A quick look at the current emotet encoding

I have went thru several samples today of this type of encoding but todays sample will be from ExecuteMalware @executemalware located here and the Twitter reference is here.

Here we can see that only 3 of the urls are displayed.

Anyrun-1

Emotet usually has 5 urls so where are they.

When we check the system they are now dropping a .jse file instead of powershell

Going thru my samples we find that they used this style before here is 1 reference from Twitter here.

If we are in a hurry we can get the script from anyrun.

We click on the winword.exe and see this.

Anyrun-2

Then click on the more info to see this.

Anyrun-3

Find and click on the jse file to see this. There could be several files to scroll thru before you see the jse file.

anyrun-4

Although this file is labeled as a .jse file it is not “Java Script Encoded” as it should be with that extension.

From previous research on the JSE and VBE encoding if the script engine does not find the header values for the encoding then it will attempt to run it as a normal script.

The script is unformatted to start with so lets pretty it up and take a look at the parts we need.

The way this works it it will take the array at the top and rotate  the array so many positions and build a new array. In the Screenshot we see the value that it will use highlighted.

Script-1

This has looked pretty much the same in every version of this I have worked with.

The difference is sometime they will also “\x” hex encode everything to make it more difficult to tell what it is doing.

The next thing we need to look for is the index and function that will be replaced when the script is run.

Script-2

If we just see the function name and an index value then the array will usually just get base64 decoded. Then from the new array that was built it will use this index in the array to do the replacement.

In this case there are 2 values, and from experience I know this is a RC4 key for decoding the base64 – RC4 encoded values.

Here is the decode function to verify if it is using ( % 0x100 )  Mod  or a AND  in the decoding

Script-3

If we scroll down they were nice enough to give us 3 of the urls plain.

Script-4

So in order to extract the last 2 urls you need to.

1: Reorder the array  using the provided value.
2: Get a list of Indexes and the key values.
3: use a for each loop of some kind and base64 decode  -> Rc4 decode for that index value and output this to a decoded Indexed list.
4: Locate and extract the last 2 Urls.

Using my tools lets see how this works.

First of all, the function name for decoding  is just “b(“. They are usually like “_x4E349(” or something similar.

So lets rename this to make sure my simple tool does not get the wrong thing.

Script-5

That is simple enough to make sure I don’t get something else with a small case b.

Script-6

Now we have a list with the index number and the key value.

Script-7

This tool here will do the reorder just before it try’s to decode each value.

Now we can get the last 2 urls from the file.

Now that we have the decoded list if you are real motivated you can now do the replacements and get a better understand of what the script is doing.

One of the old tricks that used to be used was to keep scripts from running, change the program that was associated with the script extension.

Do the the limited use of JSE or VBE file extension it “may” be safe to set the default program for these extensions to notepad.

This should stop these from running and also alert the user that something is not right.

That’s it for this one.

If you have any question you can get me on Twitter at @Ledtech3.

Posted in Malware, Programming, security | Tagged , , | Comments Off on A quick look at the current emotet encoding

Chasing malware down the rabbit hole to see where it goes.

Lets start this journey with the blog post by Pondurance  titled “777 RANSOMWARE COMBINES WITH TRICKBOT” located here.

There is not a whole lot here but it describes 2 layers of shellcode  and some indicator’s and the first is the URL “hxxps://fearlesslyhuman[.]org”.

This URL seemed familiar but upon looking it up I was having difficulty finding very much information on it. The first search led me to Hybrid Analysis where we find this calling out to a /boot URL.

HA-1

It is also only labeled as “no specific threat” .

If we scroll down we see there is a PowerShell script but it is split up in 2 areas of the strings section and no download is available for this script. So let’s just extract it from the strings section.

HA-3

Here is the extracted script. If we look close it is base64 encoded GZipped.

Lets extract that and see what we have.

HA-4

As with the original blog post that started this run we have base64 encoded data that is also Xor’d with decimal 35.

You could probably make up a CyberChef recipe to do all of the steps I’m going to do with my tools.

Base64 decode to byte –> Xor bytes by Decimal 35 to get Clean shellcode.

HA-5

One thing my tool does not extract is the latter part of the url highlighted by the red box. Also as seen in the hex editor.

HA-6

Let’s also take a look at what API’s are found in this Shellcode.

HA-7

So our full url that this is calling out to is fearlesslyhuman[.]org/FSkX .

Unfortunately this does not contain the next level of download. The hunt continues.

Our next stop is urlscan.io  Here to see what it can tell us.

URLScan-1

We see 8 hits and as of 10 days ago it appears to be down. It also appears to be the Same IP in the ones that connected.

Here on Virus Total it gives us a little more information but not a lot.

VT-1

Searching on https://app.any.run for our URL we see several hits.

AnyRun-1

If we go thru this list we are not finding anything special. Thanks to @James_inthe_box for locating this sample for me that is different than what we see in this list. Sample Here .

I’m not sure why it is not showing up in this list.

AnyRun-2

Looking at the traffic we can see there is more going on here than in the other sample in the list.

Lets download the PowerShell script and extract the shellcode from this.

AnyRun-3

As we can see here this script is the same as the last one we pulled apart so lets extract the shellcode and see what it tells us.

AnyRun-4

Here we see it is using a “/HLnZ” path instead. Where did we see that?

AnyRun-5

As we can see here is it tagged as binary so lets download that and see what we have.

AnyRun-6

And the Hash Information from Anyrun

Hashes
MD5     7E8AF84B1CB9E43F1A66D385A63C9EAB
SHA1     C7651DD95BA7D82D1B8593E5EB5F0454AFF8373A
SHA256     17CB1CCC53B52E0EE31514673F4962E673280E505324739873542D541200120C
SSDEEP     6144:5RAY+7omj6nn5QEj+8vnLDckHWgvgV8Cm:92oo6n5va8PMS9C8Cm

Here we find it On VirusTotal by the hash with no detections.

Lets Look at this binary data in a hex editor.

AnyRun-7

As we can see here it starts with an “FC” which it what Shellcode normally starts with.

Lets drop this into CyberChef and see what it tells us.

CCError

That is not good, it throws an error. Lets just try the first part of it then.

CC-2

Well that worked but still does not tell me much.

Next stop, fire up the VM and load this into SCDbg.

SCDBG-1

I’m a GUI Person so used that instead of the command line.

SCDBG-2

As in the blog post referenced in the beginning we can see many calls to API functions and SCDbg also drops 2 files for us.

SCDBG-3

SCDBG-4

We can see this is a decoded PE file from the Shellcode but it appears the parts of the PE header got stomped after it was loaded into memory.

SCDBG-5 

This version that was dropped still has the “MZ” in it.

SCDBG-6

Lets remove the decoding shellcode from the start and then we have a decoded version of the binary.

Note: this is designed to be loaded and run from the original PowerShell shellcode.

Looking at this shellcode and the resulting executable got me wondering how it gets decoded.

What can we use? Possibly a shellcode to exe utility and then load (and or) run it in IDA or or in X86Dbg .

After chatting with  @herrcore about loading shellcode to be able to view it in a debugger he pointed me to a tool called BlobRunner, Here There are prebuilt binaries and the source code so you can build it yourself.

There is also a video that goes with it Here but they are using IDA Pro (The hard way) to view the blob.

At first I had trouble figuring out how to use it, so used a smaller sample file to get a feel for it.

The steps to run it.

Copy the binary for blobrunner and the shellcode into a folder on your vm.

Open a command prompt from the folder (so you don’t have to use full paths)

Pass the parameter’s of the blobrunner and a space and the name of the shellcode file into cmd.

In this case it will be  blobrunner.exe  “HLnZ.bin” (I used double quotes incase a filename has a space)

After entering the command and hitting enter we see this.

BR-1

Next we Open X86 Dbg and attach to the blobrunner process.

BR-2

Next we go back to the command window and look for the Entry value.(or you can do it first)

br-3-b

Then back to X86Dbg and open the Memory Map section and look for that address.

BR-4

I double clicked on that address and went to the place in the CPU tab where it is.

BR-7

Here we are at the beginning of the shellcode. Set a break point here then go back to the cmd window and hit a key to start it running again.

Next hit run in the debugger and it will break at that break point. You can then start stepping thru from there to see what it is doing.

Note: If you hit run in the debugger first without setting the breakpoint first it will get away from you. (I’ve Done it)

Going thru this there are a couple of jumps to set things up, Notice the second “Call” at “B”

Now look what happens as we step thru and make some jumps.

BR-5

BR-6

Notice that the assembly changed after making the jumps, and the real odd part is if you scroll back up it returns back to normal and the same as what CyberChef shows.

BR-8

If we set a breakpoint at the pop ebp (ox33) and run it till there (after the loop is complete) it will self decode. You can then just select and copy all of the bytes here to a hex editor then clean out the beginning shellcode or you can use the follow in memory map option and then dump it to file from there.

BR-8A

That is how to do it that way.

But I’m still curious on how the algorithm works to decode.

I then stepped thru it enough times until I was sure I had it down on what it does.

Tool-1 

It is a little slow because it is dealing with a lot of data and also formatting the input string /hex from a text box. Large amounts of data in a text box is always slower than importing the data straight from a file and working with it that way.

And finally here is how this Works.

HowToDecode

That 34200 is a counter that gets reduced by 4 every round.

I am assuming that it is also a length value.

And 1 final thing. Does it run in Anyrun  Here ?

ExtracyedBinAnyrun

Nope.

That is it for this one I hope you learned as much as I did.

Links

Anyrun Links:
Link to Good run
Link to Extracted File

BlobRunner:
Link to Download
Link to Video

Hybrid Analysis:
Link to Report

UrlScan.IO:
Link to the report for URL

Virus Total:
Link to Data on Url
Link to shellcode

Posted in Malware, PowerShell, security | Tagged , , | Comments Off on Chasing malware down the rabbit hole to see where it goes.

A deeper look inside one of the new Emotet Malware Docs

The sample here comes from a quick search supplied by ANY.RUN @anyrun_app  of #emotet-doc to filter quickly on documents you want to look at. Twitter reference Here and the link to the file we are going to use Here.

One of the first things I always do is to always look at the file in a hex editor to verify what type of file I am dealing with. Never trust a file extension for the type of file.

Format

As we can see here this is a “Zip” style so we can simply decompress it to get the contents out to look at them.

One other thing we notice too is when we just scroll down in the hex editor is the script we are after can be seen clearly.

ScriptinFile

If we just search for “var” in the hex editor it will take us right to the beginning of the script and we can just copy it out without even having to run or unzip the file.

On a side note, the “var” is a java script keyword and would not normally be found in a vba file.

Looking at Any.Run we see this.

Anyrun-1

If the run looks like this with the “jse” then it may be this type. Unless they change it after seeing this blog post. (It wouldn’t be the first time)

If we left click on the first process Winword.exe then click on the More Info we see this.

Anyrun-2

AnyRun-3

Here we see the Jse File. Lets click on that.

One note about the “JSE” file here. This is not a JScript Encoded file as the scripting environment would encode it and is usually associated with this file extension.

A JSE file looks like this.

JSEFile

 

This is just plain java script that is obfuscated by a tool.

Anyrun-4

Here is our script that we will be working with. So Lets download that.

Now lets look at 1 more way to extract the file.

After we Unzip the file we see

Folder-1

Select the word folder

Folder-2

Now lets look at the vbaProject.bin file in a hex editor.

Vba-1

Her it looks like a “OLE” header so this is compressed in that format

If we try our search for “var” again we see.

Vba-2

Now we can stop here and just copy the entire text section to your text editor and then clean this up to leave just the script.

Lets go 1 step further incase this gets more obfuscated later.

We can use 7Zip to extract the contents of the vbaProject.bin .

Folder-3

Lets look in UserForm1 folder.

Folder-4

Looking at the size of the “o” object this looks promising. If it is not very large then there will not be much in the files but we need to check them anyway to verify there is not anything useful in it.

This is a binary file so we need to look at it in a hex editor.

vba-3

This is the lowest level we can go to get the script out. We could also use Office and extract it from the textbox or properties box in the VBA tools.

There are also the Decalage @decalage2 python tools Here and a few others.

Here is what the script looks like Normally.

Script-1

That is to difficult to see what it is doing so lets do some java script formatting on a “Copy” to get a better view.

Script-2

This is a very distinct format that has ben used for some time now, and if you understand the basics of these then they can be very easy to decode if you want to take the time to build tools to help with the boring parts.

There are basically 3 things we are looking for when we see this format. We look at the array at the top. It is usually either base64 , \x encoding , and I’ve even see a modified base64 encoding also in past samples that were not Emotet.

The next thing we are looking for is this function just below the array.

Script-3

As you can see it has a push shift function and a value of 0xd3. What this will do is rotate this array 1 place that many times in a circular fashion. Not doing the math to be sure , but if that number happened to be the same count as the array it should just go back where it started from, just an example.

The last piece we need is the part with the index numbers.

Script-4

This “b(‘0x0’, ‘ILb*’) is the index and a key to decode the base64 string in the array.

If it only has the index number then it either does not need another level of decoding or  the same key is used for everything and you will have to verify /locate it.

Script-5 

Here is the function where the index value and the key are passed to.

Script-6

Here we see the base64 decode with the atob() function and then the RC4 decode below that. This version uses the  “Mod 256” you might run across some code that uses “AND” 256. So just to be aware. (Also 255 is used in some scripts)

So now we have a pretty good understanding how the decoding works lets decode this.

Script-7

Script-8

Using this tool I had written for the Neutrino EK we extract the whole command and just the index and key and save both to separate files.

Script-9

This is where that value of 0xd3 comes in at.

This tool will take the base64 string array split them and rotate them to the proper place by that value. It will then use the list of Indexes/Keys from the last step to do the decoding of the array.

Script-10

If all we want to do is extract the list of urls we could just stop here.

But lets see about doing the replacements in the rest of the script.

Script-11

Here we are lined up with the decoded and encoded.

Although we have done the replacements of the encoded file there are actually more layers.

Script-12

Notice the “\x” encoded characters lets fix that.

Script-13

Does that string look familiar now. The error box in Anyrun ? (This screenshot was borrowed from the text report section and saved as png)

5c62c466-84cd-4aad-a881-a72f91d4d319

There is still 1 more trick used here that I have previously seen in in the pages that led to Angler EK.

Script-14

We have 1 more layer of separated values. As we can see here we have a key value pair where the “eb” is an array of variables, the key is “imKcX” and the value is
“Not Supported File Format”.

We can verify that by looking at the screenshot above.

There are several places it will do this.

Anyone that has tried to step thru one of these in a debugger knows how much of a pain and long winded these can be before they finally spit out the decoded page.

And that is if there are no debugger checks to throw you a curve.

One more note. I have seen some samples in the past that used this style that have been run thru this style of encoder twice. So you may need to look close and repeat the process to get it decoded as fully as possible.

That is as far as I’m going on this.  My challenge to you is to build your tools to be able to quickly decode these also.

Mine are to fragile to release for this post.

In conclusion, once you understand the layout of how these decode you can apply that knowledge to the various “Types” you may run across.

Learn it and help make this type of encoding obsolete.

 

If you have any questions place contact me on Twitter at @Ledtech3.

Posted in Malware | Tagged , , , | Comments Off on A deeper look inside one of the new Emotet Malware Docs

Another Look at the Rig Exploit Kit

It has been awhile since I have written up anything on this exploit kit since it had moved to the background more and I have not seen as may samples as I used to.

It has gone thru many changes since I have first seen it and started learning how to disassemble it to lean how it works.

What sparked the interest this time is a series of reversing videos released by Vitali Kremez @VK_Intel and 0verfl0w @0verfl0w_ of SentinelOne . You can find the tweet reference here. https://twitter.com/VK_Intel/status/1172345642796011521

This post will focus on the .saz file from the video #3 on Rig EK. @VK_Intel was kind enough to give me a copy of the saz file so I could do this write-up of an alternate method of extracting the exploit code. The method used in the video was using the debugger in Google Chrome. I personally dislike that debugger so use a copy of IE9 in a VM when I actually need to run html/Script in a debugger.

So lets first take a look at the file in Fiddler.

Fiddler

There is limited traffic here to work with. We start with a redirect to the Rig EK landing page which we can see on the right.  Lets extract this page. I generally dump everything as raw to a folder and then look thru it for what I want but Vitali demonstrates another way in the video that is more precise to only extract what you want/ need.

So here is what the landing page looks like. If you zoom in and out it appears as if everything is running together and difficult to understand.

LandingPageNormal

Here it is after doing some JS Pretty / Formatting.

LandingPageFormatted

Now that this is formatted, you should be able to zoom in on the picture and see that there are 4 script blocks on this page each has it’s own code and exploit it is targeting to attempt to download an encoded file to decode and run on the system. The downloaded  file / malware will very on the campaign.

So lets take a closer look at each section to see what they look like.

Section:1

Section-1-Entire

If you look close you can tell this is base64 encoded.

Section-1-End

In past version of this there was string replacements in the base64 string itself but now they seem to just put it in the string of letters for the base64 alphabet in the decoder for what ever reason.

So all we are going to do is extract the base64 string from this section and use a different tool to do the decoding.

Section-1-B64Decode

This section has no tricks so we can just copy paste the base 64string into the decoder and decode as UTF-8

Here is what the whole thing looks like normally after base64 decoding.

Section-1-B64Decoded-Normal

Um , we can do better so lets format it. Unfortunately the formatted version takes forever to save a screenshot of so lets just look at the most interesting parts of this one and I will Include the decoded files and my tools on my Github.

In previous samples I have pulled apart over the years I have taken the many hours it takes to decode these by hand to see what the functions do and how they work.

This code also looks allot like what you would have seen in Angler EK.

In this part we see the Shellcode.

Section-1Shellcode-1

Lets copy this shellcode over to another page and then we can look at it in a hex editor.

Section-1-Shellcode-2

Notice the first part of this shellcode here.

Section-1-Shellcode-3

Now Lets drop this into CyberChef  X86 disassembler Here

CyberChef

We can see here that there is an “Xor” by 0x84 being used. So lets try that.

Section-1-Shellcode-4

Above you can see I just decoded the entire encoded shellcode by 0x84 and where it is highlighted you can see the decoded script. So lets extract and format it for a closer look.

Section-1-Shellcode-5

Does this Look Familiar to anyone ?

Basically all of this trouble to hide what is going on is boiled down to “It just passes parameters and downloads an encoded file , decodes and runs some malware”.

But what are those parameters. In some older versions it was was hidden inside the insane encoding but now it is as simple as looking at the bottom of the page.

Section-1-Values

This is the url that gets passed and the RC4 decoding key that is used with the exploit downloader.

Section:2

Now lets take a look at section-2.

Section-2

As we can see here this section is shorter than the first but there is one thing they do here that they didn’t do in the last section. They split the base64 string by using double quotes and the plus symbol. So lets clean this up and base64 decode it to see what is happening.

Section-2-B64Decoded 

Here we can see this one is using Flash to download the sample. There have been several different exploits used with flash in order to run the final payload. So lets take a closer look at this one.

Section-2-Flash

This has several script in this that do various things but what catches my interest is this familiar looking shellcode that I have not seen previously in a Rig EK flash file.

Section-2-Flash-2

Section-2-Flash-3

Notice anything here ? It looks pretty much like the first one.

Section-2-Flash-5

So in this case instead of a flash exploit they are using this one ? I’m not sure but the people that can ID exploits used will have to determine that.

So this section will pass what parameters ?

Section-2-Flash-6

If we look at the traffic, the only thing we have is the traffic associated with the flash download.

In this case the first part of this will be where it will call out to download the flash file from.

The second part after the highlighted function name is where it will download the encoded file/malware from and the last part is the key that will be used to decode the downloaded file.

There are 2 more section that will decode pretty much as the first and I will include them in there own sub folder at each stage of the decoding.

But this is as far as we can go with this sample.
Since this does not have the frame with the encoded malware lets try another sample from “malware-traffic-analysis.net”  located Here.

New Sample

So if we open the pcap and set a filter of “http.request or http.response” we see the familiar patter as the last in the .saz file.

X-Pcap

But in this case we have an extra download after the flash file download.

So lets set a filter for the landing page and see what frame/packet number it is in and extract it.

X-LandingFilter

As it turns out it is packet-112.

Lets take a quick look at the flash file since we can see that it gets downloaded.

X-Flash-1

As we can see here that the flash in this one is more “old style” and it will do some decoding inside of the flash code.

One thing to note is that if you attempt to just use the “Export all parts” then it does not properly export with this version of FFDec and you have to painstakingly copy paste each of the scripts to a folder.

Notice the auto generated message vs the generated code in the decompiler.

X-Flash-2

As we extract and look at each script this may be using multiple exploits ?

X-Flash-3

Going thru this source I’m not seeing how this would even trigger the download based on past work with these. Perhaps I’m just forgetting something or missed what I was looking for?

So the final question is, how do we know which of these sections/scripts downloads the encoded binary ?

The answer is in frame 196.

X-Frame-196

It tells us the download is the result of the request in frame 132.

X-Frame-132

Now after extracting all of the request from the sections we can compare which ones match and that will tell us what exploit was used.

X-Frames-Request 

In this case it was section 4. and here is the script.

X-Script-4

Frame 196 contains the encoded binary and after RC4 decoding  we discover that the data in this PCAP was truncated and we didn’t get the full binary.

X-DecodedBinary

One other thing to note is the RC4 routine here  uses “And 255”
where others may use “MOD 255”.

X-RC4

In conclusion we see that they are using a scatter approach using multiple exploits and scripts on the landing page. Perhaps just one was meant to work while the others are experiments or possibly they are used as fill to waste time of researchers or to send “false flags” on what exploit was used ? Looking at some of the code you can see it “does not work as expected” so only they can answer that question for sure.

That is it for this one I hope I was able to show a viable other way to see what is happening rather than trying to step thru in a debugger.

Links:

Link to Twitter post for the Tutorial that started this HERE.

Link to Malware Traffic page HERE

Link to my Tools used here and the files on Github HERE

Posted in Malware | Tagged , , | Comments Off on Another Look at the Rig Exploit Kit

Those Pesky Powershell Shellcode’s And How To Understand Them

Shellcode comes in various forms for different operating systems. Some can just be dropped into a hex editor and get the needed understanding what it is doing , some may require looking at the generated assembly code generated by a disassembler or require a specialized tool that understands the type of shellcode you are working with.

The one constant that seems to be the same will the various samples I’ve looked at is that the shellcode is used as a form of obfuscation to download the final malware.

Here we will just be concentrating on the Windows PowerShell versions.

Sample 1:

Lets start by taking a look at a “Daily Script” from December of 2017. Here is the Twitter reference for this sample.

Step1

First we need to convert the char Codes to Chars.

Tesp2

After that we get a base64 string.

Step-3

After that we get a Powershell script Gzip stream. After we decompress that we see this.

Step-4

Here we see a base64 encoded string. This is our encoded shellcode. It will get loaded into virtual memory and run. The exact implementation may vary a little but this is what I mostly see.

That Brings us to the shellcode which is what we are after.

Now we can we base64 decode to hex.

Step-5

So now what do we do with it.

Step-6

So now we drop the hex into a hex editor and we can now see the url it was calling out to and if we look higher we can also see a User Agent string.

Sample 2:

Next we look at this sample found here on Virus total form November of 2018.

S2-S1

Here we only start with a base64 encoded script.

S2-S2

Now we have a Base64 encoded GZip script.

S2-S3

Now we see the familiar base64 encoded shellcode so lets decode that to hex and drop it into a hex editor like last time.

S2-S4

Well that not to helpful. now what ?

Lets try CyberChef  here and look at the assembly.

S2-S5

Well that dosen’t look like much help either.

What else can we do ? We have John Lambert’s “PyPowerShellXray” here . Or we have SCDBG found here

After working with these the “PSXray” requires the powershell script with the shellcode to work and the SCDBG requires only the Cleaned hex of the Shellocde so you still have to base64 decode to hex to use it in scdbg. Lets see what those 2 Show us.

PSXray-32

Here we can see some Windows API calls using psxray but something doesn’t look quite right. the ws2_32 which gets pushed backwards is not showing it all, but if we modify the python script to use the 64 bit version of the backend API for this tool we get the full Api name but the rest of the values don’t look the same.

PSXray-64

So what About scdbg then ?

Scdbg

It didn’t find anything because scdbg only work on 32 bit shellcode and this is 64 bit.

So now what.

New tools.

S2-S6

In order to save a step we can also just input the base64 string.

S2-S7

Looking at the way John lambert’s tool parsed the hashed api calls I  wanted to be able to do the same thing but as a copy paste instead having to run it thru the vm/python process.

Another new tool.

S2-S8

But how do we find these hashes.

ApiHashes

As it turns out psxray had a prebuilt list of hashes for the function calls. I had to convert those to individual dictionary items for each API to be able to use them in this new program, but first do the sheer number of them I had to build a program to do the conversion and then generated the vb.net code for me. Then I could use the generated code to do the search for the API calls.

HashValue 

If we take a closer look at the output of my tool we see found at index, this is the string index not the byte index. You would have to divide that by 2 if you were searching in a hex editor for the byte offset. Another thing you will notice is that the order it is found in the file is reversed to what you will find it in the assembly or the database with the tool.

That is why I put both the normal order found in the file and the “ASM Order” in the output.

Another odd thing I ran across in a sample was a hash value was found but at an “ODD” offset and closer inspection of the assembly and the found value showed it was a false positive. All of the normal offsets are divisible by 2 so any odd value may be false.

While investigating how the hashed API names worked for my Office Equation Blog post here I found a FireEye post from 2012 here about using precalculated string hashes and instructions on how to generate your own Sqlite database of  known hashing algorithms and values. I will include the ones I generated for reference as a lookup  database for looking up unknown hashes.

I was able to use this database to generate the remaining code for the tool above that the list form John Lambert’s tool didn’t include that I had ran across.

Sample 3:

In this sample found on Virus Total here this was a strange one. It was originally found on pastebin by Paul Melson’s  (PaulM @pmelson)  ScumBots @ScumBots  bot and uploaded to Virus Total.

When we first look at this script one thing we will notice is that it starts with a very large ase64 string. The second thing is it is broken up with the string of ‘+’ to mess with automated base64 decoders that can’t deal with putting the string back together and remove those first.

S3-S1

After we clean up the base64 string and base64 decode we see this.

S3-S2

NOTE: I have tested this in psxray and it will fail to parse this type.

If you zoom in on this picture you can see the this has a base64 encoded executable file embedded into. Let’s extract and take a quick look at that first.

It looks like the script will load this Dll which is a AMSI Bypass method which will then load the shellcode.

S3-S3

Now let’s take a closer look at this shellcode. It doesn’t start with the normal “0xFC” .

S3-S4

That’s hard to read so lets format it a little bit to better view what is happening.

S3-S5

Looking where the blue dot is we can see that this shellcode has been split apart into arrays and will get reassembled at run time.

So lets reassemble it. (New Tool)

S3-S6

Now that is it reassembled we can now input it into our Tool to get the IP/URL.

S3-S8

And also the API calls. I created this tools so it would also help give more insight as to what gets called so it may help to get a better understand of what it is doing not just the IP or Url that may show up by just running in a sandbox.

One other thing to note is that I have a checkbox for each API that gets parsed so the ones that show up as “No Hashes Found” can be unchecked and then you can rerun it to get a cleaner output.

Sample 4:

This is another strange sample As of this writing is still on Pastebin here which is another sample found by Paul Melson’s  (PaulM @pmelson)

S4-S1

We start out like normal with Powershell and a large base64 string .

S4-S2

After base64 Decoding we now have a Base64 GZip string.

S4-S3

Now we have decompressed this level we can just take the base64 encoded shellcode and drop it in our tool to extract the IP/URL.

S4-S4

Ok so What is “Shikata Ga Nai encoded shellcode” ? This one had me stumped for a bit because there where no real “clear” explanation’s on how this decoded from the byte level without using other tools.

Note: psxray has the function to decode this type of shellcode. scdbg does not work for this type.

This article here was the Closest one that helped me work this encoding out. It is found in the “metasploit-framework” found here .

The Description of it is a “polymorphic XOR additive feedback encoder” yeah that description really helps.

After reviewing the Article and anything else you can find online about it lets drop the hex cleaned shellcode into our friend CyberChef. You will also notice a difference in Cyberchef output and what psxray outputs.

Diff

(This screenshot is from my original research.)

The cyberchef is before and the psxray is after it is decoded.

S4-S5

Here are my decoding notes for how this decodes. It will start out with a xor key which will change from sample to sample and a addition value that gets added to each round.

You add the decoded byte with the current key  to get a 32 bit value for the next key.

The next thing that needs to be figure out is where the encoded data starts at. In this case if you look at the difference screenshot it will tell you where it starts by the difference.

Another way is to look for odd/ messed up assembly instructions at the beginning of the CyberChef assembly.

S4-s6

S4-S7

Now we can just drop the decoded shellcode back into out IP/Url parser tool.

One other thing to note, if you can not figure out where the encoded shellcode starts just drop the entire shellcode into the decoder after the key and decode and remove 1 byte (2 chars) at a time from the beginning until you see this value show up or more plain text in the output.

S4-S8

That is the string representation of “LoadLibraryA”

S4-S9

There are some more strange types I would like to go thru but this is starting to get long.

Here is a list of the tools I am including in the release.

ToolList

All of these tools have been used in the decoding and extraction of the shellcode.

In the base64 decode tool there are 2 buttons on the left decode as utf8 and decode as unicode . Most of the powershell scripts that base64 will use the unicode button.

To extract as hex you have to check the box and select the encoding type to extract as. Most of the time it will be 1252 from the dropdown list. This list id filled by a function to get the supported encodings for the system it is run on.

If there are any Question or problems just contact me on Twitter @Ledtech3 .

Links:

Sample 1:
Twitter Link

Sample 2:
VT Link for sample
CyberChef Link for X86 assembly.
PSXray Link
ScDbg Link to site
FireEye post on precompiled  hashes Link
My Blog post on Equation Editor Shellcode Link to

Sample 3:
VT Link for this sample

Sample 4:
Pastebin Link to sample

 

Github Link to the tools and files used here.

Again there was a lot more that I would like to have gone thru.

I hope you learned as much I did.

Posted in Malware, Networking, PowerShell | Tagged , , | 1 Comment

A deeper look at Equation Editor CVE-2017-11882 with encoded Shellcode

Our sample today comes from My Online Security @dvk01uk from this Twitter thread Here.  The First one I had started to work on comes from this Twitter thread  here from April 26 of 2019.

The encoding on the shellcode uses a method similar to Shakita Ga Nai encoding.

I would also like to thank  Denis O’Brien @Malwageddon for pointing me to this video on how to set up the vm to use X86Dbg to load when the equation editor loaded.

I would also like to thank him for giving me the tip of setting a break point at 0x00411874 on the return instruction for the font record. This can get you close to where you need to be but then you have to step thru from there.

Also this blog post had some helpful information on breakpoints that helped while trying to run this with the debugger attached.

Before we jump into the debugger let take a look at the file and extract the shellcode.

When we first open the file we see

File-1

Here we can see it is a zip file so lets just unzip it to get the file structure.

File-2

Let’s look in the “xl” folder.

file-3

Now we need embeddings.

File-4

Let take a look at this file in a hex editor.

File-5

This is an OLE file so we can just use 7Zip to extract the contents.

file-6

This a Ole0Native binary file so lets see what is in this.

File-7

By the looks of this. It does not appear to have any other headers so this is our shellcode that gets run.

Lets copy all of the data here and drop it into CyberChef.

CyberChef

Let’s take a closer look at this in Notepad++ with the colors for assembly.

CyberChefColor-1

Now lets do some math at the beginning.

blogpost-bp-b

After doing the math here we can refer back to the blog post and see that the result matches Globallock.

So let jump Into the debugger. After getting to the fonts and finding the corrupted one we step thru and find what we are looking for. The Beginning Of our Shellcode.

InShellCode

The values just above notepad are the ones we see in our debugger. Finally on the right track.

After loading Globallock and returning from Kernal32 we end up in a series of jumps.

MultipleJumps

Here I was able to get a graph of the function calls

StartOfJumpGraph

The Part we need to understand is at the top where it goes into the loop.

Loop

Here we can See the value that will be used as a Multiplier.

Multiplyer

You can Also see my notes from a previous run  the values we need to find.

So After running thru all of this we can find the decoded Shellcode in ECX.

DecodedPayload

While stepping thru this I also copied the assembly and the current values to a text documents. Lets take a closer look at the flow.

AsmValues

Now we have a better understand of how the decoding works. Here is a more Simple Version.

DecodingNotes-Final-A

Now we can build a decoder for this.

We need 3 values that we can get from the Cyberchef output. The Multiply value, the Addition Value And the Length value.

We will look at the length value first.

ShekkCodelen

So now we need to find where this is in the Shellcode we extracted.

As it turns out if we just extract from the end of the shellcode data the amount here 0x2A7  then that is the data we will be decoding.

ShellCodeSelect

This is now our encoded shell code. We can copy paste this to the new tool, get the other to values and click a button.

If we are right then you should see the decoded values clearly.

Decoded-Payload-Tool

DecodedClose

So now we can decode these by extracting the shellcode from the file, copy paste to to Cyberchef to get the Assembly, look for the required parameters and finally input them into the tool and click a button for the result without having to run the file.

Although you can just run them and get the URL where it calls to, this will give you what else in the shellcode and and what API’s are run.

The few different ones I have done have all worked just a bit different under the hood even though they have the same effect of just calling out to some site somewhere and downloading a file.

If you click on the Twitter link to this sample an then click on the CVE- tag at the top it will present you with , at the time of research, 116 pages of files that potentially use this type of encoding.

That’s it for this one. I hope you learned something too.

 

Links to URL’s in this post:

Twitter Link for this sample

AnyRun Task Link for this sample.

Link to blog post with the different values

Link to the video on how to set up X86Dbg to attach to the Equation editor.

Link to My Github with tool and decoding notes.

Posted in Malware, security | Tagged , , , | 1 Comment

A look at Stomped VBA code and the P-Code in a Word Document

This sample comes from a Twitter discussion here and a second part of the thread here on April 22 2019.

This discussion was started by “My Online Security @dvk01uk “.

Although it appears to have a vba file in it it didn’t work in a few different sandboxes as mentioned by @dvk01uk.

Lets take a closer look at the sample found here on ANY.RUN @anyrun_app .

If we look at the document in a hex editor we can see that it starts with a “PK” so this is a ZIP File version and we can just decompress it and take a closer look.

DocHex

After unzipping the document we see this folder layout.

1

Lets look at the word folder.

2

We can see here we do have a vbaProject.bin file. Lets look at that.

3

This is a OLE file so we can decompress this with 7Zip.

3A

Lets take a closer look at Module1

4

If we scroll down to the bottom of this file we can see that it appears to be Zeroed out.

If we look at the “ThisDocument” we can see the “Attribut” string which tells us it contains compressed VBA Code.

5

If you don’t have that string in the file then it does not have compressed VBA Code in it.

So how does this work then.

If we go back to the Twitter discussion “Vess @VessOnSecurity” has a python tool called pcodedump to extract the “P-Code” from the document which can be found here .

This tool currently only requires the “Decalage @decalage2” oletools.

The command I ran was this.
“C:\Python27\python.exe” “C:\Users\Joe User\Desktop\pcodedmp\pcodedmp.py” “Opticsense New Order.doc”

In order to dump it to a file just add to the above command.  “ > DumpedPcode.txt” or what ever name you want.

I have both versions of python installed on this vm so I have to use the full path to to it. I also discovered the hard way that you have to put it in double quotes in order for it to work.

Since I didn’t use the pip install for the pcodedump tool I just downloaded it and used the full path to the script I also put double quotes around that path. The final parameter was the file name in double quotes since it has a space in the name.

I just opened a cmd window in the folder where the document was and ran that command.

Here is what we see when we run  the command and dump it to a file.

P-CodeDump

This is the part we are most interested in at the moment.

Pcode-2

If you can zoom in on that you see a bunch of “Line #:” so lets clean those out and format this a bit better to be readable.

Here we find the AutoOpen Function.

Autoopen

The “Ld F_WH” appears to load the function above.

Func1

Although it is not real clear looking at this for the first time we can take an educated guess on what the names mean like “st” I would assume it means string, “ld” would be load ?

So here is appears to take the string in “E_MO” and pass it to the function “B_RA” and when it returns it will set the value of “F_DC” as an object.

FuncB_ra

So what this does is take the string of numbers and uses 3 numbers at a time then subtracts 0x1A (26) from the value then converts that number to a Character.

So after decoding the first string we see.

E_MO-Decoded

So the object that gets passed is “Wscript.Shell”.

The rest of the longer strings appear to be junk code until you get down to here.

SecondString

Here we see it is getting the string “SP_LL” from the active document.  When we search for it we find it in the “settings.xml” .

SP_LL-1

So now we need to take this string and run it thru the same B_RA function and see what is output. It will then get executed after passing back to the AutoOpen function.

DecodedPSScript

If we go back to the AutoOpen function and Continue on now that the strings are decoded.

WinMgmt

It will use WMI’s Win32_Process to load “Cmd.exe” and the rest of the script.

Lets take a closer look at the decoded powershell script.

FormatPs

If we look at the highlighted area in the screenshot we can see above it that there were 3 variables “set”. This will rebuild the string “powershell”.

As we can see here this just downloads an exe from a site and runs it.

Anyone interested in getting a better understanding of the P-Code I would suggest looking at the source code of pcodedump and this file to get better handle on how it works.

I would have liked to went in deeper on how the P-Code works from the byte level but I’m still learning that myself.

That’s it for this one.

Here are the full list of resources and a few extra not covered in the post.

Twitter threads for this sample:
Main thread Here , Second thread Here , Third Thread Here

Didier Stevens @DidierStevens ISC Diarys :
Here and Here

Vess @VessOnSecurity pcodedump tool”
Here

Decalage @decalage2” oletools :
Here

Derbycon 2018 talk “VBA Stomping – Advanced Malware Techniques”
Here

Posted in Malware | Tagged , , | Comments Off on A look at Stomped VBA code and the P-Code in a Word Document

A look at a bmp file with embedded shellcode

The sample today is from PaulM @melsonp

While watching his BSIDES Augusta talk from 2018  Here,  at that the end he shows a picture file that gets downloaded from a layered PowerShell script. He was kind enough to send me a copy of a similar one to take a closer look.

I originally thought it was one of the PowerShell only decoder scripts for picture files but here is what we first see. This is the first layer .

StartScript

After Base64 Decoding this we get.

Layer-2

Here we can see this is base64 –> decompress to get the next level. But they have one more trick.

Layer-2-A

Before we can Bas64 Decode –> Decompress this we first have to do a string replacement of  “!” with “A” in order to get a proper Base 64 encoded string.

After Decoding we get this.

Layer3

This appears to be a normal Meterpreter PowerShell Shellcode loader but in this case it is only downloading a bmp file.

The other ones I have looked into have either had the Shellcode on this page base64 encoded or hex encoded or downloaded it as this has with the picture file.

After a discussion with Paul he was able to locate the pdf of the presentation of the builder for this here and I found the video for the presentation  here and the Github for the project is  here.

Here is what we see when we open the downloaded file.

header

The first 2 bytes are normal for the bmp file format. If we open the file as a picture it is indeed the the default picture of a cat from the builder  “flipping you off”. (Which I won’t show)

So lets dig into the pdf to see how this works.

Note: I’m still learning how to read assembly. But we learn by doing.

On this page we see we have the 2 byte header “BM” 0x424D then a Jump instruction of 0xE9 then a 3 byte offset. According to This page there are more possible “jmp”  instructions that could possibly be used.

PDF-Img-1

In our file we have the offset in  little Endian byte order of 0x30C403 ,and if we reverse that to 0x03C430 that is our offset to jump to.

If we jump to that offset we can see it is at the end of the file.

Offset

Now scrolling down the pdf a little bit more we see that they also attempted to obfuscate the decoding key.

pdfimg-2

What this is doing is setting ebx to Zero and then looping a counter until it matches the “Magic” value that was randomly generated on build.

After it matches, it reverses that hex value and will use that value to xor the first 4 bytes of the encoded data to produce a decoding key which will get reversed again for decoding the remainder of the bytes.

I first wrote a brute forcer to work like the function here but after looking at this longer and getting a better understanding of what was in the registers I finally realized that this entire brute force routine was a waste of time and CPU power. No matter what the Random “Magic value” turns out to be the index value will always end up equal to the “Magic value”.

So when building an offline decoder we can just bypass this and and just use that found value for the “Magic” in our calculations saving a lot of time and CPU cycles.

In order to figure this out I also had to take a closer look at the builder.

If we look in the source file of gen.py we can see the layout of the decoder bytes.

Source

So lets just use this CyberChef recipe Here to get the assembly for the bytes starting at the offset we jumped to in our downloaded file.

And we get this.

Assembly-1

For me this is a little harder to understand so lets go back and just put the data starting with the decoding routine to the end of the data  into CyberChef and see what we have.

This looks a little different.

Assembly-2

In order to get a better handle on what was in what registers I ran it thru Scdbg.

scdbg

If we look close at this report we see it fails at the op code 0x0FC9 . The “BSWAP ECX”

It was still enough to help me understand the values in the registers at the time.

I may not fully understand all of what the assembly is doing but I’m able to understand enough to work out how to decode it.

If you look at the above screenshot of the assembly you can see the notes from what I think I  understand on how it works.

If we look back at the the source code we can see it lines up where I have commented as random.

Here are my notes on how the function works to decode the bytes.

DecodeNotes

Here I am just reversing the first 4 bytes of the encoded data instead of the “Magic” Value” as it appears in the assembly.

The next step is to build a tool to extract the shell code.

I first start by importing the entire bmp file into the tool. I then extract the offset. Next Jump to the offset.

Next I extract the data from the offset to the end of the file. We no longer need the bytes before the offset.

Since I write all of my tools in vb.net and I have not found a good way to do byte array searches in byte arrays. So I will convert these remaining bytes to a hex string and work with the data as a hex string.

Just a note It is very resource intensive to convert a file that size to a hex string to try and parse it that way. (I tried)

Since I am now working with strings of hex I can now search for the unique byte sequence as a string instead of a byte array to do the compare with the byte code before the “Magic value” in order to find and extract it.

Magic-2 

Since this sequence will be in every file we can do a search for it and then locate the Magic value in the hex string. Once we find that sequence before the “Magic” we can then extract the next  4 bytes (8 Chars) for the “Magic”.

Next we have to locate the start of the encoded data. For that we can find what this function ends with.

EncodedData

You may also notice another value we could extract. The size of the encoded data. We could get that so there is not extra nonsense data in the decoded shellcode.

So after we put all of this together we end up with the new tool.

Tool

If we load the hex string shellcode into another tool I’m working on we get.

Tool-2

One thing to note. For this type of shellcode the first byte is always 0xFC and the second byte will vary depending on if it is a 32 bit or 64 bit shellcode.

So the question would be how do you find a file encoded with this.

With a few pointers from Florian Roth @cyb3rops I was able to create this Yara rule.

rule DKMC_Picture_File {
meta:
  description = “Detects DKMC encoded bmp file with shell code”
  author = “David Ledbetter @Ledtech3”
  reference = “https://github.com/Mr-Un1k0d3r/DKMC”
  date = “2019-27-02”

strings:
     $my_hex_string1 = { 424DE9 }
     $my_hex_string2 = { 31D981F9 }
     $my_hex_string3 = { E8B7FFFFFF }
condition: 

$my_hex_string1 at 0 and $my_hex_string2 and $my_hex_string3

}

After sending this to him he modified it to do the first 3 byte search as  UInteger.

Here is the modified version.

rule DKMC_Picture_File {
   meta:
      description = “Detects DKMC encoded bmp file with shell code”
      author = “David Ledbetter @Ledtech3”
      author = “Florian Roth @cyb3rops” // modified first 3 bytes to be detected as Uint.
      reference = “http://github.com/Mr-Un1k0d3r/DK …”
      date = “2019-27-02”
   strings:
      $my_hex_string2 = { 31D981F9 }
      $my_hex_string3 = { E8B7FFFFFF }
   condition:
      uint16(0) == 0x4d42 and uint8(2) == 0xE9 and
      $my_hex_string2 and $my_hex_string3
}

I’m not sure if it is faster or not but both do find the sample I have.

A Search on Hybrid Analysis didn’t find anything using  the yara rules.

A retro hunt by Florian Roth @cyb3rops On VirusTotal resulted in several hits for this rule.

Here is the Pastebin of the found hashes here .

Well that is it for this time I hope you learned as much as I did.

Posted in Malware, PowerShell, security | Tagged , , | Comments Off on A look at a bmp file with embedded shellcode

A deeper look into a wild VBA Macro

This Sample comes from Brad Duncan @malware_traffic from his SANS ICS Diary located Here and the Files on His blog Here.

For this session I will be using “2019-01-23-example-of-attached-Word-doc-1-of-7” word document.

I ended up looking at this from different directions so that is what I want to try and show here.

The first thing I always do is to look at the file in a hex editor to verify what type of file I am dealing with. Never trust a file extension.

FileHeader

As we can see by the 8 byte file header we are dealing with a OLE file vs. say the XML or the Zipped style or RTF form of a document.

My next step is usually to drop the file it into Office Libre to see if it will even open.

Here is what the Document looks like.

Document-1

Next let’s look and see if there are any macros available. Some times no macros are detected using this program so alternate methods / programs need to be tried to verify there are no Macros.

So when this first loads even before the “AutoOpen()” Sub, it does a “GetTickCount” call to the Windows API.

VBA-2

Since we are here lets take a closer look at this function.

TickCount

The “#If VBA7 Then” is what caught my eye. According to This question on StackOverflow it is checking for 64 bit Office on a 64 bit system.

Another Odd thing I noticed was when you click from the Module1 tab at the bottom to the ThisDocument tab then back the function name changes to the AutoOpen one.

NameChange

So now we can use the “Save Basic” button to save this Module1 as a “.bas” file to take a closer look.

But lets go further Now that I have the Decalage @decalage2 and Didier Stevens @DidierStevens tools installed lets see what they tell us.

We start with Olevba

OleVBA

As we can see here it outputs the macro for us and also gives us more information about what happened when it was checking it including the decoded IP Address.

Not all of the Information in the box is “Always” correct. So you may need to verify.

Now lets take a look with Oledump.py We start with the basic command to see what streams are in here.

OleDump-1

We can see in stream 7 there is a upper case “M”. That lets us know that there is code in the macro. So lets look at that.

oleDump-Stream-7

That looks like the data is compressed so lets add the –v switch to decompress this stream.

OleDump-Stream-7-Decompress

Now that is much better. We can now output that to a text file and take a closer look in our favorite text/ code editor tool.

Lets look at 1 more method before we dig in deeper to how the rest of the code works.

I’ll use 7Zip to decompress the document and we see the folder/ file system.

Unzip-1

Lets dig into the Macros folder and see what we have.

Unzip-2

We have files and a folder. In the VBA folder we have .

unzip-3

Now here is what I’m looking for. Lets take a look at module1 in a hex editor.

Unzip-4

We can see here that there is some plain text but this “Stream” is compressed.

Before I learned how to use oledump.py I had wondered how you extract the data in this file /stream.

I had read This article in that past but didn’t understand every thing it was telling me.

But using the code provided there and with some modifications I was able to build a tool to decompress the single stream. I wrote the tool mainly to “Try” and understand how the encoding/ compression worked.

tool-1

So that now gives me 1 more way to extract the macro(s) from the document.

I also Installed and Ran Vipermonkey today to see how that worked since I have never tried it before.

Viper-1

Viper-2

As we can see here it also extracted the script but seemed to have a problem with the VBA7 code.

Here is a list of the commands I used for olevba and oledump.

All commands are run from opening a CMD prompt in the folder where the document was located. (Shift + right click on folder , select Open command window here)

CmdsUsed

Let’s dig into this code some more because it is crazy.

The first part of the code you can see in the screenshot of my tool above is just a large block of junk comment data.

If we start checking for references of declared variables before the the “AutoOpen()”  we can find that there are several that are never used so they are most likely just junk filler to make it harder to read.

CodeStart-b

This code does a series of converting the “Val” and “Len” values all of the way thru this code.  Even once we convert those values we still have to do the math for each line.

So I wrote a tool to understand how the “Val” works. This Link will give you and Idea.

Val-1

As we can see it will input that string and return the numerical value. Basically cleans all non numerical values. But this value could have also been “&H” for Hex or “&o” for Octal.

We know “Len” is the length of the string so somewhat easy. The hard part is to parse this code and do the replacements for the numeric value.

My tool still has a bug or 2 but will parse this well enough for us to get a better Idea of what this is doing.

Tool-2

 

Now that some of the extra obfuscation is out of the way we can look closer at what we have.

After going thru and doing the math by hand we see this. The part with “****” next to them is where the two main values  are reset to a new value.

Math-1

At the end of the lines I also calculated the values for the “Left”, “Mid”, and “Right” values. These get used to get the sub string from those functions and the output gets appended onto the final string that get run in the “Shell” command at the end.

If we zoom in on these values we can see they are only taking a few characters from each string.

The first number (green text) is the position to start taking from, and the second is the length to take.

SubStr

If we keep scrolling down we can see the IP that gets called out to.

ip

We also see towards the bottom this interesting code.

bottom-b

We can see where it will possibly insert a break or clear formatting.

The GetTickcount  to me seems like this might be some type of anti debugging or just another time waster. ( Without verifying , you would think the tick count would always be greater. Tick count Explanation)

If it is less than 1.2 then it will change the the output value to the garbage string to that will get run by the “Shell” and fail.

Now the “Shell” which will run what got put back together.Shell

The first part of that before the “+” is just junk code. It doesn’t do anything that I could find. In the Shell it is passing the rebuilt string and the numeric value that gets passed. (I didn’t do the math all of the way thru. )

Now that we have a real good idea of how this works how do we output this so we can see what the final string is before it gets executed ?

I tried to open it up in Office Libre and modify the Macro code but that didn’t work.

After building a new Clean VM I installed a copy of Office personal in there.

Lets see what it looks like in the real office.

Office-1

We already have a pretty good idea of how this macro works so lets open this up and make some changes then save them.

I’m not sure if it would make any difference but lets comment out the section looking for is wow64.

Off-Mac-1

Lets also make a change to the GetTickcount to make sure it is not an issue.

Off-Mac-2

We change the value to greater than the “1.2008” that gets checked later on.

And the final Change to the “Shell” lets replace that with a MsgBox call instead.

Off-Mac-3

And after saving the changes and clicking “Enable Content” we get this.

Off-Mac-4

We can then left click on the MsgBox and hit Ctl+C to copy the data and then paste it into notepad.

FinalCmd

One strange thing that happened was, when I clicked “OK”  the Document looked like this afterwards.

Off-Mac-5

What Happened here ?

ClearFormatting

It looks like there is code here for clearing the formatting and the image.

Up higher it looks like this would work for an Excel sheet also.

And when we go to close it it just ask us if we want to save the changes.

Off-Mac-6

So the macro calls out to the IP with random 7 Character string and “.jpg”

The function will choose a value between 97 and 122 which is the ASCII code range for lower case letters. For each random Number it will convert it to a lower case letter (ASCII Char code) and add that to the final value for a final length of 7.

So that is that for the Decoding part.

The next problem was after enabling that content it would not “Un-Enable” no matter what the settings were.

So what is the Problem with that ?

After enabling the content once it now becomes a “Trusted Document” , the problem is how do you Un-Trust it again ?

We have to go to File –> Options –> Trust Center –-> Trust Center Settings (Button) –> Trusted Documents –> Clear all Trusted Documents …… (Button “Clear”)

TrustedDoc

I’m not finding a way to see a List of what is trusted or even that there are any trusted documents. Perhaps there is a screen I’m not seeing somewhere.

Also I don’t know if this is a bug or not, but the “Allow documents on a network to be trusted” seem to automaticity recheck itself after I Uncheck it close the document and reopen it.

So I used the Mantra “When in doubt run Procmon” to locate where these are.

Procmon 

I first set a filter for “Category is write” then looked for the string “Trust” once I found this registry key I added a filter for “Begins with” on the registry key and removed the other filter and got the above view.

And If we look at it in the registry.

RegEdit

And if we Dump the Key.

RegDump

I did a hash calc of the document after it was saved so I have another question, what it the hash they are using ?

CompHash

I’ll also Have to Figure out the Time format too.

Once we clear these (By clicking the Clear Button)  this key will be deleted and the Documents will no longer be trusted.

Well that’s it for this one. I hope you learned as much as I did.

Posted in Malware, Programming, VBScript | Tagged , | Comments Off on A deeper look into a wild VBA Macro