Extracting Shellcode from VBA to PowerShell

This post will revolve around using my tools to extract the vba code then clean a base64 string that is exploded into multiple lines and then decode to a PowerShell script then extract the shellcode from the script and get the IP/Url from the shellcode.

The Twitter link where this came from can be found Here . The file we will be looking at is found Here.

The first thing we need to do is get a copy of the vba from the site.

Site-1

We can click on the copy content button  in the upper right hand corner to copy it to the clipboard then we can paste it into out favorite text editor.

Script-1

Just by this it appears to build a PowerShell script.

Script-2

Here at the bottom of the script we can see that “stringFinal” is the the rebuilt powershell script that will base 64 decode to “Something”. It will run the powershell with shell.

The next question is how do we easily rebuild this base64 string.

In this Link to twitter I was asking people about a Reg-X solution. There were several replies and even a method to make some changes and let it extract itself.

This post is aimed at statically decoding with  just my tools. It is just a way to demonstrate how the tools work.

So since the strings are reassembled in order , rather than reassembling by hand we can use Reg-X to clean the base64 string to be able to decode it without having to run it.

If you view the link above you can see part of the thread where Malwrologist @DissectMalware has some screenshots on how to reassemble the base64 string using Notepad ++ and 2 different regular expressions to do the job which was my original goal.

twitterRegX1

There are also several different suggestions in that thread.

Recently I had built a new tool that does Reg-X replace for a script. The twitter link for that is Here.

There they used Reg-X to decode strings.

So lets try this new tool using the 2 step process.

Reg-x-1

Using a combination of Reg-X patterns we start with 

string[0-9]{1,} = \”

It will clear the name with the number thru the first “

Reg-x-2

So now we take the output from the first Reg-X replace and put in into the input for the next round using this pattern.

\”\r\n

That will clean the end “ and the newlines thus reassembling the base64 string. It will leave the final “ in the string so that will need to be removed before inputting this in to the base64 decoder.

Sc-1

Here they are still using PowerShell to load the Hex encoded shellcode.

sc-2

Finally we can highlight and copy paste just the hex encoded shelcode click a button and if nothing goes wrong we get the IP/Url it is calling out to.

Note: This tool does not work on those types that call and load calc.exe or executable those are a different format.

We can also check to see what api’s are found in the shellcode.sc-3

Notice the checkboxes up top, those can be unchecked to clean up the not found output.

sc-4

With some practice this can be extracted and decoded within a few minuets.

That’s it for this one.

Thanks for reading if you got this far.

 

Links:

Link to original Twitter message.
Link to file.

Link to Twitter thread about Reg-X.

Link to Twitter about script the Reg-X tool was built for.

Link to Github for the Reg-X tool.
Link to Github for Remaining tools for getting the IP and API’s used.

Posted in Malware, PowerShell, VBScript | Tagged , , , | 1 Comment

More adventures with shell code and the Shikata Ga Nai Encoder

The other day I was given a sample vbscript file by Paul Melson  @pmelson  so I could take a look at the odd shell code in it.

Here is the original script.

Script-1

This starts out as a normal script running PowerShell to do a base64 decode. The next level was Gzip base64 encoded then we get to the 3rd layer.

Script-2 

This looks like a standard shell code loader. But when I tried to run it thru my tools we discover that is Shikata Ga Nai Encoded.

Script-3

In a previous blog post Here I go in a little deeper on how this Shikata Ga Nai encoding works. I had to modify my decoder for this sample to extract the final shell code.

While trying to decode this the normal way of using CyberChef to find the start key and inputting all of the shell code bytes into my tool and removing 1 byte /2 chars at a time by hand until I get the decoded shell code I was not finding the expected output of the decoded shell code.

cc-1

CC-Notes

The highlighted bytes are the key in byte order. My tool will reverse them to be used.

My next step is to fire up a VM and run the extracted shell code thru SCDbg and see if it returns anything. SCDbg only works with 32 bit shell code.

Here is what I seen.

Scdbg

If all I wanted to do is extract and IP then I could stop here.

This tells me it does decode but not how it decodes.

I next tried to run it thru psxray located Here but it didn’t fully decode the shell code.

The next thing I tried was blobrunner  located Here in conjunction with x32 dbg

I then began to suspect that this may have been encoded multiple times or by more than 1 encoder.

After a Google search I found This blog post that tells me there is an option in  Msfvenom to encode the shell code with multiple rounds.

This verified my theory of multiple layers . But how does it work ? how do you peel off the layers ?

Let’s start by comparing the encoded to the decoded file in the hex editor.

shellcomp-b

Here we can see that the decoded and encoded file are the same to offset 0x25.

That tells me that the encoded bytes start there.

The other question is how do I know if I have it properly decoded or not ? This type of encoding will not be easy to tell by just looking at the decoded value. So I added a search into the return value of the decoder. If the result contains the bytes for “FNSTENV” 0xD97424F4 then I hopefully found the decoded layer.

Testing this by only inputting the bytes 0x25 onward and the start key from CyberChef we see this.

FoundNewLayer

Once we click ok it will write to the output to keep from over running the decode.

newkeys

Here we can see that for every for 4 decoded bytes a new key is generated for the next 4 bytes.

CompBytes-2

So we take a large enough sample from the decoded section to compare to the known decoded shell code from SCDbg and we see that it matches up with the starting offset at 0x25.

So the next theory is to take the ouput from the decoder drop it into CyberChef to get the next key and then move the output of the decoder to the input.

At this point I’m not doing any pointer math to try and figure out where the next part starts. We could also copy the bytes at offset 0x25 to a new file then compare the output to it and see where the difference it but I’ll do it the easier way of just adding a shift button to the tool.

It will shift / remove 1 byte (2 chars) from the left giving it a new point to start decoding at.

Layers

While testing this theory of inputting the output from the previous run I decided to keep a log of each layer.

The enc len was a guess based on the the first 2 outputs that I never went back to verify based on the number from CyberChef results.

Each layer needed to be shifted 20+ times before it found the decoded output when you just input the output.

One odd thing I found around level 5-6 was it pops up the found message at 12 shifts but the CyberChef looks a little strange compared to the others and the first decoded bytes were not in the known decoded file like before.

CC-12

After some trial and error I decided to try and continue shifting and see if the messaged popped up again. It did at around 28 shifts.

cc-28

In the short shifts we don’t have a “MOV” to get the key but we do in the 28 shift one.

Continuing on, this process was repeated for 10 rounds. Dropping each output decoded bytes into CyberChef to retrieve the next key and then shifting until we get the message that there is another round or the the decoded shell code was found.

FinalDecode

We can verify that it decoded correctly by dropping it in the other tools.

So if we look up the term “Shikata Ga Nai” we see it means “it cannot be helped” or “nothing can be done about it”.

Now something can be done about it now that we have a better understanding of how it works.

After writing this post I decided to move the code that did the checking and pop up a message box after the code that writes the output to the output text box. That way it will already be present when the message box pops up so a screenshot can be taken then.

I also considered adding a counter to keep track of the shifts but decided not to for now.

That’s it for this one, thanks for following along. I hope you found it as interesting to read as I did to research it.

Hash for Script
Sha1 : 94659C6520CFBDCA3CFECDA7781CED15659B0687
Sha 256 : 8B5366D58D00CBA37DB8D1E1CCDD1C767F730EA197476A736EB8FEED43B8FCBC
MD5 : 2B0324C016BD023EC1405007A7DCD6A1

Hash for Shell code

Sha1 : 8E1B4CFC4B1146C332AE6D4C5F9C86C242574370
Sha 256 : F8BDB9D9CE545075F483F4F1F919560EEE0108E313121DD2B891BFDF31A65DCE
MD5 : 9F88A4BBAFF1B8F530EE29F7226B3338

 

Links:

Link to VT for Shell Code
Link to Previous blog post on this type of encoding.
Link to SCDbg
Link to PSXray
Link to BlobRunner
Link to blog post on Msfvenom
Link to my Github with tools

Posted in Malware, PowerShell, Programming, security | Tagged , , | Leave a comment

A quick look at the current emotet encoding

I have went thru several samples today of this type of encoding but todays sample will be from ExecuteMalware @executemalware located here and the Twitter reference is here.

Here we can see that only 3 of the urls are displayed.

Anyrun-1

Emotet usually has 5 urls so where are they.

When we check the system they are now dropping a .jse file instead of powershell

Going thru my samples we find that they used this style before here is 1 reference from Twitter here.

If we are in a hurry we can get the script from anyrun.

We click on the winword.exe and see this.

Anyrun-2

Then click on the more info to see this.

Anyrun-3

Find and click on the jse file to see this. There could be several files to scroll thru before you see the jse file.

anyrun-4

Although this file is labeled as a .jse file it is not “Java Script Encoded” as it should be with that extension.

From previous research on the JSE and VBE encoding if the script engine does not find the header values for the encoding then it will attempt to run it as a normal script.

The script is unformatted to start with so lets pretty it up and take a look at the parts we need.

The way this works it it will take the array at the top and rotate  the array so many positions and build a new array. In the Screenshot we see the value that it will use highlighted.

Script-1

This has looked pretty much the same in every version of this I have worked with.

The difference is sometime they will also “\x” hex encode everything to make it more difficult to tell what it is doing.

The next thing we need to look for is the index and function that will be replaced when the script is run.

Script-2

If we just see the function name and an index value then the array will usually just get base64 decoded. Then from the new array that was built it will use this index in the array to do the replacement.

In this case there are 2 values, and from experience I know this is a RC4 key for decoding the base64 – RC4 encoded values.

Here is the decode function to verify if it is using ( % 0x100 )  Mod  or a AND  in the decoding

Script-3

If we scroll down they were nice enough to give us 3 of the urls plain.

Script-4

So in order to extract the last 2 urls you need to.

1: Reorder the array  using the provided value.
2: Get a list of Indexes and the key values.
3: use a for each loop of some kind and base64 decode  -> Rc4 decode for that index value and output this to a decoded Indexed list.
4: Locate and extract the last 2 Urls.

Using my tools lets see how this works.

First of all, the function name for decoding  is just “b(“. They are usually like “_x4E349(” or something similar.

So lets rename this to make sure my simple tool does not get the wrong thing.

Script-5

That is simple enough to make sure I don’t get something else with a small case b.

Script-6

Now we have a list with the index number and the key value.

Script-7

This tool here will do the reorder just before it try’s to decode each value.

Now we can get the last 2 urls from the file.

Now that we have the decoded list if you are real motivated you can now do the replacements and get a better understand of what the script is doing.

One of the old tricks that used to be used was to keep scripts from running, change the program that was associated with the script extension.

Do the the limited use of JSE or VBE file extension it “may” be safe to set the default program for these extensions to notepad.

This should stop these from running and also alert the user that something is not right.

That’s it for this one.

If you have any question you can get me on Twitter at @Ledtech3.

Posted in Malware, Programming, security | Tagged , , | Leave a comment

Chasing malware down the rabbit hole to see where it goes.

Lets start this journey with the blog post by Pondurance  titled “777 RANSOMWARE COMBINES WITH TRICKBOT” located here.

There is not a whole lot here but it describes 2 layers of shellcode  and some indicator’s and the first is the URL “hxxps://fearlesslyhuman[.]org”.

This URL seemed familiar but upon looking it up I was having difficulty finding very much information on it. The first search led me to Hybrid Analysis where we find this calling out to a /boot URL.

HA-1

It is also only labeled as “no specific threat” .

If we scroll down we see there is a PowerShell script but it is split up in 2 areas of the strings section and no download is available for this script. So let’s just extract it from the strings section.

HA-3

Here is the extracted script. If we look close it is base64 encoded GZipped.

Lets extract that and see what we have.

HA-4

As with the original blog post that started this run we have base64 encoded data that is also Xor’d with decimal 35.

You could probably make up a CyberChef recipe to do all of the steps I’m going to do with my tools.

Base64 decode to byte –> Xor bytes by Decimal 35 to get Clean shellcode.

HA-5

One thing my tool does not extract is the latter part of the url highlighted by the red box. Also as seen in the hex editor.

HA-6

Let’s also take a look at what API’s are found in this Shellcode.

HA-7

So our full url that this is calling out to is fearlesslyhuman[.]org/FSkX .

Unfortunately this does not contain the next level of download. The hunt continues.

Our next stop is urlscan.io  Here to see what it can tell us.

URLScan-1

We see 8 hits and as of 10 days ago it appears to be down. It also appears to be the Same IP in the ones that connected.

Here on Virus Total it gives us a little more information but not a lot.

VT-1

Searching on https://app.any.run for our URL we see several hits.

AnyRun-1

If we go thru this list we are not finding anything special. Thanks to @James_inthe_box for locating this sample for me that is different than what we see in this list. Sample Here .

I’m not sure why it is not showing up in this list.

AnyRun-2

Looking at the traffic we can see there is more going on here than in the other sample in the list.

Lets download the PowerShell script and extract the shellcode from this.

AnyRun-3

As we can see here this script is the same as the last one we pulled apart so lets extract the shellcode and see what it tells us.

AnyRun-4

Here we see it is using a “/HLnZ” path instead. Where did we see that?

AnyRun-5

As we can see here is it tagged as binary so lets download that and see what we have.

AnyRun-6

And the Hash Information from Anyrun

Hashes
MD5     7E8AF84B1CB9E43F1A66D385A63C9EAB
SHA1     C7651DD95BA7D82D1B8593E5EB5F0454AFF8373A
SHA256     17CB1CCC53B52E0EE31514673F4962E673280E505324739873542D541200120C
SSDEEP     6144:5RAY+7omj6nn5QEj+8vnLDckHWgvgV8Cm:92oo6n5va8PMS9C8Cm

Here we find it On VirusTotal by the hash with no detections.

Lets Look at this binary data in a hex editor.

AnyRun-7

As we can see here it starts with an “FC” which it what Shellcode normally starts with.

Lets drop this into CyberChef and see what it tells us.

CCError

That is not good, it throws an error. Lets just try the first part of it then.

CC-2

Well that worked but still does not tell me much.

Next stop, fire up the VM and load this into SCDbg.

SCDBG-1

I’m a GUI Person so used that instead of the command line.

SCDBG-2

As in the blog post referenced in the beginning we can see many calls to API functions and SCDbg also drops 2 files for us.

SCDBG-3

SCDBG-4

We can see this is a decoded PE file from the Shellcode but it appears the parts of the PE header got stomped after it was loaded into memory.

SCDBG-5 

This version that was dropped still has the “MZ” in it.

SCDBG-6

Lets remove the decoding shellcode from the start and then we have a decoded version of the binary.

Note: this is designed to be loaded and run from the original PowerShell shellcode.

Looking at this shellcode and the resulting executable got me wondering how it gets decoded.

What can we use? Possibly a shellcode to exe utility and then load (and or) run it in IDA or or in X86Dbg .

After chatting with  @herrcore about loading shellcode to be able to view it in a debugger he pointed me to a tool called BlobRunner, Here There are prebuilt binaries and the source code so you can build it yourself.

There is also a video that goes with it Here but they are using IDA Pro (The hard way) to view the blob.

At first I had trouble figuring out how to use it, so used a smaller sample file to get a feel for it.

The steps to run it.

Copy the binary for blobrunner and the shellcode into a folder on your vm.

Open a command prompt from the folder (so you don’t have to use full paths)

Pass the parameter’s of the blobrunner and a space and the name of the shellcode file into cmd.

In this case it will be  blobrunner.exe  “HLnZ.bin” (I used double quotes incase a filename has a space)

After entering the command and hitting enter we see this.

BR-1

Next we Open X86 Dbg and attach to the blobrunner process.

BR-2

Next we go back to the command window and look for the Entry value.(or you can do it first)

br-3-b

Then back to X86Dbg and open the Memory Map section and look for that address.

BR-4

I double clicked on that address and went to the place in the CPU tab where it is.

BR-7

Here we are at the beginning of the shellcode. Set a break point here then go back to the cmd window and hit a key to start it running again.

Next hit run in the debugger and it will break at that break point. You can then start stepping thru from there to see what it is doing.

Note: If you hit run in the debugger first without setting the breakpoint first it will get away from you. (I’ve Done it)

Going thru this there are a couple of jumps to set things up, Notice the second “Call” at “B”

Now look what happens as we step thru and make some jumps.

BR-5

BR-6

Notice that the assembly changed after making the jumps, and the real odd part is if you scroll back up it returns back to normal and the same as what CyberChef shows.

BR-8

If we set a breakpoint at the pop ebp (ox33) and run it till there (after the loop is complete) it will self decode. You can then just select and copy all of the bytes here to a hex editor then clean out the beginning shellcode or you can use the follow in memory map option and then dump it to file from there.

BR-8A

That is how to do it that way.

But I’m still curious on how the algorithm works to decode.

I then stepped thru it enough times until I was sure I had it down on what it does.

Tool-1 

It is a little slow because it is dealing with a lot of data and also formatting the input string /hex from a text box. Large amounts of data in a text box is always slower than importing the data straight from a file and working with it that way.

And finally here is how this Works.

HowToDecode

That 34200 is a counter that gets reduced by 4 every round.

I am assuming that it is also a length value.

And 1 final thing. Does it run in Anyrun  Here ?

ExtracyedBinAnyrun

Nope.

That is it for this one I hope you learned as much as I did.

Links

Anyrun Links:
Link to Good run
Link to Extracted File

BlobRunner:
Link to Download
Link to Video

Hybrid Analysis:
Link to Report

UrlScan.IO:
Link to the report for URL

Virus Total:
Link to Data on Url
Link to shellcode

Posted in Malware, PowerShell, security | Tagged , , | Leave a comment

A deeper look inside one of the new Emotet Malware Docs

The sample here comes from a quick search supplied by ANY.RUN @anyrun_app  of #emotet-doc to filter quickly on documents you want to look at. Twitter reference Here and the link to the file we are going to use Here.

One of the first things I always do is to always look at the file in a hex editor to verify what type of file I am dealing with. Never trust a file extension for the type of file.

Format

As we can see here this is a “Zip” style so we can simply decompress it to get the contents out to look at them.

One other thing we notice too is when we just scroll down in the hex editor is the script we are after can be seen clearly.

ScriptinFile

If we just search for “var” in the hex editor it will take us right to the beginning of the script and we can just copy it out without even having to run or unzip the file.

On a side note, the “var” is a java script keyword and would not normally be found in a vba file.

Looking at Any.Run we see this.

Anyrun-1

If the run looks like this with the “jse” then it may be this type. Unless they change it after seeing this blog post. (It wouldn’t be the first time)

If we left click on the first process Winword.exe then click on the More Info we see this.

Anyrun-2

AnyRun-3

Here we see the Jse File. Lets click on that.

One note about the “JSE” file here. This is not a JScript Encoded file as the scripting environment would encode it and is usually associated with this file extension.

A JSE file looks like this.

JSEFile

 

This is just plain java script that is obfuscated by a tool.

Anyrun-4

Here is our script that we will be working with. So Lets download that.

Now lets look at 1 more way to extract the file.

After we Unzip the file we see

Folder-1

Select the word folder

Folder-2

Now lets look at the vbaProject.bin file in a hex editor.

Vba-1

Her it looks like a “OLE” header so this is compressed in that format

If we try our search for “var” again we see.

Vba-2

Now we can stop here and just copy the entire text section to your text editor and then clean this up to leave just the script.

Lets go 1 step further incase this gets more obfuscated later.

We can use 7Zip to extract the contents of the vbaProject.bin .

Folder-3

Lets look in UserForm1 folder.

Folder-4

Looking at the size of the “o” object this looks promising. If it is not very large then there will not be much in the files but we need to check them anyway to verify there is not anything useful in it.

This is a binary file so we need to look at it in a hex editor.

vba-3

This is the lowest level we can go to get the script out. We could also use Office and extract it from the textbox or properties box in the VBA tools.

There are also the Decalage @decalage2 python tools Here and a few others.

Here is what the script looks like Normally.

Script-1

That is to difficult to see what it is doing so lets do some java script formatting on a “Copy” to get a better view.

Script-2

This is a very distinct format that has ben used for some time now, and if you understand the basics of these then they can be very easy to decode if you want to take the time to build tools to help with the boring parts.

There are basically 3 things we are looking for when we see this format. We look at the array at the top. It is usually either base64 , \x encoding , and I’ve even see a modified base64 encoding also in past samples that were not Emotet.

The next thing we are looking for is this function just below the array.

Script-3

As you can see it has a push shift function and a value of 0xd3. What this will do is rotate this array 1 place that many times in a circular fashion. Not doing the math to be sure , but if that number happened to be the same count as the array it should just go back where it started from, just an example.

The last piece we need is the part with the index numbers.

Script-4

This “b(‘0x0’, ‘ILb*’) is the index and a key to decode the base64 string in the array.

If it only has the index number then it either does not need another level of decoding or  the same key is used for everything and you will have to verify /locate it.

Script-5 

Here is the function where the index value and the key are passed to.

Script-6

Here we see the base64 decode with the atob() function and then the RC4 decode below that. This version uses the  “Mod 256” you might run across some code that uses “AND” 256. So just to be aware. (Also 255 is used in some scripts)

So now we have a pretty good understanding how the decoding works lets decode this.

Script-7

Script-8

Using this tool I had written for the Neutrino EK we extract the whole command and just the index and key and save both to separate files.

Script-9

This is where that value of 0xd3 comes in at.

This tool will take the base64 string array split them and rotate them to the proper place by that value. It will then use the list of Indexes/Keys from the last step to do the decoding of the array.

Script-10

If all we want to do is extract the list of urls we could just stop here.

But lets see about doing the replacements in the rest of the script.

Script-11

Here we are lined up with the decoded and encoded.

Although we have done the replacements of the encoded file there are actually more layers.

Script-12

Notice the “\x” encoded characters lets fix that.

Script-13

Does that string look familiar now. The error box in Anyrun ? (This screenshot was borrowed from the text report section and saved as png)

5c62c466-84cd-4aad-a881-a72f91d4d319

There is still 1 more trick used here that I have previously seen in in the pages that led to Angler EK.

Script-14

We have 1 more layer of separated values. As we can see here we have a key value pair where the “eb” is an array of variables, the key is “imKcX” and the value is
“Not Supported File Format”.

We can verify that by looking at the screenshot above.

There are several places it will do this.

Anyone that has tried to step thru one of these in a debugger knows how much of a pain and long winded these can be before they finally spit out the decoded page.

And that is if there are no debugger checks to throw you a curve.

One more note. I have seen some samples in the past that used this style that have been run thru this style of encoder twice. So you may need to look close and repeat the process to get it decoded as fully as possible.

That is as far as I’m going on this.  My challenge to you is to build your tools to be able to quickly decode these also.

Mine are to fragile to release for this post.

In conclusion, once you understand the layout of how these decode you can apply that knowledge to the various “Types” you may run across.

Learn it and help make this type of encoding obsolete.

 

If you have any questions place contact me on Twitter at @Ledtech3.

Posted in Malware | Tagged , , , | Leave a comment

Another Look at the Rig Exploit Kit

It has been awhile since I have written up anything on this exploit kit since it had moved to the background more and I have not seen as may samples as I used to.

It has gone thru many changes since I have first seen it and started learning how to disassemble it to lean how it works.

What sparked the interest this time is a series of reversing videos released by Vitali Kremez @VK_Intel and 0verfl0w @0verfl0w_ of SentinelOne . You can find the tweet reference here. https://twitter.com/VK_Intel/status/1172345642796011521

This post will focus on the .saz file from the video #3 on Rig EK. @VK_Intel was kind enough to give me a copy of the saz file so I could do this write-up of an alternate method of extracting the exploit code. The method used in the video was using the debugger in Google Chrome. I personally dislike that debugger so use a copy of IE9 in a VM when I actually need to run html/Script in a debugger.

So lets first take a look at the file in Fiddler.

Fiddler

There is limited traffic here to work with. We start with a redirect to the Rig EK landing page which we can see on the right.  Lets extract this page. I generally dump everything as raw to a folder and then look thru it for what I want but Vitali demonstrates another way in the video that is more precise to only extract what you want/ need.

So here is what the landing page looks like. If you zoom in and out it appears as if everything is running together and difficult to understand.

LandingPageNormal

Here it is after doing some JS Pretty / Formatting.

LandingPageFormatted

Now that this is formatted, you should be able to zoom in on the picture and see that there are 4 script blocks on this page each has it’s own code and exploit it is targeting to attempt to download an encoded file to decode and run on the system. The downloaded  file / malware will very on the campaign.

So lets take a closer look at each section to see what they look like.

Section:1

Section-1-Entire

If you look close you can tell this is base64 encoded.

Section-1-End

In past version of this there was string replacements in the base64 string itself but now they seem to just put it in the string of letters for the base64 alphabet in the decoder for what ever reason.

So all we are going to do is extract the base64 string from this section and use a different tool to do the decoding.

Section-1-B64Decode

This section has no tricks so we can just copy paste the base 64string into the decoder and decode as UTF-8

Here is what the whole thing looks like normally after base64 decoding.

Section-1-B64Decoded-Normal

Um , we can do better so lets format it. Unfortunately the formatted version takes forever to save a screenshot of so lets just look at the most interesting parts of this one and I will Include the decoded files and my tools on my Github.

In previous samples I have pulled apart over the years I have taken the many hours it takes to decode these by hand to see what the functions do and how they work.

This code also looks allot like what you would have seen in Angler EK.

In this part we see the Shellcode.

Section-1Shellcode-1

Lets copy this shellcode over to another page and then we can look at it in a hex editor.

Section-1-Shellcode-2

Notice the first part of this shellcode here.

Section-1-Shellcode-3

Now Lets drop this into CyberChef  X86 disassembler Here

CyberChef

We can see here that there is an “Xor” by 0x84 being used. So lets try that.

Section-1-Shellcode-4

Above you can see I just decoded the entire encoded shellcode by 0x84 and where it is highlighted you can see the decoded script. So lets extract and format it for a closer look.

Section-1-Shellcode-5

Does this Look Familiar to anyone ?

Basically all of this trouble to hide what is going on is boiled down to “It just passes parameters and downloads an encoded file , decodes and runs some malware”.

But what are those parameters. In some older versions it was was hidden inside the insane encoding but now it is as simple as looking at the bottom of the page.

Section-1-Values

This is the url that gets passed and the RC4 decoding key that is used with the exploit downloader.

Section:2

Now lets take a look at section-2.

Section-2

As we can see here this section is shorter than the first but there is one thing they do here that they didn’t do in the last section. They split the base64 string by using double quotes and the plus symbol. So lets clean this up and base64 decode it to see what is happening.

Section-2-B64Decoded 

Here we can see this one is using Flash to download the sample. There have been several different exploits used with flash in order to run the final payload. So lets take a closer look at this one.

Section-2-Flash

This has several script in this that do various things but what catches my interest is this familiar looking shellcode that I have not seen previously in a Rig EK flash file.

Section-2-Flash-2

Section-2-Flash-3

Notice anything here ? It looks pretty much like the first one.

Section-2-Flash-5

So in this case instead of a flash exploit they are using this one ? I’m not sure but the people that can ID exploits used will have to determine that.

So this section will pass what parameters ?

Section-2-Flash-6

If we look at the traffic, the only thing we have is the traffic associated with the flash download.

In this case the first part of this will be where it will call out to download the flash file from.

The second part after the highlighted function name is where it will download the encoded file/malware from and the last part is the key that will be used to decode the downloaded file.

There are 2 more section that will decode pretty much as the first and I will include them in there own sub folder at each stage of the decoding.

But this is as far as we can go with this sample.
Since this does not have the frame with the encoded malware lets try another sample from “malware-traffic-analysis.net”  located Here.

New Sample

So if we open the pcap and set a filter of “http.request or http.response” we see the familiar patter as the last in the .saz file.

X-Pcap

But in this case we have an extra download after the flash file download.

So lets set a filter for the landing page and see what frame/packet number it is in and extract it.

X-LandingFilter

As it turns out it is packet-112.

Lets take a quick look at the flash file since we can see that it gets downloaded.

X-Flash-1

As we can see here that the flash in this one is more “old style” and it will do some decoding inside of the flash code.

One thing to note is that if you attempt to just use the “Export all parts” then it does not properly export with this version of FFDec and you have to painstakingly copy paste each of the scripts to a folder.

Notice the auto generated message vs the generated code in the decompiler.

X-Flash-2

As we extract and look at each script this may be using multiple exploits ?

X-Flash-3

Going thru this source I’m not seeing how this would even trigger the download based on past work with these. Perhaps I’m just forgetting something or missed what I was looking for?

So the final question is, how do we know which of these sections/scripts downloads the encoded binary ?

The answer is in frame 196.

X-Frame-196

It tells us the download is the result of the request in frame 132.

X-Frame-132

Now after extracting all of the request from the sections we can compare which ones match and that will tell us what exploit was used.

X-Frames-Request 

In this case it was section 4. and here is the script.

X-Script-4

Frame 196 contains the encoded binary and after RC4 decoding  we discover that the data in this PCAP was truncated and we didn’t get the full binary.

X-DecodedBinary

One other thing to note is the RC4 routine here  uses “And 255”
where others may use “MOD 255”.

X-RC4

In conclusion we see that they are using a scatter approach using multiple exploits and scripts on the landing page. Perhaps just one was meant to work while the others are experiments or possibly they are used as fill to waste time of researchers or to send “false flags” on what exploit was used ? Looking at some of the code you can see it “does not work as expected” so only they can answer that question for sure.

That is it for this one I hope I was able to show a viable other way to see what is happening rather than trying to step thru in a debugger.

Links:

Link to Twitter post for the Tutorial that started this HERE.

Link to Malware Traffic page HERE

Link to my Tools used here and the files on Github HERE

Posted in Malware | Tagged , , | Leave a comment

Those Pesky Powershell Shellcode’s And How To Understand Them

Shellcode comes in various forms for different operating systems. Some can just be dropped into a hex editor and get the needed understanding what it is doing , some may require looking at the generated assembly code generated by a disassembler or require a specialized tool that understands the type of shellcode you are working with.

The one constant that seems to be the same will the various samples I’ve looked at is that the shellcode is used as a form of obfuscation to download the final malware.

Here we will just be concentrating on the Windows PowerShell versions.

Sample 1:

Lets start by taking a look at a “Daily Script” from December of 2017. Here is the Twitter reference for this sample.

Step1

First we need to convert the char Codes to Chars.

Tesp2

After that we get a base64 string.

Step-3

After that we get a Powershell script Gzip stream. After we decompress that we see this.

Step-4

Here we see a base64 encoded string. This is our encoded shellcode. It will get loaded into virtual memory and run. The exact implementation may vary a little but this is what I mostly see.

That Brings us to the shellcode which is what we are after.

Now we can we base64 decode to hex.

Step-5

So now what do we do with it.

Step-6

So now we drop the hex into a hex editor and we can now see the url it was calling out to and if we look higher we can also see a User Agent string.

Sample 2:

Next we look at this sample found here on Virus total form November of 2018.

S2-S1

Here we only start with a base64 encoded script.

S2-S2

Now we have a Base64 encoded GZip script.

S2-S3

Now we see the familiar base64 encoded shellcode so lets decode that to hex and drop it into a hex editor like last time.

S2-S4

Well that not to helpful. now what ?

Lets try CyberChef  here and look at the assembly.

S2-S5

Well that dosen’t look like much help either.

What else can we do ? We have John Lambert’s “PyPowerShellXray” here . Or we have SCDBG found here

After working with these the “PSXray” requires the powershell script with the shellcode to work and the SCDBG requires only the Cleaned hex of the Shellocde so you still have to base64 decode to hex to use it in scdbg. Lets see what those 2 Show us.

PSXray-32

Here we can see some Windows API calls using psxray but something doesn’t look quite right. the ws2_32 which gets pushed backwards is not showing it all, but if we modify the python script to use the 64 bit version of the backend API for this tool we get the full Api name but the rest of the values don’t look the same.

PSXray-64

So what About scdbg then ?

Scdbg

It didn’t find anything because scdbg only work on 32 bit shellcode and this is 64 bit.

So now what.

New tools.

S2-S6

In order to save a step we can also just input the base64 string.

S2-S7

Looking at the way John lambert’s tool parsed the hashed api calls I  wanted to be able to do the same thing but as a copy paste instead having to run it thru the vm/python process.

Another new tool.

S2-S8

But how do we find these hashes.

ApiHashes

As it turns out psxray had a prebuilt list of hashes for the function calls. I had to convert those to individual dictionary items for each API to be able to use them in this new program, but first do the sheer number of them I had to build a program to do the conversion and then generated the vb.net code for me. Then I could use the generated code to do the search for the API calls.

HashValue 

If we take a closer look at the output of my tool we see found at index, this is the string index not the byte index. You would have to divide that by 2 if you were searching in a hex editor for the byte offset. Another thing you will notice is that the order it is found in the file is reversed to what you will find it in the assembly or the database with the tool.

That is why I put both the normal order found in the file and the “ASM Order” in the output.

Another odd thing I ran across in a sample was a hash value was found but at an “ODD” offset and closer inspection of the assembly and the found value showed it was a false positive. All of the normal offsets are divisible by 2 so any odd value may be false.

While investigating how the hashed API names worked for my Office Equation Blog post here I found a FireEye post from 2012 here about using precalculated string hashes and instructions on how to generate your own Sqlite database of  known hashing algorithms and values. I will include the ones I generated for reference as a lookup  database for looking up unknown hashes.

I was able to use this database to generate the remaining code for the tool above that the list form John Lambert’s tool didn’t include that I had ran across.

Sample 3:

In this sample found on Virus Total here this was a strange one. It was originally found on pastebin by Paul Melson’s  (PaulM @pmelson)  ScumBots @ScumBots  bot and uploaded to Virus Total.

When we first look at this script one thing we will notice is that it starts with a very large ase64 string. The second thing is it is broken up with the string of ‘+’ to mess with automated base64 decoders that can’t deal with putting the string back together and remove those first.

S3-S1

After we clean up the base64 string and base64 decode we see this.

S3-S2

NOTE: I have tested this in psxray and it will fail to parse this type.

If you zoom in on this picture you can see the this has a base64 encoded executable file embedded into. Let’s extract and take a quick look at that first.

It looks like the script will load this Dll which is a AMSI Bypass method which will then load the shellcode.

S3-S3

Now let’s take a closer look at this shellcode. It doesn’t start with the normal “0xFC” .

S3-S4

That’s hard to read so lets format it a little bit to better view what is happening.

S3-S5

Looking where the blue dot is we can see that this shellcode has been split apart into arrays and will get reassembled at run time.

So lets reassemble it. (New Tool)

S3-S6

Now that is it reassembled we can now input it into our Tool to get the IP/URL.

S3-S8

And also the API calls. I created this tools so it would also help give more insight as to what gets called so it may help to get a better understand of what it is doing not just the IP or Url that may show up by just running in a sandbox.

One other thing to note is that I have a checkbox for each API that gets parsed so the ones that show up as “No Hashes Found” can be unchecked and then you can rerun it to get a cleaner output.

Sample 4:

This is another strange sample As of this writing is still on Pastebin here which is another sample found by Paul Melson’s  (PaulM @pmelson)

S4-S1

We start out like normal with Powershell and a large base64 string .

S4-S2

After base64 Decoding we now have a Base64 GZip string.

S4-S3

Now we have decompressed this level we can just take the base64 encoded shellcode and drop it in our tool to extract the IP/URL.

S4-S4

Ok so What is “Shikata Ga Nai encoded shellcode” ? This one had me stumped for a bit because there where no real “clear” explanation’s on how this decoded from the byte level without using other tools.

Note: psxray has the function to decode this type of shellcode. scdbg does not work for this type.

This article here was the Closest one that helped me work this encoding out. It is found in the “metasploit-framework” found here .

The Description of it is a “polymorphic XOR additive feedback encoder” yeah that description really helps.

After reviewing the Article and anything else you can find online about it lets drop the hex cleaned shellcode into our friend CyberChef. You will also notice a difference in Cyberchef output and what psxray outputs.

Diff

(This screenshot is from my original research.)

The cyberchef is before and the psxray is after it is decoded.

S4-S5

Here are my decoding notes for how this decodes. It will start out with a xor key which will change from sample to sample and a addition value that gets added to each round.

You add the decoded byte with the current key  to get a 32 bit value for the next key.

The next thing that needs to be figure out is where the encoded data starts at. In this case if you look at the difference screenshot it will tell you where it starts by the difference.

Another way is to look for odd/ messed up assembly instructions at the beginning of the CyberChef assembly.

S4-s6

S4-S7

Now we can just drop the decoded shellcode back into out IP/Url parser tool.

One other thing to note, if you can not figure out where the encoded shellcode starts just drop the entire shellcode into the decoder after the key and decode and remove 1 byte (2 chars) at a time from the beginning until you see this value show up or more plain text in the output.

S4-S8

That is the string representation of “LoadLibraryA”

S4-S9

There are some more strange types I would like to go thru but this is starting to get long.

Here is a list of the tools I am including in the release.

ToolList

All of these tools have been used in the decoding and extraction of the shellcode.

In the base64 decode tool there are 2 buttons on the left decode as utf8 and decode as unicode . Most of the powershell scripts that base64 will use the unicode button.

To extract as hex you have to check the box and select the encoding type to extract as. Most of the time it will be 1252 from the dropdown list. This list id filled by a function to get the supported encodings for the system it is run on.

If there are any Question or problems just contact me on Twitter @Ledtech3 .

Links:

Sample 1:
Twitter Link

Sample 2:
VT Link for sample
CyberChef Link for X86 assembly.
PSXray Link
ScDbg Link to site
FireEye post on precompiled  hashes Link
My Blog post on Equation Editor Shellcode Link to

Sample 3:
VT Link for this sample

Sample 4:
Pastebin Link to sample

 

Github Link to the tools and files used here.

Again there was a lot more that I would like to have gone thru.

I hope you learned as much I did.

Posted in Malware, Networking, PowerShell | Tagged , , | 1 Comment