Those Pesky Powershell Shellcode’s And How To Understand Them

Shellcode comes in various forms for different operating systems. Some can just be dropped into a hex editor and get the needed understanding what it is doing , some may require looking at the generated assembly code generated by a disassembler or require a specialized tool that understands the type of shellcode you are working with.

The one constant that seems to be the same will the various samples I’ve looked at is that the shellcode is used as a form of obfuscation to download the final malware.

Here we will just be concentrating on the Windows PowerShell versions.

Sample 1:

Lets start by taking a look at a “Daily Script” from December of 2017. Here is the Twitter reference for this sample.

Step1

First we need to convert the char Codes to Chars.

Tesp2

After that we get a base64 string.

Step-3

After that we get a Powershell script Gzip stream. After we decompress that we see this.

Step-4

Here we see a base64 encoded string. This is our encoded shellcode. It will get loaded into virtual memory and run. The exact implementation may vary a little but this is what I mostly see.

That Brings us to the shellcode which is what we are after.

Now we can we base64 decode to hex.

Step-5

So now what do we do with it.

Step-6

So now we drop the hex into a hex editor and we can now see the url it was calling out to and if we look higher we can also see a User Agent string.

Sample 2:

Next we look at this sample found here on Virus total form November of 2018.

S2-S1

Here we only start with a base64 encoded script.

S2-S2

Now we have a Base64 encoded GZip script.

S2-S3

Now we see the familiar base64 encoded shellcode so lets decode that to hex and drop it into a hex editor like last time.

S2-S4

Well that not to helpful. now what ?

Lets try CyberChef  here and look at the assembly.

S2-S5

Well that dosen’t look like much help either.

What else can we do ? We have John Lambert’s “PyPowerShellXray” here . Or we have SCDBG found here

After working with these the “PSXray” requires the powershell script with the shellcode to work and the SCDBG requires only the Cleaned hex of the Shellocde so you still have to base64 decode to hex to use it in scdbg. Lets see what those 2 Show us.

PSXray-32

Here we can see some Windows API calls using psxray but something doesn’t look quite right. the ws2_32 which gets pushed backwards is not showing it all, but if we modify the python script to use the 64 bit version of the backend API for this tool we get the full Api name but the rest of the values don’t look the same.

PSXray-64

So what About scdbg then ?

Scdbg

It didn’t find anything because scdbg only work on 32 bit shellcode and this is 64 bit.

So now what.

New tools.

S2-S6

In order to save a step we can also just input the base64 string.

S2-S7

Looking at the way John lambert’s tool parsed the hashed api calls I  wanted to be able to do the same thing but as a copy paste instead having to run it thru the vm/python process.

Another new tool.

S2-S8

But how do we find these hashes.

ApiHashes

As it turns out psxray had a prebuilt list of hashes for the function calls. I had to convert those to individual dictionary items for each API to be able to use them in this new program, but first do the sheer number of them I had to build a program to do the conversion and then generated the vb.net code for me. Then I could use the generated code to do the search for the API calls.

HashValue 

If we take a closer look at the output of my tool we see found at index, this is the string index not the byte index. You would have to divide that by 2 if you were searching in a hex editor for the byte offset. Another thing you will notice is that the order it is found in the file is reversed to what you will find it in the assembly or the database with the tool.

That is why I put both the normal order found in the file and the “ASM Order” in the output.

Another odd thing I ran across in a sample was a hash value was found but at an “ODD” offset and closer inspection of the assembly and the found value showed it was a false positive. All of the normal offsets are divisible by 2 so any odd value may be false.

While investigating how the hashed API names worked for my Office Equation Blog post here I found a FireEye post from 2012 here about using precalculated string hashes and instructions on how to generate your own Sqlite database of  known hashing algorithms and values. I will include the ones I generated for reference as a lookup  database for looking up unknown hashes.

I was able to use this database to generate the remaining code for the tool above that the list form John Lambert’s tool didn’t include that I had ran across.

Sample 3:

In this sample found on Virus Total here this was a strange one. It was originally found on pastebin by Paul Melson’s  (PaulM @pmelson)  ScumBots @ScumBots  bot and uploaded to Virus Total.

When we first look at this script one thing we will notice is that it starts with a very large ase64 string. The second thing is it is broken up with the string of ‘+’ to mess with automated base64 decoders that can’t deal with putting the string back together and remove those first.

S3-S1

After we clean up the base64 string and base64 decode we see this.

S3-S2

NOTE: I have tested this in psxray and it will fail to parse this type.

If you zoom in on this picture you can see the this has a base64 encoded executable file embedded into. Let’s extract and take a quick look at that first.

It looks like the script will load this Dll which is a AMSI Bypass method which will then load the shellcode.

S3-S3

Now let’s take a closer look at this shellcode. It doesn’t start with the normal “0xFC” .

S3-S4

That’s hard to read so lets format it a little bit to better view what is happening.

S3-S5

Looking where the blue dot is we can see that this shellcode has been split apart into arrays and will get reassembled at run time.

So lets reassemble it. (New Tool)

S3-S6

Now that is it reassembled we can now input it into our Tool to get the IP/URL.

S3-S8

And also the API calls. I created this tools so it would also help give more insight as to what gets called so it may help to get a better understand of what it is doing not just the IP or Url that may show up by just running in a sandbox.

One other thing to note is that I have a checkbox for each API that gets parsed so the ones that show up as “No Hashes Found” can be unchecked and then you can rerun it to get a cleaner output.

Sample 4:

This is another strange sample As of this writing is still on Pastebin here which is another sample found by Paul Melson’s  (PaulM @pmelson)

S4-S1

We start out like normal with Powershell and a large base64 string .

S4-S2

After base64 Decoding we now have a Base64 GZip string.

S4-S3

Now we have decompressed this level we can just take the base64 encoded shellcode and drop it in our tool to extract the IP/URL.

S4-S4

Ok so What is “Shikata Ga Nai encoded shellcode” ? This one had me stumped for a bit because there where no real “clear” explanation’s on how this decoded from the byte level without using other tools.

Note: psxray has the function to decode this type of shellcode. scdbg does not work for this type.

This article here was the Closest one that helped me work this encoding out. It is found in the “metasploit-framework” found here .

The Description of it is a “polymorphic XOR additive feedback encoder” yeah that description really helps.

After reviewing the Article and anything else you can find online about it lets drop the hex cleaned shellcode into our friend CyberChef. You will also notice a difference in Cyberchef output and what psxray outputs.

Diff

(This screenshot is from my original research.)

The cyberchef is before and the psxray is after it is decoded.

S4-S5

Here are my decoding notes for how this decodes. It will start out with a xor key which will change from sample to sample and a addition value that gets added to each round.

You add the decoded byte with the current key  to get a 32 bit value for the next key.

The next thing that needs to be figure out is where the encoded data starts at. In this case if you look at the difference screenshot it will tell you where it starts by the difference.

Another way is to look for odd/ messed up assembly instructions at the beginning of the CyberChef assembly.

S4-s6

S4-S7

Now we can just drop the decoded shellcode back into out IP/Url parser tool.

One other thing to note, if you can not figure out where the encoded shellcode starts just drop the entire shellcode into the decoder after the key and decode and remove 1 byte (2 chars) at a time from the beginning until you see this value show up or more plain text in the output.

S4-S8

That is the string representation of “LoadLibraryA”

S4-S9

There are some more strange types I would like to go thru but this is starting to get long.

Here is a list of the tools I am including in the release.

ToolList

All of these tools have been used in the decoding and extraction of the shellcode.

In the base64 decode tool there are 2 buttons on the left decode as utf8 and decode as unicode . Most of the powershell scripts that base64 will use the unicode button.

To extract as hex you have to check the box and select the encoding type to extract as. Most of the time it will be 1252 from the dropdown list. This list id filled by a function to get the supported encodings for the system it is run on.

If there are any Question or problems just contact me on Twitter @Ledtech3 .

Links:

Sample 1:
Twitter Link

Sample 2:
VT Link for sample
CyberChef Link for X86 assembly.
PSXray Link
ScDbg Link to site
FireEye post on precompiled  hashes Link
My Blog post on Equation Editor Shellcode Link to

Sample 3:
VT Link for this sample

Sample 4:
Pastebin Link to sample

 

Github Link to the tools and files used here.

Again there was a lot more that I would like to have gone thru.

I hope you learned as much I did.

Advertisements
Posted in Malware, Networking, PowerShell | Tagged , , | Leave a comment

A deeper look at Equation Editor CVE-2017-11882 with encoded Shellcode

Our sample today comes from My Online Security @dvk01uk from this Twitter thread Here.  The First one I had started to work on comes from this Twitter thread  here from April 26 of 2019.

The encoding on the shellcode uses a method similar to Shakita Ga Nai encoding.

I would also like to thank  Denis O’Brien @Malwageddon for pointing me to this video on how to set up the vm to use X86Dbg to load when the equation editor loaded.

I would also like to thank him for giving me the tip of setting a break point at 0x00411874 on the return instruction for the font record. This can get you close to where you need to be but then you have to step thru from there.

Also this blog post had some helpful information on breakpoints that helped while trying to run this with the debugger attached.

Before we jump into the debugger let take a look at the file and extract the shellcode.

When we first open the file we see

File-1

Here we can see it is a zip file so lets just unzip it to get the file structure.

File-2

Let’s look in the “xl” folder.

file-3

Now we need embeddings.

File-4

Let take a look at this file in a hex editor.

File-5

This is an OLE file so we can just use 7Zip to extract the contents.

file-6

This a Ole0Native binary file so lets see what is in this.

File-7

By the looks of this. It does not appear to have any other headers so this is our shellcode that gets run.

Lets copy all of the data here and drop it into CyberChef.

CyberChef

Let’s take a closer look at this in Notepad++ with the colors for assembly.

CyberChefColor-1

Now lets do some math at the beginning.

blogpost-bp-b

After doing the math here we can refer back to the blog post and see that the result matches Globallock.

So let jump Into the debugger. After getting to the fonts and finding the corrupted one we step thru and find what we are looking for. The Beginning Of our Shellcode.

InShellCode

The values just above notepad are the ones we see in our debugger. Finally on the right track.

After loading Globallock and returning from Kernal32 we end up in a series of jumps.

MultipleJumps

Here I was able to get a graph of the function calls

StartOfJumpGraph

The Part we need to understand is at the top where it goes into the loop.

Loop

Here we can See the value that will be used as a Multiplier.

Multiplyer

You can Also see my notes from a previous run  the values we need to find.

So After running thru all of this we can find the decoded Shellcode in ECX.

DecodedPayload

While stepping thru this I also copied the assembly and the current values to a text documents. Lets take a closer look at the flow.

AsmValues

Now we have a better understand of how the decoding works. Here is a more Simple Version.

DecodingNotes-Final-A

Now we can build a decoder for this.

We need 3 values that we can get from the Cyberchef output. The Multiply value, the Addition Value And the Length value.

We will look at the length value first.

ShekkCodelen

So now we need to find where this is in the Shellcode we extracted.

As it turns out if we just extract from the end of the shellcode data the amount here 0x2A7  then that is the data we will be decoding.

ShellCodeSelect

This is now our encoded shell code. We can copy paste this to the new tool, get the other to values and click a button.

If we are right then you should see the decoded values clearly.

Decoded-Payload-Tool

DecodedClose

So now we can decode these by extracting the shellcode from the file, copy paste to to Cyberchef to get the Assembly, look for the required parameters and finally input them into the tool and click a button for the result without having to run the file.

Although you can just run them and get the URL where it calls to, this will give you what else in the shellcode and and what API’s are run.

The few different ones I have done have all worked just a bit different under the hood even though they have the same effect of just calling out to some site somewhere and downloading a file.

If you click on the Twitter link to this sample an then click on the CVE- tag at the top it will present you with , at the time of research, 116 pages of files that potentially use this type of encoding.

That’s it for this one. I hope you learned something too.

 

Links to URL’s in this post:

Twitter Link for this sample

AnyRun Task Link for this sample.

Link to blog post with the different values

Link to the video on how to set up X86Dbg to attach to the Equation editor.

Link to My Github with tool and decoding notes.

Posted in Malware, security | Tagged , , , | 1 Comment

A look at Stomped VBA code and the P-Code in a Word Document

This sample comes from a Twitter discussion here and a second part of the thread here on April 22 2019.

This discussion was started by “My Online Security @dvk01uk “.

Although it appears to have a vba file in it it didn’t work in a few different sandboxes as mentioned by @dvk01uk.

Lets take a closer look at the sample found here on ANY.RUN @anyrun_app .

If we look at the document in a hex editor we can see that it starts with a “PK” so this is a ZIP File version and we can just decompress it and take a closer look.

DocHex

After unzipping the document we see this folder layout.

1

Lets look at the word folder.

2

We can see here we do have a vbaProject.bin file. Lets look at that.

3

This is a OLE file so we can decompress this with 7Zip.

3A

Lets take a closer look at Module1

4

If we scroll down to the bottom of this file we can see that it appears to be Zeroed out.

If we look at the “ThisDocument” we can see the “Attribut” string which tells us it contains compressed VBA Code.

5

If you don’t have that string in the file then it does not have compressed VBA Code in it.

So how does this work then.

If we go back to the Twitter discussion “Vess @VessOnSecurity” has a python tool called pcodedump to extract the “P-Code” from the document which can be found here .

This tool currently only requires the “Decalage @decalage2” oletools.

The command I ran was this.
“C:\Python27\python.exe” “C:\Users\Joe User\Desktop\pcodedmp\pcodedmp.py” “Opticsense New Order.doc”

In order to dump it to a file just add to the above command.  “ > DumpedPcode.txt” or what ever name you want.

I have both versions of python installed on this vm so I have to use the full path to to it. I also discovered the hard way that you have to put it in double quotes in order for it to work.

Since I didn’t use the pip install for the pcodedump tool I just downloaded it and used the full path to the script I also put double quotes around that path. The final parameter was the file name in double quotes since it has a space in the name.

I just opened a cmd window in the folder where the document was and ran that command.

Here is what we see when we run  the command and dump it to a file.

P-CodeDump

This is the part we are most interested in at the moment.

Pcode-2

If you can zoom in on that you see a bunch of “Line #:” so lets clean those out and format this a bit better to be readable.

Here we find the AutoOpen Function.

Autoopen

The “Ld F_WH” appears to load the function above.

Func1

Although it is not real clear looking at this for the first time we can take an educated guess on what the names mean like “st” I would assume it means string, “ld” would be load ?

So here is appears to take the string in “E_MO” and pass it to the function “B_RA” and when it returns it will set the value of “F_DC” as an object.

FuncB_ra

So what this does is take the string of numbers and uses 3 numbers at a time then subtracts 0x1A (26) from the value then converts that number to a Character.

So after decoding the first string we see.

E_MO-Decoded

So the object that gets passed is “Wscript.Shell”.

The rest of the longer strings appear to be junk code until you get down to here.

SecondString

Here we see it is getting the string “SP_LL” from the active document.  When we search for it we find it in the “settings.xml” .

SP_LL-1

So now we need to take this string and run it thru the same B_RA function and see what is output. It will then get executed after passing back to the AutoOpen function.

DecodedPSScript

If we go back to the AutoOpen function and Continue on now that the strings are decoded.

WinMgmt

It will use WMI’s Win32_Process to load “Cmd.exe” and the rest of the script.

Lets take a closer look at the decoded powershell script.

FormatPs

If we look at the highlighted area in the screenshot we can see above it that there were 3 variables “set”. This will rebuild the string “powershell”.

As we can see here this just downloads an exe from a site and runs it.

Anyone interested in getting a better understanding of the P-Code I would suggest looking at the source code of pcodedump and this file to get better handle on how it works.

I would have liked to went in deeper on how the P-Code works from the byte level but I’m still learning that myself.

That’s it for this one.

Here are the full list of resources and a few extra not covered in the post.

Twitter threads for this sample:
Main thread Here , Second thread Here , Third Thread Here

Didier Stevens @DidierStevens ISC Diarys :
Here and Here

Vess @VessOnSecurity pcodedump tool”
Here

Decalage @decalage2” oletools :
Here

Derbycon 2018 talk “VBA Stomping – Advanced Malware Techniques”
Here

Posted in Malware | Tagged , , | Leave a comment

A look at a bmp file with embedded shellcode

The sample today is from PaulM @melsonp

While watching his BSIDES Augusta talk from 2018  Here,  at that the end he shows a picture file that gets downloaded from a layered PowerShell script. He was kind enough to send me a copy of a similar one to take a closer look.

I originally thought it was one of the PowerShell only decoder scripts for picture files but here is what we first see. This is the first layer .

StartScript

After Base64 Decoding this we get.

Layer-2

Here we can see this is base64 –> decompress to get the next level. But they have one more trick.

Layer-2-A

Before we can Bas64 Decode –> Decompress this we first have to do a string replacement of  “!” with “A” in order to get a proper Base 64 encoded string.

After Decoding we get this.

Layer3

This appears to be a normal Meterpreter PowerShell Shellcode loader but in this case it is only downloading a bmp file.

The other ones I have looked into have either had the Shellcode on this page base64 encoded or hex encoded or downloaded it as this has with the picture file.

After a discussion with Paul he was able to locate the pdf of the presentation of the builder for this here and I found the video for the presentation  here and the Github for the project is  here.

Here is what we see when we open the downloaded file.

header

The first 2 bytes are normal for the bmp file format. If we open the file as a picture it is indeed the the default picture of a cat from the builder  “flipping you off”. (Which I won’t show)

So lets dig into the pdf to see how this works.

Note: I’m still learning how to read assembly. But we learn by doing.

On this page we see we have the 2 byte header “BM” 0x424D then a Jump instruction of 0xE9 then a 3 byte offset. According to This page there are more possible “jmp”  instructions that could possibly be used.

PDF-Img-1

In our file we have the offset in  little Endian byte order of 0x30C403 ,and if we reverse that to 0x03C430 that is our offset to jump to.

If we jump to that offset we can see it is at the end of the file.

Offset

Now scrolling down the pdf a little bit more we see that they also attempted to obfuscate the decoding key.

pdfimg-2

What this is doing is setting ebx to Zero and then looping a counter until it matches the “Magic” value that was randomly generated on build.

After it matches, it reverses that hex value and will use that value to xor the first 4 bytes of the encoded data to produce a decoding key which will get reversed again for decoding the remainder of the bytes.

I first wrote a brute forcer to work like the function here but after looking at this longer and getting a better understanding of what was in the registers I finally realized that this entire brute force routine was a waste of time and CPU power. No matter what the Random “Magic value” turns out to be the index value will always end up equal to the “Magic value”.

So when building an offline decoder we can just bypass this and and just use that found value for the “Magic” in our calculations saving a lot of time and CPU cycles.

In order to figure this out I also had to take a closer look at the builder.

If we look in the source file of gen.py we can see the layout of the decoder bytes.

Source

So lets just use this CyberChef recipe Here to get the assembly for the bytes starting at the offset we jumped to in our downloaded file.

And we get this.

Assembly-1

For me this is a little harder to understand so lets go back and just put the data starting with the decoding routine to the end of the data  into CyberChef and see what we have.

This looks a little different.

Assembly-2

In order to get a better handle on what was in what registers I ran it thru Scdbg.

scdbg

If we look close at this report we see it fails at the op code 0x0FC9 . The “BSWAP ECX”

It was still enough to help me understand the values in the registers at the time.

I may not fully understand all of what the assembly is doing but I’m able to understand enough to work out how to decode it.

If you look at the above screenshot of the assembly you can see the notes from what I think I  understand on how it works.

If we look back at the the source code we can see it lines up where I have commented as random.

Here are my notes on how the function works to decode the bytes.

DecodeNotes

Here I am just reversing the first 4 bytes of the encoded data instead of the “Magic” Value” as it appears in the assembly.

The next step is to build a tool to extract the shell code.

I first start by importing the entire bmp file into the tool. I then extract the offset. Next Jump to the offset.

Next I extract the data from the offset to the end of the file. We no longer need the bytes before the offset.

Since I write all of my tools in vb.net and I have not found a good way to do byte array searches in byte arrays. So I will convert these remaining bytes to a hex string and work with the data as a hex string.

Just a note It is very resource intensive to convert a file that size to a hex string to try and parse it that way. (I tried)

Since I am now working with strings of hex I can now search for the unique byte sequence as a string instead of a byte array to do the compare with the byte code before the “Magic value” in order to find and extract it.

Magic-2 

Since this sequence will be in every file we can do a search for it and then locate the Magic value in the hex string. Once we find that sequence before the “Magic” we can then extract the next  4 bytes (8 Chars) for the “Magic”.

Next we have to locate the start of the encoded data. For that we can find what this function ends with.

EncodedData

You may also notice another value we could extract. The size of the encoded data. We could get that so there is not extra nonsense data in the decoded shellcode.

So after we put all of this together we end up with the new tool.

Tool

If we load the hex string shellcode into another tool I’m working on we get.

Tool-2

One thing to note. For this type of shellcode the first byte is always 0xFC and the second byte will vary depending on if it is a 32 bit or 64 bit shellcode.

So the question would be how do you find a file encoded with this.

With a few pointers from Florian Roth @cyb3rops I was able to create this Yara rule.

rule DKMC_Picture_File {
meta:
  description = “Detects DKMC encoded bmp file with shell code”
  author = “David Ledbetter @Ledtech3”
  reference = “https://github.com/Mr-Un1k0d3r/DKMC”
  date = “2019-27-02”

strings:
     $my_hex_string1 = { 424DE9 }
     $my_hex_string2 = { 31D981F9 }
     $my_hex_string3 = { E8B7FFFFFF }
condition: 

$my_hex_string1 at 0 and $my_hex_string2 and $my_hex_string3

}

After sending this to him he modified it to do the first 3 byte search as  UInteger.

Here is the modified version.

rule DKMC_Picture_File {
   meta:
      description = “Detects DKMC encoded bmp file with shell code”
      author = “David Ledbetter @Ledtech3”
      author = “Florian Roth @cyb3rops” // modified first 3 bytes to be detected as Uint.
      reference = “http://github.com/Mr-Un1k0d3r/DK …”
      date = “2019-27-02”
   strings:
      $my_hex_string2 = { 31D981F9 }
      $my_hex_string3 = { E8B7FFFFFF }
   condition:
      uint16(0) == 0x4d42 and uint8(2) == 0xE9 and
      $my_hex_string2 and $my_hex_string3
}

I’m not sure if it is faster or not but both do find the sample I have.

A Search on Hybrid Analysis didn’t find anything using  the yara rules.

A retro hunt by Florian Roth @cyb3rops On VirusTotal resulted in several hits for this rule.

Here is the Pastebin of the found hashes here .

Well that is it for this time I hope you learned as much as I did.

Posted in Malware, PowerShell, security | Tagged , , | Leave a comment

A deeper look into a wild VBA Macro

This Sample comes from Brad Duncan @malware_traffic from his SANS ICS Diary located Here and the Files on His blog Here.

For this session I will be using “2019-01-23-example-of-attached-Word-doc-1-of-7” word document.

I ended up looking at this from different directions so that is what I want to try and show here.

The first thing I always do is to look at the file in a hex editor to verify what type of file I am dealing with. Never trust a file extension.

FileHeader

As we can see by the 8 byte file header we are dealing with a OLE file vs. say the XML or the Zipped style or RTF form of a document.

My next step is usually to drop the file it into Office Libre to see if it will even open.

Here is what the Document looks like.

Document-1

Next let’s look and see if there are any macros available. Some times no macros are detected using this program so alternate methods / programs need to be tried to verify there are no Macros.

So when this first loads even before the “AutoOpen()” Sub, it does a “GetTickCount” call to the Windows API.

VBA-2

Since we are here lets take a closer look at this function.

TickCount

The “#If VBA7 Then” is what caught my eye. According to This question on StackOverflow it is checking for 64 bit Office on a 64 bit system.

Another Odd thing I noticed was when you click from the Module1 tab at the bottom to the ThisDocument tab then back the function name changes to the AutoOpen one.

NameChange

So now we can use the “Save Basic” button to save this Module1 as a “.bas” file to take a closer look.

But lets go further Now that I have the Decalage @decalage2 and Didier Stevens @DidierStevens tools installed lets see what they tell us.

We start with Olevba

OleVBA

As we can see here it outputs the macro for us and also gives us more information about what happened when it was checking it including the decoded IP Address.

Not all of the Information in the box is “Always” correct. So you may need to verify.

Now lets take a look with Oledump.py We start with the basic command to see what streams are in here.

OleDump-1

We can see in stream 7 there is a upper case “M”. That lets us know that there is code in the macro. So lets look at that.

oleDump-Stream-7

That looks like the data is compressed so lets add the –v switch to decompress this stream.

OleDump-Stream-7-Decompress

Now that is much better. We can now output that to a text file and take a closer look in our favorite text/ code editor tool.

Lets look at 1 more method before we dig in deeper to how the rest of the code works.

I’ll use 7Zip to decompress the document and we see the folder/ file system.

Unzip-1

Lets dig into the Macros folder and see what we have.

Unzip-2

We have files and a folder. In the VBA folder we have .

unzip-3

Now here is what I’m looking for. Lets take a look at module1 in a hex editor.

Unzip-4

We can see here that there is some plain text but this “Stream” is compressed.

Before I learned how to use oledump.py I had wondered how you extract the data in this file /stream.

I had read This article in that past but didn’t understand every thing it was telling me.

But using the code provided there and with some modifications I was able to build a tool to decompress the single stream. I wrote the tool mainly to “Try” and understand how the encoding/ compression worked.

tool-1

So that now gives me 1 more way to extract the macro(s) from the document.

I also Installed and Ran Vipermonkey today to see how that worked since I have never tried it before.

Viper-1

Viper-2

As we can see here it also extracted the script but seemed to have a problem with the VBA7 code.

Here is a list of the commands I used for olevba and oledump.

All commands are run from opening a CMD prompt in the folder where the document was located. (Shift + right click on folder , select Open command window here)

CmdsUsed

Let’s dig into this code some more because it is crazy.

The first part of the code you can see in the screenshot of my tool above is just a large block of junk comment data.

If we start checking for references of declared variables before the the “AutoOpen()”  we can find that there are several that are never used so they are most likely just junk filler to make it harder to read.

CodeStart-b

This code does a series of converting the “Val” and “Len” values all of the way thru this code.  Even once we convert those values we still have to do the math for each line.

So I wrote a tool to understand how the “Val” works. This Link will give you and Idea.

Val-1

As we can see it will input that string and return the numerical value. Basically cleans all non numerical values. But this value could have also been “&H” for Hex or “&o” for Octal.

We know “Len” is the length of the string so somewhat easy. The hard part is to parse this code and do the replacements for the numeric value.

My tool still has a bug or 2 but will parse this well enough for us to get a better Idea of what this is doing.

Tool-2

 

Now that some of the extra obfuscation is out of the way we can look closer at what we have.

After going thru and doing the math by hand we see this. The part with “****” next to them is where the two main values  are reset to a new value.

Math-1

At the end of the lines I also calculated the values for the “Left”, “Mid”, and “Right” values. These get used to get the sub string from those functions and the output gets appended onto the final string that get run in the “Shell” command at the end.

If we zoom in on these values we can see they are only taking a few characters from each string.

The first number (green text) is the position to start taking from, and the second is the length to take.

SubStr

If we keep scrolling down we can see the IP that gets called out to.

ip

We also see towards the bottom this interesting code.

bottom-b

We can see where it will possibly insert a break or clear formatting.

The GetTickcount  to me seems like this might be some type of anti debugging or just another time waster. ( Without verifying , you would think the tick count would always be greater. Tick count Explanation)

If it is less than 1.2 then it will change the the output value to the garbage string to that will get run by the “Shell” and fail.

Now the “Shell” which will run what got put back together.Shell

The first part of that before the “+” is just junk code. It doesn’t do anything that I could find. In the Shell it is passing the rebuilt string and the numeric value that gets passed. (I didn’t do the math all of the way thru. )

Now that we have a real good idea of how this works how do we output this so we can see what the final string is before it gets executed ?

I tried to open it up in Office Libre and modify the Macro code but that didn’t work.

After building a new Clean VM I installed a copy of Office personal in there.

Lets see what it looks like in the real office.

Office-1

We already have a pretty good idea of how this macro works so lets open this up and make some changes then save them.

I’m not sure if it would make any difference but lets comment out the section looking for is wow64.

Off-Mac-1

Lets also make a change to the GetTickcount to make sure it is not an issue.

Off-Mac-2

We change the value to greater than the “1.2008” that gets checked later on.

And the final Change to the “Shell” lets replace that with a MsgBox call instead.

Off-Mac-3

And after saving the changes and clicking “Enable Content” we get this.

Off-Mac-4

We can then left click on the MsgBox and hit Ctl+C to copy the data and then paste it into notepad.

FinalCmd

One strange thing that happened was, when I clicked “OK”  the Document looked like this afterwards.

Off-Mac-5

What Happened here ?

ClearFormatting

It looks like there is code here for clearing the formatting and the image.

Up higher it looks like this would work for an Excel sheet also.

And when we go to close it it just ask us if we want to save the changes.

Off-Mac-6

So the macro calls out to the IP with random 7 Character string and “.jpg”

The function will choose a value between 97 and 122 which is the ASCII code range for lower case letters. For each random Number it will convert it to a lower case letter (ASCII Char code) and add that to the final value for a final length of 7.

So that is that for the Decoding part.

The next problem was after enabling that content it would not “Un-Enable” no matter what the settings were.

So what is the Problem with that ?

After enabling the content once it now becomes a “Trusted Document” , the problem is how do you Un-Trust it again ?

We have to go to File –> Options –> Trust Center –-> Trust Center Settings (Button) –> Trusted Documents –> Clear all Trusted Documents …… (Button “Clear”)

TrustedDoc

I’m not finding a way to see a List of what is trusted or even that there are any trusted documents. Perhaps there is a screen I’m not seeing somewhere.

Also I don’t know if this is a bug or not, but the “Allow documents on a network to be trusted” seem to automaticity recheck itself after I Uncheck it close the document and reopen it.

So I used the Mantra “When in doubt run Procmon” to locate where these are.

Procmon 

I first set a filter for “Category is write” then looked for the string “Trust” once I found this registry key I added a filter for “Begins with” on the registry key and removed the other filter and got the above view.

And If we look at it in the registry.

RegEdit

And if we Dump the Key.

RegDump

I did a hash calc of the document after it was saved so I have another question, what it the hash they are using ?

CompHash

I’ll also Have to Figure out the Time format too.

Once we clear these (By clicking the Clear Button)  this key will be deleted and the Documents will no longer be trusted.

Well that’s it for this one. I hope you learned as much as I did.

Posted in Malware, Programming, VBScript | Tagged , | Leave a comment

A Look under the hood of a batch encrypted file

The sample in question today is thanks to a Twitter thread by Nick Carr @ItsReallyNick and Daniel Bohannon @danielhbohannon of FireEye located Here about this builder being used to encode batch scripts.

After downloading the sample from VirusBay @virusbay_io that Nick linked to, and after removing the first 2 bytes (byte order mark) from the file I was able to open it up in Notpad++.

Here is what we are greeted with.

Script-1

That is a lot to deal with so lets take a closer look.

Script-2-a

Looking at this we can see several things that stand out. It is using environment variables  in the form of  “%os:~-4,1%” .

This is actually a 2 part operation. The left part “%os%” will get the expanded environment variable for the OS the right part separated by the “:”  “~-4,1” will get the position to start getting characters at and the length. Notice here though that the first value is “-4” so this means we start from the end of the  expanded value and work back 4 characters and then get 1 character.

Lets see this in action on the command line.

Cmd-1

So here we can see that the 4th value back is a “s”. We do the same for the others.

The other thing you may notice from the boxes in the screenshot above is some plain text in between the environment variables. When the text is encountered it will be passed onto the output as plain text and no need for other processing.

One final thing you may also notice is there is a very large block at the bottom of similar looking strings like the environment variable from above but instead of something like “os” we have ‘ just a single quote. The only problem is we have to decode the top part to see if it tells us how  to decode the bottom part.

Script-3 

So now we have enough information to build a tool to decode this based on the observations so far.

So after a day of building, testing and bug hunting we end up with this.

Decode-1

Here we can see the top seems to somewhat decode to something but the bottom part is just gibberish. So so what is the problem and how to figure it out ?

Well thankfully Michael Bailey @mykill of the FireEye FLARE Team came out with a tool called “De-DOSfuscator” that works for this type and a blog post Here . After studying the blog post several different times I was noticing that the output of my tool was similar to what his tool output in Figure 7 was in his blog post . So I guess I’m on the right track but how to get the rest of this to decode.

If we take a closer look at the first part that gets decoded we can see that there is a set variable in 2 places  to set ‘ single quote  = [long string of characters] and then an “&” at the end.

We can see this better if we split all of the strings at a single “&” .

Script-4

Script-5

Now we can see how the value for working with the bottom part is set.

So after trying several variations of this string and no luck decoding any of those values in the bottom section I finally break down and install “De-DOSfuscator” on a VM. After after several false starts and some help from @mykill I get it set up and running the way it is shown in the blog post.

By using this tool you don’t have to understand how cmd.exe parses the files as it lets the cmd.exe interpreter do the parsing and just logs the results.

You may notice that one of the commands in this decoded part of the script was a shutdown command. Upon running the tool and the batch file I was not disappointed and the VM started to shut down, but not before saving a log file thru  “De-DOSfuscator”  of what commands it had run up to then. Here is what I saw upon restarting the VM.

Output-1

Although the output is very similar in my tool as it is here, something is different.

output-2

If you are able to zoom in, my tool output a much longer string. So what is the difference ?

As the “De-DOSfuscator” intercepted the parsed values cmd peeled off the the extra “^” characters.

If we download the “Dosfucation ” White Paper from Here we can see some information about the use of the character “^”on page 13  and on page 18 we can see a screenshot of a script similar to what we are working on here.

So the next step is to hard code this value into my tool instead of extracting it and using the raw string and see if it will decode the remaining values correctly.

Decode-2

Great, it looks like it decoded part of it but the rest is still a mess so another new tool to just work with this part.

We now take the key/ string value from this tool and load it into the new tool along with the full section of remaining index values that start with the “%’:” (percent, single quote, colon.)

Decode-4

This tool will extract the decoded string thru to the final “echo “ and double quote  and then also return the remaining unused variable indexes.

Oh , this looks like there is another layer with a different index string/key.

While comparing the output from the  “De-DOSfuscator” to the decoded value from my first tool what I discovered what I needed to do was, do a string replace of “^^^” with “^” to get the correct index string/key. I added this option to automatically do this in the tool. I did it by hand the first time thru.

So after multiple passes we get to level 11 and we can see that we have whittled down on the array values quite a bit.

Level-11

And finally pass 12.

Decode-5

Further testing of this tool to figure out what the last remaining values were revealed that it was a bug in the way the it extracts the remaining values to output. The program had reached the end and wrapped back around to Zero so it output the entire input string instead of returning nothing. I’ll have to fix that.

A closer view.

Decode-6

This took 1 pass to get the original “key” and 11 passes to get the final decoded string.

So thanks Nick Carr @ItsReallyNick  for trolling Daniel Bohannon @danielhbohannon .

This was a very interesting learning experience.

That’s it for this one I hope you learned as much as I did.

Thanks for reading.

Posted in Malware, Programming, security | Tagged , , | Leave a comment

Understanding Invoke- “X” Special Character Encoding

I say Invoke- “X” because it can be found in both Invoke-Obfuscation and in
Invoke-Dosfucation.

We can find a reference to the encoding scheme in this Twitter thread Here where @danielhbohannon references the the blog post from 2010 by @mutaguchi where they demonstrate a “Hello World” encoded string. I had to translate the post to view it. You can find the post here .

We can also find the link to the site in the Invoke-Obfuscation master folder in the script “Out-EncodedSpecialCharOnlyCommand.ps1”.

The script we are going to be working with today is from another Twitter thread on September 12 2018 located Here . It is a pastebin link from @James_inthe_box.

Here is what this script looks like.

FullScript

And a smaller sample view.

Top

Just looking at this it looks like total junk code.

After reading the other blog post we have a few ideas of how to work with this so lets clean this up a bit. The first thing we want to remember is that the character “;” is used as a command separator so let separate these to a new line to make it easier to read.

1 

Now that we have the commands on there own line we need to understand what the first one is doing.

2

What this first command is doing is creating a hash table to contain the values on the left side of the “= ++” to the hash table name of “${‘].}” on the right hand side.

As it goes down the list it will set the index position in the has table equal to the value Inside of “{ }” on the left.

What this will do next , or as it sets the values it will do a string replace or “lookup” of the value and the string like “${$}” will get replaced with the number 0 on the rest of the script.

Here we see what happens when we replace each value with the index number.

4

(I’ve restored our left hand values after doing the replacements. Always do replacements on a copy)

Now lets take a look at the next command and see what it is doing.

5

As we can see we now have some number inside of the “[]” like this “$(@{})”[  7  ]”

The best I understand is that this taking the the hash function name of “System.Collections.Hashtable” and in this case  taking the 7th character to build a string.

So if we take a indexed list of that string and get the 7th Character we end up with “C”.

6

So we go thru and replace the 3 characters and then get to this one.

7

In Short the “$?” will evaluate to true or false if something succeeds or failed. In this case what it gives us is the string “True” and then we take the character at index 1 of that string which  = “r”.

So now that gives us. “${*@}  =  “[Char]”  ;” and we can do the replacements for that.

8

Our next line will do replacements similar to the one we just did so lets do those.

9

So now we can see that it decoded to the string “insert”. But the way it is called it will set the value of “${‘].}” to the Signature of the function in the form of 
“string Insert (int startIndex, string value)” and you can find a list here.

So now we have 2 “+” on this next line. The first 2 are like the last 2 lines so lets do those replacements.

10

Now for the last value we are setting the value of  “${‘].}” to (“ie” + the Insert Sig.) character at index 27.

11

So index number 27 = “x”  so that makes our string now “iex”

12

So the last step for this level of encoding is to do the char code replacements.

There are multiple ways to get the char codes decoded from this point but I will go thru and format it so I can just run it thru my tool.

13

You may also notice that there is a “|” and the variable name for “iex’ at the end here.

14

The Final Decode.

15

In the usual fashion after going thru this by hand I like to build a programs to be able to just copy paste the encoded string , click a button and get the decoded value back.

16

As you can see from the output, the decoding for this piece of malware is far from being complete but this is as far as we will go with it in this post though.

Thanks for reading.

Posted in Malware, PowerShell, Programming, security | Tagged , | Leave a comment