Pulling apart Rig Exploit Kit

In the last post, A look at a cross bred Neutrino EK–Rig EK Flash file we see where the two exploit kits were merged into one.

This one is pure Rig and looks the same on the surface as other samples I’ve looked at.

The sample used here is from Brad @malware_traffic 

http://www.malware-traffic-analysis.net/2016/08/17/index.html 

The main differences between Rig EK  and Neutrino EK are the the Neutrino EK uses a layered approach with multiple flash files and embed binary files that get decoded as needed. The Rig EK uses more of a scattergun approach. they use 1 flash file and multiple embed binary resource files.

They also split the functions into many classes, functions and parameters to make it more difficult to follow along. Some parameters/vars just get passed to a new ones and not used right away.

Note: My hats off to the person(s) that made the obfuscation tool the exploded this thing into 20 different files. That said I hope whomever it was is turned into a babbling idiot after writing it. I almost was trying to follow along with this crazy maze. I feel better now.

As mentioned above they are splitting everything into 20 action script files and are using 5 different embedded resource files.

They are using 2 different Xor functions to calculate the index value for the extracted replacement values.

They are also using the RC4 encryption for one set of replacements and AES 128 ECB for the other set of replacements for a total of 25 different values decoded for use as replacements throughout various parts of the code.

If we look at what the files look like in the folder normally after decompiling we see this.

FileNmaes1

But if we let it do a rename for us we see this.

FileNames2

That’s much Better and if we look at the Second file that it calls.

Script1

And after renaming it looks like.

Script2

If this looks familiar then that is because it is the same function from the last post. The difference is, instead of leading to the RC4 decrypter it is leading to a very well exploded self contained version of the AES decryption routine. It will work the same way passing the values to get decrypted.

I believe that they may have done it this way so there would not be a call to the built in crypto provider to raise a flag on what was used, or what they wanted was not available.

This version is only extracting 6 byte arrays to decrypt and 1 key. The last byte array is 2528 (0x09E0) bytes long and is listed as the shell code.

They do several things to this “Shell Code” byte array before it gets used anywhere. It gets sent to string, has another string tacked onto the end of it, get turned back into a byte array, gets reversed (littleEndian) , and that’s all I can remember off hand.

In this view of class_12 (file)  we can see there is allot going on.

class_12-view-b

If you look close at the two “Calculate replacement index value”, you will notice that they are calling two separate classes. Although they are two separate classes the number that those values gets XOR’ed with is the same number. The class_14 gives you the value for the index number to calculate in the RC4 decode, the class_2 is the extracted and reversed binary value and converted to Integer to get the value to XOR with,  that is for the AES decode.

Here we see the the start of the RC4 decode routine with the value (var_78) that is used to do the XOR to get the index value.

StartRc4

and here is the XOR routine.

rc4Xor

I the screenshot above I was comparing the the values from the the RC4 and the AES to verify they were the same.

Using a tool I created for working with Angler EK to calculate the XOR  32 bit values we can verify what the results are. Using Calc.exe on a 64 bit system gives a different result.

CalcIndex

So here is the list of RC4 replacements.

RC4Replacements

Here is the AES replacements.

AesReplacements

Well that it for this one.

There is allot more that could be said about this one but tracing the code can be difficult at times. I traced the AES function across 4 class files and multiples functions and variables.

Thank you for reading.

Posted in Malware, Networking, Programming, security | Tagged , | Leave a comment

A look at a cross bred Neutrino EK–Rig EK Flash file

A recent post by Jérôme Segura of Malwarebytes https://blog.malwarebytes.com/threat-analysis/exploits-threat-analysis/2016/08/neutrino-ek-more-flash-trickery/

Although this post showed the flash file being sent from the compromised site rather than a “Gate” is interesting. What is more interesting is what is inside of this flash file.

After decompiling the the second level of this flash file the fist thing I noticed was the file structure.

Folderdiff-b

To me this folder structure and naming convention looked more like the Rig EK than Neutrino EK.

If we take a look inside this to find where the configuration file gets decoded we see this.

Config-Decode

Interestingly enough if we look at the Rig EK sample from Broad Analysis here http://www.broadanalysis.com/2016/08/15/rig-exploit-kit-from-185-158-152-195-delivers-zbot-banking-malware/

As we can see here.

Config-Decode-Rig

Besides the naming of the variables these two functions are the same, what they do after that is somewhat different though.

After beating my head against the wall for a couple of days trying to figure out these weird loops I finally broke down and  took a crash course how how to crash a flash debugger.

Ryan Chapman @rj_chap in his BSides presentation here was using FlashDevelop from Here http://www.flashdevelop.org/.  , so I decided to try and install it. After several try’s and Google I finally figured out how to get the the decompiled file to recompile with that program. After a few more hours of trying to figure out how to solve a problem of it throwing an error on readint() I finally gave up and abandoned that program. The break points didn’t break and a few other things about it makes me never want to try it again.

So I went back to the Free flash decompiler and figured out how to set it up to run as a debugger. That finally helped get me the last clues even though I was not able to step thru it like a normal debugger and see what all of the values were.

One thing this version did was, instead of using a normal RC4 Function like this.

RC4

They separated the functions inside to three different ones.

Exploded1

Our decode function from above will send the key in this part, build the SBox then it will be used in this section (below) for the final Xor.

Exploded2

It will then be returned to the original function and pushed as a string, basically to a list.

It will cycle like this several times, in this case 45 times.

A list of What? Well of replacements.

Decoder-b

And where are these used ?

UseConfig

And if we compare that to the last one I pulled apart.

UseConfig2

We can see here it is pretty much the same, but what are all of these values ?

Values1 

As it turns out this gets Xor’ed with a value from the embed binary data that gets the bytes reversed and converted to integer.

Values2

So if we Xor these numbers then we end up with something like.(The comment Values)

Values3

Which as it turns out does match up to the index values shown above in the decoder.

So how does this decode function work ? Lets look and the binary files and let them tell us.

BinFile-1-b

We skip the first 2 bytes, take the next 2 bytes and convert them to an Integer and that is how many times we will loop.

Next we skip 2 more bytes, take the next 2 bytes  and that is how many “Data Bytes’” we will take and add to the array of “Array Of Bytes” for later decryption.

Then skip 2 bytes again and repeat until you have looped thru the given amount of times.

In the Rig sample it was only 6 times.

I’m pretty sure I’ve seen this method used somewhere before.

Lets look at the Key file.

BinFile-2-b

And the number of bytes to get is called from this function.

Get16Bytes

Well that pretty much covers the bulk of this one.

So the question remains Which EK will this get identified with? This sample has both.

Is this just a trial run or a new way they will be built.

As it turned out my shiny new decoder did not work on the Rig EK sample even though they used the same method to pull the bytes.

The difference is however the Neutrino EK was running those bytes thru a RC4 decoder and the Rig EK sample was running them thru a AES decrypter.

So I guess I’ll have to build a new decoder for that one. Bummer.

This one is kind of short compared to the others but before I go 1 more thing.

While trying to update a new test system.

Windows7Update

Enough said.

Posted in Malware, security | Tagged | 2 Comments

Pulling apart Neutrino EK

I’ve spent the last few days going from top to bottom of 3 different Neutrino EK infections.

The one I will show here is from Broad Analysis @BroadAnalysis  from their site http://www.broadanalysis.com/2016/08/08/new-c2-neutrino-exploit-kit-via-pseudodarkleech-hopto-org-gate-delivers-crypmic-ransomware/

You can download the pcap of the traffic from there.

Any one that has read any of my post before know that I prefer to do static analysis when ever possible. When facing a new piece of malware for the the first time it is like finding a piece of cipher text that needs decrypting.

Neutrino EK is similar to the complications of the Angler EK except they use so many flash files to hide what they are doing versus just scripting and encoding/encryption.

There are many great articles out there that have already covered most of what I could write so I will concentrate mostly on understanding the code and how to follow it. I will post a few of the links to some good articles that have help me put the pieces of the puzzle together at the end of this post.

By doing static analysis and creating most of my own tools I can change them and adapt them myself for what ever changes the malware authors may make.

The Neutrino EK is comprised of what I call 4 levels, 5 if you count the original flash file.

We will first need the JPEXS free flash decompiler which for me worked flawlessly in a VMware virtual machine using a never before used copy of 32bit Windows XP Pro that I have been saving for a rainy day since the sp2 version was still new. Trying it in Windows 7 64 bit kept crashing.

Once we extract/decompile the flash file we have a folder structure like this.

Level1Folders

If we export everything then remove what is empty then that is what is left.

The first thing we will want to look at here is the scripts folder. The “binaryData” folder is the binary files we will have to work with later.

Scripts

Here we see we have two action scripts outside of the other 2 folders.

FirstFile

I’m no action script Guru but this appears to be the entry point into this rabbit hole. It is creating a child or running the object referred to here which is the other file out side of the other 2 folders. So what is in that ?

Second-1

Here we see that this is importing several items to use in the script we have 3 that start with “flash” so those are the native flash functions that are getting imported, but what are the ones starting with the “r’ ? to answer that question we go back to the the symbols file and look at it.

symbols

If we look at this and then look in the binary resource folder the names after the “r” match those files, so we figure the “r” is for resource , the last one on the list is our first file we looked at.

second-2-a

In the screenshot above we see in the red there is a variable but if we trace it down it is not used anywhere. They add a lot of code that does not do anything in order to help hide the small parts that do. In the blue we see more variables that are declared and set but never get used.

In the green though, it is setting “e” to the value of  the function “m” which is the RC4 decoding routine, and passing 2 of the embedded binary resource files. After decryption, it is next sent to the  “to string” method  then splits it in to a “String Array” with the character “%” to be use later for replacements.

As you can see at the bottom  this.e[3] would equal, starting  with index 0, “stage”.

(I missed separating the two at that one point in the screens shot above)

second-3-a

Here in the red box is yet another variable that is declared but not used.

In the blue box we see where it is assembling the byte arrays from the embedded resources. It is next run thru the RC4 decryption with the key and it returns another flash file as a byte array.

It then passes on to the “s” function  where it pulls another byte array from the binary resources and passes it to the function “et” in the launched second level flash file.

What does this last part do ? it is the most important part of the whole operation. This is, once it get decrypted is the configuration file.

Fast forward to extracting the second level and locating the function “et”.

Level2-et

This passes the last byte array from the first level as param1 here and sets “s=s” to “param 1”  and then it will call the function “s9s” if it has a event listener or it will add the result of that function call.

Level2-s9s

The main part we are concerned with in this one is it sets the value “s0s” to a string, does some check and will bail if they don’t pass and not decrypt the config. file.

configdecode-a

We can see that the name of this function gets called from the last one if it is not found.

And what does the config file look like for this one after getting decrypted?

ConfigFile

Or using a JSON Viewer it looks like this.

ConfigFile2

Here we can see that they can call out to several locations depending on what the result of the fingerprinting of the system is. It can give you nothing or the malware payload.

WorkingFolderLevel2

This is what my binary resource folder looks like after doing the decryption and decompressing.

The 3 new flash files only gets RC4 Decrypted and again depending on what the fingerprint is they get served up. The 3 html files and and the java script file each gets RC4 decrypted and and then have to get decompressed. This is a case where I made my own tools to work with these files in order to get the plain text.

The main fingerprint file to the best of my knowledge is the JS file. The only problem is it looks like this.

JS1

In order to decode it we have to take this function and the the var”v” and decode this into an array of string, which is what was show by @BroadAnalysis on a twitter post earlier today.

The very first time I worked with this I did all 70+ var replacements by hand , I’ve created a new tool to help with that now.

JS3

Here is what it looks like after the replacements. This code may not run but at least you get an idea of what is going on.

At this point we have a pretty good idea of how to decode this for each new level that is presented but what if we wanted to search around to see what connects to what.

I was having a problem where a tool I created some time back to look for stings in file was not finding the values as displayed by Notepad ++

codeproblem-a

As you can see here they are using some strange characters for names of the functions and classes. If you try and copy paste one of these in anything else  and do a search for that string it does not find it.

I finally open one of these up in a hex editor and paid closer attention to the variable names  and I seen this.

CodeProblem2

As you can see that looks different so what is up with that? As it turns out the answer came in  the way I have been extracting the html pages. I had discovered on the first file I ever worked with that you can not just open the decoded/decompressed byte array with a hex editor and copy the string part to a text file because it can leave extra escape chars in it, that gets real confusing and a pain to clean.

This file above is actually one of the Action Script files that the flash decompiler extracted.

So based on the idea from the html files  I use another tool I built a while back that will convert the hex bytes to a string using any supported encoding on the system, in this case it was normally UTF8 and after discovering this problem I found that UTF 7 would display what was in the hex file correctly. So I opened the Action script file in a hex editor and copied the bytes to this program to be converted using UTF7.

Convertutf7

CodeProblem3

Now I can copy paste those names into my search tool and can find them anywhere in the sub folders to see what connects to what.

Before we move on to the level 3 code lets take a quick look at the script to decode a binary resource file.

resourcehtml-a

As you can see there is a lot going on here to just load this iframe.

This is already getting long so lets move on to the last thing I want to show you that I have not seen mentioned before.

References to ‘MAC”

PlayerType

By this there appears to be checking if it is a “MAC”, there is one more flash file for a level 4 that can be decrypted. In 1 other sample I had worked with I found references to “MAC” in the extracted files, so I’m not sure if this will work on a “MAC” also or not. I’ve Never used one and have no way to test it.

As we have see here it can get quite confusing and difficult to travel thru these extracted files to see what does what.

After you learn the tricks and have the tools you can completely decode one of these in a matter of a few hours or less.

The analysis of all of the extracted code could take considerably longer though.

Well that’s it for this one , if you made it this far thanks for reading.

 

References:

This helped with the last piece of the puzzle, the config file.

http://malwageddon.blogspot.com/2015/03/data-obfuscation-now-you-see-me-now-you.html

And this one.

https://www.sans.org/reading-room/whitepapers/detection/neutrino-exploit-kit-analysis-threat-indicators-36892

Fingerprinting

https://blog.malwarebytes.com/cybercrime/exploits/2016/06/neutrino-ek-fingerprinting-in-a-flash/

Talks about an exploit added after someone published a POC

http://securityaffairs.co/wordpress/49383/cyber-crime/neutrino-ek-ie-flaw.html

I’m sure there are plenty more good ones out there .

Let me know and I’ll try and add them.

Posted in Malware, Programming | Tagged , , | 2 Comments

My first deep look at KRYPTOS K4

My fist exposure to KRYPTOS was most likely when I seen it used in the TV Series Alias.

I most likely looked it up, got an idea of what it was about then forgot about it it for the next several years.

A few years ago I started working on the ciphers for Ghost in the wires where I have still only completed the ciphers for chapters 1 to 35 leaving the remaining 3 to finish, I’m still missing something about chapter 36. (any clues ?)

During that process I created several tools, one of them being a Quagmire 3 cipher tool and while searching for possible solutions for the Ghost in the wires ciphers I kept running across references to the Kryptos ciphers. Dropping the first two cipher into the Quagmire 3 cipher showed that they could be solved using that tool and the known keys.

If we look at Quagmire 3 it uses two keys and an indicator letter. If we test it with “A” we get this.

Queag3-1

If we keep scrolling down the indicator letters and when we get to “K” we see this.

Quag3-2

Or we can cheat and test all of them at the same time.

Quag3-3

The question is, if we have some cipher text, plain text and the main key (KRYPTOS) can we figure out the Indicator key?  And of course the answer is yes. 

Quag3Decode1

From the repeating pattern here we can guess what the keyword is.

In the process of creating this tool I discovered that the indicator letter was irrelevant to finding the indicator key. Given the way that Quagmire 3 works we can make a list of all 26 alphabets and compare if the index positions of the plain and the cipher text agree. Then the first letter of that alphabet found is the current letter for the Indicator key.

This tool also works with K2.

Given this information lets try it out on K4.

K4-1

A few things to notice about this is, we only have 1 repeating letter , and since our search is from the middle of the cipher text the letters may need to be rotated forwards or backwards to be used as a key do to the way the cipher works on key lengths.

Together this implies that this is “Not” a Quagmire 3 cipher like the first 2 were.

So lets go back to the Statistics and see what we can learn from this.

If we do a letter count using http://rumkin.com/tools/cipher/frequency.php we can see that every letter is used thus eliminating any cipher that does not use all 26 letters. We also see that it gives an index of coincidence score of 0.0361. Anything above .0500 could be getting into some form of mono-alphabetic substitution cipher like a Atbash cipher or plain text.

After spending the last 3 + weeks I was looking at everything from Atbash to Quantum Cryptography, Morse code, Fractionated Morse and Binary manipulation. I was even trying to find a way to use a calculation for “SINE Wave” but couldn’t get the math to work out.

I’ve come to the conclusion that this may be some form of a home grown Poly alphabetic substitution cipher along the lines of a Quagmire using many alphabets to encode with. But what ?

While starring at this I finally started seeing some patterns show up.

Text1

What is the fist thing you notice here ?

We have 3 sets of letters that encode to or from different letters suggesting that there is at least 3 different Alphabets used to encode this with.

Text2

What is the next thing we notice ?

Text3

Text4

Looking at this it suggest that for each of the Plain text letters the Cipher letter gets bumped up 1 for every space in between each letter. It works for these two but will it continue forwards and backwards for the rest of the cipher text ?

The problem is, if you were creating this cipher it would be fairly easy to count what the cipher text letters would be if you know the spacing on the plain text and what alphabet was used. You would also need to know what the starting point was, or which letter you start encoding with. Here it appears to be a normal A-Z alphabet for these two samples.

In reality having mostly only Cipher text it is a real pain to calculate backwards or to decrypt it. I kept messing up my count, or was not sure if I messed it up or not so I wrote a program to do the counting for me.

LetCalc-1

Based on the theory’s from above I created this program to calculate forwards and backwards from our know Plain text letter and our Cipher text letters in the position in the cipher text.

We start with the known alphabet index location of the cipher letter and start counting backwards with the given alphabet and the initial position in the cipher text. If the current letter of the Alphabet = Current letter in the cipher text we replace the lower case “x” in our “Test String” with the our upper case letter we are testing for  , In this case “B” . We will do the same forwards.

After some trial and error and some bug hunting I also came up with some rules to narrow down on what was a potential match.

The First test is to see if our used alphabet will change the wrong letters in our know text.

The second test, once we get a output using one of the current 4 alphabets we change the location to the first letter replaced and reset the the letters and index position to match where we start this test from.

letcalc-2-b

As we can se here using a normal A-Z alphabet messed up the third letter so that would tell me that a different alphabet was used.

LetCalc-3b

Here using the first “L” we can see that it properly replaces the second “L” also.

It works the same for the 2 “C’s” in “Clock” also.

After going thru all of the letters in this manner we end up with this wild looking matrix.

Decode1

This represents the best results by following the rules. I stopped bringing down the letters after I noticed a problem with this.

decode1-cr-b

The “B” and “C” both land at the same position. This suggest to me that there may either be other Alphabets used or it will reset to the original position either at a certain count in the cipher text or for a limit to the number of plain text letters, which could change my second rule some.

Decode-2

Here we see the output of the “C’s” compared to the selected “B”. The “B” is the result of using a Reverse KRYPTOS alphabet which was the only one not to break the original rules.

After doing some more calculations on potential letters to fill in around our known letters I ended up with this.

FillinTest2

You may be able to see in this that I was able to extrapolate the word “Crafty”.

So if this is correct then we would now have “Crafty Berlin Clock”

Although I have not cracked this yet, it will require further investigation to see if this method will hold true for smaller groups to work with.

Like reset the alphabets every 24 or 26 letters .

I also will need to try every letter in each position that seems correct.

Perhaps this was the way he encoded it or perhaps I’m just jumping down the rabbit hole.

This could also be like a Hill Cipher that was encoded with a matrix that has no “Inverse” matrix, thus not being able to be decoded.

That’s it for this one, and time to turn back to other task for now like reversing malware.

I will have to try more when I can come up with a way to automate this process.

Thanks for reading if you made it this far.

Posted in Cipher | Tagged , | Leave a comment

De-obfuscating Cerber Malspam file

On July 1’st 2016 I seen a tweet by Herbie Zimmerman   @HerbieZimmerman where he had gotten a zip file from some malaspam  containing an obfuscated Java Script file.

The infection chain is documented on his site here https://www.herbiez.com/?p=550

He had posed that he had trouble reversing the script, so that is what I will cover here.

As we scroll down the script we first see this.

Script1

We see it starts by setting a variable , assigning a value then tacking more on to the end of it. We next see another variable name and a value assigned to it.

If we look close at the second variable name  it is the same variable name and value assigned many times.

So we scroll down until we see this.

Script2

Tacking more onto our initial value. Scroll down some more.

Script3

Hmm a regular expression and new array.

Script4

Looks like this is only supposed to work this year.

Script5

Hmm if the date doesn’t match it return an empty string else ……

Script6

Lets take a closer look at this.

Script7

This looks like it takes our first variable as a hex array and Xor’s it with the Decimal 68.

A close look at the second variable name  and it is never used so lets clean this up and see what is left.

Script8

Now that we have this cleaned up what do we have ?

At the top we see out initial variable as a string of hex Char’s.

Next our variable is being reassigned as a hex array by using the Reg X to spit the string into lengths of 2 hex char’s , or a hex array.

Then we start the while loop.

Next is a check for the year, Interesting thing about this is the function used is depreciated for html  and returns 116 instead of 2016. If you try and drop this script into a html page  it will always be false.

If it succeeds in matching the date then it will run thru the array of bytes and Xor them by “Decimal” 68 (that tripped me up for a few minuets) then finally output the string or the decoded script.

The final 2 lines take our decoded script, drop it into a new function then call it to run the decoded script.

And here is what it looks like.

Script9

The script itself calls out to “http[:]//220.181.87[.]80/ok.jpg” to download the file and uses a random name generator to create a 1 Char name using the alphabet of “1234567890abcdef”  and save the file to the temp folder then launches the resulting .exe file.

That’s pretty much it for this one.

Ok, so I rushed over the part about getting the variable values into a hex array.

We first copy all of the variable parts over to a new text window.

Copied

Next in this case using Notepad ++ , select everything from the left single quote on the second variable to the right single quote on the first variable and then hit the find button.

Find

next chose the replace tab.

ReplaceStr

be sure the “Replace with” is empty, then select replace all and we get.

ReplaceResult

Now take everything in between the 2 single quotes and run it thru your favorite Xor tool.

XorTool

Take the result of the Xor and drop it back into a new Notepad ++ and use the Java Script format .

Format

And there we go.

Script9 

I hope it helps.

Posted in Malware, Networking, security | Tagged , | Leave a comment

Unknown Exploit Kit

When I first seen a screenshot of this one that’s what this was, Unknown.

Here is the twitter message that Jérôme Segura from Malwarebytes posted.

TwitterConv

and the response by William Metcalf @node5 replied that it was Sundown/Xer and they steal from everyone for their Exploit Kit.

While researching other reports of Sundown, the code and the domains used in this version appears not the same as was reported in several other post on the Sundown EK. Is this a new version of Sundown? I don’t know, this is my first real look at it.

This version appears to be trying to look like Angler EK, it uses 5 sections that get decoded and each section has 1 or more levels to decode to get down to the final decoded code.

On June 15th 2016 Brad Duncan @malware_traffic posted his captured  run  here

http://www.malware-traffic-analysis.net/2016/06/15/index.html

that Jérôme Segura Mentioned in his Post.

If we look at what was posted for the first redirect from the infected site we see this.

(Screenshot borrowed from malware-traffic-analysis.net)

2016-06-15-Sundown-EK-image-01

Here they are sending the person to 5 different URL’s but a closer look tells you they are actually hosted on the same IP.

SundownIps

Here is a closer look at what Jérôme Segura posted.

InitialmalwareScreenshot

In the traffic from the Pcap from malware-traffic-analysis.net we can see there were 2 landing pages and at first they appeared to be exactly the same but doing a binary compare on them we know for a fact that they were different.

I next went thru and decoded every section as far as it would decode.

Viewing the decoded sections, “Most” of the sections contained code that would not run on its best day. In one of the decoded sections I even found a known Angler EK decryption key and some of the code from an Angler EK section. Reviewing the code in this section for what would have been the exploit section in Angler there were 2 separate decoding functions with the same name for what should decode some of the encoded strings.

Although there appears to be some advance functionality in this, um, Kit, it does not appear to be properly implemented at the moment. Just because it is ugly don’t totally dismiss this thing yet.

So if this thing is basically broken how is it calling out to download the flash and Silverlight ?

They are using embedded links in the code is several places.

In the first section we se this.

Section-1

The top part of this is a hex encoded base 64 alphabet and then the base 64 decoding function , then finally the string to decode.

Once decoded we see this.

Section-1.var1

If you look close this is also a base 64 string but the string was reversed.

ReverseString

In the screenshot above we can see the eval that kicks of this part , taking the reversed base 64 string and using this reverse function  to reverse it then, finally base 64 decode and then we end up with this.

Packed

If you take a close look there are a few “p,a,c,k,e,d” sections it in. When you try and decode that it returns what you input.

Moving down on the same code from the first decoded section we see this.

FlashBin

If we drop the hex code for the “FlashVars value” into a hex editor we see this.

Section-1-Flashvar

Here we can see there is no call for a “.exe” file but a link to

“http://trasergsgfsdx[.]xyz/z.php?id=8” which shows up in packet 602.

If we look at this section for the second landing page we se this.

ShellCodeBin

It is has “z.exe” and a different site name and id number.

If we move on down on the first landing page we see this.

VBScript

We have a call at the top using the same Url that was found in the hex above

If we clean up the top of this and zoom in we see this.

VB-Top

If we look at the bottom we se this.

VB-Bottom

The section in between is percent encoded hex so lets decode that and see what we get.

Section-4

Above they are using a array of variables and an index number to build the code with.

Also notice the number on the special folder it is calling for.

It also has the appearance of having the option to save a file as a dll or an exe.

Another interesting thing if we scroll down to the bottom of the page we see.

Section-4-bottom

Looking at this bottom function it looked familiar.

DecodeLikeAnglerEK

The left is from this exploit kit the right is from Angler EK. The differences are the variable names, the left is using “&” and the right is using “%” from some items.

Also the left is pushing to a char array then reassembling the string where the right just goes to string.

When trying to decompile the the .exe artifact the decompiler said it may be “packed” ,but looking at the file in a hex editor it appears to be corrupted or encrypted in certain sections rather than packed .

Here is a quick shot of the Silverlight after decompiling it.

SilverExploit

Even after de-obfuscating this it is still large and a lot to navigate thru.

From a static analysis point it is difficult to tell what would and would not work in this.

It would need dynamic analysis.

Before I finish with this first pcap lets take a look at some ‘Who Is” for this site.

WhoIs2

Notice the dates at the top, this url was not up long before it was found.

WhoIs1

Hmm.

And the Scumware report.

ScumwareSearch2

 

Pcap 2

Now to the second Pcap, there were some changes from the first one.

On June 20th 2016 Brad Duncan posted another pcap in with a exploit kit dump.

http://www.malware-traffic-analysis.net/2016/06/20/index.html

Lets start this one by looking at the traffic and using filters for the streams to see what goes with what. The first filter is just “ http.request or http.response “ .

Traffic-1

Here we have multiple gets and the first 2 were no doubt from the original infected page.

In this view we see at least 2 different landing pages, 1 flash not found, 1 flash that was found, and 2 Silverlight files downloaded.

If we set a filter of  “tcp.stream eq 0 and (http.request or http.response)” we see what packets goes with this stream.

Stream0

A filter of “tcp.stream eq 1 and (http.request or http.response)”

Stream1

A filter of “tcp.stream eq 3 and (http.request or http.response)” (stream two was empty)

Stream3

And a filter of “tcp.stream eq 4 and (http.request or http.response)”

Stream4

What does this do for us ? Since all of these point to the same IP it can help to see what is related a little easier.

If we look at the first one it tried to get a flash file but could not find the file.

The second one has a landing page and tried to get a Silverlight file but the traffic appears as though it did not get it.

The third one also contains a landing page and this time is did download a Silverlight application.

The fourth one instead of a “normal” landing page we get something else.

But let’s look at the third stream first.

If we base 64 decode the first section we se this.

146-sect-1

We can see it is wanting “carolinamovie.swf” but the traffic supports that it was not returned but the Silverlight one was in this stream. So lets look at it.

146-sect-2

Here we can (almost) see that it is looking for the Silverlight.  And the hex in the hex editor.

shellforsilver-b

Doing a binary compare the 2 Silverlight files from the first Pcap and this one are the same.

Now on to stream number 4.

When we look at packet 174 in stream number 4 we don’t see a “normal” landing page but this.

Packet-174

And at the bottom we see this.

Packet-174-Bottom

so what is this ? Perhaps this will give it away.

Packet-174-Mid

If we do a string search on the internet this code appears to be borrowed from this site.

http://fossil.kd2.org/garradin/vinfo/92c8bdfeaa2c5f37ddf25197d54baed89dd398ac?sbs=0

With a description of “ A lightweight Javascript Libray for OpenSSL compatible AES CBC encryption.”

So they appear to be using open SSL to decrypt the bottom 2 sections.

At the time of this writing I didn’t have time to build a decoder.

Since it does appear to download a Flash file before we run out of traffic I can only assume that it does work.

One last surprise before I close.

name-b

Good luck with that borrowed name.

Conclusion:

In conclusion, I still can’t decide if the person(s)  who write this don’t have a clue, are just testing out only certain parts of the exploit kit. If they are just trying to see what they can get out of researchers or possibly even some Collage project. There is just to much non working code in here to be a streamlined Exploit kit.

If you made it this far thanks for sticking with me.

That’s it for now.

Posted in Computer, Malware, Networking, security | Tagged , | Leave a comment

Decoding Angler Exploit Kit

After my last post Some data on Angler Exploit Kit I had received a request to write up a tutorial on decoding the Angler EK.  The Question is where to start ?

Since they seem to be on vacation or are in the middle of a new version development I’ve decided to write this up. The basics presented here could be used for any exploit kit.

In order to even start reversing this or any exploit kit you have to have some basic  understanding of Java Script / Html and how they relate to each other. I always hated trying to write a web page but find myself going back to relearn how things worked to find out how the malware is working.

Most of these exploit kits and redirect pages are using some form of obfuscation technique for the decoding functions whether it is just adding a lot of white space, converting to escaped characters, or even adding in random comments like “/*  this is a comment*/” in order to mess with the beautify tools to get the code back to some type of readable form. 

In this post we will be working with the Pcap from http://www.malware-traffic-analysis.net/2016/06/01/index2.html and using the pcap 2016-06-01-pseudoDarkleech-Angler-EK-after-hideandseek.leadconcept[.]net .

Before we can start decoding the landing page we must first find and extract the landing page using Wireshark.

The fastest way I have found to Identify the landing page is by using the filter of  “http.response”. The latest Angler EK pages were always just before a “404 Not found” in the info section. When viewing the traffic and the code you find that there are 2 identical request and the first seems to almost always fail with the 404. On occasion you will find that you have 2 Flash request instead of just 1 and no 404.

WS1PNG

If you see a large gap between the packet/frame of the 404 and the previous one then the landing page will most likely not be shown and has somehow gotten corrupted and possibly missing some bytes like some of them I encountered in the last post.

After locating and verifying that you do have the Angler EK landing page then you need to extract it. Personally I always extract it as a text file so when I’m tired I don’t accidently run it. We need to go to File –> Export Objects –>  HTTP

WS2

Then we see this.

WS3

Next we click the save as button to get the location where you want to save it to (I won’t show mine here : ) ) Another personal preference of mine is to save it with the Packet number , like here I would save it as “ Packet-150.txt” in what ever folder I chose, this way if I need to find it again I can just jump to that packet or in the case of multiple landing pages per pcap I can be sure I have the correct packet.

Also take note of what it will suggest to save as. I have seen it want to save with several things “\” , a name and php,  png , jpg or even html. They try and hide the landing page with several extensions.

Once we extract the landing page and open it up in our favorite text editor we see this. My personal choice at the moment is Notepad++.

Text1

and this

Code1

If we keep on scrolling down we then see this.

Code2

By now you have lots of questions but the first is always, what the heck is this stuff ?

In this last screen shot the code is in script tags, so it must be used somewhere, right ?

So now what ? Lets start by extracting the entire script from the the center of the page and get it into some sort of readable state. When you are first de-obfuscating the script you will want to work with a copy so you can compare if something went wrong in the process.

Code3

Here we see that even after using a Java Script Format function it is still hard to read.

I wrote a tool to fix this but it still has a few bugs so it is just as easy to fix it by hand.

Also notice at the bottom is 2 types of comments . So lets clean this up and see what we end up with.

Now that we have it somewhat de-obfuscated  lets take a look around and get our bearing as to what the script might be doing and where the starting point is.

FullScript

Hmm what is this ?

Code4

It looks like this concocts to an “Eval” and look there is a RegX expression below it.

Hmm what is this item “kTPjPVb = ‘QnJNUUxhQW5PWUdWT3RS’;” it is used in a function above, could this be a variable name or a decoding key ?  We will have to follow along with the code to find out. But what is the “biwi” in the function above ? We will have to trace that too.

I’m not showing the part where you take several different strings / variable names and see where they are used in the code.

If we have a slight understanding of java script we notice that these are nested functions and work together, so where is it actually called from ?

If we do a string search for the variable at the top , “uTGlITcQsYrl” then we end up at the bottom of the script section.

Code5

So the last 2 lines of this script is where it actually starts.

When a page is run it starts evaluating from the top down and it will concatenate the variables and get them ready to use.

So now what is that value in the last line ? We do a string search in our extracted script and don’t find it, so we go to the original (copy) of the extracted page and find it here .

Code6

Looking back at the script, it is extracting the inner html of this ID for the first string to decode.

So now what ? We now have enough information that we can now start building our Html/Java Script decoder.

But what needs to go into it ? First we need a Html page with a Document.Write function to write the output to the page. (borrowed from http://www.w3schools.com/js/js_output.asp and modified slightly)

Code7

We know we need the String from the variable shown above, we need the key value, and the string replace function, and finally the decode functions leaving the rest of the code remaining out. The rest of the code is for doing the eval on each page/ decoded section, we don’t want to run that code , we just want to decode what is in the sections.

Which is also where those small sections of code we showed earlier come in. They identify each of the remaining “Sections” to decode.

After some trial and error and a few choice words for the left off semi colons we end up with out first decoded section.

Code8

I say trial and error because unless you are very good at Html/Java script it will probably take several tries to get together everything you need for it to run properly. There may be a replacement variable that was not close to the rest of the code that you need to go back and find and then include.

Here we see the start of the code needed.

Code9

In this view we set the variable for the String replacements, string to get decoded, RegX string replace , the key used for decoding , and finally the variable to call the function to do the decode.

We next find the end of the decode function by the end braces and the semi colons and copy paste that into our new decoder in between the variables we just inserted and the document.write.

Code10

The “ };;;” tells us it is the end of the function.

Above you can see I added “var result = J6em” . The “J6em” was the variable name that was used to start the decode process where the encoded string and the key was passed to. It also give me one more place to set a break point at.

So we try to run it as is and it fails with a variable of zx not defined.

Code11

We can see plainly that it is defined (below), obfuscated, but defined and in several places in the rest of the script, so what gives ?

Code12

The location of where the code is in no longer in the bigger global code block since we extracted just part of the code so the function below can not use it.

Code13

So first we try and reduce the function so it is readable. That doesn’t work.

We finally realize that the problem is that the function below can not see this variable. So we move it into the function and it works, the code runs all of the way thru and we get the result we wanted.

Now we have a working decoder for this “Type” of encoding.

If you read my last article I found 7 different encodings ( string replacements) with 3 different decoding functions used in the samples I looked at.

Now that we have a working decoder we can do one of several things, we can copy it for each section and just replace the string to decode and our title at the top of the html or use the same decoder and just replace the string to decode.

Note: There seems to be a string length limit in what you can view in the F-12 debugging tools and display on the page. Some of the strings to decode can be very long.

Now as long as the landing page is using the same encoding type all you have to do is replace the string to decode and the key used for every new one you encounter. This method bypasses the problem of dynamic analysis where it is checking for User Agent strings and for other running programs.

Now the real problem.

Once you get each “Section” decoded, the decoded section may have more variables that need decoded in that “Section”. To see what they are doing you will have to start the process all over again for each different decoding function, including that section that you see at the top of the screenshot for what I call the “LowerSection” which used to always be found at the bottom of the decoding script which you now will sometimes find towards the top as seen here.

The decode function for the “LowerSection” variables is not found until you decode section 3 and then not all of strings get decoded before they are used in other sections.

In this example I did not reduce all of the variables to their full string values. If this was the first time you worked with this code you may want to do that, then comment out the others not needed anymore or just use a comment at the end of the final variable name that will be used after they are combined as I did for the “String[‘fromCharCode’] “.

Once you understand how the code works you could always use your favorite programming language and create the decoder that way, like I did.(It is much easier for high volume decoding)

I would suggest that the very first time you run one these to make sure you are running it in a VM just incase there are some surprises laid in for the analyst.

If you are not sure what a function does or what the value is supposed to be, launch it in the F-12 tools (in a VM)  and step thru it to see it as it changes.

String Length Limits

I thought I was done writing this post until I went back to verify if the code would work for Section 4 knowing from experience that is is a very long string.

As I had mentioned above there is a string length limit on what a page will display.

If we check the length of the string for section 4 before it gets decoded we find that it is  66,220 characters long. which is no problem for the input.

CharCount1

When we decode it Using a VB.net version of this decoder we see that the decoded length of the string is 40,907 characters long.

CharCount2

But if we use the Html/Java script version we just made.

Section-4-Test

Notice the end of the string appears to be truncated (above).

CharCount3

As we see in the string length test it is only 40,546 characters long , a difference of  316 characters. So it appears that is what the string length limit is for outputting to a page .

So how do we work around this limit ?

We change our code and  split the string. If we change the top to this.

NewDecode1

And the end to this.

NewDecode2 

So what will this do for us ? Since we have a string length limit  I just use  40,000 as number to do the split at and then we end up with this.

NewDecode3

I know, you seen that it is still truncated in this version as well but here is where the workaround comes in.

If we launch this again in the F-12 tools setting a break point on “var result” then we can step a few more times until the values are filled out.

NewCode5

Extract the values and clean them up.

NewCode6

NewCode7

Clean off the Var name and double quotes on each end plus the word “String” on the end, or just copy paste everything in-between the double quotes .

Join them together and try to beautify them.

NewCode9

but we still have a problem.

CharCount4

Here we see that out Character count is now 40,964 instead of the expected 40,907. That is more so what is going on ?

If we take a close look at the string output to the Html page and what we extracted from the variable in the debugger we see the problem.

NewCode10

NewCode11

Do you see it yet ?

There is an escape character “\” before every single and double quote and even “\” is escaped.

The workaround for this is to copy the string from the output window for the first 40,000 chars so you don’t have to mess with the changes added by the debugger then get the last remaining characters from the debugger value and clean up the “\” by hand or write a script / program to do it for you like I did here for both.

ReplaceEscapes

Now once we clean and join these we end up with.

CharCount5

We end up with the expected 40,907 characters that we expected once we get rid of the extras. It will now also beautify.

FinalTop

FinalBottom

We can now see the code to create a decoder for this encoded section.

Once We decode this section (var b) we see this. (Top)

VarBTop

(Bottom)

VarB-Bottom

My point of showing this section is there seems to be some more code missing from the bottom here. Not that I want to help the writers of this debug it, but if we look  closer at the end we see this.

Varb-bottom-2

The output length before beautifying it is 29,976 characters well within the string length limit.

So this appears that either they truncated the the script before encoding it or got an extra character  from another part of the code this came from. Without seeing the original it would be difficult to tell for sure. Checking a few others I find this “char” also and 1 different one at the end of different samples.

This was not the only potential mistake I found while going thru the code.

As has been said before, even the malware authors can run into problems.

Well that’s it for this one I hope I answered any questions and didn’t create allot more.

Posted in Malware, Programming, security | Tagged , , | Leave a comment