Pulling apart Neutrino EK

I’ve spent the last few days going from top to bottom of 3 different Neutrino EK infections.

The one I will show here is from Broad Analysis @BroadAnalysis  from their site http://www.broadanalysis.com/2016/08/08/new-c2-neutrino-exploit-kit-via-pseudodarkleech-hopto-org-gate-delivers-crypmic-ransomware/

You can download the pcap of the traffic from there.

Any one that has read any of my post before know that I prefer to do static analysis when ever possible. When facing a new piece of malware for the the first time it is like finding a piece of cipher text that needs decrypting.

Neutrino EK is similar to the complications of the Angler EK except they use so many flash files to hide what they are doing versus just scripting and encoding/encryption.

There are many great articles out there that have already covered most of what I could write so I will concentrate mostly on understanding the code and how to follow it. I will post a few of the links to some good articles that have help me put the pieces of the puzzle together at the end of this post.

By doing static analysis and creating most of my own tools I can change them and adapt them myself for what ever changes the malware authors may make.

The Neutrino EK is comprised of what I call 4 levels, 5 if you count the original flash file.

We will first need the JPEXS free flash decompiler which for me worked flawlessly in a VMware virtual machine using a never before used copy of 32bit Windows XP Pro that I have been saving for a rainy day since the sp2 version was still new. Trying it in Windows 7 64 bit kept crashing.

Once we extract/decompile the flash file we have a folder structure like this.


If we export everything then remove what is empty then that is what is left.

The first thing we will want to look at here is the scripts folder. The “binaryData” folder is the binary files we will have to work with later.


Here we see we have two action scripts outside of the other 2 folders.


I’m no action script Guru but this appears to be the entry point into this rabbit hole. It is creating a child or running the object referred to here which is the other file out side of the other 2 folders. So what is in that ?


Here we see that this is importing several items to use in the script we have 3 that start with “flash” so those are the native flash functions that are getting imported, but what are the ones starting with the “r’ ? to answer that question we go back to the the symbols file and look at it.


If we look at this and then look in the binary resource folder the names after the “r” match those files, so we figure the “r” is for resource , the last one on the list is our first file we looked at.


In the screenshot above we see in the red there is a variable but if we trace it down it is not used anywhere. They add a lot of code that does not do anything in order to help hide the small parts that do. In the blue we see more variables that are declared and set but never get used.

In the green though, it is setting “e” to the value of  the function “m” which is the RC4 decoding routine, and passing 2 of the embedded binary resource files. After decryption, it is next sent to the  “to string” method  then splits it in to a “String Array” with the character “%” to be use later for replacements.

As you can see at the bottom  this.e[3] would equal, starting  with index 0, “stage”.

(I missed separating the two at that one point in the screens shot above)


Here in the red box is yet another variable that is declared but not used.

In the blue box we see where it is assembling the byte arrays from the embedded resources. It is next run thru the RC4 decryption with the key and it returns another flash file as a byte array.

It then passes on to the “s” function  where it pulls another byte array from the binary resources and passes it to the function “et” in the launched second level flash file.

What does this last part do ? it is the most important part of the whole operation. This is, once it get decrypted is the configuration file.

Fast forward to extracting the second level and locating the function “et”.


This passes the last byte array from the first level as param1 here and sets “s=s” to “param 1”  and then it will call the function “s9s” if it has a event listener or it will add the result of that function call.


The main part we are concerned with in this one is it sets the value “s0s” to a string, does some check and will bail if they don’t pass and not decrypt the config. file.


We can see that the name of this function gets called from the last one if it is not found.

And what does the config file look like for this one after getting decrypted?


Or using a JSON Viewer it looks like this.


Here we can see that they can call out to several locations depending on what the result of the fingerprinting of the system is. It can give you nothing or the malware payload.


This is what my binary resource folder looks like after doing the decryption and decompressing.

The 3 new flash files only gets RC4 Decrypted and again depending on what the fingerprint is they get served up. The 3 html files and and the java script file each gets RC4 decrypted and and then have to get decompressed. This is a case where I made my own tools to work with these files in order to get the plain text.

The main fingerprint file to the best of my knowledge is the JS file. The only problem is it looks like this.


In order to decode it we have to take this function and the the var”v” and decode this into an array of string, which is what was show by @BroadAnalysis on a twitter post earlier today.

The very first time I worked with this I did all 70+ var replacements by hand , I’ve created a new tool to help with that now.


Here is what it looks like after the replacements. This code may not run but at least you get an idea of what is going on.

At this point we have a pretty good idea of how to decode this for each new level that is presented but what if we wanted to search around to see what connects to what.

I was having a problem where a tool I created some time back to look for stings in file was not finding the values as displayed by Notepad ++


As you can see here they are using some strange characters for names of the functions and classes. If you try and copy paste one of these in anything else  and do a search for that string it does not find it.

I finally open one of these up in a hex editor and paid closer attention to the variable names  and I seen this.


As you can see that looks different so what is up with that? As it turns out the answer came in  the way I have been extracting the html pages. I had discovered on the first file I ever worked with that you can not just open the decoded/decompressed byte array with a hex editor and copy the string part to a text file because it can leave extra escape chars in it, that gets real confusing and a pain to clean.

This file above is actually one of the Action Script files that the flash decompiler extracted.

So based on the idea from the html files  I use another tool I built a while back that will convert the hex bytes to a string using any supported encoding on the system, in this case it was normally UTF8 and after discovering this problem I found that UTF 7 would display what was in the hex file correctly. So I opened the Action script file in a hex editor and copied the bytes to this program to be converted using UTF7.



Now I can copy paste those names into my search tool and can find them anywhere in the sub folders to see what connects to what.

Before we move on to the level 3 code lets take a quick look at the script to decode a binary resource file.


As you can see there is a lot going on here to just load this iframe.

This is already getting long so lets move on to the last thing I want to show you that I have not seen mentioned before.

References to ‘MAC”


By this there appears to be checking if it is a “MAC”, there is one more flash file for a level 4 that can be decrypted. In 1 other sample I had worked with I found references to “MAC” in the extracted files, so I’m not sure if this will work on a “MAC” also or not. I’ve Never used one and have no way to test it.

As we have see here it can get quite confusing and difficult to travel thru these extracted files to see what does what.

After you learn the tricks and have the tools you can completely decode one of these in a matter of a few hours or less.

The analysis of all of the extracted code could take considerably longer though.

Well that’s it for this one , if you made it this far thanks for reading.



This helped with the last piece of the puzzle, the config file.


And this one.




Talks about an exploit added after someone published a POC


I’m sure there are plenty more good ones out there .

Let me know and I’ll try and add them.

Posted in Malware, Programming | Tagged , , | 2 Comments

My first deep look at KRYPTOS K4

My fist exposure to KRYPTOS was most likely when I seen it used in the TV Series Alias.

I most likely looked it up, got an idea of what it was about then forgot about it it for the next several years.

A few years ago I started working on the ciphers for Ghost in the wires where I have still only completed the ciphers for chapters 1 to 35 leaving the remaining 3 to finish, I’m still missing something about chapter 36. (any clues ?)

During that process I created several tools, one of them being a Quagmire 3 cipher tool and while searching for possible solutions for the Ghost in the wires ciphers I kept running across references to the Kryptos ciphers. Dropping the first two cipher into the Quagmire 3 cipher showed that they could be solved using that tool and the known keys.

If we look at Quagmire 3 it uses two keys and an indicator letter. If we test it with “A” we get this.


If we keep scrolling down the indicator letters and when we get to “K” we see this.


Or we can cheat and test all of them at the same time.


The question is, if we have some cipher text, plain text and the main key (KRYPTOS) can we figure out the Indicator key?  And of course the answer is yes. 


From the repeating pattern here we can guess what the keyword is.

In the process of creating this tool I discovered that the indicator letter was irrelevant to finding the indicator key. Given the way that Quagmire 3 works we can make a list of all 26 alphabets and compare if the index positions of the plain and the cipher text agree. Then the first letter of that alphabet found is the current letter for the Indicator key.

This tool also works with K2.

Given this information lets try it out on K4.


A few things to notice about this is, we only have 1 repeating letter , and since our search is from the middle of the cipher text the letters may need to be rotated forwards or backwards to be used as a key do to the way the cipher works on key lengths.

Together this implies that this is “Not” a Quagmire 3 cipher like the first 2 were.

So lets go back to the Statistics and see what we can learn from this.

If we do a letter count using http://rumkin.com/tools/cipher/frequency.php we can see that every letter is used thus eliminating any cipher that does not use all 26 letters. We also see that it gives an index of coincidence score of 0.0361. Anything above .0500 could be getting into some form of mono-alphabetic substitution cipher like a Atbash cipher or plain text.

After spending the last 3 + weeks I was looking at everything from Atbash to Quantum Cryptography, Morse code, Fractionated Morse and Binary manipulation. I was even trying to find a way to use a calculation for “SINE Wave” but couldn’t get the math to work out.

I’ve come to the conclusion that this may be some form of a home grown Poly alphabetic substitution cipher along the lines of a Quagmire using many alphabets to encode with. But what ?

While starring at this I finally started seeing some patterns show up.


What is the fist thing you notice here ?

We have 3 sets of letters that encode to or from different letters suggesting that there is at least 3 different Alphabets used to encode this with.


What is the next thing we notice ?



Looking at this it suggest that for each of the Plain text letters the Cipher letter gets bumped up 1 for every space in between each letter. It works for these two but will it continue forwards and backwards for the rest of the cipher text ?

The problem is, if you were creating this cipher it would be fairly easy to count what the cipher text letters would be if you know the spacing on the plain text and what alphabet was used. You would also need to know what the starting point was, or which letter you start encoding with. Here it appears to be a normal A-Z alphabet for these two samples.

In reality having mostly only Cipher text it is a real pain to calculate backwards or to decrypt it. I kept messing up my count, or was not sure if I messed it up or not so I wrote a program to do the counting for me.


Based on the theory’s from above I created this program to calculate forwards and backwards from our know Plain text letter and our Cipher text letters in the position in the cipher text.

We start with the known alphabet index location of the cipher letter and start counting backwards with the given alphabet and the initial position in the cipher text. If the current letter of the Alphabet = Current letter in the cipher text we replace the lower case “x” in our “Test String” with the our upper case letter we are testing for  , In this case “B” . We will do the same forwards.

After some trial and error and some bug hunting I also came up with some rules to narrow down on what was a potential match.

The First test is to see if our used alphabet will change the wrong letters in our know text.

The second test, once we get a output using one of the current 4 alphabets we change the location to the first letter replaced and reset the the letters and index position to match where we start this test from.


As we can se here using a normal A-Z alphabet messed up the third letter so that would tell me that a different alphabet was used.


Here using the first “L” we can see that it properly replaces the second “L” also.

It works the same for the 2 “C’s” in “Clock” also.

After going thru all of the letters in this manner we end up with this wild looking matrix.


This represents the best results by following the rules. I stopped bringing down the letters after I noticed a problem with this.


The “B” and “C” both land at the same position. This suggest to me that there may either be other Alphabets used or it will reset to the original position either at a certain count in the cipher text or for a limit to the number of plain text letters, which could change my second rule some.


Here we see the output of the “C’s” compared to the selected “B”. The “B” is the result of using a Reverse KRYPTOS alphabet which was the only one not to break the original rules.

After doing some more calculations on potential letters to fill in around our known letters I ended up with this.


You may be able to see in this that I was able to extrapolate the word “Crafty”.

So if this is correct then we would now have “Crafty Berlin Clock”

Although I have not cracked this yet, it will require further investigation to see if this method will hold true for smaller groups to work with.

Like reset the alphabets every 24 or 26 letters .

I also will need to try every letter in each position that seems correct.

Perhaps this was the way he encoded it or perhaps I’m just jumping down the rabbit hole.

This could also be like a Hill Cipher that was encoded with a matrix that has no “Inverse” matrix, thus not being able to be decoded.

That’s it for this one, and time to turn back to other task for now like reversing malware.

I will have to try more when I can come up with a way to automate this process.

Thanks for reading if you made it this far.

Posted in Cipher | Tagged , | 5 Comments

De-obfuscating Cerber Malspam file

On July 1’st 2016 I seen a tweet by Herbie Zimmerman   @HerbieZimmerman where he had gotten a zip file from some malaspam  containing an obfuscated Java Script file.

The infection chain is documented on his site here https://www.herbiez.com/?p=550

He had posed that he had trouble reversing the script, so that is what I will cover here.

As we scroll down the script we first see this.


We see it starts by setting a variable , assigning a value then tacking more on to the end of it. We next see another variable name and a value assigned to it.

If we look close at the second variable name  it is the same variable name and value assigned many times.

So we scroll down until we see this.


Tacking more onto our initial value. Scroll down some more.


Hmm a regular expression and new array.


Looks like this is only supposed to work this year.


Hmm if the date doesn’t match it return an empty string else ……


Lets take a closer look at this.


This looks like it takes our first variable as a hex array and Xor’s it with the Decimal 68.

A close look at the second variable name  and it is never used so lets clean this up and see what is left.


Now that we have this cleaned up what do we have ?

At the top we see out initial variable as a string of hex Char’s.

Next our variable is being reassigned as a hex array by using the Reg X to spit the string into lengths of 2 hex char’s , or a hex array.

Then we start the while loop.

Next is a check for the year, Interesting thing about this is the function used is depreciated for html  and returns 116 instead of 2016. If you try and drop this script into a html page  it will always be false.

If it succeeds in matching the date then it will run thru the array of bytes and Xor them by “Decimal” 68 (that tripped me up for a few minuets) then finally output the string or the decoded script.

The final 2 lines take our decoded script, drop it into a new function then call it to run the decoded script.

And here is what it looks like.


The script itself calls out to “http[:]//220.181.87[.]80/ok.jpg” to download the file and uses a random name generator to create a 1 Char name using the alphabet of “1234567890abcdef”  and save the file to the temp folder then launches the resulting .exe file.

That’s pretty much it for this one.

Ok, so I rushed over the part about getting the variable values into a hex array.

We first copy all of the variable parts over to a new text window.


Next in this case using Notepad ++ , select everything from the left single quote on the second variable to the right single quote on the first variable and then hit the find button.


next chose the replace tab.


be sure the “Replace with” is empty, then select replace all and we get.


Now take everything in between the 2 single quotes and run it thru your favorite Xor tool.


Take the result of the Xor and drop it back into a new Notepad ++ and use the Java Script format .


And there we go.


I hope it helps.

Posted in Malware, Networking, security | Tagged ,

Unknown Exploit Kit

When I first seen a screenshot of this one that’s what this was, Unknown.

Here is the twitter message that Jérôme Segura from Malwarebytes posted.


and the response by William Metcalf @node5 replied that it was Sundown/Xer and they steal from everyone for their Exploit Kit.

While researching other reports of Sundown, the code and the domains used in this version appears not the same as was reported in several other post on the Sundown EK. Is this a new version of Sundown? I don’t know, this is my first real look at it.

This version appears to be trying to look like Angler EK, it uses 5 sections that get decoded and each section has 1 or more levels to decode to get down to the final decoded code.

On June 15th 2016 Brad Duncan @malware_traffic posted his captured  run  here


that Jérôme Segura Mentioned in his Post.

If we look at what was posted for the first redirect from the infected site we see this.

(Screenshot borrowed from malware-traffic-analysis.net)


Here they are sending the person to 5 different URL’s but a closer look tells you they are actually hosted on the same IP.


Here is a closer look at what Jérôme Segura posted.


In the traffic from the Pcap from malware-traffic-analysis.net we can see there were 2 landing pages and at first they appeared to be exactly the same but doing a binary compare on them we know for a fact that they were different.

I next went thru and decoded every section as far as it would decode.

Viewing the decoded sections, “Most” of the sections contained code that would not run on its best day. In one of the decoded sections I even found a known Angler EK decryption key and some of the code from an Angler EK section. Reviewing the code in this section for what would have been the exploit section in Angler there were 2 separate decoding functions with the same name for what should decode some of the encoded strings.

Although there appears to be some advance functionality in this, um, Kit, it does not appear to be properly implemented at the moment. Just because it is ugly don’t totally dismiss this thing yet.

So if this thing is basically broken how is it calling out to download the flash and Silverlight ?

They are using embedded links in the code is several places.

In the first section we se this.


The top part of this is a hex encoded base 64 alphabet and then the base 64 decoding function , then finally the string to decode.

Once decoded we see this.


If you look close this is also a base 64 string but the string was reversed.


In the screenshot above we can see the eval that kicks of this part , taking the reversed base 64 string and using this reverse function  to reverse it then, finally base 64 decode and then we end up with this.


If you take a close look there are a few “p,a,c,k,e,d” sections it in. When you try and decode that it returns what you input.

Moving down on the same code from the first decoded section we see this.


If we drop the hex code for the “FlashVars value” into a hex editor we see this.


Here we can see there is no call for a “.exe” file but a link to

“http://trasergsgfsdx[.]xyz/z.php?id=8” which shows up in packet 602.

If we look at this section for the second landing page we se this.


It is has “z.exe” and a different site name and id number.

If we move on down on the first landing page we see this.


We have a call at the top using the same Url that was found in the hex above

If we clean up the top of this and zoom in we see this.


If we look at the bottom we se this.


The section in between is percent encoded hex so lets decode that and see what we get.


Above they are using a array of variables and an index number to build the code with.

Also notice the number on the special folder it is calling for.

It also has the appearance of having the option to save a file as a dll or an exe.

Another interesting thing if we scroll down to the bottom of the page we see.


Looking at this bottom function it looked familiar.


The left is from this exploit kit the right is from Angler EK. The differences are the variable names, the left is using “&” and the right is using “%” from some items.

Also the left is pushing to a char array then reassembling the string where the right just goes to string.

When trying to decompile the the .exe artifact the decompiler said it may be “packed” ,but looking at the file in a hex editor it appears to be corrupted or encrypted in certain sections rather than packed .

Here is a quick shot of the Silverlight after decompiling it.


Even after de-obfuscating this it is still large and a lot to navigate thru.

From a static analysis point it is difficult to tell what would and would not work in this.

It would need dynamic analysis.

Before I finish with this first pcap lets take a look at some ‘Who Is” for this site.


Notice the dates at the top, this url was not up long before it was found.



And the Scumware report.



Pcap 2

Now to the second Pcap, there were some changes from the first one.

On June 20th 2016 Brad Duncan posted another pcap in with a exploit kit dump.


Lets start this one by looking at the traffic and using filters for the streams to see what goes with what. The first filter is just “ http.request or http.response “ .


Here we have multiple gets and the first 2 were no doubt from the original infected page.

In this view we see at least 2 different landing pages, 1 flash not found, 1 flash that was found, and 2 Silverlight files downloaded.

If we set a filter of  “tcp.stream eq 0 and (http.request or http.response)” we see what packets goes with this stream.


A filter of “tcp.stream eq 1 and (http.request or http.response)”


A filter of “tcp.stream eq 3 and (http.request or http.response)” (stream two was empty)


And a filter of “tcp.stream eq 4 and (http.request or http.response)”


What does this do for us ? Since all of these point to the same IP it can help to see what is related a little easier.

If we look at the first one it tried to get a flash file but could not find the file.

The second one has a landing page and tried to get a Silverlight file but the traffic appears as though it did not get it.

The third one also contains a landing page and this time is did download a Silverlight application.

The fourth one instead of a “normal” landing page we get something else.

But let’s look at the third stream first.

If we base 64 decode the first section we se this.


We can see it is wanting “carolinamovie.swf” but the traffic supports that it was not returned but the Silverlight one was in this stream. So lets look at it.


Here we can (almost) see that it is looking for the Silverlight.  And the hex in the hex editor.


Doing a binary compare the 2 Silverlight files from the first Pcap and this one are the same.

Now on to stream number 4.

When we look at packet 174 in stream number 4 we don’t see a “normal” landing page but this.


And at the bottom we see this.


so what is this ? Perhaps this will give it away.


If we do a string search on the internet this code appears to be borrowed from this site.


With a description of “ A lightweight Javascript Libray for OpenSSL compatible AES CBC encryption.”

So they appear to be using open SSL to decrypt the bottom 2 sections.

At the time of this writing I didn’t have time to build a decoder.

Since it does appear to download a Flash file before we run out of traffic I can only assume that it does work.

One last surprise before I close.


Good luck with that borrowed name.


In conclusion, I still can’t decide if the person(s)  who write this don’t have a clue, are just testing out only certain parts of the exploit kit. If they are just trying to see what they can get out of researchers or possibly even some Collage project. There is just to much non working code in here to be a streamlined Exploit kit.

If you made it this far thanks for sticking with me.

That’s it for now.

Posted in Computer, Malware, Networking, security | Tagged ,

Decoding Angler Exploit Kit

After my last post Some data on Angler Exploit Kit I had received a request to write up a tutorial on decoding the Angler EK.  The Question is where to start ?

Since they seem to be on vacation or are in the middle of a new version development I’ve decided to write this up. The basics presented here could be used for any exploit kit.

In order to even start reversing this or any exploit kit you have to have some basic  understanding of Java Script / Html and how they relate to each other. I always hated trying to write a web page but find myself going back to relearn how things worked to find out how the malware is working.

Most of these exploit kits and redirect pages are using some form of obfuscation technique for the decoding functions whether it is just adding a lot of white space, converting to escaped characters, or even adding in random comments like “/*  this is a comment*/” in order to mess with the beautify tools to get the code back to some type of readable form. 

In this post we will be working with the Pcap from http://www.malware-traffic-analysis.net/2016/06/01/index2.html and using the pcap 2016-06-01-pseudoDarkleech-Angler-EK-after-hideandseek.leadconcept[.]net .

Before we can start decoding the landing page we must first find and extract the landing page using Wireshark.

The fastest way I have found to Identify the landing page is by using the filter of  “http.response”. The latest Angler EK pages were always just before a “404 Not found” in the info section. When viewing the traffic and the code you find that there are 2 identical request and the first seems to almost always fail with the 404. On occasion you will find that you have 2 Flash request instead of just 1 and no 404.


If you see a large gap between the packet/frame of the 404 and the previous one then the landing page will most likely not be shown and has somehow gotten corrupted and possibly missing some bytes like some of them I encountered in the last post.

After locating and verifying that you do have the Angler EK landing page then you need to extract it. Personally I always extract it as a text file so when I’m tired I don’t accidently run it. We need to go to File –> Export Objects –>  HTTP


Then we see this.


Next we click the save as button to get the location where you want to save it to (I won’t show mine here : ) ) Another personal preference of mine is to save it with the Packet number , like here I would save it as “ Packet-150.txt” in what ever folder I chose, this way if I need to find it again I can just jump to that packet or in the case of multiple landing pages per pcap I can be sure I have the correct packet.

Also take note of what it will suggest to save as. I have seen it want to save with several things “\” , a name and php,  png , jpg or even html. They try and hide the landing page with several extensions.

Once we extract the landing page and open it up in our favorite text editor we see this. My personal choice at the moment is Notepad++.


and this


If we keep on scrolling down we then see this.


By now you have lots of questions but the first is always, what the heck is this stuff ?

In this last screen shot the code is in script tags, so it must be used somewhere, right ?

So now what ? Lets start by extracting the entire script from the the center of the page and get it into some sort of readable state. When you are first de-obfuscating the script you will want to work with a copy so you can compare if something went wrong in the process.


Here we see that even after using a Java Script Format function it is still hard to read.

I wrote a tool to fix this but it still has a few bugs so it is just as easy to fix it by hand.

Also notice at the bottom is 2 types of comments . So lets clean this up and see what we end up with.

Now that we have it somewhat de-obfuscated  lets take a look around and get our bearing as to what the script might be doing and where the starting point is.


Hmm what is this ?


It looks like this concocts to an “Eval” and look there is a RegX expression below it.

Hmm what is this item “kTPjPVb = ‘QnJNUUxhQW5PWUdWT3RS’;” it is used in a function above, could this be a variable name or a decoding key ?  We will have to follow along with the code to find out. But what is the “biwi” in the function above ? We will have to trace that too.

I’m not showing the part where you take several different strings / variable names and see where they are used in the code.

If we have a slight understanding of java script we notice that these are nested functions and work together, so where is it actually called from ?

If we do a string search for the variable at the top , “uTGlITcQsYrl” then we end up at the bottom of the script section.


So the last 2 lines of this script is where it actually starts.

When a page is run it starts evaluating from the top down and it will concatenate the variables and get them ready to use.

So now what is that value in the last line ? We do a string search in our extracted script and don’t find it, so we go to the original (copy) of the extracted page and find it here .


Looking back at the script, it is extracting the inner html of this ID for the first string to decode.

So now what ? We now have enough information that we can now start building our Html/Java Script decoder.

But what needs to go into it ? First we need a Html page with a Document.Write function to write the output to the page. (borrowed from http://www.w3schools.com/js/js_output.asp and modified slightly)


We know we need the String from the variable shown above, we need the key value, and the string replace function, and finally the decode functions leaving the rest of the code remaining out. The rest of the code is for doing the eval on each page/ decoded section, we don’t want to run that code , we just want to decode what is in the sections.

Which is also where those small sections of code we showed earlier come in. They identify each of the remaining “Sections” to decode.

After some trial and error and a few choice words for the left off semi colons we end up with out first decoded section.


I say trial and error because unless you are very good at Html/Java script it will probably take several tries to get together everything you need for it to run properly. There may be a replacement variable that was not close to the rest of the code that you need to go back and find and then include.

Here we see the start of the code needed.


In this view we set the variable for the String replacements, string to get decoded, RegX string replace , the key used for decoding , and finally the variable to call the function to do the decode.

We next find the end of the decode function by the end braces and the semi colons and copy paste that into our new decoder in between the variables we just inserted and the document.write.


The “ };;;” tells us it is the end of the function.

Above you can see I added “var result = J6em” . The “J6em” was the variable name that was used to start the decode process where the encoded string and the key was passed to. It also give me one more place to set a break point at.

So we try to run it as is and it fails with a variable of zx not defined.


We can see plainly that it is defined (below), obfuscated, but defined and in several places in the rest of the script, so what gives ?


The location of where the code is in no longer in the bigger global code block since we extracted just part of the code so the function below can not use it.


So first we try and reduce the function so it is readable. That doesn’t work.

We finally realize that the problem is that the function below can not see this variable. So we move it into the function and it works, the code runs all of the way thru and we get the result we wanted.

Now we have a working decoder for this “Type” of encoding.

If you read my last article I found 7 different encodings ( string replacements) with 3 different decoding functions used in the samples I looked at.

Now that we have a working decoder we can do one of several things, we can copy it for each section and just replace the string to decode and our title at the top of the html or use the same decoder and just replace the string to decode.

Note: There seems to be a string length limit in what you can view in the F-12 debugging tools and display on the page. Some of the strings to decode can be very long.

Now as long as the landing page is using the same encoding type all you have to do is replace the string to decode and the key used for every new one you encounter. This method bypasses the problem of dynamic analysis where it is checking for User Agent strings and for other running programs.

Now the real problem.

Once you get each “Section” decoded, the decoded section may have more variables that need decoded in that “Section”. To see what they are doing you will have to start the process all over again for each different decoding function, including that section that you see at the top of the screenshot for what I call the “LowerSection” which used to always be found at the bottom of the decoding script which you now will sometimes find towards the top as seen here.

The decode function for the “LowerSection” variables is not found until you decode section 3 and then not all of strings get decoded before they are used in other sections.

In this example I did not reduce all of the variables to their full string values. If this was the first time you worked with this code you may want to do that, then comment out the others not needed anymore or just use a comment at the end of the final variable name that will be used after they are combined as I did for the “String[‘fromCharCode’] “.

Once you understand how the code works you could always use your favorite programming language and create the decoder that way, like I did.(It is much easier for high volume decoding)

I would suggest that the very first time you run one these to make sure you are running it in a VM just incase there are some surprises laid in for the analyst.

If you are not sure what a function does or what the value is supposed to be, launch it in the F-12 tools (in a VM)  and step thru it to see it as it changes.

String Length Limits

I thought I was done writing this post until I went back to verify if the code would work for Section 4 knowing from experience that is is a very long string.

As I had mentioned above there is a string length limit on what a page will display.

If we check the length of the string for section 4 before it gets decoded we find that it is  66,220 characters long. which is no problem for the input.


When we decode it Using a VB.net version of this decoder we see that the decoded length of the string is 40,907 characters long.


But if we use the Html/Java script version we just made.


Notice the end of the string appears to be truncated (above).


As we see in the string length test it is only 40,546 characters long , a difference of  316 characters. So it appears that is what the string length limit is for outputting to a page .

So how do we work around this limit ?

We change our code and  split the string. If we change the top to this.


And the end to this.


So what will this do for us ? Since we have a string length limit  I just use  40,000 as number to do the split at and then we end up with this.


I know, you seen that it is still truncated in this version as well but here is where the workaround comes in.

If we launch this again in the F-12 tools setting a break point on “var result” then we can step a few more times until the values are filled out.


Extract the values and clean them up.



Clean off the Var name and double quotes on each end plus the word “String” on the end, or just copy paste everything in-between the double quotes .

Join them together and try to beautify them.


but we still have a problem.


Here we see that out Character count is now 40,964 instead of the expected 40,907. That is more so what is going on ?

If we take a close look at the string output to the Html page and what we extracted from the variable in the debugger we see the problem.



Do you see it yet ?

There is an escape character “\” before every single and double quote and even “\” is escaped.

The workaround for this is to copy the string from the output window for the first 40,000 chars so you don’t have to mess with the changes added by the debugger then get the last remaining characters from the debugger value and clean up the “\” by hand or write a script / program to do it for you like I did here for both.


Now once we clean and join these we end up with.


We end up with the expected 40,907 characters that we expected once we get rid of the extras. It will now also beautify.



We can now see the code to create a decoder for this encoded section.

Once We decode this section (var b) we see this. (Top)




My point of showing this section is there seems to be some more code missing from the bottom here. Not that I want to help the writers of this debug it, but if we look  closer at the end we see this.


The output length before beautifying it is 29,976 characters well within the string length limit.

So this appears that either they truncated the the script before encoding it or got an extra character  from another part of the code this came from. Without seeing the original it would be difficult to tell for sure. Checking a few others I find this “char” also and 1 different one at the end of different samples.

This was not the only potential mistake I found while going thru the code.

As has been said before, even the malware authors can run into problems.

Well that’s it for this one I hope I answered any questions and didn’t create allot more.

Posted in Malware, Programming, security | Tagged , ,

Some data on Angler Exploit Kit

Here is some data assembled from Multiple Pcap’s.

First I would like to thank Brad @malware_traffic for all of the Pcap’s and write-ups posted on http://www.malware-traffic-analysis.net/.

I have downloaded All (almost all I’m sure I missed a couple) Pcap files and extracted every readable Angler EK landing page from those Pcap’s.

My final count for the date ranges of July 10, 2014 21:23:51 to  June 02, 2016 12:02:25 was 149 Landing pages. These pages include multiple runs against the same site and in some cases where there were multiple redirects to different landing pages from the same site in 1 Pcap file. This leads me to think that some sort of automated process is used to infect the sites with.

Encoding Types:

I encountered 7 top level encoding types. The types were determined first by the string replacement function and then by the Decoding function.

This is my type 7


Although there were 7  string encoding types there were only 3 decoding functions used  after doing the string replacements . The first 2 were different from what I am calling types  3-7.

Types 3-7 are using the Xtea encryption using the Right Shift Zero fill function rather than the earlier version of just plain right Shift. 


Here we see what the total count for each type that’s in the 149 landing pages that were extracted.


Although you can easily decode all 7 types with Html  /Java Script I only Decoded All of type 1 and 2 encoding types  to the first level  , I did not decode all of the types 3-7 just some of them.

With the few amounts of type 3-7 it makes me wonder if those are possibly a third party using Angler EK for a targeted attack or I just don’t have enough data to show more of those types.

Those that have been following along with all of the post on Angler EK have seen some of the changes that have taken place over time. Going thru all of these Pcaps you can see even more than what has been reported with the way code sections have been moved around and the code changed.


As was reported by Kafeine in April of 2014 Here , there was a reference to their site that showed up in the Angler EK.


If we take a look at the big picture on this one we see there is not a lot in this section.


We last see this reference  in the packet captures on 2016-03-03 where they rearranged the scripts and sections, this page now looks like this on 2016-03-07 and the reference to malware.dontneedcoffee.com is no longer found.


I guess they needed the real-estate to move things around.

I will be calling the sections that gets decoded with the code in them as “Sections” , I number them from top to bottom like the way the  page gets evaluated as it runs.

The number of sections that get decoded is between 4 –6 . It was down to 4 after the Java and the Silverlight were removed.

There is also a section that I will call the “Lower Section” that is included in every landing page with the changing encoded variable names associated with the site name.


There have been a few different ways that this section appears. Some have the same name and just use an index number, some use a different name and a index number, and those like above that use different names. Not all of those values get decoded before getting used in the building of the pages. Some get used as is. I have also seen different amounts of variables in this section. Most of the time once decoded the Host name is listed under two different index values. Like in this one you can see that the index 1 and 7 are the same strings. (index starts with 0)

You used to always find this section at the bottom of the decoding script for the sections but has moved to the top in some of the landing pages.

Section 1 code:

In the older versions when we decode “Section 1” we see this.


Then they changed to this.


In the Pcap from 2016-01-29 they added a new section 1 thus shoving the rest of the pages back. This is what the new section 1 looks like after that date.


Although the variable name in the window section changes and the number “1” will change on occasion but other than that it stays pretty much the same.


Exploit Type:

In the first 3 Pcaps there were all 3 types of exploits, Flash , Silverlight , and Java Applet.


Unless I’m missing something after November 02, 2014 15:50:34 the Java was no longer used.


Starting on April 01, 2015 18:44:24 thru February 15, 2016 14:06:51 the Silverlight code was missing from most of the Pcaps.

In July 03, 2015 11:14:44 we see it show up in section 4.


On 2016-02-19 we find that it was moved to a  new Section 6.



Payload Section ?

What I believe is the payload section comes in 2 flavors ,Scripts or what I’m calling, do to a lack of a better name of K33N this one has 2 encoding types. If someone has a better name please let me know.

The scripts were used up to July 23, 2015 13:50:21 exclusively until my type K33N Type 1 showed up.

The script looks like this.


This first part I believe is the payload that gets decoded later.

The second part consist of 2 hex encoded scripts, 1 Java script and 1 VBScript.


This is what the java script looks like encoded (above) and after decoding (below).


Even after decoding the VBScript one it is still rather obfuscated and hard to follow along with at first so I won’t show it here.

Here we see my K33N Type 1 encoding.


I only logged 3 of that type and then they Moved to Type 2 at  August 13, 2015 08:00:06.


Once you decode this section you see this.


Even once you decode down to this level there are (at least) still 3 more decode functions to fully decode this section. I believe the payload is contained in this section but have not had the time yet to verify it.

Above we see that this is variable “a” (section 3)  later on it is moved to variable “b” (section 4)  and the script around it changes some.

In the Pcap on April 22, 2016 08:00:47

we see yet another major change in this section.

The Encoded part looks similar to the older Type 2.


We get a surprise once it is decoded.


Although more of the code is showing that is readable there are still more parts that get decoded to get the full picture  Including the section shown below.

It appears as though more of the code was not put into the Base64 encoded section.


In this newer type this section is smaller which tells me that part of this code was in this Base 64 section in the older versions and possibly even the decoding key that other researchers have found using dynamic analysis and posted about.

Well that’s it for now there is still a lot of data here to go thru and lots more sections to decode to get a full view of the code. Attempting to decode this is not for the easily deterred, you just keep finding layer after layer of encoded sections.

I hope someone gets something useful out of all of this.



Posted in Malware, Networking, security | Tagged , | 2 Comments

How Does JavaScript Right Shift Zero Fill Work

I have converted several online Classic cipher tools from Java Script, Python, C, and C++ to VB.Net for some of my projects.

I will at times create small projects to get a better understanding of how a certain function works given different input.

In this project I needed to understand how the Java Script “>>>” Right shift zero fill worked for a function I’m trying to convert.

The closest I got to an explanation was this https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Bitwise_Operators

If the number input was “Positive” then it was just a normal right shift operation that should be found in just about any programming language.

If the number was “Negative” then it starts getting more complicated.

If we look at the simple example they have there.


In my test program we see.


We start by converting the negative number to a positive.

Get the binary bits for the number

Flip the bits , what was a “1” is now a “0” and what was a “0” is now a “1”.

Next we trim the right side 2  bits for the shift amount of 2. That only leave us with 2 bits left.

If we convert that back to a Integer then we end up with “1”.

As we’ve seen above in the borrowed example and my test program the expected result is not “1” but “1073741821”. So what else do we have to do ?

Well, this function is based on 32 “Bits” , so.

We subtract the shift amount from 32 and get 30.

Since there are 2 bits left we subtract 2 from 30 and get 28

Next we pad the left side of the remaining bits with 28 “1’s” .

Finally  we convert the binary string back to Integer or Long.

The amount of bits to fill is, (32- Shift Amount) – remaining bits or in this case
(32-2 = 30) –2 bits left = 28.

Now what if we want to shift more than 4 which is the the number of bits for “9”,
“1001” for the Positive value and “0110” for the Flipped bit version.

Then we pad the left side with bits equal to the amount left needed to equal the shift length. For instance we start with 4 bits and want to shift 5 then we pad 1 bit to the left.

Now when we get done shifting there are no bits left . Or we could just say if the shift amount is greater than the available bits then the remaining bits is “0”.

As we see the math above, in this case it would be (32 –5 = 27) – 0 since there are no bits left.

So we end up with 27 “1’s” , “111111111111111111111111111” , convert back to Integer or Long and end up with 134217727 which is the result we were expecting.


During the research of this process, I created some test data from Html/JavaScript to test my program with. After compiling this data and viewing the bits it was easier to see the relationship for the the length and the amount of bits we are working with.


I randomly chose those numbers to give a wider variety and lengths to work with.

After testing every number and shift on the list, my program agreed with the results shown above.

The bottom set is the last set I was testing once I started asking myself what happens when the shift is greater than the bits available.

In conclusion, the name of this function is a little misleading. When I fist started testing I was trying to “Zero Fill” the left side, taking it literally. That was not working.

After several try’s and building the test data is became clear on how it worked.

I hope this saves someone else several days of testing.

Posted in Programming | Tagged