Excel 4 macro code obfuscation

This sample comes from a Twitter thread located Here by Frost @fr0s7_ and appears to be  “BazarLoader”

Since this is a Xlsb file I usually just open it up in my Office 2010 Pro sandbox and then convert to Xlsm and unzip it so I can just view as xml.

The first thing I always do is take a quick look with a hex editor looking for anything of interest.


As we can see from the first 2 bytes we have a “PK” or zip file format.

Once we “UnZip” the file and navigate to the xl folder we can verify this is a binary file and it also contains a Excel 4 macro folder named “macrosheets”.



If we look at the SharedStrings.bin file we can see that strings are in a Unicode format and not that easy to see where they split up at.


Looking at sheet1.bin in the macrosheets folder we can see it is not human readable. 

This is the point where I usually convert the file.


Here we can see we still have a “PK” file but you can clearly see the data is presented a little differently.


Once we unzip and navigate to the xl folder here it now looks a little different.



And now if we look at the SharedStrings.xml file it is a little different.

By the counts there are 34 indexed shared strings. Each appears to be randomly generated strings.


I wrote a tool to aid in extracting and indexing the shared string from the xml file.

When I first parsed the shared strings I ended up with 0-37 index values instead of 0-33.
Turns out the tool stumbled on a rare random Char value I was using to split on.


Here we see the xml version of the macro code. Like the shared strings it is hard to see thru all of the xml tags what is there so I wrote a parser for those too.


This tool is designed to extract values to aid in better viewing what is happening without all of the xml tags. In this case some are left.


Here we see what the values are.


If we look at the highlighted values in green  we see that it is looking for the string in cell ‘E11’ then we are taking the char at the index and taking so many chars. “MID(E11,12,1)” . In vbs the index start at 1 but in this the index starts at 0.

So now we know the first char code was converted to “S” and now we see the first extracted letter is “h” and the next to letter is “e” and then the next 2 are at the same index and is “l”.

Now we have the word “Shell” extracted.

This would be a pain to do by hand, but now that we understand how it works what else is available to extract this data.

The Answer is “XLMMacroDeobfuscator” located here .


As we can see here this tool does a great job of presenting us with the deobfuscated strings.

The version I’m using here is from October 3rd 2021 before it was updated several more times. The version number stayed the same so you need to verify by the install/ file date.

Using the latest version as of November 12th 2021 it only returned the eval result. Also notice in the screen shot that showed the data it is a “Partial Evaluation” where in the updated version it is a “Full Evaluation”.

I have not looked at the byte format for the Macro sheet data but I have looked at the shared strings in the binary format.

Do to the lack of information that I can find on the file format let’s take a quick look at the data in this file as shown below. Notice the patterns.


In the original sample I wrote an extraction tool for we can see how it is laid out slightly different.



Although the file in my original sample was labeled qut.xml it was not an xml file at all. So you can not count on a file name or extension for searches.


And here is what it looks like in the Hex editor.



Lets take a look at format for this sample then we will go back and look at the one from the beginning.

We can see the first 3 bytes of the data appear to be a fixed Header value.

The next 4 bytes are the “Count”. If I understand correctly, it is the total times the string/chars are referenced.

The next 4 bytes are the “Unique Count”. These should be the total number of strings shown in the cells.

Next it gets interesting.

The first byte is always 0x13 Next we have 1 or 2 bytes (Unknown). Perhaps it is a data type ? It appears that it could be 1 or 2 bytes then a null byte depending on the string.

Next we have the length of the string as displayed in the cell. It uses at least 2 bytes.
So the first is only 1 char then value is 0x0100 or in reverse order 0x0001.

After that we have 2 null bytes. Then finally the Unicode bytes for the string.

Now lets go back to our first file that we extracted from this sample.


Notice how everything is aligned but the area in the red box.


If we look at the string under index 16 we see it is 531 characters long.

531 = 0x0213 and our value in the data is 0x1302.


Now everything lines up.

Here we see the first byte  0X13 then 2 unknown bytes then a null byte then 2 bytes for the length and then a double null and finally the start of out Unicode string values.

So in this sample we have extra 0x13 in a place that will break the tool.

At this point the tool will work on a few but will need a total rewrite based on this new information.

There have been plenty of samples that I have looked at where you did not even need to look at the VBA or macro code. All you needed to do was extract the shared strings to get the urls or paths used.

That is it for this one I hope you learned from this as much as I did.



Link to Twitter thread
Link to Sample on InQuest Labs
Link to Sample on Iris-H

Link to XLMMacroDeobfuscator

Link to my tools on GitHub


About pcsxcetrasupport3

My part time Business, I mainly do system building and system repair. Over the last several years I have been building system utility's in vb script , HTA applications and VB.Net to be able to better find the information I need to better understand the systems problems in order to get the systems repaired and back to my customers quicker.
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

1 Response to Excel 4 macro code obfuscation

  1. Pingback: Week 47 – 2021 – This Week In 4n6

Comments are closed.