Solving a simple Windows Crackme

Introduction

Welcome back! Today we are tackling another Windows binary from crackmes.one. Unlike the previous binaries, this was compiled with C/C++ so we will not be able to use dnSpy or a similar tool to reverse this binary. There aren't any surprises in this binary just some tricky disassembly that I'll do my best to walk you through! If you want you can feel free to watch the YouTube video below or continue reading!

Find The Password Video Walkthrough

Optional Materials to Follow Along

I normally use my Windows VM to analyze Windows binaries but my Windows VM was acting a little funky so I used my Kali VM. Luckily we did not need to do any debugging to solve this challenge. If you're using a Kali VM you'll need to install wine to run the binary on Kali. If you're using my VM, you'll find this binary in the /home/kali/reverse_engineering/crackmes/find_the_pass directory. If not you can download the binary here. It comes in a password protected zip file. The password is "crackmes.one" without the quotes. With all of that out of the way let's get reversing!

Initial Triage

Just like in the previous challenges we start by running file. However, you'll notice that on a Windows binary file doesn't provide much information. Also we can't look at the symbols either with the nm command. Again fortunately, this is a simple challenge so we do not need to worry too much about.

So, not much information there but let's go ahead and run strings and see if there is anything of interest in there.

We see what looks like a possible password (highlighted in red). There's no way it's this easy right? Well you know me, I am a try everything kind of guy so let's try to supply this password to the binary! We also see, we are to supply the password as an argument to the binary.

And just as we expected. Well this string could have some significance later (spoiler it does), so let's just jot this down in our notes for now. Now, before we move on, not that this binary has reference to mingw32. If you didn't know, mingw is a cross compiler. This little detail will be important later on.

Static Analysis with Ghidra

Alright let's go ahead and open this up in Ghidra!

We see something that's a little strange at address 0x401556. We see a call to ____main. We don't have to worry about this. From what I can tell this is a mingw artifact that does some extra setup. After that we see it compares the argument count with two, ensuring we passed in a parameter. Assuming we pass in a parameter we can move to address 0x401565. We see the code grabs argv[1] which would be the password. Since this is a 32-bit binary, arguments are passed on the stack. At address 0x40156d, we see our user input being placed at the top of the stack followed by a call to strlen. The result of this is stored in a variable local_14. Recall that return values are placed in the EAX register. So, we can rename local_14 to password_len. We then see the password_len is placed at the top of the stack which sets up the malloc function call. The return is then placed in a variable local_10. Let's rename this to buffer. Ghidra is doing something that makes it a little difficult to figure out what's going on. You see at address 0x401591 that EAX gets stored to ESP+local_2c. This is a little bit misleading. What's actually happening is the value stored in the EAX register which happens to be argv[1] is being placed on the stack, specifically, ESP+4. We can tell prevent Ghidra from doing this by making the following changes: Edit -> Tool options -> Listing Fields -> Operands Fields and then uncheck the box Markup Stack Variable References.

This will clean up the disassembly display as you can see below. I've also renamed the variables.

With this notation we can see clearly that our user supplied password and the newly created buffer are placed on the stack ESP+4 and ESP respectively. This makes it easier to determine that the arguments for the strcpy function. So far, our decompiled output will look something like this:

if (argc != 2)
{
	printf("Usage:...\n");
    return 0;
}

password_len = strlen(argv[1]);
buffer = malloc(len_user_input);
strcpy(buffer, argv[1]);

We then see we call a function called _hash.

Similar to the previous function, we see the arguments to _hash are buffer and 0x1c. The first argument is buffer and the second is 0x1c. This will be important in just a moment! Let's take a deeper dive into the hash function.

Analyzing a Simple Hash Function

Right off the bat we see param_2 which is 0x1c gets passed to the srand function. This means the author is using a static seed (giggity). This is good for us because it means every time rand() is called, the same number will be generated every time we run the program. Next we see param_1 which holds our user_input is passed to the strlen function. Let's go ahead and rename these variables so it's a little easier to follow.

We see that we are grabbing the creating another buffer with the malloc function. Notice the len_user_input goes through an SHL instruction before being passed to the malloc function. This will multiply len_user_input by 2 raised to the 2 power or simply 4. Finally, we see a variable being set to 0 followed by an unconditional jump which we know by experience is normally how a loop begins so I renamed this variable to i. Let's go ahead and take a look at what's going on in this loop.

We see the loop continues as long as i is less than len_user_input. The first few instructions in LAB_0040149A look a little confusing but let's go through it step by step. First, i is placed in the EAX register. The next instruction, we see EAX*4 get stored in the EDX register. We can ignore the + 0x0 since that does not affect anything. So EDX holds EAX*4 or we can i*4. Next, buffer gets placed in the EAX register. Lastly, before the rand() function call, we see EDX+EAX*1 get stored in EBX. We can substitute the registers for our variables to make it easier to understand. So, EBX holds (i*4 + buffer*1). We can actually ignore the *1 since that will not affect anything in this case. If this notation looks weird you're not alone. This was a little confusing to me as well. What's happening here is buffer is being indexed and stored in the EBX register. By the name, we expected buffer to hold characters but it appears that buffer will hold integer values. Integers are 4 bytes which explains why i gets multiplied by 4. This will ensure that we get to the next element in the array. We can simplify the indexing by simply translating the craziness to buffer[i]. Alright so buffer[i] is placed in the EBX register. Let's keep that in mind. Next, we see rand() is called. Recall, since the seed was static, this will not be a truly random number. This means if we were to create a function, initialize it with 0x1c and call random as many times as this hash function does we will get the exact same "random" numbers. After rand() is called, we can see a lot of craziness that I cannot decompile myself. The instructions from 0x4014AF to 0x4014CF take the result of rand() and mod it by 10. That is EDX will hold rand() % 10. Then we see EDX gets stored in EBX and lastly we see i is incremented. We can translate this loop to the following C code:

while(i < len_user_input)
{
	buffer[i] = rand() % 10;
    i++
}

Now, buffer does not seem to be a proper name for this variable. Let's rename it to key. This will become apparent later. Now, after this loop finishes, we go into yet another loop lol.

Luckily, we don't have any crazy modulus to worry about in this loop but we it is still a little confusing. Just like the previous loop we will go through this step-by-step. First, we see user_input[j] gets stored in the EAX register. This in turn gets stored in a local variable local_20. We then see our key get indexed by j as well. Next we see key[j] is added to local_20 which holds user_input[j]. So local_20 holds user_input[j] + key[j]. This is beginning to look similar to a shift cipher. The result of this is stored in the ECX register at address 0x401510. We then see another new stack variable local_52. This new variable gets indexed by j and then user_input[j] + key[j] is stored. So, let's rename this to modified_user_input. We then see j is incremented by 1. The above loop can be roughly translated to the following C code:

while(j < len_user_input)
{
	letter = user_input[j];
    letter += key[j];
    modified_user_input[j] = letter;
    j++
}

I went ahead and changed renamed local_20 to letter. Now, let's analyze the rest of this function.

The end of the hash function looks fairly straight forward. First, we see modified_user_input is indexed by len_user_input and then we see 0 is placed at this location. This is essentially, NULL terminating the modified_user_input. This looks like modified_user_input[len_user_input] = '\0'. Finally, we see hshpwd and modified_user_input are placed on the stack and strcpy is called. hshpwd is a global variable the author created. You'll see this variable is used in the main function shortly. Let's take a look!

Recall, that hshpwd contains our modified user_input. We also see a string that we saw earlier. This is the value that hshpwd will get compared against. What's strange is this comparison occurs a few times in succession. I'm not sure why the author did that but if we look at the rest of the main function we can see the same comparison occurs multiple times before printing out the "Success" message.

So, now we have a greater idea of what this binary is doing. We can solve it. It simply takes our user input and modifies it using the hash function and then it compares it against the string pcx|xpzlhgu}. So all we have to do is implement the hash function and pass pcx|xpzlhgu} as a parameter. Of course, we would change it slightly such that instead of adding key[i] we would substract key[i] from pcx|xpzlhgu}. Now, remember when I said that this was compiled with mingw and that would be important later on? Well this is the time. In order to replicate the values produced by rand() we have to write our solution in C and compile it with mingw. Here is the code I used to solve this challenge. You can also find this code on my Github.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main()
{
    char *hash = "pcx|xpzlhgu}";
    int random_number;
    int i = 0;
    char password[12];
    srand(0x1c);
    while(i < strlen(hash))
    {
        random_number = rand() % 10;
        printf("random_number: %d\n", random_number);
        password[i] = (int)hash[i] - random_number;
        i++;
    }
    printf("%s\n", password);

    return 0;
}

The above code must be compiled like so:

i686-w64-mingw32-gcc-win32 keygen.c -o keygen

This will create a keygen.exe file. Now run it with wine keygen.exe.

I printed out the random values produced by rand(). This of course is extremely unnecessary lol. Now, we test our input with the binary we are interested in.

And we solved it!

A Buffer Overflow?

Before I conclude I want to point something out which will be a little easier if I write out all of my decompiled output for this challenge. A friend of mine pointed this issue out to me. Let's take another look at the hash function.

hash(user_input, is_0x1c)
{
        srand(0x1c)
        len_user_input = strlen(user_input)
        key = malloc(len_user_input (2*2))
        i = 0
        while(i < len_user_input)
        {
                key[i] = rand() % 10
                i++
        }
        j = 0
        while(j < len_user_input)
        {
                letter = user_input[j]
                letter += key[j]
                new_user_input[j] = letter
                j++
        }

        modified_user_input[len_user_input] = '\0';
        strcpy(_hshpwd, new_user_input);

}

Now, I didn't include it in my decompiled output but the size of new_user_input is static. In fact, according to Ghidra it is 50 bytes long.

Why is this an issue? Well, the size of key is determined by the size of our user_input. This presents a potential buffer overflow vulnerability. If we provide input that is greater than 50 bytes, we will overflow the buffer and possible spill into the return address. Let's verify this by supplying a bunch of A's to the program and see if we can get a crash.

Alright we crashed the program. We see we got an unhandled read access. Wine was nice enough to start a debugger and print a stack trace. The stack dump doesn't have a bunch of A's that we expected but remember, the user_input is modified so it is possible that the values in the stack are indeed part of our buffer. Furthermore, take a look at the address where the crash happened (highlighted in red). This occurs in the second while loop where the bytes are being copied to the modified_user_input. I didn't have time to play around with this very much but there is a potential for a buffer overflow here. If I make any progress and exploit this vulnerability I'll post an update. Similarly, if you figure out how to exploit the vulnerability please share the details!

Conclusion

Alright that's all there is to it for this challenge. It was a little on the easier side but I did enjoy solving this challenge and I hope you enjoyed reading about it. Furthermore, I hope you learned something from this blog post or at least increased your understanding about the topics presented here. If you have any questions feel free to hit me up on Twitter, Instagram, or Discord: jaybailey216#6540. If you have a challenge you want me to try next, let me know and I'll give it a shot! I'll see you all next time!

Peace out! ✌🏾

Reverse Engineering Challenge - Find the Pass

Joshua Bailey