# Reversing Crackme Challenges

You may have noticed that I like to program many things in Ruby. I really do like many aspects of this language, and it’s usually the main language that I use at work as well. Lately I’ve been getting the feeling that a programmer can get judged and immediately pigeon holed based on using a language a lot at work. For example if you use Ruby you must be “only” a web programmer, if you use a garbage collected language, you mustn’t be able to manage memory, or handle programming in languages like C or C++, let alone assembly.

So I like to take some of my free time and play hacker wargames like smashthestack.org, or reverse engineer crackmes, so I decided to describe my process for solving those types of challanges here.

## Crackmes

“Crackmes” are essentially a compiled binary without source code, which asks you for a password or key, you enter the wrong password you lose, you enter the right one and you win and move onto the next. You need to reverse engineer the binary in order to discover the password, and these range from simple to tedious and difficult as the creator of the crackme piles on layers of obfuscation, misdirection, encryption, and anti-debugging techniques to stop you, not to mention you need a good handle on assembly and keeping track of what is in various memory locations during runtime.

## Crackme0x00a

We start out simple, and so I start with my simplest techniques. crackme0x00a wants a password:

So first we think that perhaps the password is simply in the binary, and we can find it by running the program strings on it to list every string in the program. I think it’s safe to say that will work on this one, but I have a more direct approach that I usually take, and that is run the program through ltrace. ltrace outputs all calls to library functions that the program makes, so here we’re looking for something like a call to strcmp

Simple, we found the call to strcmp the password is g00dJ0B!. You see how we can see each call, with parameters and return value. This was all that was needed to beat this challenge.

## Crackme0x00b

Let’s move onto the next one, with the same approach

So the difference here is the password is being compared with wcscmp, which has the prototype int wcscmp (const wchar_t* wcs1, const wchar_t* wcs2); It compares wide strings where each char is 16-bit.

On Linux most of the time the code and data sections get loaded around the address 0x08040000, and local variables are stored in the stack, which begins at the top of memory and grows upwards from 0xffffffff.

The arguments to wcscmp are 0x804a040, which I’m guessing is in the initialized data section, and is the secret password, and 0xff82eb4c which ltrace shows is the location on the stack that it wrote our input to.

The trick to this challenge, is that if we were simply to run the strings program on the binary, it would have worked in the previous challenge, but not this one because each character in a wchar_t string is 32-bits long with the most significant bits zeroed out, and terminated by a a 32-bit 0x0000 value.

The quickest way I know to find out the bytes at 0x804a040 is to load the binary into my debugger radare2 and just print it out.

The password is w0wgreat.

## Crackme0x01

Ok, let’s see what’s new in the next one, and hopefully get to use radare2 some more. First let’s see what happens with ltrace.

So usually I just write in blah for a password, but we can see here that scanf has returned 0 in response to my string. scanf returns the number of things it has parsed according to its format string. So I’m guessing its format string was not "%s" this time as it was before. Maybe it was looking for a number. Let’s load it into r2, dissassemble the main function and find out.

So, woo, radare is the best. It has added helpful identifiers for us rather then make us look up a bunch of addresses. For example str.Invalid_Password__n is an identifier standing in for the address of the string “Invalid Password\n”.

The assembly listing is easy to follow thanks to the analysis radare2 has done, and the symbols it has added for us, it even graphically shows branches and loops.

The first thing to know when you look at the assembly listing of a function created by a compiler, is that there is going to be 3 main sections: the function prologue, the body, and the function epilogue. If you already know this, you will probably want to skip down a little ways.

## The Function Prologue

Like I mentioned before, the stack is an area of memory which in Linux begins at 0xffffffff, and grows upwards in memory, and the CPU register esp points to the top of the stack. To get this analogy you have to think of memory as a something like a container that holds plates. The bottom of the container is 0xffffff, the top of it is 0x000000. When you place plates onto the stack over and over, the pile grows upwards. That is pushing things onto the stack.

To remove things from the stack, you “pop” them off of it. You can’t pop a plate out of the middle or the bottom, you can only pop a plate off the top. It is a LIFO stack, last in is first out.

An x86 CPU has push and pop instructions for this, and guess what, compilers don’t use them very much, prefering to instead just do arithmetic on esp or ebp in order to get values from the stack or manipulate its size.

In a running program, we use the stack as a “function call stack”, which is a sequence of “stack frames”, one per function called.

So let’s explain how that’s laid out so we can get back to the crackme. Take this example nested function calls:

The calling convention used here is called cdecl, to call a function we put its arguments on the stack in reverse order, and use the call instruction. call automatically pushes the address of the next instruction after it onto the stack, so we will know where to return to after the function call ends.

Once inside the function, space is made to hold the local variables, usually by subtracting the local variables size in bytes from esp, raising the stack up higher, sometimes this size is aligned on a 16 byte boundry by using the and instruction. This is why there is junk data in uninitialized local variables, they just hold whatever garbage happened to be on the stack.

main() like any other function has a prologue.

So we save the old base pointer (base pointer, frame pointer, same thing) then start subtracting from esp to increase the space for local variables.

It makes space twice, using two different methods for aligning it to a 16-byte boundry. You could ask, why doesn’t it just do this all in one subraction. I have no idea why it does it this way, but I do know that when you compile with optimizations off, the compiler often does redundant things. You might also ask why it reserves so much stack space for the one 32-bit local variable that actually exists in this main function.

Your guess is as good as mine. In the assembly listing we can see that the only local variable used is called local_4h, which is the value at ebp - 0x4, the int sized local variable that scanf parses our input into.

## The Function Epilogue

The leave instruction is the same as mov esp, ebp pop ebp. This effectively undoes all the stack resizing the prologue did, and then restores the base pointer to what it was before main was called. Yep, there is another hidden function that calls main()

There is a register called eip that holds the address of the current instruction, but you can’t directly assign a value to it except to use a branching instruction or ret. After leave is executed the next thing on the stack is the return address of the caller, ret pops that address into the eip register and execution continues from there.

A function’s return value is always stored in the eax register, so mov eax, 0 is the same as return 0 at the end of main().

## Back to scanf

The first instruction, lea stands for load effective address, and is good at doing pointer arithmetic, calculating the address of elements inside arrays, and things like that. [ebp - 4] is the address of the local int sized value on the stack we want scanf to write to.

local_4h is just a symbol radare uses to remind us what is in a memory location, and it allows us to rename it if we want, it just equals 4 in this case.

0x804854c is the address of the constant string “%d” located in the data section. So this code is just moving the value 0x804854c to the top of the stack, and the address of our local variable 4 bytes after that in reverse order that scanf takes them, then we call to scanf.

At this point whatever we wrote, if it was a number, will be located at [ebp - 4], and this value is compared with 0x149a, if they are equal we jump to the “you win” screen.

So that solves this crackme, all we need to type for a password is, in decimal 5274.

## Crackme0x02

Ok, let’s hope they get a bit harder from here on, and this should go a lot faster without needing to explain the stack, etc.

A quick run shows that ltrace is not going to help us this time, so back into radare we go. I’ve renamed some of the local variable so it will be clearer what is happening after the scanf.

So we have 3 local variables, number_entered, a, and b. Let’s translate this to C psuedocode and check what the condition for winning is.

After doing that arithmetic and register shuffling, it turns out b = 338724, which is the password.

## Crackme0x03

In crackme0x03 we have all the same stuff as in the previous, but now we’re calling a function test(number_entered, b) which will test the password.

So let’s have a look at this test function.

Ok so what’s going on in here. First thing I see is 8 bytes allocated on the stack and then never used (for some reason), and I see our two arguments coming in at epb + 0x8 and ebp + 0xc which I’ve renamed to match the calling code.

We then compare number_entered and b for equality, which chooses one or another string containing giberish to be passed to the function shift(). In either case the return value of test() is not checked, and we return from the test() function.

Let’s check the actual contents of these gibberish strings out in radare with the ps command to print strings.

Let’s remember that, and have a look at the shift() function which I’m going to guess is going to decrypt these strings somehow.

I’ve renamed the local stack variables to be what I believe their purpose is. This is a bigger function than we’ve seen so far, this loop pattern we see here is what a for loop looks like in assembly. You can see how counter is initialized to 0, then a condition is checked which either processes the body and increments counter, or it jumps out of the loop.

I guess is going to iterate over the string that was passed in, and probably “decrypt” it. Maybe it will help if we translate this back into C.

All it did was subtract 3 from the ascii code of the gibberish string, which is why we say it is “decrypting” in quotes :) Remember we still have not run this binary yet, so let’s just write up a quick Ruby script to see what the giberish decodes to.

Ok so we’ve learned that if the password you enter is equal to the variable calculated to b, we will select and decode the “Password Ok” string, and win, that was pretty obvious from before even looking at this function, but we got to make sure we got to the bottom of each function. So if I remember right, b was 338724`, the same password as last time, real sneaky guys.