Irken Kitties

An Integer Overflow Puzzle

I just came across this cute puzzle and decided to solve it. Like a lot of simple CTF puzzles, we’re just asked to pass some program arguments which when correct, guide the control flow to giving us a shell.

We’re given this C source code:

1
2
3
4
5
6
7
8
#include <unistd.h>

int main(int argc, long **argv) {
  if(*argv[1] * 0x1064deadbeef4601u == 0xd1038d2e07b42569u){
    execl("/bin/sh", "sh", 0);
  }
  return 0;
}

If we can provide the correct number in *argv[1], passing this conditional, we’ll execute a /bin/sh shell on this suid binary and win. Continue on to see how it was solved.

Something is different in argv

Normally the main function in a C program will look something like int main(int argc, char **argv) where argc is the number of arguments provided, including the program name as argument 0, and char **argv or similarly char *argv[] which is an array of pointers to character arrays (C strings) representing each argument.

The program’s environment variables char **envp, (the third argument to main, which has been left out in this case), and the commandline arguments are loaded into the beginning of the stack area when the program loads.

Let’s say we run this program like this ./level02 one two three four, and have a look at how that area of the stack looks in radare2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ r2 -d ./level02 one two three four
-- I accidentally the kernel with radare2.

Process with PID 27842 started...
= attach 27842 27842

#  Short binary analysis
[0x564684b24050]> aa
[x] Analyze all flags starting with sym. and entry0 (aa)

# Debug Continue Until main
[0x564684b24050]> dcu sym.main

hit breakpoint at: 564684b24135

#  Analyse Register rdi, first argument to main, normally named argc
[0x564684b24135]> ar rdi
0x00000005

#  Analyse Register rsi, second argument to main, normally named argv
[0x564684b24135]> ar rsi
0x7ffe26db3678

This is the X86_64 calling convention of putting function arguments in order in the registers: rdi, rsi, rdx, rcx, r8, r9 which means main was called essentially like this: main(rdi, rsi), or in this case literally main(5, 0x7ffe26db3678).

Somewhat confusingly this C program types argv differently, this will be important later. So the four arguments we gave, plus the program’s name gives us 5 here, and the stack address 0x7ffe26db3678 will be a pointer, to a pointer, to those argument strings, so let’s look at that now.

1
2
3
4
5
6
7
8
9
#  Print Hex, 0x50 bytes at the address pointed to by rsi
[0x564684b24135]> px 0x50 @ [rsi]

- offset -       0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x7ffe26db5785  2e2f 6c65 7665 6c30 3200 6f6e 6500 7477  ./level02.one.tw
0x7ffe26db5795  6f00 7468 7265 6500 666f 7572 004c 535f  o.three.four.LS_
0x7ffe26db57a5  434f 4c4f 5253 3d72 733d 303a 6469 3d30  COLORS=rs=0:di=0
0x7ffe26db57b5  313b 3334 3a6c 6e3d 3031 3b33 363a 6d68  1;34:ln=01;36:mh
0x7ffe26db57c5  3d30 303a 7069 3d34 303b 3333 3a73 6f3d  =00:pi=40;33:so=

We can see here the following arguments, ./level02, one, two, three, and four, each separated by a 0x00 null character to terminate the string. After that we have a similar situation for environment variables, LS_COLORS=... shown here, which we don’t care about.

An Argument String is Not an Integer

Remembering that char **argv and char *argv[] mean essentially the same thing in C, let’s find out what happens when that is changed to long **argv.

In the normal course of things char *argv[] holds an array of char * which are C strings as shown above. In this program we have long **argv which essentially means a pointer to an array of pointers to long. Confusing?

Basically what this is going to do is force a string entered on the commandline to be interpreted as a type of 64-bit signed int called a long. It’s going to force the bytes in a string to be interpreted as a 64-bit number, which we’ll subsequently do some math on. Computers don’t care about data types, and in C we can take any pointer to some bytes and say, consider the following bytes as this type.

Let’s have some fun, and convert an 8 character string into a 64-bit integer, a long as this program is doing.

1
2
3
4
5
6
7
8
9
10
11
12
#include <stdio.h>

int main(int argc, char **argv){

  char *string = "ABCDEFGH";
  long *integer = (long *)string;

  printf("sizeof(long) = %d\n", sizeof(long));
  printf("%p\n", *integer);

  return 0;
}

Giving the output:

1
2
3
$ ./string_to_int
sizeof(long) = 8
0x4847464544434241

We can see, because X86_64 is little endian, A = 0x41, B = 0x42, and so on, is now backwards, and represents a fairly large 64-bit number.

Finding the Correct Number for argv[1]

Since we now understand how a commandline string can represent a number, let’s find that number so we can pass the following condition:

1
  if(*argv[1] * 0x1064deadbeef4601u == 0xd1038d2e07b42569u)

This should be as simple as taking this equation and rearranging it terms of x:

1
2
x * 0x1064deadbeef4601 = 0xd1038d2e07b42569
x = 0xd1038d2e07b42569 / 0x1064deadbeef4601

This works out to x = 12, and it’s wrong. This is because we’re mixing regular algebra with integer division and disregarding that integers on the computer have a finite range due to bit depth, and wrap around when they exceed the resolution of bits used.

Now let’s say we have, for example 8-bit unsigned integers and multiply 99 * 5 = 495. The maximum value of of an 8-bit unsigned integer is 2^8 - 1 = 255, and this is too high. Counting 0, we have a total of 256 distinct values.

In order to find out how this expression will actually work out on the CPU we can say instead 99 * 5 mod 2^8 = 239, which is the answer you would get for 8-bits after it wraps around. We’ll use that next for our 64-bit values.

I’m going to use Wolfram Alpha to solve for x, because I don’t know offhand how to do algebra involving modulus. First we’ll convert the large 64-bit hex into decimal. I very much like hex, but Wolfram Alpha doesn’t. Boo to that.

1
2
0x1064deadbeef4601u = 1181313840091973121
0xd1038d2e07b42569u = 15061036807694329193

Then entering this equation into Wolfram to solve for x:

1
2
3
4
mod(1181313840091973121 * x, 2^64) = 15061036807694329193

Solution is this set of linear equations:
x = 18446744073709551616 * n + 8319100071223652201  forall n

This will give a valid solution for any n, because of the cyclic nature of integer overflows, so let’s just pick n = 1.

1
2
3
n = 1
x = 18446744073709551616 * n + 8319100071223652201
=> 26765844144933203817

We now have one of an infinite amount of solutions, but we have a problem, this number is too large to fit into 64-bits. How can we tell how many bits a number needs to be represented? The base 2 logarithm of a number will let us know exactly how many bits are needed.

1
2
3
4
5
6
7
8
9
10
11
12
x = 26765844144933203817
Math.log2(x)
=> 64.53702695614811

#  Requires just slightly more than 64-bits.
#  Wrap it back into 64-bit range, gives us a 63 bit number, which is fine
wrapped = x % (2**64)
=> 8319100071223652201

#  Now fits into 62-ish bits, good.
Math.log2(wrapped)
=> 62.85113317948744

Looks like we should have just picked n = 0 and saved ourselves some time. I figured as much, but we got to show off the log2 thing, so whatever. I’m interested in knowing if we can solve this with values of n < 0, but not that interested at the moment :)

Next we need to check that 8319100071223652201 is the number we’re looking for by multiplying it out, this looks like it’s going to be a HUGE number, so we’ll again need to use modulus to wrap it back within 64-bits.

1
2
3
4
5
6
7
8
9
10
8319100071223652201 * 1181313840091973121
=> 9827468051246619677849523421944489321

#  Again wrap it back into 64-bit
(8319100071223652201 * 1181313840091973121) % (2**64)
=> 15061036807694329193

#  Exactly the number that will pass the conditional
0xd1038d2e07b42569
=> 15061036807694329193

Convert our found number to hex

Our number is 8319100071223652201, and in hexadecimal that is 0x7373617034366f69. This looks suspiciously like each byte is an ASCII character value to me, forming an 8 character string, so let’s convert that in radare.

1
2
3
4
5
6
> ? 0x7373617034366f69
hex     0x7373617034366f69
octal   0715633027006415467551
int64   8319100071223652201
string  "io64pass"
binary  0b0111001101110011011000010111000000110100001101100110111101101001

Looks like we have found the commandline argument that will get us our suid shell, io64pass so let’s try it out.

1
2
3
4
5
6
7
8
$ ls -lah level02
-rwsr-xr-x 1 root root 17K Jan  7 01:02 level02

$ ./level02 io64pass
# whoami
root
# id
uid=0(root) gid=0(root) groups=0(root)