This article provides an overview of those instructions that can be used to control the flow of a program. See the last article in this series, How to diagnose and locate segmentation faults in x86 assembly.

Using comparison instructions to control applications at the x86 level

x86 instruction set comes with two popular instructions for comparison. They are CMP and TEST.  Let us explore the following program to understand how these two instructions work. First, let us assemble and link this program using the following commands. _start: mov eax, 101 mov ebx, 100 mov ecx, 100 cmp eax, ebx cmp ebx, ecx xor eax, eax test eax, eax Now, let us load the program in GDB as shown below. Set up a breakpoint at the entry point of the program and run the program as shown in the following excerpt. The following instructions move the values into the respective registers specified in the instructions. gef➤ run Following is the state of registers after executing the first 3 instructions shown above. 0x804900a <_start+10>      mov    ecx, 0x64 The eflags are as shown below. $ecx   : 0x64 $edx   : 0x0 $esp   : 0xffffd210  →  0x00000001 $ebp   : 0x0 $esi   : 0x0 $edi   : 0x0 $eip   : 0x0804900f  →  <_start+15> cmp eax, ebx Now, let us run the first cmp instruction by typing si and observe the changes to the eflags register. As we can notice, there is no difference in the flags after executing the first CMP instruction. CMP instruction compares the values and sets the ZERO flag if the difference is 0. This instruction also sets Sign Flag (SF), Parity Flag (PF), Carry Flag(CF), Overflow Flag(OF) and Adjust Flag(AF) depending on various results. In this case, the values in EAX and EBX are compared and the result did not set any of these flags. However, after executing the next cmp instruction, the ZERO flag and PARITY flags are set as shown below. When a specific flag is set, GEF shows it in upper case letters as shown in the preceding output. The next instruction xor eax, eax sets eax to 0. Following is the status of registers after executing this instruction. The next instruction test eax, eax checks if the register eax contains the value 0. If yes, the zero flag will be set. Following is the status of eflags after executing this instruction. $ecx   : 0x64 $edx   : 0x0 $esp   : 0xffffd210  →  0x00000001 $ebp   : 0x0 $esi   : 0x0 $edi   : 0x0 $eip   : 0x08049015  →  <_start+21> test eax, eax Parity flag is set if the register eax has an even number of set bits. These instructions can be used to control the flow of the program. As an example, execute a block of code if a specific register has value 0. Similarly, execute a specific block if the comparison (using the CMP instruction) results in the value zero. Following is a sample use case of cmp instruction. Following is a sample use case of test instruction.

Using jump instructions to control applications at the x86 level

The next set of instructions are jump instructions. Jump instructions are of two types. Unconditional jumps and conditional jumps. The instruction JMP is an unconditional jump as it does not rely on any conditions to be met. All other jump instructions are conditional jump instructions as their execution depends on certain conditions that are possibly set by other parts of the program. Following is an example with both unconditional and conditional jump instructions. As we can notice in the preceding program, the entry point of the program is labeled as _start. When the program starts its execution, the registers eax and ebx are set with some values. Next, a comparison is done using CMP instruction. Since the values in eax and ebx are not equal, the ZERO flag will not be set. Once it is done, the jz _printequal instruction is executed. This instruction checks if the ZERO flag is set and takes a jump to the label _printequal if zero flag is set. Clearly, this instruction relies on the output of other instructions such as CMP. In this case, the jump will not be taken. Following is an excerpt taken from GDB at this instruction. notequal db “eax and ebx are not equal” section .text global _start _start: mov eax, 100 mov ebx, 101 cmp eax, ebx jz  _printequal jmp _printnotequal _exitprogram: mov eax, 1 mov ebx, 0 int 0x80 _printequal: mov eax, 4 mov ebx, 1 mov ecx, equal mov edx, 21 int 0x80 jmp _exitprogram _printnotequal: mov eax, 4 mov ebx, 1 mov ecx, notequal mov edx, 25 int 0x80 jmp _exitprogram GEF clearly shows that the JUMP is not taken because the ZERO flag is not set. Since the JUMP is not taken, the control will be passed to the next instruction, which is an unconditional jump to _printnotequal. Once the code within _printnotequal is executed, there is another unconditional jump instruction to invoke the code within the label _exitprogram, which will gracefully exit the program. →  0x804900c <_start+12>      je     0x804901c <_printequal> NOT taken [Reason: !(Z)] 0x804900e <_start+14>      jmp    0x8049034 <_printnotequal> Following is a list of conditional jump instructions. JE (Jump if Equal): This instruction usually follows a CMP instruction and loads the EIP register with the specified address, if operands of the previous cmp instruction are equal. Example: mov eax, 10 mov ebx, 10 cmp eax, ebx je _loc _loc: JNE (Jump if Not Equal): This instruction usually follows a CMP instruction and loads the EIP register with the specified address, if operands of the previous cmp instruction are not equal. Example: mov eax, 10 mov ebx, 11 cmp eax, ebx jne _loc _loc:   JG (Jump if Greater): This instruction usually follows a CMP instruction and loads the EIP register with the specified address, if the first operand is greater than the second operand in the previous cmp instruction. A signed comparison is performed. Example: mov eax, 11 mov ebx, 10 cmp eax, ebx jg _loc _loc: JGE (Jump if Greater or Equal): This instruction usually follows a CMP instruction and loads the EIP register with the specified address, if the first operand is greater than or equal to the second operand in the previous cmp instruction. A signed comparison is performed. Example: mov eax, 11 mov ebx, 10 cmp eax, ebx jge _loc _loc: JA (Jump if Above): This instruction is the same as JG except that it performs an unsigned comparison. JAE (Jump if Above or Equal): This instruction is the same as JGE except that it performs an unsigned comparison. JO (Jump if Overflow): This instruction loads the EIP register with the specified address if overflow bit is set. JNO (Jump if Not Overflow): This instruction loads the EIP register with the specified address if overflow bit is not set. JZ (Jump if Zero): This instruction loads the EIP register with the specified address if a previous arithmetic expression resulted in a zero flag being set. JNZ (Jump if Not Zero): This instruction loads the EIP register with the specified address if a zero flag is not set. JS (Jump if Signed): This instruction loads the EIP register with the specified address if a previous arithmetic expression resulted in the sign flag being set. JNS (Jump if Not Signed): This instruction loads the EIP register with the specified address if the sign flag is not set.

Using function calls to control applications at the x86 level

In x86, the call instruction is used to call another function. The function can then return using the ret instruction. When a function is called using the call instruction, a new stack frame is created at the current esp location and the return address(typically address of the instruction next to the call instruction) is stored on the stack. After the function is executed, ret instruction will be executed to return to this address saved on the stack. Let us consider the following example. The first instruction within _start directive is a call to _print. After the _print function is executed, the ret instruction will be executed, which will return the control to the exit code written immediately after the call print instruction. Let us see how this looks like using GDB. First, let us assemble and link the program using the following commands, _start: call print mov eax, 1 mov ebx, 0 int 0x80 _print: mov edx,len mov ecx,msg mov ebx,1 mov eax,4 int 0x80 ret section .rodata msg db  ‘Hello, world!’,0xa len equ $ – msg Load the binary in GDB using the following command. Set up a breakpoint at the entry point and run the program. Following are the instructions to be executed. gef➤  run Following is the stack before running the first instruction. 0x8049016 <_print+5>       mov    ecx, 0x804a000 0x804901b <_print+10>      mov    ebx, 0x1 0x8049020 <_print+15>      mov    eax, 0x4 0x8049025 <_print+20>      int    0x80 0x8049027 <_print+22>      ret Now, run the call instruction by typing si and observe the top of the stack. 0xffffd218│+0x0008: 0x00000000 0xffffd21c│+0x000c: 0xffffd3df  →  “SHELL=/bin/bash” 0xffffd220│+0x0010: 0xffffd3ef  →  “SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1760,[…]” 0xffffd224│+0x0014: 0xffffd441  →  “QT_ACCESSIBILITY=1” 0xffffd228│+0x0018: 0xffffd454  →  “COLORTERM=truecolor” 0xffffd22c│+0x001c: 0xffffd468  →  “XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg” Notice the address placed on the top of the stack after executing the call instruction. What address is this? Let us view the disassembly of _start, which looks as shown below. 0xffffd214│+0x0008: 0xffffd3c7  →  “/home/dev/x86/functions” 0xffffd218│+0x000c: 0x00000000 0xffffd21c│+0x0010: 0xffffd3df  →  “SHELL=/bin/bash” 0xffffd220│+0x0014: 0xffffd3ef  →  “SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1760,[…]” 0xffffd224│+0x0018: 0xffffd441  →  “QT_ACCESSIBILITY=1” 0xffffd228│+0x001c: 0xffffd454  →  “COLORTERM=truecolor” As we can see in the preceding excerpt, the address placed on the stack is the address of the immediate next instruction to the call instruction. Let us continue execution until the ret instruction and observe what happens when we are about to execute the ret instruction. 0x08049000 <+0>: call   0x8049011 <_print> 0x08049005 <+5>: mov    eax,0x1 0x0804900a <+10>: mov    ebx,0x0 0x0804900f <+15>: int    0x80 End of assembler dump. gef➤ As we can notice in the preceding excerpt, the address of the next instruction to be executed after the ret instruction is the same address that was placed on the stack earlier. So, when the ret instruction is executed, the address will be popped from the stack and placed in the EIP register. 0x804900a <_start+10>      mov    ebx, 0x0 0x804900f <_start+15>      int    0x80

Using loop instructions to control applications at the x86 level

x86 instruction set provides loop instruction, which decrements ECX and jumps to the address specified by arg unless decrementing ECX causes its value to become zero. So, the loop will continue to run until the value of ECX becomes zero. Let us examine the following program. The preceding program has two registers eax, ecx with the values 0 and 5 respectively. When the control first goes to _addtoeax, the value of eax will be incremented and the loop _addtoeax instruction will be executed. When this instruction is executed, the value of ECX will be decremented by 1 and eax will be incremented once again.  The loop will continue until ECX becomes 0.  When ECX value becomes 1, EAX value becomes 5. So, when the loop instruction executes, ECX becomes 0 and the loop terminates there. _start: mov eax, 0 mov ecx, 5 _addtoeax: inc eax loop _addtoeax

Conclusion:

As discussed in this article, there are several different instructions exist in the x86 assembly instruction set to control the flow of a program. Depending on the requirement, we can choose to use these instructions appropriately. See the last article in this series, How to implement common logic constructs such as if/else/loops in x86 assembly.  

Sources

https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow https://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames https://www.cs.uaf.edu/2015/fall/cs301/lecture/09_14_call.html