Link collection: 16

Posted on Nov 11, 2022

Mateusz

JITs, how do they work?

If you search for “How do JITs work?”, you’ll get the same information presented with different CSS. What if you understand that JITs take source code or bytecode and at runtime compile it to machine code but still have no clue how is that achieved? If you think about it it’s actually somewhat of an interesting problem. There appears to be no easy solution to running arbitrary code that “appeared” during the programs lifetime. Do they create small .exe files that are running and returning result? It was so weird to me that I actually thought that this solution had some merit to itself.

The reason why this use-case seems so out of place is because it’s very rare and at the same point guarded against. If the program you were running had it’s code modified the results would be disastrous for security. Any program could just write into other processes memory where the code is stored and have them execute it. Stealing password would be so easy! But you know you can’t write to the memory of other processes, that’s what virtual memory is for. Okay, so I’ll just write into my own processes memory the code I want to execute and somehow tell my program to execute those instructions. Right? Well, almost. You’ll have to allocate memory and specifically mark this part of the memory as executable. Otherwise, the CPU will throw an exception the moment it tries to execute the first instruction. Now that we allocated memory that can be executed we somehow need to tell our program to run this piece of code. This point is the most trivial to achieve but might be hard to figure out if you’ve never done something like this. Function pointers are our solution. It’s nothing more than an address of the memory where the CPU should set it’s instruction pointer to. The code you’re compiling is stored in .text section in the ELF file. That data it’s copied into memory that can be executed and read but not written since we want to protect our program from unwanted modifications. If you’ve programmed in assembly this will make more sense, but if you’ve only programmed sparsely in low-level languages then it’ll take some time to understand.

We’ll create a simple proof of concept that will run arbitrary bytes as code in C. Starting with the code we actually want to run

int fused(int a, int b, int c) {
    return a * b + c;
}

We’ll compile this into an object file to see what does this code map to assembly wise. Using clang code.c -O2 -c && objdump -d code.o -M intel For the code above we get the following output:

Disassembly of section .text:

0000000000000000 <fused>:
   0:   0f af fe                imul   edi,esi
   3:   8d 04 17                lea    eax,[rdi+rdx*1]
   6:   c3                      ret

The hex values after the colon are the actual bytes that make up the instruction. Now we need to load these values into allocated memory region, set an instruction pointer equal to the address of that memory region and call it. Starting with allocating memory that has the executable flag set. The easiest way to achieve that is using mmap on Linux.

int main()
{
    int prot = PROT_EXEC | PROT_READ | PROT_WRITE; // Make the memory writable, readable and executable
    uint8_t *memory = mmap(NULL,          // Let the OS decide the base address
                           4096,          // Allocate 4096 bytes; one page
                           prot,          // See above
                           MAP_PRIVATE |  // Do not share this memory with other processes
                           MAP_ANONYMOUS, // Allocate memory instead of mapping a file
                           0,             // No file descriptor
                           0);            // No offset
}

And now we are going to copy the values we got from objdump above.

{
    // main continuation...
    uint8_t *write_head = memory;
    uint8_t instr01[] = { 0x0f, 0xaf, 0xfe }; // imul   %esi,%edi
    uint8_t instr02[] = { 0x8d, 0x04, 0x17 }; // lea    (%rdi,%rdx,1),%eax
    uint8_t instr03[] = { 0xc3             }; // retq
    copy_instruction(&write_head, instr01, ARRAY_LEN(instr01));
    copy_instruction(&write_head, instr02, ARRAY_LEN(instr02));
    copy_instruction(&write_head, instr03, ARRAY_LEN(instr03));
}

And we define these helpers

#define ARRAY_LEN(x) (sizeof(x)/sizeof(x[0]))

void copy_instruction(uint8_t **memory, uint8_t *instr, uint32_t count)
{
    memcpy(*memory, instr, count);
    *memory = (*memory) + count;
}

All is left to do is setting the function pointer to the address we got from mmap and calling the function.

{
    // main continuation...
    int (*fused)(int, int, int) = (int (*)(int, int, int))memory;
    printf("func(11, 11, 2022) = %u\n", fused(11, 11, 2022));
}

After compiling and running

$ clang file.c -o prog && ./prog
func(11, 11, 2022) = 2143

Go play and see what happens if you try and executing garbage data or write a simple JIT which compiles simple WASM. The full file is here.