concretization, or how 10 lines of rust got me a dangling pointer and undefined behavior

something i did not know until recently is that in safe rust, creating dangling pointers is actually perfectly legal. why this matters is that dereferencing pointers does not always happen inside the rust program (where the lifetimes would catch the error), but anywhere else the pointer goes - and the target does not care about rust’s type safety, so it just dereferences the pointer, causing undefined behavior. this is what happened when i tried to call execve in through some (generated) code in rust, and it went unnoticed for a while, because the code ran perfectly fine in debug mode, but in release mode it blew up.

a note on safety and abstraction

now we’ve all heard about rust’s safety guarantees, and a running joke with rust developers is that they are like vegans and will keep telling you about it every opportunity they get. so before i go deeper into the details, a meme about rust:

now jokes aside, rust’s safety guarantees are very real, and i like the language - it’s much like a lower level ocaml/f# with better codegen and a larger ecosystem, but i don’t really care much about the (compiler-enforced) safety aspect of it. if i’m going to pay a big efficiency cost for the safety through .clone() or rewriting every single transmute to a match expression, i might as well just write in f# and be done with it. the reason to write rust is because of the performance, and the safety is a nice bonus, but i don’t want to pay for it if it comes at a large cost of verbosity and/or efficiency*if the code is generated it’s ok, otherwise verbosity is a big problem. also for embedded targets, where the rust standard library is not available, there is no other option than to write no_std code, which means we have to break some rules, and that’s what this article is about.

another thing i’m reluctant to pay for is overabstraction - if i have a concrete idea of what i want to make, i don’t want to have to jump through hoops to fit it into some abstract interface that the language provides and pray that the compiler will give me what i wantespecially DFAs and jump tables. the code required to “create the conditions” for these optimizations is much more complex than writing them in assembly where you have control over the output. keep in mind LLVM is 20 million lines of unadulterated C and C++. if i’m writing a simple program for an embedded linux target i don’t want to even think about macos or windows, but to get anywhere near to what really happens under the hood, you have to lose some safety, and that’s where the fun part starts.

when you start stripping away the interface std rust provides and start working with raw pointers and syscalls (or just libc in this case), you quickly realize that you need a lot of unsafe blocks, and the optimizer is very aggressive, so it may lead to some surprising results if you’re not careful. (something that is also the case with c and c++ of course.)

the code

now let’s get to the code (below): the code i wrote has an unsafe block around the call to execve, but the rest of the code is perfectly safe. also, the code is very simplewell not that simple, python’s print(‘hello world!’) is simpler, it just prepares the standard argv and envp arrays and calls execve, to execute a simple shell command that prints “hello world!” to the console.

use libc::execve;
pub fn main() {
    let argv = [
        ("sh\0".as_ptr()),
        ("-c\0".as_ptr()),
        ("echo hello world!\0".as_ptr()),
        0 as *const u8,
    ].as_ptr() as *const *const i8;
    let envp = 0 as *const *const i8;
    unsafe { execve("/bin/sh\0".as_ptr() as _, argv, envp) };
}

so what’s wrong with this code? as mentioned in the beginning, creating dangling pointers is perfectly legal in safe rust, and that’s exactly what happens here. the argv array literal is temporary - it lives on the stack just long enough for the expression to evaluate, and then it’s gone. calling .as_ptr() on it gives you a pointer to memory that the compiler is free to reuse immediately. so by the time execve reads argv, the array contents may already be overwritten, which is exactly what happens in release mode.

the fix (green lines below) is simple: bind the array to a variable, so it lives until the end of the scope. argv keeps the array alive on the stack, and the pointer is still valid when execve runs.

 use libc::execve;
 pub fn main() {
     let argv = [
         ("sh\0".as_ptr()),
         ("-c\0".as_ptr()),
         ("echo hello world!\0".as_ptr()),
         0 as *const u8,
-    ].as_ptr() as *const *const i8;
+    ];
     let envp = 0 as *const *const i8;
-    unsafe { execve("/bin/sh\0".as_ptr() as _, argv, envp) };
+    unsafe { execve("/bin/sh\0".as_ptr() as _, argv.as_ptr() as _, envp) };
 }

conclusion

the bug itself is trivial once you see it, but what’s interesting is where it sits: entirely in safe rust, invisible to the borrow checker, and only triggered by the optimizer. the unsafe block around execve is a red herring - by the time you get there, the damage is already done. debug mode happens to leave the stack intact, so the bug hides until you ship a release build, which is the worst kind of surprise.

this is what i mean by concretization, i had a concrete intent to call the real execve with some real argv arguments and an (empty) environment, but the safety rules are designed for an abstracted view of the world.

the compiler is free to drop the temporary the moment it’s no longer needed by rust’s rules. the rules say the temporary lives until the end of the statement, not until the end of the function. the compiler is right, your mental model is wrong, and the kernel doesn’t care either way.