Rust Generator yield: What the Compiler Actually Builds Under async/await

Every async fn you’ve ever written in Rust compiles down to something you probably never asked to see. The Rust generator yield mechanism isn’t an exotic nightly toy — it’s the actual substrate that async/await desugars into, and the compiler generates state machines from it that can quietly balloon your binary by hundreds of kilobytes. Most devs ship this without knowing it exists.


TL;DR: Quick Takeaways

  • Every Rust async fn desugars into a generator-backed state machine — a compiler-generated enum with one variant per suspension point.
  • The state machine size grows with nesting depth: a real-world async chain can produce enum variants exceeding 400 KB (Tweede golf, 2024 measurement).
  • Generators behind #![feature(coroutines)] are still nightly-only in 2025; stable workarounds exist but carry trade-offs.
  • Python’s yield and Rust’s yield look identical syntactically — the execution contracts are completely different at the machine level.

What the Rust Compiler Does with yield: A State Machine You Never See

When rustc compiles an async function, it runs a desugaring pass that converts suspension points into enum variants. Each .await call becomes a potential yield point; the compiler assigns each one a unique state index and generates a match arm in the polling loop. What you get is not a thread, not a closure, not a callback — it’s a resumable enum that the executor pokes with poll() until it returns Poll::Ready.

The struct that wraps this enum implements Future. Inside sits the compiler-generated state machine with fields for every local variable that’s alive across a suspension point. That’s the part nobody warns you about: variables don’t get dropped at the .await boundary — they get saved into the enum variant. If you have a fat buffer alive when you hit .await, that buffer now lives in the enum. Permanently. Until the future resolves.

// What you write:
async fn fetch_data(url: &str) -> Vec {
 let response = client.get(url).await; // suspension point 1
 let body = response.bytes().await; // suspension point 2
 body.to_vec()
}

// What the compiler roughly generates (simplified MIR view):
enum FetchDataStateMachine<'a> {
 State0 { url: &'a str }, // before first await
 State1 { url: &'a str, response: Response }, // between awaits
 State2 { body: Bytes }, // after second await
 Complete,
}

The mini-analysis here is straightforward but the implications aren’t. State1 holds both url and response simultaneously because the borrow checker needs to prove both are valid at that point. The enum variant is sized to the largest variant — so if Response is 1 KB, the entire future allocates for that worst case even if most code paths never hit State1. This is where rust async function memory layout becomes a production concern, not a trivia question.

The Jump Table and MIR Output

The state machine dispatches via a jump table internally — the executor calls poll(), the match expression on the state index branches to the right arm, execution resumes from the exact suspension point. This is cooperative scheduling at zero OS overhead: no context switch, no kernel involvement, no stack swap. The Waker mechanism handles re-scheduling by registering a callback that the executor fires when the awaited I/O is ready.

If you want to actually see what rustc generates, run cargo rustc -- --emit=mir on a small async function. The MIR output is verbose but legible — you’ll see the generator drop glue, the state enum definition, and the GeneratorResumeArgument plumbing. It’s ugly in the good way: no magic, just a lot of autogenerated match arms.

Deep Dive
Rust Garbage Collectio

Garbage Collection in Rust Without a Single unsafe Block Most garbage collectors written in Rust have a dirty secret buried in their source tree: a unsafe block that throws your borrow checker faith right out...

Rust Generator vs Python Generator: Same Word, Different Contract

Python developers picking up Rust often assume yield works the same way. It doesn’t — and the difference matters enough to trip up experienced engineers. In Python, yield suspends a function and returns a value to the caller; the caller can push a value back in via gen.send(value). The generator maintains its own implicit stack frame that Python manages for you. You don’t think about lifetimes. You don’t think about pinning. You don’t think about drop order.

// Python mental model:
def counter(start):
 n = start
 while True:
 received = yield n # yields n, receives next value
 n = received or n + 1

gen = counter(0)
next(gen) # → 0
gen.send(10) # → 10

// Rust nightly equivalent (coroutines feature):
#![feature(coroutines, coroutine_trait)]
use std::ops::{Coroutine, CoroutineState};
use std::pin::Pin;

let mut gen = #[coroutine] |initial: i32| {
 let mut n = initial;
 loop {
 let received = yield n; // type of received = i32
 n = received;
 }
};

match Pin::new(&mut gen).resume(0) {
 CoroutineState::Yielded(v) => println!("{v}"), // 0
 CoroutineState::Complete(v) => {}
}

The structural difference: Python generators are heap-allocated, garbage-collected, and implicitly pinned. Rust coroutines are stack-allocated by default, borrow-checker-validated, and require explicit Pin<&mut Self> to resume. The StopIteration exception in Python maps to CoroutineState::Complete in Rust — same conceptual role, completely different mechanism. Python devs learning Rust generators shouldn’t expect a soft landing here.

The send() Asymmetry

Python’s gen.send(value) is two operations fused: it resumes the generator AND injects a value that becomes the result of the yield expression. Rust’s resume(arg) works the same way, but the type of the resume argument is part of the coroutine’s type signature — the compiler enforces it. You can’t accidentally send a string to a coroutine that expects an integer. Python discovers that at runtime; Rust refuses to compile it. Whether that’s “better” depends on how much you enjoy type errors at 2 AM.

Why async/await in Rust Is a Generator in Disguise

This isn’t metaphorical. Before Rust stabilized async/await syntax, the internal implementation literally used std::future::from_generator() — a function that wrapped a coroutine-like closure into something that implements Future. The wrapper was called GenFuture. You can find it in old Rust source and in pre-1.36 nightly code. The async keyword was syntactic sugar over this machinery from day one.

// Pre-stabilization desugaring (historical, ~Rust 1.36):
// async fn example() -> i32 { ... }
// compiled roughly to:
fn example() -> impl Future {
 from_generator(#[coroutine] || {
 // body with yields instead of awaits
 let x = yield some_future;
 x + 1
 })
}

// Modern compiler does this internally but hides the plumbing.
// The GenFuture wrapper still exists in the stdlib — it's just not public API.

The from_generator wrapper implements Future::poll by calling Coroutine::resume with a Poll-compatible argument. When the coroutine yields, it maps to Poll::Pending. When it completes, it maps to Poll::Ready. The Waker gets threaded through context via thread-local storage in the poll_with_tls_context helper. This is the async state machine — not an abstraction over it, not analogous to it. It is it.

The Stackless Architecture Consequence

Because Rust async uses stackless coroutines — meaning each suspended future stores only the variables it actually needs, not an entire call stack — you can have thousands of concurrent futures in a single thread without proportional memory cost. A Tokio application handling 10,000 simultaneous connections doesn’t maintain 10,000 stacks. Each future sits as a struct on the heap, sized exactly to its state machine. This is the zero-cost abstraction claim made concrete: you pay for what you suspend, nothing else.

Technical Reference
Python Rust Integration

Python Rust Integration: Solving Engineering Bottlenecks You didnt switch to Rust because you wanted a "safer" way to print 'Hello World'. You did it because your Python code hit a wall, and throwing more RAM...

Rust Generator on Stable: Patterns That Don’t Need Nightly

Here’s the uncomfortable reality: the #![feature(coroutines)] gate has been open since RFC 2033 landed in 2016, and as of 2025 it’s still not stable. The RFC got renamed, partially redesigned, merged with gen block proposals under RFC 3513, and the stabilization timeline remains “when it’s ready.” If you’re shipping production Rust and need generator-like behavior today, you work around the gate.

// Stable pattern 1: std::iter::from_fn
// Generates an infinite Fibonacci sequence without nightly
let mut state = (0u64, 1u64);
let fibs = std::iter::from_fn(move || {
 let next = state.0 + state.1;
 state = (state.1, next);
 Some(state.0)
});

// Stable pattern 2: genawaiter crate (0.99.x)
// Provides gen!() macro that emulates generator syntax on stable
use genawaiter::{sync::gen, yield_};

let mut generator = gen!({
 yield_!(1u32);
 yield_!(2u32);
 yield_!(3u32);
});

// Stable pattern 3: Rust 2024 gen blocks (RFC 3513)
// Available in nightly as of late 2024, stabilization target: 2025 edition
// gen { yield 1; yield 2; } → implements Iterator directly

std::iter::from_fn is the zero-dependency stable answer for simple lazy sequences — you manually maintain state in a captured variable. It works, it’s readable, and it compiles on stable going back years. The genawaiter crate goes further: it provides yield_!() macro syntax that feels like real generators and works on stable Rust by implementing the state machine explicitly. The gen {} block from RFC 3513 is the official future — it targets the Iterator trait directly and sidesteps the coroutine/generator naming debate entirely.

The Hidden Cost: How Generator State Size Grows with Nesting

The state machine size problem is real and it bites in production. Tweede golf (a Dutch embedded systems consultancy) published measurements in 2024 showing that a moderately complex async function chain in an embedded Rust application generated a future struct exceeding 400 KB. That’s a single future. On a microcontroller with 512 KB of RAM total, that’s not a performance problem — it’s a won’t-compile problem.

The mechanism is straightforward once you understand enum layout. A Rust enum is sized to its largest variant. If you have an async function that calls three other async functions sequentially and holds local state between each call, the compiler generates an enum where the largest variant contains all the nested futures plus all the local variables. Those nested futures are themselves enums sized to their largest variants. The size compounds multiplicatively, not additively. Deeply nested async call chains can produce state machines that dwarf the actual code doing work.

The fix is Box::pin(): boxing a future puts it on the heap and stores only a pointer in the state machine. Box> costs one heap allocation and pointer indirection per await point, but the state machine size becomes constant regardless of what’s inside. For embedded targets or size-sensitive code, this trade-off is often worth it. For throughput-sensitive server code, it’s sometimes a regression. Profile before deciding.

Why Rust Generators Are Still Behind a Feature Gate in 2025

The RFC 2033 rationale document is public and worth reading if you want the full history. The short version: generators landed in nightly fast, then ran into the self-referential struct problem almost immediately. A generator that holds a reference to its own state — which happens naturally when you write let x = vec![1,2,3]; yield &x[0]; — creates a struct that references itself. This is undefined behavior in safe Rust unless you use Pin.

The Pin API was designed specifically to solve this. It guarantees that a pinned value won’t be moved in memory after pinning, which makes self-referential structs safe. But adding Pin to the generator/coroutine API surface introduced complexity that took years to work through in terms of ergonomics, interaction with async/await, and the semantics of what it means to move a not-yet-started generator vs a suspended one. The gen block proposal (RFC 3513) sidesteps some of this by targeting Iterator directly — iterators can’t self-reference in the same way because they don’t need to borrow across yield points. That’s why gen blocks are closer to stabilization than raw coroutines.

FAQ

What exactly is a rust generator yield and how does it differ from return?

A yield expression suspends a coroutine and produces an intermediate value without terminating it — the function retains its state and resumes from that exact point on the next resume() call. A return terminates the function entirely and discards all local state. The practical difference is that generators can produce an arbitrary number of values over their lifetime, while regular functions produce exactly one. In Rust’s coroutine model, yield also transfers control back to the executor rather than blocking the current thread.

Worth Reading
Rust Memory Safety Myths

Beyond the Compiler: 3 Dangerous Rust Memory Safety Myths Despite the widespread adoption of the language, several Rust memory safety myths persist among developers, giving a false sense of invincibility in production systems. Engineers often...

Can I use rust generator syntax on stable Rust without nightly?

Not with native yield syntax — that requires #![feature(coroutines)] which is nightly-only. On stable, you have three options: std::iter::from_fn for simple lazy iterators, the genawaiter crate for generator-style syntax via macros, or structuring your logic as explicit state machines using enums and match. The gen {} block syntax from RFC 3513 is expected to stabilize in the 2025 edition cycle, so the situation is improving.

Why does my rust async function take so much memory?

Because the async state machine is an enum sized to its largest variant, and every local variable alive across an .await point is stored inside that enum. If you have nested async calls, those nested futures are embedded in the parent’s state machine — and their sizes compound. The standard fix is to Box::pin() large intermediate futures, which caps the state machine size at pointer width for that slot. Tweede golf’s 2024 embedded measurements showed single futures exceeding 400 KB in pathological cases; if you’re hitting stack overflows in async code, this is likely why.

What is CoroutineState and when does Complete fire?

CoroutineState is a two-variant enum returned by Coroutine::resume(). Yielded(value) means the coroutine hit a yield expression and produced an intermediate value — it can be resumed again. Complete(value) means the coroutine’s body returned and all local state has been dropped — resuming after Complete is a logic error and will panic in debug builds. The mapping to async/await: Yielded corresponds to Poll::Pending and Complete corresponds to Poll::Ready.

How is a rust coroutine different from a Go goroutine?

Completely different model. Go goroutines are stackful and preemptively scheduled by the Go runtime — the runtime can interrupt a goroutine mid-execution and swap to another. Rust coroutines are stackless and cooperatively scheduled: they only suspend at explicit yield points, and the executor decides when to resume them. Goroutines consume a minimum of ~2 KB stack even if idle. Rust futures consume exactly the size of their state machine enum — which can be less than 100 bytes for simple cases. Goroutines are simpler to reason about; Rust coroutines are lower-overhead and give the borrow checker full visibility into suspension points.

What is the genawaiter crate and is it production-ready?

Genawaiter is a stable-Rust library that provides generator and async generator functionality without nightly. It implements the state machine manually using proc macros and provides gen!() syntax for synchronous generators and async_gen!() for async ones. The crate is actively maintained as of 2025 and used in production by several projects in the Rust ecosystem. The main trade-off versus native generators is compile time overhead from the proc macros and slightly less readable desugaring in compiler error messages. For projects that can’t use nightly, it’s the most ergonomic option currently available.


krun.pro — opinionated systems writing, no filler

Written by:

Source Category: Rust Engineering