Understanding Rust Memory Layout
Memory layout is a fundamental concept that affects both the performance and correctness of your Rust programs. To truly understand Rust’s memory layout, we first need to grasp the concept of data alignment and how modern processors access memory.
Part 1: How Processors Actually Access Memory
The Reality of Memory Access
While programmers often think of memory as a simple array of bytes that can be accessed one at a time, modern processors tell a different story. Processors read and write memory in chunks - typically 2, 4, 8, 16, or even 32 bytes at a time. This chunk size is called the processor’s memory access granularity.
Consider this example on a processor with 4-byte memory access granularity:
// Reading 4 bytes from an aligned address (0x0000)
// Requires: 1 memory access
// Reading 4 bytes from an unaligned address (0x0001)
// Requires: 2 memory accesses + bit shifting operations
When data isn’t aligned to the processor’s memory access boundaries, the processor must:
- Read the first chunk containing part of the data
- Read the second chunk containing the rest
- Extract and combine the relevant bytes
This extra work can cause:
- Performance degradation (up to 4,610% slower for unaligned 64-bit floating-point access on some PowerPC processors!)
- Exceptions on processors that don’t support unaligned access (like the original 68000)
- Silent failures with atomic operations
- Incorrect results with SIMD instructions
Part 2: Alignment and Size in Rust
Key Concepts
Every value in Rust has two critical properties:
- Size: The number of bytes the value occupies, including any padding. Always a multiple of the alignment.
- Alignment: Specifies which memory addresses are valid for storing the value. Must be a power of 2 (1, 2, 4, 8, 16, etc.).
A value with alignment n
can only be stored at addresses that are multiples of n
. For example:
u8
has alignment 1: can be stored at any addressu16
has alignment 2: can only be stored at even addressesu32
has alignment 4: can only be stored at addresses divisible by 4
Primitive Type Layouts
Here are the guaranteed sizes for Rust’s primitive types:
Type | Size (bytes) | Typical Alignment |
---|---|---|
bool , u8 , i8 | 1 | 1 |
u16 , i16 | 2 | 2 |
u32 , i32 , f32 | 4 | 4 |
u64 , i64 , f64 | 8 | 8* |
u128 , i128 | 16 | 8 or 16* |
usize , isize | platform-dependent | platform-dependent |
char | 4 | 4 |
*Note: On 32-bit platforms, 64-bit types may only have 4-byte alignment. The u128
/i128
types often have 8-byte alignment despite their 16-byte size.
Part 3: Struct Layout in Rust
The Default: #[repr(Rust)]
By default, Rust structs use the Rust representation, which provides minimal guarantees:
- Fields are properly aligned
- Fields don’t overlap
- The struct’s alignment ≥ maximum alignment of its fields
Important: The Rust representation does NOT guarantee:
- Fields will be laid out in declaration order
- Consistent layout between compilations
- Specific padding patterns
Example: Understanding Padding
Let’s analyze a struct layout step by step:
// Using default #[repr(Rust)]
struct Example {
a: u8, // size: 1, align: 1
b: u64, // size: 8, align: 8
c: u16, // size: 2, align: 2
}
While we can’t predict the exact layout with #[repr(Rust)]
, one possible layout is:
Offset | Field | Size | Notes
-------|-------|------|-------
0 | a | 1 | u8 field
1-7 | pad | 7 | Padding for u64 alignment
8 | b | 8 | u64 field (must start at multiple of 8)
16 | c | 2 | u16 field
18-23 | pad | 6 | Padding to make total size multiple of alignment
Total size: 24 bytes (multiple of 8, the struct's alignment)
But Rust might reorder fields for optimization:
Offset | Field | Size | Notes
-------|-------|------|-------
0 | b | 8 | u64 field
8 | c | 2 | u16 field
10 | a | 1 | u8 field
11-15 | pad | 5 | Final padding
Total size: 16 bytes (more efficient!)
Guaranteed Layouts: #[repr(C)]
For predictable layouts, use #[repr(C)]
:
#[repr(C)]
struct Predictable {
a: u8, // offset: 0
b: u64, // offset: 8 (after 7 bytes padding)
c: u16, // offset: 16
}
// Total size: 24 bytes (includes final padding)
With #[repr(C)]
:
- Fields are laid out in declaration order
- Alignment follows C ABI rules
- Layout is stable and predictable
Layout Calculation Algorithm
Here’s how #[repr(C)]
calculates layout:
// Pseudocode for #[repr(C)] layout
struct_alignment = max(field_alignments);
current_offset = 0;
for field in fields_in_declaration_order {
// Add padding if necessary
padding = (field.align - (current_offset % field.align)) % field.align;
current_offset += padding;
field.offset = current_offset;
current_offset += field.size;
}
// Add final padding
final_padding = (struct_alignment - (current_offset % struct_alignment)) % struct_alignment;
struct_size = current_offset + final_padding;
Part 4: Optimization Strategies
1. Field Ordering
Order fields by decreasing alignment to minimize padding:
// Poor layout: 24 bytes
#[repr(C)]
struct Inefficient {
a: u8, // 1 byte + 7 padding
b: u64, // 8 bytes
c: u8, // 1 byte + 7 padding
}
// Better layout: 16 bytes
#[repr(C)]
struct Efficient {
b: u64, // 8 bytes
a: u8, // 1 byte
c: u8, // 1 byte + 6 padding
}
2. Alignment Modifiers
Use #[repr(packed)]
to eliminate padding (but beware of performance costs):
#[repr(C, packed)]
struct Packed {
a: u8, // offset: 0
b: u64, // offset: 1 (unaligned!)
c: u16, // offset: 9
}
// Size: 11 bytes, but accessing b is slow/dangerous
Use #[repr(align(n))]
to increase alignment:
#[repr(C, align(16))]
struct CacheAligned {
data: u64,
}
// Size: 16 bytes (for cache line alignment)
Part 5: Special Considerations
Zero-Sized Types (ZSTs)
Types with no data have zero size but maintain valid alignment:
struct Empty;
assert_eq!(std::mem::size_of::<Empty>(), 0);
assert_eq!(std::mem::align_of::<Empty>(), 1);
Enums
Enum layout depends on the representation:
// Size depends on discriminant + largest variant
enum Option<T> {
Some(T),
None,
}
// Niche optimization: Option<&T> same size as &T
assert_eq!(
std::mem::size_of::<Option<&u32>>(),
std::mem::size_of::<&u32>()
);
Platform Dependencies
Remember that usize
/isize
and pointer sizes vary:
- 32-bit platforms: 4 bytes
- 64-bit platforms: 8 bytes
Part 6: Practical Example
Let’s see everything in action:
use std::mem::{size_of, align_of};
#[repr(Rust)] // Default
struct RustLayout {
a: u8,
b: u64,
c: u16,
}
#[repr(C)]
struct CLayout {
a: u8,
b: u64,
c: u16,
}
#[repr(C, packed)]
struct PackedLayout {
a: u8,
b: u64,
c: u16,
}
fn main() {
println!("RustLayout: size = {}, align = {}",
size_of::<RustLayout>(), // Compiler-dependent
align_of::<RustLayout>()); // 8
println!("CLayout: size = {}, align = {}",
size_of::<CLayout>(), // 24
align_of::<CLayout>()); // 8
println!("PackedLayout: size = {}, align = {}",
size_of::<PackedLayout>(), // 11
align_of::<PackedLayout>()); // 1
}
Key Takeaways
- Alignment matters for performance: Unaligned access can be orders of magnitude slower
- Default Rust layout is optimized but unpredictable: Use
#[repr(C)]
when you need guarantees - Size includes padding: A struct’s size is always a multiple of its alignment
- Field order affects memory usage: Place larger-aligned fields first in
#[repr(C)]
structs - Different representations serve different purposes:
#[repr(Rust)]
: Best performance, compiler can optimize#[repr(C)]
: FFI compatibility, predictable layout#[repr(packed)]
: Minimal size, but potentially slow#[repr(align(n))]
: Cache-line alignment, SIMD operations
Understanding memory layout isn’t just academic—it directly impacts your program’s performance, correctness, and ability to interface with other languages. Whether you’re optimizing hot code paths, working with FFI, or building embedded systems, these concepts are essential tools in your Rust toolbox.