Memory Consistency Models | ChipFoundryServices

Home› Knowledge Base› Memory Consistency Models

Memory Consistency Models are formal specifications that define the order in which memory operations (loads and stores) performed by one processor become visible to other processors in a shared-memory multiprocessor system — choosing the right consistency model is critical because it determines both the correctness guarantees available to programmers and the hardware/compiler optimization opportunities.

Sequential Consistency (SC):

Definition: the result of any execution is the same as if operations of all processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program — the strongest and most intuitive model
Implications: all processors observe stores in the same total order, no store can appear to be reordered before a prior load or store from the same processor — severely limits hardware optimization
Performance Cost: prevents store buffers, write combining, and out-of-order memory access — modern processors would lose 30-50% performance under strict SC
Historical Significance: defined by Lamport (1979), serves as the reference model against which all relaxed models are compared

Total Store Order (TSO):

Relaxation: allows a processor's own stores to be buffered and read by subsequent loads before becoming globally visible — store-to-load reordering is permitted (FIFO store buffer)
x86 Implementation: Intel and AMD processors implement TSO (with minor exceptions) — stores are ordered with respect to each other and loads see the most recent store from the local store buffer
Store Buffer Forwarding: a load can read a value from the local store buffer before it's written to cache — this is the only reordering permitted under TSO
Programming Impact: most intuitive algorithms work correctly under TSO without explicit fences — only algorithms relying on store-to-load ordering (like Dekker's algorithm) require MFENCE instructions

Relaxed Consistency Models:

Weak Ordering: divides memory operations into ordinary and synchronization operations — ordinary operations can be freely reordered, synchronization operations enforce ordering barriers
Release Consistency (RC): refines weak ordering by distinguishing acquire (lock) and release (unlock) operations — acquires prevent subsequent operations from moving before them, releases prevent prior operations from moving after them
ARM and POWER Models: extremely relaxed — allow store-to-store, load-to-load, and load-to-store reordering in addition to store-to-load — require explicit barrier instructions (dmb, lwsync) for ordering
Alpha Model: historically the most relaxed — even allowed dependent loads to be reordered (value speculation), requiring explicit memory barriers between a pointer load and its dereference

Memory Fences and Barriers:

Full Fence (MFENCE on x86): prevents all reordering across the fence — loads and stores before the fence complete before any loads or stores after the fence begin
Store Fence (SFENCE): ensures all prior stores are globally visible before subsequent stores — used with non-temporal stores that bypass cache
Load Fence (LFENCE): ensures all prior loads complete before subsequent loads execute — rarely needed on x86 (TSO already orders loads) but critical on ARM/POWER
Acquire/Release Semantics: one-directional barriers — acquire prevents downward movement, release prevents upward movement — sufficient for most synchronization patterns and cheaper than full fences

Language-Level Memory Models:

C++11/C11 Memory Model: defines memory_order_seq_cst (default), memory_order_acquire, memory_order_release, memory_order_relaxed, and memory_order_acq_rel — portable across architectures
Java Memory Model (JMM): volatile reads/writes provide acquire/release semantics, final fields are safely published after construction — happens-before relationship defines visibility guarantees
Compiler Barriers: prevent compiler reordering without emitting hardware fence instructions — asm volatile("" ::: "memory") in GCC, std::atomic_signal_fence in C++
Data Race Freedom (DRF): if a program is correctly synchronized (no data races), it behaves as if executed under sequential consistency — the DRF guarantee is the foundation of modern language memory models

Correctly understanding memory consistency is essential for writing portable parallel code — a program that works on x86 (TSO) may fail on ARM (relaxed) if it relies on implicit ordering guarantees that don't exist on weaker architectures.

memory consistency models parallelsequential consistency relaxedtotal store order memoryrelease consistency acquirememory ordering guarantees

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.

🔍 Search Topics 💬 Ask CFSGPT 📚 Browse All

Related Topics

Explore 500+ Semiconductor & AI Topics