Home Knowledge Base NUMA-Aware Programming

NUMA-Aware Programming is the practice of allocating and accessing memory in ways that minimize cross-NUMA-node memory accesses — exploiting the topology of Non-Uniform Memory Access systems to reduce memory latency and increase bandwidth.

NUMA Topology

Detecting NUMA Topology

numactl --hardware     # Show nodes, CPUs per node, memory
lscpu | grep NUMA      # NUMA node count
numastat               # NUMA hit/miss statistics per process

Memory Allocation Policies

#include <numa.h>

// Allocate on current node (first-touch policy — default)
void* p = malloc(size);  // Allocated on node that first accesses it

// Explicit node allocation
void* p = numa_alloc_onnode(size, node_id);

// Interleave across all nodes (good for shared data)
void* p = numa_alloc_interleaved(size);

// Bind thread to node
numa_run_on_node(node_id);

First-Touch Policy

Thread Pinning (CPU Affinity)

cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(core_id, &cpuset);
pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);

NUMA Impact on MPI

NUMA-aware programming is a critical optimization for multi-socket server workloads — database servers, HPC simulations, and in-memory analytics routinely achieve 2-3x performance improvements by aligning memory allocation with memory access patterns.

numa aware programmingmemory bindinglibnumanuma topologynuma optimization

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.