Disaggregation

Disaggregation is the semiconductor design strategy of decomposing a monolithic system-on-chip (SoC) into multiple smaller, independently designed and manufactured chiplets — separating compute, memory, I/O, and specialized functions into distinct dies that can be fabricated on different process nodes, sourced from different vendors, and assembled into a single package through advanced packaging, enabling better yield, lower cost, faster time-to-market, and more flexible product families than monolithic integration.

What Is Disaggregation?

- Definition: The architectural decision to split a single large die into multiple smaller dies (chiplets) along functional boundaries — each chiplet handles a specific function (CPU cores, GPU cores, memory controller, I/O, SerDes) and is connected to other chiplets through die-to-die interconnects within an advanced package.
- Opposite of Integration: For decades, the semiconductor industry pursued monolithic integration — putting more functions on a single die. Disaggregation reverses this trend by splitting functions back into separate dies, but reconnecting them at much finer granularity than board-level integration.
- Partitioning Decisions: The key design challenge is deciding where to "cut" the monolithic die — boundaries should minimize die-to-die bandwidth requirements, align with natural functional boundaries, and separate functions that benefit from different process nodes.
- Economic Driver: Disaggregation is driven by the exponential cost increase of advanced nodes — a 3nm wafer costs 3-4× more than a 7nm wafer, so functions that don't benefit from 3nm (I/O, analog, memory controllers) should remain on cheaper nodes.

Why Disaggregation Matters

- Yield Improvement: A monolithic 800 mm² die on 3nm has ~30% yield — disaggregating into four 200 mm² chiplets improves per-chiplet yield to ~70%, dramatically reducing the cost of working silicon.
- Node Optimization: Disaggregation enables each function to use its optimal process — compute cores on 3nm for density, I/O on 6nm for analog performance, SerDes on 7nm for proven reliability — impossible with monolithic integration.
- Product Family Scaling: The same chiplet building blocks create multiple products — AMD uses 1-12 compute chiplets with a common I/O die to span from desktop (8 cores) to server (96 cores), amortizing design cost across the entire product line.
- Design Reuse: A proven I/O chiplet can be reused across multiple product generations — when the compute chiplet moves to the next node, the I/O chiplet remains unchanged, reducing design effort by 30-50%.
- Time-to-Market: Designing a new compute chiplet (1.5-2 years) while reusing proven I/O and memory chiplets is faster than designing a complete new monolithic SoC (3-4 years).

Disaggregation Examples

- AMD EPYC/Ryzen: Disaggregated CPU into compute chiplets (CCD, 8 cores each on 5nm) and I/O die (IOD on 6nm) — the pioneering commercial disaggregation that proved the concept at scale.
- Intel Ponte Vecchio: Disaggregated GPU into 47 tiles across 5 process technologies — compute, base, Xe Link, EMIB, and Foveros tiles from Intel and TSMC fabs.
- NVIDIA Blackwell: Disaggregated GPU into two compute dies connected by NVLink-C2C — NVIDIA's first multi-die GPU architecture.
- Apple M1 Ultra: Disaggregated by connecting two M1 Max dies via UltraFusion — doubling compute without designing a new monolithic chip.

| Aspect | Monolithic | Disaggregated |
|--------|-----------|--------------|
| Die Size | Large (400-800 mm²) | Small (100-300 mm² each) |
| Yield | Low (30-50%) | High (70-85% per chiplet) |
| Process Nodes | Single node for all | Optimal node per function |
| Design Cost | $500M-1B (one die) | $200-400M per chiplet |
| Product Family | One design = one product | Chiplets mix-and-match |
| Time-to-Market | 3-4 years | 1.5-2 years (derivative) |
| D2D Overhead | None | 2-5% area, < 2 ns latency |
| Package Cost | Simple ($10-50) | Complex ($100-500) |

Disaggregation is the architectural paradigm shift redefining semiconductor product design — decomposing monolithic chips into modular chiplets that improve yield, optimize process node usage, enable product family scaling, and accelerate time-to-market, establishing the dominant design methodology for high-performance processors, AI accelerators, and data center chips.

Want to learn more?