MLIR (Multi-Level Intermediate Representation) is an extensible compiler infrastructure for building domain-specific compilers — developed by Google and now part of LLVM, MLIR enables ML frameworks to define custom optimizations and target diverse hardware through a flexible, composable IR system.
What Is MLIR?
- Definition: Framework for building and composing compiler IRs.
- Origin: Google, now LLVM project.
- Purpose: Simplify compiler construction for ML and beyond.
- Key Feature: Multiple abstraction levels in one framework.
Why MLIR Matters
- Fragmentation: Each framework had its own compiler stack.
- Reuse: Share optimizations across frameworks/targets.
- Flexibility: Custom dialects for domain-specific needs.
- Hardware Diversity: Single path to many accelerators.
- Performance: Systematic optimization opportunities.
MLIR Architecture
Dialect System:
High Level:
┌──────────────────────────────────────────────────────────┐
│ Framework Dialect (tf, torch, stablehlo) │
│ - High-level ops (conv2d, matmul, attention) │
└──────────────────────────────────────────────────────────┘
│ Lowering
▼
┌──────────────────────────────────────────────────────────┐
│ Mid-Level Dialect (linalg, tensor) │
│ - Generic linear algebra ops │
└──────────────────────────────────────────────────────────┘
│ Lowering
▼
┌──────────────────────────────────────────────────────────┐
│ Low-Level Dialect (scf, memref, arith) │
│ - Loops, memory, arithmetic │
└──────────────────────────────────────────────────────────┘
│ Lowering
▼
┌──────────────────────────────────────────────────────────┐
│ Target Dialect (llvm, gpu, spirv) │
│ - Hardware-specific representation │
└──────────────────────────────────────────────────────────┘
Key Dialects:
Dialect | Purpose
-------------|----------------------------------
tf | TensorFlow operations
torch | PyTorch operations
stablehlo | Stable HLO (cross-framework)
linalg | Generic linear algebra
tensor | Tensor operations
scf | Structured control flow
memref | Memory references
arith | Arithmetic operations
gpu | GPU abstractions
llvm | LLVM IR target
How MLIR Works
Example Lowering:
Input (PyTorch):
y = torch.matmul(A, B)
↓ torch dialect
%y = torch.matmul %A, %B
↓ linalg dialect
%y = linalg.matmul ins(%A, %B) outs(%C)
↓ scf/memref
scf.for %i = 0 to %M {
scf.for %j = 0 to %N {
scf.for %k = 0 to %K {
%a = memref.load %A[%i, %k]
%b = memref.load %B[%k, %j]
%c = memref.load %C[%i, %j]
%prod = arith.mulf %a, %b
%sum = arith.addf %c, %prod
memref.store %sum, %C[%i, %j]
}
}
}
↓ Target (LLVM or GPU)
MLIR in ML Ecosystem
Framework Integration:
Framework | MLIR Usage
-----------------|----------------------------------
TensorFlow | XLA uses MLIR (StableHLO)
PyTorch | torch-mlir, torch.compile
JAX | JAX → StableHLO → MLIR
IREE | End-to-end MLIR compiler
OpenXLA | Cross-framework compilation
torch-mlir Example:
import torch
import torch_mlir
class MyModel(torch.nn.Module):
def forward(self, x, y):
return torch.matmul(x, y)
model = MyModel()
example_inputs = (torch.randn(4, 8), torch.randn(8, 16))
# Export to MLIR
mlir_module = torch_mlir.compile(
model,
example_inputs,
output_type="stablehlo"
)
print(mlir_module)
Advantages of MLIR
For Compiler Developers:
Benefit | Description
---------------------|----------------------------------
Reusable passes | Share optimizations across dialects
Type system | Rich, extensible type support
Verification | Built-in IR validation
Debugging | Great tooling (mlir-opt, etc.)
Documentation | Operation definitions are docs
For Hardware Vendors:
Benefit | Description
---------------------|----------------------------------
Single entry point | Support TF, PyTorch, JAX via MLIR
Focus on backend | Framework integration handled
Community | Leverage ecosystem work
Portability | Standard representation
Common Passes
Pass | Purpose
-----------------------|----------------------------------
Canonicalization | Simplify patterns
CSE | Common subexpression elimination
Inlining | Inline function calls
Loop fusion | Combine loops
Tiling | Partition for parallelism
Bufferization | Convert tensors to memrefs
MLIR is the foundation of modern ML compiler stacks — by providing a flexible, extensible framework for building domain-specific compilers, it enables the systematic optimization needed to extract maximum performance from diverse AI hardware.
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.