Home Knowledge Base MLIR (Multi-Level Intermediate Representation)

MLIR (Multi-Level Intermediate Representation) is an extensible compiler infrastructure for building domain-specific compilers — developed by Google and now part of LLVM, MLIR enables ML frameworks to define custom optimizations and target diverse hardware through a flexible, composable IR system.

What Is MLIR?

Why MLIR Matters

MLIR Architecture

Dialect System:

High Level:
┌──────────────────────────────────────────────────────────┐
│ Framework Dialect (tf, torch, stablehlo)                │
│ - High-level ops (conv2d, matmul, attention)           │
└──────────────────────────────────────────────────────────┘
                           │ Lowering
                           ▼
┌──────────────────────────────────────────────────────────┐
│ Mid-Level Dialect (linalg, tensor)                       │
│ - Generic linear algebra ops                             │
└──────────────────────────────────────────────────────────┘
                           │ Lowering
                           ▼
┌──────────────────────────────────────────────────────────┐
│ Low-Level Dialect (scf, memref, arith)                   │
│ - Loops, memory, arithmetic                              │
└──────────────────────────────────────────────────────────┘
                           │ Lowering
                           ▼
┌──────────────────────────────────────────────────────────┐
│ Target Dialect (llvm, gpu, spirv)                        │
│ - Hardware-specific representation                       │
└──────────────────────────────────────────────────────────┘

Key Dialects:

Dialect      | Purpose
-------------|----------------------------------
tf           | TensorFlow operations
torch        | PyTorch operations
stablehlo    | Stable HLO (cross-framework)
linalg       | Generic linear algebra
tensor       | Tensor operations
scf          | Structured control flow
memref       | Memory references
arith        | Arithmetic operations
gpu          | GPU abstractions
llvm         | LLVM IR target

How MLIR Works

Example Lowering:

Input (PyTorch):
  y = torch.matmul(A, B)

↓ torch dialect
  %y = torch.matmul %A, %B

↓ linalg dialect
  %y = linalg.matmul ins(%A, %B) outs(%C)

↓ scf/memref
  scf.for %i = 0 to %M {
    scf.for %j = 0 to %N {
      scf.for %k = 0 to %K {
        %a = memref.load %A[%i, %k]
        %b = memref.load %B[%k, %j]
        %c = memref.load %C[%i, %j]
        %prod = arith.mulf %a, %b
        %sum = arith.addf %c, %prod
        memref.store %sum, %C[%i, %j]
      }
    }
  }

↓ Target (LLVM or GPU)

MLIR in ML Ecosystem

Framework Integration:

Framework        | MLIR Usage
-----------------|----------------------------------
TensorFlow       | XLA uses MLIR (StableHLO)
PyTorch          | torch-mlir, torch.compile
JAX              | JAX → StableHLO → MLIR
IREE             | End-to-end MLIR compiler
OpenXLA          | Cross-framework compilation

torch-mlir Example:

import torch
import torch_mlir

class MyModel(torch.nn.Module):
    def forward(self, x, y):
        return torch.matmul(x, y)

model = MyModel()
example_inputs = (torch.randn(4, 8), torch.randn(8, 16))

# Export to MLIR
mlir_module = torch_mlir.compile(
    model,
    example_inputs,
    output_type="stablehlo"
)

print(mlir_module)

Advantages of MLIR

For Compiler Developers:

Benefit              | Description
---------------------|----------------------------------
Reusable passes      | Share optimizations across dialects
Type system          | Rich, extensible type support
Verification         | Built-in IR validation
Debugging            | Great tooling (mlir-opt, etc.)
Documentation        | Operation definitions are docs

For Hardware Vendors:

Benefit              | Description
---------------------|----------------------------------
Single entry point   | Support TF, PyTorch, JAX via MLIR
Focus on backend     | Framework integration handled
Community            | Leverage ecosystem work
Portability          | Standard representation

Common Passes

Pass                   | Purpose
-----------------------|----------------------------------
Canonicalization       | Simplify patterns
CSE                    | Common subexpression elimination
Inlining               | Inline function calls
Loop fusion            | Combine loops
Tiling                 | Partition for parallelism
Bufferization          | Convert tensors to memrefs

MLIR is the foundation of modern ML compiler stacks — by providing a flexible, extensible framework for building domain-specific compilers, it enables the systematic optimization needed to extract maximum performance from diverse AI hardware.

mlircompilerintermediatedialectloweringxla

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.