CUDA and Compute Capability
What is CUDA? CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform and API that enables GPUs to be used for general-purpose computing. It is the foundation for all modern GPU-accelerated AI/ML workloads.
Compute Capability Explained Compute Capability is a version number indicating which hardware features a GPU supports. Higher versions unlock newer optimizations and instruction sets.
Compute Capability by Architecture
| CC | Architecture | Year | Example GPUs | Key AI Features |
|---|---|---|---|---|
| 7.0 | Volta | 2017 | V100 | 1st gen Tensor Cores |
| 7.5 | Turing | 2018 | RTX 2080, T4 | INT8 inference |
| 8.0 | Ampere | 2020 | A100 | 3rd gen Tensor Cores, TF32 |
| 8.6 | Ampere | 2021 | RTX 3090 | Consumer Ampere |
| 8.9 | Ada Lovelace | 2022 | RTX 4090, L40S | FP8, Transformer Engine |
| 9.0 | Hopper | 2023 | H100, H200 | 4th gen Tensor Cores |
Why CC Matters for AI
- Framework requirements: PyTorch, TensorFlow require minimum CC levels
- Precision support: FP8 requires CC 8.9+, BF16 requires CC 8.0+
- Performance features: Flash Attention optimized for specific CC levels
- Driver compatibility: Newer drivers may drop old CC support
Checking Your Compute Capability
import torch
device = torch.cuda.current_device()
cc = torch.cuda.get_device_capability(device)
print(f"Compute Capability: {cc[0]}.{cc[1]}")
cudacompute capabilitynvidia
Related Topics
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.