Neural Architecture Search (NAS) is the automated machine learning technique that algorithmically discovers optimal neural network architectures — searching over the space of layer types, connections, depths, widths, and activation functions to find architectures that outperform manually-designed networks on a given task, often discovering novel design patterns that human engineers would not have considered.
Why Automate Architecture Design
Manual architecture design (ResNet, Inception, Transformer) requires deep expertise and extensive experimentation. The search space of possible architectures is astronomically large — a 20-layer network with 10 choices per layer has 10²⁰ possible architectures. NAS automates this search using optimization algorithms that systematically evaluate candidates and converge on high-performing designs.
Search Strategies
- Reinforcement Learning NAS (Zoph & Le, 2017): A controller RNN generates architecture descriptions (layer types, filter sizes, skip connections). Candidate architectures are trained and evaluated; the evaluation accuracy is the reward signal for training the controller via REINFORCE. The original NAS paper used 800 GPUs for 28 days — effective but prohibitively expensive.
- Evolutionary NAS: Maintain a population of architectures. Mutate (add/remove layers, change parameters) the best-performing individuals. Select survivors based on fitness (accuracy). AmoebaNet discovered architectures rivaling NASNet at lower search cost.
- Differentiable NAS (DARTS): Instead of sampling discrete architectures, construct a supernetwork containing all candidate operations at each layer. Use continuous relaxation (softmax over operation weights) and optimize architecture weights by gradient descent alongside network weights. Search completes in GPU-hours instead of GPU-months. The most widely used approach.
- One-Shot NAS: Train a single supernetwork once. Evaluate sub-networks by inheriting weights from the supernetwork (weight sharing). Rank candidate architectures by their inherited performance without retraining. Dramatically reduces search cost.
Search Space Design
The search space definition is as important as the search algorithm:
- Cell-based: Search for a repeating cell (normal cell + reduction cell) that is stacked to form the full network. Reduces the search space from O(10^20) to O(10^9) while producing transferable building blocks.
- Macro-search: Search over the entire network topology including depth, width, and skip connections. More flexible but harder to optimize.
Hardware-Aware NAS
Modern NAS co-optimizes accuracy and hardware efficiency (latency, energy, memory). The search incorporates a hardware cost model (measured or predicted inference latency on target hardware). MnasNet, EfficientNet, and Once-for-All networks were discovered by hardware-aware NAS targeting mobile devices.
Neural Architecture Search is the meta-learning approach that uses machines to design the machines — automating the creative process of architecture design and pushing human knowledge to discover the search spaces while algorithms discover the architectures within them.