Neural Radiance Fields (NeRF) is the neural network technique that represents a 3D scene as a continuous volumetric function learned from 2D photographs — mapping every 3D coordinate (x, y, z) and viewing direction (θ, φ) to a color (r, g, b) and volume density σ, enabling photorealistic novel view synthesis by rendering new viewpoints of a scene never directly photographed, through differentiable volume rendering that allows end-to-end training from only posed 2D images.
Core Architecture
The NeRF model is a simple MLP (8 layers, 256 channels) that takes as input a 5D coordinate (x, y, z, θ, φ) and outputs (r, g, b, σ):
- Positional Encoding: Raw (x, y, z) is mapped through sinusoidal functions at multiple frequencies: γ(p) = [sin(2⁰πp), cos(2⁰πp), ..., sin(2^(L-1)πp), cos(2^(L-1)πp)]. This enables the MLP to represent high-frequency geometric and appearance details that a raw-coordinate MLP would smooth over.
- View-Dependent Color: Density σ depends only on position (geometry is view-independent). Color depends on both position and viewing direction, capturing specular reflections and other view-dependent effects.
Volume Rendering
To render a pixel, cast a ray from the camera through that pixel into the scene: 1. Sample N points along the ray (t₁, t₂, ..., tN). 2. Query the MLP at each sample point to get (color_i, density_i). 3. Alpha-composite front-to-back: C(r) = Σᵢ Tᵢ × (1 - exp(-σᵢ × δᵢ)) × cᵢ, where Tᵢ = exp(-Σⱼ<ᵢ σⱼ × δⱼ) is the accumulated transmittance and δᵢ is the distance between samples.
This rendering is fully differentiable — gradients flow from the rendered pixel color back through the volume rendering equation to the MLP weights.
Training
Input: 50-200 posed photographs (camera position and orientation known). Loss: L2 between rendered pixel color and ground-truth pixel color. Optimize MLP weights via Adam. Training takes 12-48 hours on a single GPU for the original NeRF. Each iteration: sample random rays from random training images, render them through the MLP, compute loss, backpropagate.
Major Advances
- Instant-NGP (NVIDIA, 2022): Multi-resolution hash encoding replaces positional encoding and MLP with a compact hash table — training in seconds, rendering in real-time. 1000× speedup over original NeRF.
- 3D Gaussian Splatting (2023): Replace implicit volume with explicit 3D Gaussian primitives. Each Gaussian has position, covariance, opacity, and spherical harmonics color. Rasterization-based rendering at 100+ FPS — far faster than ray marching. Training in minutes.
- Mip-NeRF: Anti-aliased NeRF that reasons about the volume of each ray cone (not just the center line) — eliminates aliasing artifacts at different scales.
- Block-NeRF / Mega-NeRF: City-scale reconstruction by dividing the scene into blocks, each with its own NeRF, composited at render time.
Neural Radiance Fields are the breakthrough that brought neural scene representation to photorealistic quality — demonstrating that a simple MLP can memorize the complete appearance of a 3D scene from photographs, and spawning a revolution in 3D reconstruction, virtual reality, and visual effects.
Related Topics
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.