Intrinsic image decomposition

Intrinsic image decomposition is the task of separating an image into intrinsic components — decomposing appearance into reflectance (albedo) and shading (illumination), enabling material editing, relighting, and understanding of scene properties independent of lighting conditions.

What Is Intrinsic Image Decomposition?

- Definition: Decompose image into reflectance and shading.
- Input: Single RGB image.
- Output:
- Reflectance (Albedo): Surface color/texture independent of lighting.
- Shading (Illumination): Lighting effects (shadows, highlights).
- Relationship: Image = Reflectance × Shading (in linear space).

Why Intrinsic Decomposition?

- Material Editing: Change surface colors without affecting lighting.
- Relighting: Change lighting while preserving materials.
- Object Recognition: Recognize objects independent of lighting.
- Augmented Reality: Realistic insertion of virtual objects.
- Computational Photography: Advanced photo editing.

Intrinsic Components

Reflectance (Albedo):
- Definition: Intrinsic surface color/texture.
- Properties: Independent of lighting, viewpoint.
- Example: Red ball has red reflectance regardless of lighting.

Shading (Illumination):
- Definition: Lighting effects on surface.
- Components: Direct illumination, shadows, inter-reflections.
- Properties: Depends on lighting, geometry, viewpoint.

Image Formation:
``I(x) = R(x) · S(x)

Where: - I(x): Observed image intensity at pixel x - R(x): Reflectance (albedo) - S(x): Shading (illumination)``

Intrinsic Decomposition Approaches

Optimization-Based:
- Method: Formulate as energy minimization.
- Energy: Data term + priors (smoothness, sparsity).
- Priors:
- Reflectance is piecewise constant.
- Shading is smooth.
- Reflectance changes at texture edges, shading at geometry edges.
- Examples: Retinex, Intrinsic Images in the Wild.

Learning-Based:
- Method: Neural networks learn decomposition.
- Training: Supervised on synthetic or real data with ground truth.
- Examples: CGIntrinsics, IIW, ShapeNet Intrinsics.
- Benefit: Handle complex real-world images.

Physics-Based:
- Method: Model light transport, inverse rendering.
- Benefit: Physically accurate decomposition.
- Challenge: Requires scene geometry, material properties.

Challenges

Ill-Posed Problem:
- Ambiguity: Infinite (reflectance, shading) pairs can produce same image.
- Example: Dark reflectance + bright shading = bright reflectance + dark shading.
- Solution: Priors, constraints, learning from data.

Texture vs. Shading:
- Problem: Distinguish texture (reflectance) from shading.
- Example: Polka dots (texture) vs. shadows (shading).
- Solution: Multi-scale analysis, learned features.

Complex Lighting:
- Problem: Inter-reflections, subsurface scattering, transparency.
- Challenge: Simple reflectance × shading model insufficient.

Ground Truth:
- Problem: Difficult to obtain ground truth for real images.
- Solution: Synthetic data, multi-illumination capture, crowdsourcing.

Intrinsic Decomposition Methods

Retinex:
- Classic: Separate reflectance and illumination based on gradients.
- Assumption: Reflectance has sharp edges, illumination is smooth.
- Limitation: Oversimplified, doesn't handle complex scenes.

Intrinsic Images in the Wild (IIW):
- Method: Learn from sparse human annotations.
- Annotations: Relative reflectance judgments (same/different material).
- Benefit: Scalable annotation, real-world data.

CGIntrinsics:
- Training: Synthetic data from 3D scenes.
- Network: CNN predicts reflectance and shading.
- Benefit: Large-scale training data.

ShapeNet Intrinsics:
- Training: Rendered 3D objects with known reflectance/shading.
- Benefit: Perfect ground truth for training.

Applications

Material Editing:
- Use: Change surface colors independently of lighting.
- Example: Recolor walls, furniture, clothing.
- Benefit: Realistic edits respecting lighting.

Relighting:
- Use: Change lighting while preserving materials.
- Process: Decompose → modify shading → recompose.
- Example: Change time of day, add/remove lights.

Object Recognition:
- Use: Recognize objects from reflectance (lighting-invariant).
- Benefit: Robust to lighting variations.

Augmented Reality:
- Use: Understand scene lighting for realistic AR.
- Benefit: Virtual objects match real lighting.

Computational Photography:
- Use: Advanced photo editing (selective relighting, material transfer).
- Benefit: Physically plausible edits.

Intrinsic Decomposition Techniques

Multi-Illumination:
- Method: Capture scene under multiple lighting conditions.
- Benefit: Resolve ambiguities, accurate decomposition.
- Challenge: Requires controlled capture.

Multi-View:
- Method: Use multiple viewpoints.
- Benefit: Geometric constraints aid decomposition.

Video:
- Method: Temporal consistency across frames.
- Benefit: More constraints, better decomposition.

Semantic Guidance:
- Method: Use semantic segmentation to guide decomposition.
- Benefit: Material boundaries align with semantic boundaries.

Quality Metrics

MSE (Mean Squared Error):
- Definition: Pixel-wise error in reflectance and shading.
- Limitation: Doesn't account for perceptual quality.

LMSE (Local MSE):
- Definition: MSE after local scaling (handles scale ambiguity).
- Benefit: More robust to global intensity shifts.

DSSIM (Structural Dissimilarity):
- Definition: 1 - SSIM (structural similarity).
- Benefit: Perceptually motivated.

Intrinsic Decomposition Datasets

MIT Intrinsic Images:
- Data: Real objects with ground truth from multi-illumination capture.
- Size: Small but high-quality.

IIW (Intrinsic Images in the Wild):
- Data: Real images with sparse human annotations.
- Size: Large-scale, diverse scenes.

ShapeNet Intrinsics:
- Data: Rendered 3D objects with perfect ground truth.
- Size: Large-scale synthetic data.

MPI Sintel:
- Data: Animated movie frames with ground truth.
- Use: Evaluation on complex scenes.

Future of Intrinsic Decomposition

- Single-Image: Accurate decomposition from single image.
- Real-Time: Fast decomposition for interactive applications.
- Video: Temporally consistent decomposition.
- Semantic: Integrate semantic understanding.
- Physics-Based: Incorporate physical light transport models.
- Generalization: Models that work across diverse scenes.

Intrinsic image decomposition is fundamental to computational photography and computer vision — it enables understanding and manipulating images at the level of materials and lighting, supporting applications from photo editing to augmented reality to object recognition.

Want to learn more?