Intrinsic image decomposition is the task of separating an image into intrinsic components โ decomposing appearance into reflectance (albedo) and shading (illumination), enabling material editing, relighting, and understanding of scene properties independent of lighting conditions.
What Is Intrinsic Image Decomposition?
- Definition: Decompose image into reflectance and shading.
- Input: Single RGB image.
- Output:
- Reflectance (Albedo): Surface color/texture independent of lighting.
- Shading (Illumination): Lighting effects (shadows, highlights).
- Relationship: Image = Reflectance ร Shading (in linear space).
Why Intrinsic Decomposition?
- Material Editing: Change surface colors without affecting lighting.
- Relighting: Change lighting while preserving materials.
- Object Recognition: Recognize objects independent of lighting.
- Augmented Reality: Realistic insertion of virtual objects.
- Computational Photography: Advanced photo editing.
Intrinsic Components
Reflectance (Albedo):
- Definition: Intrinsic surface color/texture.
- Properties: Independent of lighting, viewpoint.
- Example: Red ball has red reflectance regardless of lighting.
Shading (Illumination):
- Definition: Lighting effects on surface.
- Components: Direct illumination, shadows, inter-reflections.
- Properties: Depends on lighting, geometry, viewpoint.
Image Formation:
``
I(x) = R(x) ยท S(x)
Where:
- I(x): Observed image intensity at pixel x
- R(x): Reflectance (albedo)
- S(x): Shading (illumination)
``
Intrinsic Decomposition Approaches
Optimization-Based:
- Method: Formulate as energy minimization.
- Energy: Data term + priors (smoothness, sparsity).
- Priors:
- Reflectance is piecewise constant.
- Shading is smooth.
- Reflectance changes at texture edges, shading at geometry edges.
- Examples: Retinex, Intrinsic Images in the Wild.
Learning-Based:
- Method: Neural networks learn decomposition.
- Training: Supervised on synthetic or real data with ground truth.
- Examples: CGIntrinsics, IIW, ShapeNet Intrinsics.
- Benefit: Handle complex real-world images.
Physics-Based:
- Method: Model light transport, inverse rendering.
- Benefit: Physically accurate decomposition.
- Challenge: Requires scene geometry, material properties.
Challenges
Ill-Posed Problem:
- Ambiguity: Infinite (reflectance, shading) pairs can produce same image.
- Example: Dark reflectance + bright shading = bright reflectance + dark shading.
- Solution: Priors, constraints, learning from data.
Texture vs. Shading:
- Problem: Distinguish texture (reflectance) from shading.
- Example: Polka dots (texture) vs. shadows (shading).
- Solution: Multi-scale analysis, learned features.
Complex Lighting:
- Problem: Inter-reflections, subsurface scattering, transparency.
- Challenge: Simple reflectance ร shading model insufficient.
Ground Truth:
- Problem: Difficult to obtain ground truth for real images.
- Solution: Synthetic data, multi-illumination capture, crowdsourcing.
Intrinsic Decomposition Methods
Retinex:
- Classic: Separate reflectance and illumination based on gradients.
- Assumption: Reflectance has sharp edges, illumination is smooth.
- Limitation: Oversimplified, doesn't handle complex scenes.
Intrinsic Images in the Wild (IIW):
- Method: Learn from sparse human annotations.
- Annotations: Relative reflectance judgments (same/different material).
- Benefit: Scalable annotation, real-world data.
CGIntrinsics:
- Training: Synthetic data from 3D scenes.
- Network: CNN predicts reflectance and shading.
- Benefit: Large-scale training data.
ShapeNet Intrinsics:
- Training: Rendered 3D objects with known reflectance/shading.
- Benefit: Perfect ground truth for training.
Applications
Material Editing:
- Use: Change surface colors independently of lighting.
- Example: Recolor walls, furniture, clothing.
- Benefit: Realistic edits respecting lighting.
Relighting:
- Use: Change lighting while preserving materials.
- Process: Decompose โ modify shading โ recompose.
- Example: Change time of day, add/remove lights.
Object Recognition:
- Use: Recognize objects from reflectance (lighting-invariant).
- Benefit: Robust to lighting variations.
Augmented Reality:
- Use: Understand scene lighting for realistic AR.
- Benefit: Virtual objects match real lighting.
Computational Photography:
- Use: Advanced photo editing (selective relighting, material transfer).
- Benefit: Physically plausible edits.
Intrinsic Decomposition Techniques
Multi-Illumination:
- Method: Capture scene under multiple lighting conditions.
- Benefit: Resolve ambiguities, accurate decomposition.
- Challenge: Requires controlled capture.
Multi-View:
- Method: Use multiple viewpoints.
- Benefit: Geometric constraints aid decomposition.
Video:
- Method: Temporal consistency across frames.
- Benefit: More constraints, better decomposition.
Semantic Guidance:
- Method: Use semantic segmentation to guide decomposition.
- Benefit: Material boundaries align with semantic boundaries.
Quality Metrics
MSE (Mean Squared Error):
- Definition: Pixel-wise error in reflectance and shading.
- Limitation: Doesn't account for perceptual quality.
LMSE (Local MSE):
- Definition: MSE after local scaling (handles scale ambiguity).
- Benefit: More robust to global intensity shifts.
DSSIM (Structural Dissimilarity):
- Definition: 1 - SSIM (structural similarity).
- Benefit: Perceptually motivated.
Intrinsic Decomposition Datasets
MIT Intrinsic Images:
- Data: Real objects with ground truth from multi-illumination capture.
- Size: Small but high-quality.
IIW (Intrinsic Images in the Wild):
- Data: Real images with sparse human annotations.
- Size: Large-scale, diverse scenes.
ShapeNet Intrinsics:
- Data: Rendered 3D objects with perfect ground truth.
- Size: Large-scale synthetic data.
MPI Sintel:
- Data: Animated movie frames with ground truth.
- Use: Evaluation on complex scenes.
Future of Intrinsic Decomposition
- Single-Image: Accurate decomposition from single image.
- Real-Time: Fast decomposition for interactive applications.
- Video: Temporally consistent decomposition.
- Semantic: Integrate semantic understanding.
- Physics-Based: Incorporate physical light transport models.
- Generalization: Models that work across diverse scenes.
Intrinsic image decomposition is fundamental to computational photography and computer vision โ it enables understanding and manipulating images at the level of materials and lighting, supporting applications from photo editing to augmented reality to object recognition.