Neural Gaffer: Relighting Any Object via Diffusion

¹Cornell University
²Zhejiang University
³Adobe Research
⁴University of Georgia

Neural Gaffer is an end-to-end 2D relighting diffusion model that accurately relights any object in a single image under various lighting conditions.

Abstract

Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BRDFs, which can be inaccurate or under-expressive. In this work, we propose a novel end-to-end 2D relighting diffusion model, called Neural Gaffer, that takes a single image of any object and can synthesize an accurate, high-quality relit image under any novel environmental lighting condition, simply by conditioning an image generator on a target environment map, without an explicit scene decomposition. Our method builds on a pre-trained diffusion model, and fine-tunes it on a synthetic relighting dataset, revealing and harnessing the inherent understanding of lighting present in the diffusion model. We evaluate our model on both synthetic and in-the-wild Internet imagery and demonstrate its advantages in terms of generalization and accuracy. Moreover, by combining with other generative methods, our model enables many downstream 2D tasks, such as text-based relighting and object insertion. Our model can also operate as a strong relighting prior for 3D tasks, such as relighting a radiance field.

Single-image Relighting Comparisons with IC-Light

We compare with IC-Light on single-image relighting under unseen lighting conditions. Our methods can generate more consistent and stable relighting results wiht lighting changing and rotating.

Lighting

Relighting 3D Objects Comparisons

Given an input radiance field of a 3D object, our method can directly relight its appearance field in just minutes, without requiring any inverse rendering reconstruction process like those conventional methods. We compare with conventional inverse rendering reconstruction methods TensoIR and NVDIFFREC-MC on 3D relighting under unseen lighting conditions, achieving better visual quality as shown below. (The following videos are raw results without channel alignment.)

Scene

Related Works

DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation: A recent work that enables fine-grained lighting control for diffusion-based image generation, which can be used for single-image relighting.

IllumiNeRF 3D Relighting without Inverse Rendering: A concurrent work that enables 3D relighting without inverse rendering, similar to our 3D relighting application.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{jin2024neural_gaffer,
  title     = {Neural Gaffer: Relighting Any Object via Diffusion},
  author    = {Haian Jin and Yuan Li and Fujun Luan and Yuanbo Xiangli and Sai Bi and Kai Zhang and Zexiang Xu and Jin Sun and Noah Snavely},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2024},
}

Acknowledgements

This work was done while Haian Jin was a full-time student at Cornell. The selection of data and the generation of all figures and results was led by Cornell University. This work was funded in part by the National Science Foundation (IIS-2211259). Jin Sun is partly supported by a gift from Google.

The selection of data and the generation of all figures and results was led by Cornell University.

The website template was borrowed from TensoIR and ClimateNeRF.