Neural Video Portrait Relighting in Real-time via Consistency Modeling

Longwen Zhang1,2   Qixuan Zhang1,2   Minye Wu1,3   Jingyi Yu1   Lan Xu1
1ShanghaiTech University   2Deemos Technology   3University of Chinese Academy of Sciences


Abstract

Video portraits relighting is critical in user-facing human photography, especially for immersive VR/AR experience. Recent advances still fail to recover consistent relit result under dynamic illuminations from monocular RGB stream, suffering from the lack of video consistency supervision. In this paper, we propose a neural approach for real-time, high-quality and coherent video portrait relighting, which jointly models the semantic, temporal and lighting consistency using a new dynamic OLAT dataset. We propose a hybrid structure and lighting disentanglement in an encoder-decoder architecture, which combines a multi-task and adversarial training strategy for semantic-aware consistency modeling. We adopt a temporal modeling scheme via flow-based supervision to encode the conjugated temporal consistency in a cross manner. We also propose a lighting sampling strategy to model the illumination consistency and mutation for natural portrait light manipulation in real-world. Extensive experiments demonstrate the effectiveness of our approach for consistent video portrait light-editing and relighting, even using mobile computing.


Pipeline

Responsive image
The training pipeline of our approach. It consists of a structure and lighting disentanglement (Sec. 4.1), a temporal consistencymodeling (Sec. 4.2) and a lighting sampling (Sec. 4.3), so as to generate consistent video relit results from a RGB stream in real-time.


Gallery

Responsive image
Our relighting results under dynamic illuminations. Each triplet includes the input frame and two relit result examples.


Results

YouTube video




Dataset



Code

We will publish the code and data for training [ DOWNLOAD HERE ] (coming soon)


Downloads

Responsive image
Paper (thecvf)
link
Responsive image
arXiv
link


Citation

@InProceedings{Zhang_2021_ICCV,
    author    = {Zhang, Longwen and Zhang, Qixuan and Wu, Minye and Yu, Jingyi and Xu, Lan},
    title     = {Neural Video Portrait Relighting in Real-Time via Consistency Modeling},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {802-812}
}


Acknowledgments

The authors would like to thank all participants of the Light Stage recordings. We also thank the authors of Wang et. al. [2020] for providing the results of their method for comparisons.


Contact