Self-supervision versus synthetic datasets: which is the lesser evil in the context of video denoising?
Valéry Dewil
Arnaud Barral
Gabriele Facciolo
Pablo Arias
[Paper]
[Supp]
[GitHub]
Project developed at the ENS Paris-Saclay, Centre Borelli and accepted at the Vision Datasets Understanding CVPR Workshop 2022.

Abstract

Supervised training has led to state-of-the-art results in image and video denoising. However, its application to real data is limited since it requires large datasets of noisy-clean pairs that are difficult to obtain. For this reason, networks are often trained on realistic synthetic data. More recently, some self-supervised frameworks have been proposed for training such denoising networks directly on the noisy data without requiring ground truth. On synthetic denoising problems supervised training outperforms self-supervised approaches, however in recent years the gap has become narrower, especially for video. In this paper, we propose a study aiming to determine which is the best approach to train denoising networks for real raw videos: supervision on synthetic realistic data or self-supervision on real data. A complete study with quantitative results in case of natural videos with real motion is impossible since no dataset with clean-noisy pairs exists. We address this issue by considering three independent experiments in which we compare the two frameworks. We found that self-supervision on the real data outperforms supervision on synthetic data, and that in normal illumination conditions the drop in performance is due to the synthetic ground truth generation, not the noise model.



Comparison protocol

Experimental protocols. The surrogate real dataset is the green cylinder, the synthetic dataset is in red. (a) The model-supervised is trained with supervision on the synthetic dataset with synthetic noise (either with or without blind-spot). (b) The previous model-supervised are fine-tuned on the surrogate dataset. The steps are (1) fine-tune on real data (2) (when possible) fine-tune on real clean data but with synthetic noise (3) Self-supervised fine-tuning directly on noisy data (UDVD and MF2F).

 [GitHub]


Paper and Supplementary Material

V. Dewil, A. Barral, G. Facciolo, P. Arias
Self-supervision versus synthetic datasets: which is the lesser evil in the context of video denoising?
In CVRP Workshops, 2022.
(hosted on ArXiv)


[Bibtex]


Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.