Core Concepts
This work presents a large-scale real-world dataset, Real-HDRV, to facilitate the development of HDR video reconstruction techniques, and proposes a two-stage alignment network to effectively handle complex motion and reconstruct high-quality HDR video.
Abstract
This paper addresses the problem of HDR video reconstruction from sequences with alternating exposures. The key contributions are:
-
Real-HDRV Dataset:
- Constructed a large-scale real-world dataset for HDR video reconstruction, featuring various scenes, diverse motion patterns, and high-quality labels.
- The dataset contains 500 LDRs-HDRs video pairs, comprising about 28,000 LDR frames and 4,000 HDR labels, covering diverse indoor/outdoor, daytime/nighttime scenes.
- Compared to existing datasets, Real-HDRV provides more diverse scenes and motion patterns, enabling better generalization of trained models.
-
Two-Stage Alignment Network:
- Proposed an end-to-end network for HDR video reconstruction, with a novel two-stage strategy to perform alignment sequentially.
- The global alignment module (GAM) effectively handles global motion by adaptively estimating global offsets.
- The local alignment module (LAM) implicitly performs local alignment in a coarse-to-fine manner at the feature level using adaptive separable convolution.
- The two-stage alignment network can effectively handle complex motion and reconstruct high-quality HDR video.
Extensive experiments demonstrate that models trained on the Real-HDRV dataset outperform those trained on synthetic datasets when evaluated on real-world scenes. The proposed two-stage alignment network also outperforms previous state-of-the-art methods.
Stats
The proposed Real-HDRV dataset contains 500 LDRs-HDRs video pairs, comprising about 28,000 LDR frames and 4,000 HDR labels.
The dataset covers diverse indoor/outdoor, daytime/nighttime scenes with various motion patterns, including global motion, local motion, and full motion.
Quotes
"Models trained on our dataset can achieve better performance on real scenes than those trained on synthetic datasets."
"Our two-stage alignment network can effectively handle complex motion and reconstruct high-quality HDR video."