Sign In

NTIRE 2023 Image Shadow Removal Challenge Technical Report by Team IIM TTI

Core Concepts
Improving shadow removal methods for the NTIRE 2023 Challenge.
The technical report discusses the improvements made by Team IIM TTI for the NTIRE 2023 Image Shadow Removal Challenge. They focused on image alignment, quality loss functions, shadow detection, joint learning, and data augmentation techniques. Their method achieved competitive scores in LPIPS and Mean Opinion Score. The team addressed issues with external camera parameters, perceptual quality limitations, shadow detector application difficulties, independent optimization of detectors and removers, and insufficient data augmentation. They proposed solutions like homography-based image alignment, new loss functions for structure preservation and SSIM, semi-automatic shadow mask annotation, and joint learning of detectors and removers. Their approach showed promising results in refining shadow detection and removal processes.
Our method achieved scores of 0.196 (3rd out of 19) in LPIPS. Our method scored 7.44 (3rd out of 19) in the Mean Opinion Score (MOS). The training time was 60 hours. The number of parameters used was 55 million. Runtime on GPU was around 1010 ms using an A100 GPU.
"Our method achieved scores of 0.196 (3rd out of 19) in LPIPS." "Our method scored 7.44 (3rd out of 19) in the Mean Opinion Score (MOS)."

Key Insights Distilled From

by Yuki Kondo,R... at 03-15-2024
NTIRE 2023 Image Shadow Removal Challenge Technical Report

Deeper Inquiries

How can the proposed improvements impact real-world applications beyond this challenge?

The proposed improvements, such as image alignment, perceptual quality loss function, semi-automatic annotation for shadow detection, joint learning of shadow detection and removal, and new data augmentation techniques for shadow removal, can have significant implications in various real-world applications. Image Alignment: The technique of aligning images with and without shadows using homography transformation can be beneficial in scenarios where misalignment occurs due to external factors like camera parameters changing between shots. This improvement ensures accurate processing of images despite such variations. Perceptual Quality Loss Function: By incorporating a structure preservation loss instead of traditional pixel-wise losses like L1 or L2 loss functions, the model can focus on maintaining high-quality visual results rather than just minimizing pixel-level differences. This enhancement could lead to more visually appealing outputs in tasks requiring image restoration or enhancement. Semi-Automatic Annotation: The semi-automatic annotation method for generating shadow masks efficiently involves human input only where necessary, making the process faster and more cost-effective. This approach could streamline data labeling tasks not only in shadow removal but also in other computer vision applications that require manual annotations. Joint Learning: The concept of joint learning by connecting the shadow detector with the shadow remover allows for end-to-end training of both networks simultaneously. This strategy enhances overall performance by leveraging shared information between tasks and could be applied to various domains requiring multiple interconnected models. Data Augmentation Techniques: Introducing new data augmentation schemes like CutShadow provides additional training samples by creating pseudo-trained data based on existing pairs of images with and without shadows. Such techniques can improve model generalization and robustness across different datasets or unseen scenarios.

What are potential drawbacks or limitations to the team's approach to shadow removal?

While the team's approach shows promising results in improving ShadowFormer for image shadow removal, there are some potential drawbacks or limitations: Complexity: Implementing advanced techniques like joint learning and sophisticated loss functions may increase computational complexity and training time. Dependency on Annotations: Semi-automatic annotation methods still rely on human input for certain aspects which might introduce subjectivity or errors into the dataset. Generalization: The effectiveness of these enhancements may vary depending on diverse real-world scenarios not fully represented in the training dataset. 4 .Scalability: Some approaches used may not scale well when dealing with large-scale datasets or when deployed in resource-constrained environments.

How can the concept of joint learning be applied to other image processing tasks outside of shadow removal?

Joint learning is a versatile technique that can benefit various image processing tasks beyond just shadow removal: 1 .Image Restoration: In tasks like denoising, deblurring, or super-resolution imaging, jointly optimizing pre-processing steps (like noise reduction) along with main restoration processes can enhance overall performance. 2 .Object Detection: By integrating object detection networks with segmentation models, jointly training them enables improved accuracy through shared feature representations and mutual feedback mechanisms 3 .Semantic Segmentation: Combining semantic segmentation algorithms with instance segmentation models in a joint learning framework facilitates better understanding of complex scenes while distinguishing individual objects within them 4 .Style Transfer: For artistic style transfer applications, incorporating style recognition modules alongside generative networks helps achieve more faithful rendering of desired styles onto content images