We present an alternative latent noise space for denoising diffusion probabilistic models (DDPMs) that enables a wide range of editing operations via simple means. Our inversion method extracts noise maps that are distributed differently from those used in regular sampling, and are more edit-friendly. This allows diverse editing of real images without fine-tuning the model or modifying its attention maps.
A method is proposed to map input volumetric measurements to a latent space where overlapping signal components are disentangled, enabling their isolation and quantification through the application of bandpass filters.
An automated method for determining a threshold value to efficiently retrieve relevant images from a dataset for perception testing of automated driving systems, balancing false positives and false negatives.
The core message of this paper is that by reformulating the diffusion process as a deterministic mapping between input images and output prediction distributions, and using low-rank adaptation to fine-tune pre-trained text-to-image diffusion models, the proposed DMP approach can effectively leverage the inherent generalizability of diffusion models to perform various dense prediction tasks, such as 3D property estimation, semantic segmentation, and intrinsic image decomposition, even with limited training data in a specific domain.
The state-of-the-art ISNs image geolocation estimation model exhibits significant biases towards high-income regions and the Western world, leading to poor performance on the underrepresented SenseCity Africa dataset.
ADDP proposes an Alternating Denoising Diffusion Process that bridges pixel and token spaces, enabling the learning of general representations applicable to both image recognition and generation tasks.
A novel parallel proportional fusion architecture combining spiking neural networks and variational quantum circuits outperforms existing classical and hybrid models in image classification tasks, exhibiting superior accuracy, robustness, and noise immunity.
The core message of this paper is to propose a new framework called IRSS that can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving out-of-distribution generalization without requiring additional supervision such as domain labels.
A feature space not explicitly trained for real-vs-fake classification can achieve significantly better generalization in detecting fake images from unseen generative models compared to deep learning based methods.
FireANTs, a novel multi-scale Adaptive Riemannian Optimization algorithm, achieves state-of-the-art performance on diffeomorphic image registration tasks across various modalities and anatomies, while providing significant speedups of up to 2000x over existing methods.