Image Classification

Accedi

approfondimento - Image Classification

Extracting a Clean and Balanced Subset from Noisy Long-tailed Datasets for Robust Classification

The core message of this work is to effectively extract a clean and class-balanced subset from a noisy and long-tailed training dataset, which can be used to train a robust classification model.

Efficient Vision Transformer with Selective Attention Layer Removal

The core message of this paper is that uninformative attention layers in vision transformers can be effectively integrated into their subsequent MLP layers, reducing computational load without compromising performance.

Certified PEFTSmoothing: Efficient Conversion of Base Models into Certifiably Robust Classifiers

PEFTSmoothing leverages Parameter-Efficient Fine-Tuning (PEFT) methods to efficiently guide large-scale vision models like ViT to learn the noise-augmented data distribution, enabling the conversion of base models into certifiably robust classifiers.

GvT: A Graph-based Vision Transformer with Talking-Heads Attention for Small Dataset Training

The proposed Graph-based Vision Transformer (GvT) utilizes graph convolutional projection and talking-heads attention to effectively train on small datasets, outperforming convolutional neural networks and other vision transformer variants.

Leveraging Diffusion Models for Improved Long-tailed Image Classification

The authors propose a novel framework, Latent-based Diffusion Model for Long-tailed Recognition (LDMLR), that leverages the powerful generative capabilities of diffusion models to augment feature representations and address the challenge of long-tailed recognition in computer vision.

Evaluating the Vulnerability of Image Classification Models to Adversarial Attacks: A Comparative Analysis of FGSM, Carlini-Wagner, and the Effectiveness of Defensive Distillation

Deep neural networks used for image classification are vulnerable to adversarial attacks, which involve subtle manipulations of input data to cause misclassification. This study investigates the impact of FGSM and Carlini-Wagner attacks on three pre-trained CNN models, and examines the effectiveness of defensive distillation as a countermeasure.

Sparse Concept Bottleneck Models: Leveraging Gumbel Tricks for Interpretable and Accurate Image Classification

The authors propose a novel framework for building Concept Bottleneck Models (CBMs) from pre-trained multi-modal encoders like CLIP. Their approach leverages Gumbel tricks and contrastive learning to create sparse and interpretable inner representations in the CBM, leading to significant improvements in accuracy compared to prior CBM methods.

Leveraging Synthetic Future Classes to Improve Exemplar-Free Class Incremental Learning

Using synthetic images of future classes generated by pre-trained text-to-image diffusion models can significantly improve the performance of exemplar-free class incremental learning methods relying on a frozen feature extractor.

Leveraging Large Language Models to Enhance Low-Shot Image Classification

Large Language Models (LLMs) can provide valuable visual descriptions and knowledge to enhance the performance of pre-trained vision-language models like CLIP in low-shot image classification tasks.

Building Geography-Agnostic Models for Fairer Image Classification

The core message of this paper is to analyze and mitigate the inherent geographical biases present in state-of-the-art image classification models, in order to make them more robust and fair across different geographical regions and income levels.

Chi Siamo

Prodotti

Risorse