AI is being used to further exploit and marginalize workers in the Global Majority, while benefiting a privileged few in the tech industry.
푸리에 신경망 연산자(FNO)를 활용하여 다양한 크기의 이미지를 동시에 학습할 수 있는 새로운 딥러닝 프레임워크를 제안한다. 이를 통해 입력 이미지 크기에 관계없이 분류 작업을 수행할 수 있다.
A novel deep learning framework based on Fourier neural operators can effectively classify 3D digital porous media of varying sizes, outperforming the intuitive approach.
단순한 U-Net 분할 기반 모델이 최신 변화 탐지 모델들을 능가한다.
Leveraging comments and employing a novel contrastive pre-training strategy to effectively align video and language modalities for improved short-form video humor detection.
A simple U-Net segmentation baseline without specialized architectural changes or training tricks outperforms the latest state-of-the-art change detection models on standard benchmark datasets.
LimSim++ is an open-source platform that enables the evaluation and continuous learning of multimodal large language models (LLMs) in autonomous driving scenarios.
The proposed method, MoXI, efficiently and accurately identifies a group of image patches that collectively have a high impact on the prediction confidence of an image classifier. MoXI leverages game-theoretic concepts of Shapley values and interactions to capture both the individual and cooperative contributions of image patches.
Dieser Datensatz zielt darauf ab, die Leistung von Gesichtserkennungsmodellen bei extremen Kopfposen zu verbessern, indem er ein großes, vielfältiges und hochauflösendes Datensatz mit Gesichtern in extremen Posen bereitstellt.
기존 얼굴 데이터셋은 정면 자세의 이미지가 대부분이어서 극단적 자세의 얼굴을 다루는 딥러닝 모델의 성능이 저하됩니다. 이 연구는 이를 해결하기 위해 450,000개의 고품질 극단적 자세 얼굴 이미지로 구성된 EFHQ 데이터셋을 제안합니다.