Introducing a novel post hoc OOD detection method, NegLabel, that leverages a large set of negative labels to effectively distinguish in-distribution and out-of-distribution samples by examining their affinities towards ID and negative labels.
A unified framework that effectively leverages both visual and verbal references to improve target perception and discrimination for natural language tracking.
Leveraging emerging multimodal, vision-language foundation models as a lens to reason about and formally verify vision-based deep neural networks in terms of human-understandable concepts.