Large Language Models (LLMs) enhance video understanding through reasoning and self-refinement.
LLMs are leveraged in a novel Video Understanding and Reasoning Framework (VURF) to enhance video tasks through reasoning and self-refinement.