LLMs rely on semantic shortcuts rather than proper reasoning, leading to hallucinations and failures in complex QA tasks.