By iteratively training language models to generate shorter yet accurate reasoning paths, they can learn to skip steps like human experts, potentially leading to more efficient problem-solving and improved generalization abilities.
Language models demonstrate sensitivity to both taxonomic relations and categorical similarity when performing property inheritance, suggesting that these mechanisms are not mutually exclusive and may be fundamentally entangled in model representations.
대규모 언어 모델(LLM)이 복잡한 추론 작업에 적합한 전략을 자율적으로 학습하고 선택하여 정확성과 효율성을 높일 수 있는 새로운 프레임워크, SMART(Self-learning Meta-strategy Agent for Reasoning Tasks)를 소개합니다.
Language models (LMs) can significantly improve their reasoning abilities and strategy selection for complex tasks through a self-learning approach called SMART, which leverages reinforcement learning to optimize strategy choice without relying on multiple refinement steps.
While experimental contexts like in-context examples and instructions can improve the ability of large language models (LLMs) to perform semantic property inheritance, this ability remains inconsistent and prone to relying on shallow heuristics, particularly when the task format creates a direct link between the output and positional cues.
PRefLexOR is a novel framework that enhances the reasoning capabilities of language models by combining preference optimization with recursive learning, enabling them to generate more coherent, accurate, and insightful responses.
대규모 언어 모델의 추론 능력을 향상시키기 위해 자가 일관성 프레임워크 내에서 추론 경로의 의미적 일관성을 분석하고 가중치를 부여하는 방법을 제안합니다.
Incorporating weighted reasoning paths into the self-consistency framework enhances the reasoning capabilities of large language models (LLMs) by leveraging semantic similarity to identify and prioritize more reliable reasoning paths, leading to improved accuracy in various reasoning tasks.
Training language models for reasoning tasks by sampling synthetic data from smaller, weaker language models (instead of larger, more expensive ones) proves to be more compute-optimal, leading to improved performance and generalization capabilities.
This research explores fine-tuning techniques to enhance causal reasoning in language models by leveraging counterfactual feedback, demonstrating that directly targeting causal consistency leads to significant improvements in reasoning performance.