The Perseus Digital Library introduces its sixth generation, featuring the ATLAS workflow, a system designed to integrate and present a wide range of open and born-digital philological data, moving beyond traditional print-based limitations.
본 논문에서는 불완전한 모달리티를 가진 인스턴스에서도 검색을 가능하게 하는 새로운 멀티모달 검색 프레임워크인 Any2Any를 제안합니다. Any2Any는 누락된 모달리티를 채우기 위해 생성 모델을 학습할 필요 없이 교차 모달 인코더를 사용하여 기존 모달리티를 처리하고 등각 예측을 통해 유사성 점수를 보정하여 다양한 모달리티에서 효과적인 검색을 가능하게 합니다.
本稿では、センサの故障やデータのアクセス制限などにより、クエリと参照データの両方に一部のモダリティが欠損している場合でも、効率的な検索を可能にする新しいマルチモーダル検索フレームワーク「Any2Any」を提案する。
The Any2Any framework effectively addresses the challenge of retrieving multimodal data with missing modalities by employing cross-modal encoders and a two-stage conformal prediction process to enable accurate comparisons and retrieval across diverse datasets.
대규모 언어 모델에서 효과적인 검색을 위해서는 유사성과 다양성을 동시에 충족하는 새로운 검색 방법론이 필요하며, 본 논문에서 제시하는 VRSD 알고리즘은 기존 MMR 알고리즘의 한계점을 극복하고 검색 성능을 향상시키는 효과적인 대안입니다.
This paper proposes a novel approach to vector retrieval in large language models (LLMs) that leverages the concept of sum vectors to simultaneously optimize for similarity and diversity, addressing the limitations of traditional methods like Maximal Marginal Relevance (MMR).
LLMを用いた自動ナゲット評価フレームワーク (AutoNuggetizer) は、TREC 2024 RAGトラックの初期結果に基づくと、人間による評価と強い相関を示しており、RAGシステムの評価の自動化に有効である可能性を示唆している。
This paper presents a novel, fully automated evaluation framework for Retrieval-Augmented Generation (RAG) systems, called AutoNuggetizer, which leverages large language models (LLMs) to automatically create and assign "nuggets" of information to assess the quality of system-generated answers. Initial results from the TREC 2024 RAG Track demonstrate a strong correlation between this automated approach and manual evaluation by human assessors, suggesting its potential as a reliable and efficient alternative for evaluating and iterating on RAG systems.
MARM leverages caching to overcome computational limitations in recommendation systems, enabling multi-layer attention modeling of user history for improved accuracy without significant performance degradation.
此研究利用多個大型語言模型和檢索增強生成技術,從生物多樣性出版物中自動提取和處理深度學習方法資訊,以提高研究的可重複性和知識轉移。