Ranking Large Language Models without Ground Truth: A Novel Perspective
The authors propose a novel approach to rank large language models without relying on ground truth or reference responses, using triplets of models to identify the worst performer with high probability.