Given a dataset of prompts and a set of LLMs, ranking them without access to ground truth is possible by considering triplets of models.
Models can be ranked without ground truth using a novel triplet approach.