ACES introduces a novel metric approach for evaluating automated audio captioning systems based on the semantics of sounds.