Dual-encoder models can outperform SOTA methods in extreme multi-label classification tasks by using decoupled softmax loss and soft top-k operator-based loss.