Tree Cross Attention: Efficient Token Retrieval for Inference
Tree Cross Attention (TCA) efficiently retrieves information for inference by organizing tokens in a tree structure and selecting a subset of nodes logarithmically, outperforming traditional methods.