Khái niệm cốt lõi
Pretrained Language Models (PLMs) sentence embeddings can be effectively reduced in dimensionality using unsupervised methods like PCA, improving performance in downstream tasks.
Tóm tắt
Pretrained sentence embeddings by PLMs have high dimensionality, causing memory and computation issues.
Unsupervised dimensionality reduction methods like PCA can reduce dimensions by almost 50% without significant loss in performance.
Other methods evaluated include SVD, KPCA, GRP, and Autoencoders.
PCA proves most effective for compressing sentence embeddings across various tasks.
Experimental results show that reducing dimensionality improves accuracy for some sentence encoders in specific tasks.
Training and inference times vary among the different dimensionality reduction methods.
Thống kê
結果は、PCAが最も効果的であることを示しています。
次にSVD、KPCA、GRP、Autoencoderなどの方法が評価されました。
PCAはさまざまなタスクで文埋め込みを圧縮するために最も効果的であることが証明されています。
Trích dẫn
"Reducing the dimensionality further improves performance over the original high dimensional versions for the sentence embeddings produced by some PLMs in some tasks."