מושגי ליבה
Pretrained Language Models (PLMs) sentence embeddings can be effectively reduced in dimensionality using unsupervised methods like PCA, improving performance in downstream tasks.
תקציר
Pretrained sentence embeddings by PLMs have high dimensionality, causing memory and computation issues.
Unsupervised dimensionality reduction methods like PCA can reduce dimensions by almost 50% without significant loss in performance.
Other methods evaluated include SVD, KPCA, GRP, and Autoencoders.
PCA proves most effective for compressing sentence embeddings across various tasks.
Experimental results show that reducing dimensionality improves accuracy for some sentence encoders in specific tasks.
Training and inference times vary among the different dimensionality reduction methods.
סטטיסטיקה
結果は、PCAが最も効果的であることを示しています。
次にSVD、KPCA、GRP、Autoencoderなどの方法が評価されました。
PCAはさまざまなタスクで文埋め込みを圧縮するために最も効果的であることが証明されています。
ציטוטים
"Reducing the dimensionality further improves performance over the original high dimensional versions for the sentence embeddings produced by some PLMs in some tasks."