핵심 개념
Transformer-based models can be affected by sequence length learning, leading to reliance on non-textual features for classification.
통계
モデルはオリジナルのトレーニングセットで高い精度を達成する。
長さが不均衡なトレーニングセットでモデルは低い精度を示す。
인용구
"Models seem to capture sequence length as a classification spurious feature."
"The more the distributions overlap, the lesser the problem."