Enhancing Out-of-Distribution Text Classification with Greedy Layer-Wise Sparse Representation Learning for Pre-trained Models
A novel greedy layer-wise sparse representation learning method, IMO, that selects domain-invariant features and key token representations from pre-trained deep transformer encoders to mitigate spurious correlations and improve out-of-distribution text classification performance.