Efficiently compressing large language models through global pruning with low memory consumption.
AdaGP proposes a novel framework for global pruning of large language models, achieving significant performance improvements in high-sparsity regimes.