Constrained Generation based Data Augmentation for Low-Resource Natural Language Processing
CoDa, a controllable and effective data augmentation technique for low-resource NLP, generates synthetic training instances by prompting off-the-shelf instruction-following Large Language Models to produce text that satisfies a set of simple constraints extracted from the original low-resource dataset.