Improving Semantic Parsing and Generation Evaluation with Challenging Benchmarks
Neural models for semantic parsing and text generation with Discourse Representation Structures (DRS) exhibit strong performance on standard test sets, but struggle with more challenging benchmarks that require handling longer texts and compositional generalization.