Reproducibility and Generalizability Issues in Using Large Language Models for Boolean Query Generation in Systematic Reviews
While large language models (LLMs) show promise for generating Boolean queries in systematic reviews, current research suffers from reproducibility and generalizability issues, highlighting the need for more transparent and robust evaluation methods.