Kernekoncepter
This study introduces a novel approach using multiple large language models (LLMs) and Retrieval-Augmented Generation (RAG) to automatically extract and categorize deep learning (DL) methodological information from biodiversity publications, addressing the challenge of limited transparency and reproducibility in scientific literature.
Statistik
The multi-LLM, RAG-assisted pipeline achieved an accuracy of 69.5% (417 out of 600 comparisons) in retrieving DL methodological information.
The Llama 3 70B model achieved the highest inter-annotator agreement (0.7708) with human annotations.
Filtering publications to include only those with detailed DL pipelines increased the positive response rate to CQs by 8.65%.
Before filtering, the pipeline provided positive responses to 27.12% of the total queries (3,524 out of 12,992).
After filtering, the percentage of positive responses increased to 35.77% (2,574 out of 7,196).