The study investigates the ability of Large Language Models (LLMs), specifically ChatGPT, to process and restructure structured and semi-structured documents. The research follows a qualitative approach, conducting two case studies with various document formats.
In the first experiment series, ChatGPT was tasked with editing a LaTeX-formatted table by performing operations such as deleting columns, swapping columns, merging rows, and formatting text. The results show that ChatGPT was able to make all the desired changes, generating syntactically correct LaTeX output that could be further processed without issues.
The second experiment series focused on converting RIS bibliographic records into OPUS XML format. ChatGPT was provided with an example RIS and OPUS XML document, as well as additional RIS files to be converted. The LLM successfully generated syntactically correct OPUS XML documents, demonstrating impressive pattern matching skills in constructing the appropriate XML fields and values based on the provided example.
The study's findings suggest that LLMs can be effectively applied for editing structured and semi-structured documents with minimal effort, as long as the prompts are straightforward and provide the necessary context. The experiments also reveal that explicit structural annotations in the input data, such as LaTeX commands, may enhance an LLM's understanding and ability to follow instructions. Additionally, the pattern matching behavior observed in the RIS-to-XML conversion task deserves further investigation, as it may contribute to understanding the processes leading to hallucinations in LLMs.
Overall, the study provides valuable insights into the capabilities of LLMs in processing structured documents, which can have practical applications in areas like document authoring, data conversion, and software development.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询