The paper evaluates the usability of ChatGPT as a tool for generating R programming code. The key findings are:
Overall, ChatGPT performed very well on the usability metrics, with high scores on accuracy, completeness, structuredness, logic clarity, parameter coverage, readability, and depth of explanation. The weakest aspect was conciseness, with an average score of 3.8 out of 5.
On objective metrics, ChatGPT required an average of only 1.61 attempts to complete the tasks, with 72% of tasks completed in a single attempt. The average time to complete a task was 47.02 seconds, with 90% of tasks completed within 100 seconds.
ChatGPT performed best on general programming tasks, scoring 95.2% on average. It scored lower on visualization (91.1%) and exploratory tasks (91.6%).
The number of attempts and completion times were also best for programming tasks, and worst for visualization tasks.
The experiment found it is difficult for human developers to learn to use ChatGPT more effectively through repeated experience, suggesting the need for better user guidance and training.
Overall, the results demonstrate that ChatGPT has high usability as a code generation tool for the R programming language, though it may struggle on more complex or specialized tasks.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Tanha Miah,H... alle arxiv.org 04-10-2024
https://arxiv.org/pdf/2402.03130.pdfDomande più approfondite