The study analyzed the performance of the ChatGPT and Gemini language models in handling linguistic ambiguity in Brazilian Portuguese. It conducted four tasks to evaluate the models' ability to:
The results showed that both models struggled significantly in these tasks. The ChatGPT achieved an accuracy of only 28.75% in detecting ambiguity, while Gemini performed better at 49.58%. However, both models tended to over-interpret non-ambiguous sentences as ambiguous.
In the disambiguation task, the models often provided incorrect or incomplete explanations, failing to accurately identify the source of ambiguity. The models performed better in handling lexical ambiguity compared to semantic and syntactic ambiguity.
When generating ambiguous sentences, the models had the most difficulty creating lexical ambiguity, often producing sentences without any perceivable ambiguity. They had relatively better success in generating syntactic ambiguity, but still struggled to accurately explain the cause of the ambiguity.
The study highlights the need for further research and development to improve large language models' understanding and handling of complex linguistic phenomena like ambiguity, especially in low-resource languages like Brazilian Portuguese.
To Another Language
from source content
arxiv.org
Deeper Inquiries