The PoTeC corpus contains data from 75 participants reading 12 scientific texts, offering insights into expert and non-expert reading strategies. It includes annotations for linguistic features and aims to facilitate diverse research studies.
The content discusses the importance of naturalistic reading corpora compared to controlled experiments, highlighting the benefits of studying language processing in ecologically valid settings. It emphasizes the value of exploring complex phenomena in naturally occurring text for theoretical relevance.
Furthermore, it explores how eye-tracking data can be leveraged for Natural Language Processing tasks and computational language models. The article also introduces a new standard for data publication following FAIR principles to enhance transparency and reusability.
Overall, the PoTeC corpus provides a valuable resource for studying cognitive processes involved in everyday reading across different disciplines and levels of expertise.
To Another Language
from source content
arxiv.org
Deeper Inquiries