Core Concepts
Evaluation of Transformer models for NER tasks in Portuguese financial texts.
Abstract
The content discusses the evaluation of named entity recognition using mono- and multilingual transformer models on Brazilian corporate earnings call transcriptions. It covers dataset collection, weak supervision annotation, model fine-tuning, and performance analysis. Key highlights include the framing of NER as a text generation task, comparison of BERT and T5 models, macro F1-score results ranging from 98.52% to 98.99%, memory and time consumption differences between models, and insights into entity recognition approaches.
Stats
The macro F1-score achieved by the models ranged from 98.52% to 98.99%
BERTimbau consumes 4.5 GB of memory and 2 minutes for inference using the test dataset, whereas PTT5 requires 13.2 GB and 27 minutes.
Quotes
"Framing NER as text generation with T5, surpassing prior methods."
"BERT-based models consistently outperform T5-based models."