Core Concepts
Applying LSTM and BERT models for text classification in the retail sector enhances multi-category predictions. Data augmentation and focal loss techniques significantly improve model accuracy.
Abstract
This study explores using advanced machine learning models, LSTM and BERT, for text classification in the retail sector. By applying data augmentation and focal loss techniques to a Brazilian retail dataset, the study demonstrates significant improvements in classifying products into multiple categories. The results show that while BERT outperformed LSTM in detailed categories, both models achieved high performance after optimization. The research highlights the importance of NLP techniques in retail and emphasizes the need for careful selection of modeling strategies.
The study is structured with an introduction discussing text classification's importance in retail, followed by a literature review on NLP approaches. Data preprocessing methods are detailed, including data augmentation through web scraping. The implementation of LSTM and BERT models is explained, along with their respective architectures and configurations. Results from both models are analyzed, focusing on F1 scores across different product categories. The impact of focal loss and data augmentation on model performance is discussed extensively.
The study concludes by suggesting future research directions to explore larger datasets, newer NLP models, and handling extreme class imbalances. Limitations of the study are acknowledged, emphasizing the importance of diverse datasets for model generalization. Overall, the research contributes valuable insights to NLP applications in the retail sector.
Stats
BERT model achieved an F1 Macro Score up to 99% for segments.
LSTM model reached an F1 Macro Score of 93% for name products.
Data augmentation added approximately 30,000 records to the dataset.
Focal loss improved model performance significantly.
Quotes
"Results showed that BERT outperformed LSTM in more detailed categories."
"LSTM achieved high performance after applying data augmentation and focal loss techniques."