The authors present a system that combines natural language processing (NLP) and machine learning (ML) techniques to automatically classify Spanish legal judgments into jurisdiction-specific law categories and provide natural language explanations for the classification decisions.
The key highlights and insights are:
The system uses a data preprocessing module to transform the original data source into a proper input format for the ML classifiers. This includes stop word removal, text lemmatization, and jurisdiction selection.
The main module performs feature engineering using char-grams and word-grams, and then classifies the judgments using parallel classifiers for each jurisdiction. The authors experiment with several ML algorithms, including support vector machines (SVM), decision trees (DT), random forests (RF), and gradient boosting (GB).
The explicability module explains the classification decisions in natural language. It extracts the relevant features from the decision paths of the tree-based models, reconstructs any char-gram features into more interpretable terms, and generates natural language templates to describe the key factors behind the classification.
The authors validate the explanations with input from legal experts, who provide "expert-in-the-loop" dictionaries of relevant terms for each jurisdiction and law category. This helps ensure the explanations are meaningful and accurate.
Experimental results on a large dataset of Spanish legal judgments show that the system achieves high classification accuracy, with RF and GB models performing particularly well. The natural language explanations are found to be easily understandable even to non-expert users.
Overall, this work presents a novel approach to combining NLP, ML, and explainable AI techniques to automatically classify and explain Spanish legal judgments, which can improve the transparency and trustworthiness of such systems.
To Another Language
from source content
arxiv.org
Głębsze pytania