Основные понятия
Equally effective predictive models can provide very different explanations of the underlying relationships in the data, even when their performance metrics are nearly identical.
Аннотация
The article introduces the concept of a "Rashomon Quartet" - a set of four predictive models (linear regression, decision tree, random forest, and neural network) that all achieve the same high level of predictive performance (R^2 = 0.729, RMSE = 0.354) on a synthetic dataset, but provide markedly different explanations of the relationships between the predictor variables and the target variable.
The authors first provide background on the Rashomon effect, where multiple models can achieve similar performance but tell different "stories" about the data. They then describe how they engineered the synthetic dataset to exhibit this phenomenon, with a data generation function that allows for nonlinear and correlated relationships between the predictors.
The key insights from analyzing the Rashomon Quartet are:
The linear regression model finds x1 to be the most important predictor, with a smaller negative contribution from x3.
The decision tree model only uses x1, ignoring the other predictors.
The random forest model uses all three predictors, with x1 being the strongest.
The neural network model finds a non-monotonic relationship between the target and x3.
The authors also provide analysis of the model residuals, showing high correlation across the models, and suggest further questions to explore the differences in the models' perspectives on the data. They conclude by emphasizing that performance metrics alone are not enough to fully understand predictive models, and that techniques for model visualization and comparison are essential.
Статистика
sin ((3x1 + x2) /5) + ε, where ε ~ N(0, 1/3) and [x1, x2, x3] ~ N(0, Σ3x3) with Σ3x3 having 1 on the diagonal and 0.9 beyond the diagonal.
Цитаты
"Similar performance of the best-fitted models does not mean that they encode similar stories about data."
"Today, performance is not enough."