核心概念
Successful development of machine learning models does not necessarily translate into useful solutions that can be deployed for real-world datasets in structural engineering applications.
摘要
This paper discusses the challenges in deploying machine learning (ML) models for structural engineering applications. Despite the promising performance of ML-based solutions, they are usually only demonstrated as proof-of-concept and are rarely deployed for real-world applications.
The key challenges highlighted in the paper include:
-
Generalizability beyond the training set:
- Overfitting can lead to poor performance on new data due to capturing pseudo-relationships or irrelevant patterns in the training data.
- Inadequate training datasets that do not represent the diversity of real-world data can result in poor deployment performance.
-
Explainability through feature importance:
- Feature importance metrics may not accurately reflect the true importance of variables, as they often measure the consequence of randomly permuting a feature rather than removing it entirely.
- Underspecification, where multiple distinctive sets of features may equally satisfy the evaluation criteria, can lead to misleading interpretations of feature importance.
The paper presents two illustrative examples using datasets from finite element simulations of cold-formed steel channels and experimental studies on reinforced concrete walls. These examples demonstrate the issues of overfitting, variable omission bias, and underspecification, highlighting the importance of implementing rigorous model validation techniques, careful physics-informed feature selection, and considerations of both model complexity and generalizability for successful deployment of ML models in structural engineering.
统计
The cold-formed steel channel dataset contains information on 14 features, including channel depth, flange width, stiffener length, thickness, slot dimensions, and material properties.
The reinforced concrete wall dataset contains information on 15 features, including slenderness ratio, shear stress demand, reinforcement details, and axial load ratio.
引用
"Fundamentally, almost all ML models (due to their statistical nature, and similar to other well-established approaches such as empirical analysis) capture data association rather than causal relationships."
"Even when ignoring the limitations of ML models in identifying causal relationships, many ML models make it difficult to determine the nature of the association between variables. Such a "black-box" aspect negatively affects user confidence when applying the ML model in deployment."
"The over-reliance on accuracy metrics during model development might be inadequate, or even problematic, for deployment under specific circumstances."