insight - Computational Science - # Uncertainty Quantification in Turbulent Combustion Modeling

A Priori Uncertainty Quantification of Reacting Turbulence Closure Models using Bayesian Neural Networks

Q: How can epistemic uncertainty guide future data collection efforts?

Epistemic uncertainty, stemming from a lack of knowledge due to insufficient training data, can provide valuable insights for guiding future data collection efforts. By identifying regions in the dataset where the model lacks confidence or exhibits high variability (high epistemic uncertainty), researchers can prioritize collecting additional data in those specific areas. This targeted approach helps improve the model's predictive quality by filling gaps in the dataset and reducing uncertainties associated with limited information.

Q: What are the implications of using synthetic data for enforcing extrapolation behavior?

Using synthetic data to enforce extrapolation behavior has significant implications for model performance outside the original dataset distribution. Synthetic data generation methods like soft Brownian offset (SBO) and normalizing flow (NF) allow models to make predictions beyond their trained range. The choice between SBO and NF impacts how well a model generalizes to out-of-distribution scenarios. NF tends to generate higher-quality OOD datasets compared to SBO, leading to better performance when predicting unseen data points.

Q: How do different methods for generating out-of-distribution synthetic data impact model performance?

The method used for generating out-of-distribution (OOD) synthetic data plays a crucial role in impacting model performance. In comparison, NF consistently outperforms SBO in creating uniformly distributed OOD datasets that enhance a model's ability to generalize effectively. Additionally, as more synthetic data is added, both methods show improvements in modeling OOD scenarios; however, there may be diminishing returns once the amount of synthetic data approaches that of the original dataset size. The distance between synthetic and original datasets also influences overall model accuracy when predicting beyond known distributions.

Core Concepts

The author employs Bayesian neural networks to quantify both epistemic and aleatoric uncertainties in a reacting flow model, focusing on the sub-filter progress variable dissipation rate. The approach provides unique insights into the structure of uncertainty in data-driven closure models.

Abstract

This content discusses the application of Bayesian neural networks for uncertainty quantification in reacting turbulence closure models. It highlights the importance of data-driven strategies, the challenges posed by uncertainties, and proposes methods for incorporating out-of-distribution information. The study demonstrates how synthetic data can be used to enforce extrapolation behavior and improve model performance.

Stats

Increased adoption of data-driven models requires reliable uncertainty estimates both in the data-informed and out-of-distribution regimes.
BNN models can provide unique insights about the structure of uncertainty of the data-driven closure models.
The dataset contains 7.88 × 10^6 data points for training and 2.63 × 10^6 data points for testing.
The primary drawback of Gaussian processes is the O(n^3) computation expense to train and O(n^2) cost to evaluate due to matrix inversion and multiplication.
BNNs reformulate deterministic deep learning models as point estimators by assigning a probability distribution to each network parameter.

Quotes

"In this work, we employ Bayesian neural networks (BNNs) to capture both epistemic and aleatoric uncertainties in a reacting flow model."
"BNNs reformulate deterministic deep learning models as point estimators and emulate an ensemble of neural nets by assigning a probability distribution to each network parameter."
"The efficacy of the model is demonstrated by a priori evaluation on a dataset consisting of a variety of flame conditions and fuels."

Key Insights Distilled From

A Priori Uncertainty Quantification of Reacting Turbulence Closure Models using Bayesian Neural Networks

by Graham Pash,... at arxiv.org 03-01-2024

https://arxiv.org/pdf/2402.18729.pdf

A Priori Uncertainty Quantification of Reacting Turbulence Closure Models using Bayesian Neural Networks

Deeper Inquiries

How can epistemic uncertainty guide future data collection efforts?

Epistemic uncertainty, stemming from a lack of knowledge due to insufficient training data, can provide valuable insights for guiding future data collection efforts. By identifying regions in the dataset where the model lacks confidence or exhibits high variability (high epistemic uncertainty), researchers can prioritize collecting additional data in those specific areas. This targeted approach helps improve the model's predictive quality by filling gaps in the dataset and reducing uncertainties associated with limited information.

What are the implications of using synthetic data for enforcing extrapolation behavior?

Using synthetic data to enforce extrapolation behavior has significant implications for model performance outside the original dataset distribution. Synthetic data generation methods like soft Brownian offset (SBO) and normalizing flow (NF) allow models to make predictions beyond their trained range. The choice between SBO and NF impacts how well a model generalizes to out-of-distribution scenarios. NF tends to generate higher-quality OOD datasets compared to SBO, leading to better performance when predicting unseen data points.

How do different methods for generating out-of-distribution synthetic data impact model performance?

The method used for generating out-of-distribution (OOD) synthetic data plays a crucial role in impacting model performance. In comparison, NF consistently outperforms SBO in creating uniformly distributed OOD datasets that enhance a model's ability to generalize effectively. Additionally, as more synthetic data is added, both methods show improvements in modeling OOD scenarios; however, there may be diminishing returns once the amount of synthetic data approaches that of the original dataset size. The distance between synthetic and original datasets also influences overall model accuracy when predicting beyond known distributions.

A Priori Uncertainty Quantification of Reacting Turbulence Closure Models using Bayesian Neural Networks