Core Concepts
A novel approach for estimating the exceedance probability of significant wave height by leveraging the forecasts of a regression model and the cumulative distribution function.
Abstract
The paper presents a novel approach for estimating the exceedance probability of significant wave height (SWH) time series. The key idea is to leverage the numeric forecasts produced by a regression model and use the cumulative distribution function (CDF) to compute the exceedance probability.
The authors first formalize the problem of SWH forecasting as a time series prediction task, where the goal is to predict the future values of the SWH time series. They experiment with several regression models, including random forest regression, LASSO, a heterogeneous regression ensemble, and a deep neural network.
To estimate the exceedance probability, the authors propose a method that uses the CDF of the predicted SWH values. Specifically, they assume the predicted values follow a Normal distribution with the mean being the forecast and the standard deviation computed from the training data. The exceedance probability is then calculated as the complement of the CDF evaluated at the predefined threshold.
The authors compare the proposed CDF-based approach with two alternative strategies for estimating exceedance probability: binary classification models and ensemble-based direct methods. The experiments are conducted on a real-world SWH dataset collected from a buoy near Halifax, Canada.
The results show that the proposed CDF-based method, when coupled with a strong regression model like a deep neural network, outperforms the alternative approaches in terms of the area under the ROC curve (AUC) metric. The authors also analyze the impact of the forecasting horizon and the sensitivity to different probability distributions.
Overall, the paper presents a novel and effective approach for estimating the exceedance probability of SWH, which can be valuable for managing maritime operations and renewable energy production.
Stats
The significant wave height (SWH) time series has an hourly granularity and spans from 11-02-2000 15:00:00 to 01-04-2020 11:00:00.
The average threshold for exceedance, computed as the 95th percentile of the SWH data, is 3.17 meters.
Quotes
"Forecasting the ocean wave conditions is valuable for multiple operations. The main motivation is related to renewable energy, where forecasts are used to estimate energy production. Moreover, these forecasts are also useful for managing the safety of maritime operations."
"We frame the prediction of impending large values of a time series as an exceedance probability forecasting problem. Exceedance probability forecasting denotes the process of estimating the probability that a time series will exceed a predefined threshold in a predefined future period."