核心概念
The symbolic regression (SR) problem is NP-hard, as it can be reduced to the NP-hard degree-constrained Steiner Arborescence problem (DCSAP).
摘要
The paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expression space. It then establishes a connection between the SR problem and the task of identifying an optimally fitted DCSAP within this graph.
The key insights are:
The DCSAP problem is proven to be NP-hard in Lemma 1.
The SR problem is equivalent to finding a DCSAP in the symbol graph, where the root vertex '⋄' and one variable vertex are set as terminals.
Since DCSAP is NP-hard, and the SR problem is equivalent to DCSAP, the SR problem is also NP-hard.
The proof provided in this paper is more robust than previous attempts, as it covers a broader range of mathematical expressions beyond the simple linear sums considered earlier. This establishes the NP-hard nature of the real-world SR problem more conclusively.