핵심 개념
A new methodology for utilizing machine learning to optimize symbolic computation research by representing a well-known human-designed heuristic as a constrained neural network, and then using machine learning to further optimize the heuristic, leading to new networks of similar size and complexity as the original.
초록
The paper presents a new approach for utilizing machine learning technology in symbolic computation research, specifically in the context of optimizing computer algebra systems (CASs). The authors explain how a well-known human-designed heuristic for choosing the variable ordering in cylindrical algebraic decomposition (CAD) can be represented as a constrained neural network. This allows them to then use machine learning methods to further optimize the heuristic, leading to new networks of similar size and complexity as the original human-designed one.
The key steps are:
- Formalizing the Brown heuristic for variable ordering in CAD as a set of three metrics based on the input polynomials.
- Interpreting the Brown heuristic as a dense 2-layer neural network with summation activation functions, where the weights are selected to ensure the network orders the variables in the same way as the original heuristic.
- Performing feature selection to identify a new set of three features that outperform the Brown heuristic on a dataset of 3-variable polynomial problems.
- Tuning the weights of the neural network using the new features, leading to further improvements in the computing time for CAD on the test dataset.
The authors present this approach as a form of ante-hoc explainability, where the machine learning outputs are human-level in complexity, allowing for potential new mathematical insights. They suggest the methodology could be applied to other variable ordering choices in symbolic computation, and potentially adapted for use on other choices as well.
통계
The computing time for the Brown heuristic on the NLSAT dataset of 3-variable polynomials was 10,580 seconds.
The computing time for the neural network with the new feature triplet was 10,181 seconds, which is 399 seconds shorter than the Brown heuristic.
After 3 epochs of weight tuning, the computing time decreased further to 9,908 seconds.
인용구
"We present this as a form of ante-hoc explainability for use in computer algebra development."
"It remains to be shown whether these more interpretable ML outputs can lead to new mathematical understanding."