Core Concepts
The author presents parallel algorithms for exact enumeration of neural network activation regions, focusing on the organization and formation of these regions.
Abstract
The content discusses the importance of understanding neural network activation regions and introduces parallel algorithms for their exact enumeration. It highlights the significance of parallelism in efficiently processing large networks beyond toy examples.
The feedforward neural network using rectified linear units constructs a mapping from inputs to outputs by partitioning its input space into convex regions where points within a region share a single affine transformation. The study aims to design and implement algorithms for exact region enumeration in networks beyond toy examples. The work presents novel algorithm frameworks and parallel algorithms for region enumeration, demonstrating the impact of dimension on further partitioning by deeper layers. The implementation runs on larger networks than existing literature, emphasizing the importance of parallelism for efficient region enumeration.
Artificial neural networks are dominant in artificial intelligence but lack fundamental theoretical understanding. The lack of understanding leads to heuristic-based decisions in network design, hindering optimal tailoring to specific problems. Deep neural networks employing rectified linear activation functions can be described relatively straightforwardly, allowing insight into their operations through polytope structures instantiated by network parameters.
The paper addresses how to design and use parallel algorithms to enumerate polyhedral activation regions for realistically sized networks. It introduces LayerWise-NNCE-Framework for designing serial cell enumeration algorithms using computational geometry subroutines. Parallel algorithms are presented for common problem settings, showcasing linear performance based on the number of cells.
Stats
To our knowledge, we run our implemented algorithm on networks larger than all used in existing literature.
The performance is linear in the number of cells.
|C1| β« π (number of first layer cells much larger than available processors).