This paper addresses the problem of selecting the most important parameters in neural networks under a fixed budget constraint, which is crucial for model compression, pruning, and efficient architecture design. The authors establish a theoretical framework for analyzing importance score approximation under cardinality constraints, derive approximation bounds for greedy selection algorithms, and propose novel algorithms with provable approximation guarantees.
Key findings
Formulates importance score approximation as a submodular maximization problem under cardinality constraints.
Derives (1-1/e)-approximation bounds for greedy importance score selection.
Establishes connections between neural network pruning theory and compressed sensing, submodular optimization, and the Lottery Ticket Hypothesis.
Proposes novel algorithms that adaptively allocate the parameter budget across layers based on curvature information.
Limitations & open questions
The theoretical framework assumes local quadratic approximation near optimal parameters, which may not hold for all neural network architectures.