Mechanism for feature learning in neural networks and backpropagation-free machine learning models

Adityanarayanan Radhakrishnan; Daniel Beaglehole; Parthe Pandit; Mikhail Belkin

doi:10.1126/science.adi5639

Mechanism for feature learning in neural networks and backpropagation-free machine learning models

Science. 2024 Mar 29;383(6690):1461-1467. doi: 10.1126/science.adi5639. Epub 2024 Mar 7.

Authors

Adityanarayanan Radhakrishnan^#^{1

2}, Daniel Beaglehole^#³, Parthe Pandit^{4

5}, Mikhail Belkin^{3

5}

Affiliations

¹ Harvard School of Engineering and Applied Sciences, Cambridge, MA 02138, USA.
² Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
³ Computer Science and Engineering, UC San Diego, La Jolla, CA 92093, USA.
⁴ Center for Machine Intelligence and Data Science, IIT Bombay, Mumbai 400076, India.
⁵ Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA 92093, USA.

^# Contributed equally.

PMID: 38452048
DOI: 10.1126/science.adi5639

Abstract

Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. In this work, we presented a unifying mathematical mechanism, known as average gradient outer product (AGOP), that characterized feature learning in neural networks. We provided empirical evidence that AGOP captured features learned by various neural network architectures, including transformer-based language models, convolutional networks, multilayer perceptrons, and recurrent neural networks. Moreover, we demonstrated that AGOP, which is backpropagation-free, enabled feature learning in machine learning models, such as kernel machines, that a priori could not identify task-specific features. Overall, we established a fundamental mechanism that captured feature learning in neural networks and enabled feature learning in general machine learning models.