Loading…
Loading grant details…
| Funder | European Commission |
|---|---|
| Recipient Organization | Scuola Internazionale Superiore Di Studi Avanzati Di Trieste |
| Country | Italy |
| Start Date | Jan 01, 2025 |
| End Date | Dec 31, 2029 |
| Duration | 1,825 days |
| Number of Grantees | 1 |
| Roles | Coordinator |
| Data Source | European Commission |
| Grant ID | 101166056 |
Deep neural networks (DNNs) have revolutionised how we learn from data.
Rather than requiring careful engineering and domain knowledge to extract features from raw data, DNNs learn the relevant features for a task automatically from data.
In particular, high-order correlations (HOCs) of the data are crucial for both the performance of DNNs and the type of features they learn.
However, existing theoretical frameworks cannot capture the impact of HOCs – they either study “lazy” regimes where DNNs do not learn data-specific features, or they rely on the unrealistic assumption of Gaussian inputs devoid of non-trivial HOCs.beyond2 will develop a theory for how and what neural networks learn from the high-order correlations of their data.
We break the problem into two parts:(i) *How?* We analyse the learning dynamics of neural networks trained by stochastic gradient descent to unveil the mechanism by which they learn from HOCs efficiently (in terms of the minimum amount of training data / learning time required to attain satisfactory predictive performance).(ii) *What?* Our preliminary research suggests that neural filters are primarily determined by the “principal components” of HOCs.
We investigate how these principal components relate to fundamental data properties, such as symmetries of the inputs.We attack these problems by extending methods from statistical physics and high-dimensional statistics to handle non-Gaussian input distributions.
Studying the interplay between data structure and learning dynamics will allow understanding how specific learning mechanisms, like attention or recursion, are able to unwrap HOCs.By shifting the focus from unstructured to non-Gaussian data models, beyond2 will yield new insights into the inner workings of neural networks.
These insights will bring theory closer to practice and might facilitate the safe deployment of neural networks in high-stakes applications.
Scuola Internazionale Superiore Di Studi Avanzati Di Trieste
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant