MECO47-ForUkraine

Abstracts > Monday 13

Jean Barbier

Statistical limits of dictionary learning: the spectral replica method

Abstract. We consider increasingly complex models of matrix denoising and dictionary learning in the Bayes- optimal setting, in the challenging regime where the matrices to infer have a rank growing linearly with the system size. This is in contrast with most existing literature concerned with the low-rank (i.e., constant-rank) regime. We first consider a class of rotationally invariant matrix denoising problems whose mutual information and minimum mean-square error are computable using techniques from random matrix theory. Next, we analyze the more challenging models of dictionary learning. To do so we introduce a novel combination of the replica method from statistical mechanics together with random matrix theory, coined spectral replica method. This allows us to derive variational formulas for the mutual information between hidden representations and the noisy data of the dictionary learn- ing problem, as well as for the overlaps quantifying the optimal reconstruction error. The proposed method reduces the number of degrees of freedom from O(N²) matrix entries to O(N) eigenvalues (or singular values), and yields Coulomb gas representations of the mutual information which are reminiscent of matrix models in physics. The main ingredients are a combination of large deviation results for random matrices together with a new replica symmetric decoupling ansatz at the level of the probability distributions of eigenvalues (or singular values) of certain overlap matrices and the use of HarishChandra-Itzykson-Zuber spherical integrals.

Tomaso Aste

Network causality representation for complex systems modeling

Abstract. The uncovering of causality relations between observables is crucial to scientific discovery. However, establishing causality from observations is hard. In complex systems, where a large number of interrelated variables contribute to the emerging system’s behavior, the problem becomes harder. There are many challenges, the major being the intrinsic multivariate nature of the problem combined with the absence of a unique, clear, definition and quantifiability of information flow beyond pairwise interactions. I will present a novel approach to construct sparse network representations of causality relations in systems with many variables. I will show how such network representations can be directly utilized to estimate multivariate probabilities from data even in cases of a large number of variables and few observations. I will discuss how the network structure conveys meaningful information about the system's organization and functioning. Examples of financial and biological systems will be presented.

Carlo Lucibello

Deep learning via message passing algorithms based on belief propagation

Abstract. Message-passing algorithms based on the Belief Propagation (BP) equations constitute a well-known distributed computational scheme. It is exact on tree-like graphical models and has also proven to be effective in many problems defined on graphs with loops (from inference to optimization, from signal processing to clustering). The BP-based scheme is fundamentally different from stochastic gradient descent (SGD), on which the current success of deep networks is based. In this paper, we present and adapt to mini-batch training on GPUs a family of BP-based message- passing algorithms with a reinforcement field that biases distributions towards locally entropic solutions. These algorithms are capable of training multi-layer neural networks with discrete weights and activations with performance comparable to SGD-inspired heuristics (BinaryNet) and are naturally well-adapted to continual learning. Furthermore, using these algorithms to estimate the marginals of the weights allows us to make approximate Bayesian predictions that have higher accuracy than point-wise solutions.

Bruno Loureiro

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layers neural networks

Abstract. Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks, which tend to get stuck in badly-generalizing local minima. Here we investigate the cross-over between these two regimes in the high-dimensional setting, and in particular investigate the connection between the so-called mean-field/hydrodynamic regime and the seminal approach of Saad & Solla. Focusing on the case of Gaussian data, we study the interplay between the learning rate, the time scale, and the number of hidden units in the high-dimensional dynamics of stochastic gradient descent (SGD). Our work builds on a deterministic description of SGD in high-dimensions from statistical physics, which we extend and for which we provide rigorous convergence rates.

Aurélien Decelle

Phase transition in the Restricted Boltzmann Machines

Abstract. The restricted Boltzmann machine (RBM) is a generative model, typically used in the field of unsupervised Machine Learning, that has attracted a lot of interest in the physics community lately. Amongst the various reasons, RBMs can be defined as a bipartite Ising model, and the learned features can be easily investigated at variance with more "deep" learning models. In recent works, we have shown that the RBM suffers multiple phase transitions along the learning. In one of these approaches, we show that the training of RBMs with real data displays critical behavior during the learning process. These phenomena had been predicted at a mean-field level before, and it is related to the alignment of the weight or coupling matrix with the first principal directions of the dataset. In this talk, I will describe this mechanism by deriving the mean-field theory and showing our latest numerical results characterizing the phase transitions encountered at the beginning of the learning. We will discuss how this phenomenon affects the learning procedure. In particular, the effect of the MC mixing time during the training, and discuss the value of the critical exponents on real dataset.

Luca Leuzzi

Random lasers as complex disordered systems and a way to experimentally measure Replica Symmetry Breaking

Abstract. The experimental measure of the complete equilibrium distribution of the overlap in a replica symmetry breaking thermodynamic phase is a challenging objective since the introduction of the Parisi solution to the Sherrington-Kirkpatrick model. We tackle the problem on a spin-glass-related model in which the spins are, actually, light modes, established and coupled in an optically random medium because of multiple light scattering. In presence of external power pumping, this model reproduces the behavior of a particular kind of so-called random lasers, which we will term glassy random lasers. We first introduce a theory of multimode light amplification in random media. The leading model, derived from fundamental light-matter interaction, is a phasor spin-glass model with multi-mode mode-locking couplings, undergoing an overall intensity constraint induced by gain saturation, i. e., a spherical complex multi-p-spin model. Through analytic theoretical approaches, numerical simulations, and experimental measurements we investigate this class of random laser models, displaying properties such as a lasing phase transition, ergodicity breaking, glassiness at high power-pumping, energy condensation, and nonlinear mode-locking. Replica Symmetry Breaking theory allows identifying a laser critical point and a glassy regime in the high pumping regime. An intensity fluctuation overlap (IFO) parameter is introduced, measuring the correlation between intensity fluctuations of light waves. In mean-field fully connected spherical models the IFO can be proved to be in a one-to-one correspondence with the Parisi overlap, and it allows to identify the laser transition and the high pumping glassy phase purely in terms of emission spectra data, the only data so far accessible in random laser experimental measurements. Though phasors configurations are not accessible, intensity configurations can, thus, be observed by means of emission spectra. Investigating pulse-to-pulse fluctuations in organic solid random lasers, indeed, the distribution of intensity fluctuation overlaps can be constructed and yields evidence of a transition to a glassy light phase compatible with a replica symmetry breaking. To bridge exact analytic results and coarse-grained experimental results numerical simulations of models of random lasers are presented. Going beyond the fully connected approximation, in a diluted interaction network, a breakdown of energy equipartition among light modes is observed right at the glass transition point.

Ohad Shpielberg

A universal power law of entanglement witnesses in biased quantum many body systems

Abstract. The ground-state properties of quantum systems are unique. Here, we consider a bipartite quantum system, open or closed, that conserves charge. The charge ratio R defines the mean charge at system A over the mean charge at system B. At large R values, we show that the entanglement entropy follows a universal power-law decay. This universal power law is shown to hold also for the Rényi entropy as well as other entanglement witnesses. The universal power law implies that one can entangle a small set of particles in a highly occupied system, suggesting the plausibility of manipulation or control of such systems through a few degrees of freedom.

Ferenc Iglói

Griffiths singularities in random quantum magnets: Geometry of rare regions and extreme statistics of the excitations

Abstract. In many-body systems with quenched disorder, dynamical observables can be singular not only at the critical point but in an extended region of the paramagnetic phase as well. These Griffiths singularities are due to rare regions, which are locally in the ordered phase and contribute to a large susceptibility. Here, we study the geometrical properties of rare regions in the transverse Ising model with dilution or with random couplings and transverse fields. In diluted models, the rare regions are percolation clusters, while in random models the ground state consists of a set of spin clusters, which are calculated by the strong disorder renormalization method. We consider the so-called energy cluster, which has the smallest excitation energy, and calculate its mass and linear extension in one-, two- and three-dimensions. We found that the rare regions are not compact: for the diluted model, they are isotropic and tree-like, while for the random model they are quasi-one-dimensional. We calculated also the low energy excitations in the one-dimensional random model with high numerical precision by the strong disorder renormalization group method and - for shorter chains - by free-fermion techniques. Asymptotically, the two methods give identical results, which are well fitted by the Fréchet limit law of the extremes of independent and identically distributed random numbers. Considering the finite size corrections, the two numerical methods give very similar results, but they differ from the correction term for uncorrelated random variables, indicating that the weak correlations between low-energy excitations in random quantum magnets are relevant.

Gesualdo Delfino

Universality in nonequilibrium quantum dynamics

Abstract. A quantum quench dynamically generates a nonequilibrium state that in the proximity of a critical point yields universal time evolution. We show how to solve the key problem of determining the nonequilibrium state for the different universality classes, and then analytically determine the behavior of local observables at large times. One result of the theory is that, for systems with interacting excitation modes, the order parameter can exhibit oscillations that stay undamped in time. In particular, this is predicted to occur for a quench of the transverse field within the ferromagnetic phase of the Ising model in more than one spatial dimension, a case previously unaccessible to analytic treatment. If the quench is performed only in a subregion of the whole d-dimensional space occupied by the system, the time evolution occurs inside a light cone spreading away from the boundary of the quenched region. In this case, the additional condition for undamped oscillations is that the volume of the quenched region is extensive in all dimensions.

Privacy | Accessibility