# Physics in ML

# Agenda

# Physics in ML

## Follow the live streaming in BIDS YouTube channel.

## Schedule

**Time & Location: **May 29th - Berkeley Institute of Data Science (BIDS), Doe Library, UC Berkeley Campus

**Wi-Fi: ***eduroam*, *AirBears2 *(UCB) or *CalVisitor *(for visitors)

8:15-8:30 Arrival and Registration

8:35-8:40 Logistics & Introduction

Josh Bloom, UC Berkeley Astronomy

8:40-8:50 Welcome and Introductory Remarks

UC Berkeley Provost Paul Alivisatos, Chemistry

### Producing & Discovering Dynamical Models

**Moderator, Laura Waller**

8:50-9:06 Data-driven methods for the discovery of governing equations

J. Nathan Kutz, UW

9:06-9:22 AI Feynman: a Physics-Inspired Method for Symbolic Regression

Silviu-Marian Udrescu, MIT

9:22-9:38 Learning physical interaction in many ways

Jiajun Wu, MIT

9:38-9:54 Data Driven Discretization for Partial Differential Equations

Stephan Hoyer, Google Research

9:54-10:10 Solving Astrophysical PDEs with Deep Neural Networks and TensorFlow

Milos Milosavljevic, The University of Texas at Austin

10:10 - 10:25 **Moderated Discussion**

10:25 - 10:35 *Break*

### Incorporating Physics directly into the Models

**Moderator, Fernando Pérez**

10:35-10:51 Machine learning for lattice gauge theory

Phiala Shanahan, MIT

10:51-11:07 Physics informed Machine Learning

Guofei Pang, Brown

11:07-11:23 Machine learning in high-energy particle physics experiments, from simulation, through reconstruction to physics analysis

Heather Gray, UC Berkeley/LBNL

11:23-11:39 FPGA-accelerated machine learning inference as a service for particle physics computing

Miaoyuan Liu, Fermilab

11:39-11:55 Physics Constrained Fluid Flow Prediction using Lyapunov's Method

Ben Erichson, UC Berkeley

11:55 - 12:11 Cosmology for Machine Learning

Uros Seljak, UC Berkeley/LBNL

12:11 - 12:25 **Moderated Discussion**

12:25 - 1 *Lunch*

### Generative Models

**Moderator, Eric Jonas**

1 - 1:16 Generative models as priors for signal denoising

Soledad Villar, NYU

1:16-1:32 Flow-based generative models for lattice field theory

Tej Kanwar, MIT

1:32-1:48 Putting Non-Euclidean Geometry to Work in ML: Hyperbolic and Product Manifold Embeddings

Frederic Sala, Stanford

1:48-2:04 Deducing Inference from Hyperspectral Imaging of Materials Using Deep Recurrent Neural Networks

Joshua Agar, Lehigh University

2:04-2:20 Improved learning for materials and chemical structures through symmetry, hierarchy and similarity

Bert de Jong, LBNL

2:20 - 2:36 CosmoGAN: Towards a cosmology emulator using Generative Adversarial Networks

Mustafa Mustafa, LBNL

2:36 - 2:52 Hybrid Physical - Deep Learning Models for Astronomical Inverse Problems

François Lanusse, UC Berkeley

2:52 - 3:07 **Moderated Discussion**

3:07 - 3:15 *Break*

### Learning with Physical Systems

**Moderator, Federica Bianco**

3:15-3:31 Physics-constrained Computational Imaging

Laura Waller, UC Berkeley

3:31-3:47 Spectral Inference Networks: Unifying Deep and Spectral Learning

David Pfau, DeepMind

3:47-4:03 Noise2Self: Blind Denoising by Self-Supervision

Joshua Batson, CZ Biohub

4:03-4:19 Reinforcement Learning for Materials Synthesis

Rama Vasudevan, Oak Ridge National Laboratory

4:19-4:35 Reinforcement Learning, Control, and Inference

Sergey Levine, UC Berkeley

4:35-4:51 Reducing simulation dependence with deep learning

Benjamin Nachman, LBNL

4:51-5:10 **Moderated Discussion & Wrap-up**

Group Walk to the Reception - Gather Restaurant (Downtown Berkeley)

5:30 - 7:30 Joint Day 1/Day 2 Reception

## Abstracts

Josh Agar, Lehigh University

**Deducing Inference from Hyperspectral Imaging of Materials Using Deep Recurrent Neural Networks**

Characterization of materials relies on measuring their stimuli-driven response after perturbation by an external energy source. These measurements generally involve either continuously changing the magnitude of the perturbation or the bandwidth/energy of the response which is measured resulting in data which has sequential or temporal dependence. Recent advances in high-speed sensors have allowed spectroscopic measurements to be conducted using a multitude of techniques (e.g., electron microscopy, atomic force microscopy, etc.) which also have high-spatial resolution. Coupling spectroscopic characterization with imaging allows researchers to directly probe structure-property relations at relevant length and time scales. Despite a boon in these multidimensional spectroscopic imaging techniques the size and complexity of the data being collected coupled with the dearth of downstream analysis approaches have limited the ultimate scientific contributions of these powerful experimental techniques. Here, we show how deep-recurrent neural networks can be used to automate the extraction of features physically-important phenomena concealed within “big” multichannel hyperspectral data into focus for interpretation. Specifically, we will discuss the broad applicability of this approach to experimental techniques ranging from piezoresponse measurement of ferroelectrics, discovery of new conduction mechanisms at charged domain walls, and atomically-resolved electron energy loss spectroscopic of functional interfaces. The methodology developed paves the way for spectroscopic techniques wherein the conventional scientific methods of designing targeted experiments aimed at a specific hypothesis are supplanted by approaches which collect all seemingly relevant data, which can then be automatically interpreted to identify a hypothesis for empirical testing.

Joshua Batson, Chan Zuckerberg Biohub

**Noise2Self: Blind Denoising by Self-Supervision**

We present a general framework for denoising high-dimensional measurements which requires no prior on the signal, no estimate of the noise, and no clean training data. The only assumption is that the noise exhibits statistical independence across different dimensions of the measurement, while the true signal exhibits some correlation. For a broad class of functions ("J-invariant"), it is then possible to estimate the performance of a denoiser from noisy data alone. This allows us to calibrate J-invariant versions of any parameterised denoising algorithm, from the single hyperparameter of a median filter to the millions of weights of a deep neural network. We demonstrate this on natural image and microscopy data, where we exploit noise independence between pixels, and on single-cell gene expression data, where we exploit independence between detections of individual molecules. This framework generalizes recent work on training neural nets from noisy images and on cross-validation for matrix factorization.

Bert de Jong LBNL

**Improved learning for materials and chemical structures through integration of physical constraints **

We will discuss efforts to develop generative machine learning approaches that can predict properties from structural information, but more importantly can also tackle the ‘inverse problem’ deducing structural information given desired properties. To do this, we need to develop information-rich encoding decoding techniques for three-dimensional and hierarchical structures. Our efforts are centered around marginalized graph kernel approaches and autoencoders, and utilizing tensor field networks to discover new scientific knowledge about structure-function relationships in chemical sciences.

Ben Erichson, UC Berkeley

**Physics Constrained Fluid Flow Prediction using Lyapunov's Method**

Dynamical systems are ubiquitous in science and technology. In many situations it is of interest to model the evolution of such a system over time. Deep learning is an emerging framework for this task, yet often ignoring physical insights of the system under consideration. This talk discusses a possible stability-promoting mechanism to improve the generalization performance of deep flow prediction models. Motivated by the idea of Lyapunov stability, our findings show that a stabilized model not only improves the generalization performance, but also reduces the sensitivity of choosing the tuning parameters.

Heather Gray, UC Berkeley/LBNL

**Machine learning in high-energy particle physics experiments, from simulation, through reconstruction to physics analysis**

Machine learning has become ubiquitous in high-energy experimental physics, transforming almost every aspect of the software. I will provide an overview of how machine learning is used in the ATLAS experiment at the Large Hadron Collider and illustrate this with selected examples including the simulation of the detector response, the reconstruction of the raw data and in physics analysis. Perspectives about where machine learning might be used in the future in ATLAS will also be provided.

Stephan Hoyer, Google Research

**Data Driven Discretization for Partial Differential Equations**

Machine learning and differentiable programming offer a new paradigm for scientific computing, with algorithms tuned by machines instead of by people. I’ll show how these approaches can be used to improve the heuristics underlying numerical methods, particularly for discretizing partial differential equations. We use high resolution simulations to create training data, which we train convolutional neural nets to emulate on much coarser grids. By building on top of traditional approaches such as finite volume schemes, we can incorporate physical constraints, ensure stability and allow for extracting physical insights. Our approach allows us to integrate in time a collection of nonlinear equations in one spatial dimension at resolutions 4-8x coarser than is possible with standard finite difference methods.

Tej Kanwar, MIT

**Flow-based generative models for lattice field theory**

Flow-based generative models offer a means of sampling from complex, high-dimensional distributions. I will discuss recent work on applying these models in the context of MCMC sampling of lattice field theory distributions. Estimating observables over these (Boltzmann) distributions gives access to information about correlation functions and the spectra of quantum field theories. I will show results from flow models trained to sample the distribution for a two-dimensional scalar lattice field theory, and compare against standard simulation methods. Several developments are required to move towards lattice gauge theories, and I will discuss preliminary work in that direction.

Nathan Kutz, University of Washington

**Data-driven methods for the discovery of governing equations**

A major challenge in the study of dynamical systems is that of model discovery: turning data into models that are not just predictive, but provide insight into the nature of the underlying dynamical system that generated the data. This problem is made more difficult by the fact that many systems of interest exhibit parametric dependencies and diverse behaviors across multiple time scales. We introduce a number of data-driven strategies for discovering nonlinear dynamical systems, their coordinates and their control laws from data. We consider two canonical cases: (i) systems for which we have full measurements of the governing variables, and (ii) systems for which we have incomplete measurements. For systems with full state measurements, we show that the recent sparse identification of nonlinear dynamical systems (SINDy) method can discover governing equations with relatively little data and introduce a sampling method that allows SINDy to scale efficiently to problems with multiple time scales and parametric dependencies. We can also regress to data-driven control laws that are capable of learning how to control a given system. Together, our approaches provide a suite of mathematical strategies for reducing the data required to discover, model and control nonlinear systems. The methods are demonstrated on optical fiber lasers and meta-material antennas.

Miaoyuan Liu, Fermilab

**FPGA-accelerated machine learning inference as a service for particle physics computing**

Large-scale particle physics experiments face challenging demands for high-throughput computing re- sources both now and in the future. The growing exploration of machine learning algorithms in particle physics offer new solutions for simulation, reconstruction, and analysis. These new machine learning solutions often lead to increased parallelization and faster reconstruction times on dedicated hardware, specifically Field Programmable Gate Arrays. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that requires minimal modification to the current computing model. As examples, we apply weight returning to the ResNet-50 image classifier to demonstrate state-of-the-art performance for top jet tagging at the LHC, and transfer learning is applied to neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge) service. representing an improvement of ∼30× (175×) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600-700 inferences per second using an image batch of one, comparable to the throughput achieved using a GPU with a large batch size. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

Milos Milosavljevic, The University of Texas at Austin

**Solving Astrophysical PDEs with Deep Neural Networks and TensorFlow**

Much of astrophysical modeling reduces to solving partial differential equations expressing conservation laws. Very recently, proofs of concept have been published that nonlinear PDEs can be solved by harnessing the universal functional approximation capacity of artificial neural networks and large-scale numerical optimization accelerated with GPUs. The neural network approach uses least-squares residual minimization to find approximate global solutions implicitly defined by the PDEs and the boundary conditions over an entire spacetime and parameter space domain. The approach is mesh-free and can thus solve high-dimensional PDEs. We discuss how the approach differs from the standard supervised machine learning. We present experiments carried out in the TensorFlow framework that test the limits of the neural network approach to solving PDEs.

Mustafa Mustafa, Berkeley Lab

**CosmoGAN: Towards a cosmology emulator using Generative Adversarial Network**s

Inferring model parameters from experimental data is a grand challenge in cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulators of fully-fledged simulations. These generative models have the potential to make a dramatic shift in the field of scientific simulations, but for that shift to happen we need to study the performance of such generators in the precision regime needed for science applications. To this end, this work, we apply Generative Adversarial Networks to the problem of generating weak lensing convergence maps. We show that our generator network produces maps that are described by, with high statistical confidence, the same summary statistics as the fully simulated maps.

Benjamin Nachman, Lawrence Berkeley National Laboratory

**Reducing simulation dependence with deep learning**

The analysis of data from fundamental physics experiments such as the Large Hadron Collider often rely heavily on simulation. While state-of-the-art simulations are excellent and can describe a wide range of physical processes, they are often approximations to nature and have components that are empirical models. With a growing interest in using deep learning to extract the most information from our data, there is a need to ensure that our techniques are robust to mist-modeling. This is particularly important for high-dimensional learning where the key information can be distributed in subtle correlations across many dimensions of the feature space. I will give examples of how modern machine learning can be combined with physical insight to render classification, regression, generation, and anomaly detection models robust to mismodeling. I will use examples from LHC physics, but many of the methods have much broader applicability.

Guofei Pang, Brown University

**Physics informed Machine Learning**

TBD

David Pfau, DeepMind

**Spectral Inference Networks: Unifying Deep and Spectral Learning**

Spectral analysis is a foundational tool in physics, and underpins many machine learning methods. For finite-dimensional vector spaces, linear operators can be diagonalized by standard methods. However diagonalizing linear operators on high-dimensional function spaces is a challenging numerical problem. In this talk I will describe Spectral Inference Networks (SpIN), which are a scalable method to approximate eigenfunctions of linear operators by stochastic gradient descent. In a machine learning context, Spectral Inference Networks can be seen as a generalization of Slow Feature Analysis, but without many of the shortcomings of classic SFA. In computational physics, Spectral Inference Networks are closely related to Variational Monte Carlo methods. I will show applications of SpIN to learning excited states of small quantum systems, interpretable features from video and eigenoptions for reinforcement learning.

Frederic Sala, Stanford

**Putting Non-Euclidean Geometry to Work in ML: Hyperbolic and Product Manifold Embeddings**

The quality of the representations achieved by embeddings is determined by how well the geometry of the embedding space matches the structure of the data. Euclidean space has been the workhorse for embeddings; recently, non-Euclidean spaces, often used in other scientific fields, have gained attention in ML due to their ability to better embed various types of structured data. In particular, hyperbolic embeddings offer excellent quality with few dimensions when embedding hierarchical data structures. We discuss several new approaches to producing hyperbolic embeddings. When data is not structured as uniformly, we propose learning embeddings in a product manifold combining multiple copies of the canonical model spaces (spherical, hyperbolic, Euclidean), providing a space of heterogeneous curvature suitable for a wide variety of structures.

Uros Seljak, UC Berkeley/LBNL

**Cosmology for machine learning**

I will give a broad overview of data analysis in cosmology and how it relates to ML, arguing that many of analysis methods in cosmology can be related to ML. I will relate it to ML topics such as unsupervised and supervised learning, generative models, optimal data analysis, uncertainty quantification using Bayesian posterior analysis, inpainting, denoising and super-resolution reconstruction.

Phiala Shanahan, MIT

**Machine learning for lattice gauge theory**

I will describe several applications of machine learning to accelerate numerical studies of field theories, which are important in contexts from statistical mechanics and condensed matter physics to nuclear and particle physics. In particular, lattice field theory calculations require the evaluation of integrals over field configurations; typically this is done via importance sampling, with correctly-distributed samples of field configurations generated via a Markov chain Monte Carlo approach (MCMC). I will outline different approaches to this task based on machine learning, including using neural networks for the matching of field configurations at different scales in multi-scale approaches to MCMC, and alternatively using normalising flows for the direct sampling of field configurations. I will also outline the challenges in these applications, which include in particular scaling the algorithms to the systems of 10^12 variables which correspond to state-of-the-art calculations.

Silviu-Marian Udrescu, Massachusetts Institute of Technology

**AI Feynman: a Physics-Inspired Method for Symbolic Regression**

A core challenge for both physics and artificial intelligence (AI) is symbolic regression: finding a symbolic expression that matches data from an unknown function. Although this problem is likely to be NP-hard in principle, functions of practical interest often exhibit symmetries, separability and other simplifying properties. In this spirit, we develop a recursive, multidimensional symbolic regression algorithm, that combines neural network fitting with a suite of physics-inspired techniques. We apply it to 100 equations from the Feynman Lectures on Physics, and it is able to discover all of them, while previous publicly available software cracks at most 71.

Rama Vasudevan, Oak Ridge National Laboratory

**Reinforcement Learning for Materials Synthesis**

Reinforcement Learning (RL) has garnered renewed attention due to demonstrations of super-human performance in video and board games, which have arisen largely due to capabilities afforded by the marriage of deep neural networks with tradition reinforcement learning (‘deep RL’). However, the use of deep RL within most practical control situations is almost non-existent, due to difficulties with sample efficiency and the associated need for enormous volumes of training data. The situation is even more pronounced within materials science, where basic ML approaches are only now becoming adopted. In this talk, we will explore the landscape of using deep RL within materials synthesis, focusing on the task of growing thin films with pulsed laser deposition with desired morphologies. The problem is presented as a Markov Decision Process with incomplete information, with delayed feedback, and therefore a candidate for deep RL methods. Challenges associated with the appropriate state and reward functions are presented. Results on use of deep Q learning for morphological control with a kinetic Monte-Carlo simulation are discussed. Success will require not only improvements in current policy learning methods but advances in accelerating simulations of film growth on high performance computing environments, and automated synthesis platforms.

Soledad Villar, NYU

**Generative models as priors for signal denoising**

Deep neural networks are currently being used to produce good generative models for real world data. Such generative models had been successfully exploited to solve classical inverse problems like compressed sensing and super resolution, improving the state of art signal processing performance. In this talk we focus on the classical signal processing problem of image denoising. We analyze a simple toy model of feed-forward neural networks propose a theoretical setting that uses spherical harmonics to identify what mathematical properties of the activation functions will allow signal denoising with local methods. Ongoing work applies these ideas to stellar spectral modelling.

Laura Waller UC Berkeley

**Physics-constrained Computational Imaging**

Computational imaging involves the joint design of imaging system hardware and software, optimizing across the entire pipeline from acquisition to reconstruction. Computers can replace bulky and expensive optics by solving computational inverse problems. This talk will describe new microscopes that use computational imaging to enable 3D fluorescence and phase imaging using image reconstruction algorithms that are based on large-scale nonlinear non-convex optimization combined with unrolled neural networks.

Jiajun Wu, MIT

**Learning physical interaction in many ways**

The ability to understand physical interaction among objects lies at the core of human cognition; it is also essential in building intelligent machines that see and manipulate objects in the real world. In this talk, I'll present our recent work on using deep learning to approximate physical interaction, with a focus on graph networks. Our recent findings suggest that (i) learning systems can approximate physical interaction at various granularities, ranging from rigid bodies to deformable shapes to fluids, (ii) the learned physical model implicitly encodes the physical object properties that govern the interaction, and (iii) incorporating physics explicitly into learning systems leads to improvement in both performance and data-efficiency for robot manipulation.

## Poster Abstracts

Wenlei Chen, University of Minnesota

**Searching for Highly Magnified Stars at Cosmological Distances in Archival Hubble Galaxy-Cluster Imaging**

The detection of gravitational waves emitted from ~10-100 solar-mass black-hole mergers has revived the search for dark matter formed by primordial black holes (PBHs). If present in foreground galaxy-cluster lenses, PBHs should cause microlensing peaks that extremely magnify well-aligned background stars, allowing the high-redshift stars to be detected with the Hubble Space Telescope (HST). Since the microlensing event rate should be proportional to the density of microlenses times the magnification of the macromodel, searching for lensed stars behind clusters provides one of few avenues to probe stellar-mass PBHs. Only four highly magnified stars have been discovered to date, and all were found behind powerful Hubble Frontier Field (HFF) galaxy clusters. Here we report the most recently discovered highly magnified star at redshift z= 0.94 in a strongly lensed arc behind a HFF cluster, MACS J0416.1-2403, discovered as part of a systematic archival search. In this ongoing search, we have found >30 bright transient candidates in the HFF fields that include known lensed stars and supernovae. Twenty-four of them, however, have not yet been publicly reported and need to be further identified. This search for transients is being extended to lower signal-to-noise microlensing candidates using machine-learning methods. We expect the study will improve constraints on the abundance of PBHs in the ~10-100 solar-mass regime.