Aims and Scope:
This is the annual workshop of the GAMM Activity Group on "Computational and Mathematical Methods in Data Science (COMinDS)". The goal is to bring together mathematicians and other scientists working (or interested) in the theory and applications of data science. The workshop will feature four keynotes, contributed talks and a poster session.
Invited Speaker:
In case of Corona restrictions in fall, we will have an online event.
Gaussian Process (GP) regression is a popular nonparametric and probabilistic machine learning method. Notably, GPs have favorable characteristics for addressing some fundamental challenges that arise when combining learning algorithms with control. After a discussion of these challenges and a short tutorial on GP regression, I will present some of our recent results in GP-based learning control. In particular, I plan to talk about (i) dynamics model learning that incorporates also physical knowledge, (ii) controller optimization that combines simulation and real experiments, and (iii) new GP uncertainty bounds for safe learning. Some of the developed theory will be illustrated through experimental results on robotic hardware.
Dynamical modeling of a process is essential to study its dynamical behavior and perform engineering studies such as control and optimization. With the ease of accessibility of data, learning models directly from the data have recently drawn much attention. Constructing simple and compact models describing complex nonlinear dynamics is also desirable for efficient engineering studies on modest computer hardware. To achieve our goal, we focus on the lifting principles--that is, sufficiently smooth nonlinear systems can be rewritten as quadratic models in an appropriate coordinate system. Therefore, we focus on identifying suitable coordinate systems such that a quadratic model can describe the dynamics in the obtained coordinate system. To determine such a coordinate system, we leverage the powerful expressive capabilities of deep learning, particularly autoencoders. Moreover, in several physical systems where energy preservation is preserved, we focus on identifying the coordinate systems that can also preserve energy. We illustrate the methodologies to learn the desired coordinate systems for nonlinear dynamical models by illustrative examples.
Convolutional neural networks (CNNs) are frequently used for image generation, see for instance [1,2]. In this context it has been observed in practice that CNNs have a smoothing effect on images. While this is desirable for denoising it also leads to unwanted blurring of edges. In this talk we formalize this observation by rigorously showing that, under mild conditions, images generated from CNNs are continuous and in some cases even continuously differentiable. In particular this implies that CNNs cannot generate sharp edges which are a key feature of natural images. To prove these results we first consider CNNs in function space for which regularity results can be proven and afterwards show that practically used CNNs are, indeed, proper discretizations of these function space CNNs. Furthermore we provide numerical experiments supporting our theoretical findings and suggest modeling approaches to avoid the issue. See [4] for a preprint of our work.
[1] V. Lempitsky, A. Vedaldi, and D. Ulyanov. Deep image prior. In 2018 IEEE/CVF, Conference on Computer Vision and Pattern Recognition, pages 9446–9454, 2018.
[2] V. Jain and S. Seung. Natural image denoising with convolutional networks. Advances neural information processing systems, 21, 2008.
[4] A. Habring and M. Holler. A Note on the Regularity of Images Generated by Convolutional Neural Networks. arXiv preprint arXiv:2204.10588 (2022).
In the context of Koopman operator based analysis of dynamical systems, the generator of the Koopman semigroup is of central importance. Models for the Koopman generator can be used, among others, for system identification, coarse graining, and control of the system at hand.
Bounds for the approximation and estimation error in this context are paramount to a better understanding of the method. In this talk, I will first discuss recent results on estimating the finite-data estimation error for Koopman generator models based on ergodic simulations. I will then present recent advances allowing for the approximation of the generator on tensor-structured subspaces by means of low-rank representations. This approach allows modelers to employ high-dimensional approximation spaces, while controlling the computational effort at the same time. Model applications to molecular dynamics simulation datasets will conclude the talk.
In this presentation we discuss new machine learning techniques suitable to solve ill-posed inverse problems. In particular, we deal with the task of reconstructing data from a collection of noisy measurements that are typically not enough to recover the ground-truth univocally. In this context, we propose a machine learning approach based on adversarial training based on optimal transport methods that, using an unpaired set of data and measurements, is able to learn an end-to-end reconstruction of the data given the measurements. Additionally, we show how to leverage the adversarial formulation of the approach to learn, as a byproduct, a regularizer for the inverse problem that encodes prior knowledge of the data. Such regularizer is used to refine the end-to-end reconstruction by solving a variational problem that jointly penalizes the regularizer and the reconstruction error for a given observation. We showcase the potential of our approach by applying it to image reconstruction tasks in computed tomography.
Breast Cancer is the deadliest and commonly diagnosed cancer in women globally. Early diagnosis and treatment of breast cancer increases the chance of a five-year survival rate by 99%. Recent technological and computational advancement have led to the discovery of machine learning algorithms for the analysis of complex data. Machine learning algorithms have been widely applied for the analysis of breast
cancer data. In this paper, we propose to implement machine learning algorithms, using a deep learning approach, for automatic detection and prediction of breast cancer using mammogram images. To achieve this, we implement transfer learning on a deep learning algorithm called Convolutional Neural Network (CNN). Two datasets of breast cancer images are analyzed using three CNN models in existing deep
learning frameworks. The models perform a binary and multiclass classification task on the images. Experimental results showed that CNN models can accurately identify and predict breast cancer when provided with a large and balanced dataset.
Given a nonnegative matrix X and a factorization rank r, nonnegative matrix factorization (NMF) approximates the matrix X as the product of a nonnegative matrix W with r columns and a nonnegative matrix H with r rows. NMF has become a standard linear dimensionality reduction technique in data mining and machine learning. In this talk, we first introduce NMF and show how it can be used in various applications, including image feature extraction and document classification. Then, we address the issue of non-uniqueness of NMF decompositions, also known as the identifiability issue, which is crucial in many applications. We finally discuss how the factors (W,H) can be computed. We illustrate these results in applications coming from hyperspectral imaging and analytical chemistry.
This is joint work with Maryam Abdolali and Robert Rajko.
Trigonometric functions can be evaluated efficiently based on the Fast Fourier Transform and related techniques.
The computational cost is $\mathcal O(N \log N)$, where $N$ is the number of given nodes. Feature maps based on such functions are therefore well suited for big data analysis, where the number of data points is typically very large.
However, the size of a full grid of Fourier coefficients grows
exponentially with the number of features $d$ and, hence, classical FFT-based methods are only efficient for small dimensions.
Recently, the usage of truncated ANOVA (analysis of variance) decompostions has been proposed.
Using small superposition dimensions helps to circumvent the curse of dimensionality. The corresponding feature maps can be applied in various Machine Learning algorithms, such as least-squares regression or support vector machines.
The ANOVA-idea makes the obtained model interpretable and helps identifying relevant features and connections between them, since Sobol indices and Shapley values are easily determined.
In recent decades, non-intrusive model reduction has been developed to become a promising solution to system dynamics forecasting, especially in cases where data are collected from experimental campaigns or proprietary software simulations. Hence, the usage of non-intrusive modelling methods in combination with physics-based considerations could comprise a building block towards predictive Digital Twins in critical engineering applications. In this work, we present a method for non-intrusive model reduction, applied to fluid dynamics. The approach is based on the a priori known sparsity of the full-order system operators (e.g. of the discretized Navier-Stokes equations), which is dictated by grid adjacency information. In order to enforce this type of sparsity, we solve a "local", regularized least-squares problem for each degree of freedom on a grid, considering only the training data from adjacent nodes, thus making computation and storage of the inferred full-order operators feasible. After constructing the non-intrusive, sparse full-order model, the Proper Orthogonal Decomposition is used for its projection to a reduced dimension subspace. This approach differs from methods where data are first projected to a low-dimensional manifold, since here the inference problem is solved for the original, full-order system. As an example, we consider the construction of a quadratic, reduced order model for the flowfield prediction over a cylinder at a low Reynolds number. Results considering the accuracy and predictive capabilities of the inferred reduced model are analytically discussed.
Gaussian processes (GPs) are a crucial tool in machine learning and their use across different areas of science and engineering has increased given their ability to quantify the uncertainty in the model. The covariance matrices of GPs arise from kernel functions, which are crucial in many learning tasks and the matrices are typically dense and large-scale. Depending on their dimension even computing all their entries is challenging and the cost of matrix-vector products scales quadratically with the dimension, if no customized methods are applied. We present a matrix-free approach that exploits the computational power of the non-equispaced fast Fourier transform (NFFT) and is of linear complexity for fixed accuracy. With this, we cannot only speed up matrix-vector multiplications with the covariance matrix but also take care of the derivatives needed for the gradient method avoiding Hadamard products of the Euclidean distance matrix and the kernel matrix. This arises when differentiating kernels as the squared-exponential kernel with respect to the length-scale parameter in the denominator of the exponential expression. Our method introduces a derivative kernel which is then well suited for multiplying with the Hadamard product. By applying our NFFT-based fast summation technique, fitting the kernel and the derivative kernel, will allow for fast tuning of the hyperparameters.
In recent years, machine learning has become a prevalent tool to
provide predictive models in many applications. In this talk, we are
interested in using such predictors to model relationships between
variables of an optimization model in Gurobi. For example, a
regression model may predict the demand of certain products as a
function of their prices and marketing budgets among other features.
We are interested in being able to build optimization models that
embed the regression so that the inputs of the regression are decision
variables, and the predicted demand can be satisfied.
We propose a python package that aims at making it easy to insert
regression models trained by popular frameworks (e.g., scikit-learn,
Keras, PyTorch) into a Gurobi model. The regression model may be a
linear or logistic regression, a neural network, or based on decision
trees. The resulting optimization models are often hard to solve with
the current technology. We also present computational results on
improvements that are specifically targeted for those types of models.
In particular, we consider optimization models with embedded neural
networks.
Spiking neural networks, also often referred to as the third generation of neural networks, carry the potential for a massive reduction in memory and energy consumption over traditional, second-generation neural networks. Inspired by the undisputed efficiency of the human brain, they introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. To open the pathway toward engineering applications, we introduce this exciting technology in the context of continuum mechanics. However, the nature of spiking neural networks poses a challenge for regression problems, which frequently arise in the modeling of engineering sciences. To overcome this problem, a framework for regression using spiking neural networks is proposed. In particular, a network topology for decoding binary spike trains to real numbers is introduced, utilizing the membrane potential of spiking neurons. As the aim of this contribution is a concise introduction to this new methodology, several different spiking neural architectures, ranging from simple spiking feed-forward to complex spiking long short-term memory neural networks, are derived. Several numerical experiments directed towards regression of linear and nonlinear, history-dependent material models are carried out. A direct comparison with counterparts of traditional neural networks shows that the proposed framework is much more efficient while retaining precision and generalizability. All code has been made publicly available in the interest of reproducibility and to promote continued enhancement in this new domain.
This work is at the interface of reduced-order modeling, data assimilation and machine learning, in which we present a model that merges the Proper Orthogonal Decomposition (POD)-Galerkin reduction methodology with Physics Informed Neural Networks (PINNs) for the sake of solving inverse problems involving the Navier--Stokes equations (NSE).
The model constructs a POD-Galerkin ROM for the NSE (or the modified turbulent NSE modeled by the RANS or LES approaches) resulting in a reduced dynamical system. Then, Physics Informed Neural Networks (PINNs) are employed for solving the reduced order system produced by the POD-Galerkin ROM. The inputs of the PINNs are time and the parameters, while their output is the vector of all reduced quantities, namely, the reduced velocity, pressure, turbulent (if applicable) and convective terms.
The PINNs are then trained by minimizing a loss function which corresponds to the (weighted) sum of the data loss and the reduced equation losses. PINNs features allow for the inference of unknown physical quantities that appear in the reduced equations. The model proposed, named POD-Galerkin PINN ROM, is also able to perform accurate forward computations for the input parameter despite the uncertainty in the problem.
The model is tested on two benchmark cases which are the steady case of the flow past a backward step and the flow around a surface mounted cubic obstacle. In both tests, the model is employed for the approximation of unknown parameters such as the physical viscosity. The POD-Galerkin PINN ROM shows accuracy in solving the inverse problems.
In the recent years, reinforcement learning (RL) has been identified as a potentially potent optimization method for stochastic control problems. The mathematical underpinning of RL is the Markov Decision Process (MDP), which provides a formal framework for devising policies for optimal decision making under uncertainties. While RL is just one method for finding such policies that solve the MDP, it is particularly useful when the reward signals are sampled, evaluative and sequential. It is the machine-learning method of choice for learning strategies in the context of dynamical systems. One early application of RL in fluid mechanics is to flow control problems, but research here is still in its infancy.
In this talk, I will give a brief introduction to RL, in particular policy gradient methods and their features. I will then discuss the problem of turbulence modelling, in particular for the discretization-filtered Navier-Stokes equations and highlight the mathematical difficulties. As a possible remedy, I will show how to formulate the task of finding an optimal closure model as an MDP, which we can solve via RL once we define the reward, state and action spaces. The environment and its transition function are given by the Discontinuous Galerkin Spectral Element solver FLEXI for the compressible Navier-Stokes equations. Since such optimization problems are quite resource-intensive and classical PDE schemes and RL methods pose disparate demands on hardware and HPC-aware design, I will briefly discuss the parallelization on hybrid architectures (on HAWK + AI extension at HLRS) and its potential for training at scale. For this coupled RL-DG framework, I will present how the RL optimization yields discretization-aware model approaches for the LES equations that outperform the current state of the art. While this is not a classical example from flow control, it shows the great potential of the “solver in the loop” optimization based on RL.