A lot of progress has been made in recent years in the development of machine learning (ML) potentials for atomistic simulations [1]. Neural network potentials (NNPs), which have been introduced more than two decades ago [2], are an important class of ML potentials. While the first generation of NNPs has been restricted to small molecules with only a few degrees of freedom, the...
Traditionally the “best” observations are those with the largest signal from the most tightly controlled systems. In a wide range of phenomena – the dance of proteins in function, femtosecond breaking of molecular bonds, the gestation of fetuses – tight control is neither feasible, nor desirable. Modern machine-learning techniques extract far more information from sparse random sightings...
Gaussian process regression (GPR) is a kernel-based regression tool with intrinsic uncertainty estimation, which makes it well-suited to natural science datasets. In Bayesian optimization, GPR is coupled with acquisition functions for an active learning approach, where models are iteratively refined by addition of new data points with high information content. This tutorial will use the BOSS...
In this tutorial, we introduce the AI technique of symbolic regression, combined with compressed sensing for the identification of compact, interpretable models.
Specifically, we introduce the Sure-Independence Screening and Sparsifying Operator (SISSO), together with its recent variants.
The methodology starts from a set of candidate features, provided by the user, and it builds a tree of...
In this tutorial, we discuss a neural network application in materials discovery. More specifically, we will showcase how to accelerate functional high entropy alloy discovery using neural network based generative model and ensemble model for the regression task. Therefore, this tutorial consists of two parts. Firstly, we discuss in detail how to construct an alloy generation scheme based on...
Atom probe tomography (APT) is a materials analysis technique that provides sub-nanometer resolution compositional mapping. The data is in the form of a point cloud containing often millions of atoms, and to each of these points is assocaited an elemental nature. By interrogating the point cloud, the local composition of a material or a phase of a specific microstructural feature can be...
I will discuss recent progress in automated experiment in electron microscopy, ranging from feature to physics discovery via active learning. The applications of classical deep learning methods in streaming image analysis are strongly affected by the out of distribution drift effects, and the approaches to minimize though are discussed. We further present invariant variational autoencoders as...
To quantify chemical segregation at multiple length scales in APT in a semi-automatic way, we propose a multi-stage strategy. First, we collect composition statistics from APT datasets for 2x2x2 nm voxels. These voxel compositions are then clustered in compositional space using Gaussian mixture models to automatically identify key phases. Next, based on this compositional classification we...
Dual Phase (DP) steels are an important family of steel grades used widely in the automotive industry because of their beneficial properties such as high ultimate tensile strength (GPa range), low initial yield stress, and high early-stage strain hardening. The DP steel microstructure consists of soft ferritic grains, which are mainly responsible for ductility, and hard martensitic zones,...
In this poster, we present our research goals of a recently BiGmax funded project towards learning dynamics of scanning transmission electron microscopy (STEM) by incorporating physical consistency with phase-field models. The primary idea of this project is to develop machine learning (ML)-based modeling of an interpretable coarse-grained dynamic model utilizing in situ STEM video sequences...
The complexity of photoemission data is rapidly increasing,
as new technological breakthroughs have enabled multidimensional parallel acquisition of multiple observables. Most of the community is currently using heterogeneous data formats and workflows.
We propose a new data format based on NeXus, a hierarchically organized hdf5 structure. The aim is to immediately enable preprocessed data...
Due to their ability to recognize complex patterns, neural networks can drive a paradigm shift in the analysis of materials science data. Here, we introduce ARISE, a crystal-structure identification method based on Bayesian deep learning. As a major step forward, ARISE is robust to structural noise and can treat more than 100 crystal structures, a number that can be extended on demand. While...
I will present our latest progress in using machine learning for solving non-linear solid mechanics. The presentation will be based on our published work here: https://www.nature.com/articles/s41524-021-00571-z
The ability to replicate results is a key characteristic of quality science, and is growing ever more important in light of the replication crisis [1, 2].
A study can rarely be repeated using only the minimalistic descriptions provided in the "Materials and methods" section in a paper.
It is therefore important to properly document the entire knowledge generation...
The increasing availability of data from the first three paradigms of science (experiments, theory, and simulations), along with advances in artificial intelligence and machine learning (AI/ML) techniques has offered unprecedented opportunities for data-driven science and discovery, which is the fourth paradigm of science. Within the arena of AI/ML, deep learning (DL) has emerged as a...
Variational methods are powerful tools in image processing.
Basically we are searching for a suitable mathematical model (function) consisting of a data term and a prior
which minimizer provides a solution of the task at hand and can be computed in an efficient, reliable way. Typically this leads to non-smooth, high-dimensional
optimization problems.
This talk deals with recent results...
HyperSpy (https://hyperspy.org/) is an open-source Python package for the analysis of multi-dimensional datasets. In its fourteen years of existence, its community of developers has taken it from being a simple collection of scripts for electron energy-loss data analysis to become the core of a multidisciplinary software ecosystem. In this talk, I will its evolution, ecosystem, main features...
HyperSpy (https://hyperspy.org/) is an open-source Python package for the analysis of multi-dimensional datasets. In its fourteen years of existence, its community of developers has taken it from being a simple collection of scripts for electron energy-loss data analysis to become the core of a multidisciplinary software ecosystem. In this talk, I will its evolution, ecosystem, main features...
Every day, experimental materials science data is being collected in thousands of laboratories around the world. However, the diversity of instruments, vendor software packages and (proprietary) data formats, lab cultures, and the focus mostly on new discoveries causes most of this data to end up in a black hole in terms of accessibility to the scientific community (including in many cases the...
The main focus of this tutorial is the FAIR sharing of materials science data and how to do it with NOMAD. We will be covering the publication of new data and the exploration and download from NOMAD’s existing data; both through our browser-based interface and APIs.