Publications

This work introduces a novel framework to evaluate the quality of AI explanations in Graph Neural Networks (GNNs), ensuring that the materials discovered by AI are selected based on reliable and physically meaningful data patterns.

 ​

Authors: Ding Zhang, Siddharth Betala, Chirag Agarwal

Read Paper

LeMat-GenBench provides the first standardized framework to rigorously evaluate and compare how well different AI models generate new, stable chemical structures.

Authors: Siddharth Betala, Samuel P. Gleason, Félix Therrien, Rocío Mercado, Alexandre Duval

Read Paper

This critical research demonstrates that while adsorption energy is a key metric, AI discovery must also consider reaction kinetics and stability to truly identify viable catalysts.

 ​

Authors: Shahana Chatterjee, Alexander Davis, Yoshua Bengio, Alexandre Duval, Félix Therrien

 Read Paper

LeMat-Bulk is a massive, cleaned database that merges multiple quantum chemistry sources to provide a high-quality foundation for training large-scale AI models for materials.

 ​

Authors: Martin Siron, Inel Djafar, Ali Ramlaoui, Felix Therrien, Alexandre Duval

 ​Read Paper

This paper introduces and analyzes batching algorithms for Graph Neural Networks (GNNs), demonstrating that optimized dynamic batching can achieve up to a 12.5x speedup in training time, significantly accelerating Al-driven materials discovery.

 ​

Authors: Daniel T. Speckhard, Tim Bechtel, Sebastian Kehl, Jonathan Godwin, Claudia Draxl

 ​Read Paper

LeMat-Synth is a toolbox that uses AI to automatically extract and standardize chemical synthesis protocols from millions of scientific papers to build comprehensive discovery databases.

 ​

Authors:
Magdalena Lederbauer, Siddharth Betala, Ayush Jain, Alexandre Duval, Samuel P. Gleason

 ​Read Paper

This work reviews state-of-the-art atomistic workflows and demonstrates how advanced data management enables the interoperability of experimental and computational research. 

 ​

Authors:
Daniel T. Speckhard, Martin Kuban, Christoph T. Koch, Joseph F. Rudzinski, Claudia Draxl 

 ​Read Paper

LeMat-Traj provides a massive, standardized dataset of atomic trajectories to benchmark and improve the accuracy of machine learning models in predicting material dynamics.

 ​

Authors: Ali Ramlaoui, Martin Siron, Inel Djafar, Joseph Musielewicz, Alexandre Duval

This work introduces Catalyst GFlowNet, an AI framework that autonomously discovers high-performance catalysts for the hydrogen evolution reaction by navigating complex chemical spaces.

 ​

Authors: Lena Podina, Christina Humer, Alexandre Duval, Victor Schmidt, Yoshua Bengio

 ​Read Paper

This work utilizes a neural-architecture search to optimize message-passing neural networks for predicting the physical properties of solids, achieving superior accuracy in band-gap and formation-energy regression[cite: 2750, 2752].

 ​

Authors: Tim Bechtel, Daniel T. Speckhard, Jonathan Godwin, Claudia Draxl


Read Paper

This work introduces machine-learning models to extrapolate DFT calculations to the complete basis-set limit, enabling high-precision material property predictions while significantly reducing computational costs.

 ​

Authors: Daniel T. Speckhard, Christian Carbogno, Luca M. Ghiringhelli, Sven Lubeck, Matthias Scheffler, Claudia Draxl

 ​

Read Paper

This paper introduces a method to compute accurate energy Hessians using pretrained GNNs, enabling faster free energy corrections and transition state searches.

 ​

Authors: Brook Wander, Joseph Musielewicz, Raffaele Cheula, John R. Kitchin

 ​

Read Paper

This paper builds the largest validated dataset of gold nanoparticle syntheses using a hybrid LLM-based approach, uncovering how shape depends on precursor and protocol choices.

 ​

Authors: Sanghoon LeeKevin CruseSamuel P. GleasonA. Paul AlivisatosGerbrand Ceder, Anubhav Jain

 ​

Read Paper

This paper presents a physics-based tool that extracts size and shape distributions of gold nanorods directly from UV-Vis spectra, enabling automated, high-throughput synthesis analysis and predictive modeling without relying on electron microscopy.

 ​

Authors: Samuel P. Gleason, Jakob C. Dahl, Mahmoud Elzouka, Xingzhi Wang, Dana O. Byrne, Hannah Cho, Mumtaz Gababa, Ravi S. Prasher, Sean Lubner, Emory M. Chan, A. Paul Alivisatos

 ​

Read Paper

This paper presents CuXASNet, a neural network trained on FEFF9 simulations that rapidly predicts Cu L-edge X-ray absorption spectra from atomic structures with near-DFT accuracy, enabling high-throughput screening and experimental analysis across diverse materials.

 ​

Authors: Samuel P. Gleason, Matthew R. Carbone, Deyu Lu, Jim Ciston

 ​

Read Paper

This paper benchmarks uncertainty quantification methods for GNN-predicted relaxed energies and shows that latent space distances—especially when engineered for rotational invariance—offer the most reliable and efficient uncertainty estimates.

 ​

Authors: Joseph Musielewicz, Janice Lan, Matt Uyttendaele, John R. Kitchin

 ​

Read Paper

This paper presents a random forest-based model trained on simulated dynamical electron diffraction patterns to predict crystal systems, space groups, and lattice constants, enabling fast, uncertainty-aware structure identification from 2D data in both simulated and experimental 4D-STEM settings.

 ​

Authors: Samuel P. Gleason, Alexander Rakowski, Stephanie M. Ribet, Steven E. Zeltmann, Benjamin H. Savitzky, Matthew Henderson, Jim Ciston, Colin Ophus

 ​

Read Paper

This paper presents a random forest model trained on simulated Cu L-edge XAS spectra to predict oxidation states directly from XAS and EELS data, enabling accurate, high-throughput analysis of mixed-valence copper materials across experimental and in situ settings.

 ​

Authors: Samuel P. Gleason, Deyu Lu, Jim Ciston

 ​

Read Paper

This paper offers a structured and in-depth guide to the field of Geometric Graph Neural Networks (GNNs) for 3D atomic systems, introducing a taxonomy of invariant, equivariant (Cartesian and spherical), and unconstrained models to help newcomers and practitioners better understand and navigate the landscape of geometric GNN architectures, applications, and future directions.

 ​

Authors: Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

 ​

Read Paper

This paper introduces PhAST, a physics-aware, scalable, and task-specific GNN framework that significantly boosts accuracy and efficiency for catalyst discovery on OC20, enabling up to 40× speedups and CPU-based training.

 ​

Authors: Alexandre DuvalVictor Schmidt, Santiago Miret, Yoshua Bengio, Alex Hérnandez-Garcia, David Rolnick

 ​

Read Paper

This paper presents a new way to generate stable bulk crystal structures using generative AI, building them step-by-step under chemical and physical constraints.

 ​

Authors: Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt

 ​

Read Paper

This paper presents a fine-tuned GPT-3 model that extracts complete, structured seed-mediated gold nanorod synthesis procedures from unstructured literature, enabling the creation of a high-quality, reusable dataset for downstream synthesis modeling and analysis.

 ​

Authors: Nicholas WalkerSanghoon LeeJohn DagdelenKevin CruseSamuel GleasonAlexander DunnGerbrand CederA. Paul AlivisatosKristin A. Persson, Anubhav Jain

 ​

Read Paper

This paper reviews how, recent progress in generative AI models and the development of specialized scientific datasets, are being used to design and discover new materials with specific properties.

 ​

Authors: Rana A Barghout, Zhiqing Xu, Siddharth Betala, Radhakrishnan Mahadevan

 ​

Read Paper

This paper presents GFlowNets as a general framework for AI-driven scientific discovery, enabling diverse, uncertainty-aware hypothesis generation, experimental design, and causal inference in domains with large, complex search spaces.

 ​

Authors: Moksh Jain, Tristan Deleu, Jason Hartford, Cheng-Hao Liu, Alex Hernandez-Garcia, Yoshua Bengio

 ​

Read Paper

This paper presents a high-throughput DFT workflow for modeling amorphous material surface reactions, using site clustering and automated NEB generation to predict etching barriers in systems like a-Si and a-C with minimal computation.

 ​

Authors: Martin Siron, Nita Chandrasekhar, Kristin A. Persson

 ​

Read Paper

This paper introduces FAENet, a lightweight and highly expressive GNN that achieves symmetry preservation through stochastic frame averaging rather than architectural constraints, enabling state-of-the-art performance and scalability in 3D materials modeling.

 ​

Authors: Alexandre Duval, Victor Schmidt, Alex Hernandez Garcia, Santiago Miret, Fragkiskos D. Malliaros, Yoshua Bengio, David Rolnick

 ​

Read Paper

This paper introduces GFlowNets as a framework for sampling complex structured objects proportionally to reward, enabling diverse generation in tasks like molecule design and probabilistic inference.

 ​

Authors: Yoshua Bengio, Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, Emmanuel Bengio

 ​

Read Paper

This paper presents a large-scale DFT study of tellurium-containing semiconductors, uncovering chemisorption trends and unique scaling relationships that suggest their potential as selective photocatalysts for CO₂ reduction.

 ​

Authors: Martin Siron, Oxana Andriuc, Kristin A. Persson

 ​

Read Paper

This paper presents an automated DFT workflow for systematically evaluating adsorption on semiconductor surfaces, demonstrated on zinc telluride to enable high-throughput screening for photocatalytic CO₂ reduction.

 ​

Authors: Oxana Andriuc, Martin Siron, Joseph H. Montoya, Matthew Horton, Kristin A. Persson

 ​

Read Paper