Cartesian Encoding Graph Neural Network for Crystal Structures Property Prediction: Application to Thermal Ellipsoid Estimation

1 Image Processing Group - Signal Theory and Communications Department,
Universitat Politècnica de Catalunya, Barcelona, Spain
2 Inorganic and Organic Chemistry Department and Institute of Theoretical and Computational Chemistry,
Universitat de Barcelona, Barcelona, Spain
3 Materials Chemistry Department, Faculty of Chemistry and Biology,
Universidad de Santiago de Chile, Santiago, Chile
Frontpage image

CartNet efficiently predicts Anisotropic Displacement Parameters (ADPs) for crystal structures, significantly reducing computational costs and outperforming the tested theoretical methods while achieving state-of-the-art accuracy across diverse datasets.

Abstract

In the diffraction resolution of crystal structures, the thermal ellipsoids are a critical parameter that is usually more difficult to determine than atomic positions. These ellipsoids are quantified through the Anisotropic Displacement Parameters (ADPs), which provide critical insights into atomic vibrations within crystalline structures. ADPs reflect the thermal behaviour and structural properties of crystal structures. However, traditional methods to compute ADPs are computationally intensive. This paper presents CartNet, a novel graph neural network (GNN) architecture designed to predict properties of crystal structures efficiently by encoding the atomic structural geometry to the cartesian axes and the temperature of the crystal structure. Additionally, CartNet employs a neighbour equalization technique for message passing to help emphasise the covalent and contact interactions, and a novel Cholesky-based head to ensure valid ADP predictions. Furthermore, a rotational SO(3) data augmentation technique has been proposed during the training phase to generalize unseen rotations. To corroborate such procedure, an ADP dataset with over 200,000 experimental crystal structures from the Cambridge Structural Database (CSD) has been curated. The model significantly reduces computational costs and outperforms existing previously resported methods in ADP prediction by 10.87%, while demonstrating a 34.77% improvement over the tested theoretical computation methods. Moreover, we have employed CartNet for other already known datasets that included different material properties, such as formation energy, band gap, total energy, energy above the convex hull, bulk moduli, and shear moduli. The proposed architecture outperformed previously reported methods by 7.71% in the Jarvis Dataset and 13.16% in the Materials Project Dataset, proving CarNet's capability to achieve state-of-the-art results in several tasks.

Results

ADP Dataset

We have curated a comprehensive dataset of over 200,000 experimental crystal structures from the Cambridge Structural Database (CSD) to validate our approach. The dataset focuses on high-quality structures with R-factors below 5%, no disorder, with a unique molecule type in the unit cell, ensuring reliable experimental ADPs for training and evaluation.

Results for the different tested architectures in the test split from the ADP Dataset. Best result in bold and second best underlined. Arrows indicate the direction of improvement for each metric.
Method MAE (Ų)↓ S₁₂ (%)↓ IoU (%)↑ #Params↓
eComformer 6.22 × 10⁻³ ± 0.01 × 10⁻³ 2.46 ± 0.01 74.22 ± 0.06 5.55M
iComformer 3.22 × 10⁻³ ± 0.02 × 10⁻³ 0.91 ± 0.01 81.92 ± 0.18 4.9M
CartNet 2.87 × 10⁻³ ± 0.01 × 10⁻³ 0.75 ± 0.01 83.56 ± 0.01 2.5M

CartNet vs DFT comparative

Results comparison

Thermal ellipsoids representations from experimental ADPs for the 5,5'-dimethyl-2,2'-bipyrazine crystal structure (CSD refcode: ETIDEQ) predicted using DFT PBE D3-BJ and CartNet, respectively. The green regions represent the experimental values, the red ones represent the prediction values, and the grey represents the intersection between them. The numbers in each atom represent the IoU between the experimental and the calculated ADP. Highlighted can be seen a sample ellipsoid predicted using the DFT and the same ellipsoid using CartNet. The parallelepiped represents the unit cell, and the red, green, and blue lines correspond to the a, b, and c unit cell axes.

Comparative ADP results between CartNet and DFT for the 5,5'-dimethyl-2,2'-bipyrazine crystal structure (CSD refcode: ETIDEQ). DFT calculations were performed using three configurations: atomic relaxation with fixed volume (Vinet), atomic relaxation with fixed lattice, and full geometry optimization. DFT calculations used 56 CPU cores on MareNostrum 5, while CartNet used 1 GPU and 1 CPU core. Best results in bold. Arrows indicate direction of improvement.
Method MAE (Ų)↓ S₁₂ (%)↓ IoU (%)↑ Time (s)↓
DFT (Vinet) 1.32 × 10⁻² 3.09 57.33 ~2.88 × 10⁶
DFT (Fix Latt.) 1.43 × 10⁻² 4.12 70.75 ~1.44 × 10⁶
DFT (Full Opt.) 3.25 × 10⁻³ 0.49 86.27 ~2.88 × 10⁶
CartNet 2.12 × 10⁻³ 0.17 92.31 ~10⁻²

Results on other Crystal Structures Datasets

Jarvis Dataset

MAE results for the different tested architectures in test split from the Jarvis dataset. Best result in bold and second best underlined. Arrows indicate the direction of improvement for each metric.
Method Form. Energy (meV/atom)↓ Band Gap(OPT) (meV)↓ Total energy (meV/atom)↓ Band Gap(MBJ) (meV)↓ Ehull (meV)↓
Matformer 32.5 137 35 300 64
PotNet 29.4 127 32 270 55
eComFormer 28.4 124 32 280 44
iComFormer 27.2 122 28.8 260 47
CartNet 27.05 ± 0.07 115.31 ± 3.36 26.58 ± 0.28 253.03 ± 5.20 43.90 ± 0.36

The Materials Dataset

MAE results for the different tested architectures in test split from the Material Project Dataset. Best result in bold and second best underlined. Arrows indicate the direction of improvement for each metric.
Method Form. Energy (meV/atom)↓ Band Gap (meV)↓ Bulk Moduli (log(GPa))↓ Shear Moduli (log(GPa))↓
Matformer 21 211 0.043 0.073
PotNet 18.8 204 0.04 0.065
eComFormer 18.16 202 0.0417 0.0729
iComFormer 18.26 193 0.038 0.0637
CartNet 17.47 ± 0.38 190.79 ± 3.14 0.033 ± 0.00094 0.0637 ± 0.0008

BibTeX

@Article{D4DD00352G,
      author ="Solé, Àlex and Mosella-Montoro, Albert and Cardona, Joan and Gómez-Coca, Silvia and Aravena, Daniel and Ruiz, Eliseo and Ruiz-Hidalgo, Javier",
      title  ="A Cartesian encoding graph neural network for crystal structure property prediction: application to thermal ellipsoid estimation",
      journal  ="Digital Discovery",
      year  ="2025",
      pages  ="-",
      publisher  ="RSC",
      doi  ="10.1039/D4DD00352G",
      url  ="http://dx.doi.org/10.1039/D4DD00352G",
      }