PRISM Frontpage

PRISM: Periodic Representation with multIscale and Similarity graph Modelling for enhanced crystal structure property prediction

1 Image Processing Group - Signal Theory and Communications Department,
Universitat Politècnica de Catalunya, Barcelona, Spain
2 Inorganic and Organic Chemistry Department and Institute of Theoretical and Computational Chemistry,
Universitat de Barcelona, Barcelona, Spain
3 Materials Chemistry Department, Faculty of Chemistry and Biology,
Universidad de Santiago de Chile, Santiago, Chile

Abstract

Crystal structures are characterised by repeating atomic patterns within unit cells across three-dimensional space, posing unique challenges for graph-based representation learning. Current methods often overlook essential periodic boundary conditions and multiscale interactions inherent to crystalline structures. In this paper, we introduce PRISM, a graph neural network framework that explicitly integrates multiscale representations and periodic feature encoding by employing a set of expert modules, each specialised in encoding distinct structural and chemical aspects of periodic systems. Extensive experiments across crystal structure-based benchmarks demonstrate that PRISM improves state-of-the-art predictive accuracy, significantly enhancing crystal property prediction.

Methodology

PRISM employs a collection of expert modules, each specialised in encoding complementary structural and chemical features of periodic systems. Every expert operates on a distinct graph topology to capture interactions at different spatial scales, thereby enabling PRISM to jointly model both local atomic environments and global lattice periodicity. Our framework constructs a dual-scale, multi-graph representation that is iteratively refined through an ensemble of experts.

PRISM Architecture Overview
A Overview of the PRISM architecture. Atomic and superatom embeddings are initialised via dedicated encoders, and each PRISM layer updates them until the final representation that is used to predict the property. B The PRISM layer architecture. Each of the representations are aggregated and fused using four expert modules: Cell (global lattice periodicity), Multiscale (atom-superatom bidirectional interactions), Atomistic (radius graph with periodic boundaries), and Similarity (feature-space graph with periodic boundaries).

Atomistic Expert

To accurately model the fundamental chemical bonds and local physical forces that govern crystalline stability and emergent properties, it is essential to capture interactions at the atomistic scale. This expert captures short‐range atomic interactions by constructing a graph connecting atoms when their minimum‐image distance under Periodic Boundary Conditions satisfies a cutoff radius.

Atomistic Expert Graph
A Three reference atoms are highlighted, and a radius graph is created around each of them. The resulting lateral links connect adjacent layers of the material, allowing message passing to propagate across layers and correctly encode the geometry. B In this material, the cutoff does not reach lateral neighbours, so edges form mainly along the vertical direction. Message passing is then propagated only vertically, and the model cannot capture the lateral geometry; stacking additional message-passing layers does not restore the missing lateral pathways.

Similarity Expert

To address the propagation bottlenecks that the atomistic radius graph can induce, we introduce the Similarity expert. This expert propagates semantically relevant chemical and structural cues across distant yet similar atoms, capturing global correlations that spatial proximity alone cannot reveal. It explicitly models similarity by constructing a feature-space graph, where atoms are connected if the Euclidean distance between their learned feature embeddings is below a feature cutoff.

Similarity Expert Graph
A Two equivalent unit-cell choices lead to different similarity edges if periodicity is ignored, producing inconsistent graphs across equivalent cells. B Proposed periodic-invariant construction. Candidate neighbours (magenta) are first identified in feature space; for each candidate, we then select the minimal periodic image and add the corresponding edge (red), yielding consistent and invariant graphs. C Corner case in which no feature-similar atoms lie inside the reference unit cell. Similar atoms exist only as periodic replicas, resulting in an unconnected graph.

Cell-Space Expert

The Cell-Space expert captures long-range cell-to-cell interactions, such as delocalised electronic coupling, collective lattice correlations, and boundary-driven surface effects, which atom-level graphs cannot represent. The design encodes global periodic repetitions by introducing a single superatom node. We construct a radius-based graph around the superatom node with a cutoff radius significantly larger than the atomistic radius to explicitly capture lattice repetitions at the superatom scale.

Cell-Space Expert Graph
A–B Two equivalent unit-cell choices produce the same set of replica connections because cell transformations preserve the relative distances and directions between the superatom and its periodic images. The only change comes from global rotations.

Multiscale Expert

To capture interactions across distinct structural scales, we introduce the Multiscale expert, which explicitly connects the global superatom representation with atomic-level embeddings without breaking the invariance between equivalent unit-cell representations. The primary objective is to enable an efficient bidirectional information flow, allowing local atomic features to aggregate into a global representation and simultaneously distributing global contextual information back to the atomic nodes.

Multiscale Expert Graph
A and B show two equivalent unit-cell representations of the same material. Because the connectivity depends only on the set of atomic embeddings and not on their coordinates, both representations yield the same graph and updates, making this expert invariant to unit-cell transformations.

Fusion of Experts

To integrate complementary information captured by the different expert modules, we introduce a learned fusion mechanism at each aggregation step. The fusion strategy separately handles the superatom and atomic node embeddings, reflecting their distinct roles within the model.

For the superatom representation, we employ a gating mechanism: $$ \mathbf{h}_s^{(l+1)} = \sigma(\alpha) \mathbf{h}_s^{(\mathrm{cell})} + \left(1 - \sigma(\alpha)\right) \mathbf{h}_s^{(\mathrm{multiscale})} $$

For atomic node representations, we employ a separate fusion approach. Here, we define learnable parameters \(\beta', \gamma', \delta'\), corresponding to each expert module (Atomistic, Similarity, and Multiscale). These parameters are normalised via a softmax operation, producing weighting factors that sum to one, ensuring a meaningful convex combination of expert outputs: $$ [\beta, \gamma, \delta] = \text{Softmax}([\beta', \gamma', \delta']) $$ $$ \mathbf{h}_i^{(l+1)} = \beta \mathbf{h}_i^{(\mathrm{atomistic})} + \gamma \mathbf{h}_i^{(\mathrm{feat})} + \delta \mathbf{h}_i^{(\mathrm{multiscale})} $$

This weighted fusion allows each atomic embedding to optimally integrate geometric, chemical, and scale-dependent information provided by the distinct expert modules.

Results

Jarvis Dataset

For tasks derived from the Jarvis dataset, we followed the methodology of Choudhary et al. in ALIGNN, utilizing the same training, validation, and test datasets.

MAE results for the different tested architectures in the test split from the JARVIS dataset. Best result in bold and second-best underlined.
Method Form. Energy (meV/atom)↓ Band Gap (OPT) (meV)↓ Total energy (meV/atom)↓ Band Gap (MBJ) (meV)↓ Ehull (meV)↓
Matformer 32.5 137 35 300 64
PotNet 29.4 127 32 270 55
eComFormer 28.4 124 32 280 44
iComFormer 27.2 122 28.8 260 47
CartNet 27.05 ± 0.07 115.31 ± 3.36 26.58 ± 0.28 253.03 ± 5.20 43.90 ± 0.36
PRISM 25.87 ± 0.36 109.26 ± 2.546 26.34 ± 0.38 236.49 ± 5.56 23.07 ± 0.62

The Materials Project Dataset

For tasks derived from The Materials Project, we followed the methodology of Yan et al. in Matformer.

MAE results for the different tested architectures in the test split from the Material Project Dataset. Best result in bold and second-best underlined.
Method Form. Energy (meV/atom)↓ Band Gap (meV)↓ Bulk Moduli (log(GPa))↓ Shear Moduli (log(GPa))↓
Matformer 21 211 0.043 0.073
PotNet 18.8 204 0.040 0.065
eComFormer 18.16 202 0.0417 0.0729
iComFormer 18.26 193 0.038 0.0637
CartNet 17.47 ± 0.38 190.79 ± 3.14 0.033 ± 0.00094 0.0637 ± 0.0008
PRISM 16.59 ± 0.1 179.71 ± 1.58 0.033 ± 0.00094 0.0655 ± 0.0008

Matbench Dataset

MAE and RMSE results for Matbench dataset. Best result in bold and second-best underlined.
Method e_form MAE (meV)↓ e_form RMSE (meV)↓ jdft2d MAE (GPa)↓ jdft2d RMSE (GPa)↓
MODNet 44.8 ± 3.9 88.8 ± 7.5 33.2 ± 7.3 96.7 ± 40.4
ALIGNN 21.5 ± 0.5 55.4 ± 5.5 43.4 ± 8.9 117.4 ± 42.9
coGN 17.0 ± 0.3 48.3 ± 5.9 37.2 ± 13.7 101.2 ± 55.0
M3GNet 19.5 ± 0.2 - 50.1 ± 11.9 -
eComFormer 16.5 ± 0.3 45.4 ± 4.7 37.8 ± 9.0 102.2 ± 46.4
iComFormer 16.5 ± 0.3 43.8 ± 3.7 34.8 ± 9.9 96.1 ± 46.3
PRISM 15.20 ± 0.31 30.43 ± 1.38 38.41 ± 12.44 97.90 ± 38.25

Fusion–weight analysis

We examine how PRISM distributes responsibility across experts and whether these distributions reflect the chemistry and physics of each target. To do so, we summarise the learned fusion weights by averaging across layers and random seeds, and then analyse the resulting allocations to the Atomistic, Similarity, Multiscale and Cell experts. Figure below reports the mean fusion weights for each property, separated into atom–level and cell–level contributions.

Two robust patterns are evident. First, for properties with a strong global or band–structure character the cell–level pathway dominates: Band Gap (OPT) and Ehull place most weight on the Cell expert. Second, within atom–level fusion, energy–like targets favour Atomistic and Multiscale approaches, whereas the Similarity expert becomes more prominent for electronic and stability descriptors.

Although both targets are reported per atom in JARVIS, the model allocates weights differently to formation energy and total energy. For formation energy, the atom-level fusion is dominated by Atomistic and Multiscale share, because formation energy is built as a difference between the compound and its elemental references; this reference normalisation cancels much of the atomic contributions, leaving local bonding, coordination, hybridisation and short interatomic distances as the primary determinants. By contrast, total energy per atom is an absolute quantity that still aggregates electrostatic and dispersion effects into the energy density experienced by each site. Consequently, PRISM assigns a larger share to the cell-level Multiscale pathway for total energy, where the superatom aggregator pools information across multiple neighbourhood radii to capture these extended interactions.

Fusion–weight analysis

Average fusion weights over layers and seeds on the JARVIS test set. Each subpanel shows atom–level (Multiscale, Atomistic, Similarity) and cell–level (Multiscale, Cell) contributions for a single target property.

BibTeX

@article{sole2025prism,
  title={PRISM: Periodic Representation with multIscale and Similarity graph Modelling for enhanced crystal structure property prediction},
  author={Solé, Àlex and Mosella-Montoro, Albert and Cardona, Joan and Aravena, Daniel and Gómez-Coca, Silvia and Ruiz, Eliseo and Ruiz-Hidalgo, Javier},
  journal={arXiv preprint arXiv:2511.20362},
  year={2025}
}