Predicting Protein-Nucleic Acid Flexibility Using Persistent Sheaf Laplacians
Abstract
Understanding the flexibility of protein-nucleic acid complexes, often characterized by atomic B-factors, is essential for elucidating their structure, dynamics, and functions, such as reactivity and allosteric pathways. Traditional models such as Gaussian Network Models (GNM) and Elastic Network Models (ENM) often fall short in capturing multiscale interactions, especially in large or complex biomolecular systems. In this work, we apply the Persistent Sheaf Laplacian (PSL) framework for the B-factor prediction of protein-nucleic acid complexes. The PSL model integrates multiscale analysis, algebraic topology, combinatoric Laplacians, and sheaf theory for data representation. It reveals topological invariants in its harmonic spectra and captures the homotopic shape evolution of data with its non-harmonic spectra. Its localization enables accurate B-factor predictions. We benchmark our method on three diverse datasets, including protein-RNA and nucleic-acid-only structures, and demonstrate that PSL consistently outperforms existing models such as GNM and multiscale FRI (mFRI), achieving up to a 21% improvement in Pearson correlation coefficient for B-factor prediction. These results highlight the robustness and adaptability of PSL in modeling complex biomolecular interactions and suggest its potential utility in broader applications such as mutation impact analysis and drug design.