Transfer entropy and O-information to detect grokking in tensor network multi-class classification problems
Abstract
Quantum-enhanced machine learning, encompassing both quantum algorithms and quantum-inspired classical methods such as tensor networks, offers promising tools for extracting structure from complex, high-dimensional data. In this work, we study the training dynamics of Matrix Product State (MPS) classifiers applied to three-class problems, using both fashion MNIST and hyper-spectral satellite imagery as representative datasets. We investigate the phenomenon of grokking, where generalization emerges suddenly after memorization, by tracking entanglement entropy, local magnetization, and model performance across training sweeps. Additionally, we employ information theory tools to gain deeper insights: transfer entropy is used to reveal causal dependencies between label-specific quantum masks, while O-information captures the shift from synergistic to redundant correlations among class outputs. Our results show that grokking in the fashion MNIST task coincides with a sharp entanglement transition and a peak in redundant information, whereas the overfitted hyper-spectral model retains synergistic, disordered behavior. These findings highlight the relevance of high-order information dynamics in quantum-inspired learning and emphasize the distinct learning behaviors that emerge in multi-class classification, offering a principled framework to interpret generalization in quantum machine learning architectures.