Lifelong and Universal Machine Learning Potentials for Chemical Reaction Network Explorations
Abstract
Recent developments in computational chemistry facilitate the automated quantum chemical exploration of chemical reaction networks for the in-silico prediction of synthesis pathways, yield, and selectivity. However, the underlying quantum chemical energy calculations require vast computational resources, limiting these explorations severely in practice. Machine learning potentials (MLPs) offer a solution to increase computational efficiency, while retaining the accuracy of reliable first-principles data used for their training. Unfortunately, MLPs will be limited in their generalization ability within chemical (reaction) space, if the underlying training data is not representative for a given application. Within the framework of automated reaction network exploration, where new reactants or reagents composed of any elements from the periodic table can be introduced, this lack of generalizability will be the rule rather than the exception. Here, we therefore study the benefits and drawbacks of two MLP concepts in this context. Whereas universal MLPs are designed to cover most of the relevant chemical space in their training, lifelong MLPs push their adaptability by efficient continual learning of additional data. While the accuracy of the universal MLPs turns out to be not yet sufficient for reaction search trials without any fine-tuning, lifelong MLPs can reach chemical accuracy. We propose an improved learning algorithm for lifelong adaptive data selection yielding efficient integration of new data while previous expertise is preserved.