New Catalysis Descriptor Discovered by Interpretable Machine Learning
Symbolic regression, a kind of interpretable machine learning approach, used to derive a catalysis descriptor predicting new oxide perovskites with improved oxygen evolution activity as corroborated by experimental validation.
There has been a long-standing interest in the field of catalysis to identify descriptors. Conventional descriptors from human knowledge, with typical volcano scaling, have shown their influential impact in the field (Figure 1). For example, seminal work of eg occupancy as a descriptor [Science 334, 1383 (2011)] in oxide perovskite catalyst has stimulated subsequent works such as the descriptor of O p band level [Nat. Comm. 4, 2439 (2013)] and the combined descriptors of t2g, eg occupancies and pd interaction [Nat. Comm. 11, 652 (2020)].
Descriptors are the concise relationships between structure (composition) and properties. We note that conventional descriptors (t2g, eg occupancies, O p band level and pd interaction) are based on human knowledge of physics and chemistry, i.e. the interactions between the adsorbate and catalyst should be neither too strong nor too weak (Sabatier principle), leading to volcano scaling. Here, human-knowledge-based descriptors and volcano scaling have been challenged by a machine learning approach in our paper in Nature Communications, "Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts".
In this work, we synthesized and characterized over thirty oxides (23 perovskites and 11 non-perovskites), which were then studied by the symbolic regression, an interpretable and glass-box machine learning approach (Figure 2). We derived an unprecedentedly simple descriptor, μ/t, where μ and t are the octahedral and tolerance factors, respectively. The performance of μ/t is comparable to conventional descriptor of eg occupancy (Figure 3). Since both μ and t are the function of ionic radii only, such descriptor makes catalysts design refrain from DFT calculations and therefore much efficient and easy. The descriptor is then used to screen out four oxide perovskites with high oxygen evolution reaction (OER) activity among 3,545 candidates. Among them, Cs-containing oxide perovskites have never been reported in literature but successfully synthesized under the guidance of the new descriptor.
The work, titled “Simple Descriptor Derived from Symbolic Regression Accelerating the Discovery of New Perovskite Catalysts” appears 07/14/20 in the journal Nature Communications (https://rdcu.be/b5BRM).