Traditionally, discovery in catalysis is an empirical process that is mostly reliant on human knowledge and intuition and often involves extensive, iterative trial-and-error screening rather than quantitative predictions. This is a consequence of the very large and highly multidimensional search space where even minor changes in the catalyst structures can have dramatic effects on the activity, selectivity and stability of catalysts. Thus, there is a high potential for improvements by transitioning to more data-driven workflows where all relevant data is utilized and where machine learning tools can guide, predict and explain experiments.
We have demonstrated this potential in several applications using a database representing the chemical space of phosphorus-based ligands with high-level physicochemical descriptors. Even in the absence of experimental data, this can be used to design ligand sets that maximize information on catalyst effects in the least number of reactions, and that facilitate the autonomous optimization of reactions, as we have shown for several palladium-catalyzed cross-couplings. Regression or classification modelling with interpretable descriptors can offer insight into reaction mechanisms and the precise ligand effects in catalysis, for example a surprisingly general ligand effect on catalyst speciation in cross-coupling. Combined with virtual libraries, such models can provide specific predictions for improved ligand structures. We demonstrated this by suggesting ligands for enantiospecific alkyl-Suzuki coupling from a virtual library of 300 000 phosphorus ligands.
Tobias Gensch is a junior group leader at TU Berlin leading a catalysis reaction discovery program integrating computational chemistry, data science and experimental chemistry to guide and explain experiments with chemical space analysis and mechanistically interpretable models. In his PhD with Frank Glorius at WWU Münster, Tobias studied C–H activation with Rh and Co-based catalysts. In his postdoc with Matt Sigman at the University of Utah, he worked on computational and statistical modelling of catalysts.
Watch a recording of the presentation below: