EXPLORING CHEMICAL REACTION SPACE: fast, automated, accurate....... choose two

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

The ability to predict the outcome of chemical reactions under a specific set of conditions is a central part of chemical research. Historically, this has primarily been the domain of expert chemists, who through education and experience have gained the ability to plan and perform synthesis. Quantum chemical calculations have proven useful in uncovering mechanisms by validating/discarding mechanisms hypothesized by chemists, but are generally not used for predicting the outcome of chemical reactions in an unbiased manner. In this thesis, I will present automated methods for discovering chemical reaction networks. Due to the high complexity of chemical reaction networks (growing with the size of the molecular system), in
addition to being accurate and automated, these methods must also be fast. Therefore, as much of the automated search as possible is done using fast semi-empirical methods.
The need for combining speed and accuracy makes machine learning come to mind. Machine learning has quickly become an established part of the toolbox of a theoretical chemist. Since machine learning models are extremely dependent on the data they were trained on, there is some concern that the high accuracy observed for the models is only achievable within a limited part of chemical space. Explainable AI is an area of research, trying to uncover the reasonings for the predictions made by machine learning models. Another area of research is about estimating the uncertainties on the predictions made by machine learning models. Here, the idea is to get the model itself to tell when it is unsure of the predictions due to lacking
information in the training set.
Both methods for explainable AI and estimating uncertainties are difficult to evaluate, since the true answer is rarely available. As part of this thesis I present a quantitative benchmark for evaluating explainable AI methods on regression tasks. Also, I demonstrate statistical methods for evaluating the quality of uncertainty estimates of machine learning methods.
The development of methods for the automatic discovery of reaction networks is still in its beginning and there is still a long way before anything working in a robust and general way is available. Important steps have been taken in recent years - some of them presented herein.
OriginalsprogEngelsk
Antal sider157
StatusUdgivet - 2022

ID: 332933854