Abstract
While the origins of computer-aided synthesis planning (CASP) date back more than 40 years, due to increased computational power and the development of novel neural network architectures, the field gained popularity in the last decade. Forward syn- thesis aims at finding novel products based on desired properties, retrosynthesis tries to find ways of how to produce these product molecules. After decades of using mostly hand-coded expert knowledge to formulate reaction templates which are rules of how to transform a product molecule to result in a reactant molecule recent publications allow for automated extraction of templates for retrosynthesis. Since template based methods lack capability of generalizing to unseen templates and template-free models can lead to chemically invalid reactions and/or reactants, in this work, a template- based retrosynthesis framework is proposed, that should mitigate the disadvantages of template-based methods by using a modern Hopfield network which showed promising results in a zero- and few-shot setting [38]. By using local templates the model learns to predict reaction centers additionally to the template class. The goal is to show the impact of different representations of reaction templates on the predictive performance of the model and identify the limiting factors. Although the proposed model does not outperform the baseline model, it could be observed that the modern Hopfield network increases the performance for template classes that were rarely seen during training and is capable of generalizing to unseen template classes. The best results were achieved with a template encoder that uses molecular fingerprints and a one-layer feed-forward neural network. When investigating graph based template encoders, it shows that the complexity of encoded template graphs is decisive for the capability of predicting the correct template class.
| Original language | English |
|---|---|
| Qualification | Master |
| Awarding Institution |
|
| Supervisors/Reviewers |
|
| Publication status | Published - Oct 2022 |
Fields of science
- 102019 Machine learning
- 102018 Artificial neural networks
- 102032 Computational intelligence
- 102004 Bioinformatics
- 104022 Theoretical chemistry
- 101016 Optimisation
- 101028 Mathematical modelling
- 101031 Approximation theory
- 101019 Stochastics
- 102003 Image processing
- 103029 Statistical physics
- 101018 Statistics
- 101017 Game theory
- 102001 Artificial intelligence
- 202017 Embedded systems
- 101015 Operations research
- 101014 Numerical mathematics
- 101029 Mathematical statistics
- 101026 Time series analysis
- 101024 Probability theory
- 102013 Human-computer interaction
- 101027 Dynamical systems
- 305907 Medical statistics
- 101004 Biomathematics
- 305905 Medical informatics
- 102033 Data mining
- 102 Computer Sciences
- 305901 Computer-aided diagnosis and therapy
- 106007 Biostatistics
- 106005 Bioinformatics
- 202037 Signal processing
- 202036 Sensor systems
- 202035 Robotics
JKU Focus areas
- Digital Transformation
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver