Abstract
Being able to identify regions within or around proteins, to which ligands can potentially bind, is an essential step to develop new drugs. Binding site identification methods can now profit from the availability of large amounts of 3D structures in protein structure databases or from AlphaFold predictions. Current binding site identification methods rely on geometric deep learning, which takes geometric invariances and equivariances into account. Such methods turned out to be very beneficial for physics-related tasks like binding energy or motion trajectory prediction. However, their performance at binding site identification is still limited, which might be due to limited expressivity or oversquashing effects of E(n)-Equivariant Graph Neural Networks (EGNNs). Here, we extend EGNNs by adding virtual nodes and applying an extended message passing scheme. The virtual nodes in these graphs both improve the predictive performance and can also learn to represent binding sites. In our experiments, we show that VN-EGNN sets a new state of the art at binding site identification on three common benchmarks, COACH420, HOLO4K, and PDBbind2020.
Original language | English |
---|---|
Title of host publication | ELLIS Workshop, Advancing Molecular Machine Learning - Overcoming Limitations, virtuell Dezember 2023 |
Number of pages | 16 |
Publication status | Published - 2023 |
Fields of science
- 305907 Medical statistics
- 202017 Embedded systems
- 202036 Sensor systems
- 101004 Biomathematics
- 101014 Numerical mathematics
- 101015 Operations research
- 101016 Optimisation
- 101017 Game theory
- 101018 Statistics
- 101019 Stochastics
- 101024 Probability theory
- 101026 Time series analysis
- 101027 Dynamical systems
- 101028 Mathematical modelling
- 101029 Mathematical statistics
- 101031 Approximation theory
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102004 Bioinformatics
- 102013 Human-computer interaction
- 102018 Artificial neural networks
- 102019 Machine learning
- 102032 Computational intelligence
- 102033 Data mining
- 305901 Computer-aided diagnosis and therapy
- 305905 Medical informatics
- 202035 Robotics
- 202037 Signal processing
- 103029 Statistical physics
- 106005 Bioinformatics
- 106007 Biostatistics
JKU Focus areas
- Digital Transformation