TY - UNPB
T1 - In silico proof of principle of machine learning-based antibody design at unconstrained scale
AU - Akbar, Rahmad
AU - Robert, Philippe A.
AU - Weber, Cedric R.
AU - Widrich, Michael
AU - Frank, Robert
AU - Pavlovic, Milena
AU - Scheffer, Lonneke
AU - Chernigovskaya, Maria
AU - Snapkov, Igor
AU - Slabodkin, Andrei
AU - Mehta, Brij Bhushan
AU - Miho, Enkelejda
AU - Lund-Johansen, Fridtjof
AU - Andersen, Jan Terje
AU - Hochreiter, Sepp
AU - Hobæk Haff, Ingrid
AU - Klambauer, Günter
AU - Sandve, Geir Kjetil
AU - Greiff, Victor
PY - 2021
Y1 - 2021
N2 - Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.
AB - Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.
U2 - 10.1101/2021.07.08.451480
DO - 10.1101/2021.07.08.451480
M3 - Preprint
T3 - bioRxiv
BT - In silico proof of principle of machine learning-based antibody design at unconstrained scale
ER -