Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

In silico proof of principle of machine learning-based antibody design at unconstrained scale

  • Rahmad Akbar
  • , Philippe A. Robert
  • , Cedric R. Weber
  • , Michael Widrich
  • , Robert Frank
  • , Milena Pavlovic
  • , Lonneke Scheffer
  • , Maria Chernigovskaya
  • , Igor Snapkov
  • , Andrei Slabodkin
  • , Brij Bhushan Mehta
  • , Enkelejda Miho
  • , Fridtjof Lund-Johansen
  • , Jan Terje Andersen
  • , Sepp Hochreiter
  • , Ingrid Hobæk Haff
  • , Günter Klambauer
  • , Geir Kjetil Sandve
  • , Victor Greiff

Publikation: Preprints, Working Paper und ForschungsberichteVorabpublikation

Abstract

Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.
OriginalspracheEnglisch
Seitenumfang36
DOIs
PublikationsstatusVeröffentlicht - 2021

Publikationsreihe

NamebioRxiv
ISSN (Druck)2692-8205

Wissenschaftszweige

  • 305907 Medizinische Statistik
  • 202017 Embedded Systems
  • 202036 Sensorik
  • 101004 Biomathematik
  • 101014 Numerische Mathematik
  • 101015 Operations Research
  • 101016 Optimierung
  • 101017 Spieltheorie
  • 101018 Statistik
  • 101019 Stochastik
  • 101024 Wahrscheinlichkeitstheorie
  • 101026 Zeitreihenanalyse
  • 101027 Dynamische Systeme
  • 101028 Mathematische Modellierung
  • 101029 Mathematische Statistik
  • 101031 Approximationstheorie
  • 102 Informatik
  • 102001 Artificial Intelligence
  • 102003 Bildverarbeitung
  • 102004 Bioinformatik
  • 102013 Human-Computer Interaction
  • 102018 Künstliche Neuronale Netze
  • 102019 Machine Learning
  • 102032 Computational Intelligence
  • 102033 Data Mining
  • 305901 Computerunterstützte Diagnose und Therapie
  • 305905 Medizinische Informatik
  • 202035 Robotik
  • 202037 Signalverarbeitung
  • 103029 Statistische Physik
  • 106005 Bioinformatik
  • 106007 Biostatistik

JKU-Schwerpunkte

  • Digital Transformation

Dieses zitieren