MHNfs: Prompting In-Context Bioactivity Predictions for Low-Data Drug Discovery.

Johannes Schimunek, Sohvi Luukkonen, Günter Klambauer*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Today’s drug discovery increasingly relies on computational and machine learning approaches to identify novel candidates, yet data scarcity remains a significant challenge. To address this limitation, we present MHNfs, an application specifically designed to predict molecular activity in low-data scenarios. At its core, MHNfs leverages a state-of-the-art few-shot activity prediction model, named MHNfs, which has demonstrated strong performance across a large set of prediction tasks in the benchmark data set FS-Mol. The application features an intuitive interface that enables users to prompt the model for precise activity predictions based on a small number of known active and inactive molecules, akin to interactive interfaces for large language models. To evaluate its efficacy, we simulate real-world scenarios by recasting PubChem bioassays as few-shot prediction tasks. MHNfs offers a streamlined and accessible solution for deploying advanced few-shot learning models, providing a valuable tool for accelerating drug discovery.
Original languageEnglish
Pages (from-to)4243-4250
Number of pages8
JournalJournal of Chemical Information and Modeling
Volume65
Issue number9
DOIs
Publication statusPublished - 30 Apr 2025

Fields of science

  • 101019 Stochastics
  • 102003 Image processing
  • 103029 Statistical physics
  • 101018 Statistics
  • 101017 Game theory
  • 102001 Artificial intelligence
  • 202017 Embedded systems
  • 101016 Optimisation
  • 101015 Operations research
  • 101014 Numerical mathematics
  • 101029 Mathematical statistics
  • 101028 Mathematical modelling
  • 101026 Time series analysis
  • 101024 Probability theory
  • 102032 Computational intelligence
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 101027 Dynamical systems
  • 305907 Medical statistics
  • 101004 Biomathematics
  • 305905 Medical informatics
  • 101031 Approximation theory
  • 102033 Data mining
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102019 Machine learning
  • 106007 Biostatistics
  • 102018 Artificial neural networks
  • 106005 Bioinformatics
  • 202037 Signal processing
  • 202036 Sensor systems
  • 202035 Robotics

JKU Focus areas

  • Digital Transformation

Cite this