Project Details
Description
Scalable machine learning of complex models on extreme data will be an important industrial application of exascale computers. In this project, we take the example of predicting compound bioactivity for the pharmaceutical industry, an important sector for Europe for employment, income, and solving the problems of an ageing society. Small scale approaches to machine learning have already been trialed and show great promise to reduce empirical testing costs by acting as a virtual screen to filter out tests unlikely to work. However, it is not yet possible to use all available data to make the best possible models, as algorithms (and their implementations) capable of learning the best models do not scale to such sizes and heterogeneity of input data. There are also further challenges including imbalanced data, confidence estimation, data standards model quality and feature diversity. The ExCAPE project aims to solve these problems by producing state of the art scalable algorithms and implementations thereof suitable for running on future Exascale machines. These approaches will scale programs for complex pharmaceutical workloads to input data sets at industry scale. The programs will be targeted at exascale platforms by using a mix of HPC programming techniques, advanced platform simulation for tuning and and suitable accelerators.
Status | Finished |
---|---|
Effective start/end date | 01.09.2015 → 31.08.2018 |
Fields of science
- 305 Other Human Medicine, Health Sciences
- 304 Medical Biotechnology
- 102019 Machine learning
- 303 Health Sciences
- 302 Clinical Medicine
- 301 Medical-Theoretical Sciences, Pharmacy
- 102 Computer Sciences
- 106005 Bioinformatics
- 106007 Biostatistics
- 304003 Genetic engineering
- 106041 Structural biology
- 101018 Statistics
- 102010 Database systems
- 106023 Molecular biology
- 102001 Artificial intelligence
- 106002 Biochemistry
- 101004 Biomathematics
- 102004 Bioinformatics
- 102015 Information systems
- 101019 Stochastics
- 102003 Image processing
- 103029 Statistical physics
- 101017 Game theory
- 101016 Optimisation
- 202017 Embedded systems
- 101015 Operations research
- 101014 Numerical mathematics
- 101029 Mathematical statistics
- 101028 Mathematical modelling
- 101026 Time series analysis
- 101024 Probability theory
- 102032 Computational intelligence
- 101027 Dynamical systems
- 102013 Human-computer interaction
- 305907 Medical statistics
- 305905 Medical informatics
- 101031 Approximation theory
- 102033 Data mining
- 305901 Computer-aided diagnosis and therapy
- 102018 Artificial neural networks
- 202037 Signal processing
- 202036 Sensor systems
- 202035 Robotics
JKU Focus areas
- Digital Transformation