Abstract
Localizing a specific protein in a human cell is essential for understanding cellular functions and biological processes of underlying diseases. A promising, low-cost,and time-efficient biotechnology for localizing proteins is high-throughput fluorescence microscopy imaging (HTI). This imaging technique stains the protein of interest in a cell with fluorescent antibodies and subsequently takes a microscopic image. Together with images of other stained proteins or cell organelles and the annotation by the Human Protein Atlas project, these images provide a rich source of information on the protein location which can be utilized by computational methods. It is yet unclear how precise such methods are and whether they can compete with human experts. We here focus on deep learning image analysis methods and, in particular, on Convolutional Neural Networks (CNNs) since they showed overwhelming success across different imaging tasks. We pro-pose a novel CNN architecture “GapNet-PL” that has been designed to tackle the characteristics of HTI data and uses global averages of filters at different abstraction levels. We present the largest comparison of CNN architectures including GapNet-PL for protein localization in HTI images of human cells. GapNet-PL outperforms all other competing methods and reaches close to perfect localization in all 13 tasks with an average AUC of 98% and F1 score of 78%. On a separate test set the performance of GapNet-PL was compared with three human experts and 25 scholars. GapNet-PL achieved an accuracy of 91%, significantly (p-value 1.1e−6) outperforming the best human expert with an accuracy of 72%.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Conference on Learning Representations (ICLR 2019) |
Number of pages | 18 |
Publication status | Published - 2019 |
Fields of science
- 305907 Medical statistics
- 202017 Embedded systems
- 202036 Sensor systems
- 101004 Biomathematics
- 101014 Numerical mathematics
- 101015 Operations research
- 101016 Optimisation
- 101017 Game theory
- 101018 Statistics
- 101019 Stochastics
- 101024 Probability theory
- 101026 Time series analysis
- 101027 Dynamical systems
- 101028 Mathematical modelling
- 101029 Mathematical statistics
- 101031 Approximation theory
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102004 Bioinformatics
- 102013 Human-computer interaction
- 102018 Artificial neural networks
- 102019 Machine learning
- 102032 Computational intelligence
- 102033 Data mining
- 305901 Computer-aided diagnosis and therapy
- 305905 Medical informatics
- 202035 Robotics
- 202037 Signal processing
- 103029 Statistical physics
- 106005 Bioinformatics
- 106007 Biostatistics
JKU Focus areas
- Digital Transformation