Abstract
Chromatography is one of the most versatile unit operations in the biotechnological industry. Regulatory initiatives like Process Analytical Technology and Quality by Design led to the implementation of new chromatographic devices. Those represent an almost inexhaustible source of data. However, the analysis of large datasets is complicated, and significant amounts of information stay hidden in big data.
Here we present a new, top-down approach for the systematic analysis of chromatographic datasets. It is the goal of this approach to analyze the dataset as a whole, starting with the most important, global information. The workflow should highlight interesting regions (outliers, drifts, data inconsistencies), and help to localize those regions within a multi-dimensional space in a straightforward way.
Moving window factor models were used to extract the most important information, focusing on the differences between samples. The prototype was implemented as an interactive visualization tool for the explorative analysis of complex datasets. We found that the tool makes it convenient to localize variances in a multidimensional dataset and allows to differentiate between explainable and unexplainable variance. Starting with one global difference descriptor per sample, the analysis ends up with highly resolute temporally dependent difference descriptor values, thought as a starting point for the detailed analysis of the underlying raw data.
Original language | English |
---|---|
Pages (from-to) | 179-190 |
Number of pages | 12 |
Journal | Journal of Chromatography B |
Volume | 1092 |
DOIs | |
Publication status | Published - 2018 |
Fields of science
- 303 Health Sciences
- 304 Medical Biotechnology
- 304003 Genetic engineering
- 305 Other Human Medicine, Health Sciences
- 101004 Biomathematics
- 101018 Statistics
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102004 Bioinformatics
- 102010 Database systems
- 102015 Information systems
- 102019 Machine learning
- 106023 Molecular biology
- 106002 Biochemistry
- 106005 Bioinformatics
- 106007 Biostatistics
- 106041 Structural biology
- 301 Medical-Theoretical Sciences, Pharmacy
- 302 Clinical Medicine
JKU Focus areas
- Computation in Informatics and Mathematics
- Nano-, Bio- and Polymer-Systems: From Structure to Function
- Medical Sciences (in general)
- Health System Research
- Clinical Research on Aging