Abstract
Nowadays the techniques employed in data acquisition provide huge amounts of data. Some parts of the information are related to the others, making desirable a way to reduce
the dimensionality, loosing as less information as possible, in order to decrease computational
times and complexity when applying any ensuing Data Mining technique. Genetic Algorithms (GA) offer the possibility of selecting which variables contain the most
relevant information to represent all the original ones. The traditional genetic operators seem
to be too general, leading to results that could be improved by means of designed genetic operators
that employ some available problem specific information. Especially, when dealing with calibration by means of NIR spectral data, which use to contain thousands of variables, it is known that not isolated wavelengths but wavebands allow a more robust model design. This aspect should be taken into account when crossing individuals.
We propose three crossover operators specifically designed for calibration with NIR spectral
data, based on a pseudo-random 2-points crossover where the first point is randomly chosen
and the selection of the second point is guided by problem specific information, and we compare
their performance against state of the art operators. Our benchmark consists of two real
world high dimensional data sets, corresponding to polyetheracrylat (PEA), where hydroxyl
number, viscosity and acidity are on-line monitored; and melamine resin production, where
the chilling point is considered in order to regulate the condensation. We show that designed
operators promote wavebands selection, achieve better good quality solutions, and converge
faster and smoother to them than S-o-A operators.
Original language | English |
---|---|
Pages (from-to) | 123-136 |
Number of pages | 14 |
Journal | Journal of Chemometrics |
Volume | 28 |
Issue number | 3 |
DOIs | |
Publication status | Published - 2014 |
Fields of science
- 211913 Quality assurance
- 101 Mathematics
- 101001 Algebra
- 101013 Mathematical logic
- 101019 Stochastics
- 101020 Technical mathematics
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 202027 Mechatronics
JKU Focus areas
- Computation in Informatics and Mathematics
- Mechatronics and Information Processing
- Nano-, Bio- and Polymer-Systems: From Structure to Function