I/NI-calls for the exclusion of non-informative genes: A highly effective filtering tool for microarray data

Willem Talloen, Djork-Arné Clevert, Sepp Hochreiter, Hinrich W.H. Göhlmann, Luc Bijnens, Stefan Kass, Dhammika Amaratunga

Research output: Contribution to journalArticlepeer-review

Abstract

DNA microarray technology typically generates many measurements of which only a relatively small subset is informative for the interpretation of the experiment. To avoid false positive results, it is therefore critical to select the informative genes from the large noisy data before the actual analysis. Most currently available filtering techniques are supervised and therefore suffer from a potential risk of overfitting. The unsupervised filtering techniques, on the other hand, are either not very efficient or too stringent as they may mix up signal with noise. We propose to use the multiple probes measuring the same target mRNA as repeated measures to quantify the signal-to-noise ratio of that specific probe set. A Bayesian factor analysis with specifically chosen prior settings, that models this probe level information, is providing an objective feature filtering technique, named I/NI calls.
Original languageEnglish
Pages (from-to)2897–2902
Number of pages6
JournalBioinformatics
Volume23
Issue number21
DOIs
Publication statusPublished - Nov 2007

Fields of science

  • 101004 Biomathematics
  • 101027 Dynamical systems
  • 101028 Mathematical modelling
  • 101029 Mathematical statistics
  • 101014 Numerical mathematics
  • 101015 Operations research
  • 101016 Optimisation
  • 101017 Game theory
  • 101018 Statistics
  • 101019 Stochastics
  • 101024 Probability theory
  • 101026 Time series analysis
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 102018 Artificial neural networks
  • 102019 Machine learning
  • 103029 Statistical physics
  • 106005 Bioinformatics
  • 106007 Biostatistics
  • 202017 Embedded systems
  • 202035 Robotics
  • 202036 Sensor systems
  • 202037 Signal processing
  • 305901 Computer-aided diagnosis and therapy
  • 305905 Medical informatics
  • 305907 Medical statistics
  • 102032 Computational intelligence
  • 102033 Data mining
  • 101031 Approximation theory

Cite this