Skip to main navigation Skip to search Skip to main content

A low false discovery rate at detection of copy-number aberrations in microarray data

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

A low false discovery rate (FDR) at the detection of copy-number aberrations (CNAs) in microarray data ensures sufficient detection power and prevents failures in CNAdisease association studies. A high FDR means many falsely discovered aberrations, which are not associated with the disease, though correction for multiple testing must take them into account. Thus, a high FDR not only decreases the discovery power of studies but also the significance level of the remaining discoveries after correction for multiple testing. Methods: We obtain a low FDR at the detection of CNAs in microarray data by a probabilistic latent variable model, called 'cn.FARMS'. The model is optimized by Bayesian maximum a posteriori approach, where a Laplace prior prefers models, which represent the null hypothesis of observing a constant copy number 2 for all samples. The posterior can only deviate from this prior by strong (deviation from copy number 2 intensities) and consistent signals in the data, which hints at a CNA - the alternative hypothesis. The information gain of the posterior over the prior gives the informative/non-informative (I/NI) call that serves as a filter for CNA candidate regions. I/NI call filtering reduces the FDR, because a region with a large I/NI call is unlikely to be a falsely detected CNA, which would neither have strong nor consistent measurements. It can be shown that the I/NI call filter applied to null hypotheses of the association study is independent of the test statistic which in turn guarantees that a type I error rate control by correction for multiple testing is still possible after filtering. I/NI-calls perform well for the usually rare CNAs that are seen at few samples only, where variance-based filtering approaches fail. Results: cn.FARMS clearly outperformed prevalent methods for CNA detection with respect to sensitivity and especially with respect to FDR on different HapMap benchmark data sets.
Original languageEnglish
Title of host publicationHGV 2011 Proceedings
Number of pages1
Publication statusPublished - 2011

Fields of science

  • 106013 Genetics
  • 106041 Structural biology
  • 102 Computer Sciences
  • 101029 Mathematical statistics
  • 102001 Artificial intelligence
  • 101004 Biomathematics
  • 102015 Information systems
  • 102018 Artificial neural networks
  • 106002 Biochemistry
  • 106023 Molecular biology
  • 305 Other Human Medicine, Health Sciences
  • 106005 Bioinformatics

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Nano-, Bio- and Polymer-Systems: From Structure to Function

Cite this