Abstract
High-density oligonucleotide genotyping microarrays, especially Affymetrix SNP6 chips, are widely used for high-resolution copy number analysis. In order to identify CNVs more reliable, we have proposed a Maximum a posteriori factor analysis model called cn.FARMS. The latent variable, the factor, captures the simultaneous increase or decrease of DNA amount at neighboring chromosome locations measured by the intensity of oligonucleotide probes. This increase or decrease indicates amplification or deletion of a DNA region that is a CNV. cn.FARMS considerably reduces the false discovery rate (FDR) by combining adjacent chromosome locations to an ensemble voting (agreement of multiple measurements) instead of relying on a single measurement as other methods do. Standard factor analysis assumes a Gaussian factor distribution which, however, is a wrong assumption for CNVs. Redon et al. 2006 showed that most CNVs affect less than three individuals out of 270 HapMap samples. These rare events are hard to detect by cn.FARMS as they would be interpreted as noise. Therefore we propose a factor analysis model with a Laplacian prior, which leads to a sparse factor distribution. We have applied the Laplacian cn.FARMS model on the HapMap dataset to detect CNVs. We could verify most of published copy number variable regions and found new ones. However many known CNVs seem to be false positives.
| Original language | English |
|---|---|
| Title of host publication | ISMB 2013 Proceedings |
| Number of pages | 1 |
| Publication status | Published - Jul 2013 |
Fields of science
- 303 Health Sciences
- 304 Medical Biotechnology
- 305 Other Human Medicine, Health Sciences
- 106013 Genetics
- 106041 Structural biology
- 102 Computer Sciences
- 101029 Mathematical statistics
- 102001 Artificial intelligence
- 101004 Biomathematics
- 102015 Information systems
- 102018 Artificial neural networks
- 106002 Biochemistry
- 106023 Molecular biology
- 301 Medical-Theoretical Sciences, Pharmacy
- 302 Clinical Medicine
- 106005 Bioinformatics
JKU Focus areas
- Computation in Informatics and Mathematics
- Nano-, Bio- and Polymer-Systems: From Structure to Function