Abstract
Detection of differential expression in RNA-Seq data is currently limited to studies in which two or more sample conditions are known a priori. However, these biological conditions are typically unknown in cohort, cross-sectional and nonrandomized controlled studies such as the HapMap, the ENCODE or the 1000 Genomes project. We present DEXUS for detecting differential expression in RNA-Seq data for which the sample conditions are unknown. DEXUS models read counts as a finite mixture of negative binomial distributions in which each mixture component corresponds to a condition. A transcript is considered differentially expressed if modeling of its read counts requires more than one condition. DEXUS decomposes read count variation into variation due to noise and variation due to differential expression. Evidence of differential expression is measured by the informative/noninformative (I/NI) value, which allows differentially expressed transcripts to be extracted at a desired specificity (significance level) or sensitivity (power). DEXUS performed excellently in identifying differentially expressed transcripts in data with unknown conditions. On 2400 simulated data sets, I/NI value thresholds of 0.025, 0.05 and 0.1 yielded average specificities of 92, 97 and 99% at sensitivities of 76, 61 and 38%, respectively. On real-world data sets, DEXUS was able to detect differentially expressed transcripts related to sex, species, tissue, structural variants or quantitative trait loci.
Original language | English |
---|---|
Title of host publication | ISMB 2014 Proceedings |
Number of pages | 1 |
Publication status | Published - 2014 |
Fields of science
- 303 Health Sciences
- 304 Medical Biotechnology
- 304003 Genetic engineering
- 305 Other Human Medicine, Health Sciences
- 101004 Biomathematics
- 101018 Statistics
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102004 Bioinformatics
- 102010 Database systems
- 102015 Information systems
- 102019 Machine learning
- 106023 Molecular biology
- 106002 Biochemistry
- 106005 Bioinformatics
- 106007 Biostatistics
- 106041 Structural biology
- 301 Medical-Theoretical Sciences, Pharmacy
- 302 Clinical Medicine
JKU Focus areas
- Computation in Informatics and Mathematics
- Nano-, Bio- and Polymer-Systems: From Structure to Function