DEXUS: Identifying Differential Expression in RNA-Seq Studies with Unknown Conditions

Günter Klambauer, Thomas Unterthiner, Sepp Hochreiter

Research output: Contribution to journalArticlepeer-review

Abstract

Detection of differential expression in RNA-Seq data is currently limited to studies in which two or more sample conditions are known a priori. However, these biological conditions are typically unknown in cohort, cross-sectional and nonrandomized controlled studies such as the HapMap, the ENCODE or the 1000 Genomes project. We present DEXUS for detecting differential expression in RNA-Seq data for which the sample conditions are unknown. DEXUS models read counts as a finite mixture of negative binomial distributions in which each mixture component corresponds to a condition. A transcript is considered differentially expressed if modeling of its read counts requires more than one condition. DEXUS decomposes read count variation into variation due to noise and variation due to differential expression. Evidence of differential expression is measured by the informative/noninformative (I/NI) value, which allows differentially expressed transcripts to be extracted at a desired specificity (significance level) or sensitivity (power). DEXUS performed excellently in identifying differentially expressed transcripts in data with unknown conditions. On 2400 simulated data sets, I/NI value thresholds of 0.025, 0.05 and 0.1 yielded average specificities of 92, 97 and 99% at sensitivities of 76, 61 and 38%, respectively. On real-world data sets, DEXUS was able to detect differentially expressed transcripts related to sex, species, tissue, structural variants or quantitative trait loci. The DEXUS R package is publicly available from Bioconductor and the scripts for all experiments are available at http://www.bioinf.jku.at/software/dexus/.
Original languageEnglish
Pages (from-to)e198
Number of pages11
JournalNucleic Acids Research
Volume41
Issue number21
DOIs
Publication statusPublished - Sept 2013

Fields of science

  • 303 Health Sciences
  • 304 Medical Biotechnology
  • 305 Other Human Medicine, Health Sciences
  • 106013 Genetics
  • 106041 Structural biology
  • 102 Computer Sciences
  • 101029 Mathematical statistics
  • 102001 Artificial intelligence
  • 101004 Biomathematics
  • 102015 Information systems
  • 102018 Artificial neural networks
  • 106002 Biochemistry
  • 106023 Molecular biology
  • 301 Medical-Theoretical Sciences, Pharmacy
  • 302 Clinical Medicine
  • 106005 Bioinformatics

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Nano-, Bio- and Polymer-Systems: From Structure to Function

Cite this