Making Sense of Very Rare and Private Single-Nucleotide Variants

Ulrich Bodenhofer, Sepp Hochreiter

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

High-throughput sequencing technologies have facilitated the identification of large numbers of singlenucleotide variations (SNVs), many of which have already been proven to be associated with diseases or other complex traits. Several large sequencing studies, such as, the 1000 Genomes Project, the UK10K project, or the NHLBI-Exome Sequencing Project, have consistently reported a large proportion of private SNVs, that is, variants that are unique to a family or even a single individual. The role of private SNVs in diseases is poorly understood, largely due to the fact that it is statistically very challenging to consider private SNVs in association testing. While it is generally impossible to make use of private SNVs in single-marker tests or in correlation-based tests like the popular SNP-set (Sequence) Kernel Association Test (SKAT), also burden tests are facing serious statistical issues. We have proposed the Position-Dependent Kernel Association Test (PODKAT), which is designed for detecting associations of very rare and private SNVs with the trait under consideration even if the burden scores are not correlated with the trait. The test assumes that, the closer two SNVs are on the genome, the more likely they have similar effects on the trait under consideration. This assumption is fulfilled as long as deleterious, neutral, and protective variants are grouped sufficiently well along the genome.
Original languageEnglish
Title of host publicationProceedings MAQC Society Second Annual Meeting
Number of pages1
Publication statusPublished - 2018

Fields of science

  • 303 Health Sciences
  • 304 Medical Biotechnology
  • 304003 Genetic engineering
  • 305 Other Human Medicine, Health Sciences
  • 101004 Biomathematics
  • 101018 Statistics
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102004 Bioinformatics
  • 102010 Database systems
  • 102015 Information systems
  • 102019 Machine learning
  • 106023 Molecular biology
  • 106002 Biochemistry
  • 106005 Bioinformatics
  • 106007 Biostatistics
  • 106041 Structural biology
  • 301 Medical-Theoretical Sciences, Pharmacy
  • 302 Clinical Medicine

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Nano-, Bio- and Polymer-Systems: From Structure to Function
  • Medical Sciences (in general)
  • Health System Research
  • Clinical Research on Aging

Cite this