HapFABIA: Identification of very short segments of identity by descent (IBD) via biclustering

Sepp Hochreiter, Gundula Povysil

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Identity by descent (IBD) can be detected reliably for long shared DNA segments which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to utilize rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing, and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies a biclustering technique to identify very short IBD segments characterized by rare variants. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified t short IBD segments characterized by rare variants with a median length of 25 kbp in data for chromosome 1 from the 1000 Genomes Project. IBD segments that match the Denisovan or the Neandertal genomes (archaic genomes) are either shared by a very low or a very high proportion of Africans. Whole abstract under http://www.bioinf.jku.at/publications/2013/ASHG2013_Hochreiter.pdf)
Original languageEnglish
Title of host publicationASHG 2013 Proceedings
Number of pages1
Publication statusPublished - 2013

Fields of science

  • 303 Health Sciences
  • 304 Medical Biotechnology
  • 304003 Genetic engineering
  • 305 Other Human Medicine, Health Sciences
  • 101004 Biomathematics
  • 101018 Statistics
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102004 Bioinformatics
  • 102010 Database systems
  • 102015 Information systems
  • 102019 Machine learning
  • 106023 Molecular biology
  • 106002 Biochemistry
  • 106005 Bioinformatics
  • 106007 Biostatistics
  • 106041 Structural biology
  • 301 Medical-Theoretical Sciences, Pharmacy
  • 302 Clinical Medicine

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Nano-, Bio- and Polymer-Systems: From Structure to Function
  • Medical Sciences (in general)
  • Health System Research
  • Clinical Research on Aging

Cite this