HapFABIA: Identification of very short segments of identity by descent (IBD) via biclustering

Activity: Talk or presentationContributed talkunknown

Description

Identity by descent (IBD) can be detected reliably for long shared DNA segments which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to utilize rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing, and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies a biclustering technique to identify very short IBD segments characterized by rare variants. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified t short IBD segments characterized by rare variants with a median length of 25 kbp in data for chromosome 1 from the 1000 Genomes Project. IBD segments that match the Denisovan or the Neandertal genomes (archaic genomes) are either shared by a very low or a very high proportion of Africans. IBD segments that match archaic genomes are enriched at lengths in the ranges of 0 to 12 kbp (about 130 kyr in the past) and 38 to 60 kbp (13 - 20 kyr). IBD segments that match an archaic genome and are of length 0 - 12 kbp are overrepresented in Africans, while those of length 38 - 60 kbp are mainly found in Asians or Europeans.
Period25 Oct 2013
Event title2013 Annual Meeting of the American Society of Human Genetics
Event typeConference
LocationUnited StatesShow on map

Fields of science

  • 106005 Bioinformatics
  • 305 Other Human Medicine, Health Sciences
  • 102018 Artificial neural networks
  • 102 Computer Sciences
  • 106041 Structural biology
  • 101029 Mathematical statistics
  • 106023 Molecular biology
  • 106013 Genetics
  • 106002 Biochemistry
  • 102001 Artificial intelligence
  • 101004 Biomathematics
  • 102015 Information systems

JKU Focus areas

  • Computation in Informatics and Mathematics
  • Nano-, Bio- and Polymer-Systems: From Structure to Function