Multiscale DNA partitioning: statistical evidence for segments

  • Thomas Hotz (Speaker)
  • Futschik, A. (Speaker)
  • Axel Munk (Speaker)
  • Hannes Sieling (Speaker)

Activity: Talk or presentationContributed talkunknown

Description

DNA segmentation, i.e. the partitioning of DNA in compositionally homogeneous segments, is a basic task in bioinformatics. Different algorithms have been proposed for various partitioning criteria such as GC content, local ancestry in Population genetics, or copy number variation. A critical component of any such method is the choice of an appropriate number of segments. Some methods use model selection criteria, and do not provide a suitable error control. Other methods that are based on simulating a statistic under a null model provide suitable error control only if the correct null model is chosen. Results: Here, we focus on partitioning with respect to GC content and propose a new approach that provides statistical error control: it guarantees with a user specified probability that the number of identified segments does not exceed the number of actually present segments. The method is based on a statistical multiscale criterion, rendering this as segmentation method which searches segments of any length (on all scales), simultaneously. It is also very accurate in localizing segments: under bench-mark scenarios, our approach leads to a Segmentation that is more accurate than the approaches discussed in the comparative review of Elhaik et al. (2010). In our real data examples, we find segments that often correspond well to the available genome annotation.
Period08 Jul 2014
Event titleXXVII International Biometric Conference
Event typeConference
LocationItalyShow on map

Fields of science

  • 106007 Biostatistics
  • 305907 Medical statistics
  • 509 Other Social Sciences
  • 101018 Statistics

JKU Focus areas

  • Computation in Informatics and Mathematics