Sparse Bayesian Unbounded Linear Models with Unknown Design over Finite Alphabets

  • Yuexuan Wang (Speaker)
  • Futschik, A. (Speaker)
  • Ritabrata Dutta (Speaker)

Activity: Talk or presentationPoster presentationscience-to-science

Description

In population genetics, the haplotype structure can provide crucial information. However, for many sequencing methods like pool sequencing, we only have allele frequency data at hand rather than haplotype information. A new method to reconstruct the unknown haplotype structure $S$ and haplotype frequencies $\omega$ from the observed allele frequency data matrix $Y = S\omega+\varepsilon$ has been proposed in \cite{pelizzola2021multiple}. There $Y\in [0,1]^{N\times T}$ contains relative allele frequencies for $N$ SNPs from $T$ samples. Since this approach leads only to point estimates, we provide a Bayesian approach to this problem. More specifically, we propose a hierarchical Bayesian model with carefully calibrated hyperparameters and hyper-priors that also gives us credible intervals. In our case, the joint estimation is not unique if we do not have any constraint for the reconstruction. To achieve the identifiability condition in Bayesian inference, we introduce a shrinkage prior. And for the situation where the number of haplotypes is unknown, we perform model selection within our Bayesian framework to help us choose the number of haplotypes adaptively.
Period29 Mar 2022
Event titleProbabilistic Modelling in Genomics 2022
Event typeConference
LocationUnited KingdomShow on map

Fields of science

  • 106007 Biostatistics
  • 305907 Medical statistics
  • 102009 Computer simulation
  • 509 Other Social Sciences
  • 101018 Statistics
  • 101029 Mathematical statistics
  • 102035 Data science