Maximal Perfect Haplotype Blocks with Wildcards

Thumbnail Image

Date

2020-05

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Recent work provides the first method to measure the relative fitness of genomic variants within a population that scales to large numbers of genomes. A key component of the computation involves finding maximal perfect haplotype blocks from a set of genomic samples for which SNPs (single-nucleotide polymorphisms) have been called. Often, owing to low read coverage and imperfect assemblies, some of the SNP calls can be missing from some of the samples. In this work, we consider the problem of finding maximal perfect haplotype blocks where some missing values may be present. Missing values are treated as wildcards, and the definition of maximal perfect haplotype blocks is extended in a natural way. We provide an output-linear time algorithm to identify all such blocks and demonstrate the algorithm on a large population SNP dataset. Our software is publicly available.

Description

Keywords

Citation

Williams, Lucia, and Brendan Mumey. “Maximal Perfect Haplotype Blocks with Wildcards.” iScience 23, no. 6 (June 2020): 101149. doi:10.1016/j.isci.2020.101149.

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as © This final published version is made available under the CC-BY-NC-ND 4.0 license.
Copyright (c) 2002-2022, LYRASIS. All rights reserved.