By the same authors

From the same journal

Maximum Likelihood Pedigree Reconstruction using Integer Linear Programming

Research output: Contribution to journalArticlepeer-review



Publication details

JournalGenetic Edpidemiology
DateE-pub ahead of print - 3 Oct 2012
DatePublished (current) - Jan 2013
Issue number1
Number of pages15
Pages (from-to)69-83
Early online date3/10/12
Original languageEnglish


Large population biobanks of unrelated individuals have been highly successful in detecting common genetic variants affecting diseases of public health concern. However, they lack the statistical power to detect more modest gene-gene and gene-environment interaction effects or the effects of rare variants for which related individuals are ideally required. In reality, most large population studies will undoubtedly contain sets of undeclared relatives, or pedigrees. Although a crude measure of relatedness might sometimes suffice, having a good estimate of the true pedigree would be much more informative if this could be obtained efficiently. Relatives are more likely to share longer haplotypes around disease susceptibility loci and are hence biologically more informative for rare variants than unrelated cases and controls. Distant relatives are arguably more useful for detecting variants with small effects because they are less likely to share masking environmental effects. Moreover, the identification of relatives enables appropriate adjustments of statistical analyses that typically assume unrelatedness. We propose to exploit an integer linear programming optimisation approach to pedigree learning, which is adapted to find valid pedigrees by imposing appropriate constraints. Our method is not restricted to small pedigrees and is guaranteed to return a maximum likelihood pedigree. With additional constraints, we can also search for multiple high-probability pedigrees and thus account for the inherent uncertainty in any particular pedigree reconstruction. The true pedigree is found very quickly by comparison with other methods when all individuals are observed. Extensions to more complex problems seem feasible.

    Research areas

  • constraint-based optimisation; genetic marker data; Bayesian networks; model uncertainty


Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations