Developing non-response weights to account for attrition-related bias in a longitudinal pregnancy cohort

Tona Pitt, Erin Hetherington, Kamala Adhikari, Shainur Premji, Nicole Racine, Suzanne Tough, Sheila McDonald*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Background: Prospective cohorts may be vulnerable to bias due to attrition. Inverse probability weights have been proposed as a method to help mitigate this bias. The current study used the “All Our Families” longitudinal pregnancy cohort of 3351 maternal-infant pairs and aimed to develop inverse probability weights using logistic regression models to predict study continuation versus drop-out from baseline to the three-year data collection wave. Methods: Two methods of variable selection took place. One method was a knowledge-based a priori variable selection approach, while the second used Least Absolute Shrinkage and Selection Operator (LASSO). The ability of each model to predict continuing participation through discrimination and calibration for both approaches were evaluated by examining area under the receiver operating curve (AUROC) and calibration plots, respectively. Stabilized inverse probability weights were generated using predicted probabilities. Weight performance was assessed using standardized differences of baseline characteristics for those who continue in study and those that do not, with and without weights (unadjusted estimates). Results: The a priori and LASSO variable selection method prediction models had good and fair discrimination with AUROC of 0.69 (95% Confidence Interval [CI]: 0.67–0.71) and 0.73 (95% CI: 0.71–0.75), respectively. Calibration plots and non-significant Hosmer-Lemeshow Goodness of Fit Tests indicated that both the a priori (p = 0.329) and LASSO model (p = 0.242) were well-calibrated. Unweighted results indicated large (> 10%) standardized differences in 15 demographic variables (range: 11 − 29%), when comparing those who continued in the study with those that did not. Weights derived from the a priori and LASSO models reduced standardized differences relative to unadjusted estimates, with the largest differences of 13% and 5%, respectively. Additionally, when applying the same LASSO variable selection method to develop weights in future data collection waves, standardized differences remained below 10% for each demographic variable. Conclusion: The LASSO variable selection approach produced robust weights that addressed non-response bias more than the knowledge-driven approach. These weights can be applied to analyses across multiple longitudinal waves of data collection to reduce bias.

Original languageEnglish
Article number295
Number of pages9
JournalBMC Medical Research Methodology
Issue number1
Publication statusPublished - 14 Dec 2023

Bibliographical note

Funding Information:
The authors acknowledge the contribution and support of All Our Families participants and All Our Families team members.

Funding Information:
All Our Families was funded through Alberta Innovates Interdisciplinary Team Grant #200700595 and the Alberta Children’s Hospital Foundation. TMP is supported by a Canadian Institutes of Health Research Doctoral Award (#187531). The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Publisher Copyright:
© 2023, The Author(s).


  • All our families
  • Attrition
  • Cohort studies
  • Inverse probability weights
  • Non-response weights

Cite this