Use of a large dataset to develop new models for estimating the sorption of active pharmaceutical ingredients in soils and sediments

Research output: Contribution to journalArticlepeer-review


Information on the sorption of active pharmaceutical ingredients (APIs) in soils and sediments is needed for assessing the environmental risks of these substances yet these data are unavailable for many APIs in use. Predictive models for estimating sorption could provide a solution. The performance of existing models is, however, often poor and most models do not account for the effects of soil/sediment properties which are known to significantly affect API sorption. Therefore, here, we use a high-quality dataset on the sorption behavior of 54 APIs in 13 soils and sediments to develop new models for estimating sorption coefficients for APIs in soils and sediments using three machine learning approaches (artificial neural network, random forest and support vector machine) and linear regression. A random forest-based model, with chemical and solid descriptors as the input, was the best performing model. Evaluation of this model using an independent sorption dataset from the literature showed that the model was able to predict sorption coefficients of 90% of the test set to within a factor of 10 of the experimental values. This new model could be invaluable in assessing the sorption behavior of molecules that have yet to be tested and in landscape-level risk assessments.

Original languageEnglish
Article number125688
Pages (from-to)125688
JournalJournal of hazardous materials
Early online date24 Mar 2021
Publication statusPublished - 5 Aug 2021

Bibliographical note

Copyright © 2021 Elsevier B.V. All rights reserved.


  • Adsorption
  • Geologic Sediments
  • Pharmaceutical Preparations
  • Soil
  • Soil Pollutants/analysis

Cite this