Tracking and modelling prices using web-scraped price microdata: towards automated daily consumer price index forecasting

Benedict James Powell, Guy Nason, Duncan Elliott, Matthew Mayhew, Jennifer Davies, Joe Winton

Research output: Contribution to journalArticlepeer-review

Abstract

With the increasing relevance and availability of on-line prices that we see today, it is natural to ask whether the prediction of the consumer price index (CPI), or related statistics, may usefully be computed more frequently than existing monthly schedules allow for. The simple answer is ‘yes’, but there are challenges to be overcome first. A key challenge, addressed by our work, is that web-scraped price data are extremely messy and it is not obvious, a priori, how to reconcile them with standard CPI statistics. Our research focuses on average prices and disaggregated CPI at the level of product categories (lager, potatoes, etc.) and develops a new model that describes the joint time evolution of latent daily log-inflation rates driving prices seen on the Internet and prices recorded in official surveys, with the model adapting to various product categories. Our model reveals the differing levels of dynamic behaviour across product category and, correspondingly, differing levels of predictability. Our methodology enables good prediction of product-category-specific CPI immediately before their release. In due course, with increasingly complete web-scraped data, combined with the best survey data, the prospect of more frequent intermonth aggregated CPI prediction is an achievable goal.

Original languageEnglish
Pages (from-to)737-756
Number of pages20
JournalJournal of the Royal Statistical Society: Series A (Statistics in Society)
Volume181
Issue number3
Early online date15 Sept 2017
DOIs
Publication statusPublished - 24 May 2018

Bibliographical note

© 2017 The Authors Journal of the Royal Statistical Society: Series A.

Keywords

  • Dynamic inflation model
  • High frequency inflation prediction
  • Inflation estimation
  • State space model

Cite this