The concentration of ozone at the Earth's surface is measured at many locations across the globe for the purposes of air quality monitoring and atmospheric chemistry research. We have brought together all publicly available surface ozone observations from online databases from the modern era to build a consistent data set for the evaluation of chemical transport and chemistry-climate (Earth System) models for projects such as the Chemistry-Climate Model Initiative and Aer-Chem-MIP. From a total data set of approximately 6600 sites and 500 million hourly observations from 1971-2015, approximately 2200 sites and 200 million hourly observations pass screening as high-quality sites in regionally representative locations that are appropriate for use in global model evaluation. There is generally good data volume since the start of air quality monitoring networks in 1990 through 2013. Ozone observations are biased heavily toward North America and Europe with sparse coverage over the rest of the globe. This data set is made available for the purposes of model evaluation as a set of gridded metrics intended to describe the distribution of ozone concentrations on monthly and annual timescales. Metrics include the moments of the distribution, percentiles, maximum daily 8-hour average (MDA8), sum of means over 35 ppb (daily maximum 8-h; SOMO35), accumulated ozone exposure above a threshold of 40 ppbv (AOT40), and metrics related to air quality regulatory thresholds. Gridded data sets are stored as netCDF-4 files and are available to download from the British Atmospheric Data Centre (doi:10.5285/08fbe63d-fa6d-4a7a-b952-5932e3ab0452). We provide recommendations to the ozone measurement community regarding improving metadata reporting to simplify ongoing and future efforts in working with ozone data from disparate networks in a consistent manner.