Discretisation of Data in a Binary Neural k-Nearest Neighbour Algorithm

Research output: Other contribution

Abstract

This paper evaluates several methods of discretisation (binning) within a k-Nearest Neighbour predictor. Our k-NN is constructed using binary neural networks which require continuous-valued data to be discretised to allow it to be mapped to the binary neural framework. Our approach uses discretisation coupled with robust encoding to map data sets onto the binary neural network. In this paper, we compare seven unsupervised discretisation methods for retrieval accuracy (prediction accuracy) across a range of well-known prediction data sets comprising time-series data. We analyse whether there is an optimal discretisation configuration for our k-NN. The analyses demonstrate that the configuration is data specific. Hence, we recommend running evaluations of a number of configurations, varying both the discretisation methods and the number of discretisation bins, using a test data set. This evaluation will pinpoint the optimum configuration for new data sets.
Original languageEnglish
TypeTechnical Report
Place of PublicationDepartment of Computer Science, University of York, UK
VolumeTechnical Report YCS-2012-473
Publication statusPublished - 1 Jun 2012

Keywords

  • k-Nearest Neighbour
  • binary neural network
  • discretisation
  • binning
  • quantisation

Cite this