Abstract
This paper evaluates several methods of discretisation (binning) within a k-Nearest Neighbour predictor. Our k-NN is constructed using binary neural networks which require continuous-valued data to be discretised to allow it to be mapped to the binary neural framework. Our approach uses discretisation coupled with robust encoding to map data sets onto the binary neural network. In this paper, we compare seven unsupervised discretisation methods for retrieval accuracy (prediction accuracy) across a range of well-known prediction data sets comprising time-series data. We analyse whether there is an optimal discretisation configuration for our k-NN. The analyses demonstrate that the configuration is data specific. Hence, we recommend running evaluations of a number of configurations, varying both the discretisation methods and the number of discretisation bins, using a test data set. This evaluation will pinpoint the optimum configuration for new data sets.
Original language | English |
---|---|
Type | Technical Report |
Place of Publication | Department of Computer Science, University of York, UK |
Volume | Technical Report YCS-2012-473 |
Publication status | Published - 1 Jun 2012 |
Keywords
- k-Nearest Neighbour
- binary neural network
- discretisation
- binning
- quantisation