State-of-the-art Convolutional Neural Networks (CNNs) have become increasingly accurate. However, hundreds or thousands of megabytes data are involved to store them, making these networks also computationally expensive. For certain applications, such as Internet-of-Things (IoT), where such CNNs are to be implemented on resource-constrained and memory-constrained platforms, including Field-Programmable Gate Arrays (FPGAs) and embedded devices, CNN architectures and parameters have to be small and efficient. In this paper, an evolutionary algorithm (EA) based adaptive integer quantisation method is proposed to reduce network size. The proposed method uses single objective rank-based evolutionary strategy to find the best quantisation bin boundary for fixed quantised bit width. The performance of the proposed method is evaluated on a small CNN, the LeNet-5 architecture, using the CIFAR-10 dataset. The aim is to devise a methodology that allows adaptive quantisation of both weights and bias from 32-bit floating point to 8-bit integer representation for LeNet-5, while retaining accuracy. The experiments compare straight-forward (linear) quantisation from 32-bits to 8-bits with the proposed adaptive quantisation method. The results show that the proposed method is capable of quantising CNNs to lower bit width representation with only a slight loss in classification accuracy.
|Title of host publication||2021 IEEE Symposium Series on Computational Intelligence (SSCI)|
|Publication status||Published - 5 Dec 2021|