Despite the effectiveness of convolutional neural networks (CNNs), especially for image classification tasks, the effect of convolution features on learned representations is still limited, mainly focusing on an images salient object but ignoring the variation information from clutter and local objects. The authors propose a multiple vector of locally aggregated descriptors (VLAD) encoding method with CNN features for image classification. To improve the VLAD coding methods performance, they explore the multiplicity of VLAD encoding with the extension of three encoding algorithms. Moreover, they equip the spatial pyramid patch (SPM) on VLAD encoding to add spatial information to CNN features. The addition of SPM, in particular, allows their proposed framework to yield better performance compared to the traditional method.