Background
Recently, convolutional neural networks (CNNs) have been used to achieve great success in image classification and object detection tasks. This success has led researchers to explore deeper models, such as ResNet (152 layers). These models yield high recognition accuracy by stacking repetitive layers and increasing the number of model parameters. This practice is feasible for applications running in big data centers or infrastructures with high-performance processing capabilities. However, such complex models are not suitable for real-time and embedded systems due to low energy constraints and limited computing resources.
The constraints of embedded systems have resulted in various approaches such as alignment of memory and single instruction, multiple data (SIMD) operations to boost matrix operations (93% Top-5 accuracy), specific hardware (field-programmable gate array (FPGA)) solutions (86.66% Top-5 accuracy), network compression (89.10% Top-5 accuracy), or using cloud computing (network latency should be considered). While these approaches can reduce energy consumption, many fail to retain recognition accuracy in critical situations. That is, reduction of computation load comes at the expense of recognition accuracy (which is currently more than 96% Top-5 accuracy).
Invention Description
Researchers at Arizona State University have developed a novel architecture for CNN image classification where a gate decides whether using the deeper model is beneficial or not. Due to resource limitations on the FPGA, partial reconfiguration is used to accommodate deep CNNs. Using ResNet CNNs on CIFAR-10, CIFAR-100, and SVHN datasets, experimental results show that, on average, only 69.8%, 71.8%, and 43.8% of computation on the deepest network is necessary for CIFAR-10, CIFAR-100, and SVHN benchmarking datasets to maintain a comparably high recognition performance.
Potential Applications
• Edge computer vision
• IoT systems
• Drones, autonomous vehicles, surveillance cameras
Benefits and Advantages
• Light mechanism suits IoT image classification on FPGA accelerators
• Permits running of heavy CNNs while balancing performance with computational load
• Flexible nature of hierarchical method allows adaptation to dynamic priority change over object categories
Related Publication: A novel design of adaptive and hierarchical convolutional neural networks using partial reconfiguration on FPGA