CNN's (Convolutional Neural Networks) have overcome the world and are responsible for enabling several AI-enhanced applications, for example, image recognition. However, the problem that currently revolves around the technology is that implementation of state-of-the-art CNNs on IoT (Internet-of-Things) low-power edge devices is a challenging task. This is because they require a large number of resources to work properly.
Now, researchers might have removed the hurdle through their newly developed and extremely efficient sparse CNN processor architecture and training algorithms that can facilitate the integration of CNN models on edge without any obstruction. The new CNN design could potentially lead to a boost within the Neural Networks Market as it is efficient and more accurate; thus, it could eventually result in a paradigm shift of a number of AI technologies.
The team in their study proposed a unique sparse CNN chip of 40 nm that accomplishes preciseness and efficaciousness through MAC (Multiple and Accurate) arrays, a Cartesian-product. In addition to pipelined activation aligners that spatially shift activations upon a standard Cartesian MAC array.
The researchers added that dense, as well as regular computation on a parallel computational array, can be more efficient in comparison to sparse ones. They achieved dense computation of sparse convolution due to their new architecture that makes use of MAC array and activation aligners. Further, it was found in the study that zero weights could be easily eliminated from storage as well as computation leading to enhanced storage utilization.
The thing of notice, although, is the mechanism’s “tunable sparsity.” It is true that sparsity can lead to computing complexities and, in turn, increase efficiency; however, the level of sparsity also has a great influence over prediction accuracy. Hence, it is necessary to keep the sparsity at a desired accuracy and efficiency to help unravel the sparsity-accuracy relationship.
To achieve the highly efficient quantized and sparse models, the team took help from DQ (Dynamic Quantization) and gradual pruning approaches over CNNmodels taught over standard image datasets like ImageNet and CIFAR100. Gradual pruning refers to pruning in incremental steps adding the lightest of weights inside each channel. The DQ facilitates quantizing the weights of neural networks to low it-length numbers, with the activation that gets quantized in-between inference.
The team tested the pruned and quantized model on a CNN chip prototype. They measured 5.30 dense tera operations per second per watt (TOPS/W), which is a metric known for assessing performance efficiency, which is synonymous to the base model’s 26.5 sparse TOPS/W.
The unique architecture proposed in the new study entails an efficient sparse CNN training algorithm that would help in the advancement of CNN models by entailing them to be integrated within low-power edge devices. The innovation would benefit a wide range of applications, including industrial IoTs and smartphones.
Other Related Reports:
Global Heterogeneous Networks Market 2020 by Company, Regions, Type and Application, Forecast to 2025
Global Low Power Wireless Networks Market Growth (Status and Outlook) 2020-2025
Global Energy Technology for Telecom Networks Market 2020 by Company, Regions, Type and Application, Forecast to 2025
Global Multienterprise Supply Chain Business Networks Market 2020 by Company, Regions, Type and Application, Forecast to 2025