The evolutionary phase of technologies is a familiar place, and AI, the prime technology of recent years, is no different. In AI, new generations add more MACs, multiple layers of quantization, this feature, that feature, all to chase improved TOPS/Watt. We have been successfully introducing AI-at-the-edge solutions to various markets through our CEVA NeuPro platform within this environment. Now we are seeing that our users want more; sometimes evolution alone is not enough.
At first our users had an emphasis on ease of use to help them bring up this new AI-at-the-edge technology. But they too have evolved. They’re more expert in advanced AI techniques and want access to all possible ways to build differentiation into their products. To blow past current state of the art approaches by at least an order of magnitude. Getting there quickly isn’t possible through evolution; revolutionary improvements are necessary. What they want has switched from ease of use to maximal algorithmic flexibility at maximum throughput and minimal power.
(Source: CEVA)
Measuring up
TOPS/W is a nice marketing number but it’s too crude to be useful in real applications. In visual inference for example, frames per second per watt (FPS/W) is a much more meaningful metric. The value of a good score in this context is easy to understand. Detecting a pedestrian or car ahead or a car passing from behind requires quick response. There’s little time to brake or steer away and neither action is instantaneous. An inference engine must be able to manage a minimum of 100 FPS –with the lowest possible power because this is only one of many sensor/AI systems around the car. That demands much higher fps/W for competitive power.
The market opportunity is unquestionable. Automotive and Telecom applications are expected to be the biggest contributors to this growth and in Automotive, intelligent imaging continues to be strong. Incidentally, so does the “many cameras” trend in mobile phones. In fact, the imaging pipeline in such cameras has started to replace conventional algorithms with neural nets for de-noising, image stabilization, super-resolution and other novel functions, all running at 60fps in a very constrained energy envelope.
What a major advance requires
There are some interesting things happening around analog AI and spiking neural nets, but product makers don’t want to jump too far away from what they are sure can scale to volume today. That constraint still leaves a lot of algorithm potential, but now product builders want access to all those algorithms with much more flexibility to squeeze out maximum performance at minimum power.
The list of optimization possibilities is long. A wide range of quantization options. Winograd support. Sparsity optimization to skip multiplications by zero. Data type diversity in activation and weights across a range of bit-sizes. Vector processing capability in parallel with neural multiplies. Data compression to reduce loading time for weights and activations. Matrix decomposition support, delivering up to a 50:1 acceleration over a reference network, and next generation NN architectures, like transformers and 3D convolution support.
A call to action
Product builders, now with more experience in AI, know what they want to build and how to build it. What they need is a platform offering all the neural net component algorithms they already understand, to construct that optimal solution for their product.
This is a dream list of algorithms and optimizations to deliver truly breakthrough capability, throughput and low power for advanced Edge AI needs. But why only a dream? Advanced product builders are no longer satisfied with incremental improvements in AI. They now expect platforms aligned with their greatly improved understanding of possibilities. Stay tuned! If you want to learn more about CEVA’s work in edge AI, click HERE.
Published on Embedded Computing Design.