The dawn of a new era of computer vision is taking cutting-edge technologies like augmented reality and holographic imaging from a desktop environment to mobile and embedded platforms, testing the limits of the existing hardware, typically comprising of CPUs and GPUs.
So far, CPUs and GPUs have taken the brunt for processing-intensive imaging algorithms as developers create new multimedia hooks in smartphones, tablets, wearables, surveillance gear and connected cars. However, innovations such as dual-camera designs, low-light shots and fast autofocus are not without compromises on part of both device OEMs and app developers.
Moreover, this pushes the limits of CPUs and GPUs, which are not designed for processing-intensive imaging algorithms. A repercussion of this is that apps consume too much power and eventually drain battery life. On the other hand, both CPU and GPU need to be extremely low power and highly optimized for their primary tasks. The ultimate solution to this design conundrum is a specialized IP processor that can work in the same basic architecture and seamlessly handle imaging and intelligent vision functions.
The New Camera Order
Here are two use-cases that demonstrate how fast the number of cameras, and consequently, the amount of image processing are increasing among consumer electronics products. First, Microsoft HoloLens, the augmented reality headgear that is known to have at least six cameras.
The second use case is Huawei Honor 6 Plus smartphone that has raised the number of CMOS sensors to create neat camera features. The handset uses an 8MP camera on the front and two 8MP cameras on the back for dual-camera features.
Figure 1: Increase in Number of Cameras Makes the Case for a Specialized Vision Processor
Anatomy of Vision Processor
The new imaging processor from Ceva Inc. offers such a platform that can be used in system-on-chip (SoC) designs to perform intelligent vision processing and offload the CPUs and GPUs from processing-intensive algorithms for image enhancement, computational photography and computer vision. The CEVA-XM4 implements human-like vision and visual perception capabilities for a broad range of imaging applications in smartphones, tablets, automotive safety and infotainment, robotics, security and surveillance, augmented reality, drones and signage.
Figure 2: CEVA-XM4 target markets
It’s the company’s fourth-generation imaging and vision processor IP that is built upon the expertise achieved through working with dozens of CEVA-MM3101 licensees and partners on computer vision applications. The CEVA-XM4 is a fully programmable processor designed from the ground up to accelerate the most demanding image processing and computer vision applications. CEVA has incorporated into the XM4 a programmable wide-vector architecture with fixed- and floating-point processing, multiple simultaneous scalar units, and a vision-oriented low-power instruction set. This allows the XM4 to achieve up to 8x performance improvement and 2.5x greater power efficiency compared to the previous generation the CEVA-MM3101.
Vision Processor vs. Mobile GPU
The XM4 intelligent vision processor stands out both in performance per mW and performance per mm2 when compared to competing solutions, including the leading mobile GPU devices. Take one of the most advanced mobile GPU in the market for object detection and tracking use-case and CEVA-XM4 core will complete the same task while consuming approximately 10 percent of the power and requiring approximately 5 percent of the die area. Furthermore, comparing specific algorithms vs. the mobile GPU (as seen in below figure 2) can show huge gap in energy efficiency of 20x on average between the XM4 and the mobile GPU.
Figure 3: CEVA-XM4 vs. Mobile GPU Energy Efficiency Ratio—Higher is Better
The CEVA-XM4 imaging IP broadly caters to three vision areas: image capture, image manipulation and visual perception. Image capture encompasses functions like 3D vision, noise reduction and depth map generation. Image manipulation relates to computational photography functions such as image stabilization, low-light image enhancement, zoom, multi-frame and multi-sensor super-resolution composition. Finally, visual perception carries out functions like object recognition and tracking, video analytics, augmented reality and facial, gesture and emotion recognition for natural user interfaces.
Figure 4: CEVA-XM4 vision domains supported
XM4 in the CNN Era
The algorithms supported by the XM4 vision core include real-time 3D depth map generation, point cloud processing for 3D scanning, and object and image recognition, ranging from Haar, LBP and ORB. The support goes all the way to deep learning algorithms that use neural network technologies like convolutional neural networks (CNN, DNN). Likewise, XM4’s support for computational photography algorithms encompasses refocus, background replacement, zoom, image stabilization, HDR, noise reduction and improved low-light capabilities.
For more information download:
- CEVA-XM4 web page
- CEVA-XM4 white paper
- “Enabling Intelligent Vision Processing in Embedded Systems”, Presentation from Linley Mobile Conference, April 22-23, 2015 Download here