The Cadence Tensilica Vision DSP

Home: IP Portfolio > Tensilica IP > Vision DSPs for Imaging, Vision and CNN

Tensilica Vision DSPs for Imaging, Computer Vision, CNN

Built for Next-Generation Imaging/Vision/CNN Requirements

IP for next generation image/video processing

Today’s applications processors are not equipped to handle the complex vision/imaging digital signal processing (DSP) functions in mobile handsets, tablets, DTVs, drone and automotive, video game, and high-end wearables. The Cadence® Tensilica® Vision DSP family offers a much-needed breakthrough in terms of energy efficiency and performance that enables applications never before possible in a programmable device.

The Tensilica Vision DSP family was designed for the complex algorithms in imaging and computer vision, including innovative multi-frame noise reduction, video stabilization, high dynamic range (HDR) processing, object and face recognition and tracking, low-light image enhancement, digital zoom, gesture recognition, plus many more. The Tensilica Vision DSP family also offers outstanding performance while running neural networks.

The Tensilica Vision DSP family offers two Vision products: the Vision P5 DSP and the Vision P6 DSP. The Vision P5 DSP was introduced in 2015 and has been highly successful in the mobile market. The newly announced Vision P6 DSP sets a new standard in neural network performance by offering 4X the peak performance compared to the Vision P5 DSP.

Our Tensilica Vision P5 DSP offers up to 4X-100X the performance relative to traditional mobile CPU+GPU systems at a fraction of the power/energy. Compared to commercially available GPUs, the Tensilica Vision P6 DSP will achieve twice the frame rate at much lower power consumption on a typical neural network implementation.

New addition to Tensilica Vision DSP family. Read press release.

Tensilica Vision DSP

For complex algorithms in imaging and vision

Offload the Host CPU for Intensive Imaging and Vision Apps

The Tensilica Vision DSP family offloads the host CPU for lower energy consumption running intensive imaging and vision apps. Multi-core host CPUs can’t handle these power-hungry, bandwidth-demanding applications, hardwired accelerators are restricted to a fixed set of functions, and GPUs offer pipelines that are not required or not efficient in image and video processing applications. 

Now, the Tensilica Vision DSP family provides an imaging-specific programmable solution that is an ideal complement to the CPU/GPU. Imaging and vision algorithms can run on a DSP that’s specifically optimized for the imaging and vision functions required.

Related Topics

Learn more about Convolutional Neural Networks (CNN) and download presentations from our Embedded Neural Network Summit.

Image and Computer Vision Processing

Programmable and Customizable

The Tensilica Vision DSPs are synthesizable processors, with the configurability and extensibility that users have come to value from Cadence. The instruction set, memory system, and data types have all been optimized for high-throughput 8-, 16-, and 32-bit pixel processing.

Highly Energy Efficient

The Vision DSP family is highly energy efficient compared to CPUs or GPUs for all kinds of pixel operations.

High Performance

The Vision P5 and Vision P6 DSPs offer a 5-way VLIW architecture, where each VLIW slot can perform 64-way SIMD 8-bit operations. The Vision family is designed to provide 320 operations per clock cycle.

The Vision P6 DSP can achieve even higher efficiency with its wide SIMD multiply-accumulates, offering significantly enhanced performance for the pixel filtering and image analysis features common in computer vision applications.


The Vision DSP Family Architecture

Vision DSP Family for Fixed-Point Vision/Imaging

The Tensilica Vision DSP family is available as licensable, synthesizable IP with rich libraries and advanced software tools allowing you to write your code in C/C++ -- no assembly code required. The instruction set, memory system, and data types have all been optimized for high-throughput 8-, 16-, and 32-bit pixel processing. The Vision DSP family was also architected to be used in solutions requiring multiple Vision DSPs to provide higher performance if required.

Wide Vector SIMD Data Processing for Superior Performance

The VLIW issue of vector operations gives an almost arbitrary mix of loads, stores, multiplies, and ALU operations, resulting in a rich set of pixel computations. Up to 320 operations can be issued per cycle and 256 of these can be ALU operations.


The Vision DSP family also integrates a highly sophisticated SuperGather™ unit, which provides the ability to quickly and efficiently read/write from non-contiguous local memory locations. The SuperGather unit enables the full utilization of the available SIMD capabilities for algorithms such as warping, lens distortion correction and canny edge tracing.

Imaging Instructions

The Tensilica Vision DSP family includes many imaging-specific operations that accelerate 8-, 16-, and 32-pixel data types and video operation patterns. Some examples of these instructions are arithmetic operations (ADD, SUB, COMPARE, MUL, DIVIDE), bit manipulation operations, and data reorganization operations. 

Vision P5 DSP Features and Benefits

  • Offers up to 13X vision-processing performance improvement over the previous generation Vision DSP
  • Processes 7168 bits per cycle
  • Optional vector floating-point unit (VFPU) with single-precision 32-bit floating-point support offers flexibility to provide high-precision math at a minimal area penalty

                                                       Vision P5 DSP Block Diagram

Vision P6 DSP Features and Benefits

With new instructions, increased math throughput, and other enhancements the Vision P6 DSP sets a new standard in imaging and computer vision benchmarks, increasing the performance by up to 4X compared to the highly successful Vision P5 DSP. For Convolutional Neural Network (CNN) applications, the Vision P6 DSP boosts performance by up to 4X with quadruple the available multiply-accumulate (MAC) horsepower, which is a major computation block for CNN applications. Compared to commercially available GPUs, the Vision P6 DSP will achieve twice the frame rate at much lower power consumption on a typical neural network implementation. For a wide range of other key vision functions, such as convolution, FIR filters, and matrix multiplies, the Vision P6 DSP increases performance by up to 2X with its improved 8-bit and 16-bit arithmetic.

  • Processes 9728 bits per cycle 
  • Offers 256 MACs: 4X compared to Vision P5 DSP
  • Enhanced instruction set and instruction slotting
  • Fully software compatible with Vision P5 DSP
  • Optional vector floating-point unit (VFPU) with single-precision 32-bit and/or half-precision 16-bit floating-point support offers performance and flexibility for porting existing GPU code


                                                       Vision P6 DSP Block Diagram

Vector Floating Point

The Vision DSP family also provides an optional vector floating-point unit for those applications that need this precision or as a quick way to port existing code. The vector floating point offers significant performance improvement with a very little area increase. The Vision P6 DSP offers optional support for a 32-way vector floating-point unit with half-precision (FP16) format.

Processor Optimization

Because the Vision DSP family is built on our proven Tensilica Optimization Platform, further optimizations can be made to target your specific application. Please see the Xtensa section for all of the options available. All processors come with a complete hardware design and matching software tools, including a mature, world-class auto-vectorizing compiler, a cycle-accurate SystemC-compatible instruction set simulator (ISS), and the full industry standard GNU toolchain.

 Vision P5 DSPVision P6 DSP
Number of bits processed per cycle 7168 9728
MACs 64 256
16-bit (FP16) VFPU support (optional) No Yes
32-bit (FP16) VFPU support (optional) Yes Yes


Library and Third-Party Support

OpenCV/VX Library Support

The Tensilica Vision P5 and P6 DSPs come with over 1000 OpenCV-like functions. These functions are highly optimized to achieve the best performance on these DSPs. OpenCV has over 2500 functions but Cadence has chosen the most common 1000 functions to optimize. Cadence continues to add more functions with quarterly library updates.

OpenVX has ~40 library functions. All of these functions are already available on the Vision P5 and Vision P6 DSPs.

Rich Third-Party Application Software Support

Along with math library support, Cadence also supports a very rich set of third-party applications targeting the Vision DSP family. Some of these third parties offer video WDR, image stabilization, super resolution, CNN, and various ADAS applications. These applications are ported and optimized on our DSPs for fast time to market. 

See our list on our Partners page

Comprehensive Hardware and Software Design Tools

Our Proven, Comprehensive Hardware and Software Design Environment

Processor design process


For Processor Designers

Cadence delivers patented, proven tools that automate the process of generating a custom processor or DSP along with matching software tools. These tools have been proven in hundreds of designs. Whether your design is for a simple controller or a complex multi-core DSP design, Cadence has the tools you need to create successful products.

View the complete set of tools for processor designers.

Software development process

For Software Developers

When you need to develop application code for a Tensilica processor, the Xtensa Software Developer's Toolkit provides a comprehensive collection of code generation and analysis tools that speed the development process. Cadence's Eclipse-based Xtensa Xplorer Integrated Development Environment (IDE) serves as the cockpit for the entire development experience.

View the complete set of tools for software developers.

FPGA Platform

Cadence has developed a complete camera system, display system and Vision P5 DSP on a FPGA platform. The FPGA platform can be used to develop various vision and imaging applications. It has a CMOS sensor based camera connected over a MIPI interface and an LCD panel connected over another MIPI interface. It also has an HDMI input and output which provides a highly flexible platform for developing imaging and vision applications. Cadence has already developed various applications including face detection and people detection on this FPGA platform.

Vision Demo System

Vision DSP Family Literature and Other Resources

Documentation and Literature

Product Literature

Vision DSP Family Product Brief

White Paper

Choosing the Right DSP for High-Resolution Imaging in Mobile and Wearable Applications

Please contact us for datasheets and more relevant documentation.

Hardware/Software Design Tools

Xtensa Processor Developer's Toolkit

Xtensa Software Developer's Toolkit


Read Blogs on Vision DSP

 The Road Ahead for Neural Networks in Embedded Systems

 Q&A: Drones, Robots, and the New Tensilica Imaging/Vision DSP


BDTi: Next-Gen Cadence Tensilica Processor Core Claims Big Performance, Energy Consumptions Gains

EEJournal Chalk Talk: Cadence Tensilica Vision P5


Watch Videos on Vision