Home: Knowledge Center > Tensilica Processors and DSPs > Benefits of Customization

Benefits of Customization

How You Will Benefit by Customizing Your Processors

More than 1000 different processor designs have been put into production using the Cadence® Tensilica® automated processor-generation system. Here's what our customers have told us are the major benefits of customizing their processors:


You make the changes. The processor production process is totally automated, so no one else (not even Cadence employees) see your option choices or the TIE instructions you add. You get a processor that's yours and yours alone. It's not just like the CPU or DSP your competitor just licensed. And it will be virtually impossible for anyone to copy, making your design very secure.

You also get a full software tool chain, totally matched to all of the optimizations you made to your DSP. No one else can get the matching software tool chain unless you provide it to them, so no one can program the processors in your SoC unless you allow it. This gives you both differentiation and product control.

Reduced Time to Market

You get to market much faster using Tensilica processors no matter how you compare them to other solutions.

Compared to RTL design: It only takes minutes to make simple option choices for your Tensilica processor. Or you can spend more time, checking the effects of different changes on your required performance, power, and area. Any way you look at it, customizing a Tensilica processor is much faster than designing a new RTL block for the same function. On top of the shorter initial design time, the verification time is cut even more. Every Tensilica processor comes with pre-verified RTL. You only need to confirm that the functionality matches your specification.

Compared to standard CPUs and DSPs: If your standard processor is not customized, you're probably not getting the best possible performance, power, and area. So you need to offload certain functions to RTL blocks. To design those blocks you run into the same time challenges mentioned above.


Instead of a hard-wired block, you have a programmable processor-based solution, so you can make changes, even after tapeout, via the software.

Tensilica processors can be used instead of RTL blocks by adding the same datapath elements as implemented in RTL accelerator blocks. These datapath elements include deep pipelines, parallel execution units, task-specific state registers, and wide data buses to local and global memories. This allows Tensilica processors to sustain the same high computational throughput and support the same data interfaces as RTL hardware accelerator blocks.

The big difference is in the control of the datapaths. With RTL, you freeze the control in the FSM (finite state machine). In a Tensilica processor, the processor-based FSM is implemented in firmware, giving you maximum flexibility to add new features or make necessary adjustments.

Best Performance, Power, and Area

Optimizing your processor enables much more efficient implementation than standard CPUs and DSPs—often 10X or more. Designers can add precisely the computing resources they need to achieve the desired algorithmic performance—nothing more, nothing less. Because Cadence's Tensilica processors were designed for the fastest possible data processing, the performance increases can be amazing because we allow the data to bypass the main system bus and stream right into the processor's execution units.

Performance improvements have very beneficial effects on overall power consumption and area. A designer can add a few custom instructions to marginally increase the core's size, which in turn marginally increases the average power dissipation per clock cycle. However, if that custom instruction dramatically cuts the total clock cycles required to perform a given workload, then the total energy consumed (power-per-cycle multiplied by total cycle time) can be substantially reduced.

Example: A 20% increase in power dissipated per clock cycle, offset by a 3X speed-up in task execution, actually reduces energy consumption by 60%. The reduction in required task-execution cycles allows the system either to spend much more time in a low-power sleep state or to reduce the processor’s clock frequency and core operating voltage, leading to further reductions in both dynamic and leakage power.