Customizable processors that scale from small efficient controllers up to compute-intensive data processing engines

Home: IP Portfolio > Tensilica IP > Tensilica Customizable Processors

Tensilica Customizable Processors

Tensilica processors excel in high performance, low power processingMake a Processor Uniquely Your Own

What are the WOW factors you need in your SoC design? For next-generation mobile devices and home entertainment products, you need efficient, high-performance functional blocks that are programmable to keep up with the latest standards.

Use our proven, automated processor generator to customize a Cadence® Tensilica® processor, and create more competitive and differentiated features with the lowest possible power.

  • Create a single product for multiple markets
  • Reduce development time and cost with pre-verified processors 
  • Extend product life cycles, make software changes without a re-spin

See How Customizable Processors Can Help Offload Your Application Processor


New Xtensa LX7

The Xtensa® LX7 processor release delivers the latest version of the Tensilica processor platform and introduces the new Vision P6 DSP for image and CNN processing, and the new Fusion G3 DSP for general purpose fixed and floating point applications. Xtensa LX7 delivers enhancements to the industry-leading ConnX BBE DSPs for baseband and radar applications, with a new vector floating-point option that features patent-pending innovations for improved area and power efficiency. All Tensilica DSPs are built on top of the Xtensa LX7 processor platform.

Additionally, the Xtensa LX7 increases common controller performance benchmarks by over 15%, while easing SoC design challenges with numerous architectural enhancements. Those include a new integrated DMA (iDMA) controller, broader support of the AMBA AXI4 protocol, thereby simplifying the integration of Tensilica processors with application processors, interface IP and the associated complex interconnect fabrics.

For more information on the Xtensa LX7, view the features and configuration options below.

Xtensa LX7 Features and Configuration Options

Xtensa LX7 Features and Configuration Table


MAC 16 DSP Yes
MUL16/MUL32 Yes
IEEE 754-compliant floating point Yes
(Half, Single and Double)
Fusion DSP family Yes
HiFi DSP Family Yes
Vision Px DSP Family Yes
ConnX BBE-EP DSP Family Yes
Integrated DMA Controller (iDMA) Yes
Memory Protection Unit (MPU) Yes
Linux MMU Yes
Pipeline Stages 5/7
Instruction and Data L1 Memories Yes
Instruction and Data Cache Memories Yes
ECC/Parity Suppurt (L1 and Caches) Yes
FLIX Technology Yes
GPIO32 option (two 32-wire ports) Yes
QIF32 option (two 32-bit queue interfaces) Yes
AXI, AHB-Lite, and PIF Yes
ACE-Lite Support Yes
Load/store units One or Two
Designer-defined Ports and Queues Yes
3-way 64-bit VLIW configuration Yes
Configurable Interrupts and Levels Yes
Second load/store unit option Yes
Lookup interfaces to connect to RAMs Yes
Virtually unlimited bandwidth I/O options Yes
ARM® CoresightTM-compatible debug Yes
Memory bank RAM support Yes
Dual load/store with caches Yes
Multiple FLIX widths supported Yes
Performance counters Yes
Dynamic and leakage power reduction features Yes

Features of a Great Architecture with the Widest Range of Customization Options

Processor Architecture With the Widest Range of Customization Options

Tensilica processors are extremely flexible, by design. You can use a Tensilica processor for applications needing a small, low-power cache-less controller to those needing a DSP with high-performance 16-way SIMD and 3-issue VLIW. Take advantage of these approaches to make Tensilica processors uniquely your own:



Choose from a menu of checkbox and drop-down options so you can pick just the features you need. Once you've determined the best implementation, our automated Xtensa Processor Generator creates, in a matter of minutes, pre-verified RTL and a complete matching software toolchain, including models for system integration and EDA scripts for production.



Add your own instructions, registers, register files, and much more using the Tensilica Instruction Extension (TIE) methodology. You can specify the functional behavior of the new data path elements in our Verilog-like TIE language, and the RTL and tool chain will be generated for you automatically.

The Most Flexible and Easy-to-Use Customization Options

Automatically Generated Hardware with Matching Software Tool Chain

Use our Eclipse-based integrated development environment (IDE) to create and test out your customizations. When you’re ready for production, the Xtensa Processor Generator automatically creates pre-verified RTL and a complete software tool chain, including a compiler, debugger, insruction set simulator, profiler, power estimator, system models, EDA tool scripts, and more. Your complete development toolchain is automatically adapted to all options and any custom extensions.

Xtensa ISA—Optimized for the Dataplane

The Xtensa instruction set architecture (ISA) is designed to meet the diverse requirements of dataplane processing. This 32-bit architecture features a compact 16- and 24-bit instruction set with modeless switching for maximum power efficiency and performance. The base architecture has 80 RISC instructions and includes a 32-bit ALU, up to 64 general-purpose 32-bit registers, and six special-purpose registers. Using this architecture, you can expect significant code size reductions that result in higher code density and better power dissipation.

Customize and Differentiate Your Design

You can start with a base Tensilica LX processor, less than 20K gates, and add what you need to customize and differentiate your design. Many high-level building blocks such as HiFi DSPs, ConnX DSPs, floating point unit options, as well as different memory protection schemes including an MMU that supports Linux, are available as pre-designed blocks. Just click to add an option to your processor design. You can fine-tune performance, power, and area by simply selecting the size, type, width, and access latency of memories. You can also set load/store unit characteristics, select the number of general-purpose registers and the number and priority level of interrupts, and much more.

Our automated tools help you make smart decisions about what to change and what not to change in your design in order to meet all of your performance, power, and area requirements. Your changes can easily and immediately be tested so you can see the results—without all of the guesswork.

Customization Using a Simple Verilog-Like Language

Using our Tensilica Instruction Extension (TIE) language, you can improve the performance of your application by creating TIE instructions of your own definition, that can do the work of multiple instructions of a general-purpose processor. Several techniques can be used to combine multiple operations into one. Using TIE you can add inputs and outputs, scratchpad memories, simple single- or multi-cycle instructions, SIMD for vectorization, or use our Flexible Length Instruction Extensions (FLIX) for parallel operations.

Accelerate Hot Spots in Applications

You don't have to go to higher MHz to improve performance. By adding instructions in our Verilog-like language (TIE), it's possible to accelerate hot spots in your applications. You can pump data through our cores with up to two 512-bit-wide data load/stores per cycle, or bypass the system bus entirely with our unique GPIO and FIFO queues.

Reduce Verification Time and Effort in the Dataplane

You can significantly reduce verification time and effort using a Tensilica processor to map the control finite state machine (FSM) to software on the processor instead of writing your own RTL for new blocks. A Tensilica processor delivers automatic RTL generation with fine-grained clock gating, saving you from months of design effort in RTL. And the processors can be reprogrammed to adapt to upgrades and bugs in algorithms—no hardware change required. You can also create datapaths similar to hardwired using multi-cycle, complex functional units, and build custom, high-bandwidth data/control connections to other blocks with predictable latencies.

Preserve Backwards Compatibility

All Tensilica processors use a common base architecture that assures you of backward compatibility. This highly-efficient 32-bit RISC/DSP architecture has a base configuration of under 20K gates. Our base instruction set includes powerful branch instructions including compare and branch and zero-overhead loops. For bit manipulation, funnel shift, bit test and branch, and field extract operations are available.

The Xtensa architecture is flexible by design. You can use Tensilica processors for anything from a small, low-power cache-less controller to a high-performance 256-MAC DSP. Configurability of a Tensilica processor core never compromises the underlying base Xtensa instruction set, thereby ensuring availability of a robust ecosystem of third-party application software and development tools. All configurable, extensible Tensilica processors are always compatible with major operating systems, debug probes, and ICE solutions, and always come with an automatically generated, complete software development toolchain.

Complete with Matching Software Tool Chain

To ensure the availability of a robust ecosystem of third-party application software and development tools, Tensilica processors are always compatible with major operating systems, debug probes, and ICE solutions, and always come with an automatically generated, complete software development toolchain that matches all configuration options and added instructions.

Highest Code Density

Our 24/16-bit ISA is often up to 25-50% smaller than 32/16-bit architectures, giving you an immediate head start with code density. When you decide to use our VLIW capabilities, you can use up to 128-bit-wide instructions without the code bloat of conventional VLIW processors because only those specified instructions are that wide.

Low Power

Tensilica processors consistently consume less power than other licensable embedded CPUs at equivalent gate counts. To reduce power consumption, we employ techniques that are either built into the base hardware or into the configuration options, giving you more control over your system and memory resources.

Innovative I/O Bypasses the Bus for Maximum Speed

With Tensilica processors, you are no longer limited to the processing that can go through the system bus. A Tensilica processor can quickly communicate control and status information or transfer streaming data without buffering. No load/store required. Our unique RTL-like ports act like GPIO and are wires that directly connect two Tensilica processors or a Tensilica processor to external RTL. Input and output queues act like FIFOs. With their high bandwidth and low control overhead, queues allow the processor to be used in applications with extreme data rates. If you need a high-speed interface to memory, check out our Lookup interfaces for connecting RAMs for table lookups or connecting long-latency hardware computation units without going through the bus.

FLIX for Parallel Execution

The FLIX architecture adds VLIW to the Tensilica processor that executes 2 to 30 parallel operations when needed. Wide 32/64/128-bit FLIX instruction formats are seamlessly intermixed with the base Xtensa 16/24-bit instructions so there is no mode switch penalty.

With FLIX, the Tensilica processor can deliver the ultra-high performance characteristics of an ultra-wide insruction word processor without the negative code size implications typically found in VLIW or UVLIW processors. In fact, Tensilica processors with FLIX can deliver higher performance and smaller code size at the same time. This performance comes with very little overhead, adding only 2,000 gates to processor size for instruction decode and control.

Customize for Highest Performance, and Lowest Power

Discover How Easy It Is to Customize Xtensa Processors

You can use Xtensa processors as 32-bit RISC controllers with minimal customization for memories and interfaces. Or you can join other designers who are taking advantage of the incredible possibilities beyond simple customizations. See our Features page to explore the many options to unleash the power of Xtensa processors as DSPs or to enable other functions to match your requirements.

By selecting and configuring pre-defined elements of the architecture and by inventing completely new instructions and hardware execution units, your Xtensa processor can deliver performance levels that are orders of magnitude more efficient than other 32-bit processors. And you can do this in a fraction of the time it takes to develop and verify an RTL-based solution.

The Cadence family of Tensilica processors is designed from the start to be basic building blocks in system-on-a-chip (SoC) designs.

Consume Less Power by Adding Custom Instructions

Tensilica processors can deliver performance comparable to an RTL accelerator block while running at low operating frequencies, thus consuming less power.

A focus on total energy consumption is key. A designer can add a few custom instructions and that extension will increase the processor’s size, which in turn increases the power dissipation per clock cycle (increase in the mW/MHz). However, if the custom instructions dramatically cut the total clock cycles required to perform a given workload (the target C-code application), then the total energy consumed (power-per-cycle multiplied by total cycle time) can be substantially reduced.

Example: a 20% increase in power dissipated per clock cycle, offset by a 3X speed up in task execution, actually reduces energy consumed by 60%.

Many Benefits of Using a Unique Processor Design

By using a unique processor, you make it much harder for competitors to copy your ideas. You get a version of a processor that no one else can buy. No one else can get the matching software tool chain unless you provide it to them so no one can program the processors in your ASIC unless you allow it. In addition, your optimized processor will deliver better performance, operate at lower clock rates, and consume less energy than the industry-standard, fixed-ISA microprocessor cores.

Processors for DSP

Many designs use a standard 32-bit processor coupled with a separate core to accelerate digital signal processing (DSP). However, using two processors means that data must transfer between the processor and DSP core over some sort of interconnect, usually a standard bus, which slows performance.

Xtensa processors don’t require a separate DSP core because DSP functions can be built into the processor itself, eliminating inter-processor data transfers over a slow processor bus. See our Audio, Voice and Speech section and our Baseband and RF Signal Processing section for examples of how we've customized our Xtensa processors for these intensive DSP functions.

Processors as RTL Alternatives

Processors can be used as alternatives to hand-coded RTL blocks by adding the same datapath elements as implemented in RTL accelerator blocks. These datapath elements include deep pipelines, parallel execution units, task-specific state registers, and wide data buses to local and global memories. This allows Tensilica processors to sustain the same high computation throughput and to support the same data interfaces as RTL hardware accelerators.

However, control of processor datapaths is very different from their RTL counterparts. Cycle-by-cyle control of a processor’s datapaths is not frozen in the hardware FSM’s state transitions. Instead, the FSM is implemented in firmware, which greatly reduces the effort needed to fix an algorithm bug or add new features. In a firmware-controlled FSM, control-flow decisions occur in branches, load and store operations implement memory accesses, and computations become explicit sequences of general-purpose and application-specific instructions.

An Automated Development Process Speeds Customization

Cadence has fine-tuned the patented Tensilica  processor customization process, making it as foolproof and secure as possible.

Innovative I/O

Automated Customization Processor Overview

Fully Automated Hardware and Software Tools Generation

Use our Eclipse-based integrated development environment (IDE) to create and test out your customizations. When you’re ready for production, our Xtensa Processor Generator automatically creates pre-verified RTL and a complete software tool chain, including a compiler, debugger, insruction set simulator, profiler, power estimator, system models, EDA tool scripts and more.

Your complete development toolchain is automatically adapted to all options and any custom extensions.

Highly Automated Design Tools Speed Your Adoption and Integration Processes

The Industry's Most Powerful and Complete Design Environment

For Processor Designers

Cadence Tensilica tools automate the process of generating a custom Xtensa processor along with matching software tools. These patented tools have been proven in hundreds of designs. Whether your design is for a simple controller or a complex multi-core DSP design, we have the tools you need to create successful products.

View the complete set of tools for processor designers.

For Software Developers

When you're ready to develop application code for an Xtensa processor, the Xtensa Software Developer's Toolkit provides a comprehensive collection of code generation and analysis tools that speed the development process. The Eclipse-based Xtensa Xplorer Integrated Development Environment (IDE) serves as the cockpit for the entire development experience.

View the complete set of tools for software developers.