*
*Home|Chinese|Japanese*About ARM|Forums|Events|News|Employment|Contact Us|Investors*
dotted rule
*ARM - the architecture for the digital worldARM - the architecture for the digital world
search
*
*
***
*MARKETS:PRODUCTS & SOLUTIONS:CONNECTED COMMUNITY:TECHNICAL SUPPORT:DOCUMENTATION*
*
products and solutions
*
*
****
*.Products & Solutions
*
*
 >>Home Page 
*
 .ARM Services 
*
 .RealView Development Tools 
*
 .Fabric IP 
*
 .On-chip Debug & Trace 
*
 .Multimedia 
*
 .Physical IP 
*
 .Processors 
*
  Processor Overview 
*
  Processor Selector 
*
 .Processor Families 
*
   
.
   
.
   
.
   
.
   
.
   
.
  Cortex 
.
  Processor Architecture 
*
  Reference Methodology 
*
  Performance Packages 
*
  Application Processors 
*
  Embedded Processors 
*
  IEM Technology 
*
*
 .Security Solutions 
*
 .Operating System Support 
*
 .Licensing 
*
 >>Markets 
*
 >>Books 
*
*
*

ARM Cortex-R4(F)

ask ARM*
*
The CortexTM-R4 and Cortex-R4F processors are the first deeply embedded processors to be based on the ARMv7 architecture and are targeted at very high volume deeply embedded applications such as hard disk drives, inkjet printers, automotive safety systems and wireless modems.
The Cortex-R4 processor provides key savings in cost and power consumption for system developers, offering substantially higher performance than any other processor with similar die size. Along with the ARM1156T2-STM and Cortex-M3 processors, the Cortex-R4 processor completes comprehensive coverage for the diverse needs of the embedded microprocessor market. Furthermore, the Cortex-R4 processor supports substantial synthesis time configurability that enables designers to match the processor precisely to the application requirements.

 Cortex-R4F thumbnail
Larger image

The Cortex-R4 processor is capable of running at clock speeds of up to 400MHz on typical 90nm processes and the focus throughout the design is on efficiency and configurability.

In collaboration with Intrinsity, ARM has also developed a high-performance implementation of the Cortex-R4 processor. The Cortex-R4X processor incorporates Intrinsity’s Fast14® 1-of-N Domino Logic (NDL) technology, which enables faster circuit speeds while minimizing power consumption and area.

The Cortex-R4 processor includes a number of technical innovations including:

  • Thumb®-2 technology. An innovation that has enabled partners to combine the minimal memory footprint of 16-bit Thumb code with the high performance of 32-bit ARM code
  • AMBA® 3 AXITM protocol. A set of major enhancements to AMBA for high performance on-chip interconnect. The Cortex-R4 processor integrates a 64-bit master port as well as a synthesis optional 64-bit slave (DMA) port for direct access to the Tightly Coupled Memories (TCM)
  • A selective superscalar eight stage pipeline that provides more than 1.6 DMIPS/MHz in an efficient low gate count implementation
  • Non-maskable Interrupts (NMI). Many real-time applications demand this and the Cortex-R4 supports a configurable NMI pin
  • CoreSight™ technology. A framework for complete system debug and trace. This includes the ETM-R4 embedded trace macrocell  and the many other CoreSight components
  • A significantly improved local memory architecture for TCM and DMA. TCM can now be unified into a single logical address space and can run as fast as cache memory
  • ARMv6 architecture enhancements. Lots of advantages for interrupt handling and an improved Memory Protection scheme. New instructions for managing interrupts reduce the critical early interrupt handler code, and the worst case interrupt latency is vastly improved to only 20 clock cycles
  • A synthesis optional Floating Point Unit (FPU) in Cortex-R4F. Fully IEEE 754 compatible and optimised for typical embedded applications
  • Performance monitoring support. Very useful for refining and tuning a system though advanced profiling of the system performance
  • Architected support for parity and ECC in the caches and the TCMs. Soft errors are an increasing concern in embedded systems and either parity or ECC is now essential in many systems
  • A very efficient branch prediction and prefetch unit that provides a branch accuracy of more than 90% for typical C code
  • The overall aim of the Cortex-R4 processor is to provide around 40% more efficiency than the ARM9TM family whilst increasing the maximum clock speed, supporting the use of low power, dense RAMs for cache and TCMs and delivering an efficient Thumb-2 engine

Architectural Features
The ARM Cortex-R4 processor’s sophisticated pipeline architecture is based on low cost dual-issue pipeline, 8 stages with advanced dynamic branch prediction achieving 1.6 DMIPS/MHz.  The Cortex-R4 processor is fully ARMv7 architecture compliant and includes:

  • Thumb-2 technology for greater performance, energy efficiency, and code density
  • Hardware divide instructions for control applications
  • Synthesis optional FPU (Cortex-R4F)
  • Optimized Level 1 Caches and TCM
  • Synthesis optional cache controllers (with optional cache parity) and TCM ports for flexibility
  • Full wait and error support on TCM interfaces
  • Flexible configuration at synthesis time of major level one features
  • A Memory Protection Unit (MPU) can be removed or an eight or twelve region one selected
  • Either one, two or three TCM ports can be included
  • Number of breakpoints and watchpoints can be selected
  • Dynamic Branch Prediction
    - Enabled by branch target, global history buffers and a function call return stack
    - Achieves 90% accuracy across industry benchmarks
  • Single-cycle load-use penalty for access to the L1 cache and TCM
  • A single 64-bit AMBA 3 AXI master port for easy integration into the SoC interconnect
  • A synthesis optional AMBA 3 AXI slave port to allow direct access to TCMs by DMA controllers and other processors in the system
  • Vectored Interrupt Controller (VIC) port for fast connection to interrupt management peripherals

Advanced features for reliability and fault tolerance
ECC technology monitors memory accesses to detect and correct errors. If a memory error occurs the ECC logic will correct it, rather than just communicating the error and stopping the system. With embedded error correction in the Cortex-R4 processor, licensees do not need to design external ECC logic, simplifying implementation and aiding IEC61508 certification. Careful integration of ECC within the processor pipeline allows this to be achieved without the performance penalty which is normally associated with this level of protection.

Optimised floating point support
The Cortex-R4F processor’s FPU performs floating-point calculations that allow a greater dynamic range and accuracy than fixed-point calculations. The FPU is IEEE compatible and is backward compatible with earlier ARM FPUs (VFP9/10/11). The implementation is optimized for the single precision processing most commonly used in automotive and control applications without sacrificing double precision support. The FPU is particularly useful in sophisticated control applications, where algorithms are often modelled in an environment such as Simulink or ASCET-SD, and code auto-generated using tools such as Real Time Workshop Embedded Coder, ASCET-SE or dSPACE Targetlink.

Application applicability
The Cortex-R4 processor brings a strong focus on safety with high resolution memory protection facilities to allow tight control over independent software tasks. This is critical to automotive applications based on the OSEK standard for an open-ended architecture, the JasPar Automotive software platform architecture, and the AutoSAR runtime environment.
In addition to its suitability for the automotive market, the Cortex-R4 processor also provides significant benefits for other applications. In networking, for example, it is critical that unplanned outages are minimized as they can contribute to lost sales, increased overtime and loss of employee productivity. The Cortex-R4 processor’s embedded ECC support helps to reduce the possible causes of system failure to increase network resiliency and avoid these effects. In addition the FPU enables imaging applications such as laser printers to take advantage of an area-optimised embedded processor.

Related Links

 

 

 

Performance Characteristics
*
*90 nm
*
*  Speed
Opt
Area
Opt
*    
*Standard Cells Advantage-HSMetro
*
*
*Memories AdvantageMetro
*
*
*    
*
*
*Frequency* (MHz) 475210
*
*
*Area with cache (mm²) 1.741.00
*
*
*Area without cache (mm²) 1.300.73
*
*
*Cache Size 8KB/8KB8KB/8KB
*
*
*Power with cache (mW/MHz) 0.320.21
*
*
*Power w/o cache (mW/MHz) 0.260.16
*
*
*
*FPU Area (mm²) 0.51**0.28**

* Worst case conditions –  90nm process - 0.9V, 125C, slow silicon
† Typical case conditions– 90nm process - 1V, 25C, typical silicon
** If added

*

**
*4 dots*Other ARM Websites
*
shadow *LEGAL STATEMENTshadow