Cortex-R4 was designed for implementation on advanced silicon processes from 90 nm down to 28 nm and beyond with an emphasis on improved energy efficiency, real-time responsiveness, advanced features and ease of system design. On a 40 nm G process the Cortex-R4 can be implemented to run at almost 1 GHz when it delivers over 1,500 Dhrystone MIPS performance. The processor provides a highly flexible and efficient two-cycle local memory interface, enabling SoC designers to minimize system cost and energy consumption.
The figure below compares Dhrystone benchmark performance of Cortex-R4 with classic ARM processors implemented on a 90 nm G process. Cortex-R4's configuration options can be chosen to minimize the processor's die area, which importantly also minimizes leakage power.
Cortex-R4 More Performance, More Power Efficient

Cortex-R4 has many other significant advantages over previous ARM9 and ARM11 processors:
Cores | ARM946E-S | ARM1156T2-S | Cortex-R4 |
|---|---|---|---|
Architecture | ARMv5TE | ARMv6T2 | ARMv7-R |
Pre-Fetch Unit | No | Instruction pre-fetch and branch prediction | |
Super-scalar execution | No | Dual-issue instructions | |
Thumb-2 instructions | No | Yes | |
Floating point support | VFP9 | VFP11 | Integrated (Cortex-R4F) |
Bus interface | AMBA AHB | AMBA3 AXI | |
Tightly-Coupled Memory (TCM) | Basic | Code and data separate | Completely flexible |
Interrupts | ARMv5 | ARMv6 enhancements, NMI | |
Soft error management | No | Optional Parity and ECC on all RAMs | |
Memory Protection Unit (MPU) | 8 regions | 16 regions | 12 regions |
Minimum region size | 4k Bytes | 32 Bytes, overlapping regions | |
Synthesis configurability | No | I and D caches. 0 or 2 TCMs. Soft error handling. MPU | I and D caches. 0, 1, 2 or 3 TCMs. FPU. Soft error handling. MPU. AXI slave |












