ARM The Architecture For The Digital World  

Cortex-M4 Processor

Cortex-M4 Processor Image (View Larger Cortex-M4 Processor Image)
The ARM Cortex™-M4 processor is the latest embedded processor by ARM specifically developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities.

The combination of high-efficiency signal processing functionality with the low-power, low cost and ease-of-use benefits of the Cortex-M family of processors is designed to satisfy the emerging category of flexible solutions specifically targeting the motor control, automotive, power management, embedded audio and industrial automation markets.

 


Energy efficient digital signal control

The Cortex-M4 offers unparalleled capability to integrate 32-bit control with leading digital signal processing techniques for markets that require very high levels of energy efficiency.

Easy-to-use technology

The Cortex-M4 makes signal processing algorithm development easy through an excellent ecosystem of software tools and the  Cortex Microcontroller Software Interface Standard (CMSIS) .

 

 


Cortex-M4 Features
ArchitectureARMv7-ME (Harvard)
ISA Support

Thumb® / Thumb-2

DSP Extensions

Single cycle 16,32-bit MAC

Single cycle dual 16-bit MAC

8,16-bit SIMD arithmetic

Hardware Divide (2-12 Cycles)

Floating Point Unit

Single precision floating point unit

IEEE 754 compliant

Pipeline3-stage + branch speculation
Dhrystone1.25 DMIPS/MHz
Memory ProtectionOptional 8 region MPU with sub regions and background region
InterruptsNon-maskable Interrupt (NMI) + 1 to 240 physical interrupts
Interrupt Latency12 cycles
Inter-Interrupt Latency6 cycles
Interrupt Priority Levels8 to 256 priority levels
Wake-up Interrupt ControllerUp to 240 Wake-up Interrupts
Sleep Modes

Integrated WFI and WFE Instructions and Sleep On Exit capability.

Sleep & Deep Sleep Signals.

Optional Retention Mode with ARM Power Management Kit

Bit ManipulationIntegrated Instructions & Bit Banding
DebugOptional JTAG & Serial-Wire Debug Ports. Up to 8 Breakpoints and 4 Watchpoints.
TraceOptional Instruction Trace (ETM), Data Trace (DWT), and Instrumentation Trace (ITM)

 

Cortex-M4 Performance, Power & Area

Process65nm low power process

Optimization Type

Speed OptimizedArea Optimized
Standard Cell Library

ARM SC12

ARM SC9

Performance (Total DMIPS)

375

185

Frequency (MHz)

300

150

Power Efficiency (DMIPS/mW)

24

38

Area (mm2)

0.21

0.11

FPU Area ( if included ) (mm2)

0.08

0.06

Core area, frequency range and power consumption are dependent on process, libraries and optimizations. The numbers quoted above are illustrative of synthesized cores using low power process technologies and ARM Physical IP standard cell libraries and RAMs. Area numbers include the central core including DSP extensions, the Nested Vectored Interrupt Controller(NVIC) and Bus Matrix but not the optional components including the Memory Protection Unit, Embedded Trace Macrocell, Breakpoint Unit, Data Watchpoint Unit and Trace Port Interface Unit.

The speed optimized implementations refer to the library choices and synthesis flow decisions and tradeoffs made in order to achieve the target frequency performance. The area optimized implementations refer to the library choices and synthesis flow decisions and tradeoffs made in order to achieve a target area density.

Frequency and Area measured for worst case conditions – 65nm low power process - 1.08V, 125C, slow silicon

Power measured for typical case conditions– 65nm low power process - 1.2V, 25C, typical silicon

 


Cortex-M4 signal processing technologies

The Cortex-M4 processor has been designed with a large variety of highly efficient signal processing features applicable to digital signal control markets. The Cortex-M4 processor features extended single-cycle multiply-accumulate (MAC) instructions, optimized SIMD arithmetic, saturating arithmetic instructions and an optional single precision Floating Point Unit (FPU). These features build upon the innovative technology that characterizes the ARM Cortex-M series processors.

Harvard architecture

Single cycle 16, 32-bit MAC

  • 32-bit AHB-Lite interface for instruction fetches
  • 32-bit AHB-Lite interface for data and debug accesses
  • Wide range of MAC instructions
  • Choice of 32 or 64 bit accumulate
  • Instructions execute in a single cycle

Single cycle SIMD arithmetic

Single cycle dual 16-bit MAC

  • 4 parallel 8-bit adds or subtracts
  • 2 parallel 16-bit adds or subtracts
  • Instructions execute in a single cycle
  • 2 parallel 16 bit MAC operations
  • Choice of 32 or 64 bit accumulate
  • Instructions execute in a single cycle

Floating point unit

Others

  • IEEE 754 standard compliant
  • Single precision floating point unit
  • Fused MAC for higher precision
  • Saturating math
  • Barrel shifters

 

 

 

ARM Cortex-M microcontroller technologies

Each Cortex-M series processor delivers specific benefits, but all are underpinned by fundamental technologies than make Cortex-M processors ideal for a broad range of embedded applications.

 

RISC processor core

  • High performance 32-bit CPU
  • Deterministic operation
  • Low latency 3-stage pipeline

Thumb-2® technology

  • Optimal blend of 16/32-bit instructions
  • 3x smaller code size than 8-bit devices
  • No compromise on performance

Low power modes

  • Integrated sleep state support
  • Multiple power domains
  • Architected software control

Nested Vectored Interrupt Controller (NVIC)

  • Low latency, low jitter interrupt response
  • No need for assembly programming
  • Interrupt service routines in pure C

Tools and RTOS support

CoreSight debug and trace

CMSIS

The ARM Cortex Microcontroller Software Interface Standard (CMSIS) is a vendor-independent hardware abstraction layer for the Cortex-M processor series. The CMSIS enables consistent and simple software interfaces to the processor for interface peripherals, real-time operating systems, and middleware, simplifying software re-use. With CMSIS the learning curve for new microcontroller developers is reduced, shortening time to market for new products.

In-depth: Nested Vectored Interrupt Controller (NVIC)

The NVIC is an integral part of Cortex-M processors and provides the processors' outstanding interrupt handling abilities.

The Cortex-M processor uses a vector table that contains the address of the function to be executed for a particular interrupt handler. On accepting an interrupt, the processor fetches the address from the vector table.

To reduce gate count and enhance system flexibility, the Cortex-M processor uses a stack based exception model. When an exception takes place critical general purpose registers are pushed on to the stack. Once the stacking and instruction fetch are completed, the interrupt service routine or fault handler is executed, followed by the automatic restoration of the registers to enable the interrupted program to resume normal execution. This approach removes the need to write assembler wrappers that are required to perform stack manipulation for traditional C-based interrupt service routines, making application development significantly easier. The NVIC supports nesting (stacking) of interrupts, allowing an interrupt to be serviced earlier by exerting higher priority.

Complete response to interrupts in hardware

The interrupt response of Cortex-M series processor is the number of cycles from interrupt signal to execution of interrupt service routine. It includes:

  • Detecting the interrupt
  • Optimal handling of back-to-back or late arriving interrupts (see below)
  • Fetching the vector address
  • Stacking corruptible registers
  • Branching to the interrupt handler

These are tasks that are performed in hardware and included in the interrupt response cycle time quoted for Cortex-M processors. In many other architectures these tasks must be performed in software in the interrupt handler, introducing latency and complexity.

 

Tail chaining in the NVIC

Back to back interrupt  time diagram

In the case of back-to-back interrupts, traditional systems would repeat the complete state save and restore cycle twice, resulting in higher latency. The Cortex-M processors simplify moving between active and pending interrupts by implementing tail-chaining technology in the NVIC hardware. The processor state is automatically saved on interrupt entry, and restored on interrupt exit, in fewer cycles than a software implementation, significantly enhancing performance in low MHz systems.

 

Response of the NVIC to late arrival of higher priority interrupts

Late Interrupt arrival  time diagram

In case of the late arrival of a higher priority interrupt during the execution of the stack Push for a previous interrupt, the NVIC immediately fetches a new vector address to service the pending interrupt, as shown above. The Cortex-M NVIC provides deterministic response to these possibilities with support for late arrival and pre-emption.

 

Stack pop pre-emption by the NVIC

Preemption time diagram

Similarly, the NVIC abandons a stack Pop if an exception arrives and services the new interrupt immediately as shown above. By pre-empting and switching to the second interrupt without completing the state restore and save, the NVIC achieves lower latency in a deterministic manner.

 


System IP

System IP components are essential for building complex system on chips and by utilizing System IP components developers can significantly reduce development and validation cycles, saving cost and reducing time to market.

DescriptionAMBA BusSystem IP Components
AMBA Design Kit (ADK)

AHB

ADK

AMBA DMA Controllers

AHB

DMA Controller 

 

Physical IP

ARM® Physical IP Platforms deliver process optimized IP, for best-in-class implementations of the Cortex-M4 processor.
Standard Cell Logic LibrariesAvailable in a variety of different architectures ARM Standard Cell Libraries support a wide performance range for all types of designs. Designers can choose between different libraries and optimize their designs for speed, power and/or area
Memory Compilers and RegistersA broad array of silicon proven SRAM, Register File and ROM memory compilers for all types of designs ranging from performance critical to cost sensitive and low power applications.
Interface LibrariesA broad portfolio of silicon-proven Interface IP designed to meet varying system architectures and standards. General Purpose I/O, Specialty I/O, High Speed DDR and Serial Interfaces are optimized to deliver high data throughput performance with low pin counts.

 

Tools Support

All ARM processors are supported by the ARM RealView® portfolio of development tools, as well as a wide range of third party tools, operating system and EDA vendors. ARM RealView tools are unique in their ability to provide solutions that span the complete development process from concept to final product deployment.

Microcontroller development tools details are available at the Keil website.


Resources

In this section you will find useful documentation, white papers and tutorials on ARM Cortex-M processors and related technologies.

  

Documentation for Cortex-M device users

Software development tools for Cortex-M device users

Find Cortex-M based microcontrollers

Universities


Maximise