Login

ARM The Architecture For The Digital World  

The ARM Processor Architecture

The ARM architecture forms the basis around which every ARM processor is built. Over time the ARM architecture has evolved to include architectural features to meet the growing demand for new functionality, high performance and the needs of new and emerging markets.

The ARM architecture supports implementations across a wide range of performance points, and is established as the leading architecture in many market segments. The ARM architecture supports a very broad range of performance points leading to very small implementations of ARM processors, and very efficient implementations of advanced designs using state of the art micro-architecture techniques. Implementation size, performance, and low power consumption are key attributes of the ARM architecture.

Architecture extensions were developed to provide support for Java acceleration (Jazelle®), security (TrustZone®), SIMD, and Advanced SIMD (NEON™) technologies. The ARMv8-A architecture adds a Cryptographic extension as an optional feature.

The ARM architecture is generally described as a Reduced Instruction Set Computer (RISC) architecture, as it incorporates these typical RISC architecture features:

  • A uniform register file load/store architecture, where data-processing operates only on register contents, not directly on memory contents.
  • Simple addressing modes, with all load/store addresses determined from register contents and instruction fields only.

Enhancements to a basic RISC architecture enable ARM processors to achieve a good balance of high performance, small code size, low power consumption and small silicon area.

The ARMv8 Architecture

ARMv8-A introduces 64-bit architecture support to the ARM architecture and includes:

  • 64-bit general purpose registers, SP (stack pointer)  and PC (program counter)
  • 64-bit data processing and extended virtual addressing 
  • Two main execution states:
    • AArch64 - The 64-bit execution state including exception model, memory model, programmers' model and instruction set support for that state
    • AArch32 - The 32-bit execution state including exception model, memory model, programmers' model and instruction set support for that state

The execution states support three key instruction sets:

  • A32 (or ARM): a 32-bit fixed length instruction set, enhanced through the different architecture variants. Part of the 32-bit architecture execution environment now referred to as AArch32
  • T32 (Thumb) introduced as a 16-bit fixed-length instruction set that was subsequently enhanced to a mixed-length 16- and 32-bit instruction set on the introduction of Thumb-2 technology. Part of the 32-bit architecture execution environment now referred to as AArch32
  • A64: a 32-bit fixed-length instruction set that offers similar functionality to the ARM and Thumb instruction sets. Introduced with ARMv8-A, it is the AArch64 instruction set.

ARM ISAs are constantly improving to meet the increasing demands of leading edge applications developers, while retaining the backwards compatibility necessary to protect investment in software development. In ARMv8-A there are some additions to A32 and T32 to maintain alignment with the A64 instruction set.

 
 


ARM, generically known as A32, is a fixed-length (32-bit) instruction set. It is the base 32-bit ISA used in the ARMv4T, ARMv5TEJ and ARMv6 architectures.  In these architectures it is used in applications requiring high performance, or for handling hardware exceptions such as interrupts and processor start-up.

The ARM ISA is also supported in the Cortex™-A and Cortex-R profiles of the Cortex architecture for performance critical applications, and for legacy code.  Most of its functionality is subsumed into the Thumb instruction set with the introduction of Thumb-2 technology. Thumb (T32) benefits from improved code density.

ARM instructions are 32-bits wide, and are aligned on 4-byte boundaries.

Most ARM instructions can be "conditionalised" to only execute when previous instructions have set a particular condition code. This means that instructions only have their normal effect on the programmers’ model operation, memory and coprocessors if the N, Z, C and V flags in the Application Program Status Register satisfy a condition specified in the instruction. If the flags do not satisfy this condition, the instruction acts as a NOP, that is, execution advances to the next instruction as normal, including any relevant checks for exceptions being taken, but has no other effect.  This conditionalisation of instructions allows small sections of if- and while-statements to be encoded without the use of branch instructions.  

The condition codes are:

 Condition Code Meaning

 N

 Negative condition code, set to 1 if  result is negative

 Z

 Zero condition code, set to 1 if the result of the instruction is 0

 C

 Carry condition code, set to 1 if the instruction results in a carry condition

 V

 Overflow condition code, set to 1 if the instruction results in an overflow condition.

 

 


Cost-sensitive embedded control applications such as cell phones, disk drives, modems and pagers are always looking for ways to achieve 32-bit performance and address space at minimal cost with respect to memory footprint.    

The Thumb (T32) instruction set provides a subset of the most commonly used 32-bit ARM instructions which have been compressed into 16-bit wide opcodes. On execution, these 16-bit instructions are decompressed transparently to full 32-bit ARM instructions in real time without performance loss.

Thumb offers the designer:

  • Excellent code-density for minimal system memory size and cost
    • 32-bit performance from 8- or16-bit memory on an 8- or 16-bit bus for low system cost.
  • Plus the established ARM features
  • Industry-leading MIPS/Watt for maximum battery life and RISC performance
  • Small die size for integration and minimum chip cost
  • Global multi-partner sourcing for secure supply.

 Designers can use both 16-bit Thumb and 32-bit ARM instructions sets and therefore have the flexibility to emphasize performance or code size on a sub-routine level as their applications require.

The Thumb ISA is widely supported by the ARM ecosystem, including a complete Windows software development environment as well as development and evaluation cards.

Improved Code Density with Performance and Power Efficiency

Thumb-2 technology made Thumb a mixed (32- and 16-bit) length instruction set, and is the instruction set common to all ARMv7 compliant ARM Cortex implementations. Thumb-2 provides enhanced levels of performance, energy efficiency, and code density for a wide range of embedded applications.

The technology is backwards compatible with existing ARM and Thumb solutions, while significantly extending the features available in the Thumb instructions set, allowing more of the application to benefit from the best in class code density of Thumb. For performance optimised code Thumb-2 technology uses 31 percent less memory to reduce system cost, while providing up to 38 percent higher performance than existing high density code, which can be used to prolong battery-life or to enrich the product feature set.

 


A64 is a new 32-bit fixed length instruction set to support the AArch64 execution state. The following is a summary of the A64 ISA features.
  • Clean decode table based on 5-bit register specifiers
  • Instruction semantics broadly the same as in AArch32
  • 31 general purpose 64-bit registers accessible at all times
    • No modal banking of GP registers - Improved performance and energy
  • Program counter (PC) and Stack pointer (SP) not general purpose registers
  • Dedicated zero register available for most instructions

Key differences from A32 are:

  • New instructions to support 64-bit operands
    • Most instructions can have 32-bit or 64-bit arguments
  • Addresses assumed to be 64-bits in size
    • LP64 and LLP64 are the primary data models targeted
  • Far fewer conditional instructions than in AArch32
    • Conditional {branches, compares, selects}
  • No arbitrary length load/store multiple instructions
    • LD/ST ‘P’ for handling pairs of registers added

A64 Advanced SIMD and scalar floating point support are semantically similar to the A32 support; they share a floating-point/vector register file,V0 to V31. A64 provides 3 major functional enhancements:

  • More 128 bit registers: 32 x 128 bit wide registers
    • Can be viewed as 64-bit wide registers
  • Advanced SIMD supports DP floating-point execution
  • Advanced SIMD supports full IEEE 754 execution
    • Rounding-modes, Denorms, NaNhandling

There are some additional floating-point instructions for IEEE754-2008

  • MaxNum/MinNum instructions
  • Float to Integer conversions with RoundTiesAway The register packing model in A64 is also different from A32:
  • All vector registers 128-bits wide, Vx[127:0]
    • Double precision scalar floating point uses Vx[63:0]
    • Single precision scalar floating point uses Vx[31:0] 

Maximise