*
*Home|Chinese|Japanese*About ARM|Forums|Events|News|Employment|Contact Us|Investors*
dotted rule
*ARM - the architecture for the digital worldARM - the architecture for the digital world
search
*
*
***
*MARKETS:PRODUCTS & SOLUTIONS:CONNECTED COMMUNITY:TECHNICAL SUPPORT:DOCUMENTATION*
*
technical support
*
*
****
*.Technical Support
*
*
*>>Home Page*
*
*.Obtaining Support*
*
*.FAQs*
*
**Development Tool FAQs*
**IP FAQs*
**Embedded Software FAQs*
**Artisan Physical IP FAQs (Login Required)*
*
*.Downloads*
*
*.Documentation*
*
*.Training*
*
*.Where To Buy*
*
*.Keil MCU Tools*
*
*.What's New*
*
*.ARM Newsgroups*
*
*.Active Assist On-site Services*
*
*
*
technical support FAQsask ARM*
*

Technical Support Search
*     (Advanced Search)
  FAQs   Documentation   Downloads   Forums

*

 
downarrowarmcc/tcc: Placing C variables at specific addresses - memory-mapped peripherals
Applies to: Compilers, Software Development Toolkit (SDT)

Description
In most ARM embedded systems, peripherals are located at specific addresses in memory. It is often convenient to map a C variable onto each register of a memory-mapped peripheral, and then read/write the register via a pointer. In your code, you will need to consider not only the size and address of the register, but also its alignment in memory.

Basic Concepts
The simplest way to implement memory-mapped variables is to use pointers to fixed addresses. If the memory is changeable by 'external factors' (e.g. by some hardware), it must be labelled as 'volatile '.

Consider a simple example:

    #define PORTBASE 0x40000000
unsigned int volatile * const port = (unsigned int *) PORTBASE;

(i.e. 'port' is a constant pointer to a volatile unsigned integer) then we can access the memory-mapped register using:

    *port = value;    /* write to port */
value = *port; /* read from port */

The use of 'volatile' ensures that the compiler always carries out the memory accesses, rather than optimizing them out (for example if the access is in a loop).

This approach can be used to access 8, 16 or 32 bit registers, but be sure to declare the variable with the appropriate type for its size, i.e. unsigned int for 32-bit registers, unsigned short for 16-bit, and unsigned char for 8-bit. The compiler will then generate the correct single load/store instructions, i.e. LDR/STRLDRH/STRHLDB/STRB. See What type of memory access does armcc/tcc use for different C constructs?

You should also ensure that the memory-mapped registers lie on appropriate address boundaries, e.g. either all word-aligned, or aligned on their natural size boundaries, i.e. 16-bit registers must be aligned on half-word addresses (but note that ARM recommends that all registers, whatever their size, be aligned on word boundaries - see later).

You can also use #define to simplify your code, e.g.:

#define PORTBASE  0x40000000    /* Counter/Timer Base */
#define PortLoad ((volatile unsigned int *) PORTBASE) /* 32 bits */
#define PortValue ((volatile unsigned short *)(PORTBASE + 0x04)) /* 16 bits */
#define PortClear ((volatile unsigned char *)(PORTBASE + 0x08)) /* 8 bits */
void init_regs(void)
{
unsigned int int_val;
unsigned short short_val;
unsigned char char_val;
    *PortLoad = (unsigned int) 0xF00FF00F;
int_val = *PortLoad;
*PortValue = (unsigned short) 0x0000;
short_val = *PortValue;
*PortClear = (unsigned char) 0x1F;
char_val = *PortClear;
}

results in the following (interleaved) code:

  void init_regs(void)
{
unsigned int int_val;
unsigned short short_val;
unsigned char char_val;

*PortLoad = (unsigned int) 0xF00FF00F;
0x00008054 ldr r1,0x00008080 ; = #0xf00ff00f
0x00008058 mov r0,#0x40000000
0x0000805c str r1,[r0,#0]
int_val = *PortLoad;
0x00008060 ldr r1,[r0,#0]
*PortValue = (unsigned short) 0x0000;
0x00008064 mov r1,#0
0x00008068 strh r1,[r0,#4]
short_val = *PortValue;
0x0000806c ldrh r1,[r0,#4]
*PortClear = (unsigned char) 0x1F;
0x00008070 mov r1,#0x1f
0x00008074 strb r1,[r0,#8]
char_val = *PortClear;
0x00008078 ldrb r0,[r0,#8]
}
0x0000807c mov pc,r14

ARM recommendations
ARM recommends word alignment of peripheral registers even if they are 16-bit or 8-bit peripherals. In a little-endian system, the peripheral databus can connect directly to the least significant bits of the ARM databus and there is no need to multiplex (or duplicate) the peripheral databus onto high bits of the ARM databus. In a big-endian system, the peripheral databus can connect directly to the most significant bits of the ARM databus and there is no need to multiplex (or duplicate) the peripheral databus onto low bits of the ARM databus.

ARM's AMBA APB bridge uses the above technique to simplify the bridge design. The result of this is that only word-aligned addresses should be used (whether byte, halfword or word tranfer), and a read will read garbage on any bits which are not connected to the peripheral (in fact you will get the old values on the bus due to the use of bus keepers to hold the level). So, if a 32-bit word is read from a 16-bit peripheral, the top 16 bits of the register value must be cleared before use.

For example, to access some 16-bit peripheral registers on 16-bit alignment, you might write:

    volatile unsigned short u16_IORegs[20]; 

This is fine providing your peripheral controller has the logic to route the peripheral databus to the high part (D31..D16) of the ARM databus as well as the low part (D15..D0) depending upon which address you are accessing. You should check if this multiplexing logic exists or not in your design (the standard ARM APB bridge does not support this).

Alignment of registers
If you wish to map 16-bit registers on 32-bit alignment as recommended, then you could use:

1) volatile unsigned short u16_IORegs[40];
...and only access even numbered registers - you will need to double the register number, eg. to access reg 4 you could use:

        x = u16_IORegs[8];
u16_IORegs[8] = newval;

2) volatile unsigned int u32_IORegs[20];
...where the registers are accessed as 32-bit full-width. But a simple peripheral controller such as ARM's AMBA APB bridge will read garbage into the top bits of the ARM register from the signals that are not connected to the peripheral (D31..D16 for a little-endian system). So, when such a peripheral is read, it must be cast to to an 'unsigned short' to get the compiler to discard the upper 16 bits,

e.g. access reg 4 using:

        x = (unsigned short)u32_IORegs[4];
u32_IORegs[4] = newval;

3) use a struct

allows descriptive names to be used (more maintainable and legible)
allows different register widths to be accomodated

Note: padding should be made explict rather than relying on automatic padding added by the compiler, e.g.

    struct PortRegs {
unsigned short ctrlreg; /* offset 0 */
unsigned short dummy1;
unsigned short datareg; /* offset 4 */
unsigned short dummy2;
unsigned int data32reg; /* offset 8 */
} iospace;
    x = iospace.ctrlreg;
iospace.ctrlreg = newval;

Please note that peripheral locations should *not* be accessed using __packed structs (where unaligned members are allowed and there is no internal padding), or using C bitfields. This is because it is not possible to control the number and type of memory access that is being performed by the compiler. The result is code which is non-portable, has undesirable side-effects, and will not work as intended. The recommended way of accessing peripherals is through explicit use of architecturally-defined types such as int, short, char on their natural alignment.

Mapping variables to specific addresses
Memory mapped registers can be accessed from C in 2 ways: either by forcing a array or struct variable to a specific address, or by using a pointer to an array or struct (see below for details). Both generate efficient code - it is really down to a matter of personal preference.

1) Forcing struct/array to a specific address

The 'variable' should be declared it in a file on its own. When it is compiled, the object code for this file will only contain data. This data can be placed at a specified address using the ARM SDT 'scatter loading' mechanism. This is the recommended method for placing all AREAs (code, data, etc) at required locations in the memory map.

a) Create a file, e.g. 'iovar.c' which contains a declaration of the variable/array/struct, e.g.

volatile unsigned short u16_IORegs[20]; 

or

struct{
volatile unsigned reg1;
volatile unsigned reg2;
} mem_mapped_reg;

b) Create a scatter load description file (called "scatter.txt") containing the following:

ALL 0x8000
{
ALL 0x8000
{
* (+RO,+RW,+ZI)
}
}
IO 0x40000000
{
IO 0x40000000
{
iovar.o (+ZI)
}
}

The scatter load file must be specified at link time to the linker using the "-scatter scatter.txt" command line option. This creates 2 different regions in your image: 'ALL' and 'IO'. The zero-init area from iovar.o (containing your array) goes into the IO area located at 0x40000000. All code (RO) and data areas (RW and ZI) from other object files go into the 'ALL' region which starts at 0x8000.

If you have more than one group of variables (i.e more than one set of memory mapped registers) you would need to define each group of variables as a separate execution region (though they could all lie within a single load region). To do this, each group of variables would need to be defined in a separate module.

The benefit of using a scatter description file is that all the (target-specific) absolute addresses chosen for your devices, code and data are located in one file, making maintenance easy. Furthermore, if you decide to change your memory map (e.g. if peripherals are moved), you do not need to rebuild your entire project - you only need to re-link the existing objects.

For a description of scatterloading, SDT 2.11/2.11a users should refer to the SDT 2.11 User Guide, section 14, 'Placing Code and Data in Memory' and Application Note 48, Scatter Loading, in the ARM Application Notes.

SDT 2.50/2.51 users should refer to SDT 2.50 Reference Guide, Chapter 6, 'Linker' and SDT 2.50 User Guide, Chapter 10, 'Writing Code for ROM'.

2) Using a pointer to struct/array

    struct PortRegs {
unsigned short ctrlreg; /* offset 0 */
unsigned short dummy1;
unsigned short datareg; /* offset 4 */
unsigned short dummy2;
unsigned int data32reg; /* offset 8 */
};
    volatile struct PortRegs *iospace = (struct PortRegs *)0x40000000;
x = iospace->ctrlreg;
iospace->ctrlreg = newval;

The pointer could be either local or global. If global, to avoid the base pointer being reloaded after function calls, make iospace a constant pointer to the struct by changing its definition to:

    volatile struct PortRegs * const iospace = (struct PortRegs *)0x40000000; 

Code efficiency
The ARM compiler will normally use a 'base register' plus the immediate offset field available in the load/store instruction to compile struct member or specific array element access.

In the ARM instruction set, LDR/STR word/byte have a 4Kbyte range, but LDRH/STRH has a smaller immediate offset of 256bytes. The Thumb instruction 1set is much more restricted - LDR/STR has a range of 32 words, LDRH/STRH has a range of 32 halfwords, LDRB/STRB has a range of 32 bytes. Hence, it is important to group related peripheral registers near to each other if possible. The compiler will generally do a good job of minimising the number of instructions required to access the array elements or structure members by using base registers.

There is a choice between one big C struct/array for the whole I/O space and smaller per-peripheral structs. In fact there isn't much difference in efficiency - the big struct might be a benefit if you are using ARM code where a base pointer can have a 4Kbyte range (for word/byte access) and the entire I/O space is <4Kbyte - but arguably it is more elegant to have one struct per peripheral. Smaller per-peripheral structs are more maintainable.






back to top

*
**
*4 dots*Other ARM Websites
*
shadow *LEGAL STATEMENTshadow