On ARM11 and Cortex-A8 cores you can prefetch instructions into cache and then lock them down so that they will not be evicted. The cache that is used for prefetch and lockdown is: - For ARM11s, the L1 I-cache
- For Cortex-A8s, the unified L2 cache
Note that the minimum granularity that can be locked down is a single cache way (usually a quarter of the total of the relevant level cache). Cache lockdown allows one or more ways to be used to exhibit similar behaviour to a block of Tightly Coupled Memory (TCM). This FAQ details the pros/cons of prefetch and lockdown and also explains how to carry out the prefetch and lockdown procedure. Some example code is also provided. Why would I want to lock instructions into cache? The process of prefetching and then locking down the relevant cache way means that you can be sure that a particular block of instructions are always cached, and therefore can produce very consistent performance when executing that block of instructions. Cache lockdown may be beneficial if: - You have a particular block of code that takes up a large proportion of the overall execution time, and you therefore want to ensure that these instructions are always cached.
- You have a particular block of code for which you need to be able to guarantee a consistent and/or fast execution time.
Note that if your core has TCMs then these should be used in preference to cache lockdown. What are the downsides? Cache ways that have been locked down will no longer be available for automated caching. For example, if you lock down one of 4 cache ways, then each time a new cache linefill takes place, there is a 1 in 3 chance of something already cached in the matching cache set being evicted, rather than 1 in 4 (if all lines in that set are already filled). You should therefore always benchmark your code with and without cache lockdown to make sure that performance is not negatively affected. You should also ensure that the amount of space wasted within locked down cache ways is minimised. For example: - You have a 16KB cache
- You have 15KB of code in a region marked as cacheable
=> During execution, all of the 15KB of code will fit into the cache. After executing the code once it will be in the cache for future executions. Now imagine: - You prefetch 2KB of that cacheable code into a cache way and lock down that cache way
- You are left with 12KB of cache and 13KB of cacheable code that has not been prefetched and locked down
=> During execution, if all 13KB of code is used then instructions will constantly be being swapped in/out of cache as there is now less usable space in the cache than there is cacheable code. Solution: prefetch a full 4KB of code into the cache way that will be locked down. You should consider the size of your code and the size of your cache before using cache lockdown. It is a good idea to benchmark your application to check that locking instructions into the cache does give a performance benefit. How do I lock instructions into the cache? Lock down all cache ways except for the way that you want to prefetch to - Prefetch the instructions that you want locked into cache
- Lock down the cache way that you have prefetched to, and unlock all other cache ways
These operations are all carried out using CP15 instructions and data loads. Remember that the code that is being prefetched into the cache must be in an area of memory marked as cacheable. NB: cache lockdown is not possible with Cortex-R4 cores. Example prefetch and lockdown code Example code is provided in a zip file below. Please see readme.txt within the download for information on the files contained.
Download reference source code
Related:
|