Login

重要なお知らせ

このサイトはcookieを利用して、コンピュータに情報を保存しています。続けるには、同意が必要です。 cookie.

ARMのWebサイトでは2種類のcookieを利用しています:(1)サイトの機能を有効にし、要求に対して素早く反応できるようにするもの。(2)分析のためのcookieで、当サイト利用時に限り匿名でWeb訪問者をトラックするもの。cookieの利用に同意されない場合は、弊社のプライバシーポリシーをご確認いただき、cookieを無効にする方法を選択できます。cookieを無効にすると、サイトのいくつかの機能が使用できなくなりますのでご注意ください

big.LITTLE処理

big.LITTLE処理 Image
big.LITTLE処理は、今日の業界のチャンレンジのひとつを解決するものです。より長いバッテリー寿命を持ちながら高い性能と維持するための非常に優れた電力効率を持ったSoC(System on Chip)を開発する方法という、今日の業界の課題の1つに対応しています。big.LITTLEでは、ARM Cortex-A15 MPCoreプロセッサのパフォーマンスがCortex-A7プロセッサのエネルギー効率と結び付けられており、アプリケーション ソフトウェアはこれらのプロセッサの間をシームレスに乗り換えることができます。 各タスクにとって最適なプロセッサを選択することによって、big.LITTLEはバッテリー寿命を最大70%も長くすることができます。
 


big.LITTLE処理は適切なジョブに合った適切なプロセッサを提供するように設計されています。Cortex-A15プロセッサはこれまで開発されたARMプロセッサの中で最高のパフォーマンスと低電力を誇り、Cortex-A7プロセッサはこれまで設計されたARMアプリケーション プロセッサの中で最もエネルギー効率が高くなっています。 Cortex-A15プロセッサのパフォーマンス能力は負荷が高いワークロードに利用でき、Cortex-A7はこれを引き継いで大部分のスマートフォン ワークロードを最も効率的に処理できます。 これには、オペレーティング システム動作、ユーザインタフェースなどの常に起動し、接続し続けるタスクが含まれます。


Hardware Requirements

For big.LITTLE processing to be invisible to software and fast enough to migrate execution opportunistically to the right sized core, the big and LITTLE processors being paired must be fully architecturally compatible - they must run all the same instructions and support the same extensions such as virtualization, large physical addressing, etc. 

The first such pairing is between the Cortex-A15 and the Cortex-A7 processors, where the big cluster of CPUs and the LITTLE CPUs can contain one to four CPUs in each, enabling big.LITTLE eight core designs, smart quad core designs with two of each processor type, or an asymmetric mix like four LITTLE cores and two big cores.



Big.LITTLE System Diagram

Both the Cortex-A15 and Cortex-A7 processors are available to partners now, and available in production separately with first big.LITTLE silicon now being demonstrated by lead licensees. The second big.LITTLE pairing is between the Cortex-A57 and the Cortex-A53 processors, successors to the Cortex-A15 and Cortex-A7 processors respectively. These cores, announced in 2012, will be available to ARM lead licensees in mid 2013, and can be combined over ARM CoreLink™ CCI-400 or other cache coherent interconnect in the same way. They both increase performance while retaining the same power efficiency as their predecessor, and both introduce 64-bit support via the ARMv8 architecture, in addition to full backwards compatibility to 32-bit ARMv7 architecture with the virtualization and large addressability extensions of the latest version of ARMv7. 

Future ARM cores will also be capable of combining with these first four in big.LITTLE processor SoCs.

 

 


Software

Software can control the allocation of threads of execution to the appropriate core, or in some versions of the software simply move the whole processor context up to big or down to LITTLE based on measured load. There are two software approaches to handling the CPU selection decision, described below. In both software approaches, cache coherence is required to enable the software to quickly move execution from LITTLE to big and from big to LITTLE as appropriate. Cache coherence allows one CPU cluster to look up in the caches of the other CPU cluster, and full hardware cache coherence between the two clusters is key to making big.LITTLE software fast and transparent. Cache coherence can be provided by the ARM CCI-400 cache coherent interconnect or any interconnect that follows the AMBA4 ACE protocol.              

In a big.LITTLE SoCs, the OS kernel dynamically and seamlessly moves tasks between the 'big' and 'LITTLE' CPUs. In reality this is an extension of the operating system power management software in wide use today on mobile phone SoCs.   

Most OS kernels already support Symmetric Multi-core Processing (SMP) and those techniques can easily be extended to support big.LITTLE systems. There are two main variants of big.LITTLE software scheduling.

big.LITTLE CPU Migration 
In CPU migration a whole workload of a CPU gets move to a differently CPU, once the OS detects it requires more or less performance. This builds on generic techniques in an OS to wake up and put to sleep CPUs in an SMP system. The key extension is around the detection that a CPU is running at maximum frequency while still requesting further performance and thus the workload needs to be moved to a ‘bigger’ CPU. Once the workload has reduced, it can moved back to a ‘smaller’ CPU.  

This CPU migration software is available today from Linaro, and is being actively developed by multiple ARM partners.  

big.LITTLE MP 
Task migration (aka big.LITTLE MP) detects a high intensity task and will schedule that onto a ‘big’ CPU. Similarly it will detect a low intensity task and move this back to a ‘LITTLE’ core.  

The advantage of task migration over CPU migration is that a system can benefit from all its CPU at the same time, if the processing demands are extremely high. For example in a 2x ‘big’ + 2x ‘LITTLE’ system all 4 CPUs can be used at peak demand times, where as CPU migration would only be able to use 2 CPUs.   

ARM and Linaro have been developing Linux support for both migration models. For more information go to:

big.LITTLE CPU migration - https://wiki.linaro.org/WorkingGroups/KernelArchived/Big.Little.Switcher 
big.LITTLE task migration, see https://wiki.linaro.org/WorkingGroups/PowerManagement/Big.Little.MP

 


Related Technology

CoreLink Cache Coherent Interconnect (CCI-400)

The ARM CoreLink™ CCI-400 Cache Coherent Interconnect provides full cache coherency between two clusters of multi-core CPUs, such as the ARM Cortex-A15, and Cortex-A7 processors enabling big.LITTLE.

The CoreLink CCI-400 enables system coherency in heterogeneous multicore and multi-cluster CPU/GPU systems, such as those required for the networking and high-performance computation markets, by enabling each processor in the system to access the other processor caches. This reduces the need to access off-chip memory, saving time and energy, which is a key enabler in systems based on ARM big.LITTLE™ processing.

CoreLink Cache Coherent Network (CCN-504)

The ARM CoreLink CCN-504 Cache Coherent Network offers scaling to 16 processor cores to give system architects an optimal solution for enterprise applications including servers and network infrastructure.

CoreLink CCN-504 can deliver up to one Terabit of usable system bandwidth per second. It will enable designers to provide high-performance, cache coherent interconnect for ‘many-core’ enterprise solutions built using the ARM Cortex-A15 MPCore processor and the latest ARM Cortex-A50 series processors with 64-bit support.

ARM Development Studio 5 (DS-5)

The ARM Development Studio 5 (DS-5™) toolchain is a suite of professional software development tools for ARM processors and extends its world-leading capabilities to the big.LITTLE performance analysis and debug.    

The DS-5™ toolchain enables engineers to develop robust and highly optimized embedded software for ARM application processors, and comprises tools such as the best-in-class ARM C/C++ Compiler, a powerful Linux/Android™/RTOS-aware debugger, the ARM Streamline™ system-wide performance analyzer and real-time system model simulators

ARM Fast Models  

ARM Fast Models provide the necessary models for constructing virtual platforms of ARM big.LITTLE processing-based systems along with templates of popular configurations. Customization of model content and configuration of items such as memory map and interrupt map, and the ability to export the platform to SystemC/TLM environments are supported.

Fast models are available for the Cortex-A15 and Cortex-A7 processors and the CoreLink CCI-400

  


» 
Blogs & Knowledge Base

Maximise