转自:https://en.wikipedia.org/wiki/List_of_ARM_microarchitectures
This is a list of microarchitectures based on the ARM family of instruction sets designed by ARM Holdings and 3rd parties, sorted by version of the ARM instruction set, release and name. ARM provides a summary of the numerous vendors who implement ARM cores in their design.[1] Keil also provides a somewhat newer summary of vendors of ARM based processors.[2] ARM further provides a chart[3] displaying an overview of the ARM processor lineup with performance and functionality versus capabilities for the more recent ARM core families.
ARM family | ARM architecture | ARM core | Feature | Cache (I / D), MMU | Typical MIPS @ MHz |
---|---|---|---|---|---|
ARM1 | ARMv1 | ARM1 | First implementation | None | |
ARM2 | ARMv2 | ARM2 | ARMv2 added the MUL (multiply) instruction | None | 4 MIPS @ 8 MHz 0.33 DMIPS/MHz |
ARMv2a | ARM250 | Integrated MEMC (MMU), graphics and I/O processor. ARMv2a added the SWP and SWPB (swap) instructions | None, MEMC1a | 7 MIPS @ 12 MHz | |
ARM3 | ARMv2a | ARM3 | First integrated memory cache | 4 KB unified | 12 MIPS @ 25 MHz 0.50 DMIPS/MHz |
ARM6 | ARMv3 | ARM60 | ARMv3 first to support 32-bit memory address space (previously 26-bit) | None | 10 MIPS @ 12 MHz |
ARM600 | As ARM60, cache and coprocessor bus (for FPA10 floating-point unit) | 4 KB unified | 28 MIPS @ 33 MHz | ||
ARM610 | As ARM60, cache, no coprocessor bus | 4 KB unified | 17 MIPS @ 20 MHz 0.65 DMIPS/MHz |
||
ARM7 | ARMv3 | ARM700 | 8 KB unified | 40 MHz | |
ARM710 | As ARM700, no coprocessor bus | 8 KB unified | 40 MHz | ||
ARM710a | As ARM710 | 8 KB unified | 40 MHz 0.68 DMIPS/MHz |
||
ARM7T | ARMv4T | ARM7TDMI(-S) | 3-stage pipeline, Thumb, ARMv4 first to drop legacy ARM 26-bit addressing | None | 15 MIPS @ 16.8 MHz 63 DMIPS @ 70 MHz |
ARM710T | As ARM7TDMI, cache | 8 KB unified, MMU | 36 MIPS @ 40 MHz | ||
ARM720T | As ARM7TDMI, cache | 8 KB unified, MMU with FCSE (Fast Context Switch Extension) | 60 MIPS @ 59.8 MHz | ||
ARM740T | As ARM7TDMI, cache | MPU | |||
ARM7EJ | ARMv5TEJ | ARM7EJ-S | 5-stage pipeline, Thumb, Jazelle DBX, Enhanced DSP instructions | None | |
ARM8 | ARMv4 | ARM810[4][5] | 5-stage pipeline, static branch prediction, double-bandwidth memory | 8 KB unified, MMU | 84 MIPS @ 72 MHz 1.16 DMIPS/MHz |
ARM9T | ARMv4T | ARM9TDMI | 5-stage pipeline, Thumb | None | |
ARM920T | As ARM9TDMI, cache | 16 KB / 16 KB, MMU with FCSE (Fast Context Switch Extension)[6] | 200 MIPS @ 180 MHz | ||
ARM922T | As ARM9TDMI, caches | 8 KB / 8 KB, MMU | |||
ARM940T | As ARM9TDMI, caches | 4 KB / 4 KB, MPU | |||
ARM9E | ARMv5TE | ARM946E-S | Thumb, Enhanced DSP instructions, caches | Variable, tightly coupled memories, MPU | |
ARM966E-S | Thumb, Enhanced DSP instructions | No cache, TCMs | |||
ARM968E-S | As ARM966E-S | No cache, TCMs | |||
ARMv5TEJ | ARM926EJ-S | Thumb, Jazelle DBX, Enhanced DSP instructions | Variable, TCMs, MMU | 220 MIPS @ 200 MHz | |
ARMv5TE | ARM996HS | Clockless processor, as ARM966E-S | No caches, TCMs, MPU | ||
ARM10E | ARMv5TE | ARM1020E | 6-stage pipeline, Thumb, Enhanced DSP instructions, (VFP) | 32 KB / 32 KB, MMU | |
ARM1022E | As ARM1020E | 16 KB / 16 KB, MMU | |||
ARMv5TEJ | ARM1026EJ-S | Thumb, Jazelle DBX, Enhanced DSP instructions, (VFP) | Variable, MMU or MPU | ||
ARM11 | ARMv6 | ARM1136J(F)-S[7] | 8-stage pipeline, SIMD, Thumb, Jazelle DBX, (VFP), Enhanced DSP instructions | Variable, MMU | 740 @ 532–665 MHz (i.MX31 SoC), 400–528 MHz |
ARMv6T2 | ARM1156T2(F)-S | 9-stage pipeline,[8] SIMD, Thumb-2, (VFP), Enhanced DSP instructions | Variable, MPU | ||
ARMv6Z | ARM1176JZ(F)-S | As ARM1136EJ(F)-S | Variable, MMU + TrustZone | 965 DMIPS @ 772 MHz, up to 2,600 DMIPS with four processors[9] | |
ARMv6K | ARM11MPCore | As ARM1136EJ(F)-S, 1–4 core SMP | Variable, MMU | ||
SecurCore | ARMv6-M | SC000 | 0.9 DMIPS/MHz | ||
ARMv4T | SC100 | ||||
ARMv7-M | SC300 | 1.25 DMIPS/MHz | |||
Cortex-M | ARMv6-M | Cortex-M0[10] | Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory | Optional cache, no TCM, no MPU | 0.84 DMIPS/MHz |
Cortex-M0+[12] | Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 0.93 DMIPS/MHz | ||
Cortex-M1[13] | Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), OS option adds SVC / banked stack pointer, optional system timer, no bit-banding memory | Optional cache, 0-1024 KB I-TCM, 0-1024 KB D-TCM, no MPU | 136 DMIPS @ 170 MHz,[14](0.8 DMIPS/MHz FPGA-dependent)[15] | ||
ARMv7-M | Cortex-M3[16] | Microcontroller profile, Thumb / Thumb-2, hardware multiply and divide instructions, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 1.25 DMIPS/MHz | |
ARMv7E-M | Cortex-M4[17] | Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv4-SP single-precision FPU, hardware multiply and divide instructions, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 1.25 DMIPS/MHz (1.27 w/FPU) | |
ARMv7E-M | Cortex-M7[18] | Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv5 single and double precisionFPU, hardware multiply and divide instructions | 0-64 KB I-cache, 0-64 KB D-cache, 0-16 MB I-TCM, 0-16 MB D-TCM (all these w/optional ECC), optional MPU with 8 or 16 regions | 2.14 DMIPS/MHz | |
Cortex-R | ARMv7-R | Cortex-R4[19] | Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lockstep with fault logic | 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt MPU with 8/12 regions | |
Cortex-R5[20] | Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lock-step with fault logic / optional as 2 independent cores, low-latency peripheral port (LLPP), accelerator coherency port (ACP)[21] | 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt MPU with 12/16 regions | |||
Cortex-R7[22] | Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 11-stage pipeline dual-core running lock-step with fault logic / out-of-order execution / dynamic register renaming / optional as 2 independent cores, low-latency peripheral port (LLPP), ACP[21] | 0–64 KB / 0–64 KB, ? of 0–128 KB TCM, opt MPU with 16 regions | |||
Cortex-R8 | TBD | TBD | |||
Cortex-A (32-bit) |
ARMv7-A | Cortex-A5[23] | Application profile, ARM / Thumb / Thumb-2 / DSP / SIMD / Optional VFPv4-D16 FPU / Optional NEON / Jazelle RCT and DBX, 1–4 cores / optional MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) | 4-64 KB / 4-64 KB L1, MMU + TrustZone | 1.57 DMIPS/MHz per core |
Cortex-A7[24] | Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4-D16 FPU / NEON / Jazelle RCT and DBX / Hardware virtualization, in-order execution, superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP, architecture and feature set are identical to A15, 8-10 stage pipeline, low-power design[25] | 8-64 KB / 8-64 KB L1, 0–1 MB L2, MMU + TrustZone | 1.9 DMIPS/MHz per core | ||
Cortex-A8[26] | Application profile, ARM / Thumb / Thumb-2 / VFPv3 FPU / NEON / Jazelle RCT and DAC, 13-stage superscalar pipeline | 16-32 KB / 16–32 KB L1, 0–1 MB L2 opt ECC, MMU + TrustZone | Up to 2000 (2.0 DMIPS/MHz in speed from 600 MHz to greater than 1 GHz) | ||
Cortex-A9[27] | Application profile, ARM / Thumb / Thumb-2 / DSP / Optional VFPv3 FPU / Optional NEON / Jazelle RCT and DBX, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) | 16–64 KB / 16–64 KB L1, 0–8 MB L2 opt parity, MMU + TrustZone | 2.5 DMIPS/MHz per core, 10,000 DMIPS @ 2 GHz on Performance Optimized TSMC 40G(dual-core) | ||
Cortex-A12[28] | Application profile, ARM / Thumb-2 / DSP / VFPv4 FPU / NEON / Hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) | 32-64 KB / 32 KB L1, 256 KB-8 MB L2 | 3.0 DMIPS/MHz per core | ||
Cortex-A15[29] | Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issuesuperscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP, 15-24 stage pipeline[25] | 32 KB w/parity / 32 KB w/ECCL1, 0–4 MB L2, L2 has ECC, MMU + TrustZone | At least 3.5 DMIPS/MHz per core (up to 4.01 DMIPS/MHz depending on implementation)[30] | ||
Cortex-A17 | Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issuesuperscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP | MMU + TrustZone | |||
Cortex-A32 | TBD | TBD | |||
Cortex-A (64-bit) |
ARMv8-A | Cortex-A35 | TBD | TBD | |
Cortex-A53[31] | Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, dual issue, in-order pipeline | 8-64 KB w/parity / 8-64 KB w/ECC L1 per core, 128 KB-2 MB L2 shared, 40-bit physical addresses | 2.3 DMIPS/MHz | ||
Cortex-A57[32] | Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, multi-issue, deeply out-of-order pipeline | 48 KB w/DED parity / 32 KB w/ECC L1 per core, 512 KB-2 MB L2 shared, 44-bit physical addresses | At least 4.1 DMIPS/MHz per core (up to 4.76 DMIPS/MHz depending on implementation) | ||
Cortex-A72[33] | Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, multi-issue, deeply out-of-order pipeline | 48 KB w/DED parity / 32 KB w/ECC L1 per core, 512 KB-4 MB L2 shared, 44-bit physical addresses | At least 4.7 DMIPS/MHz per core (up to 5.0 DMIPS/MHz depending on implementation) | ||
ARM family | ARM architecture | ARM core | Feature | Cache (I / D), MMU | Typical MIPS @ |