通用寄存器:r0-r31, 32位寄存器的名称是w0-w31,64位寄存器的名称是x0-x31。其中
SIMD寄存器:v0-v31。其中
关于指令前缀或者后缀
http://shell-storm.org/armv8-a/ISA_v85A_A64_xml_00bet8/xhtml/fpsimdindex.html
在两个寄存器之间执行按位逻辑运算,并将结果存放到目标寄存器中。
AND (vector): Bitwise AND (vector). 按位与。
BIC (vector, register): Bitwise bit Clear (vector, register). 位清除
EOR (vector): Bitwise Exclusive OR (vector). 按位异或
ORN (vector): Bitwise inclusive OR NOT (vector). 按位或非
ORR (vector, register): Bitwise inclusive OR (vector, register). 按位或(寄存器)
BIC (vector, immediate): Bitwise bit Clear (vector, immediate). 按位位清除(立即数)。 获取目标向量的每个元素,对其与一个立即数执行按位与求补运算,并将结果返回到目标向量。
ORR (vector, immediate): Bitwise inclusive OR (vector, immediate). 按位或(立即数)。获取目标向量的每个元素,对其与一个立即数执行按位或运算,并将结果返回到目标向量。
BIT (为 True 时按位插入):如果第二个操作数的对应位为 1,则该指令将第一个操作数中的每一位插入目标中;否则将目标位保持不变。
BIF (为 False 时按位插入):如果第二个操作数的对应位为 0,则该指令将第一个操作数中的每一位插入目标中;否则将目标位保持不变。
BSL (按位选择):如果目标的对应位为 1,则该指令从第一个操作数中选择目标的每一位;如果目标的对应位为 0,则从第二个操作数中选择目标的每一位。
BIF (vector): Bitwise Insert if False. 为 False 时按位插入
BIT (vector): Bitwise Insert if True. 为 True 时按位插入
BSL (vector): Bitwise Select. 按位选择
向量比较获取向量中每个元素的值,并将其与另一个向量中相应元素的值或零进行比较。 如果条件为 True,则将目标向量中的相应元素全部设置为 1。 否则,全部设置为 0。
CMEQ (register): Compare bitwise Equal (vector).
CMEQ (zero): Compare bitwise Equal to zero (vector).
CMGE (register): Compare signed Greater than or Equal (vector).
CMGE (zero): Compare signed Greater than or Equal to zero (vector).
CMGT (register): Compare signed Greater than (vector).
CMGT (zero): Compare signed Greater than zero (vector).
CMHI (register): Compare unsigned Higher (vector).
CMHS (register): Compare unsigned Higher or Same (vector).
CMLE (zero): Compare signed Less than or Equal to zero (vector).
CMLT (zero): Compare signed Less than zero (vector).
TST (向量测试位)获取向量中的每个元素,并将其与另一个向量中的相应元素执行按位逻辑“与”运算。 如果结果不为 0,则将目标向量中的相应元素全部设置为 1。 否则,全部设置为 0。
CMTST: Compare bitwise Test bits nonzero (vector). 测试位
RBIT (vector): Reverse Bit order (vector).
CVT (向量转换)按下列方式之一转换一个向量中的每个元素,并将结果存放
到目标向量中:
舍入
SCVTF (scalar, fixed-point): Signed fixed-point Convert to Floating-point (scalar).
SCVTF (scalar, integer): Signed integer Convert to Floating-point (scalar).
SCVTF (vector, fixed-point): Signed fixed-point Convert to Floating-point (vector).
SCVTF (vector, integer): Signed integer Convert to Floating-point (vector).
UCVTF (scalar, fixed-point): Unsigned fixed-point Convert to Floating-point (scalar).
UCVTF (scalar, integer): Unsigned integer Convert to Floating-point (scalar).
UCVTF (vector, fixed-point): Unsigned fixed-point Convert to Floating-point (vector).
UCVTF (vector, integer): Unsigned integer Convert to Floating-point (vector).
DUP (向量复制)将标量复制到目标向量的每个元素。 源可以是 NEON 标量或ARM 寄存器。
将一个立即数充满SIMD寄存器操作步骤:
mov w0, #imm
dup v0.8h, w0
DUP (element): Duplicate vector element to vector or scalar.
DUP (general): Duplicate general-purpose register to vector.
EXT (向量提取)从第二个操作数向量的低位和第一个操作数的高位提取 8 位元素,将这些元素连接起来,并将结果存放到目标向量中。
EXT: Extract vector from pair of vectors.
MOV (向量移动)和 MVN (向量求反移动)(立即数)生成一个立即数,并将结果存放到目标寄存器。
向量移动(寄存器)将源寄存器中的值复制到目标寄存器中。
向量求反移动(寄存器)对源寄存器中每一位的值执行求反运算,并将结果存放到目标寄存器中。
MOV (element): Move vector element to another vector element: an alias of INS (element).
MOV (from general): Move general-purpose register to a vector element: an alias of INS (general).
MOV (scalar): Move vector element to scalar: an alias of DUP (element).
MOV (to general): Move vector element to general-purpose register: an alias of UMOV.
MOV (vector): Move vector: an alias of ORR (vector, register).
MOVI Move immediate
MVN: Bitwise NOT (vector): an alias of NOT. 求反移动
NOT: Bitwise NOT (vector).
MVNI: Move inverted Immediate (vector).
SXTL, SXTL2: Signed extend Long: an alias of SSHLL, SSHLL2.
UXTL, UXTL2: Unsigned extend Long: an alias of USHLL, USHLL2.
XTN, XTN2: Extract Narrow.
SQXTN, SQXTN2: Signed saturating extract Narrow.
SQXTUN, SQXTUN2: Signed saturating extract Unsigned Narrow.
UQXTN, UQXTN2: Unsigned saturating extract Narrow.
INS (element): Insert vector element from another vector element.
INS (general): Insert vector element from general-purpose register.
SMOV: Signed Move vector element to general-purpose register.
UMOV: Unsigned Move vector element to general-purpose register.
REV16 (向量在半字中反转)反转向量每个半字中的 8 位元素的顺序,并将结果存放到对应的目标向量中。
REV32 (向量在字中反转)反转向量每个字中的 8 位或 16 位元素的顺序,并将结果存放到对应的目标向量中。
REV64 (向量在双字中反转)反转向量每个双字中的 8 位、16 位或 32 位元素的顺序,并将结果存放到对应的目标向量中。
REV16 (vector): Reverse elements in 16-bit halfwords (vector).
REV32 (vector): Reverse elements in 32-bit words (vector).
REV64 (vector): Reverse elements in 64-bit doublewords (vector).
TBL (vector): Table vector Lookup. (向量表查找)使用控制向量中的字节索引在表中查找字节值,并生成一个新的向量。 如果索引超出范围,则返回 0。
TBX (vector): Table vector lookup extension. (向量表扩展)的用法与上一指令相同,但索引超出范围时目标元素将保持不变。
TRN1 (vector) Transpose vectors (primary)
TRN2 (vector) Transpose vectors (secondary)
(向量转置)将其操作数向量的元素视为 2 x 2 矩阵的元素,并对此类矩阵进行转置。
UZP (向量解压缩)反向交叉存取两个向量的元素。
UZP1 (vector) Unzip vectors (primary)
UZP2 (vector) Unzip vectors (secondary)
ZIP1 (vector) Zip vectors (primary)
ZIP2 (vector) Zip vectors (secondary)
向量左移(按立即数)指令获取整数向量中的每个元素,按立即值对其进行左移,并将结果存放到目标向量中。
对于 SHL (向量左移),每个元素中从左侧移出的位将丢失。
对于 QSHL (向量饱和左移)和 QSHLU (向量无符号饱和左移),如果发生饱和,则设置粘性 QC 标记。
对于 SHLL (向量长型左移),将使用符号或零对值进行扩展。
SHL (vector) Shift left (immediate)
SQSHL (vector, immediate) Signed saturating shift left (immediate)
SQSHL (vector, register) Signed saturating shift left (register)
UQSHL (vector, immediate) Unsigned saturating shift left (immediate)
UQSHL (vector, register) Unsigned saturating shift left (register)
SQSHLU (vector) Signed saturating shift left unsigned (immediate)
SSHLL, SSHLL2 (vector) Signed shift left long (immediate)
USHLL, USHLL2 (vector) Unsigned shift left long (immediate)
{Q}{R}SHL (按有符号变量)
SHL (向量按有符号变量左移)获取一个向量中的每个元素,按另一个向量的相应元素的最低有效字节中的值对其进行移位,并将结果存放到目标向量中。如果移位值为正数,则该运算为左移。 否则为右移。
可以选择对结果执行饱和或舍入运算,或者同时执行这两种运算。 如果发生饱和,则会设置粘性 QC 标记。
SSHL (vector) Signed shift left (register)
USHL (vector) Unsigned shift left (register)
SQSHL (vector, immediate) Signed saturating shift left (immediate)
SQSHL (vector, register) Signed saturating shift left (register)
UQSHL (vector, immediate) Unsigned saturating shift left (immediate)
UQSHL (vector, register) Unsigned saturating shift left (register)
SRSHL (vector) Signed rounding shift left (register)
URSHL (vector) Unsigned rounding shift left (register)
SQRSHL (vector) Signed saturating rounding shift left (register)
UQRSHL (vector) Unsigned saturating rounding shift left (register)
{R}SHR{N}、{R}SRA (按立即数)
{R}SHR{N} (向量按立即值右移)获取向量中的每个元素,按立即值对其进行右移,并将结果存放到目标向量中。 可以选择对结果执行舍入或窄型运算,或者同时执行这两种运算。
{R}SRA (向量按立即值右移并累加)获取向量中的每个元素,按立即值对其进行右移,并将结果累加到目标向量中。 可以选择对结果进行舍入。
SSHR (vector) Signed shift right (immediate)
USHR (vector) Unsigned shift right (immediate)
SHRN, SHRN2 (vector) Shift right narrow (immediate)
SRSHR (vector) Signed rounding shift right (immediate)
URSHR (vector) Unsigned rounding shift right (immediate)
RSHRN, RSHRN2 (vector) Rounding shift right narrow (immediate)
SSRA (vector) Signed shift right and accumulate (immediate)
USRA (vector) Unsigned shift right and accumulate (immediate)
SRSRA (vector) Signed rounding shift right and accumulate (immediate)
URSRA (vector) Unsigned rounding shift right and accumulate (immediate)
Q{R}SHR{U}N (按立即数)
Q{R}SHR{U}N (向量饱和右移、窄型、按立即值,可选舍入)获取整数四字向量中的每个元素,按立即值对其进行右移,并将结果存放到双字向量中。
如果发生饱和,则会设置粘性 QC 标记。
SQSHRN, SQSHRN2 (vector) Signed saturating shift right narrow (immediate)
UQSHRN, UQSHRN2 (vector) Unsigned saturating shift right narrow (immediate)
SQRSHRN, SQRSHRN2 (vector) Signed saturating rounded shift right narrow (immediate)
UQRSHRN, UQRSHRN2 (vector) Unsigned saturating rounded shift right narrow (immediate)
SQRSHRUN, SQRSHRUN2 (vector) Signed saturating rounded shift right unsigned narrow (immediate)
SQSHRUN, SQSHRUN2 (vector) Signed saturating shift right unsigned narrow (immediate)
SLI (向量左移并插入)获取向量中的每个元素,按立即值对其进行左移,并将结果插入目标向量中。 每个元素中从左侧移出的位将丢失。
SRI (向量右移并插入)获取向量中的每个元素,按立即值对其进行右移,并将结果插入目标向量中。 每个元素中从最右侧移出的位将丢失。
SLI (vector) Shift left and insert (immediate)
SRI (vector) Shift right and insert (immediate)
ABA (向量差值绝对值累加)用一个向量的元素减去另一个向量的相应元素,并将结果的绝对值累加到目标向量的元素中。
ABD (向量差值绝对值)用一个向量的元素减去另一个向量的相应元素,并将结果的绝对值存放到目标向量的元素中。
这两个指令的长型格式都可用。
SABA (vector) Signed absolute difference and accumulate
SABAL, SABAL2 (vector) Signed absolute difference and accumulate long
UABA (vector) Unsigned absolute difference and accumulate
UABAL, UABAL2 (vector) Unsigned absolute difference and accumulate long
SABD (vector) Signed absolute difference
SABDL, SABDL2 (vector) Signed absolute difference long
UABD (vector) Unsigned absolute difference
UABDL, UABDL2 (vector) Unsigned absolute difference long
ABS (向量绝对值)获取一个向量中每个元素的绝对值,并将结果存放到另一个向量中。 (对于浮点格式,仅清除符号位。)
NEG (向量求反)对一个向量中的每个元素执行求反运算,并将结果存放到另一个向量中。 (对于浮点格式,仅反转符号位。)
这两个指令的饱和格式都可用。 如果发生饱和,则会设置粘性 QC 标记(FPSCR 位 [27])。
ABS (vector) Absolute value
SQABS (vector) Signed saturating absolute value
NEG (vector) Negate
SQNEG (vector) Signed saturating negate
ADD (向量加法)将两个向量中的相应元素相加,并将结果存放到目标向量中。
SUB (向量减法)用一个向量的元素减去另一个向量的相应元素,并将结果存放到目标向量中。
饱和、长型和宽型格式都可用。 如果发生饱和,则会设置粘性 QC 标记(FPSCR 位 [27])。
ADD (vector) Add
SQADD (vector) Signed saturating add
UQADD (vector) Unsigned saturating add
SADDL, SADDL2 (vector) Signed add long
UADDL, UADDL2 (vector) Unsigned add long
SADDW, SADDW2 (vector) Signed add wide
UADDW, UADDW2 (vector) Unsigned add wide
SUB (vector) Subtract
SQSUB (vector) Signed saturating subtract
UQSUB (vector) Unsigned saturating subtract
SSUBL, SSUBL2 (vector) Signed subtract long
USUBL, USUBL2 (vector) Unsigned subtract long
SSUBW, SSUBW2 (vector) Signed subtract wide
USUBW, USUBW2 (vector) Unsigned subtract wide
{R}ADDH (向量窄型加法,选择高半部分)将两个向量中的相应元素相加,选择相加结果的最高有效半部,并将最终结果存放到目标向量中。 可将结果舍入或截断。
{R}SUBH (向量窄型减法,选择高半部分)用一个向量的元素减去另一个向量的相应元素,选择相减结果的最高有效半部,并将最终结果存放到目标向量中。 可将结果舍入或截断。
ADDHN, ADDHN2 (vector) Add returning high narrow
RADDHN, RADDHN2 (vector) Rounding add returning high narrow
SUBHN, SUBHN2 (vector) Subtract returning high narrow
RSUBHN, RSUBHN2 (vector) Rounding subtract returning high narrow
HADD (向量半加)将两个向量中的相应元素相加,将每个结果右移一位,并将这些结果存放到目标向量中。 可将结果舍入或截断。
HSUB (向量半减)用一个向量的元素减去另一个向量的相应元素,将每个结果右移一位,并将这些结果存放到目标向量中。 结果将总是被截断。
SHADD (vector) Signed halving add
UHADD (vector) Unsigned halving add
SRHADD (vector) Signed rounding halving add
URHADD (vector) Unsigned rounding halving add
SHSUB (vector) Signed halving subtract
UHSUB (vector) Unsigned halving subtract
ADDP (向量按对加)将两个向量的相邻元素对相加,并将结果存放到目标向量中。
ADDLP (向量长型按对加)将向量中相邻的元素对相加,用符号或零将结果扩展为原宽度的两倍,并将最终结果存放到目标向量中。
ADALP (向量长型按对加累加)将向量中相邻的元素对相加,并将结果的绝对值累加到目标向量的元素中。
ADDP (vector) Add pairwise
SADDLP (vector) Signed add long pairwise
UADDLP (vector) Unsigned add long pairwise
SADALP (vector) Signed add and accumulate long pairwise
UADALP (vector) Unsigned add and accumulate long pairwise
SUQADD (vector) Signed saturating accumulate of unsigned value
USQADD (vector) Unsigned saturating accumulate of signed value
MAX (向量最大值)对两个向量中的相应元素进行比较,并将每一对中的较大值复制到目标向量的相应元素中。
MIN (向量最小值)对两个向量中的相应元素进行比较,并将每一对中的较小值复制到目标向量的相应元素中。
PMAX (向量按对最大值)对两个向量中的相邻元素对进行比较,并将每一对中的较大值复制到目标向量的相应元素中。 操作数和结果必须为双字向量。
PMIN (向量按对最小值)对两个向量中的相邻元素对进行比较,并将每一对中的较小值复制到目标向量的相应元素中。 操作数和结果必须为双字向量。
有关按对运算的图示,请参阅第5-63 页的图5-5。
浮点最大值和最小值:max(+0.0, –0.0) = +0.0,min(+0.0, –0.0) = –0.0
如果任意输入为非数字,则对应的结果元素为缺省非数字。
SMAX (vector) Signed maximum
UMAX (vector) Unsigned maximum
SMIN (vector) Signed minimum
UMIN (vector) Unsigned minimum
SMAXP (vector) Signed maximum pairwise
UMAXP (vector) Unsigned maximum pairwise
SMINP (vector) Signed minimum pairwise
UMINP (vector) Unsigned minimum pairwise
求得向量中的总和、最值
ADDV (vector) Add across vector
SADDLV (vector) Signed add long across vector
UADDLV (vector) Unsigned sum long across vector
SMAXV (vector) Signed maximum across vector
UMAXV (vector) Unsigned maximum across vector
SMINV (vector) Signed minimum across vector
UMINV (vector) Unsigned minimum across vector
CLS (向量前导符号位计数)计算一个向量的每个元素中最高位后面与最高位相同的连续位数目,并将结果存放到另一个向量中。
CLZ (向量前导零计数)计算一个向量的每个元素中从最高位开始算起的连续零数目,并将结果存放到另一个向量中。
CNT (向量设置位计数)计算一个向量的每个元素中值为 1 的位的数目,并将结果存放到另一个向量中。
CLS (vector) Count leading sign bits
CLZ (vector) Count leading zero bits
CNT (vector) Population count per byte
RECPE (向量近似倒数)求出一个向量中每个元素的近似倒数,并将结果存放到另一个向量中。
RSQRTE (向量近似平方根倒数)求出一个向量中每个元素的近似平方根倒数,并将结果存放到另一个向量中。
URECPE (vector) Unsigned reciprocal estimate
URSQRTE (vector) Unsigned reciprocal square root estimate
MUL (向量乘法)将两个向量中的相应元素相乘,并将结果存放到目标向量中。
MLA (向量乘加)将两个向量中的相应元素相乘,并将结果累加到目标向量的元素中。
MLS (向量乘减)将两个向量中的相应元素相乘,从目标向量的相应元素中减去相乘的结果,并将最终结果放入目标向量中。
MUL (vector): Multiply (vector).
SMULL, SMULL2 (vector): Signed Multiply Long (vector).
UMULL, UMULL2 (vector): Unsigned Multiply long (vector).
MLA (vector): Multiply-Add to accumulator (vector).
SMLAL, SMLAL2 (vector): Signed Multiply-Add Long (vector).
UMLAL, UMLAL2 (vector): Unsigned Multiply-Add Long (vector).
MLS (vector): Multiply-Subtract from accumulator (vector).
SMLSL, SMLSL2 (vector): Signed Multiply-Subtract Long (vector).
UMLSL, UMLSL2 (vector): Unsigned Multiply-Subtract Long (vector).
MUL (向量乘以标量)将向量中的每个元素乘以标量,并将结果放入目标向量中。
MLA (向量乘加)将向量中的每个元素乘以标量,并将结果累加到目标向量的相应元素中。
MLS (向量乘减)将向量中的每个元素乘以标量,然后从目标向量的相应元素中减去相乘的结果,并将最终结果放入目标向量中。
MUL (by element): Multiply (vector, by element).
SMULL, SMULL2 (by element): Signed Multiply Long (vector, by element).
UMULL, UMULL2 (by element): Unsigned Multiply Long (vector, by element).
MLA (by element): Multiply-Add to accumulator (vector, by element).
SMLAL, SMLAL2 (by element): Signed Multiply-Add Long (vector, by element).
UMLAL, UMLAL2 (by element): Unsigned Multiply-Add Long (vector, by element).
MLS (by element): Multiply-Subtract from accumulator (vector, by element).
SMLSL, SMLSL2 (by element): Signed Multiply-Subtract Long (vector, by element).
UMLSL, UMLSL2 (by element): Unsigned Multiply-Subtract Long (vector, by element).
向量饱和加倍乘法指令将其操作数相乘并将结果加倍。VQDMULL 将结果存放到目标寄存器中。VQDMLAL 将结果与目标寄存器中的值相加。VQDMLSL 用目标寄存器中的值减去结果。
如果任意结果溢出,则会对其进行饱和。 如果发生饱和,则会设置粘性 QC 标记(FPSCR 位 [27])。
SQDMULL, SQDMULL2 (by element): Signed saturating Doubling Multiply Long (by element).
SQDMULL, SQDMULL2 (vector): Signed saturating Doubling Multiply Long.
SQDMLAL, SQDMLAL2 (by element): Signed saturating Doubling Multiply-Add Long (by element).
SQDMLAL, SQDMLAL2 (vector): Signed saturating Doubling Multiply-Add Long.
SQDMLSL, SQDMLSL2 (by element): Signed saturating Doubling Multiply-Subtract Long (by element).
SQDMLSL, SQDMLSL2 (vector): Signed saturating Doubling Multiply-Subtract Long.
向量饱和加倍乘法指令将其操作数相乘并将结果加倍。 此类指令仅返回结果的高半部分。
如果任意结果溢出,则会对其进行饱和。 如果发生饱和,则会设置粘性 QC 标记(FPSCR 位 [27])。
SQDMULH (by element): Signed saturating Doubling Multiply returning High half (by element).
SQDMULH (vector): Signed saturating Doubling Multiply returning High half.
SQRDMULH (by element): Signed saturating Rounding Doubling Multiply returning High half (by element).
SQRDMULH (vector): Signed saturating Rounding Doubling Multiply returning High half.
SQRDMLAH (by element): Signed Saturating Rounding Doubling Multiply Accumulate returning High Half (by element).
SQRDMLAH (vector): Signed Saturating Rounding Doubling Multiply Accumulate returning High Half (vector).
SQRDMLSH (by element): Signed Saturating Rounding Doubling Multiply Subtract returning High Half (by element).
SQRDMLSH (vector): Signed Saturating Rounding Doubling Multiply Subtract returning High Half (vector).
PMUL: Polynomial Multiply.
PMULL, PMULL2: Polynomial Multiply Long.
SDOT (by element): Dot Product signed arithmetic (vector, by element).
SDOT (vector): Dot Product signed arithmetic (vector).
UDOT (by element): Dot Product unsigned arithmetic (vector, by element).
UDOT (vector): Dot Product unsigned arithmetic (vector).
向量加载单个 n 元素结构到一条向量线。 它将一个 n 元素结构从内存加载到一个或多个 NEON 寄存器。 未加载的寄存器元素将保持不变。
向量存储单个 n 元素结构到一条向量线。 它将一个 n 元素结构从一个或多个NEON 寄存器存储到内存中。
LD1 (single structure): Load one single-element structure to one lane of one register.
LD2 (single structure): Load single 2-element structure to one lane of two registers.
LD3 (single structure): Load single 3-element structure to one lane of three registers).
LD4 (single structure): Load single 4-element structure to one lane of four registers.
ST1 (single structure): Store a single-element structure from one lane of one register.
ST2 (single structure): Store single 2-element structure from one lane of two registers.
ST3 (single structure): Store single 3-element structure from one lane of three registers.
ST4 (single structure): Store single 4-element structure from one lane of four registers.
向量加载单个 n 元素结构到所有向量线。 它将一个 n 元素结构的多个副本从内存加载到一个或多个 NEON 寄存器。
LD1R: Load one single-element structure and Replicate to all lanes (of one register).
LD2R: Load single 2-element structure and Replicate to all lanes of two registers.
LD3R: Load single 3-element structure and Replicate to all lanes of three registers.
LD4R: Load single 4-element structure and Replicate to all lanes of four registers.
向量加载多个 n 元素结构。 它使用反向交叉存取功能,将多个 n 元素结构从内存加载到一个或多个 NEON 寄存器中(除非 n == 1)。 会加载每个寄存器的每个元素。
向量存储多个 n 元素结构。 它使用交叉存取功能,将多个 n 元素结构从一个或多个 NEON 寄存器存储到内存中(除非 n == 1)。 会存储每个寄存器的每个元素。
LD1 (multiple structures): Load multiple single-element structures to one, two, three, or four registers.
LD2 (multiple structures): Load multiple 2-element structures to two registers.
LD3 (multiple structures): Load multiple 3-element structures to three registers.
LD4 (multiple structures): Load multiple 4-element structures to four registers.
ST1 (multiple structures): Store multiple single-element structures from one, two, three, or four registers.
ST2 (multiple structures): Store multiple 2-element structures from two registers.
ST3 (multiple structures): Store multiple 3-element structures from three registers.
ST4 (multiple structures): Store multiple 4-element structures from four registers.
VLDR 伪指令将一个常数值加载到 64 位 NEON 向量的每个元素,或者加载到 VFP单精度或双精度寄存器。
如果某一指令(如 VMOV)可用于直接将常数生成到寄存器中,则汇编器将使用该指令。 否则,汇编器生成一个包含常数的双字文字池条目,并使用 VLDR 指令加载该常数。
LDR (literal, SIMD&FP): Load SIMD&FP Register (PC-relative literal).
使用后增量和前增量加载或存储扩展寄存器的伪指令。
有关不使用后增量和前增量的 VLDR 和 VSTR 指令的信息,请参阅第5-23 页的 VLDR 和 VSTR。
后增量指令在传送后按偏移量的值递增寄存器中的基址。 前增量指令按偏移量的值递减寄存器中的基址,然后使用寄存器中的新地址执行传送。 这些伪指令汇编为 VLDM 或 VSTM 指令(请参阅第5-24 页的VLDM、VSTM、VPOP 和VPUSH)。
LDR (immediate, SIMD&FP): Load SIMD&FP Register (immediate offset).
LDR (register, SIMD&FP): Load SIMD&FP Register (register offset).
STR (immediate, SIMD&FP): Store SIMD&FP register (immediate offset).
STR (register, SIMD&FP): Store SIMD&FP register (register offset).
FABD: Floating-point Absolute Difference (vector).
FABS (scalar): Floating-point Absolute value (scalar).
FABS (vector): Floating-point Absolute value (vector).
FACGE: Floating-point Absolute Compare Greater than or Equal (vector).
FACGT: Floating-point Absolute Compare Greater than (vector).
FADD (scalar): Floating-point Add (scalar).
FADD (vector): Floating-point Add (vector).
FADDP (scalar): Floating-point Add Pair of elements (scalar).
FADDP (vector): Floating-point Add Pairwise (vector).
FCADD: Floating-point Complex Add.
FCCMP: Floating-point Conditional quiet Compare (scalar).
FCCMPE: Floating-point Conditional signaling Compare (scalar).
FCMEQ (register): Floating-point Compare Equal (vector).
FCMEQ (zero): Floating-point Compare Equal to zero (vector).
FCMGE (register): Floating-point Compare Greater than or Equal (vector).
FCMGE (zero): Floating-point Compare Greater than or Equal to zero (vector).
FCMGT (register): Floating-point Compare Greater than (vector).
FCMGT (zero): Floating-point Compare Greater than zero (vector).
FCMLA: Floating-point Complex Multiply Accumulate.
FCMLA (by element): Floating-point Complex Multiply Accumulate (by element).
FCMLE (zero): Floating-point Compare Less than or Equal to zero (vector).
FCMLT (zero): Floating-point Compare Less than zero (vector).
FCMP: Floating-point quiet Compare (scalar).
FCMPE: Floating-point signaling Compare (scalar).
FCSEL: Floating-point Conditional Select (scalar).
FCVT: Floating-point Convert precision (scalar).
FCVTAS (scalar): Floating-point Convert to Signed integer, rounding to nearest with ties to Away (scalar).
FCVTAS (vector): Floating-point Convert to Signed integer, rounding to nearest with ties to Away (vector).
FCVTAU (scalar): Floating-point Convert to Unsigned integer, rounding to nearest with ties to Away (scalar).
FCVTAU (vector): Floating-point Convert to Unsigned integer, rounding to nearest with ties to Away (vector).
FCVTL, FCVTL2: Floating-point Convert to higher precision Long (vector).
FCVTMS (scalar): Floating-point Convert to Signed integer, rounding toward Minus infinity (scalar).
FCVTMS (vector): Floating-point Convert to Signed integer, rounding toward Minus infinity (vector).
FCVTMU (scalar): Floating-point Convert to Unsigned integer, rounding toward Minus infinity (scalar).
FCVTMU (vector): Floating-point Convert to Unsigned integer, rounding toward Minus infinity (vector).
FCVTN, FCVTN2: Floating-point Convert to lower precision Narrow (vector).
FCVTNS (scalar): Floating-point Convert to Signed integer, rounding to nearest with ties to even (scalar).
FCVTNS (vector): Floating-point Convert to Signed integer, rounding to nearest with ties to even (vector).
FCVTNU (scalar): Floating-point Convert to Unsigned integer, rounding to nearest with ties to even (scalar).
FCVTNU (vector): Floating-point Convert to Unsigned integer, rounding to nearest with ties to even (vector).
FCVTPS (scalar): Floating-point Convert to Signed integer, rounding toward Plus infinity (scalar).
FCVTPS (vector): Floating-point Convert to Signed integer, rounding toward Plus infinity (vector).
FCVTPU (scalar): Floating-point Convert to Unsigned integer, rounding toward Plus infinity (scalar).
FCVTPU (vector): Floating-point Convert to Unsigned integer, rounding toward Plus infinity (vector).
FCVTXN, FCVTXN2: Floating-point Convert to lower precision Narrow, rounding to odd (vector).
FCVTZS (scalar, fixed-point): Floating-point Convert to Signed fixed-point, rounding toward Zero (scalar).
FCVTZS (scalar, integer): Floating-point Convert to Signed integer, rounding toward Zero (scalar).
FCVTZS (vector, fixed-point): Floating-point Convert to Signed fixed-point, rounding toward Zero (vector).
FCVTZS (vector, integer): Floating-point Convert to Signed integer, rounding toward Zero (vector).
FCVTZU (scalar, fixed-point): Floating-point Convert to Unsigned fixed-point, rounding toward Zero (scalar).
FCVTZU (scalar, integer): Floating-point Convert to Unsigned integer, rounding toward Zero (scalar).
FCVTZU (vector, fixed-point): Floating-point Convert to Unsigned fixed-point, rounding toward Zero (vector).
FCVTZU (vector, integer): Floating-point Convert to Unsigned integer, rounding toward Zero (vector).
FDIV (scalar): Floating-point Divide (scalar).
FDIV (vector): Floating-point Divide (vector).
FJCVTZS: Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero.
FMADD: Floating-point fused Multiply-Add (scalar).
FMAX (scalar): Floating-point Maximum (scalar).
FMAX (vector): Floating-point Maximum (vector).
FMAXNM (scalar): Floating-point Maximum Number (scalar).
FMAXNM (vector): Floating-point Maximum Number (vector).
FMAXNMP (scalar): Floating-point Maximum Number of Pair of elements (scalar).
FMAXNMP (vector): Floating-point Maximum Number Pairwise (vector).
FMAXNMV: Floating-point Maximum Number across Vector.
FMAXP (scalar): Floating-point Maximum of Pair of elements (scalar).
FMAXP (vector): Floating-point Maximum Pairwise (vector).
FMAXV: Floating-point Maximum across Vector.
FMIN (scalar): Floating-point Minimum (scalar).
FMIN (vector): Floating-point minimum (vector).
FMINNM (scalar): Floating-point Minimum Number (scalar).
FMINNM (vector): Floating-point Minimum Number (vector).
FMINNMP (scalar): Floating-point Minimum Number of Pair of elements (scalar).
FMINNMP (vector): Floating-point Minimum Number Pairwise (vector).
FMINNMV: Floating-point Minimum Number across Vector.
FMINP (scalar): Floating-point Minimum of Pair of elements (scalar).
FMINP (vector): Floating-point Minimum Pairwise (vector).
FMINV: Floating-point Minimum across Vector.
FMLA (by element): Floating-point fused Multiply-Add to accumulator (by element).
FMLA (vector): Floating-point fused Multiply-Add to accumulator (vector).
FMLAL, FMLAL2 (by element): Floating-point fused Multiply-Add Long to accumulator (by element).
FMLAL, FMLAL2 (vector): Floating-point fused Multiply-Add Long to accumulator (vector).
FMLS (by element): Floating-point fused Multiply-Subtract from accumulator (by element).
FMLS (vector): Floating-point fused Multiply-Subtract from accumulator (vector).
FMLSL, FMLSL2 (by element): Floating-point fused Multiply-Subtract Long from accumulator (by element).
FMLSL, FMLSL2 (vector): Floating-point fused Multiply-Subtract Long from accumulator (vector).
FMOV (general): Floating-point Move to or from general-purpose register without conversion.
FMOV (register): Floating-point Move register without conversion.
FMOV (scalar, immediate): Floating-point move immediate (scalar).
FMOV (vector, immediate): Floating-point move immediate (vector).
FMSUB: Floating-point Fused Multiply-Subtract (scalar).
FMUL (by element): Floating-point Multiply (by element).
FMUL (scalar): Floating-point Multiply (scalar).
FMUL (vector): Floating-point Multiply (vector).
FMULX: Floating-point Multiply extended.
FMULX (by element): Floating-point Multiply extended (by element).
FNEG (scalar): Floating-point Negate (scalar).
FNEG (vector): Floating-point Negate (vector).
FNMADD: Floating-point Negated fused Multiply-Add (scalar).
FNMSUB: Floating-point Negated fused Multiply-Subtract (scalar).
FNMUL (scalar): Floating-point Multiply-Negate (scalar).
FRECPE: Floating-point Reciprocal Estimate.
FRECPS: Floating-point Reciprocal Step.
FRECPX: Floating-point Reciprocal exponent (scalar).
FRINTA (scalar): Floating-point Round to Integral, to nearest with ties to Away (scalar).
FRINTA (vector): Floating-point Round to Integral, to nearest with ties to Away (vector).
FRINTI (scalar): Floating-point Round to Integral, using current rounding mode (scalar).
FRINTI (vector): Floating-point Round to Integral, using current rounding mode (vector).
FRINTM (scalar): Floating-point Round to Integral, toward Minus infinity (scalar).
FRINTM (vector): Floating-point Round to Integral, toward Minus infinity (vector).
FRINTN (scalar): Floating-point Round to Integral, to nearest with ties to even (scalar).
FRINTN (vector): Floating-point Round to Integral, to nearest with ties to even (vector).
FRINTP (scalar): Floating-point Round to Integral, toward Plus infinity (scalar).
FRINTP (vector): Floating-point Round to Integral, toward Plus infinity (vector).
FRINTX (scalar): Floating-point Round to Integral exact, using current rounding mode (scalar).
FRINTX (vector): Floating-point Round to Integral exact, using current rounding mode (vector).
FRINTZ (scalar): Floating-point Round to Integral, toward Zero (scalar).
FRINTZ (vector): Floating-point Round to Integral, toward Zero (vector).
FRSQRTE: Floating-point Reciprocal Square Root Estimate.
FRSQRTS: Floating-point Reciprocal Square Root Step.
FSQRT (scalar): Floating-point Square Root (scalar).
FSQRT (vector): Floating-point Square Root (vector).
FSUB (scalar): Floating-point Subtract (scalar).
FSUB (vector): Floating-point Subtract (vector).
AESD: AES single round decryption.
AESE: AES single round encryption.
AESIMC: AES inverse mix columns.
AESMC: AES mix columns.
SHA1C: SHA1 hash update (choose).
SHA1H: SHA1 fixed rotate.
SHA1M: SHA1 hash update (majority).
SHA1P: SHA1 hash update (parity).
SHA1SU0: SHA1 schedule update 0.
SHA1SU1: SHA1 schedule update 1.
SHA256H: SHA256 hash update (part 1).
SHA256H2: SHA256 hash update (part 2).
SHA256SU0: SHA256 schedule update 0.
SHA256SU1: SHA256 schedule update 1.
SHA512H: SHA512 Hash update part 1.
SHA512H2: SHA512 Hash update part 2.
SHA512SU0: SHA512 Schedule Update 0.
SHA512SU1: SHA512 Schedule Update 1.
SM3PARTW1: SM3PARTW1.
SM3PARTW2: SM3PARTW2.
SM3SS1: SM3SS1.
SM3TT1A: SM3TT1A.
SM3TT1B: SM3TT1B.
SM3TT2A: SM3TT2A.
SM3TT2B: SM3TT2B.
SM4E: SM4 Encode.
SM4EKEY: SM4 Key.
LDNP (SIMD&FP): Load Pair of SIMD&FP registers, with Non-temporal hint.
LDP (SIMD&FP): Load Pair of SIMD&FP registers.
LDUR (SIMD&FP): Load SIMD&FP Register (unscaled offset).
STNP (SIMD&FP): Store Pair of SIMD&FP registers, with Non-temporal hint.
STP (SIMD&FP): Store Pair of SIMD&FP registers.
STUR (SIMD&FP): Store SIMD&FP register (unscaled offset).