zhashung0920

AVX指令集函数列表中文翻译

AVX指令集函数列表

基于Intel Intrinsics Guide 3.62，不包括AVX、AVX2中的以__mm开头的函数。本文档建议初学者学习，详细内容请查看官方文档。

Arithmetic

__m256i _mm256_add_epi16 (m256i a, m256i b)

16位整形向量a加b

Add packed 16-bit integers in a and b, and store the results in dst.

__m256i _mm256_add_epi32 (m256i a, m256i b)

32位整形向量a加b

Add packed 32-bit integers in a and b, and store the results in dst.

__m256i _mm256_add_epi64 (m256i a, m256i b)

64位整形向量a加b

Add packed 64-bit integers in a and b, and store the results in dst.

__m256i _mm256_add_epi8 (m256i a, m256i b)

8位整形向量a加b

Add packed 8-bit integers in a and b, and store the results in dst.

__m256d _mm256_add_pd (m256d a, m256d b)

64位双精度浮点数向量a加b

Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

__m256 _mm256_add_ps (m256 a, m256 b)

32位单精度浮点数向量a加b

Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

__m256i _mm256_adds_epi16 (m256i a, m256i b)

使用饱和，16位整形向量a加b

Add packed 16-bit integers in a and b using saturation, and store the results in dst.

__m256i _mm256_adds_epi8 (m256i a, m256i b)

使用饱和，8位整形向量a加b

Add packed 8-bit integers in a and b using saturation, and store the results in dst.

__m256i _mm256_adds_epu16 (m256i a, m256i b)

使用饱和，16位无符号整形向量a加b

Add packed unsigned 16-bit integers in a and b using saturation, and store the results in dst.

__m256i _mm256_adds_epu8 (m256i a, m256i b)

使用饱和，8位无符号整形向量a加b

Add packed unsigned 8-bit integers in a and b using saturation, and store the results in dst.

__m256d _mm256_addsub_pd (m256d a, m256d b)

64位双精度浮点数向量a加或减b（偶数通道使用加法，奇数通道使用减法）

Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

__m256 _mm256_addsub_ps (m256 a, m256 b)

32位单精度浮点数向量a加或减b（偶数通道使用加法，奇数通道使用减法）

Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

__m256d _mm256_div_pd (m256d a, m256d b)

64位双精度浮点数向量a除b

Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.

__m256 _mm256_div_ps (m256 a, m256 b)

32位单精度浮点数向量a除b

Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.

__m256 _mm256_dp_ps (m256 a, m256 b, const int imm8)

根据imm8中的第4-7位确定32位单精度浮点数向量a、b的哪些通道执行乘法，将所有结果加起来，然后根据imm8的低四位存储到dst的各通道（对高低各128位分开计算）

Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.

__m256i _mm256_hadd_epi16 (m256i a, m256i b)

分别对16位整形向量a、b计算相邻两个通道的加法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally add adjacent pairs of 16-bit integers in a and b, and pack the signed 16-bit results in dst.

__m256i _mm256_hadd_epi32 (m256i a, m256i b)

分别对32位整形向量a、b计算相邻两个通道的加法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally add adjacent pairs of 32-bit integers in a and b, and pack the signed 32-bit results in dst.

__m256d _mm256_hadd_pd (m256d a, m256d b)

分别对64位双精度浮点数向量a、b计算相邻两个通道的加法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

__m256 _mm256_hadd_ps (m256 a, m256 b)

分别对32位单精度浮点数向量a、b计算相邻两个通道的加法，最后将a、b分别的计算交叉保存到结果dst中

Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

__m256i _mm256_hadds_epi16 (m256i a, m256i b)

使用饱和，分别对16位整形向量a、b计算相邻两个通道的加法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally add adjacent pairs of signed 16-bit integers in a and b using saturation, and pack the signed 16-bit results in dst.

__m256i _mm256_hsub_epi16 (m256i a, m256i b)

分别对16位整形向量a、b计算相邻两个通道的减法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally subtract adjacent pairs of 16-bit integers in a and b, and pack the signed 16-bit results in dst.

__m256i _mm256_hsub_epi32 (m256i a, m256i b)

分别对32位整形向量a、b计算相邻两个通道的减法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally subtract adjacent pairs of 32-bit integers in a and b, and pack the signed 32-bit results in dst.

__m256d _mm256_hsub_pd (m256d a, m256d b)

分别对64位双精度浮点数向量a、b计算相邻两个通道的加法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

__m256 _mm256_hsub_ps (m256 a, m256 b)

分别对32位单精度浮点数向量a、b计算相邻两个通道的减法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally subtract adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

__m256i _mm256_hsubs_epi16 (m256i a, m256i b)

使用饱和，分别对16位整形向量a、b计算相邻两个通道的减法，最后将a、b分别的计算结果交叉保存到结果dst中

Horizontally subtract adjacent pairs of signed 16-bit integers in a and b using saturation, and pack the signed 16-bit results in dst.

__m256i _mm256_madd_epi16 (m256i a, m256i b)

16位整形向量a乘b，结果为32位整形向量，将该中间结果的临近的通道相加，最后将结果保存到dst。

Multiply packed signed 16-bit integers in a and b, producing intermediate signed 32-bit integers. Horizontally add adjacent pairs of intermediate 32-bit integers, and pack the results in dst.

__m256i _mm256_maddubs_epi16 (m256i a, m256i b)

8位无符号整形向量a乘8位有符号整形向量b，结果为16位有符号整形向量。使用饱和，将该中间结果的临近的通道相加，最后将结果保存到dst。

Vertically multiply each unsigned 8-bit integer from a with the corresponding signed 8-bit integer from b, producing intermediate signed 16-bit integers. Horizontally add adjacent pairs of intermediate signed 16-bit integers, and pack the saturated results in dst.

__m256i _mm256_mul_epi32 (m256i a, m256i b)

使用64位整形向量a和b的各低32位有符号整形相乘，结果为64位有符号向量

Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst.

__m256i _mm256_mul_epu32 (m256i a, m256i b)

使用64位整形向量a和b的各低32位无符号整形相乘，结果为64位无符号向量

Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst.

__m256d _mm256_mul_pd (m256d a, m256d b)

64位双精度浮点数向量a乘b

Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

__m256 _mm256_mul_ps (m256 a, m256 b)

32位单精度浮点数向量a乘b

Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

__m256i _mm256_mulhi_epi16 (m256i a, m256i b)

16位有符号整形向量a乘b，结果为32位整形向量，然后保存该中间结果每个通道高16位至dst中

Multiply the packed signed 16-bit integers in a and b, producing intermediate 32-bit integers, and store the high 16 bits of the intermediate integers in dst.

__m256i _mm256_mulhi_epu16 (m256i a, m256i b)

16位无符号整形向量a乘b，结果为32位整形向量，然后保存该中间结果每个通道高16位至dst中

Multiply the packed unsigned 16-bit integers in a and b, producing intermediate 32-bit integers, and store the high 16 bits of the intermediate integers in dst.

__m256i _mm256_mulhrs_epi16 (m256i a, m256i b)

16位有符号整形向量a乘b，结果为32位整形向量，保留该中间结果高18位有效数字再加1，最后保留低16位

Multiply packed signed 16-bit integers in a and b, producing intermediate signed 32-bit integers. Truncate each intermediate integer to the 18 most significant bits, round by adding 1, and store bits [16:1] to dst.

__m256i _mm256_mullo_epi16 (m256i a, m256i b)

16位有符号整形向量a乘b，结果为32位整形向量，然后保存该中间结果的每个通道低16位至dst中

Multiply the packed signed 16-bit integers in a and b, producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in dst.

__m256i _mm256_mullo_epi32 (m256i a, m256i b)

32位有符号整形向量a乘b，结果为64位整形向量，然后保存该中间结果的每个通道低32位至dst中

Multiply the packed signed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst.

__m256i _mm256_sad_epu8 (m256i a, m256i b)

计算8位无符号整形向量a和b的差的绝对值。每8个通道将这些差值取和，共生成4个16位无符号整形向量，将该向量放在dst每64位元素的低16位。

Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of 64-bit elements in dst.

__m256i _mm256_sign_epi16 (m256i a, m256i b)

当16位有符号整形向量b中通道的值为负数时，反转16位有符号整形向量a中对应位置的值的符号，并存储在dst中，当b中通道的值为0时，dst中对应位置的值设为0

Negate packed signed 16-bit integers in a when the corresponding signed 16-bit integer in b is negative, and store the results in dst. Element in dst are zeroed out when the corresponding element in b is zero.

__m256i _mm256_sign_epi32 (m256i a, m256i b)

当32位有符号整形向量b中通道的值为负数时，反转32位有符号整形向量a中对应位置的值的符号，并存储在dst中，当b中通道的值为0时，dst中对应位置的值设为0

Negate packed signed 32-bit integers in a when the corresponding signed 32-bit integer in b is negative, and store the results in dst. Element in dst are zeroed out when the corresponding element in b is zero.

__m256i _mm256_sign_epi8 (m256i a, m256i b)

当8位有符号整形向量b中通道的值为负数时，反转8位有符号整形向量a中对应位置的值的符号，并存储在dst中，当b中通道的值为0时，dst中对应位置的值设为0

Negate packed signed 8-bit integers in a when the corresponding signed 8-bit integer in b is negative, and store the results in dst. Element in dst are zeroed out when the corresponding element in b is zero.

__m256i _mm256_sub_epi16 (m256i a, m256i b)

16位整形向量a减b

Subtract packed 16-bit integers in b from packed 16-bit integers in a, and store the results in dst.

__m256i _mm256_sub_epi32 (m256i a, m256i b)

32位整形向量a减b

Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst.

__m256i _mm256_sub_epi64 (m256i a, m256i b)

64位整形向量a减b

Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst.

__m256i _mm256_sub_epi8 (m256i a, m256i b)

8位整形向量a减b

Subtract packed 8-bit integers in b from packed 8-bit integers in a, and store the results in dst.

__m256d _mm256_sub_pd (m256d a, m256d b)

64位双精度浮点数向量a减b

Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

__m256 _mm256_sub_ps (m256 a, m256 b)

32位单精度浮点数向量a减b

Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

__m256i _mm256_subs_epi16 (m256i a, m256i b)

使用饱和，16位整形向量a减b

Subtract packed signed 16-bit integers in b from packed 16-bit integers in a using saturation, and store the results in dst.

__m256i _mm256_subs_epi8 (m256i a, m256i b)

使用饱和，8位整形向量a减b

Subtract packed signed 8-bit integers in b from packed 8-bit integers in a using saturation, and store the results in dst.

__m256i _mm256_subs_epu16 (m256i a, m256i b)

使用饱和，16位无符号整形向量a减b

Subtract packed unsigned 16-bit integers in b from packed unsigned 16-bit integers in a using saturation, and store the results in dst.

__m256i _mm256_subs_epu8 (m256i a, m256i b)

使用饱和，8位无符号整形向量a减b

Subtract packed unsigned 8-bit integers in b from packed unsigned 8-bit integers in a using saturation, and store the results in dst.

Compare

__m256d _mm256_cmp_pd (m256d a, m256d b, const int imm8)

根据imm8的条件比较64位双精度浮点数向量a和b，每个通道当满足条件时返回64位全1否则为64位全0

Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

__m256 _mm256_cmp_ps (m256 a, m256 b, const int imm8)

根据imm8的条件比较32位单精度浮点数向量a和b，每个通道当满足条件时返回32位全1否则为32位全0

Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

__m256i _mm256_cmpeq_epi16 (m256i a, m256i b)

比较16位整形向量a和b是否相等，每个通道当满足条件时返回16位全1否则为16位全0

Compare packed 16-bit integers in a and b for equality, and store the results in dst.

__m256i _mm256_cmpeq_epi32 (m256i a, m256i b)

比较32位整形向量a和b是否相等，每个通道当满足条件时返回32位全1否则为32位全0

Compare packed 32-bit integers in a and b for equality, and store the results in dst.

__m256i _mm256_cmpeq_epi64 (m256i a, m256i b)

比较64位整形向量a和b是否相等，每个通道当满足条件时返回64位全1否则为64位全0

Compare packed 64-bit integers in a and b for equality, and store the results in dst.

__m256i _mm256_cmpeq_epi8 (m256i a, m256i b)

比较8位整形向量a和b是否相等，每个通道当满足条件时返回8位全1否则为8位全0

Compare packed 8-bit integers in a and b for equality, and store the results in dst.

__m256i _mm256_cmpgt_epi16 (m256i a, m256i b)

比较16位整形向量a是否大于（不含等于）b，每个通道当满足条件时返回16位全1否则为16位全0

Compare packed signed 16-bit integers in a and b for greater-than, and store the results in dst.

__m256i _mm256_cmpgt_epi32 (m256i a, m256i b)

比较32位整形向量a和b是否大于（不含等于），每个通道当满足条件时返回32位全1否则为32位全0

Compare packed signed 32-bit integers in a and b for greater-than, and store the results in dst.

__m256i _mm256_cmpgt_epi64 (m256i a, m256i b)

比较64位整形向量a和b是否大于（不含等于），每个通道当满足条件时返回64位全1否则为64位全0

Compare packed signed 64-bit integers in a and b for greater-than, and store the results in dst.

__m256i _mm256_cmpgt_epi8 (m256i a, m256i b)

比较8位整形向量a和b是否大于（不含等于），每个通道当满足条件时返回8位全1否则为8位全0

Compare packed signed 8-bit integers in a and b for greater-than, and store the results in dst.

Convert

__m256i _mm256_cvtepi16_epi32 (__m128i a)

将16位有符号整形向量a扩展（SignExtend）为32位有符号整形向量dst

Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst.

__m256i _mm256_cvtepi16_epi64 (__m128i a)

将16位有符号整形向量a的低4个通道扩展（SignExtend）为64bit有符号整形向量dst

Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst.

__m256i _mm256_cvtepi32_epi64 (__m128i a)

将32位有符号整形向量a扩展（SignExtend）为64位有符号整形向量dst

Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst.

__m256d _mm256_cvtepi32_pd (__m128i a)

将32位有符号整形向量a扩展（SignExtend）为64位双精度浮点向量dst

Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

__m256 _mm256_cvtepi32_ps (__m256i a)

将32位有符号整形向量a扩展（SignExtend）为32位单精度浮点向量dst

Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

__m256i _mm256_cvtepi8_epi16 (__m128i a)

将8位有符号整形向量a扩展（SignExtend）为16位有符号整形向量dst

Sign extend packed 8-bit integers in a to packed 16-bit integers, and store the results in dst.

__m256i _mm256_cvtepi8_epi32 (__m128i a)

将8位有符号整形向量a的低8个通道扩展（SignExtend）为32位有符号整形向量dst

Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst.

__m256i _mm256_cvtepi8_epi64 (__m128i a)

将8位有符号整形向量a的低4个通道扩展（SignExtend）为64位有符号整形向量dst

Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst.

__m256i _mm256_cvtepu16_epi32 (__m128i a)

将16位无符号整形向量a扩展（ZeroExtend）为32位整形向量dst

Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst.

__m256i _mm256_cvtepu16_epi64 (__m128i a)

将16位无符号整形向量a的低4个通道扩展（ZeroExtend）为64位有符号整形向量dst

Zero extend packed unsigned 16-bit integers in a to packed 64-bit integers, and store the results in dst.

__m256i _mm256_cvtepu32_epi64 (__m128i a)

将32位无符号整形向量a扩展（SignExtend）为64位整形向量dst

Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst.

__m256i _mm256_cvtepu8_epi16 (__m128i a)

将8位无符号整形向量a的低16个通道扩展扩展（SignExtend）为16位整形向量dst

Zero extend packed unsigned 8-bit integers in a to packed 16-bit integers, and store the results in dst.

_m256i _mm256_cvtepu8_epi32 (__m128i a)

将8位无符号整形向量a的低8个通道扩展（SignExtend）为32位整形向量dst

Zero extend packed unsigned 8-bit integers in a to packed 32-bit integers, and store the results in dst.

__m256i _mm256_cvtepu8_epi64 (__m128i a)

将8位无符号整形向量a的低4个通道扩展（SignExtend）为64位整形向量dst

Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst.

__m128i _mm256_cvtpd_epi32 (__m256d a)

将64位双精度浮点数向量a转换为32位整形向量dst

Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

__m128 _mm256_cvtpd_ps (__m256d a)

将64位双精度浮点数向量a转换为32位单精度浮点数向量dst

Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

__m256i _mm256_cvtps_epi32 (__m256 a)

将32位单精度浮点数向量a转换为32位整形向量dst

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

__m256d _mm256_cvtps_pd (__m128 a)

将32位单精度浮点数向量a转换为64位双精度浮点数向量dst

Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

double _mm256_cvtsd_f64 (__m256d a)

复制64位双精度浮点数向量的最低通道至dst

Copy the lower double-precision (64-bit) floating-point element of a to dst.

int _mm256_cvtsi256_si32 (__m256i a)

复制32位整形向量的最低通道至dst

Copy the lower 32-bit integer in a to dst.

float _mm256_cvtss_f32 (__m256 a)

复制32位单精度浮点数向量的最低通道至dst

Copy the lower single-precision (32-bit) floating-point element of a to dst.

__m128i _mm256_cvttpd_epi32 (__m256d a)

使用截断，转换64位双精度浮点数向量a至32位整形向量dst

Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

__m256i _mm256_cvttps_epi32 (__m256 a)

使用截断，转换32位单精度浮点数向量a至32位整形向量dst

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

Elementary Math Functions

__m256 _mm256_rcp_ps (__m256 a)

近似计算32位单精度浮点数向量a的倒数，结果的最大误差不大于1.5*2^-12

Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

__m256 _mm256_rsqrt_ps (__m256 a)

近似计算32位单精度浮点数向量a的平方根的倒数，结果的最大误差不大于1.5*2^-12

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

__m256d _mm256_sqrt_pd (__m256d a)

计算64位双精度浮点数向量a的平方根

Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

__m256 _mm256_sqrt_ps (__m256 a)

计算32位单精度浮点数向量a的平方根

Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

General Support

__m256d _mm256_undefined_pd (void)

返回一个未定义的__m256d变量

Return vector of type __m256d with undefined elements.

__m256 _mm256_undefined_ps (void)

返回一个未定义的__m256变量

Return vector of type __m256 with undefined elements.

__m256i _mm256_undefined_si256 (void)

返回一个未定义的__m256i变量

Return vector of type __m256i with undefined elements.

void _mm256_zeroall (void)

将所有XMM或YMM寄存器置零

Zero the contents of all XMM or YMM registers.

void _mm256_zeroupper (void)

将所有YMM寄存器的高128位置零，低128位不变

Zero the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.

Load

__m256d _mm256_broadcast_pd (__m128d const * mem_addr)

从内存中读取128位（2个64位双精度浮点数）并广播至dst

Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.

__m256 _mm256_broadcast_ps (__m128 const * mem_addr)

从内存中读取128位（4个64位双精度浮点数）并广播至dst

Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.

__m256d _mm256_broadcast_sd (double const * mem_addr)

从内存中读取1个64位双精度浮点数并广播至dst所有通道

Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.

__m256 _mm256_broadcast_ss (float const * mem_addr)

从内存中读取1个32位单精度浮点数并广播至dst所有通道

Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

__m256i _mm256_i32gather_epi32 (int const* base_addr, __m256i vindex, const int scale)

从内存中聚集32位整形向量。读取的起始地址为base_addr，偏移量为32位向量vindex乘scale个字节。scale必须为1、2、4、8

Gather 32-bit integers from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256i _mm256_mask_i32gather_epi32 (__m256i src, int const* base_addr, m256i vindex, m256i mask, const int scale)

从内存中聚集32位整形向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果32位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256i _mm256_i32gather_epi64 (__int64 const* base_addr, __m128i vindex, const int scale)

从内存中聚集64位整形向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8

Gather 64-bit integers from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256i _mm256_mask_i32gather_epi64 (m256i src, int64 const* base_addr, m128i vindex, m256i mask, const int scale)

从内存中聚集64位整形向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果对应mask中32位向量通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256d _mm256_i32gather_pd (double const* base_addr, __m128i vindex, const int scale)

从内存中聚集64位双精度浮点数向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8

Gather double-precision (64-bit) floating-point elements from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256d _mm256_mask_i32gather_pd (__m256d src, double const* base_addr, m128i vindex, m256d mask, const int scale)

从内存中聚集64位双精度浮点数向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果32位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256 _mm256_i32gather_ps (float const* base_addr, __m256i vindex, const int scale)

从内存中聚集32位单精度浮点数向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8

Gather single-precision (32-bit) floating-point elements from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256 _mm256_mask_i32gather_ps (__m256 src, float const* base_addr, m256i vindex, m256 mask, const int scale)

从内存中聚集32位单精度浮点数向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果32位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m128i _mm256_i64gather_epi32 (int const* base_addr, __m256i vindex, const int scale)

从内存中聚集32位整形向量。读取的起始地址为base_addr，偏移量为32位vindex向量乘scale字节。scale必须为1、2、4、8

Gather 32-bit integers from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m128i _mm256_mask_i64gather_epi32 (__m128i src, int const* base_addr, m256i vindex, m128i mask, const int scale)

从内存中聚集32位整形向量。读取的起始地址为base_addr，偏移量为64位向量vindex乘scale个字节。scale必须为1、2、4、8。
如果32位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256i _mm256_i64gather_epi64 (__int64 const* base_addr, __m256i vindex, const int scale)

从内存中聚集64位整形向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘以scale个字节。scale必须为1、2、4、8

Gather 64-bit integers from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256i _mm256_mask_i64gather_epi64 (m256i src, int64 const* base_addr, m256i vindex, m256i mask, const int scale)

从内存中聚集64位整形向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果64位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256d _mm256_i64gather_pd (double const* base_addr, __m256i vindex, const int scale)

从内存中聚集64位双精度浮点数向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘scale个字节。scale必须为1、2、4、8

Gather double-precision (64-bit) floating-point elements from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m256d _mm256_mask_i64gather_pd (__m256d src, double const* base_addr, m256i vindex, m256d mask, const int scale)

从内存中聚集64位双精度浮点数向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘scale个字节。scale必须为1、2、4、8。
如果64位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m128 _mm256_i64gather_ps (float const* base_addr, __m256i vindex, const int scale)

从内存中聚集32位单精度浮点数向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘scale个字节。scale必须为1、2、4、8

Gather single-precision (32-bit) floating-point elements from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.

__m128 _mm256_mask_i64gather_ps (__m128 src, float const* base_addr, m256i vindex, m128 mask, const int scale)

从内存中聚集32位单精度浮点数向量，读取的起始地址为base_addr，偏移量为64位vindex向量乘以scale个字节。scale必须为1、2、4、8。
如果64位向量mask对应通道的最高位为1时，dst使用聚集的数据，否则使用src中对应的通道的值

__m256i _mm256_lddqu_si256 (__m256i const * mem_addr)

从非对齐的内存加载256位整形数据。当数据跨越一个cache line时，该命令性能可能比_mm256_loadu_si256更好

Load 256-bits of integer data from unaligned memory into dst. This intrinsic may perform better than _mm256_loadu_si256 when the data crosses a cache line boundary.

__m256d _mm256_load_pd (double const * mem_addr)

从内存中加载256位数据（由4个64位双精度浮点数组成）， mem_addr必须32字节对齐，否则会产生通用保护异常

Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into dst. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

__m256 _mm256_load_ps (float const * mem_addr)

从内存中加载256位数据（由8个32位单精度浮点数组成）， mem_addr必须32字节对齐，否则会产生通用保护异常

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into dst. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

__m256i _mm256_load_si256 (__m256i const * mem_addr)

从内存中加载256位整形数据， mem_addr必须32字节对齐，否则会产生通用保护异常

Load 256-bits of integer data from memory into dst. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

__m256d _mm256_loadu_pd (double const * mem_addr)

从内存中加载256位数据（由4个64位双精度浮点数组成）， mem_addr无需对齐

Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into dst. mem_addr does not need to be aligned on any particular boundary.

__m256 _mm256_loadu_ps (float const * mem_addr)

从内存中加载256位数据（由8个32位单精度浮点数组成）， mem_addr无需对齐

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into dst. mem_addr does not need to be aligned on any particular boundary.

__m256i _mm256_loadu_si256 (__m256i const * mem_addr)

从内存中加载256位整形数据， mem_addr无需对齐

Load 256-bits of integer data from memory into dst. mem_addr does not need to be aligned on any particular boundary.

__m256 _mm256_loadu2_m128 (float const* hiaddr, float const* loaddr)

从内存中加载两个128位数据（分别由4个32位单精度浮点数组成），并拼接为一个256位数据，hiaddr和loaddr不需要对齐

Load two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

__m256d _mm256_loadu2_m128d (double const* hiaddr, double const* loaddr)

从内存中加载两个128位数据（分别由2个64位双精度浮点数组成），并拼接为一个256位数据，hiaddr和loaddr不需要对齐

Load two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

__m256i _mm256_loadu2_m128i (m128i const* hiaddr, m128i const* loaddr)

从内存中加载两个128位数据（分别由整形数据组成），并拼接为一个256位数据，hiaddr和loaddr不需要对齐

Load two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

__m256i _mm256_maskload_epi32 (int const* mem_addr, __m256i mask)

从内存中加载32位整形向量，当mask对应通道的最高位是0时，将dst对应通道置零

Load packed 32-bit integers from memory into dst using mask (elements are zeroed out when the highest bit is not set in the corresponding element).

__m256i _mm256_maskload_epi64 (__int64 const* mem_addr, __m256i mask)

从内存中加载64位整形向量，当mask对应通道的最高位是0时，将dst对应通道置零

Load packed 64-bit integers from memory into dst using mask (elements are zeroed out when the highest bit is not set in the corresponding element).

__m256d _mm256_maskload_pd (double const * mem_addr, __m256i mask)

从内存中加载64位双精度浮点数向量，当mask对应通道的最高位是0时，将dst对应通道置零

Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

__m256 _mm256_maskload_ps (float const * mem_addr, __m256i mask)

从内存中加载32位单精度浮点数向量，当mask对应通道的最高位是0时，将dst对应通道置零

Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

__m256i _mm256_stream_load_si256 (__m256i const* mem_addr)

使用non-temporal memory hint，从内存中加载256位数据，mem_addr必须32字节对齐，否则会产生通用保护异常

Load 256-bits of integer data from memory into dst using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

Logical

__m256d _mm256_and_pd (m256d a, m256d b)

64位双精度浮点数向量a逻辑与b

Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

__m256 _mm256_and_ps (m256 a, m256 b)

32位单精度浮点数向量a逻辑与b

Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

__m256i _mm256_and_si256 (m256i a, m256i b)

256位向量a逻辑与b

Compute the bitwise AND of 256 bits (representing integer data) in a and b, and store the result in dst.

__m256d _mm256_andnot_pd (m256d a, m256d b)

先计算64位双精度浮点数向量a的逻辑非，再与向量b做逻辑与

Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b, and store the results in dst.

__m256 _mm256_andnot_ps (m256 a, m256 b)

先计算32位单精度浮点数向量a的逻辑非，再与向量b做逻辑与

Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b, and store the results in dst.

__m256i _mm256_andnot_si256 (m256i a, m256i b)

先计算256位向量a的逻辑非，再与向量b做逻辑与

Compute the bitwise NOT of 256 bits (representing integer data) in a and then AND with b, and store the result in dst.

__m256d _mm256_or_pd (m256d a, m256d b)

64位双精度浮点数向量a逻辑或b

Compute the bitwise OR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

__m256 _mm256_or_ps (m256 a, m256 b)

32位单精度浮点数向量a逻辑或b

Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

__m256i _mm256_or_si256 (m256i a, m256i b)

256位向量a逻辑或b

Compute the bitwise OR of 256 bits (representing integer data) in a and b, and store the result in dst.

int _mm256_testc_pd (m256d a, m256d b)

首先计算64位双精度浮点数向量a逻辑与b，如果中间结果4个通道的符号位全为0则设置ZF为1，否则ZF为0.然后先计算64位双精度浮点数向量a的逻辑非，再与向量b做逻辑与，如果该中间结果4个通道的的符号位全为0则设置CF位1，否则CF为1，返回CF

Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

int _mm256_testc_ps (m256 a, m256 b)

首先计算32位单精度浮点数向量a逻辑与b，如果中间结果的符号位全为0则设置ZF为1，否则ZF为0.然后先计算32位单精度浮点数向量a的逻辑否，再与向量b做逻辑与，如果该中间结果8的符号位全为0则设置CF位1，否则CF为1，返回CF

Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

int _mm256_testc_si256 (m256i a, m256i b)

首先计算256位数据a逻辑与b，如果中间结果所有位全为0则设置ZF为1，否则ZF为0.然后先计算256位向量a的逻辑非，再与向量b做逻辑与，如果该中间结果所有位全为0则设置CF位1，否则CF为1，返回CF

Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.

int _mm256_testnzc_pd (m256d a, m256d b)

首先计算64位双精度浮点数向量a逻辑与b，如果中间结果4个通道的符号位全为0则设置ZF为1，否则ZF为0。然后先计算64位双精度浮点数向量a的逻辑否，再与向量b做逻辑与，如果该中间结果4个通道的的符号位全为0则设置CF位1，否则CF为1。如果ZF与CF全为0则返回1，否则返回0"

int _mm256_testnzc_ps (m256 a, m256 b)

首先计算32位单精度浮点数向量a逻辑与b，如果中间结果8个通道的符号位全为0则设置ZF为1，否则ZF为0。然后先计算32位单精度浮点数向量a的逻辑非，再与向量b做逻辑与，如果该中间结果8个通道的的符号位全为0则设置CF位1，否则CF为1。如果ZF与CF全为0则返回1，否则返回0

int _mm256_testnzc_si256 (m256i a, m256i b)

首先计算256位向量a逻辑与b，如果中间结果所有位全为0则设置ZF为1，否则ZF为0。然后先计算256位向量a的逻辑非，再与向量b做逻辑与，如果该中间结果所有位全为0则设置CF位1，否则CF为1，返回CF。如果ZF与CF全为0则返回1，否则返回0

int _mm256_testz_pd (m256d a, m256d b)

首先计算64位双精度浮点数向量a逻辑与b，如果中间结果4个通道的符号位全为0则设置ZF为1，否则ZF为0。然后先计算64位双精度浮点数向量a的逻辑非，再与向量b做逻辑与，如果该中间结果4个通道的的符号位全为0则设置CF位1，否则CF为1，返回ZF

int _mm256_testz_ps (m256 a, m256 b)

首先计算32位单精度浮点数向量a逻辑与b，如果中间结果的符号位全为0则设置ZF为1，否则ZF为0。然后先计算32位单精度浮点数向量a的逻辑非，再与向量b做逻辑与，如果该中间结果8个通道的符号位全为0则设置CF位1，否则CF为1，返回ZF

int _mm256_testz_si256 (m256i a, m256i b)

首先计算256位向量a逻辑与b，如果中间结果所有位全为0则设置ZF为1，否则ZF为0。然后先计算256位向量a的逻辑非，再与向量b做逻辑与，如果该中间结果所有位全为0则设置CF位1，否则CF为1，返回ZF

__m256d _mm256_xor_pd (m256d a, m256d b)

64位双精度浮点数向量a逻辑与或b

Compute the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

__m256 _mm256_xor_ps (m256 a, m256 b)

32位单精度浮点数向量a逻辑与或b

Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

__m256i _mm256_xor_si256 (m256i a, m256i b)

256位向量a逻辑与或b

Compute the bitwise XOR of 256 bits (representing integer data) in a and b, and store the result in dst.

Miscellaneous

__m256i _mm256_alignr_epi8 (m256i a, m256i b, const int imm8)

将16字节向量a与b相拼接为32字节向量，将该中间结果右移imm8个字节，然后存储低16字节

Concatenate pairs of 16-byte blocks in a and b into a 32-byte temporary result, shift the result right by imm8 bytes, and store the low 16 bytes in dst.

int _mm256_movemask_epi8 (__m256i a)

提取8位向量a的最高位组成一个32位dst

Create mask from the most significant bit of each 8-bit element in a, and store the result in dst.

int _mm256_movemask_pd (__m256d a)

提取64位双精度浮点数向量a的最高位组成一个32位dst

Set each bit of mask dst based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.

int _mm256_movemask_ps (__m256 a)

提取32位单精度浮点数向量a的最高位组成一个32位dst

Set each bit of mask dst based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.

__m256i _mm256_mpsadbw_epu8 (m256i a, m256i b, const int imm8)

Compute the sum of absolute differences (SADs) of quadruplets of unsigned 8-bit integers in a compared to those in b, and store the 16-bit results in dst. Eight SADs are performed for each 128-bit lane using one quadruplet from b and eight quadruplets from a. One quadruplet is selected from b starting at on the offset specified in imm8. Eight quadruplets are formed from sequential 8-bit integers selected from a starting at the offset specified in imm8.

__m256i _mm256_packs_epi16 (m256i a, m256i b)

使用饱和，将有符号16位整形向量a和b缩小，并交叉保存为一个8位整形向量dst

Convert packed signed 16-bit integers from a and b to packed 8-bit integers using signed saturation, and store the results in dst.

__m256i _mm256_packs_epi32 (m256i a, m256i b)

使用饱和，将有符号32位整形向量a和b缩小，并交叉保存为一个16位整形向量dst

Convert packed signed 32-bit integers from a and b to packed 16-bit integers using signed saturation, and store the results in dst.

__m256i _mm256_packus_epi16 (m256i a, m256i b)

使用饱和，将有符号16位整形向量a和b缩小，并交叉保存为一个8位无符号整形向量dst

Convert packed signed 16-bit integers from a and b to packed 8-bit integers using unsigned saturation, and store the results in dst.

__m256i _mm256_packus_epi32 (m256i a, m256i b)

使用饱和，将有符号32位整形向量a和b缩小，并交叉保存为一个16位无符号整形向量dst

Convert packed signed 32-bit integers from a and b to packed 16-bit integers using unsigned saturation, and store the results in dst.

Move

__m256d _mm256_movedup_pd (__m256d a)

复制64位双精度浮点数向量的偶数索引的元素到相邻奇数索引的元素

Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.

__m256 _mm256_movehdup_ps (__m256 a)

复制32位单精度浮点数向量的奇数索引的元素到相邻偶数索引的元素

Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

__m256 _mm256_moveldup_ps (__m256 a)

复制32位单精度浮点数向量的偶数索引的元素到相邻奇数索引的元素

Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

Probability/Statistics

__m256i _mm256_avg_epu16 (m256i a, m256i b)

计算16位无符号整形向量a和b的均值

Average packed unsigned 16-bit integers in a and b, and store the results in dst.

__m256i _mm256_avg_epu8 (m256i a, m256i b)

计算8位无符号整形向量a和b的均值

Average packed unsigned 8-bit integers in a and b, and store the results in dst.

Set

__m256i _mm256_set_epi16 (short e15, short e14, short e13, short e12, short e11, short e10, short e9, short e8, short e7, short e6, short e5, short e4, short e3, short e2, short e1, short e0)

使用指定值分别设置16位整形向量

Set packed 16-bit integers in dst with the supplied values.

__m256i _mm256_set_epi32 (int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)

使用指定值分别设置32位整形向量

Set packed 32-bit integers in dst with the supplied values.

__m256i _mm256_set_epi64x (int64 e3, int64 e2, int64 e1, int64 e0)

使用指定值分别设置64位整形向量

Set packed 64-bit integers in dst with the supplied values.

__m256i _mm256_set_epi8 (char e31, char e30, char e29, char e28, char e27, char e26, char e25, char e24, char e23, char e22, char e21, char e20, char e19, char e18, char e17, char e16, char e15, char e14, char e13, char e12, char e11, char e10, char e9, char e8, char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0)

使用指定值分别设置8位整形向量

Set packed 8-bit integers in dst with the supplied values.

__m256 _mm256_set_m128 (m128 hi, m128 lo)

使用两个__m128设置__m256

Set packed __m256 vector dst with the supplied values.

__m256d _mm256_set_m128d (m128d hi, m128d lo)

使用两个__m128d设置__m256d

Set packed __m256d vector dst with the supplied values.

__m256i _mm256_set_m128i (m128i hi, m128i lo)

使用两个__m128i设置__m256i

Set packed __m256i vector dst with the supplied values.

__m256d _mm256_set_pd (double e3, double e2, double e1, double e0)

使用指定值分别设置64位双精度浮点数向量

Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.

__m256 _mm256_set_ps (float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)

使用指定值分别设置32位单精度浮点数向量

Set packed single-precision (32-bit) floating-point elements in dst with the supplied values.

__m256i _mm256_set1_epi16 (short a)

将一个16位整形广播为向量

Broadcast 16-bit integer a to all all elements of dst. This intrinsic may generate the vpbroadcastw.

__m256i _mm256_set1_epi32 (int a)

将一个32位整形广播为向量

Broadcast 32-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastd.

__m256i _mm256_set1_epi64x (long long a)

将一个64位整形广播为向量

Broadcast 64-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastq.

__m256i _mm256_set1_epi8 (char a)

将一个8位整形广播为向量

Broadcast 8-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastb.

__m256d _mm256_set1_pd (double a)

将一个64位双精度浮点数广播为向量

Broadcast double-precision (64-bit) floating-point value a to all elements of dst.

__m256 _mm256_set1_ps (float a)

将一个32位单精度浮点数广播为向量

Broadcast single-precision (32-bit) floating-point value a to all elements of dst.

__m256i _mm256_setr_epi16 (short e15, short e14, short e13, short e12, short e11, short e10, short e9, short e8, short e7, short e6, short e5, short e4, short e3, short e2, short e1, short e0)

反向使用指定值分别设置16位整形向量

Set packed 16-bit integers in dst with the supplied values in reverse order.

__m256i _mm256_setr_epi32 (int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)

反向使用指定值分别设置32位整形向量

Set packed 32-bit integers in dst with the supplied values in reverse order.

__m256i _mm256_setr_epi64x (int64 e3, int64 e2, int64 e1, int64 e0)

反向使用指定值分别设置64位整形向量

Set packed 64-bit integers in dst with the supplied values in reverse order.

__m256i _mm256_setr_epi8 (char e31, char e30, char e29, char e28, char e27, char e26, char e25, char e24, char e23, char e22, char e21, char e20, char e19, char e18, char e17, char e16, char e15, char e14, char e13, char e12, char e11, char e10, char e9, char e8, char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0) 反向使用指定值分别设置8位整形向量 Set packed 8-bit integers in dst with the supplied values in reverse order.

__m256 _mm256_setr_m128 (m128 lo, m128 hi)

反向使用两个__m128设置__m256

Set packed __m256 vector dst with the supplied values.

__m256d _mm256_setr_m128d (m128d lo, m128d hi)

反向使用两个__m128d设置__m256d

Set packed __m256d vector dst with the supplied values.

__m256i _mm256_setr_m128i (m128i lo, m128i hi)

反向使用两个__m128i设置__m256i

Set packed __m256i vector dst with the supplied values.

__m256d _mm256_setr_pd (double e3, double e2, double e1, double e0)

反向使用指定值分别，设置64位双精度浮点数向量

Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.

__m256 _mm256_setr_ps (float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)

反向使用指定值分别，设置32位单精度浮点数向量

Set packed single-precision (32-bit) floating-point elements in dst with the supplied values in reverse order.

__m256d _mm256_setzero_pd (void)

返回一个全为零的__m256d

Return vector of type __m256d with all elements set to zero.

__m256 _mm256_setzero_ps (void)

返回一个全为零的__m256

Return vector of type __m256 with all elements set to zero.

__m256i _mm256_setzero_si256 (void)

返回一个全为零的__m256i

Return vector of type __m256i with all elements set to zero.

Shift

__m256i _mm256_bslli_epi128 (__m256i a, const int imm8)

使用0填充，左移128位向量imm8个字节

Shift 128-bit lanes in a left by imm8 bytes while shifting in zeros, and store the results in dst.

__m256i _mm256_bsrli_epi128 (__m256i a, const int imm8)

使用0填充，右移128位向量imm8个字节

Shift 128-bit lanes in a right by imm8 bytes while shifting in zeros, and store the results in dst.

__m256i _mm256_sll_epi16 (m256i a, m128i count)

使用0填充，左移16位整形向量count位

Shift packed 16-bit integers in a left by count while shifting in zeros, and store the results in dst

__m256i _mm256_sll_epi32 (m256i a, m128i count)

使用0填充，左移32位整形向量count位

Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst.

__m256i _mm256_sll_epi64 (m256i a, m128i count)

使用0填充，左移64位整形向量count位

Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst.

__m256i _mm256_slli_epi16 (__m256i a, int imm8)

使用0填充，左移16位整形向量imm8位

Shift packed 16-bit integers in a left by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_slli_epi32 (__m256i a, int imm8)

使用0填充，左移32位整形向量imm8位

Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_slli_epi64 (__m256i a, int imm8)

使用0填充，左移64位整形向量imm8位

Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_slli_si256 (__m256i a, const int imm8)

使用0填充，左移128位向量imm8个字节

Shift 128-bit lanes in a left by imm8 bytes while shifting in zeros, and store the results in dst.

__m256i _mm256_sllv_epi32 (m256i a, m256i count)

使用0填充，根据count中对应通道的数值左移32位整形向量

Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.

__m256i _mm256_sllv_epi64 (m256i a, m256i count)

使用0填充，根据count中对应通道的数值左移64位整形向量

Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.

__m256i _mm256_sra_epi16 (m256i a, m128i count)

使用算术右移，右移16位整形向量count位

Shift packed 16-bit integers in a right by count while shifting in sign bits, and store the results in dst.

__m256i _mm256_sra_epi32 (m256i a, m128i count)

使用算术右移，右移32位整形向量count位

Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst.

__m256i _mm256_srai_epi16 (__m256i a, int imm8)

使用算术右移，右移16位整形向量imm8位

Shift packed 16-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.

__m256i _mm256_srai_epi32 (__m256i a, int imm8)

使用算术右移，右移32位整形向量imm8位

Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.

__m256i _mm256_srav_epi32 (m256i a, m256i count)

使用算术右移，根据count中对应通道的数值右移32位整形向量

Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst.

__m256i _mm256_srl_epi16 (m256i a, m128i count)

使用逻辑右移，右移16位整形向量count位

Shift packed 16-bit integers in a right by count while shifting in zeros, and store the results in dst.

__m256i _mm256_srl_epi32 (m256i a, m128i count)

使用逻辑右移，右移32位整形向量count位

Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst.

__m256i _mm256_srl_epi64 (m256i a, m128i count)

使用逻辑右移，右移64位整形向量count位

Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst.

__m256i _mm256_srli_epi16 (__m256i a, int imm8)

使用逻辑右移，右移16位整形向量 imm8位

Shift packed 16-bit integers in a right by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_srli_epi32 (__m256i a, int imm8)

使用逻辑右移，右移32位整形向量 imm8位

Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_srli_epi64 (__m256i a, int imm8)

使用逻辑右移，右移64位整形向量 imm8位

Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst.

__m256i _mm256_srli_si256 (__m256i a, const int imm8)

使用逻辑右移，右移128位向量 imm8位

Shift 128-bit lanes in a right by imm8 bytes while shifting in zeros, and store the results in dst.

__m256i _mm256_srlv_epi32 (m256i a, m256i count)

使用逻辑右移，根据count中对应通道的数值右移32位整形向量

Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.

__m256i _mm256_srlv_epi64 (m256i a, m256i count)

使用逻辑右移，根据count中对应通道的数值右移64位整形向量

Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.

Special Math Functions

__m256i _mm256_abs_epi16 (__m256i a)

计算16位有符号整形向量a的绝对值

Compute the absolute value of packed signed 16-bit integers in a, and store the unsigned results in dst.

__m256i _mm256_abs_epi32 (__m256i a)

计算32位有符号整形向量a的绝对值

Compute the absolute value of packed signed 32-bit integers in a, and store the unsigned results in dst.

__m256i _mm256_abs_epi8 (__m256i a)

计算8位有符号整形向量a的绝对值

Compute the absolute value of packed signed 8-bit integers in a, and store the unsigned results in dst.

__m256d _mm256_ceil_pd (__m256d a)

将64位双精度浮点数向量a向上取整，并返回64位双精度浮点数向量

Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements in dst.

__m256 _mm256_ceil_ps (__m256 a)

将32位单精度浮点数向量a向上取整，并返回32位单精度浮点数向量

Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements in dst.

__m256d _mm256_floor_pd (__m256d a)

将64位双精度浮点数向量a向下取整，并返回64位双精度浮点数向量

Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements in dst.

__m256 _mm256_floor_ps (__m256 a)

将32位单精度浮点数向量a向下取整，并返回32位单精度浮点数向量

Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements in dst.

__m256i _mm256_max_epi16 (m256i a, m256i b)

计算16位有符号整形向量a与b的最大值

Compare packed signed 16-bit integers in a and b, and store packed maximum values in dst.

__m256i _mm256_max_epi32 (m256i a, m256i b)

计算32位有符号整形向量a与b的最大值

Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst.

__m256i _mm256_max_epi8 (m256i a, m256i b)

计算8位有符号整形向量a与b的最大值

Compare packed signed 8-bit integers in a and b, and store packed maximum values in dst.

__m256i _mm256_max_epu16 (m256i a, m256i b)

计算16位无符号整形向量a与b的最大值

Compare packed unsigned 16-bit integers in a and b, and store packed maximum values in dst.

__m256i _mm256_max_epu32 (m256i a, m256i b)

计算32位无符号整形向量a与b的最大值

Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst.

__m256i _mm256_max_epu8 (m256i a, m256i b)

计算8位无符号整形向量a与b的最大值

Compare packed unsigned 8-bit integers in a and b, and store packed maximum values in dst.

__m256d _mm256_max_pd (m256d a, m256d b)

计算64位双精度浮点数向量a与b的最大值。当a、b为NaN或+0时不符合IEEE754标准

Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.dst does not follow the IEEE Standard for Floating-Point Arithmetic (IEEE 754) maximum value when inputs are NaN or signed-zero values.

__m256 _mm256_max_ps (m256 a, m256 b)

计算32位单精度浮点数向量a与b的最大值。当a、b为NaN或+0时不符合IEEE754标准

Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.dst does not follow the IEEE Standard for Floating-Point Arithmetic (IEEE 754) maximum value when inputs are NaN or signed-zero values.

__m256i _mm256_min_epi16 (m256i a, m256i b)

计算16位有符号整形向量a与b的最小值

Compare packed signed 16-bit integers in a and b, and store packed minimum values in dst.

__m256i _mm256_min_epi32 (m256i a, m256i b)

计算32位有符号整形向量a与b的最小值

Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst.

__m256i _mm256_min_epi8 (m256i a, m256i b)

计算8位有符号整形向量a与b的最小值

Compare packed signed 8-bit integers in a and b, and store packed minimum values in dst.

__m256i _mm256_min_epu16 (m256i a, m256i b)

计算16位无符号整形向量a与b的最小值

Compare packed unsigned 16-bit integers in a and b, and store packed minimum values in dst.

__m256i _mm256_min_epu32 (m256i a, m256i b)

计算32位无符号整形向量a与b的最小值

Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst.

__m256i _mm256_min_epu8 (m256i a, m256i b)

计算8位无符号整形向量a与b的最小值

Compare packed unsigned 8-bit integers in a and b, and store packed minimum values in dst.

__m256d _mm256_min_pd (m256d a, m256d b)

计算64位双精度浮点数向量a与b的最小值，当a、b为NaN或+0时不符合IEEE754标准

Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.dst does not follow the IEEE Standard for Floating-Point Arithmetic (IEEE 754) minimum value when inputs are NaN or signed-zero values.

__m256 _mm256_min_ps (m256 a, m256 b)

计算32位单精度浮点数向量a与b的最小值，当a、b为NaN或+0时不符合IEEE754标准

Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.dst does not follow the IEEE Standard for Floating-Point Arithmetic (IEEE 754) minimum value when inputs are NaN or signed-zero values.

__m256d _mm256_round_pd (__m256d a, int rounding)

计算64位双精度浮点数向量a取整，并返回64位双精度浮点数向量，取整方法根据参数rounding。

“Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements in dst.Rounding is done according to the rounding[3:0] parameter, which can be one of:”

__m256 _mm256_round_ps (__m256 a, int rounding)

计算32位单精度浮点数向量a取整，并返回32位单精度浮点数向量，取整方法根据参数rounding。

“Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements in dst.Rounding is done according to the rounding[3:0] parameter, which can be one of:”

Store

void _mm256_maskstore_epi32 (int* mem_addr, m256i mask, m256i a)

保存32位整形向量至内存，当mask对应通道的最高位是零时不保存

Store packed 32-bit integers from a into memory using mask (elements are not stored when the highest bit is not set in the corresponding element).

void _mm256_maskstore_epi64 (__int64* mem_addr, m256i mask, m256i a)

保存64位整形向量至内存，当mask对应通道的最高位是零时不保存

Store packed 64-bit integers from a into memory using mask (elements are not stored when the highest bit is not set in the corresponding element).

void _mm256_maskstore_pd (double * mem_addr, m256i mask, m256d a)

保存64位双精度浮点数向量至内存，当mask对应通道的最高位是零时不保存

Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

void _mm256_maskstore_ps (float * mem_addr, m256i mask, m256 a)

保存32位单精度浮点数向量至内存，当mask对应通道的最高位是零时不保存

Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

void _mm256_store_pd (double * mem_addr, __m256d a)

保存256位（由4个64位双精度浮点数组成）至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

void _mm256_store_ps (float * mem_addr, __m256 a)

保存256位（由8个32位单精度浮点数组成）至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

void _mm256_store_si256 (__m256i * mem_addr, __m256i a)

保存256位整形数据至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits of integer data from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

void _mm256_storeu_pd (double * mem_addr, __m256d a)

保存256位（由4个64位双精度浮点数组成）至内存， mem_addr无需32字节对齐

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.

void _mm256_storeu_ps (float * mem_addr, __m256 a)

保存256位（由8个32位单精度浮点数组成）至内存， mem_addr无需32字节对齐

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.

void _mm256_storeu_si256 (__m256i * mem_addr, __m256i a)

保存256位整形数据至内存， mem_addr无需32字节对齐

Store 256-bits of integer data from a into memory. mem_addr does not need to be aligned on any particular boundary.

void _mm256_storeu2_m128 (float* hiaddr, float* loaddr, __m256 a)

将a拆分为两个128位数据（每个由4个32位单精度浮点数向量组成），分别存储至 hiaddr和loaddr。 hiaddr和loaddr无需对齐

Store the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

void _mm256_storeu2_m128d (double* hiaddr, double* loaddr, __m256d a)

将a拆分为两个128位数据（每个由2个64位双精度浮点数向量组成），分别存储至 hiaddr和loaddr。 hiaddr和loaddr无需对齐

Store the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

void _mm256_storeu2_m128i (m128i* hiaddr, m128i* loaddr, __m256i a)

将a拆分为两个128位数据（每个由128位整形数据组成），分别存储至 hiaddr和loaddr。 hiaddr和loaddr无需对齐

Store the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

void _mm256_stream_pd (double * mem_addr, __m256d a)

使用non-temporal memory hint，保存256位（由4个64位双精度浮点数组成）至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

void _mm256_stream_ps (float * mem_addr, __m256 a)

使用non-temporal memory hint，保存256位（由8个32位单精度浮点数组成）至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

void _mm256_stream_si256 (__m256i * mem_addr, __m256i a)

使用non-temporal memory hint，保存256位整形数据至内存， mem_addr必须32字节对齐，否则会产生通用保护异常

Store 256-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

Swizzle

__m256i _mm256_blend_epi16 (m256i a, m256i b, const int imm8)

根据imm8的低8位混合（二选一）16位整形向量a和b的高低128位通道（即第0个16位通道和第8个16位通道使用相同的混合规则）

Blend packed 16-bit integers from a and b within 128-bit lanes using control mask imm8, and store the results in dst.

__m256i _mm256_blend_epi32 (m256i a, m256i b, const int imm8)

根据imm8的低8位混合（二选一）32位整形向量a和b

Blend packed 32-bit integers from a and b using control mask imm8, and store the results in dst.

__m256d _mm256_blend_pd (m256d a, m256d b, const int imm8)

根据imm8的低4位混合（二选一）64位双精度浮点数向量a和b

Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

__m256 _mm256_blend_ps (m256 a, m256 b, const int imm8)

根据imm8的低8位混合（二选一）32位单精度浮点数向量a和b

Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

__m256i _mm256_blendv_epi8 (m256i a, m256i b, __m256i mask)

根据mask对应通道的最高位混合（二选一）8位整形向量a和b

Blend packed 8-bit integers from a and b using mask, and store the results in dst.

__m256d _mm256_blendv_pd (m256d a, m256d b, __m256d mask)

根据mask对应通道的最高位混合（二选一）64位双精度浮点数向量a和b

Blend packed double-precision (64-bit) floating-point elements from a and b using mask, and store the results in dst.

__m256 _mm256_blendv_ps (m256 a, m256 b, __m256 mask)

根据mask对应通道的最高位混合（二选一）32位单精度浮点数向量a和b

Blend packed single-precision (32-bit) floating-point elements from a and b using mask, and store the results in dst.

__m256d _mm256_broadcast_pd (__m128d const * mem_addr)

广播内存中的128位数据（由2个64位双精度浮点数组成）

Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.

__m256 _mm256_broadcast_ps (__m128 const * mem_addr)

广播内存中的128位数据（由4个32位单精度浮点数成）

Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.

__m256d _mm256_broadcast_sd (double const * mem_addr)

广播内存中的一个64位双精度浮点数至dst所有通道

Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.

__m256i _mm256_broadcastb_epi8 (__m128i a)

广播8位整形向量最低通道至dst所有通道

Broadcast the low packed 8-bit integer from a to all elements of dst.

__m256i _mm256_broadcastd_epi32 (__m128i a)

广播32位整形向量最低通道至dst所有通道

Broadcast the low packed 32-bit integer from a to all elements of dst.

__m256i _mm256_broadcastq_epi64 (__m128i a)

广播64位整形向量最低通道至dst所有通道

Broadcast the low packed 64-bit integer from a to all elements of dst.

__m256d _mm256_broadcastsd_pd (__m128d a)

广播64位双精度浮点数向量最低通道至dst所有通道

Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst.

__m256i _mm256_broadcastsi128_si256 (__m128i a)

广播128位整形数据至dst所有通道

Broadcast 128 bits of integer data from a to all 128-bit lanes in dst.

__m256 _mm256_broadcastss_ps (__m128 a)

广播32位单精度浮点数向量最低通道至dst所有通道

Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst.

__m256i _mm256_broadcastw_epi16 (__m128i a)

广播16位整形向量最低通道至dst所有通道

Broadcast the low packed 16-bit integer from a to all elements of dst.

int _mm256_extract_epi16 (__m256i a, const int index)

根据索引index从a中提取16位整形

Extract a 16-bit integer from a, selected with index, and store the result in dst.

__int32 _mm256_extract_epi32 (__m256i a, const int index)

根据索引index从a中提取32位整形

Extract a 32-bit integer from a, selected with index, and store the result in dst.

__int64 _mm256_extract_epi64 (__m256i a, const int index)

根据索引index从a中提取64位整形

Extract a 64-bit integer from a, selected with index, and store the result in dst.

int _mm256_extract_epi8 (__m256i a, const int index)

根据索引index从a中提取8位整形

Extract an 8-bit integer from a, selected with index, and store the result in dst.

__m128d _mm256_extractf128_pd (__m256d a, const int imm8)

根据索引imm8提取128位数据（由2个64位双精度浮点数组成）

Extract 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

__m128 _mm256_extractf128_ps (__m256 a, const int imm8)

根据索引imm8提取128位数据（由4个32位单精度浮点数组成）

Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

__m128i _mm256_extractf128_si256 (__m256i a, const int imm8)

根据索引imm8提取128位数据（由整形数据组成）

Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.

__m128i _mm256_extracti128_si256 (__m256i a, const int imm8)

根据索引imm8提取128位数据（由整形数据组成）

Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.

__m256i _mm256_insert_epi16 (m256i a, int16 i, const int index)

复制a至dst，然后根据索引index在相应位置插入16位整形i

Copy a to dst, and insert the 16-bit integer i into dst at the location specified by index.

__m256i _mm256_insert_epi32 (m256i a, int32 i, const int index)

复制a至dst，然后根据索引index在相应位置插入32位整形i

Copy a to dst, and insert the 32-bit integer i into dst at the location specified by index.

__m256i _mm256_insert_epi64 (m256i a, int64 i, const int index)

复制a至dst，然后根据索引index在相应位置插入64位整形i

Copy a to dst, and insert the 64-bit integer i into dst at the location specified by index.

__m256i _mm256_insert_epi8 (m256i a, int8 i, const int index)

复制a至dst，然后根据索引index在相应位置插入8位整形i

Copy a to dst, and insert the 8-bit integer i into dst at the location specified by index.

__m256d _mm256_insertf128_pd (m256d a, m128d b, int imm8)

复制a至dst，然后根据索引imm8在相应位置插入128位数据（由2个64位双精度浮点数组成）

Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.

__m256 _mm256_insertf128_ps (m256 a, m128 b, int imm8)

复制a至dst，然后根据索引imm8在相应位置插入128位数据（由4个32位单精度浮点数组成）

Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.

__m256i _mm256_insertf128_si256 (m256i a, m128i b, int imm8)

复制a至dst，然后根据索引imm8在相应位置插入128位数据

Copy a to dst, then insert 128 bits from b into dst at the location specified by imm8.

__m256i _mm256_inserti128_si256 (m256i a, m128i b, const int imm8)

复制a至dst，然后根据索引imm8在相应位置插入128位数据（由整形数据组成）

Copy a to dst, then insert 128 bits (composed of integer data) from b into dst at the location specified by imm8.

__m256d _mm256_permute_pd (__m256d a, int imm8)

根据 imm8（低4位每1位控制1个通道）重排64位双精度浮点数向量（只能在高低各128位内重排）

Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

__m256 _mm256_permute_ps (__m256 a, int imm8)

根据 imm8（低8位每2位控制1个通道）重排32位单精度浮点数向量（只能在高低各128位内重排）

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

__m256d _mm256_permute2f128_pd (m256d a, m256d b, int imm8)

根据imm8（低8位每4位控制1个通道，每4位的最高位可控制是否输出0）混合128位（由2个64位双精度浮点数组成）向量a和b

Shuffle 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

__m256 _mm256_permute2f128_ps (m256 a, m256 b, int imm8)

根据imm8（低8位每4位控制1个通道，每4位最高位可控制是否输出0）混合128位（由4个32位单精度浮点数组成）向量a和b

Shuffle 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

__m256i _mm256_permute2f128_si256 (m256i a, m256i b, int imm8)

根据imm8（低8位每4位控制1个通道，每4位最高位可控制是否输出0）混合128位（由整形数据组成）向量a和b

Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.

__m256i _mm256_permute2x128_si256 (m256i a, m256i b, const int imm8)

根据imm8（低8位每4位控制1个通道，每4位最高位可控制是否输出0）混合128位（由整形数据组成）向量a和b

Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.

__m256i _mm256_permute4x64_epi64 (__m256i a, const int imm8)

根据 imm8（低8位每2位控制1个通道）重排64位整形向量

Shuffle 64-bit integers in a across lanes using the control in imm8, and store the results in dst.

__m256d _mm256_permute4x64_pd (__m256d a, const int imm8)

根据 imm8（低8位每2位控制1个通道）重排64位双精度浮点数向量

Shuffle double-precision (64-bit) floating-point elements in a across lanes using the control in imm8, and store the results in dst.

__m256d _mm256_permutevar_pd (m256d a, m256i b)

根据b（每64位的第2位控制1个通道）重排64位双精度浮点数向量（只能在高低各128位内重排）

Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

__m256 _mm256_permutevar_ps (m256 a, m256i b)

根据b（每32位的低2位控制1个通道）重排32位单精度浮点数向量（只能在高低各128位内重排）

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

__m256i _mm256_permutevar8x32_epi32 (m256i a, m256i idx)

根据 idx（每32位的低3位控制1个通道）重排32位整形向量

Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.

__m256 _mm256_permutevar8x32_ps (m256 a, m256i idx)

根据 idx（每32位的低3位控制1个通道）重排32位单精度浮点数向量

Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx.

__m256i _mm256_shuffle_epi32 (__m256i a, const int imm8)

根据 imm8（低8位的每2位控制1个通道）重排32位整形向量（只能在高低各128位内重排）

Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst.

__m256i _mm256_shuffle_epi8 (m256i a, m256i b)

根据b（每8位的低4位控制1个通道，每4位最高位可控制是否输出0）重排8位整形向量（只能在高低各128位内重排，可将通道置零）

Shuffle 8-bit integers in a within 128-bit lanes according to shuffle control mask in the corresponding 8-bit element of b, and store the results in dst.

__m256d _mm256_shuffle_pd (m256d a, m256d b, const int imm8)

根据 imm8（低4位的每1位控制1个通道）混合64位双精度浮点数向量a和b（只能在高低各128位内混合）

Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.

__m256 _mm256_shuffle_ps (m256 a, m256 b, const int imm8)

根据 imm8（低4位的每1位控制1个通道）混合64位双精度浮点数向量a和b（只能在高低各128位内混合）

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

__m256i _mm256_shufflehi_epi16 (__m256i a, const int imm8)

根据imm8（低8位的每2位控制1个通道）重排128位向量的高64位，并保存到dst对应128位通道的高64位，将a中的128位通道的低64位直接拷贝到dst的对应通道的低64位

Shuffle 16-bit integers in the high 64 bits of 128-bit lanes of a using the control in imm8. Store the results in the high 64 bits of 128-bit lanes of dst, with the low 64 bits of 128-bit lanes being copied from from a to dst.

__m256i _mm256_shufflelo_epi16 (__m256i a, const int imm8)

根据imm8（低8位的每2位控制1个通道）重排128位向量a的低64位，并保存到dst对应128位通道的低64位，将128位向量的高64位直接拷贝到dst的对应通道的高64位

Shuffle 16-bit integers in the low 64 bits of 128-bit lanes of a using the control in imm8. Store the results in the low 64 bits of 128-bit lanes of dst, with the high 64 bits of 128-bit lanes being copied from from a to dst.

__m256i _mm256_unpackhi_epi16 (m256i a, m256i b)

以16位整形为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave 16-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpackhi_epi32 (m256i a, m256i b)

以32位整形为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpackhi_epi64 (m256i a, m256i b)

以64位整形为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpackhi_epi8 (m256i a, m256i b)

以8位整形为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave 8-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256d _mm256_unpackhi_pd (m256d a, m256d b)

以64位双精度浮点数为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256 _mm256_unpackhi_ps (m256 a, m256 b)

以32位单精度浮点数为单位交叉拼接128位向量a和b的各高64位

Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpacklo_epi16 (m256i a, m256i b)

以16位整形为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave 16-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpacklo_epi32 (m256i a, m256i b)

以32位整形为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpacklo_epi64 (m256i a, m256i b)

以64位整形为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.

__m256i _mm256_unpacklo_epi8 (m256i a, m256i b)

以8位整形为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave 8-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.

__m256d _mm256_unpacklo_pd (m256d a, m256d b)

以64位双精度浮点数为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

__m256 _mm256_unpacklo_ps (m256 a, m256 b)

以32位单精度浮点数为单位交叉拼接128位向量a和b的各低64位

Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

你可能感兴趣的:(编程,算法,人工智能)

精准测试：软件开发中的高效质量保障利器霍格沃兹软件测试开发精准化测试测试用例安全性测试测试覆盖率模块测试 selenium 测试工具压力测试
全面解析软件测试开发：人工智能测试、自动化测试、性能测试、测试左移、测试右移到DevOps如何驱动持续交付在现代软件开发中，测试效率与测试质量直接影响产品竞争力。精准测试作为一项兼具效率与精度的创新测试方法，已经成为众多企业提升软件质量的重要手段。本篇文章围绕精准测试的落地实施、对质量指标的提升、数据统计与效果评估方法以及如何提高投入产出比进行全面解读，帮助企业掌握精准测试的价值与实践路径。精准测
提升敏感力，“工具人”破圈的唯一解！技能咖 GAI认证生成式人工智能认证人工智能
在当今这个日新月异的数字化时代，个人与组织面临着前所未有的挑战与机遇。随着科技的飞速发展，尤其是生成式人工智能（GenerativeAI）的兴起，职场生态正在发生深刻变革。如何在这场变革中提升敏感力，实现从“工具人”到行业佼佼者的跨越，成为了众多职场人士关注的焦点。本文将探讨提升敏感力的重要性，并引入生成式人工智能认证（GAI认证），为您揭示“工具人”破圈的唯一解。提升敏感力：职场竞争的关键什么是
Web3身份验证技术对数据保护的影响研究清晨反侦测指纹浏览器社交媒体 web3 ClonBrowser 跨境电商隐私保护
Web3身份验证技术对数据保护的影响研究在这个数字化时代，我们的身份和数据安全比以往任何时候都更加重要。Web3技术以其去中心化和用户主权的核心理念，为个人数据的管理和保护提供了新的视角。本文将探讨Web3身份验证技术如何影响数据保护，并分析其对我们数字生活的影响。Web3身份验证技术简介Web3身份验证技术依托于区块链和先进的加密技术，如非对称加密算法和智能合约，为用户提供了一种全新的身份验证方
金三银四快过去一半了，是时候加把劲了后端go找工作面试
从复旦春招会的15000+岗位争夺战，到AI算法岗年薪百万的“神仙打架”，再到游戏行业20:1的残酷竞争比，今年的金三银四像极了《三体》里的黑暗森林：机会看似遍地，但稍有不慎就成了别人的“背景板”。但现实真的是“投晚了就凉了”吗？数据告诉你真相：智联研究院统计显示，算法工程师、机器人算法工程师等岗位需求同比激增44%，而中小企业的“捡漏窗口”才刚开启。这半个月，我整理了20+场面试实录（含小鹅通、
新浪财经App喜娜AI助手通过大模型登记，已上线AI摘要和个股公告AI解读量子位
3月14日，官方发布的信息显示，新浪财经App喜娜AI助手近日已通过北京市生成式人工智能服务登记。目前，喜娜AI助手已上线两项创新功能：喜娜AI摘要和个股公告AI解读。这两项功能旨在通过先进的人工智能技术，提升用户对财经资讯和上市公司公告的理解与分析效率，这标志着AI技术在信息服务领域的又一重大突破。喜娜AI摘要：快速提炼财经资讯核心要点AI时代，资讯信息迎来爆炸性增长，用户每天都要面对海量资讯，
开发人员的编程心理学开发
向开发人员提供建议的编程心理学我之前写过，编程有两个受众：CPU和你的编程伙伴。还有一些优秀的文章，比如《面向苦难编程》，可以帮助你在编程时调整目标——让它工作、让它漂亮、让它快速，这是那篇文章的建议。“让它工作、让它漂亮、让它快速”是绝妙的编程建议，也是我从第一次读它开始就一直牢记在心的建议。编程建议程序首先以CPU为目标——即“使其工作”。1合理的编程建议然后建议针对您的编程伙伴，即必须维护或
模型微调：让AI更懂你的魔法棒带上一无所知的我 pytorch 人工智能 python
模型微调：让AI更懂你的魔法棒✨在人工智能的世界里，模型微调（Fine-tuning）就像是一位魔法师用魔法棒对预训练模型进行“个性化改造”，让它更适应特定的任务。今天，我们就来深入探讨模型微调的技术细节，让你也能像魔法师一样，轻松驾驭AI模型！什么是模型微调？模型微调是指在预训练模型的基础上，通过少量的特定任务数据进行训练，使模型更好地适应新任务的技术。预训练模型通常是基于大规模数据集（如Ima
动态规划算法优化在资源分配问题中的应用 suyang199312 课程设计
摘要资源分配问题广泛存在于各类生产与管理场景，合理分配资源以实现效益最大化至关重要。本文深入剖析动态规划算法在资源分配问题中的应用，详细阐述其基本原理与常规解法，针对常规解法的不足提出创新优化思路，并给出具体实现步骤。通过实际案例分析与实验验证，展示优化后的动态规划算法在提升资源分配效率和效益方面的显著优势，为相关领域的决策制定提供有力支持。引言在经济、工程、计算机科学等众多领域，资源分配问题无处
加密算法的性能优化与安全性平衡研究 sigen520520 笔记
摘要在数字化信息飞速发展的当下，数据安全至关重要，加密算法作为数据保护的核心手段，其性能与安全性直接关乎信息系统的稳定运行。本文深入剖析常见加密算法，详细分析其性能指标与安全性特点，全面探讨在提升加密速度的同时确保安全的有效方法与实践，旨在为构建高效、安全的加密体系提供理论支撑与实践指导。引言随着互联网的普及和信息技术的广泛应用，数据在传输与存储过程中面临诸多安全威胁，如数据泄露、篡改、伪造等。加
2025React岗位前端面试题180道及其答案解析,看完稳了,万字长文,持续更新.... 祈澈菇凉前端
1.什么是React？它的主要特点是什么？答案解析：React是一个用于构建用户界面的JavaScript库，主要用于构建单页应用。其主要特点包括：组件化：React应用由多个可重用的组件组成，便于管理和维护。虚拟DOM：React使用虚拟DOM提高性能，通过最小化实际DOM操作来优化渲染过程。单向数据流：数据在组件之间以单向流动的方式传递，简化了数据管理和调试。声明式编程：React允许开发者以
从 DeepSeek 到 AI 工具箱：Websoft9 应用托管平台赋能高校教学与科研人工智能deepseek
从DeepSeek到AI工具箱：Websoft9应用托管平台赋能高校教学与科研人工智能技术的快速发展正在重塑高校的教学与科研生态。从智能教学辅助到跨学科研究，AI工具的应用场景不断扩展，而技术落地的复杂性也带来新的挑战。在这一背景下，如何将大模型能力与多样化AI工具无缝整合，构建安全、易用的科研教学环境，成为高校数字化转型的关键命题。一、高校智能化转型的三大痛点技术门槛高•AI工具部署依赖专业运维
聊聊关于Python与人工智能那些事小G-biu- python 人工智能 tensorflow
Python与人工智能：介绍Python在人工智能方面的应用Python是一种广泛使用的编程语言，也是人工智能领域中最受欢迎的语言之一。Python提供了许多用于构建和训练人工智能模型的库和框架。本文将介绍一些常见的人工智能技术以及Python在这些技术中的应用。OpenAIOpenAI是一个非营利组织，旨在推动人工智能的发展并促进其对人类的利益。OpenAI通过开发人工智能技术、研究人工智能的影
Matlab 基于最小二乘向量机 LSSVM + NSGAII 多目标优化算法的工艺参数优化前程算法屋私信获取源码工艺参数优化 matlab 算法多目标优化
Matlab基于最小二乘向量机LSSVM+NSGAII多目标优化算法的工艺参数优化一、引言1.1研究背景与意义在现代工业生产中，工艺参数优化占据着举足轻重的地位。它犹如工业生产的核心引擎，直接影响着企业的生产效率、产品质量以及成本控制。从生产效率角度看，优化工艺参数能够显著提升生产速度。合理的参数设置可使生产设备处于最佳运行状态，减少不必要的停机与等待时间，让生产流程更加顺畅。以汽车制造业为例，通
获取网站流量的方法有哪些？ liuliangpuzi 互联网流量运营数据搜索引擎百度大数据
不同流量源的比例反映了网站所有者不同的管理策略和网站的发展阶段。那么，网站流量来源都有哪些？接下来小编就跟大家浅析下网站流量来源的三大途径，一起来看看吧！1、直接访问来源搜索引擎源和外部链源依赖于外部，因此通常存在较大的不确定性，如搜索引擎算法调整、业务模型调整、策略监管等，这可能会使网站的流量从每天数十万IP急剧下降到数千。对于小型商业站来说，从搜索引擎获取流量是一种更经济实惠、廉价的选择，但对
华为仓颉编程语言与医疗领域的深度融合：技术与实践想成为高手499 华为人工智能服务器
引言在数字化浪潮席卷全球的背景下，医疗行业的智能化转型已成为一种不可逆的趋势。从电子病历（EMR）、医疗影像分析，到远程手术和个性化健康管理，技术创新正在不断推动医疗领域的变革。然而，这一过程对底层技术提出了更高的要求：高效的计算性能、强大的硬件适配性、分布式计算能力以及生态系统的支持。华为推出的自研编程语言仓颉（Cangjie）正是在此背景下应运而生。仓颉语言以其高效、灵活和强大的硬件整合能力，
当现代教育技术遇上仓颉---探秘华为仓颉编程语言与未来教育技术的接轨想成为高手499 华为服务器 php
引言随着人工智能、物联网、区块链等新兴技术的发展，编程语言的需求也在不断演化。据市场研究机构发布的数据显示，全球编程语言市场规模预计在未来五年内将以每年10%的速度增长。此外，越来越多的企业和高校正在积极推动基于分布式系统和硬件优化的新型语言开发，这进一步表明对高性能编程语言的需求日益旺盛。近年来，华为推出了自研编程语言“仓颉”，以其高效的语法设计、灵活的语义表达能力和强大的跨平台适配性能引发了编
Python在人工智能与机器人开发中的应用与实践一键难忘 python 人工智能机器人
Python在人工智能与机器人开发中的应用与实践Python已经成为人工智能和机器人开发的主要编程语言之一，凭借其简洁的语法、强大的库支持和广泛的社区资源，Python为开发者提供了一个高效且易于学习的平台。在这篇文章中，我们将深入探讨如何使用Python进行人工智能（AI）和机器人开发，并通过实际代码示例展示核心技术和应用。1.Python在人工智能中的应用人工智能（AI）领域的核心任务包括机器
FastAPI测试策略：参数解析单元测试 qcidyu 文章归档异常传播验证依赖注入测试请求模拟技术测试覆盖率优化 Pydantic验证测试单元测试策略参数解析测试
扫描二维码关注或者微信搜一搜：编程智域前端至全栈交流与成长探索数千个预构建的AI应用，开启你的下一个伟大创意第一章：核心测试方法论1.1三层测试体系架构#第一层：模型级测试deftest_user_model_validation():withpytest.raises(ValidationError):User(age=-5)#第二层：依赖项测试deftest_auth_dependency()
QT信号和槽用于对象之间的通信 qq_33510982 c++QT 信号与槽
转载：http://blog.csdn.net/zhang2531/article/details/50807616初学qt，觉得这篇关于qt信号和槽机制讲的最为透彻。信号和槽信号和槽用于对象间的通讯。信号/槽机制是Qt的一个中心特征并且也许是Qt与其它工具包的最不相同的部分。在图形用户界面编程中，我们经常希望一个窗口部件的一个变化被通知给另一个窗口部件。更一般地，我们希望任何一类的对象可以和其它
C# 上位机开发：从“编程小白”到“工业控制专家”的成长之路威哥说编程单片机 stm32 嵌入式硬件 c#开发语言
在现代工业自动化中，上位机软件是至关重要的一环。上位机通常负责与下位机（如PLC、单片机等）进行通信，进行数据采集、处理、显示和控制。C#作为一种现代化的编程语言，以其易用性和强大的功能被广泛应用于上位机开发。如果你是从“代码小白”起步，想要进入工业控制领域，C#是一个理想的起点。本文将带你从零开始，逐步理解C#在上位机开发中的应用，帮助你从基础到进阶，最终成为一名工业控制的高手。一、认识上位机与
不要再走弯路了2025最全的黑客入门学习路线在这渗透代老师学习网络安全 web安全网络 python
基于入门网络安全/黑客打造的：黑客&网络安全入门&进阶学习资源包在大多数的思维里总觉得[学习]得先收集资料、学习编程、学习计算机基础，这样不是不可以，但是这样学效率太低了！你要知道网络安全是一门技术，任何技术的学习一定是以实践为主的。也就是说很多的理论知识其实是可以在实践中去验证拓展的，这样学习比起你啃原理、啃书本要好理解很多。所以想要学习网络安全选对正确的学习方法很重要，这可以帮你少走很多弯路。
C语言的五套标准：C89、C99、C11、C17和C23（新手必看） xiecoding.cn c语言开发语言 C语言入门 C++C/C++数据结构
作为一门经典的编程语言，C语言标准随着时间不断演进，以适应新的编程需求和技术发展。本文将详细介绍C语言的五套标准：C89、C99、C11、C17和C23。我们将从每套标准的背景、主要特性入手，逐步深入，帮助你理解它们之间的差异以及对编程实践的影响。C89：奠定基础的第一个标准C89，也称为ANSIC，是C语言的第一个正式标准，由美国国家标准协会（ANSI）于1989年发布，后在1990年被国际标准
Java并发编程之ReentrantReadWriteLock Johnny Lnex Java并发编程 java 开发语言 jvm
基本使用方法创建锁对象首先，通过newReentrantReadWriteLock()创建一个锁实例。获取读锁和写锁使用readLock()方法获得读锁对象，使用writeLock()方法获得写锁对象。使用锁保护共享资源在需要保护的代码块前后分别调用lock()和unlock()方法，确保对共享资源的访问安全。示例代码：importjava.util.concurrent.locks.Reentr
智慧交通是什么，可以帮助我们解决什么问题? Guheyunyi 运维大数据人工智能信息可视化前端
智慧交通是什么？智慧交通（SmartTransportation）是指利用物联网（IoT）、大数据、人工智能（AI）、云计算、5G通信等先进技术，对交通系统进行智能化管理和优化，以提高交通效率、减少拥堵、降低事故率、提升出行体验，并实现交通资源的合理配置和可持续发展。智慧交通的核心是通过数据采集、分析和应用，实现交通系统的智能化、自动化和协同化，从而构建一个高效、安全、绿色、便捷的交通生态系统。智
LeetCode 热题 100_跳跃游戏（78_55_中等_C++）（贪心算法） Dream it possible！ LeetCode 热题 100 leetcode c++贪心算法算法
LeetCode热题100_跳跃游戏（78_55）题目描述：输入输出样例：题解：解题思路：思路一（贪心算法）：代码实现代码实现（思路一（贪心算法））：以思路一为例进行调试题目描述：给你一个非负整数数组nums，你最初位于数组的第一个下标。数组中的每个元素代表你在该位置可以跳跃的最大长度。判断你是否能够到达最后一个下标，如果可以，返回true；否则，返回false。输入输出样例：示例1：输入：num
第十四届蓝桥杯省赛C++C组——子矩阵（蓝桥杯篇章完结撒花） Dawn_破晓蓝桥杯一个月速成日志蓝桥杯 c++c语言
本来想写的速成日志也没写多少，cb国二，最后一题树形DP调了一小时发现h数组没置-1，最后无果，如果没马虎可能有国一水平了，正儿八经准备用了两个月，因为要考研，每天只学2-3小时的算法，一共刷了300多道题吧，由于之前选过ACM（实验课因为周六去，懒得去还给我挂了）和算法分析课，所以还是有点基础的，如果算上一年前刷的题总共加起来也就400多道题吧。说一下历程吧，一年前的题都是老师布置的作业，迫不得
医疗行业的数据安全怎么防护？ jinan886 网络大数据安全开源软件数据分析
医疗行业的数据安全防护是一个系统工程，需要政府、医疗机构、技术提供商及社会各界共同努力，形成合力。通过构建全方位、多层次的数据安全防护体系，不断提升数据安全防护能力，才能为患者提供更加安全、高效的医疗服务，同时保障医疗行业的稳健发展。医疗行业的数据安全防护至关重要，以下是一些关键措施：1.数据加密传输加密：使用SSL/TLS等协议保护数据传输。存储加密：采用国标算法256位等上邦加密软件算法。2.
【C++篇】排队的艺术：用生活场景讲解优先级队列的实现 far away4002 C++c++stl 优先级队列向下（向上）调整算法
文章目录须知欢迎讨论：如果你在学习过程中有任何问题或想法，欢迎在评论区留言，我们一起交流学习。你的支持是我继续创作的动力！点赞、收藏与分享：觉得这篇文章对你有帮助吗？别忘了点赞、收藏并分享给更多的小伙伴哦！你们的支持是我不断进步的动力！分享给更多人：如果你觉得这篇文章对你有帮助，欢迎分享给更多对C++感兴趣的朋友，让我们一起进步！深入理解与实现：C++优先级队列的模拟实现1.引言在算法和数据结构中
实战LLM强化学习——使用GRPO（DeepSeek R1出圈算法）大富大贵7 程序员知识储备1 程序员知识储备2 程序员知识储备3 经验分享
引言近年来，深度强化学习（DRL）已经成为解决复杂决策问题的一个强有力工具，尤其是在自然语言处理（NLP）领域的广泛应用。通过不断优化决策策略，DRL能在大量数据中学习最佳行为，尤其是大型语言模型（LLM）在任务中展现出的巨大潜力。然而，随着模型规模的扩大和任务复杂性的增加，传统的强化学习算法开始暴露出训练效率低、收敛速度慢等问题。为了解决这些挑战，DeepSeek公司提出了一个新的强化学习算法—
C语言每日一练——day_9 Run_Teenage C语言入门练习题 c语言开发语言
引言针对初学者，每日练习几个题，快速上手C语言。第九天。（连续更新中）采用在线OJ的形式什么是在线OJ？在线判题系统（英语：OnlineJudge，缩写OJ）是一种在编程竞赛中用来测试参赛程序的在线系统，也可以用于平时的练习。详细内容可以看一下这篇博客：关于C/C++语言的初学者在哪刷题，怎么刷题-CSDN博客https://blog.csdn.net/2401_88433210/article/
web前段跨域nginx代理配置刘正强 nginx cms Web
nginx代理配置可参考server部分 server { listen 80; server_name localhost;
spring学习笔记 caoyong spring
一、概述 a>、核心技术 : IOC与AOP b>、开发为什么需要面向接口而不是实现接口降低一个组件与整个系统的藕合程度，当该组件不满足系统需求时，可以很容易的将该组件从系统中替换掉，而不会对整个系统产生大的影响 c>、面向接口编口编程的难点在于如何对接口进行初始化,(使用工厂设计模式)
Eclipse打开workspace提示工作空间不可用 0624chenhong eclipse
做项目的时候，难免会用到整个团队的代码，或者上一任同事创建的workspace， 1.电脑切换账号后，Eclipse打开时，会提示Eclipse对应的目录锁定，无法访问，根据提示，找到对应目录，G:\eclipse\configuration\org.eclipse.osgi\.manager，其中文件.fileTableLock提示被锁定。解决办法，删掉.fileTableLock文件，重
Javascript 面向对面写法的必要性？一炮送你回车库 JavaScript
现在Javascript面向对象的方式来写页面很流行，什么纯javascript的mvc框架都出来了：ember 这是javascript层的mvc框架哦,不是j2ee的mvc框架我想说的是，javascript本来就不是一门面向对象的语言，用它写出来的面向对象的程序，本身就有些别扭，很多人提到js的面向对象首先提的是：复用性。那么我请问你写的js里有多少是可以复用的，用fu
js array对象的迭代方法换个号韩国红果果 array
1.forEach 该方法接受一个函数作为参数，对数组中的每个元素使用该函数 return 语句失效 function square(num) { print(num, num * num); } var nums = [1,2,3,4,5,6,7,8,9,10]; nums.forEach(square); 2.every 该方法接受一个返回值为布尔类型
对Hibernate缓存机制的理解归来朝歌 session 一级缓存对象持久化
在hibernate中session一级缓存机制中，有这么一种情况：问题描述：我需要new一个对象，对它的几个字段赋值，但是有一些属性并没有进行赋值，然后调用 session.save()方法，在提交事务后，会出现这样的情况： 1：在数据库中有默认属性的字段的值为空 2：既然是持久化对象，为什么在最后对象拿不到默认属性的值？通过调试后解决方案如下：对于问题一，如你在数据库里设置了
WebService调用错误合集 darkranger webservice
Java.Lang.NoClassDefFoundError: Org/Apache/Commons/Discovery/Tools/DiscoverSingleton 调用接口出错，一个简单的WebService import org.apache.axis.client.Call;import org.apache.axis.client.Service; 首先必不可
JSP和Servlet的中文乱码处理 aijuans Java Web
JSP和Servlet的中文乱码处理前几天学习了JSP和Servlet中有关中文乱码的一些问题，写成了博客，今天进行更新一下。应该是可以解决日常的乱码问题了。现在作以下总结希望对需要的人有所帮助。我也是刚学，所以有不足之处希望谅解。一、表单提交时出现乱码：在进行表单提交的时候，经常提交一些中文，自然就避免不了出现中文乱码的情况，对于表单来说有两种提交方式：get和post提交方式。所以
面试经典六问 atongyeye 工作面试
题记：因为我不善沟通，所以在面试中经常碰壁，看了网上太多面试宝典，基本上不太靠谱。只好自己总结，并试着根据最近工作情况完成个人答案。以备不时之需。以下是人事了解应聘者情况的最典型的六个问题： 1 简单自我介绍关于这个问题，主要为了弄清两件事，一是了解应聘者的背景，二是应聘者将这些背景信息组织成合适语言的能力。我的回答：(针对技术面试回答，如果是人事面试，可以就掌
contentResolver.query()参数详解百合不是茶 android query()详解
收藏csdn的博客,介绍的比较详细,新手值得一看 1.获取联系人姓名一个简单的例子，这个函数获取设备上所有的联系人ID和联系人NAME。 [java] view plain copy public void fetchAllContacts() {
ora-00054:resource busy and acquire with nowait specified解决方法 bijian1013 oracle 数据库 kill nowait
当某个数据库用户在数据库中插入、更新、删除一个表的数据，或者增加一个表的主键时或者表的索引时，常常会出现ora-00054:resource busy and acquire with nowait specified这样的错误。主要是因为有事务正在执行（或者事务已经被锁），所有导致执行不成功。 1.下面的语句
web 开发乱码征客丶 spring Web
以下前端都是 utf-8 字符集编码一、后台接收 1.1、 get 请求乱码 get 请求中，请求参数在请求头中；乱码解决方法： a、通过在web 服务器中配置编码格式：tomcat 中，在 Connector 中添加URIEncoding="UTF-8"； 1.2、post 请求乱码 post 请求中，请求参数分两部份， 1.2.1、url？参数，
【Spark十六】： Spark SQL第二部分数据源和注册表的几种方式 bit1129 spark
Spark SQL数据源和表的Schema case class apply schema parquet json JSON数据源准备源数据 {"name":"Jack", "age": 12, "addr":{"city":"beijing&
JVM学习之:调优总结 -Xms -Xmx -Xmn -Xss BlueSkator -Xss -Xmn -Xms -Xmx
堆大小设置JVM 中最大堆大小有三方面限制：相关操作系统的数据模型（32-bt还是64-bit）限制；系统的可用虚拟内存限制；系统的可用物理内存限制。32位系统下，一般限制在1.5G~2G；64为操作系统对内存无限制。我在Windows Server 2003 系统，3.5G物理内存，JDK5.0下测试，最大可设置为1478m。典型设置： java -Xmx355
jqGrid 各种参数详解(转帖) BreakingBad jqGrid
jqGrid 各种参数详解分类：源代码分享个人随笔请勿参考解决开发问题 2012-05-09 20:29 84282人阅读评论(22) 收藏举报 jquery 服务器 parameters function ajax string
读《研磨设计模式》-代码笔记-代理模式-Proxy bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.lang.reflect.InvocationHandler; import java.lang.reflect.Method; import java.lang.reflect.Proxy; /* * 下面
应用升级iOS8中遇到的一些问题 chenhbc ios8 升级iOS8
1、很奇怪的问题，登录界面，有一个判断，如果不存在某个值，则跳转到设置界面，ios8之前的系统都可以正常跳转，iOS8中代码已经执行到下一个界面了，但界面并没有跳转过去，而且这个值如果设置过的话，也是可以正常跳转过去的，这个问题纠结了两天多，之前的判断我是在 -(void)viewWillAppear:(BOOL)animated 中写的，最终的解决办法是把判断写在 -(void
工作流与自组织的关系？ comsci 设计模式工作
目前的工作流系统中的节点及其相互之间的连接是事先根据管理的实际需要而绘制好的，这种固定的模式在实际的运用中会受到很多限制，特别是节点之间的依存关系是固定的，节点的处理不考虑到流程整体的运行情况，细节和整体间的关系是脱节的，那么我们提出一个新的观点，一个流程是否可以通过节点的自组织运动来自动生成呢？这种流程有什么实际意义呢？这里有篇论文，摘要是：“针对网格中的服务
Oracle11.2新特性之INSERT提示IGNORE_ROW_ON_DUPKEY_INDEX daizj oracle
insert提示IGNORE_ROW_ON_DUPKEY_INDEX 转自：http://space.itpub.net/18922393/viewspace-752123 在 insert into tablea ...select * from tableb中，如果存在唯一约束，会导致整个insert操作失败。使用IGNORE_ROW_ON_DUPKEY_INDEX提示，会忽略唯一
二叉树:堆 dieslrae 二叉树
这里说的堆其实是一个完全二叉树,每个节点都不小于自己的子节点,不要跟jvm的堆搞混了.由于是完全二叉树,可以用数组来构建.用数组构建树的规则很简单: 一个节点的父节点下标为: (当前下标 - 1)/2 一个节点的左节点下标为: 当前下标 * 2 + 1 &
C语言学习八结构体 dcj3sjt126com c
为什么需要结构体，看代码 # include <stdio.h> struct Student //定义一个学生类型，里面有age, score, sex, 然后可以定义这个类型的变量 { int age; float score; char sex; } int main(void) { struct Student st = {80, 66.6,
centos安装golang dcj3sjt126com centos
#在国内镜像下载二进制包 wget -c http://www.golangtc.com/static/go/go1.4.1.linux-amd64.tar.gz tar -C /usr/local -xzf go1.4.1.linux-amd64.tar.gz #把golang的bin目录加入全局环境变量 cat >>/etc/profile<
10.性能优化-监控-MySQL慢查询 frank1234 性能优化 MySQL慢查询
1.记录慢查询配置 show variables where variable_name like 'slow%' ; --查看默认日志路径查询结果：--不用的机器可能不同 slow_query_log_file=/var/lib/mysql/centos-slow.log 修改mysqld配置文件：/usr /my.cnf[一般在/etc/my.cnf，本机在/user/my.cn
Java父类取得子类类名 happyqing java this 父类子类类名
在继承关系中，不管父类还是子类，这些类里面的this都代表了最终new出来的那个类的实例对象，所以在父类中你可以用this获取到子类的信息！ package com.urthinker.module.test; import org.junit.Test; abstract class BaseDao<T> { public void
Spring3.2新注解@ControllerAdvice jinnianshilongnian @Controller
@ControllerAdvice，是spring3.2提供的新注解，从名字上可以看出大体意思是控制器增强。让我们先看看@ControllerAdvice的实现： @Target(ElementType.TYPE) @Retention(RetentionPolicy.RUNTIME) @Documented @Component public @interface Co
Java spring mvc多数据源配置 liuxihope spring
转自：http://www.itpub.net/thread-1906608-1-1.html 1、首先配置两个数据库 <bean id="dataSourceA" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close&quo
第12章 Ajax（下） onestopweb Ajax
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
BW / Universe Mappings blueoxygen BO
BW Element OLAP Universe Element Cube Dimension Class Charateristic A class with dimension and detail objects (Detail objects for key and desription) Hi
Java开发熟手该当心的11个错误 tomcat_oracle java 多线程工作单元测试
#1、不在属性文件或XML文件中外化配置属性。比如，没有把批处理使用的线程数设置成可在属性文件中配置。你的批处理程序无论在DEV环境中，还是UAT（用户验收测试）环境中，都可以顺畅无阻地运行，但是一旦部署在PROD 上，把它作为多线程程序处理更大的数据集时，就会抛出IOException，原因可能是JDBC驱动版本不同，也可能是#2中讨论的问题。如果线程数目可以在属性文件中配置，那么使它成为
推行国产操作系统的优劣 yananay windows linux 国产操作系统
最近刮起了一股风，就是去“国外货”。从应用程序开始，到基础的系统，数据库，现在已经刮到操作系统了。原因就是“棱镜计划”，使我们终于认识到了国外货的危害，开始重视起了信息安全。操作系统是计算机的灵魂。既然是灵魂，为了信息安全，那我们就自然要使用和推行国货。可是，一味地推行，是否就一定正确呢？先说说信息安全。其实从很早以来大家就在讨论信息安全。很多年以前，就据传某世界级的网络设备制造商生产的交

AVX指令集函数列表中文翻译

AVX指令集函数列表

Arithmetic

__m256i _mm256_add_epi16 (__m256i a, __m256i b)

__m256i _mm256_add_epi32 (__m256i a, __m256i b)

__m256i _mm256_add_epi64 (__m256i a, __m256i b)

__m256i _mm256_add_epi8 (__m256i a, __m256i b)

__m256d _mm256_add_pd (__m256d a, __m256d b)

__m256 _mm256_add_ps (__m256 a, __m256 b)

__m256i _mm256_adds_epi16 (__m256i a, __m256i b)

__m256i _mm256_adds_epi8 (__m256i a, __m256i b)

__m256i _mm256_adds_epu16 (__m256i a, __m256i b)

__m256i _mm256_adds_epu8 (__m256i a, __m256i b)

__m256d _mm256_addsub_pd (__m256d a, __m256d b)

__m256 _mm256_addsub_ps (__m256 a, __m256 b)

__m256d _mm256_div_pd (__m256d a, __m256d b)

__m256 _mm256_div_ps (__m256 a, __m256 b)

__m256 _mm256_dp_ps (__m256 a, __m256 b, const int imm8)

__m256i _mm256_hadd_epi16 (__m256i a, __m256i b)

__m256i _mm256_hadd_epi32 (__m256i a, __m256i b)

__m256d _mm256_hadd_pd (__m256d a, __m256d b)

__m256 _mm256_hadd_ps (__m256 a, __m256 b)

__m256i _mm256_hadds_epi16 (__m256i a, __m256i b)

__m256i _mm256_hsub_epi16 (__m256i a, __m256i b)

__m256i _mm256_hsub_epi32 (__m256i a, __m256i b)

__m256d _mm256_hsub_pd (__m256d a, __m256d b)

__m256 _mm256_hsub_ps (__m256 a, __m256 b)

__m256i _mm256_hsubs_epi16 (__m256i a, __m256i b)

__m256i _mm256_madd_epi16 (__m256i a, __m256i b)

__m256i _mm256_maddubs_epi16 (__m256i a, __m256i b)

__m256i _mm256_mul_epi32 (__m256i a, __m256i b)

__m256i _mm256_mul_epu32 (__m256i a, __m256i b)

__m256d _mm256_mul_pd (__m256d a, __m256d b)

__m256 _mm256_mul_ps (__m256 a, __m256 b)

__m256i _mm256_mulhi_epi16 (__m256i a, __m256i b)

__m256i _mm256_mulhi_epu16 (__m256i a, __m256i b)

__m256i _mm256_mulhrs_epi16 (__m256i a, __m256i b)

__m256i _mm256_mullo_epi16 (__m256i a, __m256i b)

__m256i _mm256_mullo_epi32 (__m256i a, __m256i b)

__m256i _mm256_sad_epu8 (__m256i a, __m256i b)

__m256i _mm256_sign_epi16 (__m256i a, __m256i b)

__m256i _mm256_sign_epi32 (__m256i a, __m256i b)

__m256i _mm256_sign_epi8 (__m256i a, __m256i b)

__m256i _mm256_sub_epi16 (__m256i a, __m256i b)

__m256i _mm256_sub_epi32 (__m256i a, __m256i b)

__m256i _mm256_sub_epi64 (__m256i a, __m256i b)

__m256i _mm256_sub_epi8 (__m256i a, __m256i b)

__m256d _mm256_sub_pd (__m256d a, __m256d b)

__m256 _mm256_sub_ps (__m256 a, __m256 b)

__m256i _mm256_subs_epi16 (__m256i a, __m256i b)

__m256i _mm256_subs_epi8 (__m256i a, __m256i b)

__m256i _mm256_subs_epu16 (__m256i a, __m256i b)

__m256i _mm256_subs_epu8 (__m256i a, __m256i b)