_mm256_mask_shuffle_f32x4
Classification
AVX-512, Miscellaneous, CPUID Test: AVX512F
Header File
Instruction
VSHUFF32X4 ymm {k}, ymm, ymm, imm8
Synopsis
_mm256_mask_shuffle_f32x4(__m256 src, __mmask8 k, __m256 a, __m256 b, const int imm8);
Description
Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by "imm8" from "a" and "b", and store the results in "dst" using writemask "k" (elements are copied from "src" when the corresponding mask bit is not set).
Operation
tmp_dst.m128[0] := a.m128[imm8[0]]
tmp_dst.m128[1] := b.m128[imm8[1]]
FOR j := 0 to 7
i := j*32
IF k[j]
dst[i+31:i] := tmp_dst[i+31:i]
ELSE
dst[i+31:i] := src[i+31:i]
FI
ENDFOR
dst[MAX:256] := 0