_mm256_mask_i64gather_ps
Classification
Header File
Instruction
VGATHERQPS xmm, vm64y, xmm
Synopsis
_mm256_mask_i64gather_ps(__m128 src, float const* base_addr, __m256i vindex, __m128 mask, const int scale);
Description
Gather single-precision (32-bit) floating-point elements from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at "base_addr" and offset by each 64-bit element in "vindex" (each index is scaled by the factor in "scale"). Gathered elements are merged into "dst" using "mask" (elements are copied from "src" when the highest bit is not set in the corresponding element). "scale" should be 1, 2, 4 or 8.
Operation
FOR j := 0 to 3
i := j*32
m := j*64
IF mask[i+31]
addr := base_addr + vindex[m+63:m] * ZeroExtend64(scale) * 8
dst[i+31:i] := MEM[addr+31:addr]
ELSE
dst[i+31:i] := src[i+31:i]
FI
ENDFOR
mask[MAX:128] := 0
dst[MAX:128] := 0