_mm_mask_4fmadd_ss
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_4FMAPS
Header File
Instruction
V4FMADDSS xmm {k}, xmm, m128
Synopsis
_mm_mask_4fmadd_ss(__m128 src, __mmask8 k, __m128 a0, __m128 a1, __m128 a2, __m128 a3, __m128 * b);
Description
Multiply the lower single-precision (32-bit) floating-point elements specified in 4 consecutive operands "a0" through "a3" by corresponding element in "b", accumulate with the lower element in "a", and store the result in the lower element of "dst" using writemask "k" (the element is copied from "a" when mask bit 0 is not set).
Operation
dst[127:0] := src[127:0]
IF k[0]
FOR m := 0 to 3
addr := b + m * 32
dst.fp32[0] := dst.fp32[0] + a{m}.fp32[0] * Cast_FP32(MEM[addr+31:addr])
ENDFOR
FI
dst[MAX:128] := 0