_mm512_mask_4fnmadd_ps
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_4FMAPS
Header File
Instruction
V4FNMADDPS zmm {k}, zmm, m128
Synopsis
_mm512_mask_4fnmadd_ps(__m512 src, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b);
Description
Multiply packed single-precision (32-bit) floating-point elements specified in 4 consecutive operands "a0" through "a3" by the 4 corresponding packed elements in "b", accumulate the negated intermediate result with the corresponding elements in "src", and store the results in "dst" using writemask "k" (elements are copied from "a" when the corresponding mask bit is not set).
Operation
dst[511:0] := src[511:0]
FOR i := 0 to 15
FOR m := 0 to 3
addr := b + m * 32
IF k[i]
dst.fp32[i] := dst.fp32[i] - a{m}.fp32[i] * Cast_FP32(MEM[addr+31:addr])
FI
ENDFOR
ENDFOR
dst[MAX:512] := 0