_mm512_mask_4fmadd_ps
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_4FMAPS
Header File
immintrin.h
Instruction
V4FMADDPS zmm {k}, zmm, m128
Synopsis
 _mm512_mask_4fmadd_ps(__m512 src, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b);
Description
Multiply packed single-precision (32-bit) floating-point elements specified in 4 consecutive operands "a0" through "a3" by the 4 corresponding packed elements in "b", accumulate with the corresponding elements in "src", and store the results in "dst" using writemask "k" (elements are copied from "a" when the corresponding mask bit is not set).
Operation
dst[511:0] := src[511:0]
FOR i := 0 to 15
	FOR m := 0 to 3
		addr := b + m * 32
		IF k[i]
			dst.fp32[i] := dst.fp32[i] + a{m}.fp32[i] * Cast_FP32(MEM[addr+31:addr])
		FI
	ENDFOR
ENDFOR
dst[MAX:512] := 0