_mm512_maskz_4dpwssd_epi32
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_4VNNIW
Header File
immintrin.h
Instruction
VP4DPWSSD zmm {z}, zmm, m128
Synopsis
 _mm512_maskz_4dpwssd_epi32(__mmask16 k, __m512i src, __m512i a0, __m512i a1, __m512i a2, __m512i a3, __m128i * b);
Description
Compute 4 sequential operand source-block dot-products of two signed 16-bit element operands with 32-bit element accumulation with mask, and store the results in "dst" using zeromask "k" (elements are zeroed out when the corresponding mask bit is not set).
Operation
dst[511:0] := src[511:0]
FOR i := 0 to 15
	IF k[i]
		FOR m := 0 to 3
			lim_base := b + m*32
			t.dword  := MEM[lim_base+31:lim_base]
			p1.dword := SignExtend32(a{m}.word[2*i+0]) * SignExtend32(Cast_Int16(t.word[0]))
			p2.dword := SignExtend32(a{m}.word[2*i+1]) * SignExtend32(Cast_Int16(t.word[1]))
			dst.dword[i] := dst.dword[i] + p1.dword + p2.dword
		ENDFOR
	ELSE
		dst.dword[i] := 0
	FI
ENDFOR
dst[MAX:512] := 0