_mm_maskz_dpbusds_epi32
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_VNNI
Header File
immintrin.h
Instruction
VPDPBUSDS xmm {z}, xmm, xmm
Synopsis
 _mm_maskz_dpbusds_epi32(__mmask8 k, __m128i src, __m128i a, __m128i b);
Description
Multiply groups of 4 adjacent pairs of unsigned 8-bit integers in "a" with corresponding signed 8-bit integers in "b", producing 4 intermediate signed 16-bit results. Sum these 4 results with the corresponding 32-bit integer in "src" using signed saturation, and store the packed 32-bit results in "dst" using zeromask "k" (elements are zeroed out when the corresponding mask bit is not set).
Operation
FOR j := 0 to 3
	IF k[j]
		tmp1.word := Signed(ZeroExtend16(a.byte[4*j]) * SignExtend16(b.byte[4*j]))
		tmp2.word := Signed(ZeroExtend16(a.byte[4*j+1]) * SignExtend16(b.byte[4*j+1]))
		tmp3.word := Signed(ZeroExtend16(a.byte[4*j+2]) * SignExtend16(b.byte[4*j+2]))
		tmp4.word := Signed(ZeroExtend16(a.byte[4*j+3]) * SignExtend16(b.byte[4*j+3]))
		dst.dword[j] := Saturate32(src.dword[j] + tmp1 + tmp2 + tmp3 + tmp4)
	ELSE
		dst.dword[j] := 0
	FI
ENDFOR
dst[MAX:128] := 0