_mm_dpbssds_epi32
Classification
AVX_ALL, Arithmetic, CPUID Test: AVX_VNNI_INT8
Header File
Instruction
VPDPBSSDS xmm, xmm, xmm
Synopsis
_mm_dpbssds_epi32(__m128i __W, __m128i __A, __m128i __B);
Description
Multiply groups of 4 adjacent pairs of signed 8-bit integers in "__A" with corresponding signed 8-bit integers in "__B", producing 4 intermediate signed 16-bit results. Sum these 4 results with the corresponding 32-bit integer in "__W" with signed saturation, and store the packed 32-bit results in "dst".
Operation
FOR j := 0 to 3
tmp1.word := SignExtend16(__A.byte[4*j]) * SignExtend16(__B.byte[4*j])
tmp2.word := SignExtend16(__A.byte[4*j+1]) * SignExtend16(__B.byte[4*j+1])
tmp3.word := SignExtend16(__A.byte[4*j+2]) * SignExtend16(__B.byte[4*j+2])
tmp4.word := SignExtend16(__A.byte[4*j+3]) * SignExtend16(__B.byte[4*j+3])
dst.dword[j] := SIGNED_DWORD_SATURATE(__W.dword[j] + tmp1 + tmp2 + tmp3 + tmp4)
ENDFOR
dst[MAX:128] := 0