_mm_dp_pd
Classification
SSE_ALL, Arithmetic, CPUID Test: SSE4.1
Header File
smmintrin.h
Instruction
DPPD xmm, xmm, imm8
Synopsis
 _mm_dp_pd(__m128d a, __m128d b, const int imm8);
Description
Conditionally multiply the packed double-precision (64-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8".
Operation
DEFINE DP(a[127:0], b[127:0], imm8[7:0]) {
	FOR j := 0 to 1
		i := j*64
		IF imm8[(4+j)%8]
			temp[i+63:i] := a[i+63:i] * b[i+63:i]
		ELSE
			temp[i+63:i] := 0.0
		FI
	ENDFOR
	
	sum[63:0] := temp[127:64] + temp[63:0]
	
	FOR j := 0 to 1
		i := j*64
		IF imm8[j%8]
			tmpdst[i+63:i] := sum[63:0]
		ELSE
			tmpdst[i+63:i] := 0.0
		FI
	ENDFOR
	RETURN tmpdst[127:0]
}
dst[127:0] := DP(a[127:0], b[127:0], imm8[7:0])