_mm_dp_pd
Classification
SSE_ALL, Arithmetic, CPUID Test: SSE4.1
Header File
Instruction
DPPD xmm, xmm, imm8
Synopsis
_mm_dp_pd(__m128d a, __m128d b, const int imm8);
Description
Conditionally multiply the packed double-precision (64-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8".
Operation
DEFINE DP(a[127:0], b[127:0], imm8[7:0]) {
FOR j := 0 to 1
i := j*64
IF imm8[(4+j)%8]
temp[i+63:i] := a[i+63:i] * b[i+63:i]
ELSE
temp[i+63:i] := 0.0
FI
ENDFOR
sum[63:0] := temp[127:64] + temp[63:0]
FOR j := 0 to 1
i := j*64
IF imm8[j%8]
tmpdst[i+63:i] := sum[63:0]
ELSE
tmpdst[i+63:i] := 0.0
FI
ENDFOR
RETURN tmpdst[127:0]
}
dst[127:0] := DP(a[127:0], b[127:0], imm8[7:0])