_mm_maskz_cmul_round_sch
Classification
AVX-512, Arithmetic, CPUID Test: AVX512_FP16
Header File
Instruction
VFCMULCSH xmm {z}, xmm, xmm {er}
Synopsis
_mm_maskz_cmul_round_sch(__mmask8 k, __m128h a, __m128h b, const int rounding);
Description
Multiply the lower complex number in "a" by the complex conjugate of the lower complex number in "b", and store the result in the lower elements of "dst" using zeromask "k" (elements are zeroed out when mask bit 0 is not set), and copy the upper 6 packed elements from "a" to the upper elements of "dst". Each complex number is composed of two adjacent half-precision (16-bit) floating-point elements, which defines the complex number "complex = vec.fp16[0] + i * vec.fp16[1]", or the complex conjugate "conjugate = vec.fp16[0] - i * vec.fp16[1]".
[round_note]
Operation
IF k[0]
dst.fp16[0] := (a.fp16[0] * b.fp16[0]) + (a.fp16[1] * b.fp16[1])
dst.fp16[1] := (a.fp16[1] * b.fp16[0]) - (a.fp16[0] * b.fp16[1])
ELSE
dst.fp16[0] := 0
dst.fp16[1] := 0
FI
dst[127:32] := a[127:32]
dst[MAX:128] := 0