_mm_reduce_round_sh
Classification
AVX-512, Special Math Functions, CPUID Test: AVX512_FP16
Header File
immintrin.h
Instruction
VREDUCESH xmm, xmm, xmm {sae}, imm8
Synopsis
 _mm_reduce_round_sh(__m128h a, __m128h b, int imm8, const int sae);
Description
Extract the reduced argument of the lower half-precision (16-bit) floating-point element in "b" by the number of bits specified by "imm8", store the result in the lower element of "dst", and copy the upper 7 packed elements from "a" to the upper elements of "dst". [round_imm_note][sae_note]
Operation
DEFINE ReduceArgumentFP16(src[15:0], imm8[7:0]) {
	m[15:0] := FP16(imm8[7:4]) // number of fraction bits after the binary point to be preserved
	tmp[15:0] := POW(2.0, FP16(-m)) * ROUND(POW(2.0, FP16(m)) * src[15:0], imm8[3:0])
	tmp[15:0] := src[15:0] - tmp[15:0]
	IF IsInf(tmp[15:0])
		tmp[15:0] := FP16(0.0)
	FI
	RETURN tmp[15:0]
}
dst.fp16[0] := ReduceArgumentFP16(b.fp16[0], imm8)
dst[127:16] := a[127:16]
dst[MAX:128] := 0