FMLAL, FMLAL2 (by element)

Floating-point fused Multiply-Add Long to accumulator (by element). This instruction multiplies the vector elements in the first source SIMD&FP register by the specified value in the second source SIMD&FP register, and accumulates the product to the corresponding vector element of the destination SIMD&FP register. The instruction does not round the result of the multiply before the accumulation.

A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see Floating-point exception traps.

Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state and Exception level, an attempt to execute the instruction might be trapped.

From Armv8.2, this is an optional instruction.

ID_AA64ISAR0_EL1.FHM indicates whether this instruction is supported.

It has encodings from 2 classes: FMLAL and FMLAL2

FMLAL
(FEAT_FHM)

313029282726252423222120191817161514131211109876543210
0Q00111110LMRm0000H0RnRd
szS

FMLAL <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.H[<index>]

if !HaveFP16MulNoRoundingToFP32Ext() then UNDEFINED; integer d = UInt(Rd); integer n = UInt(Rn); integer m = UInt('0':Rm); // Vm can only be in bottom 16 registers. if sz == '1' then UNDEFINED; integer index = UInt(H:L:M); integer esize = 32; integer datasize = if Q=='1' then 128 else 64; integer elements = datasize DIV esize; boolean sub_op = (S == '1'); integer part = 0;

FMLAL2
(FEAT_FHM)

313029282726252423222120191817161514131211109876543210
0Q10111110LMRm1000H0RnRd
szS

FMLAL2 <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.H[<index>]

if !HaveFP16MulNoRoundingToFP32Ext() then UNDEFINED; integer d = UInt(Rd); integer n = UInt(Rn); integer m = UInt('0':Rm); // Vm can only be in bottom 16 registers. if sz == '1' then UNDEFINED; integer index = UInt(H:L:M); integer esize = 32; integer datasize = if Q=='1' then 128 else 64; integer elements = datasize DIV esize; boolean sub_op = (S == '1'); integer part = 1;

Assembler Symbols

<Vd>

Is the name of the SIMD&FP destination register, encoded in the "Rd" field.

<Ta> Is an arrangement specifier, encoded in Q:
Q <Ta>
0 2S
1 4S
<Vn>

Is the name of the first SIMD&FP source register, encoded in the "Rn" field.

<Tb> Is an arrangement specifier, encoded in Q:
Q <Tb>
0 2H
1 4H
<Vm>

Is the name of the second SIMD&FP source register, encoded in the "Rm" field.

<index>

Is the element index, encoded in the "H:L:M" fields.

Operation

CheckFPAdvSIMDEnabled64(); bits(datasize DIV 2) operand1 = Vpart[n,part]; bits(128) operand2 = V[m]; bits(datasize) operand3 = V[d]; bits(datasize) result; bits(esize DIV 2) element1; bits(esize DIV 2) element2 = Elem[operand2, index, esize DIV 2]; for e = 0 to elements-1 element1 = Elem[operand1, e, esize DIV 2]; if sub_op then element1 = FPNeg(element1); Elem[result, e, esize] = FPMulAddH(Elem[operand3, e, esize], element1, element2, FPCR); V[d] = result;


Internal version only: isa v32.13, AdvSIMD v29.04, pseudocode morello-2022-01_rc2, capabilities morello-2022-01_rc2 ; Build timestamp: 2022-01-11T11:23

Copyright © 2010-2022 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.