FMA instruction set
Wikibooks has a book on the topic of: X86 Assembly/AVX, AVX2, FMA3, FMA4
The FMA instruction set is an extension to the 128- and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations. There are two variants:
- FMA4 is supported in AMD processors starting with the Bulldozer architecture. FMA4 was performed in hardware before FMA3 was. Support for FMA4 has been removed since Zen 1.
- FMA3 is supported in AMD processors starting with the Piledriver architecture and Intel starting with Haswell processors and Broadwell processors since 2014.