Zen 3 architecture7/2/2023 ![]() ![]() This is done before confirming that the store and load are in fact to the same address. When the CPUsees the store/load pair again, it may predict that STLF will occur and speculatively forward the data from the store to the load. If STLF typically occurs between a particular store and load, the CPU will remember this. With PSF, the CPU learns over time the relationship between loads and stores. PSF expands on this by speculating on the relationship between loads and stores without waiting for the address calculation to complete. In a typical CPU, STLF occurs after the address of both the load and store are calculated and determined to match. ![]() With STLF, data from the store is forwarded directly to the load without having to wait for it to be written to memory. Many modern processors implement a technique known as Store-To-Load-Forwarding (STLF) to improve performance in such cases. It is common for a CPU to execute a load instruction to an address that was recently written by a store. The feature implements a technique called speculative execution, which works by running multiple alternative CPU operations in advance to make results available faster, and then discarding “predicted” data once deemed unneeded.īelow is AMD’s high-level explanation for how PSF works under the hood: US chipmaker AMD advised customers last week to disable a new performance feature if they plan to use CPUs for sensitive operations, as this feature is vulnerable to Spectre-like side-channel attacks.Ĭalled Predictive Store Forwarding (PSF), this feature was added to AMD CPUs part of the company’s Zen 3 core architecture, a processor series dedicated to gaming and high-performance computing, which launched in November 2020. I have tried reading through both the PDFs (Zen 3 section) as well as to gather data but I am not confident if what I understand is correct and would like to request assistance in clearing up my misunderstandings.AMD Zen 3 CPUs vulnerable to Spectre-like attacks via PSF feature I would like some help in verifying the information I gathered and how/where I can find the data I need for part 2. However, I am unable to confirm the latency in part 2, I am also unable to find the issue time for these instructions. I think my information in point 1 is correct. Simple integer instruction has throughput 4 Unsure if the data I gathered for floating point is single or double precision Issue/latency of FMUL : 元 or L6 (from uops, not sure where to get issue time) Issue/latency of FADD : 元 or L6 (from uops, not sure where to get issue time) (Can execute 6 integer instructions per clock cycle on average as long as they are all different types)Ħ Floating Point FUs (Including 2 multiply/addition & 2 further addition), 2 Address Generation units The processor I have in mind is a Ryzen 7 5700X, here is what I gathered:Ĥ Integer ALU FUs (multiply/divide only use 1 out of 4) & 2 branch units & 3 Address Generation units I am not 100% sure if the information I gathered is correctly. I have went through to read up on more of the instruction latencies. ![]() Which starts at page 241 for Zen 3 details.Īnd this link for the instruction infos. I was using this link for the architecture. I am trying to find information regarding the integer & floating point functional units for the processor zen 3 architecture by AMD.Īs well as the issue time (minimum time between two operations) & latency of integer & floating point (single & double precision) addition & multiplication. ![]()
0 Comments
Leave a Reply. |