Replace uses of brw_builder::at() with various more descriptive
variants. Use block pointer from instruction when possible.
A couple of special cases remained and will be handled in separate patches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
Since brw_inst now has the block it belongs and the block can
reach the shader, the only necessary information to create a
builder is the brw_inst itself.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
Our name for this enum was brw_message_target, but it's better known as
shared function ID or SFID. Call it brw_sfid to make it easier to find.
Now that brw only supports Gfx9+, we don't particularly care whether
SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs
were confusingly tagged "GFX12" but aren't available on Gfx12.0; they
were introduced with Alchemist/Meteorlake.
GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds
like the SFID to use for the sampler on Gfx6+, however it has nothing to
do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID.
On Haswell, we ran out of messages on the main data cache data port, and
so they introduced two additional ones, for more messages. The modern
Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think
the "sampler" name came from some idea about reorganizing messages that
never materialized (instead, the LSC came as a much larger cleanup).
Recently we've adopted the term "HDC" for the legacy data cluster, as
opposed to "LSC" for the modern Load/Store Cache. To make clear which
SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2.
We were also citing the G45, Sandybridge, and Ivybridge PRMs for a
compiler that supports none of those platforms. Cite modern docs.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>
Memory fences do not refer to an element of a binding table. Rather,
the reason we had "BTI" in these opcodes was to distinguish what in
modern terms are called UGM (untyped memory data cache) vs. SLM
(cross-thread shared local memory) fences.
Icelake and older platforms used the "data cache" SFID for both
purposes, distinguishing them by having a special binding table
index, 254, meaning "this is actually SLM access". This is where
the notion that fences had BTIs came in. (In fact, prior to Icelake,
separate SLM fences were not a thing, so BTI wasn't used there either.)
To avoid confusion about BTI being involved, we choose a simpler lie: we
have Icelake SLM fences target GFX12_SFID_SLM (like modern platforms
would), even though it didn't really exist back then. Later lowering
code sets it back to the correct Data Cache SFID with magic SLM binding
table index. This eliminates BTI everywhere and an unnecessary source.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
brw_memory_fence() overrides the instructions generated by the
MEMORY_FENCE or INTERLOCK opcodes to be force_writemask_all with
exec_size == 1. But the IR was emitting it in SIMD8 (regardless
of dispatch width). Instead, just emit the IR as SIMD1/NoMask so
the IR matches what we actually generate. Have size_written indicate
that the entire destination is written, however, as it is ultimately
going to be a SEND that writes a whole register.
We were also using a UD register for the source of
FS_OPCODE_SCHEDULING_FENCE when the generator overrides it to UW,
so just specify UW in the IR as well so that they line up.
Also add validation for MEMORY_FENCE/INTERLOCK that we've done the
exec_size and masking right in the IR.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>