The current code walks the instructions, and when needed,
it will scan to find the next "end of scope" and sometimes
the next "end of block". It also has a separate patching
logic for HALTs.
The new code collects the necessary scope information up front,
then walks the instruction backwards, making avoiding the need
to scan for the end of scope. It will also walk only the
relevant instructions that were previously collected. It also
replaces the previous HALT-specific patching logic.
With this new change, many cases that were jumping to
intermediate HALTs, will now jump straight to the end of
scope (or the "end of the program" section). E.g. in
```
if
...
(...) HALT
...
(...) HALT
endif
```
both HALTs now will jump to the end of the scope, instead of the
first HALT jumping into the second one.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38914>
Most of instructions follow the basic formats (1, 2 and 3 src), so
consolidate their emission code in generator.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
This is unused at the moment but the backend incorrectly assumes
immediate handles are for the binding table (therefore not bindless).
Some new CTS tests are using an immediate bindless handle which is
broken.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38359>
Even though some platforms support int64 they don't support indirect
movs with 64-bit values. Effectively this is only supported for non-LP
Gfx9.
This fixes various tests in dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.*.push_constant.*64*
on BMG.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
This works around a long-standing synchronization issue consequence of
the HALT instruction used to implement FS discard not being considered
a control flow instruction by the back-end -- The fact that it doesn't
cause the CFG pass to introduce an edge in the graph means that the
software scoreboard pass is completely blind to the effect of discard
jumps on control flow, so it doesn't introduce the required
annotations to avoid data hazards when the discard path of the CFG is
taken. Note that because of the very limited set of instructions that
can follow the HALT target in a fragment shader this was very unlikely
to lead to issues in practice, but starting on xe3 it appears to have
become far more likely due to the use of SENDG, since SENDG requires
the scalar register to be set prior to the submission of the render
target write payloads, which can easily lead to a WaR hazard if there
was another SENDG before the HALT jump that wasn't done reading out
its payload from the GRF.
In an ideal world this would be avoided by having HALT be a normal
control flow instruction represented as an edge in the control flow
graph -- But unfortunately that would prevent the optimizations we
currently do that take advantage of the ability of reordering code
past the HALT instruction, so it would have a pretty large performance
cost. Instead this simply adds a SYNC.ALLWR instruction after the
HALT target to guarantee that all pending SEND messages have finished
execution -- That may also seem costly, however its cost in practice
appears to be minimal since at the point of the program when the
target HALT is executed there is almost nothing left to do other than
send out the render target write payloads, so any pending operations
had to be waited on at roughly this point of the program regardless.
There appear to be no statistically significant regressions in Traci
on neither BMG nor PTL. Fixes hangs observed on Dying Light 2 and
Cyberpunk on PTL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13896
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13965
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14092
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37674>
v2 (idr): So much rebasing. Deleted a bunch of code that we're not
going to need yet.
v3 (Ken): bfn inst encoding fix
v4 (idr): Add BFN to brw_get_lowered_simd_width.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
The Intel EU fusion feature needs to be disabled on SEND messages
where either the texture handle, sampler handle, sampler header is not
identical on fused threads.
This is the case in particular with accesses on non-uniform
texture/sampler handles but could also strike with dynamic
programmable offsets (currently disabled).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>
This keeps the directory structure a bit more organized:
- brw specific code
- elk specific code
- common NIR passes that could be used in both places
It also means that you can now 'git grep' in the brw directory without
finding a bunch of elk code, or having to "grep thing b*".
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
2025-10-09 07:01:47 +00:00
Renamed from src/intel/compiler/brw_generator.cpp (Browse further)