sfid is another field that is not preserved after brw_transform_inst_to_send()
so we need to store it before transform and retore it to preserve the sfid value.
Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
Opcodes SHADER_OPCODE_INTERLOCK and SHADER_OPCODE_MEMORY_FENCE are emitted as
brw_send_inst and at nir to brw conversion the desc field is set with scope and
flush type of the instruction.
But when brw_inst is converted to brw_send_inst all special fields of
brw_send_inst are set to 0, causing scope and flush type to always be 0.
So here calling lower_lsc_memory_fence_and_interlock() with brw_send_inst
parameter and storing the desc before brw_transform_inst_to_send().
I still have not figure out why we need do brw_transform_inst_to_send() even
if it is already a brw_send_inst but not doing so causes a segfault in
foreach_block_and_inst_safe(block, brw_inst, inst, s.cfg) of
brw_lower_logical_sends(), also other opcodes of that function does something
similar so I don't think that is wrong.
Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
This works around a long-standing synchronization issue consequence of
the HALT instruction used to implement FS discard not being considered
a control flow instruction by the back-end -- The fact that it doesn't
cause the CFG pass to introduce an edge in the graph means that the
software scoreboard pass is completely blind to the effect of discard
jumps on control flow, so it doesn't introduce the required
annotations to avoid data hazards when the discard path of the CFG is
taken. Note that because of the very limited set of instructions that
can follow the HALT target in a fragment shader this was very unlikely
to lead to issues in practice, but starting on xe3 it appears to have
become far more likely due to the use of SENDG, since SENDG requires
the scalar register to be set prior to the submission of the render
target write payloads, which can easily lead to a WaR hazard if there
was another SENDG before the HALT jump that wasn't done reading out
its payload from the GRF.
In an ideal world this would be avoided by having HALT be a normal
control flow instruction represented as an edge in the control flow
graph -- But unfortunately that would prevent the optimizations we
currently do that take advantage of the ability of reordering code
past the HALT instruction, so it would have a pretty large performance
cost. Instead this simply adds a SYNC.ALLWR instruction after the
HALT target to guarantee that all pending SEND messages have finished
execution -- That may also seem costly, however its cost in practice
appears to be minimal since at the point of the program when the
target HALT is executed there is almost nothing left to do other than
send out the render target write payloads, so any pending operations
had to be waited on at roughly this point of the program regardless.
There appear to be no statistically significant regressions in Traci
on neither BMG nor PTL. Fixes hangs observed on Dying Light 2 and
Cyberpunk on PTL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13896
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13965
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14092
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37674>
By dynamic setting num_channel we can share more code in
lower_lsc_varying_pull_constant_logical_send().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37853>
cmod propagation needs more work. Since the result type is always UD,
BRW_CONDITION_G should be able to substitute for NZ. Either that or
users of the condition could be rewritten to use an inverted condition.
v2: Add a couple more unit tests. Suggested by Matt.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
v2 (idr): So much rebasing. Deleted a bunch of code that we're not
going to need yet.
v3 (Ken): bfn inst encoding fix
v4 (idr): Add BFN to brw_get_lowered_simd_width.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
The Intel EU fusion feature needs to be disabled on SEND messages
where either the texture handle, sampler handle, sampler header is not
identical on fused threads.
This is the case in particular with accesses on non-uniform
texture/sampler handles but could also strike with dynamic
programmable offsets (currently disabled).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>
mesh doesn't use brw_vue_prog_data. Also, I had been catching TCS
shaders here, and shouldn't.
Fixes: bf76e86bc8 ("brw: Refactor clip/cull distance mask setting into a helper")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37809>
This is nearly identical, except for bindless sampler/texture/image
handling. But we only use it for inputs/outputs, not uniforms, where
there are no bindless handles to worry about.
Deletes a lot of mostly-duplicated code.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37784>
There are no 64-bit renderable formats so we can't have FS outputs that
are dvecs. This dates back to 2016 and a ton of the backend has been
rewritten, so I think whatever this was trying to solve is no longer a
problem.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37784>
replace is preferred when appropriate & should be faster. after is when
you use the result in your lowering itself.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37753>
This keeps the directory structure a bit more organized:
- brw specific code
- elk specific code
- common NIR passes that could be used in both places
It also means that you can now 'git grep' in the brw directory without
finding a bunch of elk code, or having to "grep thing b*".
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
These are identical and are just hardware enum values, not related to
the structure of the backend compiler.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
We were compiling these twice, one for brw, one for elk. There's no
reason to do that, just compile the common code once and link against it
in both backends.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
This is from the pre-NIR era where we used GLSL IR expression opcodes
directly. We haven't done that in years.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
We want to be able to emit load_reloc_const_intel intrinsics from common
NIR passes (such as printf lowering). In order to do that, we need to
have the enum with the meaning of values in common code. Once you have
that, it's easy to see the (identical) data structures as a way for the
driver to communicate about relocations, rather than a compiler backend
specific thing. So we move it all up to common code, and re-unify.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
Usage of '--outdir' argument in python scripts makes it very
complicated for tools like ninja-to-soong to generate the Android
equivalent build file.
This is because the option is less clear on what will be generated.
Instead, change it for '--out' where we give the full path of the file
to generate. This has the good point of deduplicating the locations of
the file name to have it only in 'meson.build'.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37741>
The bindless sampler heap was introduced in Gfx11 which ELK doesn't
support.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37692>
It's always set to a fixed value and not used in many places. Use the
value directly where it's needed.
Suggested-by: Lucas Fryzek <lfryzek@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>
With EXT_shader_object, it became possible to compile shaders
independently and then use them together later, so we cannot rely on the
lack of task shader data to decide that no task shader will be used. The
flag VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT exists for that purpose,
but it doesn't really make any difference for us. Always assume that if
the mesh shader is reading the task payload, it's going to be used with
one, as otherwise the application is doing it wrong.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13983
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>