The Intel Gfx12.x generation of GPU has an architecture feature called
EU fusion in which 2 subgroups run lock step. A typical case where
this happens is a compute shader with 1x1x1 local workgroup size and a
dispatch command of 2x1x1. In that case 2 threads will be run in lock
step for each of the workgroup.
This has been the sources of some troubles in the backend because one
subgroup can run with all lanes disabled, requiring care for SEND
messages using the NoMask flag (execution regardless of the lane mask).
We found out that other things are happening when 2 subgroups run
together :
- the HW will use the surface/sampler handle from only one subgroup
- the HW will use the sampler header from only one subgroup
So one of the fused subgroup can access the wrong surface/sampler if
the value is different between the 2 subgroups and that can happen
even with subgroup uniform values.
Fortunately we can flag SEND instructions to disable the fusion
behavior (most likely at a performance cost).
This change introduce a new divergence mode that tries to compute
things divergent between subgroups so that we can flag instructions
accordingly.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>
nir_recompute_io_bases will modify i/o intrinsics, which is not the
expected behaviour when the keep_intrinsics flag is set.
Fixes: 83aecc8f3f ("mesa/st, nir: commonize unlower_io_to_vars pass")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37725>
mesh doesn't use brw_vue_prog_data. Also, I had been catching TCS
shaders here, and shouldn't.
Fixes: bf76e86bc8 ("brw: Refactor clip/cull distance mask setting into a helper")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37809>
When comparing an vec3 and a vec4 array, scalar type is the same for
both (float). Instead use the array element type to compare (that is,
vec3 vs vec4).
Fixes
spec@glsl-1.20@compiler@invalid-vec4-array-to-vec3-array-conversion.vert
piglit test.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37783>
Sometimes we need to select between ishr/ushr based some condition; this
builder makes this less verbose.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37793>
Coercing the argument to a bool when we have __builtin_expect but
leaving it unmodified otherwise is a recipe for really subtle bugs. I
don't know if any bugs like that exist currently, but I almost
introduced one in panfrost.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37801>
We need to return true if we need the companion batch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e60416b4e4 ("anv: use companion batch for operations with HIZ/STC_CCS destination")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37797>
This is nearly identical, except for bindless sampler/texture/image
handling. But we only use it for inputs/outputs, not uniforms, where
there are no bindless handles to worry about.
Deletes a lot of mostly-duplicated code.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37784>
There are no 64-bit renderable formats so we can't have FS outputs that
are dvecs. This dates back to 2016 and a ton of the backend has been
rewritten, so I think whatever this was trying to solve is no longer a
problem.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37784>
All vtn_bindgen2-generated files use the same 'vtn_bindgen_dummy' struct
name. When linking more than one file (like in panfrost), the
constructor and destructor symbols collide and every instance ends up
running the same initialization. In panfrost, this results in us
dropping any printf format strings that don't occur in v6.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: b7447a94c8 ("vtn: add vtn_bindgen2 tool")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37798>
When you're trying to figure out what shader some NIR pass broke, use
nir_shader_bisect_select() to decide between NIR pass behaviors, and then
nir_shader_bisect.py will help you automatically bisect down to which
source_blake3 is at fault. Once it's identified, it prints you a C call
you can use for selecting that shader specifically, which you can use for
continuing on in your debugging.
On a test I was looking at, this took 10 steps to bisect 134 shaders down
to the source_blake3 of the NIR shader in question.
This idea is heavily lifted from Job Noorman's ir3_shader_bisect.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37468>
When there's only blend mode updates (e.g. CB_BLEND_EQUATIONS not
covered by fs_user_dirty check), we have to set dcd0_dirty for the
relevant CB updates. Otherwise, we might miss to clear FPK. On the
other hand, this also optimizes to set FPK in the reverse mutation, so
that new draws no longer depending on the previous tile buffer can
benefit from FPK.
Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37760>
Adding vk_enum_defines.h to idep_vulkan_util_headers to help
ninja-to-soong generate correct rules for the Android build system.
Without it, ninja-to-soong is not able to figure out that this file is
needed by targets depending on idep_vulkan_util_headers, leading to
build errors with the file missing.
Ref #14072
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37789>
replace is preferred when appropriate & should be faster. after is when
you use the result in your lowering itself.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37753>
This state combination wedges something in GPU causing hang.
Forcing A6XX_LATE_Z prevents it. Prop driver does the same.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37765>
This script didn't have tests defined. Now, the objects on this tool have
tests to verify its functionality, as well as the core method called to do
all the processes to provide a sorted list of the marge queue.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
As proposed in !37454, we can benefit from `rich` python module to simplify
output formatting. Using it here it's only the link print in console what
needs to be adjusted. There is a side effect, that look ok, with some
coloring in the timestamps and MR ids listed.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
This marge_tool wasn't yet described in the documentation. It has links to the
resource utilization, and it is a satellite tool for crnm.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
The output line with the MR link only works in some consoles; it can be
interesting for some developers to have visibility of the MR id.
It can be useful, too, to have some sort of a header showing the fields
printed from each merge request in the queue.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
For testing and an eventual use of tenacity, it is practical to encapsulate
calls to the GitLab module in methods.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
Use the parameter retry_transient_errors on the GitLab object creation to
protect the script from transient errors that can be well handled.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
Rebuild the information gathering about the marge queue and how the
information is later prompted to the user.
The queue provided to the user is sorted, so the user knows what will be
merged first (when the corresponding merge request pipeline succeeds).
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
Enhancement of the module with two structures that can encapsulate
functionalities and establish links between data collected.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
Instead of printing an exception on the screen when the process is interrupted
from the keyboard, handle it and print a more friendly message.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
Encapsulate the procedure in a method that can be imported from another tool
or even a python console.
Also include a typehint fix.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37395>
This keeps the directory structure a bit more organized:
- brw specific code
- elk specific code
- common NIR passes that could be used in both places
It also means that you can now 'git grep' in the brw directory without
finding a bunch of elk code, or having to "grep thing b*".
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>