This will save us the trouble of faking constant folding for the BVH level and
trace ray control values when we lower this intrinsic in the new backends.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42006>
Add a script to run whatever you want under drm-shim given only a driver name,
CI job name or GPU model, plus the option to dump assembly with a common option.
This lets people debugging common code easily run shader-db or whatever with
whatever they want without needing to look up a million driver specific
options/paths/etc.
Must run inside a meson devenv. Example usage (path symlinked):
drm-shim --disasm glk ./run shaders/glmark/1-1.shader_test
drm-shim --disasm asahi ./run shaders/glmark/1-1.shader_test
drm-shim --disasm panfrost-t860 ./run shaders/glmark/1-1.shader_test
drm-shim --disasm zink-radv-navi31-valve ./run shaders/glmark/1-1.shader_test
Makes for a fun compilerexplorer like tool too
Reduces amount of docs needed for https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41959
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42055>
We don't use anything from that header. We call
nir_format_pack_r9g9b9e5(), which comes from nir_format_convert.h,
which we already include.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41976>
The key is only used inside that file. Make it like we do with the
keys in blorp_clear.c.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41976>
When you use designated initializers, anything that is not explicitly
set is set to zero. When you do something like:
struct blorp_blit_prog_key {
.base = BLORP_BASE_KEY_INIT(BLORP_SHADER_TYPE_BLIT),
.base.shader_pipeline = BLORP_SHADER_PIPELINE_RENDER,
};
the second initialization is the only one that does something: it sets
shader_pipeline to the desired value, and all the other fields in
"base" are set to 0. This is easily verifiable by just examining the
contents of all the blorp keys we initialize this way: name and
shader_type are always zero.
This means that if two blorp shaders of different types have the
same key size, the shader cache could confuse them. Still, I don't
think this is happening in the real world.
Fixes: 22ecb4a10f ("intel/blorp: Support compute for slow clears")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11690
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41976>
If we fail to compile a Kernel, don't silently fail: call mesa_loge()
so we can at least know it happened. On debug builds, just assert(),
so if they ever happen in CI, we'll know.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41976>
mutex.get_mut() says
Since this call borrows the Mutex mutably, no actual locking
needs to take place – the mutable borrow statically guarantees
no new locks can be acquired while this reference exists.
However, the borrow checker does not really apply inside of the
unsafe ffi functions which can result in unintended concurrent
access.
Bug: b/519657682
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42061>
A future patch will add more parameters to fill_inline_param(), so lets reduce
the number of parameters by passing a struct to this function instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41351>
As the push constant size limit is only valid in stages that don't use inline
param I had to add and call stage_has_inline_param() first.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41351>
The QPU ALUs operate internally on 32-bit data, and V3D already asks
nir_lower_int64 to lower several 64-bit integer operations before they
reach the backend.
Extend that set to cover bit count, div/mod, abs, and min/max, so these
operations are expanded into 32-bit sequences instead of being left for
backend codegen.
Signed-off-by: yserrr <dlwognsdc610@naver.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42052>
Compiling with clang produces a -Wimplicit-fallthrough warning:
src/gallium/drivers/ethosu/ethosu_cmd.c:1032:7: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough]
The plain "/* fall-through */" comment is not recognized by clang as a
fall-through annotation, so the intentional fall-through from the
ETHOSU_OPERATION_TYPE_CONVOLUTION case into the default case is flagged.
Replace the comment with the FALLTHROUGH macro, which expands to the
appropriate attribute and documents the intent for both GCC and clang.
Fixes: dce4b0313a ("ethosu: Add reshape operation")
Assisted-by: Claude Code (Claude Opus 4.8)
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42069>
Fixes many failing CTS tests in following set:
KHR-Single-GL46.enhanced_layouts.ssb_member_invalid_offset*
See commit e58dcc47c3 that made the same change for radeonsi.
Fixes: 1eb4a2f5cd ("iris: Limit max_shader_buffer_size to INT32_MAX")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41958>
these must be dynamically uniform but can be GPR. fixes validation on
dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.dynamically_uniform_tessellation_evaluation,
and probably really bugs doing indirect loads in divergent control flow
(when lane 0 is masked off).
no fossil-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42056>
This optimization mostly helped BRW because 3-src instructions can't take
immediates, and BRW can't allocate scalars without wasting an entire GRF unit
per scalar. Jay has a better RA that can pack many scalars into a single GRF
unit, so allocating temporary registers for the immediates is far less likely
to lead to as much spilling as it does on BRW.
SIMD16:
Totals from 1331 (50.28% of 2647) affected shaders:
Instrs: 1665848 -> 1665514 (-0.02%); split: -0.16%, +0.14%
CodeSize: 23192072 -> 23215672 (+0.10%); split: -0.30%, +0.40%
SIMD32:
Totals from 1114 (42.09% of 2647) affected shaders:
Instrs: 1959968 -> 1960548 (+0.03%); split: -0.30%, +0.33%
CodeSize: 28004460 -> 28023468 (+0.07%); split: -0.39%, +0.46%
Number of spill instructions: 31157 -> 31161 (+0.01%); split: -0.01%, +0.03%
Number of fill instructions: 32138 -> 32130 (-0.02%); split: -0.05%, +0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42056>
The idea here is to eliminate the flag used for the select condition,
not eliminate other flag sources.
Previously, if we had an instruction like:
gpr = SEL <not in flag> 0 <already in flag>
we would process source 0 and try to rewrite_without_flags(). Because
it's not in a flag, we think eliminating flags would be useful, so we
rewrite it. But this only eliminates the source 2 selection flag, not
the source 0 flag. It's valid to do so (but debatably useful).
However, we thought we were done, and skipped the setup that ensures
source 0's value was actually loaded into a flag.
Instead, we should just perform this optimization when processing the
selection flag (source 2). By that point, we will have properly set
up any flags for sources 0 and 1. And if source 2 is not in a flag,
we can decide to rewrite without it. Or, if it's already in a flag,
we can keep it as-is.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42056>
delete_solo_discard was removing unconditional discards in the case
where the entire program had been optimized away. However, we can
do better: unconditional discards in the end block can be removed if
1. All render target writes after the discard have been eliminated
2. No intrinsics with side-effects (e.g. image stores) come after
See
dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_depth
where there's a discard at the end of the program which can be removed.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42056>
opt_unconditional_discards may eliminate all render target stores
due to all pixels being discarded. In that case, it tries to add
one back with a Null RT and no colour/depth/stencil outputs, just
to end the thread. In that case, we don't want to predicate it on
helper invocations - we just need a basic message to end the thread.
In particular, we already lowered nir_intrinsic_is_helper_invocation
so we don't want to emit it again, as nothing would lower it afterwards.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42056>
Since Metal only has 16 and 32 bit types, if 8 bit indices were used, we
would ran into asserts when trying to fetch the size from the util call.
Reviewed-by: squidbus <squidbus@proton.me>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42024>
Standard gfx-ci/linux kernel can be used for all imagination jobs.
Signed-off-by: Robert Mazur <robert.mazur@imgtec.com>
Co-authored-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41869>
Picks up drm/imagination bug fixes and enables DRM_POWERVR for TI AM62/AM68.
Signed-off-by: Robert Mazur <robert.mazur@imgtec.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41869>
I detected so many leaks with them, so I think running them with ASAN
is really useful, they take up to 6 minutes max.
I added a suffix to make it more obvious.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42022>