This gives us the infrastructure that allows us to slowly migrate
pieces of blorp shaders from NIR to OpenCL, which, IMHO, are much
easier to read. We can't fully migrate everything due to all the
conditional building we do with these shaders, but I'm sure we'll find
opportunities to replace some NIR with OpenCL eventually.
The conversion of blorp_check_in_bounds() serves as the first example.
I also plan to have the shaders from the new indirect copy extension
be OpenCL shaders (mixed with some NIR as well), so having this patch
merged now will reduce the diff for the extension later.
Thanks to Alyssa Rosenzweig for her help here.
v2:
- Use SPDX (Alyssa).
- Use nir_trim_vector() (Alyssa).
- Adjust CL variable declaration (Alyssa).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
We have two distinct code paths sharing blorp_params->wm_inputs for
different purposes: the code from blorp_blit.c and the code from
blorp_clear.c. While blorp_blit.c uses most of the parameters (all
except clear_color), blorp_clear.c only uses clear_color and
bounds_rect. Split the parameters in two structs: one for blits and
the other for clears.
This not only helps save some space in the shader inputs, but it also
organizes things so it's more clear which parameters are used by what.
In addition, my plan is to later add struct blorp_wm_inputs_indirect,
which won't share anything that the others use, and would otherwise
grow the struct even more.
This change would reduce the size of struct blorp_wm_inputs from 96 to
80, but we have to add padding due to the assertion that compares it
to cs_prog_data->push.cross_thread.size. Still good, though.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
Because blorp_params_get_clear_kernel() calls
blorp_params_get_clear_kernel_cs(), which reads params->num_samples,
which we have not properly set yet at this point.
I am also planning to have the functions that create the shader to
rely on params.op, which we have not set yet either.
I found this by inspection (when writing another patch), I'm not sure
if this fixes something relevant, but it may be relevant to ver >= 30
multi-sampled cases.
Fixes: de0c547448 ("blorp: Handle 2D MSAA array image copies on compute shader")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
When I first looked at this struct, my tiny little brain felt
overwhelmed.
- Add some white spaces in order to group the parameters into
"logical" groups so it's easier to reason about everything.
- Change the parameter order just a little bit - without breaking the
logical groups - so the struct size decreases by 1.7% to 1864 bytes.
- Add a comment explaining what the void * pointers point to.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
If we ever add more entries, things won't explode.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
I'm sorry, but I have OCD and the rest of the file is nicely aligned.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
On Broadwell, using the debug mode, you can't create even a single
VkImage:
createimage: ../../src/util/u_math.h:829: util_is_aligned: Assertion `(a != 0) && ((a & (a - 1)) == 0)' failed.
Thread 1 "createimage" received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007ffff789573f in __pthread_kill_internal (threadid=<optimized out>, signo=6) at ./nptl/pthread_kill.c:89
#2 0x00007ffff7840462 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff78284ac in __GI_abort () at ./stdlib/abort.c:77
#4 0x00007ffff7828420 in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, file=<optimized out>, line=829, function=<optimized out>) at ./assert/assert.c:118
#5 0x00007ffff5a5fb0c in util_is_aligned (n=0, a=0) at ../../src/util/u_math.h:829
#6 0x00007ffff5a6060d in memory_range_end (memory_range=...) at ../../src/intel/vulkan_hasvk/anv_image.c:51
#7 0x00007ffff5a61c52 in check_memory_range_s (p=0x7fffffffd800) at ../../src/intel/vulkan_hasvk/anv_image.c:779
#8 0x00007ffff5a61ef3 in check_memory_bindings (device=0x555555654d50, image=0x55555566e050) at ../../src/intel/vulkan_hasvk/anv_image.c:830
#9 0x00007ffff5a62ea3 in anv_image_init (device=0x555555654d50, image=0x55555566e050, create_info=0x7fffffffd9d0) at ../../src/intel/vulkan_hasvk/anv_image.c:1263
#10 0x00007ffff5a63147 in anv_image_init_from_create_info (device=0x555555654d50, image=0x55555566e050, pCreateInfo=0x7fffffffda80) at ../../src/intel/vulkan_hasvk/anv_image.c:1333
#11 0x00007ffff5a63211 in anv_CreateImage (_device=0x555555654d50, pCreateInfo=0x7fffffffda80, pAllocator=0x0, pImage=0x7fffffffdd20) at ../../src/intel/vulkan_hasvk/anv_image.c:1356
#12 0x00007ffff44ff376 in vvl::dispatch::Device::CreateImage (this=0x55555562c480, device=0x555555654d50, pCreateInfo=0x7fffffffdcb8, pAllocator=0x0, pImage=0x7fffffffdd20)
at ./layers/vulkan/generated/dispatch_object.cpp:1160
#13 0x00007ffff43e8214 in vulkan_layer_chassis::CreateImage (device=0x555555654d50, pCreateInfo=0x7fffffffdcb8, pAllocator=0x0, pImage=0x7fffffffdd20) at ./layers/vulkan/generated/chassis.cpp:2181
#14 0x0000555555560af4 in vks::Image::init (this=0x7fffffffdcb0) at /home/przanoni/git/random-stuff/vk/vks/libvulkanscript.hpp:1298
#15 0x000055555556557d in main () at createimage.cpp:36
Since we haven't noticed this issue as quickly as I imagined we would,
let's opt for what's mostly a revert of the behavior change in the
original commit.
Fixes: 7be63ef956 ("intel: do not NIH util_is_aligned")
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39045>
when executing a renderpass which is only a clear, an emulated alpha
image will have the correct value applied from having the clear value
rewritten. this still can't handle draws, but it's a small expansion of
functionality which fixes some edge cases
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39318>
Elk still uses param array, so it depends on this behavior. This fixes
an assertion failure on Broadwell in gfxbench4/carchase/339.shader_test.
src/intel/compiler/elk/elk_fs_nir.cpp:148: void fs_nir_setup_uniforms(elk_fs_visitor&): Assertion `s.uniforms == s.prog_data->nr_params' failed.
Fixes: f4a0e05970 ("anv/brw/iris: get rid of param array on prog_data")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39280>
ACO got a lot better at forming VOPD instructions, and testing
feedback seems to point in a slightly positive direction for this.
gfx12 will also start requiring wave32 for dynamic VGPR allocation at
some point.
Measurements on navi31:
Cyberpunk 2077:
Difference at 95.0% confidence
1.12333 +/- 0.42876
1.88216% +/- 0.718391%
(Student's t, pooled s = 0.189165)
Black Myth Wukong benchmark:
Difference at 95.0% confidence
4 +/- 1.30862
13.9535% +/- 4.56495%
(Student's t, pooled s = 0.57735)
Portal with RTX:
66.2ms->61.5ms (~7.64% improvement)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39275>
Add nir_io_compact_to_higher_16 flag so that the pass knows if it can
compact 16-bit varyings into the higher 16 bits of a 32-bit varying.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38994>
There are some situations where we don't get a binary. In those cases
it is not relevant to report the statistics.
This was detected while using shader-db, with a shader that has a xfb
vertex shader that was not writing to gl_Position. It that case
IDVS_POSITION binary was zero, so later compiling IDVS_VARYING was
skipped. Due the skip, the pan_stat structure was not initialized, so
the debug used the wrong isa, that used different measurements. This
lead to shader-db report tool failing, due having a mix-up of
measurements.
Although an alternative would be to try to ensure that the structure
is always initialized, seems just more natural to just not report on
shaders with empty binaries (as the stats would be zero for all
measurements).
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39310>
We just... didn't do this at all??? I have no idea how this didn't blow
up before, given that plenty of apps should generate a traversal shader
that spills (and thus has a large stack size), but it did finally blow
up in function-call related work.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
G720 isn't wired up in CI, and nobody is running these regularly. So we
have missed that these CTS issues were fixed upstream.
I haven't actually verified that this works, because I don't have a G720
around at the moment. But the fixes fixed both v10 and v13 GPUs, so it
would be really, really strange if this wasn't effective on v11 as well.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39034>
These arguments misrenders, and gets rendered using a long hyphen
instead of double dashes. Let's fix this by using the Sphinx
option-directive, which is meant for exactly this purpose instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39286>
I'm not convinced these really should be separate opcodes at all in NIR, but
that's not what this patch is about. Here we just infer the opcodes in the
texture builder to allow simplified usage.
This lets us drop nir_txl() & nir_txb() helpers in favour of nir_tex(.lod/bias)
which is more normalized. We could also drop nir_txf_ms in favour of nir_txf but
that affects more callsites and is not obviously a win (unlike nir_txl which is
used once and nir_txb which is unused).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39271>
It is not allowed in SDS. This matters on gen8.
Possibly it wasn't enforced, or worked by luck, on older gens. But at
least for gen7 and gen8 it is documented as not allowed in SDS (so
working or not may depend on fw version, phase of the moon, etc).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39254>