Commit graph

186540 commits

Author SHA1 Message Date
Francisco Jerez
189422de1b intel/brw/xe2+: Update encoding of FB write descriptor message control.
Ref: bspec: 65209, 63908
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>
2024-03-20 15:46:44 -07:00
Francisco Jerez
7b0fbc22dd intel/brw/xe2: Render target reads have been removed from the hardware.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>
2024-03-20 15:46:44 -07:00
Paulo Zanoni
6ec1e322f0 anv: don't leak device->vma_samplers
The vma_samplers vma heap is initialized unconditionally. Don't use
device->physical->indirect_descriptors as a condition on whether to
free it or not.

From my TGL machine:

==373617== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==373617==    at 0x48459F3: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==373617==    by 0x6926DC0: util_vma_heap_free (vma.c:339)
==373617==    by 0x6925ED3: util_vma_heap_init (vma.c:53)
==373617==    by 0x5334EDA: anv_CreateDevice (anv_device.c:3404)
==373617==    by 0x685593A: vk_tramp_CreateDevice (vk_dispatch_trampolines.c:78)
==373617==    by 0x48A6D56: terminator_CreateDevice (loader.c:5833)
==373617==    by 0x9C2293F: vulkan_layer_chassis::CreateDevice(VkPhysicalDevice_T*, VkDeviceCreateInfo const*, VkAllocationCallbacks const*, VkDevice_T**) (chassis.cpp:497)
==373617==    by 0x48B0690: loader_create_device_chain (loader.c:4937)
==373617==    by 0x48B1327: loader_layer_create_device (loader.c:4317)
==373617==    by 0x48B8D79: vkCreateDevice (trampoline.c:1004)
==373617==    by 0x10CC7A: MyApp::MyApp(int, bool) (sparse.cpp:608)
==373617==    by 0x1201E8: main (sparse.cpp:6025)

Fixes: 7c76125db2 ("anv: use 2 different buffers for surfaces/samplers in descriptor sets")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28303>
2024-03-20 21:55:55 +00:00
Rob Clark
5ee8fd6b49 freedreno/a6xx: Fix z/s preserving sysmem clear blit
Need to ignore color attachments in that logic.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10466
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28249>
2024-03-20 21:28:40 +00:00
Mary Strodl
42ad4c6e6e rusticl: set OCL_ICD_VENDORS as directory, not file
Looks like `OCL_ICD_VENDORS` is meant to be a directory containing
`.icd` files, but we were giving it a path to a driver. This meant that
rusticl wouldn't get picked up when using devenv.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28231>
2024-03-20 16:12:22 -04:00
Lionel Landwerlin
4fbdfdce9c anv: allocate pipeline bindings tables dynamically on the heap
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28290>
2024-03-20 19:29:05 +00:00
Lionel Landwerlin
7730fa5683 anv: track embedded sampler counts in layouts
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28290>
2024-03-20 19:29:05 +00:00
Juston Li
dc1069b167 venus: extend device format prop cache with VkFormatProperties3
Extend the vkGetPhysicalDeviceFormatProperties2 cache to include
VkFormatProperties3 from the pNext chain. VkFormatProperties3 was
observed being always attached for DXVK and thus skipping the cache
if not handled.

Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28194>
2024-03-20 19:12:00 +00:00
Joshua Ashton
aecd46182d lavapipe: Enable EXT_swapchain_colorspace
No-op.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
2024-03-20 18:24:26 +00:00
Joshua Ashton
fc263e0308 v3dv: Enable EXT_swapchain_colorspace
No-op.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
2024-03-20 18:24:26 +00:00
Joshua Ashton
5c49f3c1aa lavapipe: Enable EXT_swapchain_maintenance1
This was missing, this is implemented in common code.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
2024-03-20 18:24:26 +00:00
Joshua Ashton
f977e4d4f5 v3dv: Enable EXT_swapchain_maintenance1
This was missing, this is implemented in common code.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
2024-03-20 18:24:25 +00:00
Joshua Ashton
145ab5b853 anv: Enable EXT_swapchain_maintenance1
This was missing, this is implemented in common code.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
2024-03-20 18:24:25 +00:00
Rhys Perry
76e089ea48 aco/cssa: update comments
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28248>
2024-03-20 17:50:27 +00:00
Rhys Perry
0c0819f0da aco/cssa: reset equal_anc_out if merging fails
try_merge_merge_set() expects equal_anc_out to be Temp() in the beginning,
so we need to reset it in case it's used again.

Fixes compilation of metro_exodus/163b3b895730d37b with
VK_PIPELINE_CREATE_2_DISABLE_OPTIMIZATION_BIT_KHR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 18ba93e673 ("aco/cssa: rewrite lower_to_cssa pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28248>
2024-03-20 17:50:27 +00:00
Mark Collins
f72cd2eae7 fd/decode: Fix "OPTSIONS" typo in help messages
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
8b4b252674 fd/replay: Use generate_rd as default CS generator
The generate_rd from the same directory as the replay executable is
used as the default CS generator.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
69d347e42f fd/decode: Build generate_rd executable rather
A seperate meson project with mesa as a subproject was required to
compile the generate_rd executable for replaying rddecompiler
generated source. That approach was overly complicated, now the
generate_rd executable is now compiled as a part of building the
freedreno tools with a blank generate-rd.cc file that can be replaced
with the source generated by rddecompiler.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
bdd89dad1c fd/rddecompiler: Disable IR3 cache for replay context
The cache being enabled results in some warning logs, it's not
necessary for rddecompiler either way.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
fc9e718a86 fd/replay+rddecompiler: Add option to clear wrbufs at start
It's useful to clear buffers at the start of sequences to view the
delta, this adds that functionality to wrbufs with a fixed clear
value of 0xDEADBEEF.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
694ed34673 fd/replay: Error when VMA AS allocation fails
It's possible for large allocations to hit the maximum address
space size especially with a fake AS, these failures are silent
and can cause a hard to debug segfault later down the line.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
e0a680162d fd/replay: Add wrbuf support for KGSL/DXG
The vector for wrbufs wasn't being initialized for KGSL and DXG
leading to UB when they were used with it.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
0fad4e547b fd/replay: Clear wrbufs after submitting cmdstreams for DRM
Retaining them across submissions was a bug, the wrbuf should only
be dumped for the submission it originates from.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
011cacd982 fd/replay: Clamp dumped wrbuf to buffer size
We should be careful to not read past the end of any buffers when
dumping wrbufs, this clamps the size to the size of the buffer with
a warning.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
e10202fdf4 fd/replay: Dump wrbuf into cwd rather than exe directory
It didn't make any sense to output into the bin directory, it has
been replaced with the working directory instead.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:08 +00:00
Mark Collins
d043ebc941 fd/replay: Fix wrbuffer name extraction
The offset for reading and length calculation logic was incorrect
leading to the name string being entirely incorrect.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
2024-03-20 17:34:07 +00:00
Zan Dobersek
1e27138588 freedreno/fdl: avoid overflow in layout size computations
When computing layout for large extents and array size, the computations
can overflow the 32-bit variables holding these values. This can end up
underreporting memory requirements for a given resource, making the
application think that such a resource is allocatable.

The size-holding variables in the fdl_layout struct have their type
upgraded to uint64_t, as does the total size variable in tu_image. This
avoids problems in corner-case tests in the Vulkan CTS, but this code
should be improved further to avoid overflows and narrowing conversions.

Fixes a quartet of tests under
dEQP-VK.pipeline.monolithic.render_to_image.core.2d_array.huge.width_height_layers.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28050>
2024-03-20 17:05:55 +00:00
Samuel Pitoiset
be4a6b946a radv: add a workaround for null IBO on GFX6
Based on PAL.

Fixes dEQP-VK.draw.*nulldescriptor_maintenance_5_maintenance6 on GFX6.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28263>
2024-03-20 16:27:58 +00:00
Juan A. Suarez Romero
d87ccf0632 broadcom/ci: add new expected failures
Add more expected failures that should have been included in
74be42d9a4.

Fixes: 74be42d9a4 ("broadcom/ci: add new expected test failures")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28298>
2024-03-20 16:06:35 +00:00
Mike Blumenkrantz
f79557dd38 zink: do io fixup on patch variables too
fixes spec@arb_separate_shader_objects@rendezvous by location (5 stages)

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28296>
2024-03-20 15:09:12 +00:00
Rhys Perry
f88922e816 radv: use dual_color_blend_by_location with Half-Life Alyx
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Ethan Lee <flibitijibibo@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10462
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28269>
2024-03-20 11:40:18 +00:00
Kenneth Graunke
a075b44493 intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST once
brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL
outside of control flow.  None of our optimization passes generate
additional cases of that instruction, so once it's gone, we shouldn't
ever have to run the pass again.  Moving it out of the loop should
save a bit of CPU time.

While we're at it, also clean adjacent BROADCAST instructions that
consume the result of our FIND_LIVE_CHANNEL.  Without this, we have
to perform copy propagation to get the MOV 0 immediate into the
BROADCAST, then algebraic to turn it into a MOV, which enables more
copy propagation...not to mention CSE gets involved.  Since this
FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is
really common, and it's trivial to clean up, we can do that.  This
lets the initial copy prop in the loop see MOV instead of BROADCAST.

Zero impact on fossil-db, but less work in the optimization loop.

Together with the previous patches, this cuts compile time in
Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:22 -07:00
Kenneth Graunke
5814534de5 intel/brw: Don't consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRF
It's a logical opcode which is lowered to a send-from-GRF later.  That
lowering code is responsible for ensuring the sources are set up in a
proper SEND payload.

This was preventing copy propagation of surface handles which started
out as scalars, were splatted out to full-SIMD values with NoMask, then
actually consumed as only component 0 (scalar again), because we thought
that scalar values were not allowed.

fossil-db on Alchemist shows improvements in q2rtx but no other titles:

   Totals:
   Instrs: 161310436 -> 161310152 (-0.00%)
   Cycles: 14370605159 -> 14370601066 (-0.00%)

   Totals from 17 (0.00% of 652298) affected shaders:
   Instrs: 16097 -> 15813 (-1.76%)
   Cycles: 185508 -> 181415 (-2.21%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:22 -07:00
Kenneth Graunke
ea423aba1b intel/brw: Split out 64-bit lowering from algebraic optimizations
We don't necessarily want to split up MOVs for 64-bit addresses into
2x 32-bit MOVs right away, as this makes things like copy propagating
the whole address around harder.  We should do this late, once, while
still doing other algebraic optimizations earlier.

fossil-db results for Alchemist show tiny improvements:

   Totals:
   Instrs: 161310502 -> 161310436 (-0.00%); split: -0.00%, +0.00%
   Cycles: 14370605606 -> 14370605159 (-0.00%); split: -0.00%, +0.00%

   Totals from 33 (0.01% of 652298) affected shaders:
   Instrs: 15053 -> 14987 (-0.44%); split: -0.64%, +0.20%
   Cycles: 196947 -> 196500 (-0.23%); split: -0.25%, +0.02%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:17 -07:00
Nanley Chery
831703157e iris: Use resource_get_param in resource_get_handle
Refactor iris_resource_get_handle to use iris_resource_get_param to pick
up the fix from the previous patch.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9994
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19 23:12:06 +00:00
Nanley Chery
bf1008ac28 iris: Report the correct modifier for Tile4 images
In iris_resource_get_param, report the Tile4 modifier for Tile4 images
instead of reporting the linear modifier.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19 23:12:06 +00:00
Mark Janes
345c918a76 intel/dev: remove pci revision from shader cache key
Pci revision was included in the shader cache key because it can
enable platform workarounds.  While some platform workarounds exist in
the compiler, none are dependent on the silicon stepping.

Many platforms differ only in the pci revision id, causing needless
duplication in cache entries between platforms.

When a platform ships publicly with stepping-specific compiler
workarounds, pci id must be incorporated into the shader cache key.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28085>
2024-03-19 15:11:19 -07:00
Timur Kristóf
58e3b1f930 aco: Allow passing constant operand to is_overwritten_since.
This is to make it more intuitive and also consistent
with last_writer_idx which does allow constant operands.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28046>
2024-03-19 20:50:12 +00:00
Gert Wollny
d1cac5ed05 zink: acquire - maybe clear timeout after waiting for presentation fence
If the presentation fence was signalled and we still hold
max_acquires or more images, then clear the timeout to avoid
a possible deadlock.

With that we avoid the validation error

  VUID-vkAcquireNextImageKHR-surface-07783

triggered by piglit

   spec@!opengl 1.0@gl-1.0-drawbuffer-modes

and others.

v2: clear timeout only if we have acquired more images than the
    reported max and add some comment why the timeout is cleared
    (Mike).

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28245>
2024-03-19 20:12:52 +00:00
Mary Guillemard
9e133c4000 nouveau: Add support for TERT opcodes in vk_push_print
Those opcodes are vestige of the old command format.

This implement handling of them and fix issues when analysing command
buffers that use thoses.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28277>
2024-03-19 19:56:07 +00:00
Kenneth Graunke
d473004576 intel/fs: Avoid generating useless UNDEFs for every SSA def
Emitting UNDEF is only necessary when the instructions we generate to
produce the NIR def are considered partial writes.  By adding a simple
check (adapted from fs_inst::is_partial_write()), we can avoid creating
loads of unnecessary UNDEFs that we have to clean up later.

Our first dead code elimination pass does get rid of them pretty
quickly, but this should save memory and time during our first
split_virtual_grfs and dead_code_elimination passes.

This generates roughly 30% fewer instructions at the beginning.

Improves compilation time of shaders:
- Rise of the Tomb Raider: -3.51563% +/- 0.103951% (n=7)
- Borderlands 3: -3.64422% +/- 0.300951% (n=7).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28169>
2024-03-19 19:32:18 +00:00
Konstantin Seurer
a6b93c50d0 radv/printf: Use fprintf instead of printf
For using other destinations than stdout.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19 19:05:25 +00:00
Konstantin Seurer
d902b6d805 radv: Skip more acceleration structure build markers
We should skip even more stuff when using updates only.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19 19:05:25 +00:00
Caio Oliveira
b58b6d2d32 anv: Enable VK_KHR_shader_quad_control
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Caio Oliveira
b22879e753 intel/brw: Use predicates for quad_vote_any and quad_vote_all when available
Up until Xe2, we can use the predicates ANY4H and ALL4H to achieve the
same result with less instructions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Caio Oliveira
857e62e6ac intel/brw: Implement quad_vote_any and quad_vote_all
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Ian Romanick
671745b616 intel/fs: Don't allow 0 stride on MOV destination
Outside SIMD1 instructions, a destination stride of zero doesn't make
any sense. When such strides exist, they would be fixed by the FS
generator. Currently the only place that intentionally generates such a
stride is setup_barrier_message_payload_gfx125, and this commit changes
that.

The existence of a zero stride that won't really be a zero stride causes
a variety of problems with other optimization passes. Those passes don't
know that 0 actually means 1, and they make incorrect assumptions about
sizes written, etc.

The assertion helped catch many bugs in some other work in progress that
tries to store convergent values in SIMD8 registers regardless of the
dispatch width. That code would accidentally generate destination
strides of zero.

v2: Check stride differently depending on register file. Suggested by
Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28256>
2024-03-19 18:17:59 +00:00
Danylo Piliaiev
d10b546776 freedreno/replay: Use real queueid for submissions and waits
Otherwise it failed when expected queueid is not 0.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27123>
2024-03-19 17:56:33 +00:00
Samuel Pitoiset
6f18f39208 zink/ci: enable RADV_PERFTEST=shader_object for polaris10
It's passing in CI now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28273>
2024-03-19 17:33:11 +00:00
Konstantin Seurer
6095b70f85 radv/rt: Use 32-bit offsets for load_sbt_entry
Totals from 82 (18.06% of 454) affected shaders:
MaxWaves: 820 -> 821 (+0.12%)
Instrs: 2765694 -> 2766338 (+0.02%); split: -0.08%, +0.10%
CodeSize: 14751988 -> 14735464 (-0.11%); split: -0.13%, +0.01%
VGPRs: 8464 -> 8448 (-0.19%)
SpillSGPRs: 454 -> 512 (+12.78%)
Latency: 19368679 -> 19344967 (-0.12%); split: -0.21%, +0.09%
InvThroughput: 5354427 -> 5346317 (-0.15%); split: -0.24%, +0.08%
VClause: 100183 -> 100331 (+0.15%); split: -0.02%, +0.17%
SClause: 66584 -> 66590 (+0.01%); split: -0.02%, +0.03%
Copies: 237008 -> 238684 (+0.71%); split: -0.53%, +1.23%
Branches: 113344 -> 113386 (+0.04%); split: -0.00%, +0.04%
PreSGPRs: 6141 -> 6194 (+0.86%)
PreVGPRs: 7916 -> 7880 (-0.45%)

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27725>
2024-03-19 17:03:28 +00:00