Commit graph

189224 commits

Author SHA1 Message Date
Samuel Pitoiset
7f6e28db26 radv: fix re-emitting fragment output state when resetting gfx pipeline state
When switching from pipeline to shader objects.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33840>
2025-03-03 19:19:33 +00:00
Xaver Hugl
779c8d1669 vulkan/wsi: don't use sRGB if the compositor doesn't support it
This could realistically happen if the compositor doesn't support parametric image
descriptions at all, in which case we'd get a protocol error for trying to use it.

Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33804>
2025-03-03 18:55:29 +00:00
Gert Wollny
6da19eafd5 r600/sfn: gather info and set lowering 64 bit after nir_lower_io
After nir_lower_io we need to gather the info about 64 bit usage
to be up-to-date when deciding whether the remaining 64 bit IO ops
be lowered.

Before 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
the info was eventually updated to include the use of 64 bit values
also if only some IO was using this so that SFN was handling the code
correctly. As it seems with above patch this is not always the case
anymore, and we have to take care of it.

Fixes: 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32774>
2025-03-03 18:35:45 +00:00
Sasha Finkelstein
1601668155 vtn_bindgen2: Fix memory corruption
This sometimes causes memory corruption, specifically on 32-bit x86
where it can result in a build failure

Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33847>
2025-03-03 17:25:58 +00:00
Mary Guillemard
7c12df63de pan/bi: Add unit tests for FAU special page 3 and WARP_ID
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33843>
2025-03-03 17:04:04 +00:00
Mary Guillemard
ef0c7382c7 pan/bi: Disallow FAU special page 3 and WARP_ID on message instructions
This is a constraint that apply on Valhall and later, instructions
should not use FAU special page 3 or WARP_ID if running
on the message unit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: fd1906afea ("pan/va: Add FAU validation")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33843>
2025-03-03 17:04:04 +00:00
Rob Clark
6df9591905 tu: Don't emit SP_PS_2D_WINDOW_OFFSET on a6xx
This register isn't there.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33823>
2025-03-03 16:44:24 +00:00
Konstantin Seurer
4348253db5 llvmpipe: Skip draw_mesh if the ms did not write gl_Position
There is nothing to be done and the code will hit "assert(pos != -1);"
otherwise.

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12684
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33812>
2025-03-03 15:18:28 +00:00
Patrick Lerda
ee1cb894d6 r600: fix evergreen_emit_vertex_buffers() related cl regression
For instance, this issue is triggered with "piglit/bin/cl-custom-buffer-flags":
Segmentation fault

Fixes: 81889f4d5c ("r600: ensure that the last vertex is always processed on evergreen")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33351>
2025-03-03 14:54:32 +00:00
Danylo Piliaiev
264d8a6766 ir3: Set need_full_quad depending on info.fs.require_full_quads
The info from NIR is more granular, so that we don't have to
enable full quad for coarse derivatives.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33801>
2025-03-03 13:42:33 +00:00
Konstantin Seurer
6e3fc37d47 radv: Implement multidimensional ray query arrays
This is technically a bug fix, but no sane developer would use this.
It's still nice to implement all corner cases.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32334>
2025-03-03 12:07:47 +00:00
Konstantin Seurer
febc923a46 radv: Lower ray query vars to structs
This is much cleaner than passing an index around it will allow
implementing multidimensional ray query arrays.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32334>
2025-03-03 12:07:47 +00:00
Zan Dobersek
86456cf0e6 tu: support exposing derived counters through VK_KHR_performance_query
Turnip's current VK_KHR_performance_query implementation only exposed raw
perfcounters. These aren't exactly trivial to evaluate on their own.

This mode can still be used with the new TU_DEBUG_PERFCRAW flag. Existing
TU_DEBUG_PERFC now enables performance query mode where Freedreno's derived
counters are exposed instead. These default to using the command scope,
making them usable with RenderDoc's performance counter capture.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
2025-03-03 11:38:28 +00:00
Zan Dobersek
27fd2d1ad1 freedreno: add common implementation of perfcntr-based derived counters
Freedreno's derived counters combine multiple perfcntrs into a more
sensible, human-friendly metric. This change picks up the counters
currently used in Freedreno's Perfetto producer and rolls them into a
more genericallly usable form.

First place of their use will be through VK_KHR_performance_query, but
the Perfetto producer should also be able to use this interface instead
of having the logic duplicated. For now the counters are available only
for a7xx devices.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
2025-03-03 11:38:28 +00:00
Zan Dobersek
b8338dee39 tu/a7xx: disable preemption during performance query measurement
Use CP_SCOPE_CNTL to disable preemption when beginning performance query
and enable it back when that performance query is ended. This way the
collected perfcounter measurements will only cover work that's encompassed
by the query.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
2025-03-03 11:38:28 +00:00
Zan Dobersek
c964a96ab2 tu: performance query result writes must use dedicated union type
When using vkGetQueryPoolResults for performance queries, the result of
each query is the VkPerformanceCounterResultKHR union, and the result of
each query should be written into that union according to the storage type
of the corresponding counter.

This fixes current behavior of using the generic write procedure, where
either 32-bit or 64-bit value is written depending on the specified query
result flags.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
2025-03-03 11:38:28 +00:00
Zan Dobersek
1cf5cf6da3 tu: use query index when retrieving performance query iovas
Query index should be used to calculate the correct iova for various
performance query fields used to store counter values at the beginning
and ending of performance queries.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
2025-03-03 11:38:28 +00:00
Emmanuel Gil Peyrot
b4a82110ce panvk: Initialize out array with the correct length
This avoids reading past the buffer’s end in the client afterward, because the
drmFormatModifierCount hasn’t been changed from what the client passed, if it
wasn’t zero at first.

GTK triggers that bug by setting it to the length of the static array (see this
bug[0] though), but other Vulkan programs might have the same issue if they
don’t first query the count before allocating the array.

This has been tested on a Radxa ROCK 5B board running a Mali-G610 GPU.

[0] https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/8222

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 252ddaf51b ("panvk: fix VkDrmFormatModifierPropertiesListEXT query")
Fixes: https://gitlab.freedesktop.org/mstoeckl/waypipe/-/issues/127
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33657>
2025-03-03 11:09:14 +01:00
Julia Zhang
79bb8e3455 radv: advertise VK_EXT_device_memory_report
Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088>
2025-03-03 08:26:51 +00:00
Julia Zhang
f504ed9e73 radv: emit device memory report for device memory events
Emit device memory report when radv create memory or free memory.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088>
2025-03-03 08:26:51 +00:00
Julia Zhang
313aa44bf1 radv: add obj_id to radeon_winsys_bo
mem->bo->obj_id will be used by device memory report.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088>
2025-03-03 08:26:51 +00:00
Julia Zhang
900be035c8 radv: add import and export handle_type in radv_alloc_memory
The import_handle_type and export_handle_type will be used to set the
memoryObjectId for memory report.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088>
2025-03-03 08:26:51 +00:00
Julia Zhang
273e00bffe vulkan: handle device memory report requests
Add memory_report to vk_device and init it when create vk
device to handle device memory report requests.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088>
2025-03-03 08:26:51 +00:00
Job Noorman
c24373d907 ir3/sched: unblock a0.x/a1.x after last use
We currently keep a0.x/a1.x unnecessarily blocked until we have to spill
it, even if all its uses have been scheduled already. This causes
unnecessary spills and potentially schedules address writes later than
we'd like.

Fix this by keeping track of the number of uses that are still
unscheduled, unblocking address writes when it drops to zero.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33816>
2025-03-03 07:48:16 +00:00
Job Noorman
9728b31e65 ir3: clear instruction uses when cloned
Clones do not share the uses of the cloned instruction.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33816>
2025-03-03 07:48:15 +00:00
Hyunjun Ko
f80e5c2ce5 anv/ci: remove some expected failures of dEQP-VK.video.formats.*
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33784>
2025-03-03 04:36:21 +00:00
Hyunjun Ko
f7ff9b240d anv: Do not support the tiling of DRM modifier if DECODE_DST
Fixes: 04709e4f ("anv: fix video profile lists");

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33784>
2025-03-03 04:36:21 +00:00
Yiwei Zhang
b6b8193534 venus: fix an obsolete protocol sync earlier
luckily it doesn't hurt anything and won't break anything

Fixes: 785f44adc8 ("venus: sync protocol for the passthrough extensions")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33834>
2025-03-02 16:58:21 +00:00
Yiwei Zhang
c35b52638c venus: relax the requirement for sync2
The current requirement for sync is only to support WSI, and it is not
necessarily needed at all per the comment added. Will drop it later.

Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33829>
2025-03-02 16:40:14 +00:00
Christian Gmeiner
3f95531d39 etnaviv/ci: Start using the revision number for GPU_VERSION
This step is needed to support the same GPU model, but with a
different revision. A good example is the gc7000.

 - imx8mp: model gc7000 with revision 6204 (uses RS)
 - imx8mq: model gc7000 with revision 6214 (uses BLT)

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33811>
2025-03-01 22:14:28 +00:00
Alyssa Rosenzweig
d2edb15454 radv: use VK_COPY_STR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Alyssa Rosenzweig
aa2f1138e6 v3dv: switch to common VK_COPY/PRINT_STR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Alyssa Rosenzweig
0b4ccac83e anv,hasvk: switch to common VK_COPY/PRINT_STR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Alyssa Rosenzweig
0bca84c3a1 hk: switch to common VK_COPY/PRINT_STR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Alyssa Rosenzweig
e331efd4fe vulkan: add common VK_PRINT_STR/VK_COPY_STR macros
every vk driver wants these macros for executable statistics, so make them
common. there are two variants floating in-tree, a pure copy (radv) and a
formatted print (everyone else). we add both variants and then convert most
prints to copies where formatting isn't actually used. that has the benefit of
cleaning up trivial "%s" format strings in a bunch of places.

I didn't bother cleaning up the formatting in non-automatic-formatted drivers
because it's tedious and I'm planning to delete a lot of this driver code with
upcoming runtime work anyway. This is a step towards those runtime improvements.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Alyssa Rosenzweig
7ce8f1ebb0 asahi: clang-format
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
2025-03-01 20:27:26 +00:00
Georg Lehmann
975be7ac5d ac/nir/mem_access_bit_sizes: split unaligned vec3 lds access to allow more read2/write2
Foz-DB Navi21:
Totals from 77 (0.10% of 79377) affected shaders:
Instrs: 69787 -> 68745 (-1.49%); split: -1.51%, +0.02%
CodeSize: 367256 -> 360060 (-1.96%); split: -1.97%, +0.01%
VGPRs: 3896 -> 3880 (-0.41%)
Latency: 335403 -> 335297 (-0.03%); split: -0.11%, +0.08%
InvThroughput: 102766 -> 102931 (+0.16%); split: -0.09%, +0.25%
VClause: 1645 -> 1643 (-0.12%); split: -0.18%, +0.06%
SClause: 1434 -> 1433 (-0.07%)
Copies: 4280 -> 4283 (+0.07%); split: -0.56%, +0.63%
PreVGPRs: 2408 -> 2421 (+0.54%); split: -0.08%, +0.62%
VALU: 45557 -> 45646 (+0.20%); split: -0.10%, +0.29%
SALU: 6458 -> 6474 (+0.25%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33448>
2025-03-01 18:26:54 +00:00
Georg Lehmann
8b2b3e5704 radv: remove outdated vectorize TODO
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33448>
2025-03-01 18:26:54 +00:00
Faith Ekstrand
47ca264dc2 nak: Set .NODEP on tex ops based on nir_opt_tex_skip_helpers()
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>
2025-03-01 08:44:15 +00:00
Faith Ekstrand
a65009e808 nir: Add a nir_opt_tex_skip_helpers optimization
Arm and NVIDIA hardware both have this as a bit you can set on the
texture instruction so we may as well have a shared pass for it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>
2025-03-01 08:44:15 +00:00
Faith Ekstrand
7ac6ec2ceb nir: Add a get_io_index_src() helper
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>
2025-03-01 08:44:15 +00:00
Faith Ekstrand
61108eb1b5 compiler/rust: Add u_printf_info to the rust bindings
This both adds it to the compiler allowlist and adds it to the user
denylist.  This gets rid of a Rust compiler warning in NAK.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>
2025-03-01 08:44:15 +00:00
Georg Lehmann
7eb43c3b1c aco/optimizer: delete combine_and_subbrev
This is now done in NIR. No Foz-DB changes on Navi21.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:28 +00:00
Georg Lehmann
d272a6e261 nir/opt_algebraic: optimize d3d a ? b : 0
Foz-DB Navi21:
Totals from 3466 (4.34% of 79789) affected shaders:
MaxWaves: 73163 -> 73161 (-0.00%); split: +0.02%, -0.02%
Instrs: 3993862 -> 3987633 (-0.16%); split: -0.19%, +0.04%
CodeSize: 21747420 -> 21725620 (-0.10%); split: -0.15%, +0.05%
VGPRs: 190736 -> 190728 (-0.00%); split: -0.04%, +0.03%
SpillSGPRs: 489 -> 478 (-2.25%); split: -2.86%, +0.61%
Latency: 48169718 -> 48159068 (-0.02%); split: -0.05%, +0.02%
InvThroughput: 12132999 -> 12128721 (-0.04%); split: -0.05%, +0.01%
VClause: 78063 -> 78052 (-0.01%); split: -0.09%, +0.08%
SClause: 109095 -> 108996 (-0.09%); split: -0.13%, +0.04%
Copies: 265784 -> 264530 (-0.47%); split: -0.72%, +0.25%
Branches: 84533 -> 84553 (+0.02%)
PreSGPRs: 172577 -> 172531 (-0.03%); split: -0.19%, +0.16%
PreVGPRs: 165776 -> 165825 (+0.03%); split: -0.06%, +0.09%
VALU: 2851544 -> 2850426 (-0.04%); split: -0.08%, +0.04%
SALU: 413543 -> 408408 (-1.24%); split: -1.45%, +0.21%
VMEM: 139890 -> 139887 (-0.00%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:28 +00:00
Georg Lehmann
2e7f34af6b nir/opt_algebraic: optimize more ine/ieq(umin(b2i, ), 0)
Foz-DB Navi21:
Totals from 76 (0.10% of 79789) affected shaders:
MaxWaves: 1050 -> 1062 (+1.14%)
Instrs: 113754 -> 113691 (-0.06%); split: -0.11%, +0.06%
CodeSize: 605096 -> 605216 (+0.02%); split: -0.03%, +0.05%
VGPRs: 6024 -> 5976 (-0.80%)
Latency: 1776501 -> 1777519 (+0.06%); split: -0.06%, +0.12%
InvThroughput: 379644 -> 376751 (-0.76%)
SClause: 2132 -> 2134 (+0.09%)
Copies: 4131 -> 4128 (-0.07%); split: -1.77%, +1.69%
PreSGPRs: 4275 -> 4270 (-0.12%)
PreVGPRs: 5568 -> 5526 (-0.75%)
VALU: 86732 -> 86581 (-0.17%); split: -0.24%, +0.07%
SALU: 7112 -> 7198 (+1.21%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:28 +00:00
Georg Lehmann
7bc3062a3b nir/opt_algebraic: push comparisons with constants into bcsel with constant
Foz-DB Navi21:
Totals from 1657 (2.08% of 79789) affected shaders:
MaxWaves: 30275 -> 30261 (-0.05%); split: +0.01%, -0.05%
Instrs: 3316251 -> 3315701 (-0.02%); split: -0.04%, +0.02%
CodeSize: 17831924 -> 17832020 (+0.00%); split: -0.06%, +0.06%
SpillSGPRs: 815 -> 859 (+5.40%)
SpillVGPRs: 3335 -> 3293 (-1.26%)
Scratch: 231424 -> 230400 (-0.44%)
Latency: 33413310 -> 33402751 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 9116062 -> 9112904 (-0.03%); split: -0.04%, +0.00%
VClause: 65587 -> 65560 (-0.04%); split: -0.05%, +0.01%
SClause: 86208 -> 86261 (+0.06%); split: -0.02%, +0.08%
Copies: 356158 -> 356439 (+0.08%); split: -0.07%, +0.15%
PreSGPRs: 101710 -> 101806 (+0.09%); split: -0.01%, +0.11%
PreVGPRs: 89293 -> 89286 (-0.01%); split: -0.04%, +0.04%
VALU: 2220900 -> 2218839 (-0.09%); split: -0.11%, +0.01%
SALU: 472988 -> 474567 (+0.33%); split: -0.08%, +0.42%
VMEM: 118401 -> 118347 (-0.05%)
SMEM: 123597 -> 123592 (-0.00%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:27 +00:00
Georg Lehmann
3837bc6d16 nir/opt_algebraic: optimize ~a == ~b and ~a == #b
Foz-DB Navi21:
Totals from 2 (0.00% of 79789) affected shaders:
Instrs: 8343 -> 8323 (-0.24%)
CodeSize: 43884 -> 43764 (-0.27%)
Latency: 19390 -> 19363 (-0.14%)
InvThroughput: 3380 -> 3356 (-0.71%)
VALU: 5413 -> 5393 (-0.37%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:27 +00:00
Georg Lehmann
8759223498 nir/opt_algebraic: optimize b2i/b2f comparision with non 0/1 constants
Foz-DB Navi21:
Totals from 28 (0.04% of 79789) affected shaders:
MaxWaves: 732 -> 728 (-0.55%)
Instrs: 23425 -> 22559 (-3.70%)
CodeSize: 137740 -> 132292 (-3.96%)
VGPRs: 1128 -> 1144 (+1.42%)
Latency: 94604 -> 92423 (-2.31%)
InvThroughput: 19166 -> 18814 (-1.84%); split: -2.38%, +0.54%
VClause: 429 -> 423 (-1.40%)
SClause: 937 -> 926 (-1.17%)
Copies: 1199 -> 914 (-23.77%); split: -24.52%, +0.75%
Branches: 451 -> 421 (-6.65%)
PreSGPRs: 1043 -> 996 (-4.51%)
PreVGPRs: 992 -> 973 (-1.92%); split: -3.53%, +1.61%
VALU: 17566 -> 16865 (-3.99%)
SALU: 1254 -> 1157 (-7.74%)
VMEM: 619 -> 609 (-1.62%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:27 +00:00
Georg Lehmann
2bfcfef5da nir/opt_algebraic: optimize bcsel of b2f and constants
Foz-DB Navi21:
Totals from 212 (0.27% of 79789) affected shaders:
MaxWaves: 4024 -> 4030 (+0.15%)
Instrs: 1314134 -> 1313894 (-0.02%); split: -0.03%, +0.02%
CodeSize: 7033216 -> 7026888 (-0.09%); split: -0.10%, +0.01%
VGPRs: 14224 -> 14176 (-0.34%)
Latency: 7402062 -> 7399180 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 1724879 -> 1723773 (-0.06%); split: -0.07%, +0.00%
VClause: 37741 -> 37711 (-0.08%); split: -0.11%, +0.03%
SClause: 29266 -> 29268 (+0.01%); split: -0.01%, +0.01%
Copies: 123810 -> 123786 (-0.02%); split: -0.19%, +0.17%
Branches: 42370 -> 42407 (+0.09%); split: -0.03%, +0.11%
PreSGPRs: 13149 -> 13196 (+0.36%); split: -0.05%, +0.40%
PreVGPRs: 12407 -> 12395 (-0.10%)
VALU: 884471 -> 883475 (-0.11%); split: -0.12%, +0.01%
SALU: 177671 -> 178408 (+0.41%); split: -0.03%, +0.45%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>
2025-03-01 07:49:27 +00:00
Sil Vilerino
4a62f8bc75 d3d12: Cache the texture array cap requirement in encoder creation for calls to d3d12_video_create_dpb_buffer
Previously, the caps were not queried until after the first DPB buffer
was created. That causes the AOT path to be triggered even when the
device does report a requirement for texture array.

Tested-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33824>
2025-02-28 20:35:13 +00:00