Commit graph

201327 commits

Author SHA1 Message Date
Alyssa Rosenzweig
af2a796b13 util/ralloc: add total_size helper
It's useful to determine how much memory a nir_shader consumes for debugging
memory bloat, particularly for persistent NIR library. Add a helper that lets us
compute this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
2024-10-30 12:59:10 +00:00
Alyssa Rosenzweig
33299354e0 nir/opt_algebraic: optimize patterns hit with OpenCL
This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL
kernel. LLVM generates a lot of garbage around booleans which we need to chew
through. Though there's nothing AGX or really OpenCL specific here, so some of
this could help graphics shaders too.

Together, their effect is significant for that kernel instr count & occupancy:

before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads
after:  2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads

No significant changes on GL shaderdb (a single godot shader regressed 1
instruction, 1344->1345).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
2024-10-30 12:59:10 +00:00
Samuel Pitoiset
fc0545e6a7 radv: fix wrong index in radv_skip_graphics_pipeline_compile()
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12089
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31901>
2024-10-30 11:25:59 +00:00
Boris Brezillon
32ccec7450 pan/cs: Fix lazy allocation support
Commit 0e6aaab00a ("pan/cs: add block to handle registers backup in
exception handler") broke the lazy allocation case by checking the
current chunk capacity too early.

Fixes: 0e6aaab00a ("pan/cs: add block to handle registers backup in exception handler")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Tested-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Tested-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Tested-by: Benjamin Re <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31884>
2024-10-30 10:19:21 +00:00
Benjamin Lee
ffe68eb225 panvk: flush sync point before executing secondary cmdbufs
Secondary command buffers need seqno registers to hold the current value
at the start of command buffer execution in order to calculate correct
wait values.

Fixes: c2299b6642 ("panvk/csf: Implement vkCmdExecuteCommands")
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31813>
2024-10-30 09:57:53 +00:00
Benjamin Lee
2b1ec1c35d panvk: allow resuming secondary cmdbufs with dynamic rendering
The removed assertion was originally added to enforce

> VUID-vkCmdExecuteCommands-pCommandBuffers-00100:
>  If vkCmdExecuteCommands is not being called within a render pass
>  instance, each element of pCommandBuffers must not have been recorded
>  with the VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT

However, if a render pass instance is entered with vkCmdBeginRendering,
vk_command_buffer::render_pass is unset.

Code change was done by Boris, only commit description was added.

Fixes: c2299b6642 ("panvk/csf: Implement vkCmdExecuteCommands")
Co-authored-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31813>
2024-10-30 09:57:53 +00:00
Daniel Schürmann
62715984f8 aco/README: add descriptions of recently added passes
... and less recent ones.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Daniel Schürmann
21ceeb22ed aco: move jump threading optimization into separate pass
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Daniel Schürmann
87a3c08df1 aco/ssa_elimination: remove some redundant checks during jump threading
Since phis got already lowered to parallelcopies by this point,
there is no need to cross-check.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Daniel Schürmann
a6c38f706d aco/ssa_elimination: perform jump threading after parallelcopy insertion
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Erik Faye-Lund
b63dab29f0 panvk: expose EXT_depth_clip_enable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31886>
2024-10-30 09:55:56 +01:00
Erik Faye-Lund
e6174e6139 panvk/csf: respect depth-clip state
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31886>
2024-10-30 09:55:50 +01:00
Erik Faye-Lund
117283cdf8 panvk/jm: respect depth-clip state
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31886>
2024-10-30 09:55:43 +01:00
Erik Faye-Lund
0ebb1b737c panvk: drop duplicate dirty-test
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31886>
2024-10-30 09:55:17 +01:00
David Rosca
84bce1af41 radeonsi: Support HEVC features and block sizes for UVD
Features are the same as VCN 1.0, block sizes are different.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31872>
2024-10-30 07:13:30 +00:00
David Rosca
4f31625aa6 radeonsi/uvd_enc: Allocate session buffer in VRAM
Improves encoding performance on dGPUs.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31872>
2024-10-30 07:13:30 +00:00
David Rosca
079ff0a9df radeonsi: Enable VIDEO_CAP_ENC_SUPPORTS_ASYNC_OPERATION on VCE/UVD
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31872>
2024-10-30 07:13:30 +00:00
David Rosca
1921473f1f radeonsi/vce: Implement fence_wait
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31872>
2024-10-30 07:13:30 +00:00
David Rosca
375ecea7b5 radeonsi/uvd_enc: Implement fence_wait
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31872>
2024-10-30 07:13:29 +00:00
Eric Engestrom
e69aba2cde freedreno/ci: add nightly freedreno gl testing on a750
Not very stable (got a hang 7/20 times while stress-testing), but it's
probably still useful in nightly, especially given how quick it is.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31849>
2024-10-29 20:31:48 +00:00
Eric Engestrom
1bfbc3abf6 freedreno/ci: abort a750 testing when a hang is detected
There's no point continuing just to get a massive number of fails.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31849>
2024-10-29 20:31:48 +00:00
Samuel Pitoiset
4459a1d210 radv: resize the SPM bo when it's too small
This used to abort (see the previous commit) when the hardware wasn't
able to sample all SPM counters because the BO was too small. The SPM
BO can now be resized like the SQTT BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883>
2024-10-29 18:33:17 +00:00
Samuel Pitoiset
e14511f77d ac/spm: do not abort when the SPM BO is too small
It needs to be resized instead, like the SQTT BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883>
2024-10-29 18:33:17 +00:00
Chia-I Wu
cc1c663152 panvk: disable depth write when depth test is disabled
The spec says

  depthWriteEnable controls whether depth writes are enabled when
  depthTestEnable is VK_TRUE. Depth writes are always disabled when
  depthTestEnable is VK_FALSE.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-By: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31878>
2024-10-29 18:08:03 +00:00
Eric Engestrom
09a2de2a51 egl: error out during setup if the configuration is invalid
If EGL is built, it needs a driver; if we don't abort setup here,
the compilation fails with:

    ld: error: undefined symbol: _eglDriver

See #11397 or #11956 for instance.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31870>
2024-10-29 17:25:08 +00:00
Marek Olšák
16aec27515 radeonsi: simplify util_rast_prim_is_lines_or_triangles
PATCHES can't occur here.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
73abbf1175 radeonsi: rewrite how small prim precision is passed to culling code
Instead of passing 2 different 4-bit precision values via the SGPR, pass
the quant mode enum + log_samples as 3 bits, and 2-bit log_samples
separately. This saves 3 bits in the SGPR, which we'll need for culling
states.

This completely changes how the small prim precision is computed from
the state bits.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
4f096b994d ac/nir,radeonsi: use load_cull_line_viewport_xy_scale_and_offset_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
0f39d44f1b ac/nir,radeonsi: use load_cull_small_line_precision_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
10c6f87adb ac/nir,radeonsi: use load_cull_small_lines_enabled_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
ee452129c6 nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
2227f5be9d nir: rename load_cull_small_primitive_precision -> triangle, add line_precision
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
0914e0d02f nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Lu Yao
0442a6c292 ac/radeonsi: compute htile for tile mode RADEON_SURF_MODE_1D on GFX6-8
Computing 'htile_size/meta_size' is allowed for RADEON_SURF_MODE_1D when
RADEON_SURF_TC_COMPATIBLE_HTILE isn't set.
Lacking of computing causes performance degradation in some scenarios.

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31617>
2024-10-29 16:23:51 +00:00
Sagar Ghuge
17096f87c1 intel: Switch to COMPUTE_WALKER_BODY
Stuff COMPUTE_WALKER_BODY in COMPUTER_WALKER in both iris and anv.

This also fixes the tracepoint for ray dispatches. Stuffing
COMPUTE_WALKER_BODY allow us to set the
cmd_buffer->state.last_compute_walker.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31822>
2024-10-29 15:54:43 +00:00
Georg Lehmann
938f5ec7ce radv: use nir_opt_fragdepth
Cyberpunk 2077 writes unmodified depth.

Foz-DB Navi21:
Totals from 28 (0.04% of 79395) affected shaders:
Instrs: 6484 -> 6448 (-0.56%)
CodeSize: 36016 -> 35784 (-0.64%)
Latency: 58517 -> 58400 (-0.20%)
InvThroughput: 7719 -> 7717 (-0.03%)
Branches: 129 -> 119 (-7.75%)
PreVGPRs: 394 -> 372 (-5.58%)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31874>
2024-10-29 15:15:24 +00:00
Benjamin Lee
e52a599eba panvk: fix combined image/sampler descriptor arrays
For combined image/sampler descriptors, each user-facing descriptor gets
two entries in the descriptor table. Indexes must be strided to account
for this.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31777>
2024-10-29 13:07:56 +00:00
cheyang
122fd46b15 Android15 support gralloc IMapper5
In Android15 libui.so the vendor partition can access.
so use GraphicBufferMapper load mapper4 or mapper5.
still using U_GRALLOC_TYPE_GRALLOC4 because GraphicBufferMapper
load mapper5 fail will rollback loading mapper4

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11091

Signed-off-by: cheyang <cheyang@bytedance.com>
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31766>
2024-10-29 12:32:04 +00:00
YaoBing Xiao
b63dfcc172 vulkan/x11: use xcb_connection_has_error to check for failue
xcb_connectxx() always returns a non-NULL pointer to a
xcb_connection_t, even on failure.

cc: mesa-stable

Signed-off-by: YaoBing Xiao <xiaoyaobing@uniontech.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31812>
2024-10-29 11:52:37 +00:00
Georg Lehmann
d6535f2602 nir/opt_algebraic: create ubfe with non constant mask
Foz-DB Navi21:
Totals from 278 (0.35% of 79395) affected shaders:
MaxWaves: 7444 -> 7448 (+0.05%)
Instrs: 316069 -> 314584 (-0.47%); split: -0.47%, +0.00%
CodeSize: 1608064 -> 1593204 (-0.92%)
VGPRs: 11128 -> 11120 (-0.07%)
Latency: 796599 -> 797786 (+0.15%); split: -0.19%, +0.34%
InvThroughput: 141195 -> 139472 (-1.22%); split: -1.22%, +0.00%
Copies: 28565 -> 29796 (+4.31%); split: -0.15%, +4.46%
PreSGPRs: 14335 -> 14336 (+0.01%)
VALU: 161342 -> 159426 (-1.19%)
SALU: 87794 -> 88305 (+0.58%); split: -0.03%, +0.61%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>
2024-10-29 10:51:10 +00:00
Timur Kristóf
be68aeafdc nir/opt_algebraic: Add various bitfield extract patterns.
v2 (Georg Lehmann):
- fixed incorrect imin in ubfe_ubfe
- simplied outer_bits of ushr((ubfe, ...), ...) opt
- added is_used_once to iand(ushr(), ...) opt to improve stats

For-DB Navi21:
Totals from 3309 (4.18% of 79206) affected shaders:
Instrs: 5295291 -> 5282128 (-0.25%); split: -0.28%, +0.03%
CodeSize: 28299320 -> 28298456 (-0.00%); split: -0.07%, +0.06%
Latency: 51566173 -> 51521923 (-0.09%); split: -0.09%, +0.01%
InvThroughput: 13222050 -> 13204557 (-0.13%); split: -0.14%, +0.01%
VClause: 116451 -> 116458 (+0.01%); split: -0.02%, +0.02%
SClause: 160356 -> 160324 (-0.02%); split: -0.03%, +0.01%
Copies: 424152 -> 423670 (-0.11%); split: -0.20%, +0.09%
Branches: 156701 -> 156192 (-0.32%); split: -0.33%, +0.01%
PreSGPRs: 168507 -> 168500 (-0.00%); split: -0.02%, +0.01%
PreVGPRs: 151477 -> 151474 (-0.00%)
VALU: 3486077 -> 3476675 (-0.27%); split: -0.31%, +0.04%
SALU: 786467 -> 783109 (-0.43%); split: -0.45%, +0.03%
VMEM: 188035 -> 188060 (+0.01%)
SMEM: 259632 -> 259630 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>
2024-10-29 10:51:09 +00:00
Erik Faye-Lund
78f23bf295 panfrost: add an assert in render-target setup
This code isn't really wrong, but it makes some assumptions that are a
bit hard to grok. Let's thread a bit more carefully, by adding an assert
that hopefully clears things up a tad.

We area after all choosing in the range of RAW8 to RAW128.

CID: 1605056
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
a62e80ce11 panfrost: drop needless assign
We are overwriting l right after the conditional block, so let's just
drop this for simplicity.

CID: 1529404
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
8b619b2360 panvk/csf: only look at fs if it's required
If the FS isn't used, there's no reason to consult it. This was inspired
by a coverity report, which was technically wrong, but made me look
closer at the code.

CID: 1620442
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
712c11fc17 panvk: assert on missing vs
A vs is always required, and we already dereference it in this function
unconditionally. Let's add an assert to be sure, and drop the run-time
check here.

CID: 1620449
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
9ad9d9ac68 panvk: put conditional outside of define
While this is perfectly valid, stuffing the conditional into the define
makes us evaluate it over and over again, causing some warnings about
nonsensical compares with Coverity.

CID: 1618771
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
19fdfd6429 panvk: drop needless assert
The value can't be larger than 31 here anyway, due to the bitfield
width. So the assert is completely needless.

CID: 1633082
Fixes: b8bfbbdf66 ("panvk: check against texfeat_bit")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:16 +00:00
Erik Faye-Lund
103ad15ece panvk: avoid signed integer underflow
This is undefined behavior, let's use unsigned integer underflow
instead.

CID: 1605124
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31767>
2024-10-29 10:13:15 +00:00
Georg Lehmann
695d2414cd nir,radv: optimize shared atomic offsets
Foz-DB Navi21:
Totals from 87 (0.11% of 79395) affected shaders:
Instrs: 140877 -> 140873 (-0.00%)
CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01%
Latency: 4528171 -> 4528162 (-0.00%)
InvThroughput: 826358 -> 826349 (-0.00%)
Copies: 10888 -> 10884 (-0.04%)
VALU: 84634 -> 84630 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>
2024-10-29 09:31:08 +00:00
Georg Lehmann
a2baff4810 ac/llvm: handle shared atomic base offset
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>
2024-10-29 09:31:08 +00:00