Samuel Pitoiset
4428541c54
radv/meta: fix HTILE fixup after copying depth/stencil image copies
...
Typo, it should be false because it's after the copy.
Fixes: 4f41818194 ("radv/meta: add a function to fixup HTILE metadata for copies on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40270 >
2026-03-09 09:07:09 +00:00
Samuel Pitoiset
fff16a9748
radv: replace radv_sdma_surf by ac_sdma_surf
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:47 +00:00
Samuel Pitoiset
c40225e490
radv: tidy up radv_sdma_surf
...
Adjust few things before replacing it by ac_sdma_surf.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:47 +00:00
Samuel Pitoiset
0616fd22a5
radv: simplify getting bpe for SDMA surfaces
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:46 +00:00
Samuel Pitoiset
9893ac3674
radv: remove unnecessary radv_sdma_surf::{blk_w,blk_h}
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:46 +00:00
Samuel Pitoiset
94acb7edd5
radv: simplify computing offset/extent of SDMA surfaces
...
By computing in elements earlier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:45 +00:00
Samuel Pitoiset
5923a7b8c6
radv: use vk_image_buffer_copy_layout() for SDMA buf layout
...
For consistency with non-SDMA paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:45 +00:00
Samuel Pitoiset
02d047099e
radv: simplify 96-bit copies with SDMA
...
By adjusting offset/extent earlier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:45 +00:00
Samuel Pitoiset
6f3b9a62b3
radv: remove redundant radv_sdma_surf::is_linear
...
is_linear is never used for buffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:44 +00:00
Samuel Pitoiset
dba9809e0c
radv: remove redundant radv_sdma_surf::is_3d
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40186 >
2026-03-09 08:40:44 +00:00
Samuel Pitoiset
1a00587c44
radv: fix a GPU hang with PS epilogs and secondary command buffers
...
If the secondary changes the fragment output state and if the same
PS epilog used before ExecuteCommands() is re-bind immediately after
that call, the PS epilog state wouldn't be re-emitted.
Apply the same change for VS prologs, although the logic is slightly
different and the bug shouldn't occur. The whole logic of secondaries
should be completely rewritten because it's definitely not robust.
This fixes a GPU hang in Where Winds Meet, see
https://github.com/doitsujin/dxvk/issues/5436 .
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40221 >
2026-03-09 08:16:49 +00:00
Samuel Pitoiset
ac3fd06987
radv: always enable DISABLE_CONSERVATIVE_ZPASS_COUNTS on GFX11
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This might cause incorrect occlusion queries count.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40235 >
2026-03-09 07:26:25 +00:00
Kenneth Graunke
952bf55483
nir: Fix divergence of Intel URB input/output handle intrinsics
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Tessellation evaluation shaders have a single convergent URB handle
(for the common patch data) used by all lanes. Every other stage's
IO handles have separate handles in each lane.
Thanks to Alyssa Rosenzweig for catching this bug.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280 >
2026-03-09 02:38:59 +00:00
Connor Abbott
6e3d805735
freedreno: Rename afuc to QRisc
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In [1] the AQE is called the "Application QRisc Engine." Thus the real
name of afuc is QRisc. Rename everything.
[1] a698ebd321
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40287 >
2026-03-08 22:32:39 +00:00
Connor Abbott
554eec159b
freedreno/afuc: Update cread/cwrite syntax in README
...
We now print actual modifiers instead of mysterious flags. Remove the
remaining ones.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40287 >
2026-03-08 22:32:39 +00:00
Mel Henning
1371c53e6a
nvk: Expose VK_KHR_depth_clamp_zero_one
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Promoted from EXT
Reviewed-By: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39812 >
2026-03-08 17:16:26 -04:00
Mel Henning
8e2707950b
nvk: Use the MME for cond rendering on Turing+
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We can avoid the stalls from subc switches by avoiding using the copy engine
during vkCmdBeginConditionalRenderingEXT. Implement this by loading the
cond render value using the MME, since the hardware doesn't have a
suitable 32-bit comparison itself.
This brings the Sascha Willems conditionalrender demo from
from 1661 to 8334 fps on my blackwell system with all meshes disabled.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40277 >
2026-03-08 17:31:32 +00:00
Mel Henning
905557ab31
nvk: Use SET_GLOBAL_RENDER_ENABLE
...
This brings the Sascha Willems conditionalrender demo from
927 to 1661 fps on my blackwell system with all meshes disabled.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40277 >
2026-03-08 17:31:32 +00:00
Eric Engestrom
1c14a7f283
etnaviv/ci: fix expectation
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 3c0aa7c633b397ee8055 ("etnaviv/ci: update expectations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40286 >
2026-03-07 23:35:55 +00:00
Karol Herbst
65c5c4e1a1
nvk: run nir_opt_large_constants before nir_lower_load_const_to_scalar
...
nir_opt_large_constants isn't able to deal with complex derefs and
nir_lower_load_const_to_scalar e.g. splits up vectors to scalars.
This prevented nir_opt_large_constants from extracting large constants
in shaders that e.g. use a array of vector constant table.
Totals:
CodeSize: 9460341008 -> 9443435056 (-0.18%); split: -0.20%, +0.02%
Number of GPRs: 47363466 -> 47300498 (-0.13%); split: -0.13%, +0.00%
SLM Size: 5409320 -> 1202912 (-77.76%)
Static cycle count: 6130972462 -> 6121193466 (-0.16%); split: -0.20%, +0.04%
Spills to reg: 184840 -> 184828 (-0.01%); split: -0.01%, +0.01%
Fills from reg: 223889 -> 223874 (-0.01%); split: -0.01%, +0.00%
Max warps/SM: 50637796 -> 50641540 (+0.01%); split: +0.01%, -0.00%
Totals from 32429 (2.79% of 1163204) affected shaders:
CodeSize: 824883920 -> 807977968 (-2.05%); split: -2.25%, +0.20%
Number of GPRs: 2413077 -> 2350109 (-2.61%); split: -2.61%, +0.00%
SLM Size: 4437016 -> 230608 (-94.80%)
Static cycle count: 1208715713 -> 1198936717 (-0.81%); split: -1.02%, +0.21%
Spills to reg: 11934 -> 11922 (-0.10%); split: -0.20%, +0.10%
Fills from reg: 14118 -> 14103 (-0.11%); split: -0.14%, +0.04%
Max warps/SM: 1035736 -> 1039480 (+0.36%); split: +0.37%, -0.01%
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14993
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40282 >
2026-03-07 23:21:40 +00:00
Karol Herbst
faea742c3a
nouveau/drm-shim: implement get_zcull_info
...
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40282 >
2026-03-07 23:21:40 +00:00
Lucas Fryzek
844c5a1ae6
lvp: Export -1 as sync fd
...
If the gallium context does not support `native_fence_fd`, we can still
support sync fd export/import by exporting -1 as sync_fd in vulkan.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211 >
2026-03-07 22:21:13 +00:00
Lucas Fryzek
420b934494
lvp: Mark opaque FD and dmabuf as compatible is supported
...
If dmabuf export is supported we can now mark them as compatible handle
types. Additionally we can always store the backed_fd for export.
v2 (zzyiwei): hide opaque fd compat with dmabuf export behind udmabuf
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211 >
2026-03-07 22:21:13 +00:00
Yiwei Zhang
848da336dd
lvp: hide import-only dmabuf support from zink
...
Zink has assumed both import and export when dmabuf extension is
advertised, so lavapipe has to hide the extension for zink without
supporting both.
Together with the prior commit, now zink-on-lvp in the CI env without
udmabuf will no longer test against fake dmabuf support.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211 >
2026-03-07 22:21:13 +00:00
Yiwei Zhang
5ab8c8a439
lvp: avoid advertising dmabuf support for kms_swrast
...
Lavapipe relies on true udmabuf support for dmabuf export allocation.
This changes aligns the behavior with both llvmpipe_allocate_memory_fd
and llvmpipe_import_memory_fd.
Fixes: 7d0a631f20 ("llvmpipe: export dmabuf caps for kms_swrast")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211 >
2026-03-07 22:21:12 +00:00
Mel Henning
bfde63e4d8
driconf: force_vk_vendor on No Man's Sky + NVK
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40278 >
2026-03-07 15:55:08 +00:00
Georg Lehmann
406935c6fe
radv: use nir_opt_fp_math_ctrl
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi21:
Totals from 10833 (9.63% of 112497) affected shaders:
Instrs: 10090308 -> 10043030 (-0.47%); split: -0.48%, +0.01%
CodeSize: 53681564 -> 53556756 (-0.23%); split: -0.25%, +0.01%
VGPRs: 511568 -> 511296 (-0.05%); split: -0.08%, +0.03%
SpillSGPRs: 2442 -> 2438 (-0.16%); split: -0.20%, +0.04%
Latency: 58989785 -> 58935280 (-0.09%); split: -0.18%, +0.09%
InvThroughput: 15142587 -> 15067217 (-0.50%); split: -0.52%, +0.02%
VClause: 200588 -> 200410 (-0.09%); split: -0.20%, +0.11%
SClause: 257273 -> 257262 (-0.00%); split: -0.20%, +0.19%
Copies: 741430 -> 741397 (-0.00%); split: -0.22%, +0.22%
Branches: 211023 -> 211020 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 491752 -> 491663 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 418558 -> 418089 (-0.11%); split: -0.12%, +0.01%
VALU: 7064149 -> 7017847 (-0.66%); split: -0.66%, +0.01%
SALU: 1227287 -> 1226639 (-0.05%); split: -0.13%, +0.07%
SMEM: 449268 -> 449343 (+0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40098 >
2026-03-07 08:16:29 +01:00
Georg Lehmann
7c217e540c
nir: add a pass to optimize fp_math_ctrl
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40098 >
2026-03-07 08:16:27 +01:00
Georg Lehmann
042ee8dafc
panvk/ci: document new crashes on bifrost
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
f474e9853e
nir: add fp class analysis tests
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
4885e5cf3a
nir: remove more fsat using range analysis
...
Foz-DB Navi48:
Totals from 3018 (3.65% of 82636) affected shaders:
MaxWaves: 69274 -> 69280 (+0.01%)
Instrs: 7165414 -> 7157581 (-0.11%); split: -0.12%, +0.01%
CodeSize: 38890212 -> 38823132 (-0.17%); split: -0.18%, +0.00%
VGPRs: 228672 -> 228624 (-0.02%)
Latency: 64789026 -> 64784877 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 11805156 -> 11802642 (-0.02%); split: -0.02%, +0.00%
VClause: 136900 -> 136886 (-0.01%); split: -0.03%, +0.02%
SClause: 150135 -> 150130 (-0.00%); split: -0.01%, +0.01%
Copies: 574690 -> 574894 (+0.04%); split: -0.03%, +0.06%
Branches: 187169 -> 187086 (-0.04%); split: -0.04%, +0.00%
PreSGPRs: 190074 -> 190067 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 189564 -> 189538 (-0.01%); split: -0.02%, +0.00%
VALU: 3955188 -> 3949411 (-0.15%); split: -0.15%, +0.00%
SALU: 1114659 -> 1114729 (+0.01%); split: -0.02%, +0.03%
SMEM: 231080 -> 231077 (-0.00%); split: -0.00%, +0.00%
VOPD: 116150 -> 116180 (+0.03%); split: +0.04%, -0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
506bb5a609
nir/search_helpers: use fp class analysis more
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
a9e75d8ee4
nir: remove nir_analyze_fp_range
...
Use fp class analysis instead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
eb431efc19
nir/search_helpers: switch to fp class analysis
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
58799c4e7c
nir/gather_tcs_info: use nir_analyze_fp_class directly
...
The information around positive one helps in theory.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
08cac48170
aco/isel: skip min/max for SALU fsat if possible
...
Foz-DB Navi48:
Totals from 789 (0.95% of 82636) affected shaders:
Instrs: 4144156 -> 4141345 (-0.07%); split: -0.07%, +0.00%
CodeSize: 23345212 -> 23333960 (-0.05%); split: -0.05%, +0.00%
Latency: 22988205 -> 22986666 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 4378321 -> 4377874 (-0.01%); split: -0.01%, +0.00%
Copies: 302311 -> 302313 (+0.00%); split: -0.00%, +0.00%
SALU: 647622 -> 645901 (-0.27%); split: -0.27%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
0ecf2c322e
nir: add fp class analysis for fround_even
...
Foz-DB Navi48:
Totals from 383 (0.33% of 114655) affected shaders:
MaxWaves: 9806 -> 9808 (+0.02%)
Instrs: 502508 -> 501762 (-0.15%); split: -0.16%, +0.01%
CodeSize: 2711404 -> 2707604 (-0.14%); split: -0.15%, +0.01%
VGPRs: 24360 -> 24348 (-0.05%)
Latency: 2068105 -> 2066817 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 370962 -> 370081 (-0.24%)
VClause: 7045 -> 7041 (-0.06%)
SClause: 10551 -> 10559 (+0.08%); split: -0.08%, +0.15%
Copies: 29135 -> 29117 (-0.06%); split: -0.12%, +0.05%
Branches: 17333 -> 17328 (-0.03%)
PreSGPRs: 21511 -> 21510 (-0.00%)
PreVGPRs: 18555 -> 18545 (-0.05%)
VALU: 274445 -> 273874 (-0.21%); split: -0.21%, +0.00%
SALU: 78819 -> 78779 (-0.05%); split: -0.07%, +0.02%
VMEM: 10918 -> 10913 (-0.05%)
SMEM: 17662 -> 17656 (-0.03%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
7509b4a199
nir: add fp class analysis for fsub
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
d8734e5453
nir: add fp class analysis for shadow compare
...
Foz-DB Navi48:
Totals from 145 (0.18% of 82636) affected shaders:
Instrs: 280871 -> 280729 (-0.05%)
CodeSize: 1545724 -> 1545488 (-0.02%); split: -0.02%, +0.00%
Latency: 10840265 -> 10840216 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2093707 -> 2093646 (-0.00%)
SClause: 4483 -> 4481 (-0.04%)
VALU: 188142 -> 188039 (-0.05%)
SALU: 22238 -> 22236 (-0.01%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
6d3a279a3b
nir: add fp class analysis for some intrinsics
...
I also tried ddx/ddy, but that was not worth it.
Foz-DB Navi48:
Totals from 1019 (1.23% of 82636) affected shaders:
Instrs: 516459 -> 515700 (-0.15%); split: -0.17%, +0.02%
CodeSize: 2712428 -> 2707008 (-0.20%); split: -0.21%, +0.01%
VGPRs: 70152 -> 70140 (-0.02%)
Latency: 1799198 -> 1795926 (-0.18%); split: -0.19%, +0.00%
InvThroughput: 233497 -> 232628 (-0.37%); split: -0.37%, +0.00%
VClause: 15315 -> 15346 (+0.20%); split: -0.11%, +0.31%
Copies: 30009 -> 30035 (+0.09%); split: -0.06%, +0.14%
VALU: 305519 -> 304727 (-0.26%); split: -0.27%, +0.01%
SALU: 45855 -> 45854 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
73bce23f65
nir: add fp class analysis for flog2
...
Foz-DB Navi48:
Totals from 230 (0.28% of 82636) affected shaders:
Instrs: 599005 -> 598615 (-0.07%); split: -0.09%, +0.02%
CodeSize: 3110528 -> 3103136 (-0.24%); split: -0.24%, +0.00%
Latency: 3661526 -> 3663241 (+0.05%); split: -0.01%, +0.05%
InvThroughput: 526561 -> 526487 (-0.01%); split: -0.01%, +0.00%
Copies: 33735 -> 33820 (+0.25%); split: -0.06%, +0.31%
VALU: 378034 -> 377904 (-0.03%); split: -0.03%, +0.00%
SALU: 65156 -> 65045 (-0.17%); split: -0.19%, +0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
81e272aa1d
nir: add fp class analysis for sin/cos
...
Foz-DB Navi48:
Totals from 264 (0.32% of 82636) affected shaders:
CodeSize: 1688676 -> 1688672 (-0.00%)
Latency: 510773 -> 510772 (-0.00%)
InvThroughput: 138569 -> 138568 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
5a298f3560
nir: rewrite fp range analysis as a fp class analysis
...
Knowing if a value is not larger than one helps proving finite
results of fmul/fadd and will allow skipping/creating more fsat.
Knowing that a value is larger than one helps proving non zero
results of fmul.
Separating positive and negative zero also has advantages when
signed zero correctness is required.
Foz-DB Navi48:
Totals from 1344 (1.63% of 82636) affected shaders:
Instrs: 5319389 -> 5312280 (-0.13%); split: -0.14%, +0.01%
CodeSize: 29702516 -> 29665684 (-0.12%); split: -0.13%, +0.01%
Latency: 40694344 -> 40694545 (+0.00%); split: -0.01%, +0.02%
InvThroughput: 7481192 -> 7480403 (-0.01%); split: -0.02%, +0.01%
VClause: 121947 -> 121946 (-0.00%); split: -0.00%, +0.00%
SClause: 104972 -> 104923 (-0.05%); split: -0.05%, +0.00%
Copies: 371098 -> 371092 (-0.00%); split: -0.02%, +0.02%
Branches: 122929 -> 122919 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 82506 -> 82510 (+0.00%); split: -0.00%, +0.01%
PreVGPRs: 79175 -> 79168 (-0.01%)
VALU: 2906718 -> 2904777 (-0.07%); split: -0.07%, +0.00%
SALU: 726256 -> 723454 (-0.39%); split: -0.39%, +0.00%
VMEM: 205021 -> 205016 (-0.00%)
SMEM: 163972 -> 163916 (-0.03%)
VOPD: 303354 -> 303298 (-0.02%); split: +0.02%, -0.04%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
32b5719a9f
nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern
...
More fallout from f2a59fdea6 .
is_not_zero now always returns whether the result is a floating point zero.
When combined with the fp denorm handling that will be added to
floating point range analysis, this is false for many sensible integer values.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
ab773fc5d4
nir/opt_algebraic: fix frsq clamp pattern
...
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.
Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:42 +00:00
Georg Lehmann
ba30de1f97
nir/opt_algebraic: remove pattern that skips iabs with range analysis
...
Fixes: f2a59fdea6 ("nir: remove non float nir_analyse_range support")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:41 +00:00
Danylo Piliaiev
81a76be861
tu: Don't read .patch_input_gmem of unused attachment
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There was duplicated code to set unscaled_input_fragcoord and a read
from VK_ATTACHMENT_UNUSED attachment, which incorrectly updated
builder->unscaled_input_fragcoord.
ubsan:
tu_pipeline.cc:4734:44: runtime error: load of value 127, which is not a valid value for type 'bool'
Seen in:
dEQP-VK.renderpasses.renderpass1.custom_resolve.monolithic.stencil_only_s8
Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264 >
2026-03-07 01:58:43 +00:00
Danylo Piliaiev
3bf3c1eb03
tu: Fix stomping of D/S test for custom resolve with D/S
...
D/S tests are disabled if subpass doesn't declare D/S being used, when
resolving D/S via draw call - test/write has to be enabled.
Fixes D/S tests from:
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.*
Fixes: 5a3b0ce461 ("tu: avoid incorrect pipeline draw state for disabled depth/stencil attachments")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264 >
2026-03-07 01:58:43 +00:00
Danylo Piliaiev
67c54c4465
tu: Store gmem attachments after custom resolve in dyn RP
...
For dynamic renderpass we created a fake second subpass,
which would is used by CmdBeginCustomResolveEXT, however
CmdBeginCustomResolveEXT doesn't trigger tile stores, but
attachments didn't know they should be stored after fake
custom resolve subpass.
Fixes: 520e3f3a47 ("tu: Implement VK_EXT_custom_resolve")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264 >
2026-03-07 01:58:43 +00:00
Danylo Piliaiev
7eb79f0740
tu: vk_dont_care_as_load should not affect internal DONT_CARE cases
...
It shouldn't affect attachments created for VK_EXT_custom_resolve and
VK_EXT_multisampled_render_to_single_sampled.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264 >
2026-03-07 01:58:43 +00:00