Commit graph

218579 commits

Author SHA1 Message Date
Georg Lehmann
2355b63cb5 nir/opt_algebraic: use better float control for some fcmp patterns
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93 nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
We already have patterns to move the negation to the constant.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236 nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b nir/opt_algebraic: make pattern pushing fmul into bcsel exact
The only special case here is d == -0.0.

Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0 nir/opt_algebraic: remove inexact from floor->trunc pattern
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337 nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
No Foz-DB chagnes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc nir/opt_algebraic: add a - a with nnan
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1 nir/opt_algebraic: remove inexact a * 0.0 patterns
We already have some with nnan,nsz.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
63d199a01e nir: remove special fp_math_ctrl rules
All opcodes should now respect the nan/inf/sz preserving flags.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
e443229644 nir/opt_algebraic: mark newly created fmulz nan/inf preserving
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
b678899ef8 nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
And remove some, because they should be covered by the search pattern anyway.

Foz-DB Navi48:
Totals from 560 (0.68% of 82405) affected shaders:
MaxWaves: 11279 -> 11291 (+0.11%)
Instrs: 5214229 -> 5214386 (+0.00%); split: -0.02%, +0.02%
CodeSize: 29613884 -> 29616740 (+0.01%); split: -0.01%, +0.02%
VGPRs: 50400 -> 50328 (-0.14%)
Latency: 36481700 -> 36481157 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7309905 -> 7307905 (-0.03%); split: -0.05%, +0.02%
VClause: 131423 -> 131424 (+0.00%); split: -0.00%, +0.00%
SClause: 111485 -> 111499 (+0.01%); split: -0.00%, +0.01%
Copies: 441899 -> 442029 (+0.03%); split: -0.02%, +0.05%
Branches: 165599 -> 165597 (-0.00%)
PreVGPRs: 43558 -> 43525 (-0.08%)
VALU: 2573609 -> 2573324 (-0.01%); split: -0.03%, +0.02%
SALU: 851172 -> 851271 (+0.01%); split: -0.01%, +0.02%
VOPD: 366409 -> 366934 (+0.14%); split: +0.23%, -0.08%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a8ad72b912 nir/search: add option to set nan/inf/sz preserve on replacement patterns
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
7e22c43f65 glsl: make fmin/fmax/fsat nan/inf preserving
Probably not needed, but it makes piglit happy, so whatever.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
ef5e58e513 glsl: make fp (not) equal always nan/inf preserving
Outside of isnan/isinf this shouldn't be needed, but at this point
they were already lowered.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
c5bf979d71 glsl: preserve inf/nan for precise/invariant
Not sure if this is actually required, but it matches previous behavior in NIR,
and some piglit tests expect this.
Notably GL-CTS does not need this, so maybe piglit is just broken.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
52eab085e6 nir/lower_uniform_subgroup: use nan/inf preserve instead of exact for feq
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
30da75e8b1 nir/lower_double_ops: don't create more exact ops than the input requires
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
e2301164c7 nir/format_convert: use nan/inf preserve flag for fmax instead of exact
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a87cdfc6b7 radv/nir/rt: preserve inf/nan for emulated RT intersect
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
7a23ff9cf8 gallium/ttn: use nan/inf preserve instead of exact for kill's flt
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
c94d666943 mesa/prog_to_nir: use nan/inf preserve instead of exact for kill's flt
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
1fe4d799e7 spirv: use nan/inf preserve for glsl.std.450 min/max instead of exact
Foz-DB Navi48:
Totals from 135 (0.16% of 82405) affected shaders:
Instrs: 546831 -> 546552 (-0.05%); split: -0.05%, +0.00%
CodeSize: 3038664 -> 3037392 (-0.04%); split: -0.05%, +0.00%
Latency: 4360757 -> 4357294 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 753593 -> 752997 (-0.08%)
Copies: 57180 -> 57207 (+0.05%)
VALU: 300705 -> 300513 (-0.06%)
SALU: 71339 -> 71364 (+0.04%)
VOPD: 30002 -> 29999 (-0.01%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
7c5a5755e2 spirv: use nan/inf preserve instead of exact for fp compare
Foz-DB Navi48:
Totals from 438 (0.53% of 82405) affected shaders:
MaxWaves: 13164 -> 13076 (-0.67%)
Instrs: 259008 -> 257978 (-0.40%); split: -0.82%, +0.42%
CodeSize: 1415756 -> 1416404 (+0.05%); split: -0.22%, +0.27%
VGPRs: 21732 -> 21852 (+0.55%); split: -0.11%, +0.66%
Latency: 911833 -> 916968 (+0.56%); split: -0.20%, +0.76%
InvThroughput: 149739 -> 148995 (-0.50%); split: -0.99%, +0.49%
VClause: 4512 -> 4517 (+0.11%); split: -0.04%, +0.16%
SClause: 5429 -> 5452 (+0.42%); split: -0.31%, +0.74%
Copies: 11953 -> 11995 (+0.35%); split: -0.51%, +0.86%
PreSGPRs: 16326 -> 16321 (-0.03%); split: -0.04%, +0.01%
PreVGPRs: 14929 -> 14930 (+0.01%); split: -0.45%, +0.46%
VALU: 158092 -> 156926 (-0.74%); split: -1.31%, +0.57%
SALU: 25711 -> 25559 (-0.59%); split: -0.82%, +0.23%
VOPD: 76 -> 74 (-2.63%)

The regressions are in d3d9 shaders where fmulz is no longer reassociated,
because it now has the nan/inf preserve flags. This will be fixed later in the series.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
e873b8764a aco/optimizer: use nan preserve flag to prevent incorrect med3
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Eric Engestrom
88a1887cc2 Revert "add VK CTS validation report for a0 interpolation fix"
This reverts commit 4eeda739c4.

Looks like @ajax has been playing with claude "ai" and accidentally
committed and pushed things to main.

1. That doesn't belong here.
2. We shouldn't bypass MR review unless in emergency situations.

This seems like a good time to remind people to not have a push-able
upstream remote, to avoid an accidental `git push` or something else
doing that.
One way to do this is to change the push url like this, assuming your
upstream remote is called `origin` (default if you didn't pick
something else):

    git remote set-url --push origin invalid-url

`invalid-url` will fail if you try to push to it, catching your mistakes :)

Spotted-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39810>
2026-02-10 18:07:18 +00:00
Rob Clark
4a654aee7c freedreno/decode: Keep intereactive for query mode
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I'm not sure I remember the original rational for disabling this is
query mode.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
2026-02-10 16:17:40 +00:00
Rob Clark
4e3da8e56b freedreno/decode: Filter redundent _HI regs
If querying based on a regexp pattern, both FOO_BASE and the synthisized
FOO_BASE_HI could match the regex.  But we don't really want to see both
seperately, we just want the combined 64b value.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
2026-02-10 16:17:40 +00:00
Rob Clark
07feb1ca4c freedreno/decode: Split out endswith() helper
Rename the original, and split the string suffix checking into it's own
helper so we can re-use it.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
2026-02-10 16:17:39 +00:00
Rob Clark
45cccf6e42 freedreno/decode: Shorten query string
The output in query mode can take many columns.  Tighten it up a bit,
going from:

  12: DI_PT_TRILIST(0,0-1381,1381):98304:RM6_BIN_VISIBILITY:CP_SET_THREAD_BR:   00000040!+      RB_BIN_FOVEAT: { BINSCALEEN }

to

  12: TRILIST(0,0-1381,1381):98304:BIN_VISIBILITY:BR:   00000040!+      RB_BIN_FOVEAT: { BINSCALEEN }

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
2026-02-10 16:17:39 +00:00
Karol Herbst
faf3a93e8f vtn: set default fp_math_ctrl values for kernels
The kernel capabilty has the `FPFastMathMode` decoration, but not the
`FPFastMathDefault` execution mode, so a SPIR-V module not using
`SPV_KHR_float_controls2` has no way of setting any defaults.

Fixes: 9da2d21804 ("vtn: implement default fp_math_ctrl without using execution mode")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39790>
2026-02-10 15:14:57 +00:00
José Roberto de Souza
4df142acb6 iris: Fix invalid reads when uploading blend state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
blend_entries is already a uint32_t so its lenght should only be rt_dwords.
This fix invalid reads in the line below where the memcpy lenght is the
sizeof(blend_entries).

Valgrind warning:
==176211== Invalid read of size 2
==176211==    at 0x485A7CF: memmove (vg_replace_strmem.c:1415)
==176211==    by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211==    by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211==    by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211==    by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211==    by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211==    by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211==    by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211==    by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211==    by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211==    by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211==    by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211==    by 0x68E97C4: read_pixels (readpix.c:1216)
==176211==    by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211==    by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211==  Address 0x1a792240 is 0 bytes after a block of size 144 alloc'd
==176211==    at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211==    by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211==    by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211==    by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211==    by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211==    by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211==    by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211==    by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211==    by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211==    by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211==    by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211==    by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==
==176211== Invalid read of size 2
==176211==    at 0x485A7C0: memmove (vg_replace_strmem.c:1415)
==176211==    by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211==    by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211==    by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211==    by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211==    by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211==    by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211==    by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211==    by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211==    by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211==    by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211==    by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211==    by 0x68E97C4: read_pixels (readpix.c:1216)
==176211==    by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211==    by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211==  Address 0x1a792242 is 2 bytes after a block of size 144 alloc'd
==176211==    at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211==    by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211==    by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211==    by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211==    by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211==    by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211==    by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211==    by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211==    by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211==    by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211==    by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211==    by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==

Cc: stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39661>
2026-02-10 14:27:56 +00:00
Utku Iseri
c954aaa842 panvk: increase mappable VA range to 48 bits
This is still going to be limited by the actual range supported
by panthor. Some internal allocations have to be within the
lower 32-bit AS, so this creates a private heap and puts all
priv bo and gpu queue allocations in that.

If panthor doesn't let us use >32bit addresses, we go back
to the old behavior of having a single heap.

The lowest 32MB range is still reserved for device usage.

Also adds a debug option to limit it back to 32 bits for debugging
any potential future issues.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39383>
2026-02-10 13:53:51 +00:00
Utku Iseri
9e3c5fccf9 panvk: pass heap explicitly to as_alloc/free
Convenience change for the upcoming multiple-heap case.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39383>
2026-02-10 13:53:51 +00:00
Utku Iseri
192fca11a3 pan/genxml: make pandecode comparisons return -1,1
Returning the offset overflows integers when the extended
VA range is active.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39383>
2026-02-10 13:53:51 +00:00
Karol Herbst
af954427bf vtn/opencl: flush denorms for cbrt()
libclc doesn't so we have to. fixes math_brutefore cbrt on Iris.

Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39794>
2026-02-10 13:24:53 +00:00
Eric R. Smith
fa418f1e73 mesa: do not unbind general point when different indexed points are deleted
When a buffer is deleted, we have to remove it from all binding points.
We were re-using the code for BindBufferRange for this; however, this
caused the general binding point to be unbound (bound to NULL)
unconditionally, even if a different buffer is bound there. Fix this by
inlining the various bind calls into the delete buffers code.

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14755
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39659>
2026-02-10 12:57:12 +00:00
Caterina Shablia
4b21a3db05 pan/lib: use tiled AFBC
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For sufficiently big images, tiled AFBC offers perf advantages
over linear AFBC. Keep using linear AFBC for images that are thin
and fall through to U-interleaved for even thinner images. Note
that indeed, interleaved 64k will be skipped in this case as it
won't meet the minimum size criteria set out by interleaved 64k's
test_props.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39737>
2026-02-10 11:20:10 +00:00
Samuel Pitoiset
2cd9693a31 radv/meta: remove an useless barrier when fixing up HTILE for copies on compute
The copy operation doesn't use HTILE of the destination image, so the
clear can run in parallel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
5663ebffc4 radv/meta: skip some HTILE operations when it's decompressed on image stores
Only GFX11-GFX11.5 are affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
0996b4c527 radv/meta: do not disable compression for depth/stencil expand on compute
This doesn't make sense for the destination image and this would
prevent COMPRESSION_EN=1 to work correctly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
452304897f radv: set COMPRESSION_EN=1 for depth or stencil storage images when supported
On GFX10+, the hardware can write decompressed DWORDS to HTILE when
COMPRESSION_EN=1, which means some HTILE decompression/initialization
operations can be avoided because it automatically mark the tiles that
are touched as uncompressed.

Though according to PAL, there are issues with that on GFX10-10.3, so
it's only enabled on GFX11-11.5.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
6f2b048f84 radv/meta: stop fixing up HTILE after a partial copy
The decompression pass already resets HTILE to its uncompressed state,
so this is just redundant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:21 +00:00
Samuel Pitoiset
4f41818194 radv/meta: add a function to fixup HTILE metadata for copies on compute queue
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:21 +00:00
Samuel Pitoiset
9f5a20abde radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE
Only for partial copies because image stores don't decompress on writes
(ie. HTILE isn't updated by image stores).

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>
2026-02-10 10:42:21 +00:00
Daniel Schürmann
e362011cca nir/loop_analyze: also set force_unroll if the array_size is larger than max_trip_count
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Loop peeling can reduce the trip_count. It is also not
necessary that the array_size exactly matches the trip_count.

Totals from 54 (0.06% of 84383) affected shaders: (Navi48)

MaxWaves: 758 -> 884 (+16.62%)
Instrs: 284511 -> 343292 (+20.66%)
CodeSize: 1524940 -> 1837996 (+20.53%)
VGPRs: 5904 -> 5544 (-6.10%)
Scratch: 18432 -> 0 (-inf%)
Latency: 7317179 -> 7186789 (-1.78%); split: -1.80%, +0.02%
InvThroughput: 1646024 -> 1545357 (-6.12%); split: -6.19%, +0.08%
VClause: 5840 -> 6867 (+17.59%); split: -1.92%, +19.50%
SClause: 6959 -> 7935 (+14.03%)
Copies: 25516 -> 31310 (+22.71%); split: -4.87%, +27.58%
Branches: 9205 -> 10571 (+14.84%); split: -3.25%, +18.09%
PreSGPRs: 5586 -> 5394 (-3.44%); split: -3.67%, +0.23%
PreVGPRs: 5087 -> 4674 (-8.12%); split: -8.18%, +0.06%
VALU: 145243 -> 174719 (+20.29%)
SALU: 53128 -> 67594 (+27.23%); split: -0.00%, +27.23%
VMEM: 8911 -> 10221 (+14.70%); split: -1.41%, +16.11%
SMEM: 8519 -> 9509 (+11.62%)
VOPD: 419 -> 796 (+89.98%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39778>
2026-02-10 09:24:23 +00:00
Daniel Schürmann
b5439c4fbf nir/opt_loop_unroll: Always unroll loops with a known trip-count of 0
Loop peeling decrements the calculated trip count, which might
result in a known trip-count of 0 for single-iteration loops.
Thus, also unroll loops if max_trip_count == 0 and exact_trip_count_known.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39778>
2026-02-10 09:24:23 +00:00
Eric Engestrom
3197e79276 mr-label-maker: label wsi files that have a label
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39798>
2026-02-10 09:47:37 +01:00
Lakshman Chandu Kondreddy
fb2646e527 freedreno/layout, tu: Fix UBWC block sizes for PIPE_FORMAT_R8_G8B8_420_UNORM
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The Y and UV planes of PIPE_FORMAT_R8_G8B8_420_UNORM have different
UBWC block sizes. Add support to use the correct block sizes for
this format based on the plane.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39576>
2026-02-10 07:16:49 +00:00
Daivik Bhatia
026fa1799b broadcom/compiler: Update comment clarifying OpTerminate implementation
Explain why the driver uses demote instead of an immediate jump to the
end of the shader for OpTerminate, noting that the jump approach showed
no performance gains.

Reference: !38381

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39703>
2026-02-10 06:20:25 +00:00
Felix DeGrood
0966743943 intel/tools: intel_measure.py avoid early exit on corrupted data
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When data corruption detected, try and parse anyways - hoping
the corruption didn't impact something important.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39749>
2026-02-10 04:23:05 +00:00