Commit graph

220236 commits

Author SHA1 Message Date
Ian Romanick
d038bfa1c1 util: Use same method to clear bits in u_foreach_bit as util_bitcount
Saves about 2k text size.

Before:

   text	   data	    bss	    dec	    hex	filename
24817485	 456164	  27080	25300729	1820ef9	./lib64/libvulkan_intel.so

After:

   text	   data	    bss	    dec	    hex	filename
24815381	 456164	  27080	25298625	18206c1	./lib64/libvulkan_intel.so

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40230>
2026-03-19 17:30:25 +00:00
Ian Romanick
e13565acf4 anv: Use u_foreach_bit
Suggested-by: Lionel
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40230>
2026-03-19 17:30:25 +00:00
Ian Romanick
4cbf2ee3f0 anv: Use different logic to isolate lowest flag in anv_foreach_vk_stage
Silences many ubsan errors like:

src/intel/vulkan/anv_shader_compile.c:609:4: runtime error: shift exponent -1 is negative

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40230>
2026-03-19 17:30:25 +00:00
Anders Roxell
ea731cda12 ethosu: fix blockdep to check for data dependencies
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
calc_blockdep always returned MAX_BLOCKDEP without checking if the
previous op writes to a buffer the current op reads from. This let
the NPU start reading before the previous write was done.

Add overlap check between previous OFM and current IFM so we set
blockdep to 0 when they share the same buffer.

Update ethos-imx93-fails.txt to remove the tests that now pass.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
17435b6a58 teflon/tests: add micronet_large anomaly detection model
Downloaded from the Arm ML Zoo [1]. Per-channel quantized INT8 model
with 14 operators: CONV_2D (7x), DEPTHWISE_CONV_2D (5x),
AVERAGE_POOL_2D, RESHAPE. All per-op tests pass but the full model
fails due to a bug in synchronization of operations.

[1] https://github.com/Arm-Examples/ML-zoo/tree/master/models/anomaly_detection/micronet_large/tflite_int8

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
7c1ec56427 ethosu: clean up ADD elementwise scaling
Replace the two functions simplified_elementwise_add_sub_scale and
eltwise_emit_ofm_scaling with a single advanced_elementwise_add_sub_scale
that follows the ethos-u-vela naming. Remove the large block of
commented out Vela Python code.

No functional change.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
69d3f080be ethosu: fix RESIZE upscale mode
The upscale field was a bool which happened to work since true maps
to 1 which is NEAREST in the hardware. Change from bool to an enum
ethosu_upscale_mode so the intent is clear and we dont rely on the
bool-to-int mapping.

Also add a check in operation_supported so RESIZE only accepts 2x
upscaling since thats what the NPU can do with IFM_UPSCALE. Other
sizes fall back to CPU.

Keep the original zero_points from tensors in RESIZE and STRIDED_SLICE
instead of forcing them to 0 since the requantization needs them.

Fixes the RESIZE_NEAREST_NEIGHBOR operations in EfficientDet-Lite
models that use BiFPN with 2x nearest neighbor upsampling.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Anders Roxell
e27ba5b437 ethosu: Handle per-channel zero_points
fill_weights subtracted a single zero_point from all weights which
did not handle models with per-channel zero_points. Use the
per-channel zero_point for each output channel when available.

Also decouple the zero_points copy from the scales copy in the lower
pass so they are handled independently.

Suggested-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Anders Roxell
63c028b5e0 ethosu: Add support for per-channel quantization
For those models with coefficients that have different quantization
parameters for each channel.

The NPU can handle per-channel scales as can be seen in
fill_scale_and_biases(), which already iterates per output channel.

Activation tensors (input/output) don't have per-channel quantization.

- Add scales/zero_points arrays to ethosu_kernel struct
- Copy per-channel scales from weight tensor in lower pass
- Use per-channel scale when computing conv_scale in coefs
- Allow per-channel quantization in operation_supported check

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Anders Roxell
0887e6d89f teflon: Add support for symmetric per-channel quantization
The old code would assert when a model has multiple scales but only
one zero_point. This is common for symmetric quantization where all
channels share the same zero_point (typically 0).

Handle this by replicating the single zero_point for all channels
instead of crashing.

Fixes MoveNet models using per-channel quantization.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:11 +00:00
Lars-Ivar Hesselberg Simonsen
292fffac1b pan/va/disasm: Move src discard marker behind reg
Purely a visual change, but aligns with DDK disassembly.

For example:
-   FMA.f32 r1, ^r1, u1, ^r4
+   FMA.f32 r1, r1^, u1, r4^

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:13 +00:00
Lars-Ivar Hesselberg Simonsen
43c6f51a29 pan/va/disasm: Clean up hardcoded values
A lot of masks and shifts were hard-coded in the disassembler. This
commit tries to move them to shared logic.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:13 +00:00
Lars-Ivar Hesselberg Simonsen
614d07c986 pan/va: Generalize opcode/opcode2
Rather than opcode/opcode2 hardcoded, treat the opcode as a list of
one or more subcodes.

This implies modifying the disassembler to hold an arbitrary depth dict
of dicts and recursively build the switch statements used to look up
each level.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:13 +00:00
Lars-Ivar Hesselberg Simonsen
11f243205c pan/va/disasm: Move instr print to function
This splits the printing logic from the iteration logic, making it
easier to reason about either.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:12 +00:00
Lars-Ivar Hesselberg Simonsen
adffad6adb pan/va: XMLify opcode2
Opcode2 was a bit all over the place, so utilize the new opcode modifier
to gather opcode2 information in a single place.

This cleans up the implicit va_mods "left", "descriptor_type" and
"memory_width".

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:12 +00:00
Lars-Ivar Hesselberg Simonsen
5b24568c87 pan/va: Add opcode modifier to ISA.xml
Rather than having the opcode as an attribute and the offset/mask being
implicit, make all of this information explicit in the xml.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:12 +00:00
Lars-Ivar Hesselberg Simonsen
9bd4a40233 pan/va: Clean up unused/removed instructions
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:12 +00:00
Lars-Ivar Hesselberg Simonsen
1b1f4bd35e pan/va: Remove non-existent unused CLPERs
These instructions were not generated as they do not exist.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
2026-03-19 15:11:12 +00:00
Eric Engestrom
384d128164 ci: fix scheduled pipelines
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 32a818d11d ("ci: drop workaround for jobs not being created in fork pipelines")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40505>
2026-03-19 13:23:00 +00:00
Eric Engestrom
d38916d673 ci: fix rebase mistake
Fixes: 32a818d11d ("ci: drop workaround for jobs not being created in fork pipelines")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40505>
2026-03-19 13:23:00 +00:00
Erik Faye-Lund
982f567b19 pan/lib: drop redundant assign
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is already the default value, so there's no point in overriding it
to itself.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
2026-03-19 12:00:47 +00:00
Erik Faye-Lund
5280b80281 pan/lib: divide extent by tile-extend, not itself
Dividing this by itself is nonsensical, and just always gives us one.
That's obviously not what we want here.

But in this case we also know that the extent is divisible by the tile
extent, so there's no need for DIV_ROUND_UP, we can just divide.

Fixes: e6f8cab698 ("pan/layout: Split the logic per modifier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
2026-03-19 12:00:47 +00:00
Erik Faye-Lund
b0c32fcc66 pan/lib: set srgb-flag for afrc render-targets
Without this, sRGB rendering to AFRC is broken.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
2026-03-19 12:00:47 +00:00
Erik Faye-Lund
322aaa88c6 pan/lib: do not try to use stencil-aspect of color attachment
We can't use the stencil-aspect of a color-attachment. That's going to
fail, so let's use the color-aspect instead. We already have it around
anyway.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
2026-03-19 12:00:46 +00:00
Erik Faye-Lund
15e0ac0731 pan/genxml: remove non-existent YUV Enable for AFRC
This is controlled by the writeback-mode when using AFRC, not by an YUV
Enable field. This Filed doesn't exist in these, and should according to
the spec be zero.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
2026-03-19 12:00:46 +00:00
Faith Ekstrand
3418525a82 pan/bi: Lower VS outputs in NIR
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:32 +00:00
Lorenzo Rossi
8127f5a88a pan/bi: Resize varyings IO early
In preparation for IO lowering in NIR. The varying size does not change
between variants and we'll need the real store width in NIR if we want
to lower it correctly.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
636aba5811 panfrost: Lower indirect derefs before lower_io
This will surely lose performance in some cases, this is a temporary fix
to align ourselves with how the Vulkan compiler works.  We might be able
to us indirect varyings directly in the future depending on how we
handle their memory layout.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
538c3ee6c7 Revert "pan/bi: Model pos/vary segments in STORE instructions"
This reverts commit 039bb4e68c.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
43ffcf06f4 pan/bi,nir: Divide memory_access from segments
Valhall removed Bifrost's memory segments and added in its place memory
access.  Those were bolted on reserved bits as "pseudo-segments" and the
emitter would catch these and emit the right memory access.  This commit
cleans it up a bit by making memory_access available directly and
exposing it to NIR (this will be useful later).

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
c730e41ed5 pan/bi: Add is_psiz_store flag in bi_instr
This removes the previous hack that searched the psiz write by looking
for 16-bit stores with the correct pseudo segment.  We also add a new
intrinsic that mimicks global stores but tags psiz writes, this will be
used later in the series.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Faith Ekstrand
de338dc908 pan,nir: Rework converted_mem_pan intrinsics
First, rename them to make them a bit more clear.  They act on global
memory so they should be _global and they map to ld/st_cvt so so _cvt is
nice and obvious.  Second, they don't need IO semantics as they're not
IO.  But they do need ACCESS so that we can better control things like
CAN_REORDER.  Third, add a src_type to store_global_cvt even though it
won't be used just yet because we'll want it for lowering VS stores.

Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:29 +00:00
Faith Ekstrand
8541dca8ed pan/bi: Lower FS input loads in NIR
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:28 +00:00
Faith Ekstrand
d2f430bea9 pan/bi: Add new FS input load intrinsics
Unlike load[_interpolated]_input, which has to deal with all sorts of
ABI nonsense between driver and compiler, these new intrinsics are
dumber than bricks.  They're literally just the HW ops as NIR
intrinsics.  These will allow us do the lowering in NIR and put the
driver in total control over what goes down what path.  Among other
things, a driver could choose to lower some things to ld_var and others
to ld_var_buf.

Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:28 +00:00
Valentine Burley
00fe22aa6a ci: Capture weston logs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Save the weston logs in the artifacts, similar to Xorg.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34323>
2026-03-19 09:58:59 +01:00
Georg Lehmann
1e77a2218a radv/ci: update restricted trace checksum
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Annoying because these will never be caught in the MR that regresses them.

Looking at the diff, this is fallout from the clipping/guardband changes.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40499>
2026-03-19 07:41:30 +00:00
Georg Lehmann
57c05f72f9 nir/opt_large_constants: only use 16bit float alu when supported
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
5f37788ae9 nir/opt_large_constants: handle floating point power of two fractions
Foz-DB Navi48:
Totals from 365 (0.32% of 114655) affected shaders:
MaxWaves: 10020 -> 10016 (-0.04%)
Instrs: 486252 -> 486097 (-0.03%); split: -0.21%, +0.18%
CodeSize: 2629536 -> 2628452 (-0.04%); split: -0.19%, +0.14%
VGPRs: 19884 -> 19896 (+0.06%); split: -0.06%, +0.12%
SpillSGPRs: 210 -> 212 (+0.95%)
Latency: 3818610 -> 3765549 (-1.39%); split: -1.50%, +0.11%
InvThroughput: 598445 -> 596281 (-0.36%); split: -0.58%, +0.22%
VClause: 10053 -> 9698 (-3.53%); split: -3.54%, +0.01%
SClause: 17548 -> 17334 (-1.22%); split: -1.24%, +0.02%
Copies: 43196 -> 42249 (-2.19%); split: -2.34%, +0.14%
Branches: 16695 -> 16628 (-0.40%); split: -0.47%, +0.07%
PreSGPRs: 17988 -> 17971 (-0.09%)
PreVGPRs: 13552 -> 13520 (-0.24%)
VALU: 244842 -> 246611 (+0.72%); split: -0.02%, +0.74%
SALU: 79163 -> 77778 (-1.75%); split: -2.05%, +0.30%
VMEM: 13468 -> 13084 (-2.85%)
SMEM: 23571 -> 23393 (-0.76%)
VOPD: 8384 -> 8372 (-0.14%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
372c1a23dc nir/opt_large_constants: support negative small constants
Foz-DB Navi48:
Totals from 511 (0.45% of 114655) affected shaders:
MaxWaves: 14554 -> 14552 (-0.01%)
Instrs: 767577 -> 768334 (+0.10%); split: -0.17%, +0.27%
CodeSize: 4171036 -> 4181400 (+0.25%); split: -0.10%, +0.35%
VGPRs: 27676 -> 27724 (+0.17%)
SpillSGPRs: 144 -> 183 (+27.08%)
Latency: 4053919 -> 4027092 (-0.66%); split: -0.88%, +0.22%
InvThroughput: 817990 -> 819490 (+0.18%); split: -0.21%, +0.39%
VClause: 11573 -> 11172 (-3.46%); split: -3.47%, +0.01%
SClause: 14418 -> 14579 (+1.12%); split: -0.46%, +1.57%
Copies: 71638 -> 71365 (-0.38%); split: -1.54%, +1.16%
Branches: 20212 -> 20425 (+1.05%); split: -0.39%, +1.44%
PreSGPRs: 21765 -> 21743 (-0.10%); split: -0.23%, +0.12%
PreVGPRs: 19475 -> 19307 (-0.86%); split: -0.91%, +0.05%
VALU: 411365 -> 413642 (+0.55%); split: -0.02%, +0.57%
SALU: 126940 -> 125411 (-1.20%); split: -1.53%, +0.32%
VMEM: 20574 -> 20062 (-2.49%)
SMEM: 23724 -> 23677 (-0.20%); split: -0.25%, +0.05%
VOPD: 19838 -> 19847 (+0.05%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
a9f3efcae0 nir/opt_large_constants: optimize small vector constant arrays
Foz-DB Navi48:
Totals from 2956 (2.58% of 114655) affected shaders:
MaxWaves: 85080 -> 85110 (+0.04%)
Instrs: 5167735 -> 5170572 (+0.05%); split: -0.12%, +0.17%
CodeSize: 28882716 -> 28867340 (-0.05%); split: -0.14%, +0.08%
VGPRs: 164484 -> 164616 (+0.08%); split: -0.09%, +0.18%
SpillSGPRs: 612 -> 611 (-0.16%)
Latency: 35017837 -> 34391146 (-1.79%); split: -1.80%, +0.01%
InvThroughput: 6336245 -> 6323807 (-0.20%); split: -0.49%, +0.29%
VClause: 112504 -> 111117 (-1.23%); split: -1.32%, +0.09%
SClause: 121125 -> 117618 (-2.90%); split: -3.04%, +0.15%
Copies: 392203 -> 384977 (-1.84%); split: -1.88%, +0.04%
Branches: 155578 -> 155376 (-0.13%); split: -0.13%, +0.01%
PreSGPRs: 127654 -> 127205 (-0.35%); split: -0.39%, +0.04%
PreVGPRs: 112486 -> 112449 (-0.03%); split: -0.04%, +0.00%
VALU: 2577362 -> 2586379 (+0.35%); split: -0.00%, +0.35%
SALU: 889569 -> 888472 (-0.12%); split: -1.01%, +0.89%
VMEM: 167203 -> 165750 (-0.87%)
SMEM: 190438 -> 187313 (-1.64%)
VOPD: 194411 -> 194344 (-0.03%); split: +0.01%, -0.04%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
f782524c36 nir/opt_large_constants: enable small constant optimization for non trivial strides
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:17 +00:00
Georg Lehmann
568b96f8b2 nir/opt_large_constants: set fp_math_ctrl for bit exact results
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:17 +00:00
Georg Lehmann
e810382a1e nir/opt_large_constants: don't add constants implemented with ALU to the constant data
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:16 +00:00
Konstantin Seurer
581df90a89 nir/tests: Test nir_opt_large_constants
Tests a whole bunch of cases that can be turned into literals.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:16 +00:00
Georg Lehmann
023e3554e9 ir3: set progress for nir_opt_large_constants
I guess the original intention was that ir3_nir_lower_load_constant will
always make progress if nir_opt_large_constants made progress,
but this is not the case with the small constant arrays optimizations.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:16 +00:00
Eric Engestrom
402bd37f9d docs: add sha sum for 26.0.3
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40494>
2026-03-18 17:01:37 +00:00
Eric Engestrom
d2e3b4b4fb docs: add release notes for 26.0.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40494>
2026-03-18 17:01:37 +00:00
Eric Engestrom
da214b0fce docs: update calendar for 26.0.3
Includes an additional 26.0.x release to make sure we still have the
expected overlap with 26.1.1.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40494>
2026-03-18 17:01:36 +00:00
Icenowy Zheng
af8923bb01 zink: skip all post-process when importing and resource_create fails
When the pipe_resource pointer returned by resource_create is NULL, the
process importing the handle into the underlying Vulkan driver is known
to have failed, and the handle importing process shouldn't continue.

Just return NULL in this case to prevent further check of pres being
non-NULL.

This also fixes the issue that renderonly code lacks check for non-NULL
pres, and the conversion of pipe_resource to zink_resource in renderonly
codepath is now gone because of a converted zink_resource is available
above.

Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40490>
2026-03-18 16:34:10 +00:00
Samuel Pitoiset
79ac5fd4c2 radv/amdgpu: remove dead code in radv_amdgpu_winsys_bo_create()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40461>
2026-03-18 16:03:39 +00:00