Commit graph

212972 commits

Author SHA1 Message Date
Eric Engestrom
7d56f83875 zink+turnip/ci: document fixed tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>
2025-09-07 22:25:59 +02:00
Eric Engestrom
0cfc3429fc zink+nvk/ci: document fixed tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>
2025-09-07 22:25:58 +02:00
Eric Engestrom
a5fd6fce4c nvk/ci: document fixed tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>
2025-09-07 22:25:58 +02:00
Eric Engestrom
e0adaae78a r300/ci: document fixed tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>
2025-09-07 22:24:31 +02:00
Eric Engestrom
ff791ab7a9 etnaviv/ci: document fixed tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>
2025-09-07 22:24:22 +02:00
Christoph Neuhauser
2f8b8649f0 iris: Increase max_shader_buffer_size to max_buffer_size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This commit increases max_shader_buffer_size to max_buffer_size for Iris.

Signed-off-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Co-authored-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37182>
2025-09-07 16:17:10 +00:00
Caio Oliveira
62815cc91f util: Avoid invalid access in ralloc_print_info()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Check if allocation is large enough to hold the
linear and gc contexts before probing for them.

Fixes: 7b5b164281 ("util: Add function print information about a ralloc tree")
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37017>
2025-09-06 20:28:34 +00:00
Caio Oliveira
f37c9c873c brw: Fix printing of blocks in disassembly when BRW is available
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When disassembling and BRW IR is available (which happens in the
generator), there will be pointers to the BRW's basic block structures
that are used to print the block numbers and predecessor/successors
in the output.

There are two challenges:

- Because DO and FLOW instructions are not real instructions, they are
  not emitted in the output but would still cause the output to contain
  empty blocks.  Previous code accounted for DO but still had problems.

- DO blocks have special physical links that don't make sense when the
  DO is not emitted at the end, but they would be shown even if that
  block was omitted.

These issues can be seen here (edited to remove non-essential bits)

```
   START B0 (2 cycles)
mov(8)          g126<1>UD       0x3f800000UD
   END B0 ->B1
   START B2 <-B1 <-B4 (0 cycles)
   END B2 ->B3
   START B3 <-B2 (260 cycles)

LABEL1:
mov(8)          g1<1>D          0D
cmp.ge.f0.0(8)  null<1>D        g2<0,1,0>D      10D
sync nop(1)                     null<0,1,0>UB
send(1)         g0UD            g1UD            nullUD
(+f0.0) break(8) JIP:  LABEL0         UIP:  LABEL0
   END B3 ->B1 ->B5 ->B4
   START B4 <-B3 (1000 cycles)
sync nop(1)                     null<0,1,0>UB
mov(8)          g126<1>UD       g0<0,1,0>UD

LABEL0:
while(8)        JIP:  LABEL1
   END B4 ->B2
   START B5 <-B1 <-B3 (20 cycles)
```

For example:
- Block 1 is missing (a skipped DO block)
- Block 2 is empty (it was a FLOW block)
- Block 3 ends with a link to Block 1 (the special links involving DO
  blocks).

Two key changes were made to fix this.  First, skip the DO and FLOW
blocks completely.  The use_tail ensures that the instruction group is
reused to avoid empty blocks.  Second, when printing, the successors and
predecessors, walk through the skipped blocks.  And finally, don't print
the special blocks.

With the fix, here's the output.  Note the blocks retain their original
BRW IR number.

```
   START B0 (2 cycles)
mov(8)          g127<1>UD       0x3f800000UD
   END B0 ->B3
   START B3 <-B0 <-B4 (260 cycles)

LABEL1:
mov(8)          g1<1>D          0D
cmp.ge.f0.0(8)  null<1>D        g2<0,1,0>D      10D
sync nop(1)                     null<0,1,0>UB
send(1)         g0UD            g1UD            nullUD
(+f0.0) break(8) JIP:  LABEL0         UIP:  LABEL0
   END B3 ->B5 ->B4
   START B4 <-B3 (1000 cycles)
sync nop(1)                     null<0,1,0>UB
mov(8)          g127<1>UD       g0<0,1,0>UD

LABEL0:
while(8)        JIP:  LABEL1
   END B4 ->B3
   START B5 <-B3 (20 cycles)
```

Issue was spotted by Ken.

Fixes: d2c39b1779 ("intel/brw: Always have a (non-DO) block after a DO in the CFG")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36226>
2025-09-06 16:42:05 +00:00
Daniel Schürmann
c78f1d516c nir/algebraic: add pattern for (a << #b) * #c => a * (#c << #b)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals from 2545 (3.19% of 79839) affected shaders: (Navi48)

Instrs: 6371003 -> 6364130 (-0.11%); split: -0.12%, +0.01%
CodeSize: 33827548 -> 33812244 (-0.05%); split: -0.06%, +0.01%
Latency: 47451755 -> 47430108 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 10442450 -> 10437159 (-0.05%); split: -0.05%, +0.00%
SClause: 159829 -> 159874 (+0.03%); split: -0.01%, +0.04%
Copies: 500725 -> 500721 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 110482 -> 110478 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 147289 -> 147287 (-0.00%); split: -0.00%, +0.00%
VALU: 3456135 -> 3454241 (-0.05%); split: -0.06%, +0.01%
SALU: 925982 -> 923616 (-0.26%)
VOPD: 1243 -> 1212 (-2.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37173>
2025-09-06 10:18:42 +00:00
Georg Lehmann
87f451aa39 intel/ci: update restricted trace checksums
Caused by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37113

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37211>
2025-09-06 11:59:16 +02:00
Georg Lehmann
f47e4fee4c mesa: clamp fog scale to -FLT_MAX instead of FLT_MIN
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
FLT_MIN is the smallest positive float, not the smallest negative float.

Fixes: 35ae5dce39 ("mesa: don't pass Infs to the shader via gl_Fog.scale")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11412

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37204>
2025-09-06 07:20:31 +00:00
Yonggang Luo
885323ea3a tgsi/nir: Handling TGSI_OPCODE_RET in tgsi_to_nir
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11554

The nir_push_if is needed as more instructions will added after `RET`.

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37170>
2025-09-06 01:34:44 +00:00
Faith Ekstrand
c2a9a33f75 nvk: Use Vulkan formats for SET_ZT_FORMAT instead of NIL
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The Vulkan format is actually depth/stencil while the NIL format
sometimes has the stencil swapped for X.

Fixes: 89110b8d1d ("nvk: Use the image format for depth views")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37208>
2025-09-06 00:26:48 +00:00
Emma Anholt
29fb897c0a ir3: Enable nir_opt_shrink_shrink_vec_array_vars.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The effect is surprisingly big, though it does seem to be concentrated in
just a few apps (Batman: Arkham Origins, Metro 2033 Redux, Shadow
Warrior):

Totals:
MaxWaves: 19680240 -> 19788620 (+0.55%); split: +0.55%, -0.00%
Instrs: 369291159 -> 367831500 (-0.40%); split: -0.40%, +0.01%
CodeSize: 936669580 -> 933798912 (-0.31%); split: -0.31%, +0.00%

...

Totals from 16918 (1.21% of 1402199) affected shaders:
MaxWaves: 125724 -> 234104 (+86.20%); split: +86.83%, -0.63%
Instrs: 11328230 -> 9868571 (-12.89%); split: -13.13%, +0.25%
CodeSize: 23684238 -> 20813570 (-12.12%); split: -12.24%, +0.12%
NOPs: 1633346 -> 1640119 (+0.41%); split: -2.09%, +2.50%
MOVs: 1940036 -> 510016 (-73.71%); split: -75.07%, +1.36%
COVs: 188107 -> 188546 (+0.23%); split: -0.32%, +0.56%
Full: 454239 -> 263078 (-42.08%); split: -42.80%, +0.71%
(ss): 251004 -> 231443 (-7.79%); split: -9.81%, +2.01%
(sy): 116086 -> 115153 (-0.80%); split: -2.38%, +1.58%
(ss)-stall: 738920 -> 794215 (+7.48%); split: -7.13%, +14.62%
(sy)-stall: 3321071 -> 3193717 (-3.83%); split: -5.58%, +1.74%
STPs: 101880 -> 71523 (-29.80%)
LDPs: 17406 -> 14411 (-17.21%)
Preamble Instrs: 2519390 -> 2548205 (+1.14%); split: -0.31%, +1.46%
Subgroup size: 1097472 -> 1097920 (+0.04%)

Cat0: 1833041 -> 1839613 (+0.36%); split: -1.91%, +2.27%
Cat1: 2128393 -> 698894 (-67.16%); split: -68.42%, +1.26%
Cat2: 3602449 -> 3595086 (-0.20%); split: -0.24%, +0.03%
Cat3: 2817384 -> 2815410 (-0.07%); split: -0.08%, +0.01%
Cat4: 273682 -> 273655 (-0.01%)
Cat5: 304630 -> 304398 (-0.08%)
Cat6: 207434 -> 179648 (-13.40%); split: -13.70%, +0.31%
Cat7: 161217 -> 161867 (+0.40%); split: -1.25%, +1.65%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>
2025-09-06 00:03:12 +00:00
Emma Anholt
b353f868dc ir3: Enable nir_opt_shrink_stores.
This pass strips trailing components not in the writemask of store
intrinsics, or from the trailing components that aren't part of an image's
format.

Totals from 11641 (0.83% of 1402199) affected shaders:
MaxWaves: 159402 -> 159422 (+0.01%); split: +0.08%, -0.07%
Instrs: 3073536 -> 3064117 (-0.31%); split: -0.59%, +0.28%
CodeSize: 7529906 -> 7417398 (-1.49%); split: -1.54%, +0.04%
NOPs: 286665 -> 289623 (+1.03%); split: -2.71%, +3.74%
MOVs: 85466 -> 74849 (-12.42%); split: -14.28%, +1.86%
Full: 116869 -> 116557 (-0.27%); split: -0.35%, +0.09%
(ss): 68245 -> 65758 (-3.64%); split: -5.23%, +1.59%
(sy): 31673 -> 31812 (+0.44%); split: -0.75%, +1.19%
(ss)-stall: 160473 -> 161653 (+0.74%); split: -3.63%, +4.37%
(sy)-stall: 668624 -> 668566 (-0.01%); split: -2.82%, +2.81%
Preamble Instrs: 1059243 -> 1033109 (-2.47%); split: -2.47%, +0.00%
Early Preamble: 10550 -> 10530 (-0.19%)
Subgroup size: 1172672 -> 1172416 (-0.02%); split: +0.01%, -0.03%

Cat0: 323161 -> 326364 (+0.99%); split: -2.50%, +3.49%
Cat1: 156177 -> 145280 (-6.98%); split: -7.92%, +0.95%
Cat2: 1448974 -> 1448964 (-0.00%)
Cat3: 874169 -> 874175 (+0.00%)
Cat5: 75743 -> 75742 (-0.00%)
Cat7: 38702 -> 36982 (-4.44%); split: -5.80%, +1.35%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>
2025-09-06 00:03:12 +00:00
Faith Ekstrand
baeb070a94 nvk: Stop adding Vulkan image usage flags
The sampled and color attachment bits don't actually affect image layout
in any meaningful way.  They just cause us to create extra descriptors
in cases where we may not need them.  However, now that meta always sets
view usage, we always create the usages meta needs, even if the client
doesn't request them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:15 +00:00
Faith Ekstrand
446d5ef103 vulkan: Drop the driver_internal from vk_image_view_init/create()
It alwways comes in through the create flags now.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:14 +00:00
Faith Ekstrand
d1ef8647ac v3dv: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:13 +00:00
Faith Ekstrand
1897d5d9c9 radv: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA
This does mean having to set the flag everywhere, which is a bit
annoying, but I don't think I missed any.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:12 +00:00
Faith Ekstrand
4eb098a6f1 nvk: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA
Instead of having our own nvk_image_view_init() which passes through a
boolean, just set the create flag.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:12 +00:00
Faith Ekstrand
42abf00f2b vulkan: Handle VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA automatically
This moves the bit into vk_image.h and handles it automatically in
vk_image_view_init() so drivers don't have to.

This also means that Meta is now hitting the driver_internal path for
all its images so we need to do the same format fixups there that we
sould normally do on the !driver_internal path.  We don't want to do
them unconditionally because v3dv and other drivers override
depth/stencil color formats and we don't want to break that.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:11 +00:00
Faith Ekstrand
e7b0cbdf40 vulkan/meta: Always set VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:11 +00:00
Faith Ekstrand
89110b8d1d nvk: Use the image format for depth views
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:11 +00:00
Connor Abbott
7527ad001a tu: Lower ViewIndex to 0 when multiview is disabled
This is an optimization, but it also seems to be required because the HW
sometimes fails to set ViewIndex to 0. This fixes flakes with
dEQP-VK.renderpass2.fragment_density_map.*multiviewport where the VS for
the main renderpass is reused for the copy renderpass afterwards and it
copies ViewIndex to ViewportIndex expecting it to be 0 since multiview
is disabled for the copy renderpass.

Closes: #13534
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37206>
2025-09-05 22:17:39 +00:00
Karol Herbst
5bb463bb48 nak/qmd: properly set target shared mem size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The shared memory settings in the QMD affect occupancy as the hardware
needs to manage the available shared memory across all workgroups.

We should set the target to the amount of shared memory used by all the
blocks that can run concurrently taking GPR usage and the local size into
account.

E.g. a shader using 88 gprs, 256 threads and a shared memory size of 18944
can have 2 blocks running concurrently, therefore on an Ampere we need to
set the target to 64kB to properly utilize the hardware.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:02 +00:00
Karol Herbst
a0131b53ad nvk: use hardware limits for maxComputeSharedMemorySize
It doesn't change the reported values, but it will allow us to easily
advertise real hardware limits in the future.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:02 +00:00
Karol Herbst
1d5a1b11db nak/qmd: base shared mem size allocation on hardware limits
We can allocate more than 48k of shared memory, but the limits differ
across hardware, so we need to take it all into account to create the
shared memory splits the hardware can accept.

This does change behavior on Turing, but the assumption is, that the
hardware has simply rounded up. Might need performance testing on Turing
to verify nothing regresses here.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:02 +00:00
Karol Herbst
b09deba713 nouveau/winsys: add shared memory size tables
It's a bit of a disaster, but each generation supports a different set of
shared memory configurations.

Knowing the maximum is important for compute shader performance, knowing
all the legal sizes for QMD generation.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:01 +00:00
Karol Herbst
3c9fa18069 nvk: prepare for higher shared memory sizes
On hw we have up to 228k of available Shared memory so a 16 bit int isn't
enough for that.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:01 +00:00
Karol Herbst
083a3dc545 util: move typed_memcpy into macros.h
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>
2025-09-05 20:01:00 +00:00
Mel Henning
1c764357e8 nvk: Only copy 32-bits for cond render operand A
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that we're guaranteed the upper 32 bits are zero initialized,
there's no reason we need to do a 64-bit write here.

This is a 0.3% performance improvement on the Sascha Willems
conditionalrender demo with all rendering disabled (638 fps -> 640 fps)

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>
2025-09-05 18:57:33 +00:00
Mel Henning
4d8e2f7768 nvk: Don't re-initialize cond rendering operand B
We can initialize this just once from the CPU side instead of
overwriting it each time using the copy engine.

This is a 5% performance improvement on the Sascha Willems
conditionalrender demo with all rendering disabled (607 fps -> 638 fps)

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>
2025-09-05 18:57:33 +00:00
Mel Henning
966a1b5380 nvk: Reuse the same cond render temp in a cmd_buf
Within a single command buffer, we know that our operations will happen
sequentially so we don't need to allocate a unique address per
vkCmdBeginConditionalRenderingEXT - we can re-use the same address
instead.

Improves perf on the Sascha Willems conditionalrender demo with all
rendering disabled by about 2% (595 fps -> 607 fps)

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>
2025-09-05 18:57:33 +00:00
Mel Henning
64b4e52755 nvk: Move cond rendering memory out of gart
This is a 41% performance improvement on the Sascha Willems
conditionalrender demo with all rendering disabled (422 fps -> 595 fps)

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>
2025-09-05 18:57:32 +00:00
Mel Henning
0b43a625f4 nvk: Remove gart from the name of cond_render_mem
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>
2025-09-05 18:57:32 +00:00
Connor Abbott
a89f897870 freedreno/ci: Add a750 sparse skips
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
de60f2ff68 tu: Advertise shaderResourceMinLod
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
e72fed3faa ir3: Support min_lod tex source
Use the .clp modifier. In order to fix
dEQP-VK.glsl.texture_functions.textureoffsetclamp.* we need to add a
workaround for an empirically-discovered problem.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
63959bb716 ir3: Assemble and disassemble .clp modifier
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
655934eef7 tu: Expose shaderResourceResidency
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
70cf40086c ir3: Implement sparse residency check
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
120f755bdb ir3: Assemble and disassemble rck modifier
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
918e25e158 tu: Support sparse residency for images
The tricky thing here is that we have to emulate the 64k "standard"
tile sizes in terms of the native 4k macrotiles. We do this by
manipulating which 4k pages get mapped, dividing the 64k tile into 4k
macrotiles and mapping each tile in such a way that, when viewed in
terms of the final swizzled image coordinates, the 4k tiles linearly
tile the image region that's supposed to be mapped to the 64k "tile".
Supporting the standard block sizes allows emulation layers to claim D3D
Tiled Resources Tier 2, which is required for the 12.0 feature level.
It's also required for ARB_sparse_texture2.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
ae53234414 freedreno/fdl: Add sparse layout support
Compute the Vulkan "sparse miptail," add support for padding the array
stride in order to make sure that the sparse miptail is large enough as
mandated by the Vulkan spec, and add a function to compute the standard
sparse block size.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
166bda02aa freedreno/fdl: Handle layout differences for r8g8 images
We don't handle copying r8g8 tiled images yet, but at least return the
correct tile size and bank swizzle so that r8g8 sparse textures work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
7225334589 freedreno/fdl: Handle cpp=32 and cpp=64 when getting macrotile size
These can only happen with multisampled images, which aren't supported
by fdl_tiled_memcpy. However these cases can be hit by multisampled
sparse textures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
8ef64f2042 freedreno/fdl: Refactor and expose bank swizzling logic
For sparse, we will need to handle bank swizzling and macrotiles when
mapping sparse textures. However the functions for handling this were
leaking internal tiled_memcpy implementation details, like the concept
of a 256-byte "block" that doesn't really exist in the tiling (instead
everyone else deals with UBWC blocks, which may be 256 bytes or smaller,
and 4K macrotiles). Rewrite them to work in terms of macrotiles, and
take an fdl_layout.

In order to avoid having to pass an fdl_layout everywhere, pass around
the computed bank_mask and bank_swizzle everywhere. This also means that
we don't recompute several times.

Finally, expose a function to compute the macrotile size, which will
also be needed to work with bank swizzling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
348ffdc996 freedreno/fdl: Expose fdl6_is_r8g8_layout() publicly
We will need to use this in other places in fdl.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Mike Blumenkrantz
6596bf69c6 zink: add another flag to determine whether linked program compile is done
it's otherwise possible for this to race and hit the draw before
precompile finishes without ever waiting on the fence

I guess this just worked coincidentally before?

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37197>
2025-09-05 16:29:15 +00:00
Mike Blumenkrantz
0b586d546d zink: remove rebar requirement for descriptor buffer support
this is not really relevant; if db is supported, use it

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37197>
2025-09-05 16:29:15 +00:00