Commit graph

6090 commits

Author SHA1 Message Date
Job Noorman
af8105d085 ir3/ra: ignore phis handled by shared RA
If shared RA is used, it may have handled some phis. These are already
ignored by regular RA in handle_phi but were used before that in
potentially dangerous ways. More specifically, the interval of such phis
was accessed which may cause an out-of-bounds read since it was never
created. Fix this by skipping such phis earlier.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c6a932d4b3 ("ir3/ra: handle phis with preferred regs first")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503>
2025-04-15 06:04:04 +00:00
Job Noorman
d8033ba173 ir3/ra: add helper for getting a dst interval
There have been multiple issues related to accessing intervals through
invalid register names. This usually results in a (difficult to
diagnose) out-of-bounds access. Wrap all the interval accesses in a
helper where we can assert that the name is in-bounds.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503>
2025-04-15 06:04:04 +00:00
Connor Abbott
74531094cb ir3: Vectorize shared memory loads/stores
This drastically helps a Path of Exile 2 compute dispatch, going from
4.6ms to 2.7ms.
Totals from 969 (0.59% of 164134) affected shaders:
MaxWaves: 9586 -> 9560 (-0.27%); split: +0.02%, -0.29%
Instrs: 1252433 -> 1234724 (-1.41%); split: -1.47%, +0.05%
CodeSize: 2237424 -> 2195238 (-1.89%); split: -1.91%, +0.03%
NOPs: 362213 -> 360913 (-0.36%); split: -0.92%, +0.56%
MOVs: 58879 -> 59591 (+1.21%); split: -0.62%, +1.83%
Full: 15817 -> 15867 (+0.32%); split: -0.04%, +0.36%
(ss): 35671 -> 35434 (-0.66%); split: -1.80%, +1.14%
(sy): 23953 -> 23964 (+0.05%); split: -0.38%, +0.43%
(ss)-stall: 127807 -> 124930 (-2.25%); split: -3.43%, +1.18%
(sy)-stall: 583947 -> 585886 (+0.33%); split: -0.61%, +0.94%

Early-preamble: 317 -> 316 (-0.32%)
Cat0: 394577 -> 393316 (-0.32%); split: -0.85%, +0.53%
Cat1: 100335 -> 101057 (+0.72%); split: -0.36%, +1.08%
Cat2: 415880 -> 415835 (-0.01%); split: -0.05%, +0.04%
Cat3: 187928 -> 187929 (+0.00%); split: -0.00%, +0.00%
Cat5: 19143 -> 19148 (+0.03%)
Cat6: 69630 -> 52523 (-24.57%)
Cat7: 47160 -> 47136 (-0.05%); split: -0.56%, +0.51%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441>
2025-04-14 17:22:47 +00:00
Connor Abbott
9977c4d682 ir3: Move load/store vectorization to finalize
Some frontends such as rusticl and turnip call the optimization loop
before choosing the shared memory layout, in order to be able to delete
variables that turn out to be unused. This means that we can't vectorize
them until after the first run of the optimization loop. Other drivers
also seem to do something similar.

This also has the benefit that by delaying vectorization of UBOs until
after they are lowered from derefs, we don't insert casts which remove
the ability of nir_lower_explicit_io to insert a range, which was
blocking the pushing of vectorized indirect UBO loads. This has a
significant positive impact on fossil-db:

Only doing vectorization later exposes a bug where vectorization could
change the bitsize after we used it to determine which descriptor to
use. It happened to work before because vectorization was usually done
early. To fix it, move adjusting the descriptor to a new pass that
happens after finalizing.

Totals:
MaxWaves: 2249140 -> 2281068 (+1.42%); split: +1.43%, -0.01%
Instrs: 49624230 -> 49143117 (-0.97%); split: -1.14%, +0.17%
CodeSize: 103796862 -> 104143744 (+0.33%); split: -0.98%, +1.31%
NOPs: 8489860 -> 8512218 (+0.26%); split: -1.55%, +1.81%
MOVs: 1531650 -> 1574911 (+2.82%); split: -1.37%, +4.20%
Full: 1814334 -> 1748906 (-3.61%); split: -3.64%, +0.03%
(ss): 1155395 -> 1128249 (-2.35%); split: -3.48%, +1.13%
(sy): 608650 -> 567972 (-6.68%); split: -7.32%, +0.64%
(ss)-stall: 4352550 -> 4340473 (-0.28%); split: -2.08%, +1.80%
(sy)-stall: 17852259 -> 16943647 (-5.09%); split: -6.25%, +1.16%
STPs: 24568 -> 24215 (-1.44%)
LDPs: 37799 -> 37468 (-0.88%)
Early-preamble: 115698 -> 113694 (-1.73%); split: +0.17%, -1.90%
Cat0: 9345228 -> 9367782 (+0.24%); split: -1.41%, +1.65%
Cat1: 2445265 -> 2549122 (+4.25%); split: -0.81%, +5.06%
Cat2: 18704736 -> 18377519 (-1.75%); split: -1.76%, +0.01%
Cat3: 14210303 -> 14130558 (-0.56%); split: -0.56%, +0.00%
Cat4: 1346895 -> 1346462 (-0.03%); split: -0.03%, +0.00%
Cat5: 1420418 -> 1420417 (-0.00%); split: -0.07%, +0.07%
Cat6: 745590 -> 549358 (-26.32%); split: -26.66%, +0.34%
Cat7: 1405795 -> 1401899 (-0.28%); split: -0.96%, +0.68%

Totals from 79089 (48.19% of 164134) affected shaders:
MaxWaves: 947648 -> 979576 (+3.37%); split: +3.40%, -0.03%
Instrs: 38664140 -> 38183027 (-1.24%); split: -1.47%, +0.22%
CodeSize: 80179110 -> 80525992 (+0.43%); split: -1.27%, +1.70%
NOPs: 6880907 -> 6903265 (+0.32%); split: -1.91%, +2.23%
MOVs: 1183855 -> 1227116 (+3.65%); split: -1.78%, +5.43%
Full: 1107056 -> 1041628 (-5.91%); split: -5.96%, +0.05%
(ss): 939342 -> 912196 (-2.89%); split: -4.28%, +1.39%
(sy): 457959 -> 417281 (-8.88%); split: -9.73%, +0.85%
(ss)-stall: 3664495 -> 3652418 (-0.33%); split: -2.47%, +2.14%
(sy)-stall: 12266805 -> 11358193 (-7.41%); split: -9.10%, +1.69%

STPs: 7494 -> 7141 (-4.71%)
LDPs: 7050 -> 6719 (-4.70%)
Early-preamble: 46339 -> 44335 (-4.32%); split: +0.43%, -4.75%
Cat0: 7548630 -> 7571184 (+0.30%); split: -1.75%, +2.05%
Cat1: 1823872 -> 1927729 (+5.69%); split: -1.09%, +6.78%
Cat2: 14767716 -> 14440499 (-2.22%); split: -2.22%, +0.01%
Cat3: 10630582 -> 10550837 (-0.75%); split: -0.75%, +0.00%
Cat4: 1150090 -> 1149657 (-0.04%); split: -0.04%, +0.00%
Cat5: 1068913 -> 1068912 (-0.00%); split: -0.09%, +0.09%
Cat6: 554910 -> 358678 (-35.36%); split: -35.82%, +0.45%
Cat7: 1119427 -> 1115531 (-0.35%); split: -1.20%, +0.86%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441>
2025-04-14 17:22:46 +00:00
Connor Abbott
ec780eb0e7 ir3: Pass through access flags when lowering global accesses
This will let us do optimizations such as moving loads to a preamble.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483>
2025-04-14 16:53:34 +00:00
Job Noorman
35ec960f6f ir3: run cp after ir3_imm_const_to_preamble
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that ir3_cp has an option to not lower immediates to const
registers, we can use it after ir3_imm_const_to_preamble instead of
manually propagating immediates.

This fixes a lot of missed opportunities for early-preamble as we didn't
propagate the mova1 immediate which a caused a GPR to be used in many
preambles.

Totals:
Instrs: 49704517 -> 49703700 (-0.00%); split: -0.16%, +0.16%
CodeSize: 103917968 -> 103187072 (-0.70%); split: -0.82%, +0.11%
NOPs: 8516944 -> 8511764 (-0.06%); split: -0.78%, +0.72%
MOVs: 1534023 -> 1536385 (+0.15%); split: -1.12%, +1.27%
Full: 1816517 -> 1816548 (+0.00%); split: -0.05%, +0.06%
(ss): 1162108 -> 1161490 (-0.05%); split: -1.03%, +0.98%
(sy): 611398 -> 610311 (-0.18%); split: -0.80%, +0.62%
(ss)-stall: 4384529 -> 4388096 (+0.08%); split: -1.22%, +1.30%
(sy)-stall: 17858701 -> 17837101 (-0.12%); split: -0.87%, +0.74%
STPs: 25096 -> 25491 (+1.57%); split: -0.05%, +1.63%
LDPs: 37635 -> 38030 (+1.05%); split: -0.03%, +1.08%
Preamble Instrs: 12589113 -> 11391946 (-9.51%); split: -9.75%, +0.24%
Early Preamble: 115946 -> 122893 (+5.99%); split: +6.05%, -0.06%
Cat0: 9374513 -> 9370393 (-0.04%); split: -0.71%, +0.67%
Cat1: 2443348 -> 2446546 (+0.13%); split: -0.82%, +0.95%
Cat2: 18731502 -> 18731478 (-0.00%); split: -0.00%, +0.00%
Cat7: 1410092 -> 1410221 (+0.01%); split: -0.61%, +0.62%

Totals from 39189 (23.81% of 164575) affected shaders:
Instrs: 30656115 -> 30655298 (-0.00%); split: -0.26%, +0.26%
CodeSize: 61714230 -> 60983334 (-1.18%); split: -1.37%, +0.19%
NOPs: 6074700 -> 6069520 (-0.09%); split: -1.10%, +1.01%
MOVs: 1010392 -> 1012754 (+0.23%); split: -1.70%, +1.93%
Full: 617108 -> 617139 (+0.01%); split: -0.16%, +0.16%
(ss): 778842 -> 778224 (-0.08%); split: -1.54%, +1.46%
(sy): 362803 -> 361716 (-0.30%); split: -1.35%, +1.05%
(ss)-stall: 3203827 -> 3207394 (+0.11%); split: -1.67%, +1.78%
(sy)-stall: 9507680 -> 9486080 (-0.23%); split: -1.63%, +1.40%
STPs: 23004 -> 23399 (+1.72%); split: -0.06%, +1.77%
LDPs: 33942 -> 34337 (+1.16%); split: -0.04%, +1.20%
Preamble Instrs: 8090918 -> 6893751 (-14.80%); split: -15.18%, +0.38%
Early Preamble: 12246 -> 19193 (+56.73%); split: +57.25%, -0.52%
Cat0: 6656706 -> 6652586 (-0.06%); split: -1.00%, +0.94%
Cat1: 1546399 -> 1549597 (+0.21%); split: -1.30%, +1.50%
Cat2: 11642214 -> 11642190 (-0.00%); split: -0.00%, +0.00%
Cat7: 943911 -> 944040 (+0.01%); split: -0.91%, +0.92%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>
2025-04-14 04:37:28 +00:00
Job Noorman
226ec669d8 ir3/cp: ignore alias sources for sam.s2en
ir3_cp asserts that the first source of a sam.s2en is a collect which
isn't necessarily true after creating alias registers.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>
2025-04-14 04:37:28 +00:00
Job Noorman
1618c2495b ir3/cp: add option to disable immediate to const lowering
This will allow it to be used after ir3_imm_const_to_preamble so that we
don't have to do the propagation of immediates manually there.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>
2025-04-14 04:37:27 +00:00
Job Noorman
6546a40225 ir3: remove spaces in shader stats
The shaderdb scripts don't like them.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>
2025-04-14 04:37:27 +00:00
Valentine Burley
b49eaf0966 ci/lava: Consolidate piglit trace job definitions
Clean up LAVA job definitions.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:07 +00:00
Valentine Burley
87d58ea57a ci/piglit: Consolidate HWCI_TEST_SCRIPT for piglit traces
The HWCI_TEST_SCRIPT variable was always getting overwritten for these
definitions.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:06 +00:00
Valentine Burley
1aeedddbb6 ci/piglit: Drop redundant PIGLIT_PROFILES variable
PIGLIT_PROFILES was only used with the piglit-runner.sh script, which no
jobs were using anymore.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:06 +00:00
Renato Pereyra
7190949927 perfetto/android: align datasource names with tooling expectations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
A few Android tools are based on/assume the datasource names
gpu.renderstages and gpu.counters. It is less effort to align with that
naming for Android builds than to chase down those tools and fix them,
not to mention account for new tools that may be created in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34330>
2025-04-08 18:29:10 +00:00
Rob Clark
ea6e69e9d3 tu: vdrm vtest support
In a few places, we need to deal with not having direct access to the
rendernode device.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:39 +00:00
Rob Clark
bf0e3d6274 virtio/vdrm: Add vtest backend
This allows for testing drm native ctx support without spinning up a VM.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:39 +00:00
Rob Clark
28ad8fd5b1 tu: Add some func traces
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:38 +00:00
Rob Clark
db88a490b8 tu: Avoid extraneous set_iova
The GEM_NEW ccmd already passes the iova, so we don't need an extra
SET_IOVA for newly created BOs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:38 +00:00
Rob Clark
081869e591 tu/vdrm: Fix userspace fence cmds
Somehow the update of the fence value to write was dropped, so the
cmdstream that wrote the fence value would simply write zero over and
over again.

Fixes: 84d6eedd5e ("tu: Refactor the submit path")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:38 +00:00
Rob Clark
471961d0ca ir3: Comment re-indent
To make this more readable.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:38 +00:00
Mike Blumenkrantz
b14c8128bf tu: check for valid descriptor set when binding descriptors
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
these pointers can be null, and they are checked as null in
pipeline layout creation, but here if the pointer is null it will crash

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34412>
2025-04-07 18:49:10 +00:00
Collabora's Gfx CI Team
fcf19bf335 Uprev ANGLE to 3818d37d5e94317f01810053b8f28c1f1e8b98e6
1b34d2a18a...3818d37d5e

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34378>
2025-04-07 18:16:00 +00:00
Eric Engestrom
90844640a1 freedreno/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34386>
2025-04-04 23:49:23 +00:00
Connor Abbott
536b2b13c8 tu: Implement VK_EXT_fragment_density_map_offset
Implement support for dynamic rendering, including suspending and
resuming render passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34159>
2025-04-04 22:35:20 +00:00
Danylo Piliaiev
0e9854a894 tu: Implement VK_KHR_shader_clock
There is a special address defined in kernel from which ALWAYSON
counter could be read.

Blob uses this sequence to read it:
  getone #l15
  mov.s32s32 r2.y, -4096
  mov.s32s32 r2.z, 131071
  (rpt5)nop
  ldg.u32 r2.w, g[r2.y], 1
  ldg.u32 r2.y, g[r2.y+4], 1
  (sy)(ss)mov.s32s32 r48.x, (last)r2.w
  mov.s32s32 r48.y, (last)r2.y
  l15:

Passes:
 dEQP-VK.glsl.shader_clock.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29860>
2025-04-04 18:22:49 +00:00
Danylo Piliaiev
4b1b4ee10c freedreno,tu: Read and pass to compiler uche_trap_base
KGSL always exposed uche_trap_base, and MSM only recently
got support for it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29860>
2025-04-04 18:22:49 +00:00
Mark Collins
e4359cc49c tu/kgsl: Fix KGSL syncobj lifetime in no CB submit
The temporary syncobj created in the fast path of kgsl_queue_submit was
not being destroyed, and potentially being assigned to multiple syncobjs
without being properly duplicated. This could lead to a use-after-free
or double-free since multiple syncobjs could be assigned the same FD.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34328>
2025-04-04 16:54:17 +00:00
Mark Collins
cf4bd2e412 tu/kgsl: Revert "Remove zero CB queue submission fast path"
This reverts commit 0342d34bdb which
introduced a regression in the Turnip's KGSL backend, causing various
sync issues since KGSL doesn't advance the GPU timeline when a submit
without cmdbufs is made. A comment explaining the issue was added to the
code, and the fast path is reintroduced.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34328>
2025-04-04 16:54:17 +00:00
Job Noorman
78ef51aa04 ir3/opt_preamble: take alias.rt into account for rewrite cost
FS outputs can use const registers in alias.rt without a mov so take
this into account when calculating the rewrite cost of instructions.

Totals:
MaxWaves: 2765084 -> 2765130 (+0.00%); split: +0.00%, -0.00%
Instrs: 56289002 -> 56285073 (-0.01%); split: -0.01%, +0.00%
CodeSize: 118071672 -> 118076808 (+0.00%); split: -0.00%, +0.01%
NOPs: 9491112 -> 9492474 (+0.01%); split: -0.00%, +0.02%
MOVs: 1790085 -> 1786768 (-0.19%); split: -0.19%, +0.00%
Full: 2156693 -> 2156607 (-0.00%); split: -0.00%, +0.00%
(ss): 1329812 -> 1329546 (-0.02%); split: -0.03%, +0.01%
(sy): 686396 -> 686386 (-0.00%); split: -0.00%, +0.00%
(ss)-stall: 4995295 -> 4995185 (-0.00%); split: -0.02%, +0.01%
(sy)-stall: 19828966 -> 19828624 (-0.00%); split: -0.01%, +0.01%
Cat0: 10450369 -> 10451731 (+0.01%); split: -0.00%, +0.02%
Cat1: 2787946 -> 2784566 (-0.12%); split: -0.12%, +0.00%
Cat2: 21265787 -> 21264447 (-0.01%)
Cat3: 16207098 -> 16206536 (-0.00%)
Cat7: 1597849 -> 1597840 (-0.00%); split: -0.00%, +0.00%

Totals from 730 (0.36% of 200220) affected shaders:
MaxWaves: 6308 -> 6354 (+0.73%); split: +0.79%, -0.06%
Instrs: 258235 -> 254306 (-1.52%); split: -1.59%, +0.07%
CodeSize: 698806 -> 703942 (+0.73%); split: -0.28%, +1.02%
NOPs: 21040 -> 22402 (+6.47%); split: -1.85%, +8.33%
MOVs: 9426 -> 6109 (-35.19%); split: -35.52%, +0.33%
Full: 8914 -> 8828 (-0.96%); split: -1.03%, +0.07%
(ss): 5118 -> 4852 (-5.20%); split: -6.58%, +1.39%
(sy): 2118 -> 2108 (-0.47%); split: -1.18%, +0.71%
(ss)-stall: 17360 -> 17250 (-0.63%); split: -4.57%, +3.94%
(sy)-stall: 34921 -> 34579 (-0.98%); split: -5.90%, +4.92%
Cat0: 24734 -> 26096 (+5.51%); split: -1.58%, +7.09%
Cat1: 12311 -> 8931 (-27.46%); split: -27.70%, +0.24%
Cat2: 106329 -> 104989 (-1.26%)
Cat3: 100547 -> 99985 (-0.56%)
Cat7: 3646 -> 3637 (-0.25%); split: -0.91%, +0.66%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34279>
2025-04-04 14:17:10 +00:00
Zan Dobersek
248edb43c3 tu: allow D3D-compatible texture coordinate rounding
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When running under DXVK or vkd3d, the texture coordinate rounding behavior
should match D3D expectations. On Adreno, this behavior can be toggled
through the SP_TP_MODE_CNTL register.

A driconf-based option is introduced to help set the relevant register flag
that enables this behavior.

This fixes the cause of test_sampler_rounding test case failure in vkd3d on
Turnip's side, but a small change in vkd3d is also required, so the test
failure expectation isn't removed yet.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>
2025-04-04 10:09:47 +00:00
Zan Dobersek
3b1ca55b40 freedreno/registers: add useful A6XX_SP_TP_MODE_CNTL bitfields
Add additional bitfields for the A6XX_SP_TP_MODE_CNTL registers, ones that
we already use and the texcoord rounding mode bitfield that we'll need for
D3D-over-Vulkan implementations.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>
2025-04-04 10:09:47 +00:00
Zan Dobersek
335cc96069 tu: disable logic operations for float and sRGB formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Per spec, logic operations between fragment values and color attachments
should be disabled when attachments are using float or sRGB formats.
Regardless of attachment's format, enabled logic operations should keep
blending disabled.

Fixes: dEQP-VK.pipeline.*.logic_op_na_formats.*

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34212>
2025-04-03 15:48:19 +00:00
Job Noorman
45a5ccbf07 ir3/ra: create merge sets for splits/collects inserted for shared RA
Since shared RA happens after creating merge sets, newly inserted
splits/collects did not have merge sets created for them. Fix this by
creating merge sets for new instructions after shared RA.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Job Noorman
0cafd07b0c ir3: add ir3_aggressive_coalesce helper
To allow us to create merge sets outside of ir3_merge_regs.c.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Job Noorman
a0db2f9737 ir3/ra: assign interval offsets to new defs after shared RA
Shared RA might insert new defs to be handled by regular RA (e.g.,
shared spills). However, their interval offsets were not initialized
which caused their intervals to sometimes be mistakenly matched with
those containing offset 0. Fix this by calling index_merge_sets after
shared RA and modifying that function to only index new defs in that
case.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Job Noorman
dd1ba74777 ir3: make shpe a terminator
shpe is a bit of a special instruction: it's not really a terminator
(i.e., it does not perform a jump) but it does have to stay at the end
of its block. Up to now, we tried to enforce this by creating const
write barriers on shpe; the assumption being that everything that
happens in the preamble ends in a write to the const file so shpe stays
at the end. Alas, it turns out this is not true: things like sampler
prefetches do not write the const file and nothing was preventing those
from being scheduled after shpe.

Instead of trying to create even more barrier dependencies, fix this by
making shpe a terminator. Both sched and postsched treat terminators
specially to make sure they always stay at the end of their block.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34290>
2025-04-03 08:16:59 +00:00
Danylo Piliaiev
f5019ee0d4 ir3: Fix shaders that write only color classified as empty
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Shader may have zero instructions and no prefetches but have inputs
that without modifications are used as output.

Fixed vkd3d test:
 test_depth_bias_behaviour

Fixes: b0a98d3b13
("ir3: Detect empty fragment shaders")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34348>
2025-04-03 06:47:43 +00:00
Connor Abbott
75178c4655 tu: Implement VK_QCOM_fragment_density_map_offset
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>
2025-04-03 05:37:56 +00:00
Connor Abbott
7351f8d587 tu/fdm: Skip some patchpoints when binning
In order to implement FDM offset, we will have to offset the viewport
and scissor in the binning pass. In order to do this, we have to pass a
bin with nonsensical negative offsets to the patchpoint function, which
would result in asserts when patching the load/store sequences. But we
don't really need to patch these anyways as they are unused during
binning, so add the ability to skip them when binning. FS params and
some implementations of CmdClearAttachments (that don't contribute to
visibility) can similarly be skipped.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>
2025-04-03 05:37:56 +00:00
Connor Abbott
df0c17f76e tu: Fix CmdClearAttachments with fragment density map
The clear may be a partial clear, in which case we need to make sure
that the clear rectangle is transformed into GMEM space so that it is
clipped correctly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>
2025-04-03 05:37:56 +00:00
Connor Abbott
0d4eed0e46 tu: Split out part of tiling config to vsc config
For FDM offset, we will need to expand the number of bins by 1, which
can change how pipes are allocated. We don't necessarily know whether
FDM offset will be used when creating the VkFramebuffer, so we'll have
to create two different configs when FDM is enabled. Split out the parts
that are affected by the number of bins into a separate "VSC config"
struct that will be duplicated with FDM offset.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>
2025-04-03 05:37:56 +00:00
Connor Abbott
304af47ba2 tu: Only allow power-of-two fragment areas
Non-power-of-two fragment areas can result in precision loss and missed
fragments, which was seen in an upcoming CTS test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>
2025-04-03 05:37:56 +00:00
Job Noorman
02ff26be38 ir3: run opt_if after opt_vectorize
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_vectorize could replace swizzled movs with vectorized movs in a
different block. If this happens with swizzled movs in a then block, it
could leave this block empty. ir3 assumes only the else block can be
empty (e.g., when lowering predicates) so make sure ifs are in that
canonical form again.

This fixes empty predication blocks in some shaders, for example:

predt
predf
...
prede

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34272>
2025-04-03 00:19:31 +00:00
Job Noorman
ee0ee2a317 ir3: don't sync every TCS/GEOM block
TCS/GEOM shaders need (sy)(ss) on their first instruction but we
accidentally set it on the first instruction of every block.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34257>
2025-04-02 23:37:35 +00:00
Connor Abbott
3ba315f205 ir3: Split mad with scalar ALU
At least on all a6xx/a7xx, mad.f32 and mad.f16 are not fused. This means
that when the sources of a NIR ffma are all uniform we can split it in
two to execute it on the scalar ALU. This is important to reduce
register pressure and make more preambles executed early.

On fossil-db the statistics are mostly a wash as expected, but with
early preambles increasing dramatically:

Totals:
MaxWaves: 2249180 -> 2249230 (+0.00%); split: +0.01%, -0.01%
Instrs: 49668884 -> 49662951 (-0.01%); split: -0.12%, +0.11%
CodeSize: 103662656 -> 103831154 (+0.16%); split: -0.22%, +0.38%
NOPs: 8502571 -> 8495568 (-0.08%); split: -0.61%, +0.53%
MOVs: 1554442 -> 1538804 (-1.01%); split: -2.01%, +1.01%
Full: 1820906 -> 1814292 (-0.36%); split: -0.39%, +0.03%
(ss): 1168628 -> 1165868 (-0.24%); split: -1.01%, +0.78%
(sy): 616751 -> 616521 (-0.04%); split: -0.52%, +0.49%
(ss)-stall: 4384397 -> 4361662 (-0.52%); split: -1.44%, +0.93%
(sy)-stall: 17850227 -> 17858949 (+0.05%); split: -0.58%, +0.63%

Early-preamble: 102262 -> 115702 (+13.14%)
Cat0: 9375820 -> 9367978 (-0.08%); split: -0.57%, +0.48%
Cat1: 2470212 -> 2454318 (-0.64%); split: -1.28%, +0.64%
Cat2: 18673655 -> 18707106 (+0.18%)
Cat3: 14227810 -> 14211106 (-0.12%)
Cat5: 1424184 -> 1424150 (-0.00%)
Cat7: 1404718 -> 1405808 (+0.08%); split: -0.39%, +0.47%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34115>
2025-04-02 23:08:39 +00:00
Connor Abbott
15660caa90 tu: Fix layer_count with dynamic rendering + multiview
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With "classic" renderpasses, the VkFramebuffer's layerCount must be 1 if
multiview is enabled. We accidentally rely on this to not disable GMEM
for multiview, and possibly for other things too. Apparently the dynamic
rendering equivalent, VkRenderingInfo::layerCount, can be anything when
multiview is enabled, and some CTS tests set it to the number of views.
Sanitize it when constructing the internal framebuffer for dynamic
rendering.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34080>
2025-04-02 15:47:47 +00:00
Danylo Piliaiev
c538a9ec6e tu: Use EARLY_Z also for stencil tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
EARLY tests can test and write out stencil values.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33851>
2025-04-02 12:03:30 +00:00
Danylo Piliaiev
534cf4feeb tu/lrz: Improve LRZ around stencil tests and reads_dest cases
We were a bit too conservative and fully disabled LRZ for when stencil
or blending were involved. There is no need to fully disable LRZ
in those cases, only LRZ writes should be disabled.

The final rules are:
 LRZ is DISABLED until depth attachment is cleared when:
   - Depth Write + changing direction of depth test
      e.g. from OP_GREATER to OP_LESS;
   - Depth Write + OP_ALWAYS or OP_NOT_EQUAL;
   - Clearing depth with vkCmdClearAttachments;
   - Depth image is a target of blit commands.
   - (pre-a650) Not clearing depth attachment with LOAD_OP_CLEAR;
   - (pre-a650) Using secondary command buffers;
 LRZ WRITE is DISABLED until depth attachment is cleared when:
   - Depth Write + blending (color blend, logic ops, partial color mask, etc.);
   - Fragment may be killed by stencil;
 LRZ is disabled for CURRENT draw when:
   - Fragment shader side-effects (writing to SSBOs, atomic operations, etc);
   - Fragment shader writes depth or stencil;
 LRZ WRITE is DISABLED (via LATE_Z) for CURRENT draw when:
   - Fragment may be via killed alpha-to-coverage, discard, sample coverage;

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33851>
2025-04-02 12:03:30 +00:00
Aaron Ruby
8513bcbd2f virtio: Remove virglrenderer_hw.h entirely
Capset definitions replaced by those in virtgpu_drm.h

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34300>
2025-04-01 22:11:10 +00:00
Collabora's Gfx CI Team
1ce0cef6bf Uprev ANGLE to 1b34d2a18af12cc55a3bc74dd679c2937d10cc5c
6abdc11741...1b34d2a18a

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34277>
2025-04-01 12:51:06 +00:00
Danylo Piliaiev
be481e6615 tu: Disable FS in certain cases even if FS is not empty
If FS doesn't have side-effects and color write mask is zero.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33735>
2025-03-31 12:15:56 +02:00