Commit graph

199086 commits

Author SHA1 Message Date
Ian Romanick
f0bf68dd25 brw/const: Remove TODO that isn't allowed by the hardware
There are a lot of restrictions for bfloat16. The one that prevents this
very useful optimization from being possible is, "Broadcast of bfloat16
scalar is not supported."

Part of the reason this MR exists is to build up to implementing BF
support, and there are a couple more commits that implement
this. However, it fails on both real hardware and simulation:

    Instruction is: mad (8|M0) r6.0<1>:f 0xBF80:bf r2.0<8;1>:f r64.0<0>:f

    In bfloat/float mixed mode, bfloat src must be packed.

Alas.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
99d3755bdd brw/const: Allow HF constants in MAD on Gfx11
These can't mix with F values, but if the non-constant sources are
already HF, this is allowed in src0.

No shader-db changes on any Intel platform.

fossil-db:

Ice Lake
Totals:
Instrs: 236027458 -> 236027442 (-0.00%)
Cycle count: 24515944704 -> 24515945379 (+0.00%)

Totals from 8 (0.00% of 798454) affected shaders:
Instrs: 10226 -> 10210 (-0.16%)
Cycle count: 58567 -> 59242 (+1.15%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
4c462b6b32 brw/const: Allow constants in integer MAD
Nothing can generate this currently, but a future commit will.

The Bspec and experimentation support the following limitations:

- Gfx11: Either src0 or src2 can be W or UW.
- Gfx12: Either src0 or src2 can be W or UW.
- Gfx12.5: Both src0 and src2 can be W or UW.
- Gfx20: Both src0 and src2 can be W or UW.

v2: Add missing break statement.

v3: Leave the MAD handling in the case with the other 3 source
instructions. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
9fa6b68f9e brw/const: Refactor checking whether an immediate source is allowed
Should be no functional change here. This simplifies some later changes.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
69d74739fd brw/algebraic: Don't restrict MAD(a, b, 1) optimization to float32
This is very unlikely for floating point MAD. At some point I intend
to add internal integer MAD uses, and this could occur there.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
b605f76b2a brw/algebraic: Constant fold multiplicands of MAD
v2: Move the full constant folding part to
brw_constant_fold_instruction. Suggested by Caio. I did this by
extracting the core part of the folding to a helper function.

v3: Delete stale comment. Noticed by Caio.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 18090847 -> 18090843 (<.01%)
instructions in affected programs: 150 -> 146 (-2.67%)
helped: 1 / HURT: 0

total cycles in shared programs: 919664648 -> 919663210 (<.01%)
cycles in affected programs: 3426 -> 1988 (-41.97%)
helped: 1 / HURT: 0

LOST:   1
GAINED: 0

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 220496486 -> 220496403 (-0.00%)
Cycle count: 31610880908 -> 31610879044 (-0.00%); split: -0.00%, +0.00%

Totals from 70 (0.01% of 702439) affected shaders:
Instrs: 47018 -> 46935 (-0.18%)
Cycle count: 6335504 -> 6333640 (-0.03%); split: -0.11%, +0.09%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
3a16ad71b7 brw/copy: Commute immediates for MAD multiplicands
This enables constant combining to do its job.

v2: Restore accidentally deleted line from a comment. Noticed by Caio.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total cycles in shared programs: 919668392 -> 919669310 (<.01%)
cycles in affected programs: 10125264 -> 10126182 (<.01%)
helped: 348 / HURT: 194

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Cycle count: 31610720660 -> 31610692748 (-0.00%); split: -0.00%, +0.00%

Totals from 9066 (1.29% of 702433) affected shaders:
Cycle count: 810411934 -> 810384022 (-0.00%); split: -0.01%, +0.00%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
e3e58d6f48 brw: Emit immediate value for MAD in canonical position
No shader-db changes on any Intel platform.

fossil-db:

Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown)
Totals:
Cycle count: 25096109024 -> 25096108722 (-0.00%); split: -0.00%, +0.00%

Totals from 4106 (0.51% of 797610) affected shaders:
Cycle count: 63266176 -> 63265874 (-0.00%); split: -0.01%, +0.01%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
d9b019b683 brw/copy: Don't try to be clever about ADD3 constant propagation
Always propagate into any source. Let commute_immedates and constant
combining sort out the mess. It's literally their job.

No shader-db changes on any Intel platform. The fossil-db changes just
appear to be subtle changes in register allocation if the immediate
source changes from src0 to src2.

v2: Update the comment in commute_immediates. Suggested by Caio.

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Cycle count: 31610720510 -> 31610720660 (+0.00%); split: -0.00%, +0.00%

Totals from 8 (0.00% of 702433) affected shaders:
Cycle count: 5522382 -> 5522532 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
a84e3a0f55 brw/const: Allow mixing signed and unsigned immediate sources
No shader-db or fossil-db changes on any Intel platform. This commit
just prevents issues with a later commit, "brw/copy: Don't try to be
clever about ADD3 constant propagation."

v2: Use 'can_promote = true; break;' instead of 'return
true;'. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
a738c55d7b brw/algebraic: Partial constant folding of ADD3
Fold the cases where one of the sources is zero or two of the sources
are constants. Both case will result in a regular ADD.

No shader-db or fossil-db changes on any Intel platform. This commit
just prevents issues with a later commit, "brw/copy: Don't try to be
clever about ADD3 constant propagation."

v2: Move the full constant folding part to
brw_constant_fold_instruction. Suggested by Caio.

v3: Eliminate the impossible src.file == BAD_FILE case in
brw_fs_opt_algebraic. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
c52ce6157f brw/emit: Fix typo in recently added ADD3 assertion
The current assertion fails as soon as a MAD with src0 and src2 being
immediate is detected.

The assertion was supposted to catch, "If it's ADD3, only one of src0
and src2 can be immediate." The detect this, the opcode test should have
been !=.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: c1c09e3c4a ("brw/emit: Add correct 3-source instruction assertions for each platform")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
25de9dcd76 brw/algebraic: Fix MUL constant folding
Some callers of brw_constant_fold_instruction depend on the result being
a MOV of immediate when progress is made. Previously `MUL dst:D src0:D
1:D` would be converted to `MOV dst:D src0:D`. There was also no
handling for `MUL dst:D imm0:D imm1:D`.

This could cause problems if one of the immedate values was -1. The
existing code would convert this to a `MOV dst:D imm0:D` and set the
negate flag on src0. That is not correct.

v2: Fix the is_negative_one case handling of the non-negative-one
source. Add a comment explaining the assertion. Both suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2cc1575a31 ("brw/algebraic: Refactor constant folding out of brw_fs_opt_algebraic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
Ian Romanick
086e83ccd9 brw/algebraic: Fix ADD constant folding
Some callers of brw_constant_fold_instruction depend on the result being
a MOV of immediate when progress is made. Previously `ADD dst:D src0:D
0:D` would be converted to `MOV dst:D src0:D`. There was also no
handling for `ADD dst:D imm0:D imm1:D`.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2cc1575a31 ("brw/algebraic: Refactor constant folding out of brw_fs_opt_algebraic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>
2024-12-13 01:24:26 +00:00
duncan.hopkins
191d7c6cb6 kopper: Add '#if' guard around loader_dri3_get_pixmap_buffer to stop missing symbol on MacOS.
MacOS does not support DRI3.

Reviewed-By: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32568>
2024-12-13 00:39:16 +00:00
duncan.hopkins
568a4ca899 glx: ignore zink check for has_explicit_modifiers and DRI3 on MacOS.
MacOS has neither of these so always fails to start up zink.

Reviewed-By: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32568>
2024-12-13 00:39:16 +00:00
duncan.hopkins
e89eba0796 glx: change #if guard around dri_common.h to stop missing 'driDestroyConfigs' symbol on MacOS builds.
Reviewed-By: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32568>
2024-12-13 00:39:16 +00:00
Caio Oliveira
c8f6d8154f intel/brw: Remove overloads for brw_print_instruction/s functions
Almost all cases now handled with default arguments.  The only real
extra work that was being done was pushed to the client code in
debug_optimizer().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32596>
2024-12-12 22:01:48 +00:00
Alyssa Rosenzweig
41076b2a55 radeonsi: use mesa_prim_has_adjacency
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
b7f2d480da agx: optimize scratch access
so we can use designated initializers and other fun features.

all affected shaders are in gfxbench:

total instructions in shared programs: 2750549 -> 2750497 (<.01%)
instructions in affected programs: 10832 -> 10780 (-0.48%)
helped: 4
HURT: 2
Inconclusive result (value mean confidence interval includes 0).

total alu in shared programs: 2278478 -> 2278760 (0.01%)
alu in affected programs: 7040 -> 7322 (4.01%)
helped: 2
HURT: 4
Alu are HURT.

total fscib in shared programs: 2276985 -> 2277267 (0.01%)
fscib in affected programs: 7040 -> 7322 (4.01%)
helped: 2
HURT: 4
Fscib are HURT.

total bytes in shared programs: 19922466 -> 19922734 (<.01%)
bytes in affected programs: 71412 -> 71680 (0.38%)
helped: 4
HURT: 2
Inconclusive result (value mean confidence interval includes 0).

total regs in shared programs: 865070 -> 865086 (<.01%)
regs in affected programs: 142 -> 158 (11.27%)
helped: 0
HURT: 2

total uniforms in shared programs: 2120930 -> 2121034 (<.01%)
uniforms in affected programs: 244 -> 348 (42.62%)
helped: 0
HURT: 2

total scratch in shared programs: 11576 -> 11600 (0.21%)
scratch in affected programs: 2744 -> 2768 (0.87%)
helped: 0
HURT: 2

total spills in shared programs: 958 -> 868 (-9.39%)
spills in affected programs: 958 -> 868 (-9.39%)
helped: 6
HURT: 0

total fills in shared programs: 732 -> 626 (-14.48%)
fills in affected programs: 732 -> 626 (-14.48%)
helped: 4
HURT: 2

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
923e6361d1 compiler/glsl_types: add glsl_get_word_size_align_bytes
this alignment matches what nir_lower_scratch_to_var wants. this is not
correctness bearing but it mitigates stats regressions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
bd89279dd4 nir: add lower_scratch_to_var pass
to ease opencl pain.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
d5a4aa756f asahi: use mesa_prim_has_adjacency
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
8abb043c19 compiler: add mesa_prim_has_adjacency helper
hk will use this, it's a pretty obvious thing to want.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
e4f61771d8 compiler: use libcl.h for CL
instead of redefining BITFIELD_BIT.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
d695c84829 libagx: port to common libcl.h
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
a0694fd5c3 libagx: drop pointless helper
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
c34635c58d agx: implement halts
just translate to a stop. seems to work fine.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
21c16fe343 asahi,hk: wire up printf, abort
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Lionel Landwerlin
36623697d1 hk: fix timeline value type
Signed-off-by: Lionel Landwerlin <llandwerlin@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
dd4805fcc8 asahi/clc: remap __FILE__
important for reproducability. wondering if we can do this in common code but
not sure how yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
bfe1fd737b asahi: allow c23 extensions
hk already does. this quiesches warnings with single argument static_assert
which we want for CL parity.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
13a4186c96 util/bitpack_helpers: make partially CL safe
add enough preprocessor guards that we can include this from CL and get basic
implementations of things. FIXED packs are missing due to llroundf (probably
fixable).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
d64caf4161 libcl: add VkDraw(Indexed)IndirectCommand definitions
this is helpful to indirect draw munging code, which applies to at least 3
stacks using driver CL stuff (current Intel, shortterm Asahi, mediumterm
Panfrost)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
12e27497b3 libcl: add a common header for CPU/GPU stuff
In an attempt to make OpenCL shaders more "batteries included", start building
up a standard library. Based on libagx.h.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
13b8af95fb clc: plumb cl_khr_subgroup_ballot
although rusticl isn't lighting it up yet, it's helpful to get
sub_group_ballot for driver CL, which is all standard Vulkan-compatible spirv.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Paulo Zanoni
d4a54d4f92 brw: don't read past the end of old_src buffer in resize_sources()
In this case, num_sources is bigger than this->sources, so if we loop
up to num_sources (instead of this->sources) we'll end up reading past
the end of old_src[]. Only copy up to what we originally had.

This was found by code inspection, I'm not aware of any applications
failing due to the lack of this patch.

Fixes: d9e737212d ("intel/brw: Add a src array for the common case in fs_inst")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32600>
2024-12-12 20:33:13 +00:00
Samuel Pitoiset
c7a7f0244f radv: add radv_lower_terminate_to_discard and enable for Indiana Jones
To workaround game bug.

This fixes the rendering issue with eyes.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32606>
2024-12-12 19:54:39 +00:00
Samuel Pitoiset
4d4418dbb3 spirv: add an options to lower SpvOpTerminateInvocation to OpKill
To workaround game bugs like Indiana Jones.

Original workaround found by Hans-Kristian.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32606>
2024-12-12 19:54:39 +00:00
Erik Faye-Lund
976eb6825e panvk: do not require opt-in for panvk on v10
As of writing, PanVK on v10 HW is in pretty good shape. It's not yet
conformant, but we were passing over 99.9% of the CTS last time I
checked. That's probably good enough to drop the opt-in here.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32561>
2024-12-12 19:32:06 +00:00
Erik Faye-Lund
12067727fa panvk: soften the language around opt-in
We already have and use vk_warn_non_conformant_implementation(), so
we're already being clear that PanVK is not yet conformant. Let's not
repeat that information here, and instead focus on it not being
well-tested.

This brings the wording more or less in-line with NVK.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32561>
2024-12-12 19:32:06 +00:00
Timur Kristóf
deab81fb0d radv: Configure implicit VS primitive ID to be per-primitive.
This is beneficial to applications that rely on
the implicit primitive ID from VS.

- We don't have to disable provoking vertex reuse,
  which results in more efficient vertex processing.
- There is no LDS access needed to export the primitive ID,
  because it is already available to GS threads.
- As a consequence of not needing LDS, we can use this
  together with NGG passthrough mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:47 +00:00
Timur Kristóf
95ac0f8d76 radv: Reorder FS primitive ID input after layer and viewport.
We want to make the implicit VS primitive ID a per-primitive
output attribute, which means that this has to be last.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:47 +00:00
Timur Kristóf
9224b9a752 ac/nir/ngg: Add ability to store primitive ID as per-primitive.
This configuration will be enabled in RADV in a subsequent commit.

On GFX10.3:
Do this together with the primitive export, to avoid adding extra
CF, and to ensure optimal access of the export space.

On GFX11:
It's not an export but a memory store instruction, so always do
it earlier and ensure the optimal attribute ring access pattern.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:45 +00:00
Timur Kristóf
d670dc0c0b radv: Only set NGG_DISABLE_PROVOK_REUSE for VS.
It doesn't do anything useful for other stages.

In VS, we use this when the implicit primitive ID is needed,
so that we can export that as a per-vertex attribute of the
provoking vertex.

In TES, the patch ID (which is used as the primitive ID) is
already a per-vertex input VGPR, so it doesn't make sense to
configure this.

In GS, the primitive ID is explicitly written by the shader,
so it makes no sense to disable provoking vertex reuse in the
input.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:45 +00:00
Rhys Perry
9fe92689cc radv: increase maxComputeWorkGroupCount[0]
Match AMDVLK and radeonsi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:47 +00:00
Rhys Perry
53d0187bab aco: decrease max_workgroup_size
Match the limit of radeonsi and RADV.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:46 +00:00
Rhys Perry
87f2f77960 aco: fix max_workgroup_count[0]
This is necessary for radeonsi.

fossil-db (navi21):
Totals from 292 (0.37% of 79395) affected shaders:
Instrs: 305965 -> 306182 (+0.07%); split: -0.00%, +0.07%
CodeSize: 1624816 -> 1627212 (+0.15%); split: -0.00%, +0.15%
Latency: 5244652 -> 5243587 (-0.02%); split: -0.07%, +0.05%
InvThroughput: 1221089 -> 1225285 (+0.34%); split: -0.04%, +0.38%
Copies: 22712 -> 22702 (-0.04%)
PreSGPRs: 10713 -> 10712 (-0.01%)
PreVGPRs: 10918 -> 10920 (+0.02%)
VALU: 178613 -> 178836 (+0.12%)
SALU: 43490 -> 43493 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:46 +00:00
Lionel Landwerlin
e0b5179869 blorp: use 2D dimension for 1D tiled images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31eeb72e45 ("blorp: Add support for blorp_copy via XY_BLOCK_COPY_BLT")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32608>
2024-12-12 17:10:45 +00:00
Erik Faye-Lund
cfb5687cb3 panvk: disable imageCubeArray on bifrost
We haven't wired this up correctly on Bifrost, so let's make this V10
only for now.

Fixes: 605c173fbd ("panvk: update feature support")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32610>
2024-12-12 15:10:26 +00:00