Commit graph

205571 commits

Author SHA1 Message Date
Kenneth Graunke
20222cd956 anv: Use the new nir_opt_acquire_release_barriers pass
Improves performance of Phasmophobia with the "Eye Adaptation" video
setting enabled on Arc B570 by about 9.5%.

fossil-db results on Battlemage:

   Totals:
   Instrs: 148797922 -> 148797865 (-0.00%)
   Send messages: 7066341 -> 7066317 (-0.00%)
   Cycle count: 21459978352 -> 21459975048 (-0.00%)

   Totals from 8 (0.00% of 574410) affected shaders:
   Instrs: 4633 -> 4576 (-1.23%)
   Send messages: 479 -> 455 (-5.01%)
   Cycle count: 611886 -> 608582 (-0.54%)

Observed to cut 15% of sends in a Phasmophobia shader, 8.3% in a Far Cry
New Dawn shader, 7% in a Borderlands 3 DX11 shader, and 3.4-3.7% of
sends in a few Witcher 3 and Dark Souls 3 shaders.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33504>
2025-05-16 00:29:13 +00:00
Kenneth Graunke
deb1d47155 nir: Add a new optimization for acquire/release atomics & barriers
Some shaders contain back-to-back atomic accesses in SPIR-V with
AcquireRelease semantics.  In NIR, we translate these to a release
memory barrier, the atomic, then an acquire memory barrier.

This results in a lot of unnecessary memory barriers in the middle
of the sequence of atomics:

   0. Release barrier
   1. Atomic
   2. Acquire barrier
   3. Release barrier
   4. Atomic
   5. Acquire barrier
   6. Release barrier
   7. Atomic
   8. Acquire barrier

In the absence of loads/stores, and when the atomic destinations are
unused, these barriers in-between atomics shouldn't be required.

This optimization pass would drop them (lines 2-3 and 5-6 above) while
leaving the first and last barriers (0 and 8), so the sequence remains
synchronized against other access elsewhere in the program.

One common example where this occurs is a sequence of min and max
atomics to clamp a certain memory location's value within a range.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33504>
2025-05-16 00:29:13 +00:00
Rob Clark
65e18a8494 freedreno: Fix shader-clock when kernel exposes UCHE_TRAP_BASE
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 4b1b4ee10c ("freedreno,tu: Read and pass to compiler uche_trap_base)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35010>
2025-05-15 22:27:17 +00:00
Yinjie Yao
089e2cb6f9 radeonsi: Disable av1 cdef_channel_strength for VCN4
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
VCN4 hardware doesn't support this feature, it can only be supported in VCN5.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35004>
2025-05-15 18:52:08 +00:00
Seán de Búrca
10fad5081d nouveau: implement Default for Push
By convention, a struct with a `new()` method which has no parameters
should have a `Default` impl which calls `new()`.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
f4f4b25d25 nak,nil: style cleanup
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
adecea4af9 nak,nouveau: adjust function/method signatures to better match convention
v2: restore `to_cssa()` naming

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
e559c63fd8 nak,nil: elide lifetimes where possible
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
e4f045df58 nak,nil: avoid explicit returns at the end of functions
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
e32c82d0f5 nak: use standard methods and macros to improve readability
v2: Leave `Op::is_branch()` and `Op::no_scoreboard()` matches alone
v3: Revert additional changes with unclear readability

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
ba2b9345e8 nak: use Option propagation instead of explicit let-else clauses
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
f2cc77dca8 nak: collapse extraneous conditional branches
v2: Revert collapsing of branches per review

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
451b37820d nak: remove unnecessary casts and conversions
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
2025-05-15 17:52:32 +00:00
Seán de Búrca
e4d895f0e1 rusticl: fix build with clippy driver
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35002>
2025-05-15 09:35:17 -07:00
Seán de Búrca
35af55a2a7 rusticl: replace map_or(false, f) with is_some_and(f)
A new clippy lint fails on this pattern, causing build errors on
versions >= 1.84.0. `is_some_and()` is stable from 1.70.0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35002>
2025-05-15 09:35:08 -07:00
José Roberto de Souza
cb6f96a1e8 anv: Remove a '#if GFX_VER >= 30' block inside of a else of '#if GFX_VERx10 >= 125'
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Removing deadcode.

Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
José Roberto de Souza
37b42ef648 anv: Drop '#if GFX_VERx10 >= 125' inside of '#if GFX_VERx10 >= 125'
This is just redundant.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
José Roberto de Souza
bca12800aa iris: Restrict platforms that needs Wa_1604061319
It was being applied even to platforms that don't require it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
José Roberto de Souza
3cd972a2d3 anv: Enable preemption due 3DPRIMITIVE in GFX 12
The issues preventing it to be enabled were fixed so now we can enable
it but we need also to enable workaround 16013994831 back again.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
José Roberto de Souza
2432d6677e anv: Implement missing part of Wa_1604061319
Description of this workaround are not clear but looking at Iris
implementation we need to emit all 3DSTATE_PUSH_CONSTANT_ALLOC_XS if
any 3DSTATE_PUSH_CONSTANT_ALLOC_XS is emitted.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
Ashley Smith
a1376449c8 panvk: Expose support for multiview on v7
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34832>
2025-05-15 14:04:29 +00:00
Ashley Smith
4171917210 panvk: Add support for VK_KHR_multiview on v7
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34832>
2025-05-15 14:04:29 +00:00
Rob Clark
d8ed4f14e6 freedreno/ir3: Fix tess/geom asan error
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: ee0ee2a317 ("ir3: don't sync every TCS/GEOM block")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34989>
2025-05-15 12:46:16 +00:00
Georg Lehmann
3f70433ff0 aco: add type information for operands/definitions
More information available for use in the optimizer.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29695>
2025-05-15 12:17:17 +00:00
Corentin Noël
6c1c116a0f virgl: Avoid possible double free when destroying the hw resource
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When a resource is un-referenced, the reference count is decremented,
and intentionally no lock is acquired. This can result in the following
race condition when a resource is created from a handle:

```
[Thread] Operation
[0] Create resource from handle for the first time, refcount set to 1
[0] resource is unreferenced, refcount is decremented to 0 (intentionally
    no mutex is locked)
[0] before entering virgl_hw_res_destroy to lock
    virgl_drm_winsys::bo_handles_mutex the thread yields
[1] Create resource from handle pulls the resource from
     virgl_drm_winsys::bo_handles, refcount is incremented to 1
[1] resource is unreferenced, refcount is decremented to 0
[1] Enter virgl_hw_res_destroy,
[1] acquire the lock on virgl_drm_winsys::bo_handles_mutex
[1] check reference count to be 0, yes -> the resource is destroyed
[1] release the lock on virgl_drm_winsys::bo_handles_mutex
[0] Enter virgl_hw_res_destroy,
[0] acquire the lock on virgl_drm_winsys::bo_handles_mutex
[0] Here the res pointer already points to freed memory
[0] check reference count to be 0, yes -> the resource is destroyed (again!)
double free or corruption (!prev)
```

To work around this race condition, keep track of the number of times
the resource was pulled from virgl_drm_winsys::bo_handles to see whether
it has to be kept alive despite the reference count being zero.

This can be reproduced with the `spec@ext_image_dma_buf_import@ext_image_dma_buf_import-refcount-multithread`
piglit test.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34809>
2025-05-15 10:38:13 +00:00
Mary Guillemard
1c57581856 pan/lib: Make pan_shader.c not GENX
We move pan_raw_format_mask_midgard to pan_format.c instead making
pan_shader.c not depending on any GENX.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
2025-05-15 10:41:07 +02:00
Mary Guillemard
0bb9df9d33 pan/lib: Make pan_shader_get_compiler_options not GENX
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
2025-05-15 10:40:57 +02:00
Mary Guillemard
7158f2eb8b pan/lib: Make pan_shader_compile not GENX
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
2025-05-15 10:40:47 +02:00
Mary Guillemard
1fa13ceb74 pan/lib: Move pan_fixup_blend_type to pan_blend.c
Also move bifrost_blend_type_from_nir to pan_blend.c, rename it and
makes it not GENX.

This part is related to blend so it makes more sense to have it there
and this will allow us to make pan_shader.c not GENX.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
2025-05-15 10:40:36 +02:00
Mary Guillemard
b3f8c955a7 pan/genxml: Add Register File Format to common.xml
This was added in v6+ and never changed.
This will allow us to remove GENX code logic that is identical.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
2025-05-15 10:40:23 +02:00
Mary Guillemard
60b131a712 pan/bi: Lower ffract in bifrost_nir_algebraic on v11+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On v11+, because FROUND.v2f16 is gone we end up with precision issues.
We now lower ffract in bifrost_nir_algebraic instead of during common
algebraic to ensure lower_bit_size has been performed.

This fixes
"dEQP-GLES3.functional.shaders.builtin_functions.common.fract.vec2_lowp_vertex"
and
"dEQP-GLES31.functional.shaders.builtin_functions.common.fract.vec2_lowp_compute".

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34970>
2025-05-15 07:12:44 +00:00
Mary Guillemard
5588ff49a7 pan/bi: Flush subnormals to zero for FROUND on v11+
FROUND on v11+ does not flush subnormals to zero even when configured in
the shader program header.

We now use FLUSH.ftz on the input of FROUND to ensure proper
behavior when rounding up and down with FTZ enabled.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34970>
2025-05-15 07:12:44 +00:00
Hans-Kristian Arntzen
e674823d55 radv: Consider that DGC might need shader reads of predicated data.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
2025-05-15 06:14:46 +00:00
Samuel Pitoiset
b79f1a3af3 ac/gpu_info: allow 32-bit predicate on GFX11+
This is natively supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
3ca2f71f3d radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
e2625fa9ca radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Faith Ekstrand
d808870d49 nvk: Implement VK_EXT_zero_initialize_device_memory
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13159
Reviewed-By: Thomas H.P. Andersen <phomes@gmail.com>
Reviewed-By: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34968>
2025-05-15 03:20:12 +00:00
Faith Ekstrand
f542a60686 nak: Add a helper to reduce OpPrmt sel immediates
Only the bottom 16 bits matter of the select source matter so we can
throw away the top 16 bits and avoid any i20 encoding issues.  All of
the back-ends were already doing this except SM70 which has 32-bit
immediates anyway.  However, doing it in a common place where it's
documented is better than skattering it everywhere.  Also, doing it as
part of legalization ensures that we see the same thing in the
post-legalize IR as gets encoded.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:27 +00:00
Faith Ekstrand
212f99d39d nak: Add a helper for reducing OpShfl lane and c immediates
Every back-end has code to mask these because the hardware only has
limited encoding space.  However, this can be done as a common
legalization operation and doing so means that our post-legalize IR
matches what actually gets encoded.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:25 +00:00
Faith Ekstrand
9890110856 nak: Reduce shift immediates instead of adding copies
SM20 was smart enough to reduce shift immediates instead of just
detecting i20 overflow and adding copies.  This adds helpers to make
this easier and propagates the improvement out to all the back-ends.
Even though it isn't necessary on Volta+, we might as well do it there
for consistency and because smaller shift values are easier to read in
the final assembly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:24 +00:00
Faith Ekstrand
87a90a0e6a nak: Add HW tests for OpShr and OpShl
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:23 +00:00
Faith Ekstrand
d3e917ea03 nak: Fix OpShf folding for shift >= 64
The checked_shr wasn't returning the correct value if .wrap was not set.
We also weren't checking this case in the unit tests so we missed it.
While we're here, get rid of a bunch of pointhess `as u64` as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:22 +00:00
Faith Ekstrand
fa58199166 nak/sm20: Remove some unnecessary Option<>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:22 +00:00
Hyunjun Ko
7ddf51dc99 anv: Fix to set CDEF filter flag correctly.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes to play av1_intel_broken2.ivf.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Hyunjun Ko
2e256a3cee anv: Allocate MV buffers enough for AV1 decoding.
As other video memories for AV1 are already allocated for the maximum
sizes, now it does the same for MV buffers too.

This fixes a bunch of artifacts of AV1 playing.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Hyunjun Ko
f4d480f808 anv: Always allocate cdf tables when independent profiles provided
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Faith Ekstrand
b5e657da48 nak/sm70: Don't set a predicate destination on redg
Reduction ops don't return anything, including predicates.  On Turing
through Hopper, this doesn't matter because these bits are ignored.
However, Blackwell uses those bits to adjust address calculations for
reduction ops.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
e2b7a736a4 nak/nir/lower_tex: Use nir_tex_instr_add_src()
This is slightly less efficient but way safer than trying to mangle the
sources array that's already in the tex instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Dave Airlie
8a39a1502f nak: Use TexOffsetMode for all texture ops
We had a bool for most of them and an enum for OpTld4. Now we have an
enum for all of them and we just reserve PerPx for OpTld4.  While we're
here, rework printing to put the "." in the enum display method.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
4c6010df64 nak/sm70: imnmx takes and returns more predicates on Blackwell+
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00