Commit graph

205537 commits

Author SHA1 Message Date
Samuel Pitoiset
3ca2f71f3d radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
e2625fa9ca radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Faith Ekstrand
d808870d49 nvk: Implement VK_EXT_zero_initialize_device_memory
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13159
Reviewed-By: Thomas H.P. Andersen <phomes@gmail.com>
Reviewed-By: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34968>
2025-05-15 03:20:12 +00:00
Faith Ekstrand
f542a60686 nak: Add a helper to reduce OpPrmt sel immediates
Only the bottom 16 bits matter of the select source matter so we can
throw away the top 16 bits and avoid any i20 encoding issues.  All of
the back-ends were already doing this except SM70 which has 32-bit
immediates anyway.  However, doing it in a common place where it's
documented is better than skattering it everywhere.  Also, doing it as
part of legalization ensures that we see the same thing in the
post-legalize IR as gets encoded.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:27 +00:00
Faith Ekstrand
212f99d39d nak: Add a helper for reducing OpShfl lane and c immediates
Every back-end has code to mask these because the hardware only has
limited encoding space.  However, this can be done as a common
legalization operation and doing so means that our post-legalize IR
matches what actually gets encoded.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:25 +00:00
Faith Ekstrand
9890110856 nak: Reduce shift immediates instead of adding copies
SM20 was smart enough to reduce shift immediates instead of just
detecting i20 overflow and adding copies.  This adds helpers to make
this easier and propagates the improvement out to all the back-ends.
Even though it isn't necessary on Volta+, we might as well do it there
for consistency and because smaller shift values are easier to read in
the final assembly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:24 +00:00
Faith Ekstrand
87a90a0e6a nak: Add HW tests for OpShr and OpShl
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:23 +00:00
Faith Ekstrand
d3e917ea03 nak: Fix OpShf folding for shift >= 64
The checked_shr wasn't returning the correct value if .wrap was not set.
We also weren't checking this case in the unit tests so we missed it.
While we're here, get rid of a bunch of pointhess `as u64` as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:22 +00:00
Faith Ekstrand
fa58199166 nak/sm20: Remove some unnecessary Option<>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
2025-05-15 02:32:22 +00:00
Hyunjun Ko
7ddf51dc99 anv: Fix to set CDEF filter flag correctly.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes to play av1_intel_broken2.ivf.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Hyunjun Ko
2e256a3cee anv: Allocate MV buffers enough for AV1 decoding.
As other video memories for AV1 are already allocated for the maximum
sizes, now it does the same for MV buffers too.

This fixes a bunch of artifacts of AV1 playing.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Hyunjun Ko
f4d480f808 anv: Always allocate cdf tables when independent profiles provided
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
2025-05-15 01:02:05 +00:00
Faith Ekstrand
b5e657da48 nak/sm70: Don't set a predicate destination on redg
Reduction ops don't return anything, including predicates.  On Turing
through Hopper, this doesn't matter because these bits are ignored.
However, Blackwell uses those bits to adjust address calculations for
reduction ops.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
e2b7a736a4 nak/nir/lower_tex: Use nir_tex_instr_add_src()
This is slightly less efficient but way safer than trying to mangle the
sources array that's already in the tex instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Dave Airlie
8a39a1502f nak: Use TexOffsetMode for all texture ops
We had a bool for most of them and an enum for OpTld4. Now we have an
enum for all of them and we just reserve PerPx for OpTld4.  While we're
here, rework printing to put the "." in the enum display method.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
4c6010df64 nak/sm70: imnmx takes and returns more predicates on Blackwell+
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
9d89214a69 nak/sm70: Use rZ for the 3rd source of lea when .hi is not set
Pre-blackwell, it's ignored so we can set whatever.  On Blackwell+, it
seems to be take into account somehow (more RE needed?) so we need to
set it to rZ to get the old behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
32f78eff80 nak/sm70: Fix bra offset encoding for for Hopper+
They split the field to add 8 more bits on Hopper.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
046f90ad56 nak/copy_prop: Don't propagate cbufs into ALU on Blackwell+
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:08 +00:00
Faith Ekstrand
9604896c70 nak/lower_copy: Implement copy from CBuf as ldc on Blackwell+
Constant buffer sources for ALU instructions are removed on Blackwell so
we have to use ldc instead of mov.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Faith Ekstrand
994035908d nak/hw_tests: Copy data stride and invocations to avoid cbuf sources
CBuf sources are gone on Blackwell+.  Let's not introduced them directly
in the unit tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Faith Ekstrand
8c3ebddba3 nak/sm70: Properly encode ldc on Blackwell+
Also add nvdisasm tests for ldc because it's pretty important and has
lots of subtle per-SM differences.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Faith Ekstrand
0b142182cb nak/sm70: Increase the number of UGPRs on Blackwell+ to 80
This also affects encodings as rZ is now 255 instead of 63.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Dave Airlie
da16e8aff7 nvk: Add hopper priv registers
The priv registers moved. I've confirmed hopper and above.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Dave Airlie
1c77a6f049 nvk: Don't emit MME FIFO config on Blackwell+
I can't see this in traces on Blackwell and it causes hangs.

These regs are in the hopper class headers so should be fine there.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Dave Airlie
bd7777aee6 nvk: Fix compute class comparison in dispatch indirect
This works by coincidence rather than design.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Dave Airlie
693b55a4af nouveau/headers: Add stub blackwell class headers
These just have the class define.  We'll replace them with the actual
headers from NVIDIA once we have them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
2025-05-15 00:11:07 +00:00
Eric Engestrom
2bc7130808 r300/ci: switch radeon.ko jobs to common kernel (6.13.7)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that it is possible to have more than one initrd, let's switch to
the common b2c kernel which requires two additional initrds:

 * The GPU initrd which contains amdgpu, i915, nouveau, radeon, and xe,
   along with their necessary firmware
 * The depmod initrd which contains what's necessary to modprobe the
   modules of the GPU initrd

Since the GPU initrd is huge (73 MB), let's reduce the size by dropping
all the firmware that is not related to AMD.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34881>
2025-05-14 23:06:42 +00:00
Emma Anholt
e4790143a5 tu: Disable Z reads for always/never.
This saves a bit of bandwidth when we're not going to use the value.
Improves renderpass times across 4 affected traces I tested (bioshock,
stranded deep, transport fever, and godot material testers) on sysmem by
.3% +/- .1%.

A similar change for avoiding stencil reads showed no change on the one
app affected among all of our renderdoc traces, so that's left out.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34964>
2025-05-14 22:34:08 +00:00
Olivia Lee
c053bc2213 panvk: fix driconf memory leak
The driconf options were leaked when the panvk instance was destroyed.

Fixes: aa8fec638f ("panvk: add basic driconf infrastructure")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34986>
2025-05-14 21:55:26 +00:00
Marek Olšák
3fd2bdd285 radeonsi: move si_gs_output_info into si_temp_shader_variant_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
97357e721d radeonsi: add struct si_temp_shader_variant_info
This contains all shader info that's used during compilation,
but is never used after compilation.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
53cd29d946 radeonsi: move shaders args initialization into its own file
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
af8c4f19ab radeonsi: move shader variant info and spi_ps_input_ena code into its own file
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
2e8cac328a radeonsi: move si_nir_mark_divergent_texture_non_uniform to its own file
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
deda05e2b7 nir: move nir_lower_color_inputs into radeonsi
it's the only user

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
70aa58cc95 radeonsi: move shader info structures into new file si_shader_info.h
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
5389a3736f radeonsi: move NIR passes from si_shader.c into their own files
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
e478410466 radeonsi: inline shader_info in si_shader_info, keep only what's used
This reduces the si_shader_info size by 244 B.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
dc5e0e2b73 radeonsi: rename num_stream_output_components -> num_gs_stream_components
it's not for streamout

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
54cc89f7c2 radeonsi: use a simpler way to gather enabled_streamout_buffer_mask
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
180f320e69 radeonsi: use info.num_streamout_vec4s instead of si_shader_uses_streamout
It's identical now.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
759de230de radeonsi: don't declare GDS size for LLVM
We don't use GDS memory.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:17 +00:00
Marek Olšák
32274ab50e radeonsi: implement remove_streamout in si_nir_kill_outputs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
100f9a1624 radeonsi: move xfb fields from si_shader_info to shader variant info
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
9edcf19f7d radeonsi: remove si_shader_info::writes_position
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
c761da42ce radeonsi: don't use si_shader_info in si_parse_next_shader_property
just use NIR info.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
20e5c35cfe radeonsi: gather uses_discard from shader variants
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
de6ca8c7ec radeonsi: gather writes_z/stencil/sample_mask as shader variant info
si_get_shader_variant_info doesn't need to check the kill flags because
killed stores are removed from NIR before that.

Only shader variants need to clear the writes_* flags if the epilog kills
them.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00
Marek Olšák
a9ac95fc0a radeonsi: gather uses_gs_state_provoking_vtx_first/outprim from the shader
Just use nir_def_bits_used() instead of the manually written conditions.

These are gathered from bits of load_scalar_arg(vs_state_bits):
- uses_vs_state_indexed
- uses_gs_state_provoking_vtx_first
- uses_gs_state_outprim

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
2025-05-14 20:19:16 +00:00