Commit graph

157764 commits

Author SHA1 Message Date
Mike Blumenkrantz
29908aa493 aux/tc: fix address calc for segmented texture subdata
this fixes all dimension/array uses for the rp tracking path

Fixes: 51ad269198 ("aux/tc: handle stride mismatch during rp-optimized subdata")

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25180>
(cherry picked from commit 5f73b8976b)
2023-09-20 10:49:16 +01:00
Mike Blumenkrantz
414f6b3af6 aux/tc: fix staging buffer sizing for texture_subdata
this is the size of the src data, not the dst data

Fixes: 51ad269198 ("aux/tc: handle stride mismatch during rp-optimized subdata")

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25180>
(cherry picked from commit b6bc1f85f4)
2023-09-20 10:48:57 +01:00
Erico Nunes
98032b406b lima: fix plbu block stride calculation
For some specific texture sizes, notably some texture sizes with width
4096, block stride calculation could end up calculating stride 256 which
is an invalid value.
In those specific cases, this could cause rendering artifacts or
application/driver crashes.

Cc: mesa-stable

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25084>
(cherry picked from commit cb1c88d41f)
2023-09-18 11:13:16 +01:00
Mike Blumenkrantz
b5ccb6901c nir/inline_uniforms: fix oob access with nir_find_inlinable_uniforms
the array dimensionality needs to match nir_add_inlinable_uniforms even if
only the first member is used

Fixes: 0c0fb216dd ("nir/inline_uniforms: Allow possibility of more than one UBO")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25063>
(cherry picked from commit 39fca243bb)
2023-09-12 14:26:47 +01:00
Danylo Piliaiev
cbfb7e930d ir3/lower_tex_prefetch: Fix crash with lowered load_barycentric_at_offset
ir3_nir_lower_tex_prefetch expects src0 of load_interpolated_input to
be intrinsic, however this assumption broke when src0 is
load_barycentric_at_offset and is lowered in series of alu instructions.

 32x2  %1121 = @load_barycentric_at_offset (%1120) (interp_mode=0)
 32x4  %1118 = @load_interpolated_input (%1121, %1116 (0x0)) ...
 32x2    %32 = vec2 %1118.x, %1118.y
 32x4    %37 = (float32)tex %36 (texture_handle), %34 (sampler_handle), %32 (coord), 0 (texture), 0 (sampler)

is lowered into:

 [...]
 32      %54 = ffma %46.y, %52, %50
 32      %55 = ffma %46.y, %53, %51
 32x2    %56 = vec2 %54, %55
 32x4    %57 = @load_interpolated_input (%56, %25 (0x0))
 [...]

Crash backtrace:

 #5  in __GI___assert_fail (assertion=0x7ff6692328 "parent && parent->type == nir_instr_type_intrinsic",
     file=0x7ff66921c8 "nir.h", line=2536, function=0x7ff6692630 <__PRETTY_FUNCTION__.13> "nir_instr_as_intrinsic")
     at assert.c:101
 #6  in nir_instr_as_intrinsic (parent=0x7fd4b648e8) at nir.h:2536
 #7  in coord_offset (ssa=0x7fd4b649d0) at ir3_nir_lower_tex_prefetch.c:77
 #8  in coord_offset (ssa=0x7fd4b64a90) at ir3_nir_lower_tex_prefetch.c:48
 #9  in ir3_nir_coord_offset (ssa=0x7fd4b64a90) at ir3_nir_lower_tex_prefetch.c:104
 #10 in lower_tex_prefetch_block (block=0x7fd482c100) at ir3_nir_lower_tex_prefetch.c:185
 #11 in lower_tex_prefetch_func (impl=0x7fd4aa0890) at ir3_nir_lower_tex_prefetch.c:218
 #12 in ir3_nir_lower_tex_prefetch (shader=0x7fd4942b10) at ir3_nir_lower_tex_prefetch.c:242

Cc: mesa-stable

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25096>
(cherry picked from commit b16472fc97)
2023-09-12 14:26:47 +01:00
Mike Blumenkrantz
e7cf3b3d3e aux/tc: handle stride mismatch during rp-optimized subdata
to avoid splitting renderpasses, this subdata optimization handles the usual
driver dance of staging buffer -> gpu copy

if the pbo stride doesn't match the image format's stride, however, then
a direct copy will yield broken pixels and the image will misrender. to avoid this,
detect stride mismatch and translate the single subdata call into a sequence
of non-overlapping subdata calls that the driver can magically figure out
while continuing to not split renderpasses

fixes #9589

cc: mesa-stable

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24849>
(cherry picked from commit 51ad269198)
2023-09-12 14:26:47 +01:00
Mike Blumenkrantz
35791e0352 zink: set is_xfb=false for all i/o variables
this can affect streamout generation, even though it so far hasn't

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24950>
(cherry picked from commit e87b24719f)
2023-09-12 14:26:47 +01:00
David Rosca
1fb0c702c4 frontends/va: Flush after unmapping VAImageBufferType
If application changed image data we need to flush on unmap to make the
changes visible. This will also flush if the mapping was used only for
reading, but we can't know that as vaMapBuffer doesn't have a parameter
to specify if read or write is requested.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9774

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25102>
(cherry picked from commit d6299ec258)
2023-09-12 14:26:47 +01:00
Georg Lehmann
3adf614dad nir/opt_algebraic: remove broken fddx/fddy patterns
These patterns are broken in the following scenario:

%1 = f2fmp %0
%2 = fddx %1
%3 = ... // non quad uniform
if %3 {
   %4 = f2f32 %2
   ...
}

Which would turn into

%3 = ...
if %3 {
   %4 = fddx %0
   ...
}

Yet another example that shows why derivative instructions should be
be intrinsics, not alu.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25014>
(cherry picked from commit 136a698251)
2023-09-12 14:26:46 +01:00
Lionel Landwerlin
540dc4d6b5 hasvk: add state cache invalidation back before fast clears
Prior to 87149cc545, blorp added a state cache invalidation prior to
fast clears. This got dropped on Hasvk.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 87149cc545 ("blorp: update and move fast clear PIPE_CONTROLs to drivers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24719>
(cherry picked from commit 9231f24be1)
2023-09-10 19:14:09 +01:00
Rohan Garg
9fd2a458c5 iris: migrate preemption streamwout wa to WA infra
Fixes: db6c374 ('iris: disable preemption for 3DPRIMITIVE during streamout')
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25080>
(cherry picked from commit a57faf5037)
2023-09-06 18:53:28 +01:00
antonino
4c71800efb drirc: enable vk_wsi_force_swapchain_to_current_extent for "Serious Sam Fusion"
This game handles swapchain size incorrecly and can crash because of
it.

Enable this driconf as a workaround.

Fixes: 6139493ae3 ("vulkan/wsi: return VK_SUBOPTIMAL_KHR for sw/x11 on window resize")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24818>
(cherry picked from commit 1456cb9c0b)
2023-09-06 18:53:10 +01:00
antonino
f9dcca0525 drirc: enable vk_wsi_force_swapchain_to_current_extent for "The Talos Principle"
This game handles swapchain size incorrecly and can crash because of
it.

Enable this driconf as a workaround.

Fixes: 6139493ae3 ("vulkan/wsi: return VK_SUBOPTIMAL_KHR for sw/x11 on window resize")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24818>
(cherry picked from commit 142e317024)
2023-09-06 18:48:29 +01:00
antonino
c93039d661 vulkan/wsi: add vk_wsi_force_swapchain_to_current_extent driconf
Add a driconf to force the swapchain size to match
`VkSurfaceCapabilities2KHR::currentExtent` as a workaround for
misbehaved games

Fixes: 6139493ae3 ("vulkan/wsi: return VK_SUBOPTIMAL_KHR for sw/x11 on window resize")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24818>
(cherry picked from commit aa657247ce)
2023-09-06 18:43:04 +01:00
Lionel Landwerlin
e055eef4bc intel/nir: rerun lower_tex if it lowers something
nir_lower_tex can lower tg4 coords into tg4 offset which on DG2+ we
also need to lower into constant offsets.

Unfortunately the nir_lower_tex pass is not able to lower the
instructions it itself generates, so the easy fix for when
nir_lower_tex lowers tg4 coords into tg4 offsets is to rerun the pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9735
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25015>
(cherry picked from commit 10e75aae1b)
2023-09-06 18:42:13 +01:00
Leo Liu
0621ca00e9 radeonsi/vcn: fix the incorrect dt_size
Issue: For texture with multiple planes, the planes will point to the
same BO with the total size, so current vcn dt_size is incorrect.

(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[0]))
...
  buf = 0x5555558daa30,
  gpu_address = 0xffff800101000000,
  bo_size = 0xa2000,
...
}
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[1]))
...
  buf = 0x5555558daa30,
  gpu_address = 0xffff800101000000,
  bo_size = 0xa2000,
...
}

This is because: in function static struct si_texture *si_texture_create_object(),
   if (plane0) {
      /* The buffer is shared with the first plane. */
      resource->bo_size = plane0->buffer.bo_size;
      ...
      radeon_bo_reference(sscreen->ws, &resource->buf, plane0->buffer.buf);
      resource->gpu_address = plane0->buffer.gpu_address;
   }

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9728
Cc: mesa-stable

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25013>
(cherry picked from commit 7876a2f685)
2023-09-06 18:38:21 +01:00
Timur Kristóf
4ddc9267d6 ac/nir/ngg: Wait for attribute ring stores in mesh shaders.
Make sure that both per-vertex and per-primitive attribute
ring stores are finished before position or primitive export
instructions are executed.

This is necessary because we need to ensure that mesh shader
waves work correctly when they have either vertex-only or
primitive-only waves.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 93b4f200de)
2023-09-06 18:37:48 +01:00
Timur Kristóf
b86865c0b7 ac/nir/ngg: Refactor mesh shader primitive export.
Cleanup the code that generates the two channels of the
primitive export instruction, and move storing the built-in
per-primitive outputs out to match how vertex attributes work.

Prepares the mesh shader lowering for a workaround that
affect export instructions.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 0721784b78)
2023-09-06 18:34:58 +01:00
Timur Kristóf
b4954af896 ac/nir/ngg: Wait for attribute stores before VS/TES/GS pos0 export.
This is a HW bug workaround for some (all?) GFX11 chips.

On these chips, rasterization can start before the attribute ring
stores are finished, which can cause issues.
As a workaround, wait for attribute ring stores to finish
before doing the position export.

Mesh shaders will be taken care of in another commit.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit edd51655f0)
2023-09-06 18:29:56 +01:00
Timur Kristóf
06f66ff242 ac/nir: Slightly refactor how pos0 exports are added when missing.
Prepares for a workaround. Makes it possible for this function
to not emit the pos0 export at all so that it can be emitted
by a subsequent call to the function later.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 9c096e4ace)
2023-09-06 18:26:18 +01:00
Timur Kristóf
a848887a99 ac/nir: Add done arg to ac_nir_export_position.
This prepares for a workaround where we won't need to add
the done flag to the last export in this function, because
it will be added in a subsequent call to the same function.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 838d886d90)
2023-09-06 18:15:58 +01:00
David Rosca
ec388feab6 Revert "radeonsi/vcn: add an exception of field case for h264 decoding"
This change causes page faults when playing corrupted video from the
bugreport. The original issue have now been resolved in firmware.

This reverts commit bfce57c7a5.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9210

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24836>
(cherry picked from commit ee1132bd79)
2023-09-06 18:12:38 +01:00
Rohan Garg
6c6ac51d7c blorp: drop undefined macro
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 97d6ceaf04 ("intel: Remove GEN_IS_HASWELL macro")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25011>
(cherry picked from commit ca7ae1a53f)
2023-09-06 16:23:08 +01:00
Rohan Garg
0ccdcb1a38 crocus: fix GFX_VERx10 macro
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25011>
(cherry picked from commit 99a88ca4a2)
2023-09-06 16:23:07 +01:00
Lionel Landwerlin
2d35e451a9 anv: add missing ISL storage usage
ISL makes a bunch of decision on programming (MOCS,
RENDER_SURFACE_STATE values) based on this flag. It's important to set
it if we're going to use an image as storage.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23620>
(cherry picked from commit 34d5db0583)
2023-09-06 16:23:04 +01:00
Karol Herbst
56249130f4 rusticl/memory: do not verify pitch for IMAGE1D_BUFFER
Devices might report an image_pitch_alignment of 0 leading to a division
by 0 trap.

Fixes: 06daa03c5c ("rusticl: Implement spec for cl_khr_image2d_from_buffer")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24993>
(cherry picked from commit 5263802618)
2023-09-06 16:23:01 +01:00
Friedrich Vock
3c40cff398 radv/rt: Pre-initialize instance address
It's not disallowed by spec to load instance-related data in case of a
miss where no instance was ever visited. Such loads make no sense, so we
can return garbage, but it mustn't hang the GPU. Initialize the instance
addresses to the TLAS base to make sure we always have valid memory to load from.

Partially fixes GPU hangs in RTX Remix games.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24971>
(cherry picked from commit 728f6c0b70)
2023-09-06 16:22:59 +01:00
Rhys Perry
ec01ebd1f8 aco/spill: add all live-in to merge block spill candidates
Previously, only already spilled live-in or phis were added to the spill
candidates. Because of branch definitions, this might not be enough.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9722
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24949>
(cherry picked from commit c7bb2f7bb9)
2023-09-06 16:22:00 +01:00
Corentin Noël
3f60927f9d virgl: Do not expose EXT_texture_mirror_clamp when using a GLES host
The GL_MIRROR_CLAMP_EXT wrap parameter is never available in GLES.

This fixes the `spec@!opengl 1.1@texwrap 2d proj` piglit test when using a GLES
host.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24935>
(cherry picked from commit 9c39ea796c)
2023-09-06 16:21:59 +01:00
Lionel Landwerlin
6a7023cd36 intel/fs: implement dynamic interpolation mode for dynamic persample shaders
There is no restriction for query per sample positions from the
interpolator when in non-per-sample dispatch mode. But apparently
that's not giving us the expected values for fragment shaders compiled
without per-sample dispatch knowledge (graphics pipeline libraries).

So when per-sample dispatch is dynamic and we're doing at_sample
interpolation, turn the interpolation back into at_offset at runtime
when we detect that the fragment shader is not run per sample.

Fixes a bunch of dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>
(cherry picked from commit 68027bd38e)
2023-09-06 16:21:49 +01:00
Lionel Landwerlin
fec6eacb99 intel/compiler: fix dynamic alpha-to-coverage handling
Got the wrong logic operation. Let's reuse the nicer NIR builder
helper.

Fixes a bunch of KHR-GL46.sample_variables.mask.rgba8.*.samples*.mask*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: fd7debc8bb ("intel/fs: make alpha_to_coverage a tristate")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9568
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>
(cherry picked from commit 9bf2a89127)
2023-09-06 16:21:42 +01:00
Lionel Landwerlin
ccdec6f92e intel/compiler: disable per-sample interpolation modes with non-per-sample dispatch
Fixes hangs in dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5644011f06 ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>
(cherry picked from commit d74c301026)
2023-09-06 16:19:35 +01:00
Rhys Perry
b5d66e01d8 aco/spill: skip p_branch in process_block
Fixes compilation of a Dead by Daylight shader.

fossil-db (gfx1100):
Totals from 58 (0.04% of 133461) affected shaders:
Instrs: 319824 -> 319421 (-0.13%); split: -0.13%, +0.00%
CodeSize: 1711260 -> 1708744 (-0.15%); split: -0.15%, +0.00%
SpillSGPRs: 2567 -> 2459 (-4.21%)
Latency: 3274930 -> 3274921 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 924106 -> 924105 (-0.00%); split: -0.00%, +0.00%
Copies: 41883 -> 41757 (-0.30%); split: -0.31%, +0.00%
Branches: 9144 -> 9146 (+0.02%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9599
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24896>
(cherry picked from commit cb096b85ff)
2023-09-06 16:17:15 +01:00
Timothy Arceri
ddf78b2469 util: add radeonsi workaround for Nowhere Patrol
Cc: mesa-stable

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24919>
(cherry picked from commit ddac37a8b3)
2023-09-06 16:17:10 +01:00
Paul Gofman
dfe1857e9d driconf: add a workaround for Rainbow Six Extraction
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24784>
(cherry picked from commit 3e66eba59e)
2023-09-06 16:17:07 +01:00
Kenneth Graunke
059ae38c11 iris: Check prog[] instead of uncompiled[] for BLORP state skipping
Huge thanks to Tapani Pälli for debugging this issue, figuring out
what was going wrong, proposing fixes, and walking me through where
things were going off the rails.

BLORP always disables tessellation and geometry shaders.  Our handling
tried to look at ice->shaders.uncompiled[] to determine whether the next
draw needed those shaders.  If not, we can leave BLORP's residual state
that disabled those stages in place, and skip looking at it.

Unfortunately, predicting the future is a bit fraught, in part due to
the uncompiled[] and prog[] arrays being slightly out of sync at times.

Consider the following case:

1. Draw with tessellation shaders in place

   => uncompiled[TES] and prog[TES] will both point at valid shaders.

2. Gallium calls pipe->bind_tes_state(NULL).

   => This makes uncompiled[TES] point at NULL, and flags
      IRIS_STAGE_DIRTY_UNCOMPILED_TES.

      Because iris_update_compiled_shaders() hasn't happened yet,
      uncompiled[TES] is NULL but prog[TES] has the stale TES from
      the previous draw still.

3. BLORP operations happen

   => BLORP sees uncompiled[TES] == NULL and decides that tessellation
      is off for the upcoming draws.  So it skips flagging tess state.

4. Gallium calls pipe->bind_tes_state(shader from step #1).

   => uncompiled[TES] points at the original shader.
      IRIS_STAGE_DIRTY_UNCOMPILED_TES gets flagged again.

5. Draw again

   => This calls iris_update_compiled_shaders(), which sees that
      a TES is bound, and calls iris_update_compiled_tes().  But
      because the same shader was bound as before, the program it
      comes up with is identical to the one already bound at
      ice->shaders.prog[TES].  So, it thinks it doesn't have to
      flag any tessellation state dirty because it was already
      set up for the last draw.

This random unbind and rebind between draws leads to a situation
where, at step #3, BLORP thinks it can skip flagging tessellation
state (nothing is bound), and at step #5, normal state handling
thinks it can skip flagging tessellation state (nothing changed
since last time).  So nobody does, and things break.

This unbind appears to be happening when st_release_variants()
decides it wants to free some shaders.  Then a rebind happens to
put back the actual shader for the draw.  So, it's not theoretical.

To fix this, we change BLORP to look at ice->shaders.prog[] rather
than uncompiled[].  This is equivalent to thinking about the previous
draw, rather than the next.  If the last draw had tessellation off,
then BLORP's disabling was a no-op, and the GPU is still in the same
state as the previous draw.  This is more reliable than predicting
the future.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8308
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9678
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24880>
(cherry picked from commit d693027a00)
2023-09-04 11:30:26 +01:00
Dmitry Baryshkov
117130bee7 tu: Pass real size of prime buffers to allocator
The msm driver reserves the actual DMABUF size in the memory map, while
TU can request smaller memory chunk to be allocated. This potentially
can lead to a situation when next allocation IOVA will be in the middle
of the address space which is reserved for the DMABUF. Pass the
`real_size' to TU allocator instead, so that kernel and userspace have
the same picture of memory allocations.

Cc: mesa-stable
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24861>
(cherry picked from commit 2fdcc00b01)
2023-09-04 11:30:25 +01:00
Sviatoslav Peleshko
8389e5daae intel/isl: Don't over-allocate CLEAR_COLOR size to use whole cache line
At the time this was added to fix some test failures. But it seems that
the failures were happening due to missing cache flushes, so
this extra space is no longer neccessary.

Fixes: 37b4eacc ("intel/isl: Resize clear color buffer to full cacheline")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24768>
(cherry picked from commit caa5c23e48)
2023-09-04 11:30:25 +01:00
Tapani Pälli
b15f54d091 mesa: fix some TexParameter and SamplerParameter cases
EXT extension was added without tests so these functions did
not work properly.

Fixes: 799710be88 ("mesa: Add EXT_texture_mirror_clamp_to_edge to extension table")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24845>
(cherry picked from commit d65fe6eff1)
2023-09-04 11:30:25 +01:00
Georg Lehmann
e4f5c03f4c aco: fix u2f16 with 32bit input
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826>
(cherry picked from commit 6d949e18fd)
2023-09-04 11:30:25 +01:00
Rhys Perry
7c043d6868 aco: fix p_bpermute_gfx6 with input at non-zero byte
Same as the other bpermute pseudo instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>
(cherry picked from commit 85957dd6e5)
2023-09-04 11:30:25 +01:00
Mike Blumenkrantz
9dd3670658 zink: don't start multiple cache jobs for the same program
if there's already a cache job in flight then starting a second one
is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24831>
(cherry picked from commit d5157356ce)
2023-09-04 11:30:25 +01:00
Daniel Schürmann
c1dae7d2ba nir/opt_move: fix handling of if-condition
By accident, this used the parent of the nir_src which is a nir_if
instead of the parent of the SSA value.

Totals from 10814 (8.10% of 133461) affected shaders: (GFX11)
Instrs: 21759185 -> 21757190 (-0.01%); split: -0.02%, +0.01%
CodeSize: 112320272 -> 112316008 (-0.00%); split: -0.02%, +0.01%
SpillSGPRs: 11220 -> 11212 (-0.07%)
SpillVGPRs: 911 -> 903 (-0.88%); split: -1.54%, +0.66%
Latency: 258334759 -> 258316073 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 31428650 -> 31426394 (-0.01%); split: -0.02%, +0.01%
VClause: 309119 -> 309090 (-0.01%); split: -0.01%, +0.01%
SClause: 657028 -> 657150 (+0.02%); split: -0.03%, +0.04%
Copies: 1434209 -> 1432420 (-0.12%); split: -0.28%, +0.15%
Branches: 481804 -> 481801 (-0.00%)
PreSGPRs: 829995 -> 829966 (-0.00%)
PreVGPRs: 758249 -> 758253 (+0.00%)

Fixes: 8a78706643 ('nir: refactor nir_opt_move')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24695>
(cherry picked from commit 7e246f7f2b)
2023-09-04 11:30:25 +01:00
Friedrich Vock
59a25fc82c nir/load_store_vectorize: Handle intrinsics with constant base
This includes nir_load_stack and nir_store_stack, which are vectorized
in nir_lower_shader_calls. If not adjusted, we end up loading from
the wrong base.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9596
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9587
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24789>
(cherry picked from commit a28ff7f240)
2023-09-04 11:30:24 +01:00
Marek Vasut
8d9aeaaac9 etnaviv: Fully replicate back stencil config
The blob replicates both the value mask as well as the stencil reference
of the back-facing stencil to the front-facing stencil. This fixes the
remaining failures in the following dEQPs:

   dEQP-GLES2.functional.fbo.render.*_stencil_index8

Fixes: c8ccd63911 ("etnaviv: Fix depth stencil ops on GC880/GC2000")
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4867>
(cherry picked from commit ef4cb2431d)
2023-09-04 11:30:24 +01:00
Chia-I Wu
81ad2e6186 ac/surface: limit RADEON_SURF_NO_TEXTURE to color surfaces
For z surfaces, flags.texture should be based on
RADEON_SURF_TC_COMPATIBLE_HTILE alone.  Otherwise, addrlib could pick a
_X/_T swizzle mode for a MSAA depth texture, which is said to be broken:

  When _X/_T swizzle mode was used for MSAA depth texture, TC will get zplane
  equation from wrong address within memory range a tile covered and use the
  garbage data for compressed Z reading which finally leads to corruption.

Fixes: de0885cdb8 ("amd/surface: add RADEON_SURF_NO_TEXTURE flag")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24767>
(cherry picked from commit e74c3dbb70)
2023-09-04 11:30:24 +01:00
Mike Blumenkrantz
23e098db63 zink: wait on async fence during ctx program removal
removed=true implies that no async jobs are outstanding

fixes #9580

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24811>
(cherry picked from commit ca987c0dfb)
2023-09-04 11:30:24 +01:00
Tatsuyuki Ishi
4e203b4070 radv/amdgpu: Do not pass in a BO handle when clearing PRT VA region.
This field is invalid to access for virtual BOs.

Fixes: a931d5a4a4 ("radv/winsys: clear the PRT VA range when destroying a virtual BO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24805>
(cherry picked from commit 6c5512568b)
2023-09-04 11:30:24 +01:00
Samuel Pitoiset
fd6bcc5303 Revert "radv/amdgpu: skip adding per VM BOs for sparse during CS BO list build"
This reverts commit 51caece74c.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24774>
(cherry picked from commit e3fae01730)
2023-09-04 11:30:24 +01:00
Samuel Pitoiset
f141107afb Revert "radv/amdgpu: workaround a kernel bug when replacing sparse mappings"
This workaround was added temporarily but it can actually cause
stuttering in some games like Forza Horizon 5.

The kernel fix
(https://lists.freedesktop.org/archives/amd-gfx/2023-June/094648.html)
landed in some stable kernels (5.15.121+, 6.1.40+ and 6.4.5+). Sadly,
older stable kernels don't have it, so you might experiment random GPU
hangs in games that use sparse mapping. Please ensure your kernel is
up-to-date for the best experience.

This reverts commit 9b00867327.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9443
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24774>
(cherry picked from commit f67eb9ce07)
2023-09-04 11:30:24 +01:00