It's undefined to not export a position but some applications rely
on that. The position is always initialized to 0,0,0,1 everywhere else
if not exported.
Fixes KHR-GL46.shader_image_load_store.multiple-uniforms with Zink.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13470>
Hear that it matters for RGP. This is the most likely scenario where
we would hit this workaround, given the tooling for profiling on the
deck will set profile_peak as workaround for hangs.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13534>
RGP only supports GFX8+. RADV doesn't allow SQTT on < GFX8 and
RadeonSI only allows it on GFX9+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13451>
When the determinant that we use for calculating triangle area
is NaN, it's not possible to decide the facing of the triangle.
This can happen when a coordinate of one of the triangle's vertices
is INFINITY. It's better to just accept these triangles in the shader
and let the PA deal with them.
Let's do the same for +/- Infinity too.
Though we haven't seen this yet, it may be troublesome as well.
Fixes: 651a3da1b5Closes: #5470
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13299>
Traces with clock = 0 are totally useless due to RGP getting very
confused.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13301>
These also have a higher compressed block size.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
Currently, we aren't checking if the modifier supports the extent of the image.
DCN only works with !64B && 128B on extents < 4K.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
Make u_vector_init a wrapper to u_vector_init_pot. Let both take
(element_count, element_size) as parameters.
Motivated by eed0fc4caf ("vulkan/wsi/wayland: fix an invalid
u_vector_init call")
v2: rename u_vector_init_pot to u_vector_init_pow2
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>
DCC_IND_BLK is not hooked up for this to work in the kernel in any released version, and it's unsafe to do so even if it was because it doesn't check the modifiers.
There's no reason to change the legacy non-modifier path to be more performant at the expense of breaking backwards compatibility with older versions of Mesa.
Fixes: 0f6251b3 ("ac/surface: use DCC compatible with image stores for < 4K resolutions")
Closes: #5422
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13122>
We need to keep RADV and RadeonSI on the same page about this due to modifiers.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13153>
We don't have to use the special DCC settings for lower resolutions.
This will cause corruption if X and an windowed app use different Mesa
versions. The fix is to restart the X server. I expect to get false bug
reports due to this.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13013>
With NGG culling, the shaders are split into two parts:
the top part that computes just the position output,
and the bottom part which produces the other outputs.
To reduce redundancy between the two, I added some code
to reuse uniform variables between them. However, there is
an edge case I didn't think about: because of vertex repacking,
it is possible for the bottom part to process a different vertex.
Therefore it can take a different divergent code path
(though it must still take the same uniform code path).
Due to this, when a uniform value comes from divergent control
flow, this may be undefined in the bottom part.
This commit stops reusing uniform variables from
divergent control flow, to fix issues that arise from this.
Fossil DB stats on Sienna Cichlid with NGGC on:
Totals from 1723 (1.34% of 128647) affected shaders:
VGPRs: 89312 -> 89184 (-0.14%); split: -0.15%, +0.01%
SpillSGPRs: 4575 -> 120 (-97.38%)
CodeSize: 10846424 -> 10873836 (+0.25%); split: -0.68%, +0.93%
MaxWaves: 34582 -> 34602 (+0.06%); split: +0.06%, -0.01%
Instrs: 2124471 -> 2128835 (+0.21%); split: -0.51%, +0.72%
Latency: 7274569 -> 7293899 (+0.27%); split: -0.22%, +0.48%
InvThroughput: 1637130 -> 1635490 (-0.10%); split: -0.17%, +0.07%
VClause: 25141 -> 25414 (+1.09%); split: -0.02%, +1.10%
SClause: 56367 -> 59503 (+5.56%); split: -1.36%, +6.93%
Copies: 230704 -> 219313 (-4.94%); split: -5.49%, +0.55%
Branches: 72781 -> 72681 (-0.14%); split: -0.21%, +0.07%
PreSGPRs: 118766 -> 100176 (-15.65%); split: -15.70%, +0.05%
PreVGPRs: 76876 -> 76833 (-0.06%)
Fixes: 0bb543bb60
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>
match_mask checks the intrinsic type and decides whether it's
per-patch or not. VS don't have per-patch outputs,
so this causes wrong behaviour there.
Found using the GCC undefined behavior sanitizer.
Fixes the following error:
runtime error:
shift exponent 18446744073709551584 is too large
for 64-bit type 'long unsigned int'
Closes: #5319
Fixes: bf966d1c1d
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>
The byte-permute instruction v_perm_b32 is not exposed by older
LLVM releases (only available on LLVM 13 and later), therefore a new
sequence is needed which we can use with these LLVM versions too.
The prefix sum is replaced by two alternatives:
1. For GPUs that support v_dot, we shift 0x01 to the wanted byte
positions and then use v_dot to sum the results.
2. For older GPUs (Navi 10), we simply shift out the unwanted bytes
and use v_sad_u8 to produce the sum.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>
Helper function to check if a modifier supports DCC image stores.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>
Fills the "Wave mode" in "Pipelines" for GPUs that supports Wave32.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>
Fills the "Scatch Mem" with "Yes/No" in "Pipelines", this requires
instruction timing to be enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>
This is the first part of a refactor to make vertex compaction optional.
Additionally, it may yield a very small benefit to allocate the PC
space sligtly sooner.
Fossil DB stats on Sienna Cichlid with NGGC on:
Totals from 58239 (45.27% of 128647) affected shaders:
CodeSize: 160502348 -> 160502340 (-0.00%)
Instrs: 30722664 -> 30722662 (-0.00%)
Latency: 137627419 -> 137782218 (+0.11%); split: -0.00%, +0.11%
InvThroughput: 21698587 -> 21699068 (+0.00%); split: -0.00%, +0.00%
Copies: 3288263 -> 3288261 (-0.00%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>
Previously, indices 0, 2, 4 were used.
This worked, but it was somewhat unintuitive.
This commit changes it to use indices 0, 1, 2 instead, which
makes the code easier to understand.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12511>