Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40318>
We sometimes use this with non-pointer keys.
This removes a footgun at the cost of a larger entry size on 32-bit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40318>
We can drop RT flush and PS Scoreboard stall if state cache perf fix
disabled is set to 1. If bit is set RCC uses the sum of Binding Table
Pointer and Binding Table Index as tag in state cache instead of just
Binding Table Index.
On DX12 this is a performance win on all workloads we've tested.
On DX11 there are a bunch of performance of regression. We think this
is due to the fact that to avoid trashing the RCC, we need to remove
all but render targets from the binding table, meaning all shader
resource accesses have to go through the bindless HW heap. This leads
to additional register usage due to the need to push the base offset
of descriptor sets. Improvement in the compiler would likely mitigate
this.
This change introduce a DRIRC key we only turn on for DX12.
Also platforms prior to DG2/LSC have a really small bindless heap that
leads to additional register usage, so this optimization is completely
disable there.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10872
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10873
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14075
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>
This trivial change is to improve readability of this header:
1. replaces random tabs to spaces
2. use 3-spaces indent consistently across the header
3. minor clang-format fixes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40523>
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
Saves about 2k text size.
Before:
text data bss dec hex filename
24817485 456164 27080 25300729 1820ef9 ./lib64/libvulkan_intel.so
After:
text data bss dec hex filename
24815381 456164 27080 25298625 18206c1 ./lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40230>
The game uses glGetUniformLocation() but specifies the wrong program id
for one of the uniforms. The shader programs both contain shaders with
a uniform of the same name but because they have a different number of
uniforms the returned uniform location does not match the expected uniform.
Here we add a workaround to force the uniform with the wrong get location
params to always have the location 0 so that it doesn't matter which
shader the application checks for the location.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14864
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
This adds some reusable math to set up YCbCr to RGB color transforms. It
covers ITU BT.601, ITU BT.709 and ITU BT.2020 YUV <-> RGB conversion, as
well as "narrow"" and "full" range.
This code is intended to replace three different implementations of
YUV-transforms already present in Mesa, all of them with different
parameterizations and differences in data-formats. These implementations
are: nir_lower_tex.c, vk_nir_convert_ycbcr.c and vl_csc.c.
None of the exising implementations seems to fully cover all of the needs
of the others. The one that comes the closest is the one in vl_csc.c, but
it has a few issues:
1. It doesn't differentiate between per-channel bit-sizes, which the
Vulkan code needs.
2. It uses enums from p_video_enums.h in Gallium to paremeterize the
behavior.
3. It's written in a monolithic way, handling up to two
range-remappings, which the other implementations doesn't need.
While it could be possible to entangle all of that, that would likely
end up being a more or less a new implementation anyway. So let's instead
try to pick the best of all three implementations into one new one,
that's broken into smaller pieces that can be assembled into either of
the three.
In addition, this implementation has a bunch of unit-tests, to make sure
we don't introduce subtle breakages down the line.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40175>
Having deleted_key be a reserved key probably wasn't useful, because it's
not a constant: it's (uintptr_t)&deleted_key_value.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40196>
Add a method for determining which MESA_MAP_ACCESS_* flag would be
appropriate for a given OwnedDescriptor, based on both access flags and
write seals (since access mode can be RDWR despite the seals!)
This is useful for virtgpu implementations when mapping incoming buffers
from host software into the guest's address space. Previously Rutabaga
relied on basic heuristics like "SHM is always R/O", but with upcoming
extra protocols to be forwarded over virtgpu channels (like PipeWire)
those assumptions no longer hold true.
Signed-off-by: Val Packett <val@packett.cool>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40132>
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
A fragment shader in Creed: Rise to Glory has `depth_less` set
however comparing the gl_FragDepth written with gl_FragCoord.z
shows that in some cases the depth is greater.
This fixes graphical artifacts on the character's skin.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40075>
Add support for the use of vertex input registers as additional general
purpose registers which previously was restricted to temporary
registers. Use of vertex input registers as additional general purpose
registers is not available for fragment shaders.
Vertex input registers are similar to temporary registers. The only
difference is that vertex input registers can contain pre-initialised
data when the shader starts.
By default, the number of vertex input registers used for register
allocation is the number of vertex input registers used for their
pre-initialised data rounded up to the nearest multiple of 4, as vertex
input registers are allocated in blocks of 4.
If PCO_DEBUG=alloc_extra_vtxins is used, a mimimum of 12 vertex input
registers are available for register allocation.
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39886>
Without this, GDB will say PIPE_FORMAT_ZS_START instead of
PIPE_FORMAT_S8_UINT. This makes pipe_format more debuggable.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
This helper is generally useful when trying to prettyprint a 32-bit value, so
make it available to the rest of the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
The issue caused us to put a switch to disable (Xe2) drm modifers
in 2418c91537 is fixed in GTK 4.20.3,
so we can enable the modifiers with this and newer GTK releases.
GTK https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/9164:
b2a42d5a6e Revert "vulkan: Wait for device to be idle before
create/recreating swapchain"
270735a151 vulkan: Rework swapchain present implementation
The hex values represent the GTK version range: [4.0.0, 4.20.2] for
VK_MAKE_VERSION(), refer to:
f493f5c88d
Cc: mesa-stable
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39223>
This just bit me. Add an assert to catch the next person who doesn't
read the function signature and tries to extract 64-bits out and wonders
why things are silently broken.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39892>
This adopts the device internal app workaround layer from radv
The layer allows to fix up game input in the layer instead of
adding workarounds within the driver.
Initially this only includes the workaround for Metro exodus as
I have verified that it fixes a crash on NVK. Follow up commits
can add the other relevant workarounds when the fixes are verified
to be needed for NVK.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
Linux v6.4+ provides a syscall called riscv_hwprobe that could detect
multiple characteristics of the running CPU on RISC-V platform.
Implement real check_os_riscv_support() with it and support extensions
detectable by it on Linux v6.5 .
When the toolchain has no riscv_hwprobe definition or the kernel at
runtime does not support it, the fallback code still assumes GC.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154>