Saves about 2k text size.
Before:
text data bss dec hex filename
24817485 456164 27080 25300729 1820ef9 ./lib64/libvulkan_intel.so
After:
text data bss dec hex filename
24815381 456164 27080 25298625 18206c1 ./lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40230>
The game uses glGetUniformLocation() but specifies the wrong program id
for one of the uniforms. The shader programs both contain shaders with
a uniform of the same name but because they have a different number of
uniforms the returned uniform location does not match the expected uniform.
Here we add a workaround to force the uniform with the wrong get location
params to always have the location 0 so that it doesn't matter which
shader the application checks for the location.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14864
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
This adds some reusable math to set up YCbCr to RGB color transforms. It
covers ITU BT.601, ITU BT.709 and ITU BT.2020 YUV <-> RGB conversion, as
well as "narrow"" and "full" range.
This code is intended to replace three different implementations of
YUV-transforms already present in Mesa, all of them with different
parameterizations and differences in data-formats. These implementations
are: nir_lower_tex.c, vk_nir_convert_ycbcr.c and vl_csc.c.
None of the exising implementations seems to fully cover all of the needs
of the others. The one that comes the closest is the one in vl_csc.c, but
it has a few issues:
1. It doesn't differentiate between per-channel bit-sizes, which the
Vulkan code needs.
2. It uses enums from p_video_enums.h in Gallium to paremeterize the
behavior.
3. It's written in a monolithic way, handling up to two
range-remappings, which the other implementations doesn't need.
While it could be possible to entangle all of that, that would likely
end up being a more or less a new implementation anyway. So let's instead
try to pick the best of all three implementations into one new one,
that's broken into smaller pieces that can be assembled into either of
the three.
In addition, this implementation has a bunch of unit-tests, to make sure
we don't introduce subtle breakages down the line.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40175>
Having deleted_key be a reserved key probably wasn't useful, because it's
not a constant: it's (uintptr_t)&deleted_key_value.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40196>
Add a method for determining which MESA_MAP_ACCESS_* flag would be
appropriate for a given OwnedDescriptor, based on both access flags and
write seals (since access mode can be RDWR despite the seals!)
This is useful for virtgpu implementations when mapping incoming buffers
from host software into the guest's address space. Previously Rutabaga
relied on basic heuristics like "SHM is always R/O", but with upcoming
extra protocols to be forwarded over virtgpu channels (like PipeWire)
those assumptions no longer hold true.
Signed-off-by: Val Packett <val@packett.cool>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40132>
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
A fragment shader in Creed: Rise to Glory has `depth_less` set
however comparing the gl_FragDepth written with gl_FragCoord.z
shows that in some cases the depth is greater.
This fixes graphical artifacts on the character's skin.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40075>
Add support for the use of vertex input registers as additional general
purpose registers which previously was restricted to temporary
registers. Use of vertex input registers as additional general purpose
registers is not available for fragment shaders.
Vertex input registers are similar to temporary registers. The only
difference is that vertex input registers can contain pre-initialised
data when the shader starts.
By default, the number of vertex input registers used for register
allocation is the number of vertex input registers used for their
pre-initialised data rounded up to the nearest multiple of 4, as vertex
input registers are allocated in blocks of 4.
If PCO_DEBUG=alloc_extra_vtxins is used, a mimimum of 12 vertex input
registers are available for register allocation.
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39886>
Without this, GDB will say PIPE_FORMAT_ZS_START instead of
PIPE_FORMAT_S8_UINT. This makes pipe_format more debuggable.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
This helper is generally useful when trying to prettyprint a 32-bit value, so
make it available to the rest of the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
The issue caused us to put a switch to disable (Xe2) drm modifers
in 2418c91537 is fixed in GTK 4.20.3,
so we can enable the modifiers with this and newer GTK releases.
GTK https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/9164:
b2a42d5a6e Revert "vulkan: Wait for device to be idle before
create/recreating swapchain"
270735a151 vulkan: Rework swapchain present implementation
The hex values represent the GTK version range: [4.0.0, 4.20.2] for
VK_MAKE_VERSION(), refer to:
f493f5c88d
Cc: mesa-stable
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39223>
This just bit me. Add an assert to catch the next person who doesn't
read the function signature and tries to extract 64-bits out and wonders
why things are silently broken.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39892>
This adopts the device internal app workaround layer from radv
The layer allows to fix up game input in the layer instead of
adding workarounds within the driver.
Initially this only includes the workaround for Metro exodus as
I have verified that it fixes a crash on NVK. Follow up commits
can add the other relevant workarounds when the fixes are verified
to be needed for NVK.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
Linux v6.4+ provides a syscall called riscv_hwprobe that could detect
multiple characteristics of the running CPU on RISC-V platform.
Implement real check_os_riscv_support() with it and support extensions
detectable by it on Linux v6.5 .
When the toolchain has no riscv_hwprobe definition or the kernel at
runtime does not support it, the fallback code still assumes GC.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154>
Add a few RISC-V extensions that could be detected by the riscv_hwprobe
interface of Linux v6.5+, and add caps for FD/C extensions.
The real probe code will come in the following commit, only a stub that
still assumes GC is added.
Adding these bits also changed the size of non-cache-related CPU
information from 4 dwords to 5, so the code hashing it for shader cache
in llvmpipe is also updated.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154>
These are a handful of errors that pop up in UBSAN, a lot of them
depend on compiler-specific behavior such as zero-sized VLAs being
valid, while others plugged some potential bug prone code such as
nullptr derefs.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39662>
Add a 'cond' argument to the _MESA_TRACE_SCOPE(),
_MESA_TRACE_SCOPE_NAME() and _MESA_TRACE_SCOPE_FLOW() macros, fix up
the MESA_TRACE_SCOPE(), MESA_TRACE_SCOPE_FLOW(), MESA_TRACE_FUNC() and
MESA_TRACE_FUNC_FLOW() macros depending on it and add the new
MESA_TRACE_SCOPE_IF(), MESA_TRACE_SCOPE_FLOW_IF(),
MESA_TRACE_FUNC_IF() and MESA_TRACE_FUNC_FLOW_IF() conditional macros.
The trace macros are now based on the conditional ones. Code gen stays
the same for all the current traces though since compilers optimize
out the condition to always taken. See the compiler explorer link.
Conditional CPU scope traces are meant to allow builds with either
Perfetto, Gpuvis or sysprof tracing enabled to filter traces at
run-time.
Link: https://godbolt.org/z/886PKWEqf
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39407>
Reorder trace calls in _mesa_trace_scope_end() to match the order in
the _mesa_trace_scope_*begin*() functions: Perfetto, Gpuvis then
Sysprof.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39407>
This brings what ANV reports closer to what Iris reports, and is mostly dropping
redundancies.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
This is for parity with what we do in the current GL shader-db path.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
Traditionally we don't print these for GL and tooling doesn't know about this.
Just drop them. Note that neither AMD nor Intel uses the common GL print path
yet which is why this hadn't been hit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
When tracepoint is not queued, the memory for it is allocated on stack
and no memory is allocated for variable-sized strings. So we shouldn't
copy or print them in non-queued case.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39128>
I noticed we disable the prefetch only on Gfx12.5. But surely that
recommendation carries on on later platforms.
It seems other drivers just disable it all the time and only have an
option to force the prefetch. So implementing the same thing here.
Blorp path is left untouched.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39424>
Mesa now has a statistics framework. This adds support for emitting
additional statistics about PowerVR shaders for the Rogue architecture.
Add support for emitting the following statistics: Code size, scratch
size, spill count, temp count, loop count, number of inst groups, number
of main inst groups, number of bitwise inst groups and number of control
inst groups.
Add support for new PCO_DEBUG_PRINT option "stats" to emit shader stats.
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39523>
MHW has a long-running shader compile step on first
launch that is significantly sped up by disabling
Link Time Optimization in the ANV driver.
Shader compile times with LTO disabled are 50% of
baseline measurements and the benchmark shows no
stastically significant change to performance
(tested on LNL-M OOB)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39544>