These are a handful of errors that pop up in UBSAN, a lot of them
depend on compiler-specific behavior such as zero-sized VLAs being
valid, while others plugged some potential bug prone code such as
nullptr derefs.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39662>
Add a 'cond' argument to the _MESA_TRACE_SCOPE(),
_MESA_TRACE_SCOPE_NAME() and _MESA_TRACE_SCOPE_FLOW() macros, fix up
the MESA_TRACE_SCOPE(), MESA_TRACE_SCOPE_FLOW(), MESA_TRACE_FUNC() and
MESA_TRACE_FUNC_FLOW() macros depending on it and add the new
MESA_TRACE_SCOPE_IF(), MESA_TRACE_SCOPE_FLOW_IF(),
MESA_TRACE_FUNC_IF() and MESA_TRACE_FUNC_FLOW_IF() conditional macros.
The trace macros are now based on the conditional ones. Code gen stays
the same for all the current traces though since compilers optimize
out the condition to always taken. See the compiler explorer link.
Conditional CPU scope traces are meant to allow builds with either
Perfetto, Gpuvis or sysprof tracing enabled to filter traces at
run-time.
Link: https://godbolt.org/z/886PKWEqf
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39407>
Reorder trace calls in _mesa_trace_scope_end() to match the order in
the _mesa_trace_scope_*begin*() functions: Perfetto, Gpuvis then
Sysprof.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39407>
This brings what ANV reports closer to what Iris reports, and is mostly dropping
redundancies.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
This is for parity with what we do in the current GL shader-db path.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
Traditionally we don't print these for GL and tooling doesn't know about this.
Just drop them. Note that neither AMD nor Intel uses the common GL print path
yet which is why this hadn't been hit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>
When tracepoint is not queued, the memory for it is allocated on stack
and no memory is allocated for variable-sized strings. So we shouldn't
copy or print them in non-queued case.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39128>
I noticed we disable the prefetch only on Gfx12.5. But surely that
recommendation carries on on later platforms.
It seems other drivers just disable it all the time and only have an
option to force the prefetch. So implementing the same thing here.
Blorp path is left untouched.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39424>
Mesa now has a statistics framework. This adds support for emitting
additional statistics about PowerVR shaders for the Rogue architecture.
Add support for emitting the following statistics: Code size, scratch
size, spill count, temp count, loop count, number of inst groups, number
of main inst groups, number of bitwise inst groups and number of control
inst groups.
Add support for new PCO_DEBUG_PRINT option "stats" to emit shader stats.
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39523>
MHW has a long-running shader compile step on first
launch that is significantly sped up by disabling
Link Time Optimization in the ANV driver.
Shader compile times with LTO disabled are 50% of
baseline measurements and the benchmark shows no
stastically significant change to performance
(tested on LNL-M OOB)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39544>
Add shared enum for blend overlap modes used by both
VK_EXT_blend_operation_advanced and GL_NV_blend_equation_advanced:
- UNCORRELATED: Default, no coverage assumptions
- CONJOINT: Maximal overlap, primitives are correlated
- DISJOINT: Minimal overlap, primitives don't overlap
This enum is shared between Vulkan's VkBlendOverlapEXT and OpenGL's
GL_BLEND_OVERLAP_NV parameter.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38929>
VK_EXT_blend_operation_advanced and GL_NV_blend_equation_advanced
defines additional blend operations beyond what OpenGL KHR_blend_equation_advanced
provides. Add these modes to pipe_advanced_blend_mode.
Also add a default case to gl_nir_lower_blend_equation_advanced.c
to handle unsupported modes gracefully.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38929>
Rename gl_advanced_blend_mode to pipe_advanced_blend_mode and move it
to src/util/blend.h so it can be shared between OpenGL and Vulkan
drivers.
This prepares for implementing VK_EXT_blend_operation_advanced by
providing a common enum for advanced blend modes across APIs.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38929>
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>
Currently panfrost has some bug at least on Midgard T-860, which causes
an assertion failure with these 16 bpc displayable framebuffer configs:
".../drivers/panfrost/pan_fb_preload.c:341: pan_preload_get_blend_shaders:
Assertion `b->work_reg_count <= 4' failed."
See discussion in:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588
Disable these rgb16 configs on panfrost by default for now,
as suggested by Eric Smith.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Suggested-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588>
These are useful for displaying very high color precision images with
more than 10 bpc color depth, and also more precision than what fp16
can do on a standard dynamic range (SDR) display, where fp16 for values
in the unorm 0.0 - 1.0 range is about equivalent to at most ~11 bpc
linear color depth. This is especially useful for and aimed at scientific
applications, e.g., neuroscience and other bio-medical research cases.
At least current generation AMD gpu's released during the last 10 years
and supported by amdgpu-kms + atomic modesetting do allow for scanout of
such 16 bpc framebuffers and of up to 12 bpc output to suitable HDMI or
DisplayPort high precision displays.
We gate the format behind a new driconf option 'allow_rgb16_configs',
which defaults to true, but allows to disable the formats if any issues
should arise.
Most regular applications won't need the high display precision of
these new 16 bpc 64 bpp formats which have higher memory and bandwidth
requirements, and therefore a potential undesired performance impact
for regular apps. Followup per-platform enablement commits will use
the EGL_EXT_config_select_group extension to put these 16 bpc unorm
formats into a lower priority config select group 1, so they don't get
preferably chosen by default by eglChooseConfig(), but must be explicitely
requested by client applications which really need the high color
precision of these 64 bpp formats and are happy to pay the potential
performance impact. Thanks to Adam Jackson for pointing me to the
EGL_EXT_config_select_group extension.
If the format would be put into the default config select group 0, a
simple EGL eglChooseConfig() call would end up choosing these formats,
which is not what such regular apps would want.
Tested to not cause any change on native X11/EGL and X11/GLX, which only
supports at most 30 bpc / 32 bpp formats.
Followup commits will enable these formats for the EGL/Wayland backend,
and on the EGL/DRM backend.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588>
Detects 16 bpc unorm formats. Used by following RGB[A]16
UNORM display enablement commits.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588>
Otherwise, the following warning is observed with musl
headers, which translates to an error on Android.
external/musl/include/sys/poll.h:1:2: error:
redirecting incorrect #include <sys/poll.h> to <poll.h> [-Werror,-W#warnings]
warning redirecting incorrect #include <sys/poll.h> to <poll.h>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39279>
if set, creates a file in /tmp folder with mesa_<process_name>_<pid>_XXXXXX.log
logging all errors, warnings, etc., rather than stderr. The XXXXXX will be replaced
with alpha numeric character so for each run of the app a new log file will be
created guaranteed.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39206>
Those are variants of f2f16 that always round up/down. Constant folding
requires nextafter that supports half floats (util_nextafter).
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
This is the least invasive way to switch all of Mesa to BLAKE3. It's done
this way for transitional reasons.
SHA1_DIGEST_LENGTH is already equal to BLAKE3_KEY_LEN. This just switches
the SHA1 implementation to BLAKE3.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>
The last 12 bytes are always 0 for now. With this, all SHA1 functions
can be internally implemented as BLAKE3, so that we can switch everything
to BLAKE3 by only changing the implementation of the sha1 utility.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>
fmin/fmax are not signed zero correct.
The name is taken from RDNA4's ISA, because the full IEEE name is a bit long.
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39137>
When converting between snorm and between unorm values, the following
patch introduced 64-bit division on all platforms.
util: Add and use functions to calculate min and max int for a size
Commit hash: 72259a870f
The following unit math
// local MAX_UINT macro based on UINT32_MAX
(uint64_t * uint32_t + uint32_t) / uint32_t
changed to
// macros.h inlined 'uint64_t u_uintN_max()' functions.
(uint64_t * uint64_t + uint32_t) / uint64_t
This can significantly impact performance on 32-bit platforms.
Address this by type-casting the return values from the inlined
functions to avoid the 64-bit divide on 32-bit platforms.
Signed-off-by: Micah Shennum <micah.shennum@garmin.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39001>
This reverts commit dcd9b90aff.
The game falls back to software rendering if any of the enumerated GPUs
have invalid vendorID. This fix causes a regression for systems with an
AMD GPU where the game works perfectly fine, that also happen to have an
Intel iGPU.
The Intel iGPU will be enumerated with vendorID=-1 and this will cause
the game to not use the AMD GPU either. Clearly an issue with the game
but since this workaround is causing more issues than it is solving,
remove the config.
From what I can tell, this fix _did_ work at the time of committing, but
as this is a live service game it gets regularly updated and this buggy
behavior with choosing the device has been introduced recently.
Fixes: dcd9b90aff ("drirc/anv: force_vk_vendor=-1 for Wuthering Waves")
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Taras Pisetskyi <taras.pisetskyi@globallogic.com>
Signed-off-by: llyyr <llyyr.public@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39089>
Fix compiler error:
../src/util/blob.c:344:8: error: assigning to 'uint8_t *' from
'const void *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
glibc now provides C23-style type-generic string functions. memchr
returns const void * when passed a const void * argument. Update nul
declaration to const since it's only used to find the null terminator
position and calculate a size.
Fixes: 1c9877327e ("glsl: Add blob.c---a simple interface for serializing data")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39048>
This feature causes piles of vertex shader timeouts in the CTS which makes CTS
testing extremely unreliable on my min-spec M1. Since we only really have it as
a tickbox for Proton, hide it except for Proton - at least for now.
This eliminates our known flakes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39111>
Add the actual registers used (including uniforms used) by the shader
to stats. This is calculated by the stats gathering code, because the
scheduler and scoreboard passes run after register allocation and can
sometimes change the results.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38961>