Both OpenGL and Vulkan drivers share the same performance counters.
Let's move them to a common place instead of duplicating.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21420>
For the platforms that call it, it a function in the hot path so
moving it to kmd backend.
After this patch i915/iris_bufmgr.c is empty but not removing it
as next patch will add functions to it.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21494>
bo_madvise() is on hot path, so moving it to kmd backend.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21494>
Swizzle of 8-bit stencil format is defined as _x__ but the hw expects
BC_SWIZZLE_XYZW.
Fixes dEQP-VK.pipeline.monolithic.sampler.border_swizzle.*s8_uint*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21482>
Modifying pitch for all LINEAR surface isn't correct;
the original change that modified surf_pitch was only
intended for YUV textures.
This fixes vkGetImageSubresourceLayout rowPitch return value
for VK_FORMAT_BC3_UNORM_BLOCK + VK_IMAGE_TILING_LINEAR.
Fixes: fcc499d5 (ac/surface: adjust gfx9.pitch[*] based on surf->blk_w)
v2: add check for UYVY format (Pierre-Eric)
v3: move blk_w division to above if check (Pierre-Eric)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21595>
It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly
be handled as a pure VALU instruction.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
Both GLSL & SPIRV have undefined values for shift > bitsize. But SM5
says :
"This instruction performs a component-wise shift of each 32-bit
value in src0 left by an unsigned integer bit count provided by
the LSB 5 bits (0-31 range) in src1, inserting 0."
Better to not hard code the wrong behavior in NIR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e227bb9fd5 ("nir/builder: add ishl_imm helper")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21720>
It's unnecessary and already checked elsewhere like in
vkCmdBindDescriptorSets(). This improves performance of vkoverhead
test #76 (descriptor_1image) by +18%. It's the same performance as
PRO on my Threadripper 1950X now. This should also slightly improve
texel and buffer descriptors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20699>
We can't add a 64-bit immediate with the hardware iadd, that won't work. What we
can do is add a 32-bit immediate, derived as the low 32-bits of a 64-bit
nir_ssa_def.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
If we manage to fold in a left shift that's bigger than the hardware can do, we
should at least avoid generating a useless right shift to feed the hardware
rather bailing completely.
For motivation, this form of address arithmetic is encountered when indexing
into arrays with large power-of-two element sizes (array-of-structs).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
Optimize address arithmetic of the form
base + u2u64((index << shift) + const)
into hardware operands
base, index << (shift - format_shift) + const'
which (if format_shift = shift) can be simply
base, index + const'
rather than the current naive translation
base, ((index << shift) + const) >> format_shift
This saves at least one pointless shift. We can't do this optimization with
nir_opt_algebraic, because explicitly optimizing "(a << #b) >> #b" to "a" isn't
sound due to overflow. But there's no overflow issue here, which is what this
whole pass is designed around.
For motivation, this address arithmetic implements "dynamically indexing into an
array inside of a C structure", where the const is the offset of the array
relative to the structure.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>