Emitting the instructions one by one results in two MOV instructions
that won't be propagated. By handling both instructions at once, a
single MOV is emitted. For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends
Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
This eliminates some spurious, size-converting moves. For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 56 instructions. 1 loops. 4444 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
v2: Condition two of the patterns on !options->lower_extract_byte.
Suggested by Lionel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
In the vec4 compiler, 8-bit types should never exist.
In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.
Some instructions are handled in non-obvious ways.
Hopefully this will save the next person some time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
GLX_ARB_create_context_profile has some clever language that sets the
default to core profile but silently degrades back to compat for pre-3.2
GLs. We can just do that, rather than track whether the user specified a
profile.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>
We only end up with one DRI provider per screen, so the only way the
context vtable can differ is if they're not the same directness. Rewrite
the test in those terms to help us unify some of this code away in the
future. Also apply the same logic to the indirect context creation path.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>
Beside show_frame and error_resilient_mode, also need to check if frame size
changes. FRAME_HDR_INFO_VP9_USE_PREV_IN_FIND_MV_REFS flag should be OFF if
frame size changes.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12368>
Adding last width/height to keep tracking the size of the last frame.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12368>
Instead of making LMEM the special case, unify the two paths by setting
up a fake drm_i915_query_memory_regions struct and filling it out based
on OS queries. The important functional change here is that we now pass
system memory through the same GTT size and 3/4 filter that we were
using with the OS queries. This should make behavior consistent on
integrated GPUs regardless of whether or not we have the memory region
query API.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
16bit integer support also implies using 16-bit results when sampling
textures.
Because we're returning the results in float SSA values instead of int,
we need to bitcast back to integers before truncating the values.
Fixes: 00ff60f799 ("gallivm: add 16-bit integer support")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12413>
Otherwise with some compilers/environments (Android) padding
may contain garbage and memcmp of the key will fail.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12438>
This is what I see happening in
dEQP-VK.spirv_assembly.instruction.compute.opatomic_storage_buffer.load on
pixel 2 (also where I found a buffer big enough to show how to encode the
size).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>
According to https://mesonbuild.com/Reference-manual.html, the check
parameter is supported since meson 0.47.0.
This could have helped to catch the issue fixed by:
221871fb6d ("meson: Search for python3 before python for bin/meson_get_version.py")
as it would have caused the build to fail immediately.
It's still a good idea to check the result even though that issue is
now fixed.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12404>
Even though we can't really do the parsing on behalf of the driver (it's
too complicated), storing it in the vk_image lets us provide a common
implementation of vkGetImageDrmFormatModifierPropertiesEXT(). It'll
also be useful in the next few commits for swapchain images.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12023>