The check in check_multiview_texture_target() whether numViews <= 0 (as
required by the OVR_multiview spec) is never triggered since it is only
called by frame_buffer_texture() when numviews > 1, as numviews of 0 is
passed in by non multiview FramebufferTexture functions. Such cases are
incorrectly treated as non-multiview attachments.
Tweak frame_buffer_texture() to take an extra bool argument "multiview"
to distinguish between a multiview call with numviews=0, and a
non-multiview call.
Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
Fix the FBO attachment completeness test to ensure that multiview
attachments have all views referring to layers in range of the
underlying texture.
The OVR_multiview spec states:
Add the following to the list of conditions required for framebuffer
attachment completeness in section 9.4.1 (Framebuffer Attachment
Completeness):
"If <image> is a two-dimensional array and the attachment
is multiview, all the selected layers, [<baseViewIndex>,
<baseViewIndex> + <numViews>), are less than the layer count of the
texture."
Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
OVR_multiview requires OpenGL 3.0, so expose gl_ViewID_OVR builtin back
to GLSL 1.30 on OpenGL.
v2: Minor whitespace fix
Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
v1. Makes special_event a member of struct dri_drawable to be re-used.
(Michel Dänzer @daenzer)
v2. Guard with VK_USE_PLATFORM_XCB_KHR and clean-up.
(Mike Blumenkrantz @zmike)
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31474>
Patch adds new CALL_STACK_HANDLER struct which has offset to
start and end of RegistersPerThread field, this spec changes is
described in Wa_22019854901 (see HSD 22019967134).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33342>
This is complicated by two things: mediump varyings, and the lack of u16
regfmt support in LD_VAR.
With mediump, a load(_interpolated)_input with a 16-bit dest size may
either be an explicit 16-bit type or a mediump type lowered by
nir_lower_mediump_io. With explicit 16-bit types, we write 16-bit values
in the VS, but with mediump we write 32-bit in the VS (for messy
reasons). bi_emit_load_vary needs to distinguish these cases by checking
for a mediump type, and set the appropriate source_format to convert the
type on the LD_VAR_BUF path. Types like 'mediump uint16' are luckily not
allowed.
The missing u16 regfmt for LD_VAR means that we take the obvious
approach for 16-bit int varyings of emitting 16-bit int formats in the
attribute descriptor and loading them to u16. Instead, we just
write/read all 16-bit varyings as f16 regardless of type. Unlike with
mediump, we don't need to do any 32bit->16bit conversion when loading in
the FS, so as long as we use the same type between the attribute
descriptor and LD_VAR, the conversion is a no-op and the mismatch
doesn't matter.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33078>
There is no auto32 equivalent for 16-bit types, we need to select
specific register formats.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33078>
For Bifrost and newer, we always write mediump varyings from a 32-bit
source in the VS. This is needed because the FS does not unconditionally
lower mediump to 16-bit.
Previously we worked around this in panvk by replacing 16-bit formats
with 32-bit in emit_varying_descs, but once we support
storageInputOutput16, we will need to preserve 16-bit formats for
explicit 16-bit varyings.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33078>
Neither of these changes are a behavior difference. The change to
emitting uint16 formats from pan_collect_varyings for PSIZ is
inconsequential because neither panvk nor the gallium driver emit
attribute descriptors for special varyings.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33078>
In a 3-plane uncompressed YUV surface, only the chroma planes should use
MALI_PLANE_TYPE_CHROMA_2P plane_type or set secondary_pointer.
Fixes: 144f9324a3 ("panfrost: prepare v9+ to support YUV sampling")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33104>
On v10, only YUV 420 formats support center_y or center siting.
On previous HW versions, YUV 422 formats support center_y siting but not
center_x or center siting.
Fixes: 83c76cceaf ("panfrost: advertise YUV formats for valhall")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33104>
The mesa screenshot layer attempts to use VK_FORMAT_R8G8B8_UNORM by
default. Using this, we can directly & efficiently write out to a
PNG file without further modifications. However, some GPUs don't
support the given format, so for those that don't, we'll attempt to
use VK_FORMAT_R8G8B8A8_UNORM, which will require some work to ensure
the alpha values are set to opaque to make RGB comparisons easier.
If both surface formats fail, a more descriptive failure
will be shown.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33295>
For Xe3+ the registers are tightly packed to make better use of GRF
space, so add a statistic to keep track of how many registers were used.
For previous versions this is not useful since the code is spreading
the registers among the whole space.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33311>
This enables them for task and mesh shaders, which for lvp are just
fancy compute shaders, and it's not like gallivm has any real awareness
of the stage it's emitting code for anyway.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32632>
We do not have the TEX semaphore there anyway so the benefits are not as
high as with R500 and the chances of running out of TEX indirections are
just too high.
This will increase the register pressure in some shaders, but I believe
the gained shaders are worth it and there is also some cycles reduction
in some cases. I'm not sure how to optimize this further without
actually clonning the shader before the pair shceduling and than doing a
trial and error to see if there is some compromise where we can just hit
the indirection limit to not group it too much...
Shader-db RV410:
total instructions in shared programs: 112800 -> 112825 (0.02%)
instructions in affected programs: 5024 -> 5049 (0.50%)
helped: 23
HURT: 19
total temps in shared programs: 18170 -> 18244 (0.41%)
temps in affected programs: 1365 -> 1439 (5.42%)
helped: 39
HURT: 34
total cycles in shared programs: 169535 -> 166806 (-1.61%)
cycles in affected programs: 14229 -> 11500 (-19.18%)
helped: 84
HURT: 4
LOST: 0
GAINED: 8
GAINED: shaders/godot3.4/34-59.shader_test FS
GAINED: shaders/lightsmark/25.shader_test FS
GAINED: shaders/lightsmark/28.shader_test FS
GAINED: shaders/lightsmark/34.shader_test FS
GAINED: shaders/this-war-of-mine/144.shader_test FS
GAINED: shaders/this-war-of-mine/145.shader_test FS
GAINED: shaders/tropics/432.shader_test FS
GAINED: shaders/tropics/462.shader_test FS
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33275>
if there is no heap with device-local and host-visible, then
rebar cannot exist. the previous detection did not account for
the rebar heap using the device-local fallback, which of course
would have the same size as the device-local heap and pass the threshold
check
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33359>
Rather than assert (and otherwise write past the array size), guard against
this (and miscompile the shader), to make the code more robust.
This mimics the behavior of exceeding the cond_stack size (and other similar
stacks) - the if_stack is only used together with the cond_stack, the behavior
should be the same.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33338>
The gen_header.py script is failing for older versions of python3 such
as python 3.5. Two issues observed with python 3.5 are ...
1. Python 3 versions prior to 3.6 do not support the f-string format.
2. Early python 3 versions do not support the 'required' argument for
the argparse add_subparsers().
Fix both of the above so that older versions of python 3 still work.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28780>
We don't need to run the validation of the XML files if we are just
compiling the kernel. Skip the validation unless the user enables
corresponding Kconfig option. This removes a warning from gen_header.py
about lxml being not installed.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28780>
This reverts commit 3290222a1a, which was
introduced to fix a regression that only happens in v3d.
As this was moved to the v3d driver, it does not makes any sense more to
do it here.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33310>
It is possible that shader comes with output stores executed before
loading inputs. As the memory to read the inputs and store the outputs
is the same, this mean it could be overwriting the inputs before reading
them.
This move avoids this situation.
This partially improves
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33053.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33310>