We always support the extension. Wether we support any formats or not
depends on one of two conditions:
1. If Mesa is built with shader-cache support or not, which is not a
driver decision.
2. If GL_ARB_gl_spirv is supported or not, which is covered elsewhere.
So there's no reason to list individual drivers here, as that doesn't
really change anything.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32300>
Update panfrost to add support for NV16 and for the 10 bit
NV15 and NV20 formats.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
Although there are no DRI formats directly corresponding to 10bpp
planes (as used in e.g. NV15), some hardware can emulate NV15 with
R10_G10B10_420. Check for this in dri2_yuv_dma_buf_supported, so
that we can advertise support for these formats if available.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
Support external images with 10 bit YUV in NV15 and NV20 formats.
These are produced by some hardware decoders, so this will be
useful.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
Support for NV16 was kind of half done, by declaring it to be
NV12. That didn't actually work though, so add some more stuff
to make it work.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
Y8_U8V8_422_UNORM is more commonly known as NV16. There has been
a fourcc for NV16 for a while now, so let's rename it to be in
line with NV12 and similar formats.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
It's also possible to use ALL_GRAPHICS and PRE_RASTERIZATION as
alternatives.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32323>
This was obviously broken because demote results in more helper invocations.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b447f5049b ("nir: Add a discard optimization pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32310>
When simul_use=true, the tiler descriptors are allocated from
the descriptor ringbuf. We set state.gfx.render.tiler to a
non-NULL value to satisfy the is_tiler_desc_allocated() tests,
but we want it to point to a faulty address so we can easily
detect if it's used in the command stream/framebuffer
descriptors.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32213>
Fix cross command buffer render pass suspend/resume by
emitting a render context (tiler+framebuffer descriptors)
on suspend that we can re-use on resume.
This involves splitting the issue_fragment_jobs() logic to
decouple the framebuffer descriptor initialization and the
run_fragment emission. This also requires patching a few
places where we were testing the tiler/fbd values to
determine if we are in a render pass, which no longer works
when a render pass is resumed.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32213>
This allows us to start from the HW reg file state instead of a
zero-initialized reg file.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32213>
We don't need to check anymore if this was already applied and it turned
out the check was not working properly in the first place.
The check for vs is kept in place, because that one still detects that
few wine shaders already have the sin/cos input in correct range.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32160>
finalize_nir requires that calling it multiple times on the same shader
doesn't break it.
RV530 shader-db:
total instructions in shared programs: 132915 -> 132851 (-0.05%)
instructions in affected programs: 2016 -> 1952 (-3.17%)
helped: 16
HURT: 0
total temps in shared programs: 18238 -> 18232 (-0.03%)
temps in affected programs: 42 -> 36 (-14.29%)
helped: 6
HURT: 0
total cycles in shared programs: 197510 -> 197446 (-0.03%)
cycles in affected programs: 2102 -> 2038 (-3.04%)
helped: 16
HURT: 0
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32160>
No effect in shader-db right now, but without it the next commit
leads to small regression in instruction numbers (0.03%) instead
of the small win we have now (-0.05%).
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32160>
This makes instruction selection simpler and fixes potential issues with
allocated_vec or the optimizer moving SGPR uses out of the loop.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
The VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID is handled
by the common code.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32314>
Some MTK display controller drivers support only this AFBC modifier.
Give it a chance to use AFBC for scanout resources.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
When dealing with AFBC render targets using wide blocks, the GPU needs to
keep rendering tiles that are a multiple of 16x16. This is described
as AFBC render block size, and adds extra constraints:
- render target buffers need to be aligned on 16 pixels in the vertical
direction, even if the AFBC super block size is 4 or 8 pixels.
- if the effective tile size is smaller than the render block size, we
should force a clean write and discard+ignore the CRC
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
On all previous GPUs, the effective tile size was limited to 16x16, but
it got increased on v10. Add an helper to query this maximum effective
tile size.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
This allows using the tile size to make decisions not related to the
framebuffer descriptor. Mainly, for the near future, to decide
whether some tiling hierarchy levels should be disabled.
The color buffer allocation size is also calculated at the same time
as it's using common data underneath.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
This is mostly useful so that we can set the hierarchy level mask
using information from the `pan_fb_info` structure that isn't filled
yet when the tiler descriptor is first allocated.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
AFBC body is required to be aligned on 128 bytes on v6+ hardware.
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
It's an experimental feature that we may enable later.
Instead of exporting NULL primitives, perform a compaction
on primitives after culling.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>
Implement two workgroup scans over two boolean values in parallel,
so that they can be done with very minimal ALU overhead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>
In certain cases, the hardware fails to properly process a mipmap level
of these special stencil and depth formats. This happens at width=16.
This change adds a software workaround.
Modifying the corresponding mipmap nblk_x, and the other related
values, could make the tests below to work. Anyway, this method
generates regressions.
This change was tested on palm and cayman and fixes the following tests:
spec/arb_framebuffer_object/framebuffer-blit-levels read stencil: fail pass
spec/arb_depth_buffer_float/fbo-clear-formats stencil/gl_depth32f_stencil8: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31957>
This situation is happening, for instance, when the hardware is
using the type FMT_8_8_8_8 (4 bytes) while the software was
requesting a 3 bytes type. The width should be adjusted to the
expected hardware size; otherwise, the last vertex is lost.
Note: The rv770 didn't behave like this. This is definitely
a hardware change between these gpus.
This change was tested on palm and cayman. Here are the tests fixed:
spec/!opengl 2.0/gl-2.0-vertexattribpointer-size-3: fail pass
deqp-gles2/functional/draw/random/62: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_byte3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_256: fail pass
deqp-gles3/functional/draw/random/117: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/byte/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads256: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32184>
This change fixes the evergreen nonconformity issue on non-mipmap
textures when the minification and the magnification are not in
the same state.
This modification disables 5278436d67 when the minification and
the magnification are different. This fixes the nonconformity
without new regressions. Anyway, I was unable to reproduce
the issue described by 5278436d67 on palm and cayman.
This change was tested on cayman and palm. It fixes 84 deqp-gles2
tests and 128 deqp-gles3 tests:
deqp-gles2/functional/texture/filtering/2d/linear_nearest_*
deqp-gles2/functional/texture/filtering/2d/nearest_linear_*
deqp-gles2/functional/texture/filtering/cube/linear_nearest_*
deqp-gles2/functional/texture/filtering/cube/nearest_linear_*
deqp-gles2/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles2/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/cube/filtering/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/3d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/3d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/cube/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/cube/combinations/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/3d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/3d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/cube/filtering/nearest_linear_*
Fixes: 5278436d67 ("r600: force LOD range to be only one value when mip.min filter is NONE")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32185>
On evergreen depth-stencil textures are allocated as two objects, and
when using the eg_surface_init_1d_miptrees code path the size evaluation
uses the generalized surf_minify function. Here when allocating the
depth texture the alignment takes the depth bpe value into account, and
uses bpe=1 for the stencil texture. As a result the texture pair may
consist of textures with two different nblk_x sizes and this seems to
be a problem with some textures, namely npot and small (width < 32), but
not for mipmapped textures. In the problematic cases, if the so allocated
depth texture is larger than the stencil texture, then the kernel may reject
sent data with an error message like:
evergreen_cs_track_validate_stencil:622 stencil read bo too
small (layer size 131072, offset 524288, max layer 1, bo size 606208)
- because apparently the expected layer size is evaluated from the depth
texture size, but the actual bo size is evaluated based on the true texture
size values. If, on the other hand, the stencil texture is larger than the
depth texture, then the data is send with a wrong alignment, and certain
dEQP-GLES31 tests fail.
In order to obtain equal texture sizes in the problematic cases magnify
the depth texture alignment requirement by its bpe, so that the relative
alignment is the same for depth and stencil texture.
Fixes:
dEQP-GLES31.functional.stencil_texturing.format
.depth32f_stencil8_2d
.depth32f_stencil8_2d_array
.depth24_stencil8_2d
.depth24_stencil8_2d_array
.stencil_index8_2d
.stencil_index8_2d_array
.depth32f_stencil8_draw
.depth24_stencil8_draw
dEQP-GLES31.functional.texture.border_clamp.formats
.stencil_index8.nearest_size_npot
.depth24_stencil8_sample_stencil.nearest_size_npot
.depth32f_stencil8_sample_stencil.nearest_size_npot
dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d
.uint_stencil.nearest.s_clamp_to_edge_t_clamp_to_border_npot
.uint_stencil.nearest.s_repeat_t_clamp_to_border_npot
.uint_stencil.nearest.s_mirrored_repeat_t_clamp_to_border_npot
piglits:
arb_framebuffer_object-depth-stencil-blit *stencil*
framebuffer-blit-levels draw stencil
arb_texture_stencil8/
texwrap formats offset/gl_stencil_index8, npot
texwrap formats/gl_stencil_index8, npot
ext_framebuffer_multisample
accuracy all_samples stencil_resolve small depthstencil
unaligned-blit * stencil downsample
ext_texture_array/fbo-depth-array *stencil
egl_khr_gl_renderbuffer_image-clear-shared-image gl_depth_component24
v2: use util_is_power_of_two_or_zero (Marek)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32169>
This requirement is currently satisfied by the usage in panfrost and
lima.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32084>
Because we are doing perspective division before clipping, small
gl_Position.w values will give Inf for positions and interpolated
varyings. Before this change, primitives containing a vertex with w=0
were invisible.
This is only used in panfrost and lima.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32084>
I don't know of any case of Apple's driver using this, but it seems to work. The
stream link bit is identical to VDM so that was easy, the tricky part was the
return but I bruteforced the encoding space and this is the (only) thing that
worked. So add the XML.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>