current `st_ChooseTextureFormat(..., internalFormat=GL_BGRA8, ...)`
returns `PIPE_FORMAT_R8G8B8A8_UNORM`.
this causes significant performance loss in apps that use BGRA texture
format(e.g. firefox) when transferring textures because of format
conversions, if the driver doesn't support PIPE_TEXTURE_TRANSFER_BLIT.
fix this by modifying the texture format mapping.
See Also: https://community.mnt.re/t/poor-browser-performance/2042/30
Signed-off-by: mojyack <mojyack@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35678>
(cherry picked from commit db383ceb64)
The new layout affects the whole buffer so it needs to be done
on a full clear.
This fixes this piglit test on a RX 6800 XT:
ext_framebuffer_multisample-accuracy 6 depth_resolve small depthstencil
Fixes: 75a03d733a ("radeonsi: simplify and fix enable_tc_compatible_htile_next_clear logic")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35582>
(cherry picked from commit 04d283c628)
Otherwise, outputs_read/outputs_written might not be up-to-date
(mostly after nir_remove_dead_variables) and remove_point_size() might
reach an assertion later because the output variable isn't found.
It seems better to run nir_shader_gather_info() at the very end of
radv_optimize_nir() which can change a lot of things anyways.
No fossils-db changes.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35707>
(cherry picked from commit 30ccd97cd2)
The register footprint could limit occupancy. We need to take this into
account to avoid deadlocks when a kernel is using barriers.
Fixes: 6d85cd6a3b ("freedreno: Implement get_compute_state_info for Adreno 6xx/7xx")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35745>
(cherry picked from commit 2e00925c81)
`cc: mesa-stable` instead of `fixes:` because several commits have
modified this but keeping this bug:
- 06e57e3231 ("virtio: Add vdrm native-context helper") made
an unconditional copy of subdir(virtio)
- cede4e7ac3 ("meson: Only include virtio when DRM available")
introduced a new condition, which doesn't cover everything that was
needed
- other commits made more changes
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35723>
(cherry picked from commit d0c7bea727)
It was absent when initialising a panfrost_resource from a winsys handle.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 7da251fc72 ("panfrost: Check in sources for command stream")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34224>
(cherry picked from commit cf4a137459)
When panfrost_resource_init_afbc_headers() fails, freeing the newly
created resource is not enough, because we need to unreference its BOs.
This will also take care of freeing its resource label.
Also replace instances of FREE() in error-handling paths with
panfrost_resource_destroy(), as it is capable of handling partially
initialised resources.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: e3f2bc7963 ("panfrost: handle mmap failures")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34224>
(cherry picked from commit 32b128be01)
bi_emit_alu needs to fully set the vector of bi_index pass to
bi_make_vec_to, as it is expected by the callee.
Fixes: 3cc6a4c5 ("pan/bi: Handle swizzles in i2i8")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35642>
(cherry picked from commit bc973d687c)
The snorm formats are not compatible with the srf flag
which was set by the emit_image_load_or_atomic() function.
In this specific case, "use_const_fields" is not set which
implies that the format definition is local. The other
supported formats do not require the srf flag as well.
This change was tested on cypress, barts and cayman. Here are the tests fixed:
khr-gl4[2-6]/shader_image_load_store/basic-allformats-load: fail pass
khr-gl4[2-6]/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/basic-allformats-loadstorecomputestage: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
khr-gles31/core/shader_image_load_store/basic-allformats-loadstorecomputestage: fail pass
khr-gles31/core/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
deqp-gles31/functional/image_load_store/2d/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d_array/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d_array/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/3d/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/buffer/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/buffer/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/cube/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/cube/format_reinterpret/rgba8_rgba8_snorm: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35548>
(cherry picked from commit d27ed38d1a)
The mode r10g10b10a2_snorm processed as vertex on palm at the
hardware level doesn't follow the current standard. Indeed, the .w
component (2-bits) is not calculated as expected. The table below
describes the situation.
This change fixes this issue by adding three gpu instructions at
the vertex fetch shader stage. An equivalent C representation and
a gpu asm dump of the generated sequence are available below.
.w(2-bits) expected palm
0 0.0 0.000000
1 1.0 0.333333
2 -1.0 0.666667
3 -1.0 1.000000
w_out = (4.*w_in > 1. ? 1. : 4.*w_in) - (w_in > 0.5 ? 2. : 0.);
0002 00000008 A0080000 ALU 3 @16
0016 00000C02 A0000CC0 1 y: MOV*4_sat __.y, R2.w
0018 801F8C02 600004A0 w: SETGT*2 __.w, R2.w, 0.5
0020 839FC4FE 60400010 2 w: ADD R2.w, PV.y, -PV.w
Note: The rv770 and cypress don't need this correction. This is
definitely a hardware change between these gpus.
This change was tested on palm, barts and cayman. Here are the tests fixed:
spec/arb_vertex_type_2_10_10_10_rev/arb_vertex_type_2_10_10_10_rev-array_types: fail pass
deqp-gles3/functional/draw/random/124: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/normalize/int2_10_10_10/components4_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/normalize/int2_10_10_10/components4_quads256: fail pass
khr-gl43/vertex_attrib_binding/basic-input-case5: fail pass
khr-gl44/vertex_attrib_binding/basic-input-case5: fail pass
khr-gl45/vertex_attrib_binding/basic-input-case5: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32427>
(cherry picked from commit e8fa3b4950)
`getopt_long()` returns an `int`, not a `char`; putting the value in
a `char` before comparing it to `-1` was making the comparison always
fail, resulting in the invalid codepath taken that then fails with:
option `-' is invalid: ignored
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
(cherry picked from commit 99e8d804bf)
`subprocess.Popen()` returns immediately, and the subprocess might not
have finished by the time `stdout` is read on the next line, spuriously
failing the tests.
`subprocess.check_output()` makes sure the output is available before
returning, solving this issue; it additionally raises an error if the
subprocess failed, giving a better error than a failed diff later in the
script.
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
(cherry picked from commit de6ab1beda)
Clients are expecting the color info to be fully filled when the api
exists. Give proper defaults for the metadata to stay aligned with
legacy backends.
Also amend the missing ChromaSiting cases.
Fixes: ee42e2166d ("android: Introduce the Android buffer info abstraction")
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35613>
(cherry picked from commit 64d18f84b0)
Clients are expecting the color info to be fully filled when the api
exists. Give proper defaults for the metadata to stay aligned with
legacy backends.
Fixes: 122fd46b15 ("Android15 support gralloc IMapper5")
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35613>
(cherry picked from commit 0ac1e05f65)
For modern gralloc impl, usually there's other fds appended in the
native_handle_t for gralloc buffer metadata sharing. So numFds can be
greater than 1, while the common agreement is still that the format
plane handles being placed at the beginning in the lack of a standard
format plane fd metadata query mapper api.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13163
Cc: mesa-stable
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35038>
(cherry picked from commit 41d241bf6e)
The MEMORY_BARRIER instruction has some issues, where we end up
dead-code eliminating it before it gets to do what it's supposed to do.
But even if we fix that, we have issues where we can end up inserting
flow control into it, which isn't going to work because we have nothing
to emit here either.
So let's rework this to a special-cased NOP instruction, which is marked
as a scheduling barrier. The beneft here is that NOPs are already properly
handled when it comes to flow control.
Note that this isn't perfect either; this only prevents memory operations
from crossing the scheduling barrier. We should really prevent any
operation with observable side effects from crossing the barrier. This
includes things like reading clocks etc.
But that's a larger change, and it's a step in the right direction to get
this to no longer be dead-code eliminated. So let's put this band-aid on
for now.
Fixes: f77a50e45e ("pan/bi: add a MEMORY_BARRIER pseudo-instruction")
Reviewed-by: Caterina Shablia <caterina.shablia@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35502>
(cherry picked from commit 18893a250f)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 252cac1e5c ("anv: avoid memory type changes with INTEL_DEBUG=noccs")
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35615>
(cherry picked from commit bfee389f0c)
It might be the case that both the branch and exec mask write in a
divergent branch block are removed. try_remove_simple_block() might then
try to remove it, but fail because it has multiple logical successors.
Instead, just skip these blocks.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.1
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>
(cherry picked from commit 5344abbc56)
Previous code crashed with an assertion failure in this case.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 211aa20194 ("panvk: Move away from panfrost_{bo,device}")
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35528>
(cherry picked from commit b31dee9b7e)
u_blitter sets a viewport transform with depth range [-1,1], which is
outside the [0,1] range that is allowed by opengl.
The mali hardware docs state that setting the LOW_DEPTH_CLAMP register
outside of [0,1] is undefined behavior. We haven't observed any problems
with this so far, but better to fix it.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 810135fb42 ("gallium/u_blitter: Fix depth.")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35225>
(cherry picked from commit b8c7fcda27)
The HW specifications require the size of shader resource tables to be a
multiple of 4, otherwise correct behaviour is not guaranteed.
Fixes: 713f5c3600 ("panvk: Prepare the cmd_desc_state logic for Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35553>
(cherry picked from commit 48e8d6d207)
The SPIR-V spec is pretty clear that coordinates on subpass attachments
are relative to the current pixel. They're required to be zero but we
should stay consistent with ourselves (we already do this for image
intrinsics) and with the spec.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
(cherry picked from commit 2c13e1e655)
There's nothing in NIR which guarantees that the deref is the first
source or that the coordinate is the second. Use
nir_tex_instr_src_index() to get the actual indices.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
(cherry picked from commit 9a52b9372c)
Turnip when cross-compiled for i386 needs to be built with SSE2 as a
minimum spec, as it uses clflush unconditionally. Make sure to pass in
the sse2_args, which will be empty on Arm64 targets.
Fixes: 7231eef630
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35621>
(cherry picked from commit f872cbea37)
We are reading accel header parameter those are updated by CS, so we
need to apply flushes to make L3 coherent with CS.
This fixes ray query tests on MTL:
- dEQP-VK.ray_query.*.serialization.*
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35590>
(cherry picked from commit a676ba9294)