We need to expose this, as we support it.
Otherwise 1x1 is assumed and we fail some CTS.
Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
When we are using compute resolve, we can get
values the CTS does not expect due to the value
we end up writing for UNORM in
`nir_image_deref_store`.
Make the compute resolve rounding path match with
the output of the fragment shader resolve path,
by going through the same FP16 RTZ conversion as
we do for UNORM/SNORM formats.
This is why VK_EXT_sample_locations CTS was
failing on > GFX9.
On <= GFX9, I am assuming we are falling back to
RESOLVE_FRAGMENT, due to DCC stuff, which is why
it works there.
I tested a handful of images from the Vulkan CTS
for the sample locations and resolve tests for
diff UNORM formats from the qpa file forcing
FRAGMENT and with this change.
With this change, we now match on the compute
resolve path the same sha for the ones I compared
with ImageMagick `identify`.
CTS passes for: *resolve*, *image_clearing* and
*sample_locations* on RX 7900XTX.
Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
Taken from OpenGL-Registry commit ca62982097eb
("Remove plural bindings in GL_ARB_shader_texture_image_samples (#637)")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33356>
This fixes some upcoming CTS tests that attempt bias usage when
it is not valid per spec.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34285>
This is based on Timur Kristof's code, but there are a lot of differences.
The idea is that it doesn't just compute an intersection between a point
and a triangle. It computes the *distance* between a point and a triangle
and it does so in screen space. It accurately takes the subpixel precision
of the rasterizer into account, so that it works optimally at all
resolutions, all MSAA modes, and all quant modes.
The distance computation is only approximated because it only considers
the infinite lines going through triangle edges. However, it seems to be
more than sufficient in practice because the existing rounding-based small
prim culling compensates for it.
The performance improvement is up to 10% in some geometry-bound tests,
though targeted microbenchmarks can show a lot more than that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33361>
Migrate the panfrost-g52-vk job to mt8186-corsola-steelix-sku131072, a
new device in LAVA. This DUT is faster than the Khadas VIM3 device it
replaces, and since more of these devices are available in LAVA, reduce
the DEQP_FRACTION and increase the parallelism for the pre-merge job.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33996>
This condition accidentally got inverted when cleaning up code, whoops.
Fixes: 3251f321b8 ("mesa: some cleanups for texparam extension checks")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34248>
We currently just assume that textureCompressionETC2 and
textureCompressionASTC_LDR are always supported. And while that's true
for all the G52s, G610s abd G310s we've seen out in the wild, it's not
guaranteed to be true. An SoC vendor might disable support for one of
these formats.
So let's check properly, just for good measure.
Fixes: d970fe2e9d ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34206>
The panfrost_emit_plane function expects an array, and Coverity
complains about passing a pointer instead. Yeah, that's a bit nit-picky,
but it's easy enough to use an actual array here instead of trying to
fudge it.
This should be a non-functional change.
CID: 1636773, 1636744
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34156>
We were using an unsigned instead of a size_t for the size calculation,
which would break large allocations. So let's switch to size_t here,
for good measure.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34156>
We already have a variable call "alignment" here, and aliasing it
breaks things. Whoops, let's rename the variable to page_size to
avoid this.
Fixes: 22985caf3f ("panfrost: sanity-check alignment")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34156>
quick_shader is for some reason now excruciatingly slow on the Microsoft
runners, but fine on the GStreamer runners. Until we can figure out why
this is happening - 27min runtimes instead of 3min - just keep them over
there.
Dozen is miraculously unaffected.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
These jobs all pretty comfortably complete in 5-7min. Add a timeout for
roughly double that to make sure that jobs don't get stuck.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
Follow what everyone else did to have gitlab-ci.yml for concrete job
definitions, and gitlab-ci-inc.yml for things which get inherited.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
This job is currently broken due to the lack of git in the Vulkan
container; it should really be pulling the fossils from S3 like the
traces anyway.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
Buffer resources are quite special as they are only one dimensional,
always linear, don't have miplevels or array slices, never have a
texture or render compatible sibling, don't ever use TS.
The gallium context interface acknowledges this fact by providing
separate entry points for buffer maps/unmaps/flushes.
Provide a specialized etna_buffer_resource as a much more lightweight
alternative to the fullblown etna_resource and implement buffer
maps/unmaps in the same straight forward, direct map manner that is
hidden inside all the tiling, TS and resource sibling handling in
etna_transfer_map/unmap. It is expected that further map optimizations
can be added on top of this simple implementation much more easily
than in the merged buffer/texture transfer code.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34061>
This aligns the prototype with etna_resource_used and allows to
track different resource specializations in the same datastructure.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34061>
We are already applying the .bagr swizzle in bifrost_preprocess_nir(), so
remove lower_tg4_broadcom_swizzle from nir_lower_tex_options in
panvk_preprocess_nir to avoid applying the swizzle twice.
Fixes: 4050697a8f ("panvk: So more nir_lower_tex before descriptor lowering")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34033>
This has been tested again with vkoverhead on 4 different CPUs and
using 32-bit counters is the fastest combination overall.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
A begin/end sequence is something like (it's all macros based):
radeon_begin(cs);
radeon_emit(PKT3(PKT3_DRAW_INDEX_AUTO, 1, cmd_buffer->state.predicating));
radeon_emit(vertex_count);
radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX | use_opaque);
radeon_end();
This is loosely based on RadeonSI (see !8653 (a0978fff)) and it seems
indeed faster overall.
The main goal of this rework is to re-use the same logic as RadeonSI
for paired packets on GFX12 (also GFX11 dGPUs) because it's supposed
to be way faster, especially on GFX12 where the CP is slow. The other
goal is to share more cmdbuf emission between both drivers in the near
future.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
This is a bug fix but Volta is currently disabled by default and it only
affects fp64 and fp16.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34302>
This is probably a little more code but we're about to add real data for
Turing+ so it's better to have things contained like this. Since Volta
and earlier will always remain hacks, we might as well have those hacks
in the per-SM files rather than pretending we have a general thing in
sched_common.rs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34302>
It's split cleanly along the SM70 boundary anyway so there's no code
duplication happening here.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34302>
We're about to add instruction latencies which are going to be in their
own files because they're also massive. This makes things follow a bit
more of a module structure where sm70.rs is the thing that ties it all
together, sm70_encode.rs is the encoder, and smXX_instr_latencies.rs
will be the individual latency files.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34302>
For now, these all just call into sched_common.rs but this gives us the
interface we really want going forward.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34302>