The alpha channel seems to be internally returned as f16 (up-converted
to f32 is that is the dest type of the sam instruction). This expresses
1/3 and 2/3 with less precision than cl cts expects (f32).
This may be a test bug. But the format is not required.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40028>
In some cases CL has higher precision requirements for format support.
Add a PIPE_BIND_x flag so that drivers can expose formats in GL(ES) that
they cannot expose in CL.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40028>
The original MR switched to use a float raw_timestamp_period to scale
the raw timestamp outside of the gallium driver. This better matched
how vulkan works.
But unlike vulkan, gallium has timestamp related queries/APIs that
return already scaled time, resulting in small errors if the way the
scaling is done differs between driver scaling and frontend scaling.
The important thing is that any error introduced by scaling must be
the same error across APIs.
(In particular, a f64 value cannot preciesly represent an arbitrary
u64 value. OTOH the driver's scaling could be simply multiply be an
integer. But differing precision errors of the two approachs causes
problems when comparing between timestamps that are converted in
different ways.)
In some, but not all, cases this could be addressed by changing the
driver to use the same scaling function, but this is not always possible
(if, for ex, the scaling is done on the GPU CP). So switch back to
the original approach from !39995, using a pscreen->convert_timestamp()
callback, to put the control back in the hands of the driver.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40051>
The Vulkan spec says about VkFormatFeatureFlagBits:
If a format does not incorporate chroma downsampling (it is
not a “422” or “420” format) but the implementation supports
sampler Y′CBCR conversion for this format, the implementation
must set VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT.
Fixes: af062126ae
Signed-off-by: Benjamin Otte <otte@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39820>
In some situations we can have multiple present queued
with the same target_msc and in this case we might get
the last one signaled before the previous ones. Here's
an example with some debug logs added to the relevant
functions:
loader_dri3_swap_buffers_msc: new send_sbc=2323 - target_msc=337
dri3_handle_present_event: old recv_sbc=2322 msc=338 new_recv_sbc=2323
loader_dri3_swap_buffers_msc: new send_sbc=2324 - target_msc=338
dri3_handle_present_event: old recv_sbc=2323 msc=338 new_recv_sbc=2324
loader_dri3_swap_buffers_msc: new send_sbc=2325 - target_msc=338
dri3_handle_present_event: old recv_sbc=2324 msc=338 new_recv_sbc=2325
loader_dri3_swap_buffers_msc: new send_sbc=2326 - target_msc=338
loader_dri3_swap_buffers_msc: new send_sbc=2327 - target_msc=338
loader_dri3_swap_buffers_msc: new send_sbc=2328 - target_msc=338
dri3_handle_present_event: old recv_sbc=2325 msc=338 new_recv_sbc=2327
loader_dri3_swap_buffers_msc: new send_sbc=2329 - target_msc=338
dri3_handle_present_event: old recv_sbc=2327 msc=338 new_recv_sbc=2328
loader_dri3_swap_buffers_msc: new send_sbc=2330 - target_msc=338
dri3_handle_present_event: old recv_sbc=2328 msc=338 new_recv_sbc=2329
loader_dri3_swap_buffers_msc: new send_sbc=2331 - target_msc=338
dri3_handle_present_event: old recv_sbc=2329 msc=338 new_recv_sbc=2330
dri3_handle_present_event: old recv_sbc=2330 msc=338 new_recv_sbc=2326 # oops
dri3_handle_present_event: old recv_sbc=2326 msc=339 new_recv_sbc=2331
It's usually harmless, except if Mesa ends up using
loader_dri3_swapbuffer_barrier right after the out-of-order event.
In this example it's ok because more swaps are executed after 2330, so
waiting for read_sbc>=2330 would work anyway.
But if this wasn't the case, loader_dri3_swapbuffer_barrier would never
return, waiting for recv_sbc to become >= 2330 while it's stuck at 2326
because the later swaps were processed earlier.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39857>
Otherwise, the following error is observed:
lvp_pipeline.c:422:28:
error: variable 'progress' is used uninitialized whenever
'if' condition is false [-Werror, -Wsometimes-uninitialized]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40046>
The way I understand the HW docs is that the polygon offset is applied
always in 24bit depth domain (there are no polygon offset depth format
control registers like r600 has), so we need to manually rescale for
16bit buffers.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39196>
The main change here is adding an architecture argument to
pan_blend_can_fixed_function, so that we can take into
account fixed function hardware limitations in particular
generations.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
If the output format has no alpha channel then DST_ALPHA is the same
as CONST_ONE, and hence the blend operation becomes trivial (opaque).
This also fixes some piglit test failures, possibly because the
fixed function blending hardware isn't really set up to handle RGB1.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
This is intended to enable rusticl to use get_query_result_resource()
for timestamp queries, for hw which cannot convert ticks to us on the
GPU (or for which doing the conversion on the GPU is expensive). In
this case, the query result buffer is not exposed to the app, so we
can still do the necessary conversion on the CPU.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39995>
arb_point_sprite-mipmap renders polygons with polygon mode set to POINT.
However, r300 point-sprite setup only treated MESA_PRIM_POINTS as point
draws, so sprite coord replacement was disabled for polygon primitives
that were rasterized as points. This produced wrong texcoord orientation
and failed the piglit test.
Detect point rasterization from the primitive plus rasterizer fill/cull
state and use that in both HWTCL and SWTCL draw paths when updating
is_point flag.
The test now pass on RV370 and fails with the rest of the CI HW, but the
remaining issues seem to be some LOD boundary mismatch at point size 22,
the hardware samples level 0 where test expects level 1. In total only 4
cases now fail instead of 82 before.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39850>
Add support for selecting which geometry shader output stream feeds
the rasterizer, via VkPipelineRasterizationStateStreamCreateInfoEXT.
In both the LLVM and fallback draw pipelines, select the rasterization
stream after stream output emit so that SO still receives all streams.
Wire the Vulkan state through and advertise the feature.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39984>
draw_geometry_shader_run allocates vertex buffers for all active
streams, but the non-LLVM pipeline cleanup only freed stream 0.
Free all GS stream allocations unconditionally.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39984>
Add a 2-bit field to select which geometry shader output stream
feeds the rasterizer. Only meaningful when a geometry shader
with multiple output streams is active.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39984>
Replace the shader-based R/B swap with the blob driver's approach:
use A8R8G8B8 as the texture format so the sampler correctly interprets
the BGRA bytes the PE writes, and perform R<->B conversion at the
CPU boundary during transfer blits (tiled<->linear copies).
The R/B swap is gated by an in_transfer_blit context flag so that
GPU-internal blits (e.g. glBlitFramebuffer) operating on data already
in BGRA byte order are not affected.
For RB_SWAP formats, skip the texture shadow shortcut to ensure the
blit engine path is used, which handles the R/B swap correctly for
both reads and writes.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Lang <dalang@gmx.at>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38710>
Add a helper that returns the BGRA pipe format for a given RGBA pipe
format when the PE uses RB_SWAP. This is needed to pack clear colors
in the byte order the hardware actually stores.
Also fix translate_pe_format_rb_swap() to return 0 for formats with
PE_FORMAT_NONE, avoiding false positives on texture-only formats.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Lang <dalang@gmx.at>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38710>
Pass the per-image swizzle array through to the BLT CONFIG register
SWIZ fields instead of hardcoding the identity swizzle. This allows
the BLT engine to perform channel swizzling during copies, matching
what the blob driver does.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Lang <dalang@gmx.at>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38710>
It was already computed in brw_shader::assign_curb_setup() so we can use it
in brw_assign_urb_setup().
There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when
push_sizes were not multiple of REG_SIZE, the first one was aligning every
push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum
of all push_size.
By luck the only places that did not had a push_size aligned to REG_SIZE only
had one push_size, so this was not an issue.
So here also fixing this mismatch and adding an assert to caught any future
mismatch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>
VC4 hardware doesn't have a dispatch mask for the VS, so divergent
loops can have undefined/garbage contents in some execution channels,
potentially causing infinite loops and GPU hangs.
Fail shader linking instead of hanging the GPU when a divergent loop is
detected in a vertex shader.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
On V3D 4.2 (Raspberry Pi 4), there is a hardware bug where the binner
can trigger a GPU reset in some situations where primitives are
discarded, such as due to primitive restarts.
The way to avoid this is to force the binner to do always something, by
emitting the proper CL. In this case we decided to always set point
size, as it is a very simple and fast operation.
This fixes resets caused by
dEQP-VK.pipeline.monolithic.input_assembly.primitive_restart.*.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39826>
We just read this from the NIR and store it in iris_compiled_shader,
there's no reason for the backend compiler to be involved.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
These days, our system value concept is just about iris_program
communicating to iris_state which values to upload into a UBO.
Nowhere in that process is the backend compiler involved, so it
doesn't make sense for there to be brw/elk mechanisms.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
iris needs this, but anv does not, and it's just a small wrapper around
common NIR lowering anyway. This also removes some brw/elk splitting.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
When these are correctly mapped to draw usage, we can't rely on them
being globally initialized in tu_init_hw(). They need to be re-
initialized after rp stomping.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39819>