The alpha channel seems to be internally returned as f16 (up-converted
to f32 is that is the dest type of the sam instruction). This expresses
1/3 and 2/3 with less precision than cl cts expects (f32).
This may be a test bug. But the format is not required.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40028>
In some cases CL has higher precision requirements for format support.
Add a PIPE_BIND_x flag so that drivers can expose formats in GL(ES) that
they cannot expose in CL.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40028>
The previous commit enable different command buffers to program the
same 3DSTATE_BINDING_TABLE_POOL_ALLOC instruction even though they
allocated different chunks of binding tables.
Now we can just predicate this programming and skip the stalling,
flushing & invalidation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>
We currently allocate 64KiB chunks of binding table pools for each
command buffers and program the 3DSTATE_BINDING_TABLE_POOL_ALLOC
instruction accordingly.
But 3DSTATE_BINDING_TABLE_POINTERS_* instructions can address 2^20
bytes. So it's possible to have 2 command buffers share the same
programming if they just add some offsets to their
3DSTATE_BINDING_TABLE_POINTERS_* programming and round down
3DSTATE_BINDING_TABLE_POOL_ALLOC addresses to 2^20.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>
The original MR switched to use a float raw_timestamp_period to scale
the raw timestamp outside of the gallium driver. This better matched
how vulkan works.
But unlike vulkan, gallium has timestamp related queries/APIs that
return already scaled time, resulting in small errors if the way the
scaling is done differs between driver scaling and frontend scaling.
The important thing is that any error introduced by scaling must be
the same error across APIs.
(In particular, a f64 value cannot preciesly represent an arbitrary
u64 value. OTOH the driver's scaling could be simply multiply be an
integer. But differing precision errors of the two approachs causes
problems when comparing between timestamps that are converted in
different ways.)
In some, but not all, cases this could be addressed by changing the
driver to use the same scaling function, but this is not always possible
(if, for ex, the scaling is done on the GPU CP). So switch back to
the original approach from !39995, using a pscreen->convert_timestamp()
callback, to put the control back in the hands of the driver.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40051>
This should definitely be an OR operation if MRT0 and MRT1 don't write
the same channels. This also requires to set the writemask manually
because when it's 0 (in case a dual-source output is missing), the
intrinsic computes the mask itself with the number of components.
No fossils-db changes on NAVI33.
Fixes: 45d8cd037a ("ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14878
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39996>
Noticed while playing with pixel coord things.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40056>
The Vulkan spec says about VkFormatFeatureFlagBits:
If a format does not incorporate chroma downsampling (it is
not a “422” or “420” format) but the implementation supports
sampler Y′CBCR conversion for this format, the implementation
must set VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT.
Fixes: af062126ae
Signed-off-by: Benjamin Otte <otte@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39820>
There's no reason to have these checks be smeared between
radv_image_need_retile and radv_retile_transition.
Make radv_image_need_retile verify that the image might ever
need to have its displayable DCC updated.
Also, radv_image_need_retile should not care about the command
buffer. We should never try to do retile transition on a
command buffer that can't do compute to begin with.
Make radv_retile_transition only check whether the layout
we're transitioning to might involve reading the displayable
DCC, and perform retiling if so.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39990>
On the Rogue architecture add support for using a fragment passthrough
shader when there is no fragment shader present in a graphics
pipeline but the sample mask is required.
fix:
dEQP-VK.pipeline.monolithic.empty_fs.masked_samples
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Co-authored-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40048>
We were inconsistently handling MESA_SHADER_KERNEL, which for the most
part should be treated identically to MESA_SHADER_COMPUTE.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40027>
Configure the EOT setup for SPM EOT programs so that the generated
programs load the tile buffer into the output buffer before doing
the emit
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40002>
In subpasses preserve attachments are not used by the subpass but
their contents must be preserved throughout the subpass.
Add a list for the preserve attachments info specified by a subpass
and when determining a subpass attachments total uses check the
preserve attachments list and add it uses to the total.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40002>
The struct will also be used for preserve attachments in the next
commit.
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40002>
When calculating the dwords per pixel the output registers should
always be taken into account in addition to the number of tile buffers.
Fixes incorrect scratch buffer space calculation when both output
registers and tile buffers are emitted by a render.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 3457f8083a ("pvr: Acquire scratch buffer on framebuffer creation.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40002>
The subpass merging optimisation check for when subpasses are using
tile buffers was in the incorrect location.
The current check is in a function called from two places but only
the first of these should have been doing the optimisation check.
This was incorrectly affecting the number of renders that subpass
merging could avoid.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 10b6a0d567 ("pvr: Add support for generating render pass hw setup data.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40002>
Add support for the use of vertex input registers as additional general
purpose registers which previously was restricted to temporary
registers. Use of vertex input registers as additional general purpose
registers is not available for fragment shaders.
Vertex input registers are similar to temporary registers. The only
difference is that vertex input registers can contain pre-initialised
data when the shader starts.
By default, the number of vertex input registers used for register
allocation is the number of vertex input registers used for their
pre-initialised data rounded up to the nearest multiple of 4, as vertex
input registers are allocated in blocks of 4.
If PCO_DEBUG=alloc_extra_vtxins is used, a mimimum of 12 vertex input
registers are available for register allocation.
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39886>
Refactor pipeline creation path to use the vk_graphics_pipeline_state
structures provided by runtime instead of raw Vulkan CreateInfo structs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39834>
Empirically, TCS outputs have to be aligned to 64 bytes,
otherwise stale data may be read in rare cases. The exact
reason is not clear, but tests and proprietary driver behavior
strongly point at the need for 64 byte alignment.
Fixes tesselation issues in at least "Conan Exiles" but likely in many
more cases.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39890>
Venus uses vkImportSemaphoreFdKHR() with FD == -1 to signal a binary
semaphore and vkGetSemaphoreFdKHR() to wait on a binary semaphore.
Both KK and Dzn uses vk_sync_binary so supporting this special case for
import/export sync file will enable Venus host support.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39067>
Pack/unpack should be a lot faster than duplicating the subgroup op.
No fossil-db changes, but multiple people complained about this to me.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40024>
This probably doesn't do anything because sgpr_read_by_valu are all set
already for raytracing shaders.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825>
Wire up the existing Panfrost dithering infrastructure to the Vulkan
extension. The library already supports dithered formats via
pan_dithered_format_from_pipe_format(), so this plumbs the
VK_RENDERING_ENABLE_LEGACY_DITHERING_BIT_EXT flag through to the blend
descriptor emission and color attachment internal conversion paths.
Dithering is only applied to color attachments, not depth or framebuffer
preloads.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39781>
If a set layout is missing the driver can't compute the dynamic buffer
start offsets correctly. The only solution is to load these offsets from
an user SGPR.
To avoid adding more complexity, these offsets are re-emitted every
time dynamic buffers are dirty. That shouldn't matter because the
combination of dynamic buffers and independent sets is just super rare.
This fixes new VKCTS coverage
dEQP-VK.pipeline.pipeline_library.graphics_library.independent_sets_random.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39988>