This relies too much on the properties of the SPIRV-LLVM-Translator and is
required to load SPIR-Vs found in the OpenCL CTS.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16479>
The value should be at the bottom 24 bits, not at the top.
dEQP-VK.pipeline.sampler.* still passes. This fixes most of
dEQP-GLES31.functional.texture_border_clamp.formats.*depth* on angle.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16570>
The struct is returned from a function, so in debug builds the address
may change after returning, and pointers to patched_s will be broken.
Pass the pointer to the patched stencil view as a parameter to
pan_preload_get_views to avoid this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16343>
this is a simplified version of the renderpass infrastructure which
tracks rendering info on the context and updates it incrementally to
try and reduce cpu overhead
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16476>
This would be needed by some NIR pass during linking. Given that NGG
settings are currently dispatched in many places, I don't think this
should hurt but it should be refactored at some point.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16404>
If the sample mask is being written it means we want to discard some of the
samples generated so we should not be promoting the fragment shader to
do early tests, since that would not take into account the sample mask
written from the shader.
Fixes:
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_samples_4
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16626>
GPUs with the LINEAR_PE feature bit have the ability to render into linear
buffers. While this decreases PE cache effectiveness and is thus slower than
rendering into a (super-)tiled buffer, it's still preferable for cases where
we would need a blit to get into linear otherwise, i.e. when importing a
linear buffer or when linear is forced on allocation by usage flags or
modifiers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16615>
The blob only switches to the 3 single buffer state when required, which seems
to be the case when any color or ZS target is <= 16bpp. Using 2 as the single
buffer state gives a very small 1-2% performance improvement on fillrate
constrained rendering, so it likely affects some PE cache setting.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16615>
Some glcts tests implement tons of tests because they verify
every possible combination of format/swizzle/target/...
They take a long time to execute and aren't possible to run
using multiple processes.
The proper way to fix it would be to split them in vk-gl-cts,
as is already done for some of them (eg es31fTextureGatherTests.cpp).
In the meantime, not running them makes glcts run almost
10 times faster.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16580>
This is required by lower_tg4_offsets which split one
sparseTextureGatherOffsetsARB call to four sparseTextureGatherOffsetARB
calls and merge their resisident results into one.
Fixes: ee040a6b63 ("radeonsi: enable ARB_sparse_texture2")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16599>