No drivers are using this anymore so we can delete it and not keep
maintaining this legacy code-path. If any drivers want this in the
future, they should use nir_lower_varst_to_explicit_types followed by
nir_lower_explicit_io.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
As we do not support stream output buffers we only count the primitives
processed by the pipeline. Use the correct query type.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5754>
For some reason, in order to get all tests to pass, pretty much all
hardware (across vendors) has to program in offset_units * 2. This fixes
dEQP-GLES3.functional.polygon_offset.float32_displacement_with_units.
While we're at it, add polygon offset clamp support.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5763>
For turnip, we use the "bindless" model on a6xx. Loads and stores with
the bindless model require a bindless base, which is an immediate field
in the instruction that selects between 5 different 64-bit "bindless
base registers", a 32-bit descriptor index that's added to the base, and
the usual 32-bit offset. The bindless base usually, but not always,
corresponds to the Vulkan descriptor set. We can handle the case where
the base is non-constant by using a bunch of if-statements, to make it a
little easier in core NIR, and this seems to be what Qualcomm's driver
does too. Therefore, the pointer format we need to use in NIR has a vec2
index, for the bindless base and descriptor index. Plumb this format
through core NIR.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5683>
Add the possibility to specify the source components. This is necessary
to let the UBO/SSBO index have more than one component, and it also lets
us remove a few hand-rolled load intrinsic definitions.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5683>
Previously we would match the start of the compatible string, in
a couple of cases, in order to match compatible strings like
"qcom,adreno-630.2". But these cases would always list a more
generic compatible (ie. "qcom,adreno") as a later choice. So if
we parse all the compatible strings, we can do a more precise
exact match.
This avoids us accidentially matching on "qcom,adreno-smmu" and
the hilarity that ensues.
Fixes: 5a13507164 ("freedreno/perfcntrs: add fdperf")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5762>
This was causing vkAcquireNextImageKHR to not signal the fences and
semaphores. In the case where the semaphore was brand new, this could
cause an unsignalled syncobj to be passed into execbuffer2 which it will
reject with -EINVAL leading to VK_ERROR_DEVICE_LOST. Thanks to Henrik
Rydgård who works on the PPSSPP project for helping me figure this out.
Fixes: ca3cfbf6f1 "vk: Add an initial implementation of the actual..."
Fixes: 778b51f491 "vulkan/wsi: Add a hooks for signaling semaphores..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5672>
We have an issue with early depth testing and discard, where
non-perfect counts count the tile if the early depth test succeeds.
We could spend a lot of effort to set this conditionally based
on the presence of the two conditions, but in the presence of
inherited queries let's try this first.
Changing PERFECT_ZPASS_COUNTS since I'm pretty sure this has a lower
performance impact than always using late depth testing.
CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3218
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5757>
The previous version assumes tess level outputs will only be written once
in the shader, however its not possible to guarantee that.
It also assumes all invocations will write all the levels, which is also
not guaranteed.
This is required to fix the "tesselation" and "terraintessellation" demos
with turnip.
The comment about nir_lower_io_to_temporaries in lower_tess_ctrl_block is
removed because nir_lower_io_to_temporaries specifically skips TESS_CTRL
shaders so the comment doesn't make sense.
The split load for tess levels workaround is removed, the new version only
has scalar access unless if ever gets vectorized.
This sets NIR_COMPACT_ARRAYS cap to avoid the glsl tess vec lowering with
gallium. It seems this will also disable "LowerCombinedClipCullDistance",
which I'm not sure was needed or not.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5744>
Accidentally broke this when rebasing the offending commit.
My use case with non-zero explicit offset is UV plane of UBWC NV12, and
only the UBWC slice offset is used for the UBWC sampler, so I didn't catch
it immediately.
Fixes: d53dc6c376 ("freedreno/fdl6: rework layout code a bit (reduce linear align to 64 bytes)")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5761>
glGetPerfMonitorCounterInfoAMD(..., ..., GL_COUNTER_RANGE_AMD, ...)
returned NAN (binary representation of uint64_t(-1) as float) as
a max value.
Fixes: 0fd4359733 ("iris/perf: implement routines to return counter info")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5473>
Some Piglit tests (rightfully) fail because of min >= max when exposed
to perf counters that do not explicitly define their max value.
Failing tests:
spec/amd_performance_monitor/api/test_counter_info
spec/amd_performance_monitor/vc4/test_counter_info
u32/u64 changes are no-ops.
Fixes: 4cd1cfb983 ("st/mesa: implement GL_AMD_performance_monitor")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5473>
I haven't tested all the limits, but these two should be enough
for driver writers to realise.
I've also submitted a minmax test for piglit to test this.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5727>
A3xx GPUs support RG8 and RGBA8, but not R8 for rendering. Add RG8 as
fallbacks for integer formats, and require a renderable format to be
picked for all R8 variants.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5748>
Let's consistently use the following code format instead of relying on
falling through to `default`:
if (!req)
return GL_INVALID_OPERATION;
break;
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5729>
The compton compositor is unmaintained, with a new fork named picom taking
its place. As with the other compositors (including compton), adaptive
sync should not be enabled.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5740>
ir3 already calculates the stride in the tess param bo, so use that instead
of a incorrect calculation. The calculation of per_vertex_output_size /
per_patch_output_size is wrong because it counts dwords instead of bytes,
and what it counts for per_vertex_output_size is a per-patch size because
the glsl type is already an array of # vertex/patch elements.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5743>
XXH32 is doing access through u32 *, and with strict aliasing the compiler
gets to assume that those are independent of the u16 writes we did in
fd6_texture_key setup, and based on various tweaks to the code, would
result in bad hashes computed after inlining. The failure was:
../src/util/hash_table.c:326:_mesa_hash_table_search_pre_hashed: Assertion
`ht->key_hash_function == ((void *)0) || hash == ht->key_hash_function(key)'
failed.)
By setting these two flags, we always take the unaligned,
memcpy-the-32-bit-data path. I believe this should be same perf on x86
(which will happily unaligned load 32 bits in the end), while it will be
slower on arm (where you have to a special unaligned load operation iirc).
This should still be far faster than our old hash.
Fixes: edd62619a1 ("freedreno: replace fnv1a hash function with xxhash")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5271>
The old code updated the viewport index on the first vertex in
a primitive, however it was picking the first vertex wrong
when used with geometry shaders.
This code has access to the prim info with the primitive lengths
so instead keep track of when a new primitive starts by tracking
the lengths and updating the viewport index then. The prim info
is only valid after a GS or prim assembly, so enable prim assembly
if a vertex shader ever uses viewport index.
This fixes:
piglit arb_viewport_array-render-viewport-2
KHR-GLES31.core.viewport_array.draw_to_single_layer_with_multiple_viewports,Fail
KHR-GLES31.core.viewport_array.draw_mulitple_viewports_with_single_invocation,Fail
KHR-GLES31.core.viewport_array.draw_multiple_layers,Fail
KHR-GLES31.core.viewport_array.depth_range,Fail
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5489>
* Remove scratch_bo from cmdbuffer, use a device-global bo instead, which
also includes border color (and eventually shaders for 3D blit path)
* Use CP_SET_BIN_DATA5_OFFSET to allow setting VSC buffer addresses only
once at the start of the cmdstream
* Use scratch bo mechanism for a resizable VSC buffer
* Use feedback from "vsc_draw_overflow" and "vsc_prim_overflow" values to
increase the size of VSC buffer when beginning to record a new cmdbuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>