This fixes:
dEQP-GLES3.functional.fbo.completeness.renderable.texture.depth.depth_component_unsigned_short
as gallium will try an allocate a DS usage, but fallback to just
a sampling usage if that fails.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11646>
In addition to register pressure benefits from getting more fp16/int16,
this avoids i2imp's from standing in the way of loop unrolling.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11545>
Compiling with clang warns about an unused variable in
nir_lower_packing.
Tracking progress was added to nir_lower_packing in
adb157ddfd but the function
will ignore the progress from impl calls and always return
false.
This patch changes it to return the progress. It fixes the
warning and should enable validation calls in NIR_PASS when
progress is made.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: adb157ddfd "nir: Return progress from nir_lower_64bit_pack()"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11615>
color/zs are stored in a union so testing for zs.stencil_offset
isn't the correct way to test for stencil.
Fixes: 988f148db3 ("ac/surface: overlap color and Z/S fields using a union in gfx9_surf_layout")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11625>
With discrete GPUs, we aren't able to change mapping modes after we've
created a buffer. This is a limitation of TTM. However, we already
have a buffer cache and it's pretty likely that stuff in any given
memzone will end up with just the one mapping type anyway so this
shouldn't have much of a cost.
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11583>
Surely the compiler would sort that out, you would think. But no, my
debugoptimized build improves
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime by 25%
on my SKL from this change.
This was the slowest test in the GLES31 tests on APL in CI, at 22s. And
yes, we were spending around half of our runtime in this function.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11631>
The texture compression can also be used for 2D arrays and
3D textures.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11634>
The Vulkan specification says: "If VkDescriptorSetAllocateInfo::pSetLayouts[i]
does not include a variable count descriptor binding, then
pDescriptorCounts[i] is ignored". The previous code triggered an assertion
in such cases, and this patch fixes it.
v2: removed the offending assertion that is now always satisfied and
reworded the commit message with a reference to Vulkan spec.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4992
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11576>
Assert in dest_regs() that dst_count == 1, since most users of it will
blow up if they encounter multiple destinations, and split out the core
of writes_gpr() so that we can easily make code using it multi-dst
aware.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>
These were a holdover from before the src/dst split and are no longer
necessary. Just don't create any dest registers for instructions that
never have a destination.
This has the side-effect that it becomes easier to replace uses of
dest_regs() with a per-register thing, once we start adding support for
multiple destinations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>
When using subgroups there are additional restrictions to consider,
so for now we keep it simple and disable supergroup packing in that
scenario.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11620>
For some reason, this was implemented with the bulk of the compute
shader enablement, but this intrinsic is specific to subgroups and
thus was not really used. Also, its implementation was not correct,
since it was returning the element index within the subgroup, not
the subgroup index itself, which is the index of the batch in the
dispatch.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11620>
We're currently using the 32bit version which is dropping half the
bits of the 64bits values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11607>
If a shader has no defined version force_glsl_version was
previous ignored and the shader would default to 110. This updates
the code so that those shaders are forced to a new level also.
We reused the existing code to make sure a sensible value is set
for the version.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11602>