"most" times it isn't necessary to insert any pipeline barriers when binding
descriptors, as GL requires explicit barrier usage which comes through a different
codepath
the exception here is when the following scenario occurs:
* have buffer A
* buffer_subdata is called on A
* discard path is taken || A is not host-visible
* stream uploader is used for host write
* CmdCopyBuffer is used to copy the data back to A
buffer A now has a pending TRANSFER write that must complete before the buffer is
used in a shader, so synchronization is required any time TRANSFER usage is detected
in a bind
there's also going to be more exceptions going forward as more internal usage is added,
so just remove the whole fake-barrier mechanism since it'll become more problematic
going forward
Cc: 21.3 mesa-stable
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14496>
ensure that vertex key data is always zeroed when changing last stage since it will
be updated before draw anyway and can only cause problems if left alone here
fixes the following caselist:
dEQP-GLES31.functional.shaders.builtin_constants.tessellation_shader.max_tess_evaluation_texture_image_units
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_quads_geometry_output_points
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.25
cc: mesa-stable
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14482>
The idea is to offer the driver a way to execute on a different queue
than the one the app is using for Present.
For instance, this could be used to make the DRI_PRIME blit asynchronous,
by using a transfer queue.
So instead of creating a command buffer to be executed on present using
the supplied queue, this commit uses an internal transfer queue to perform
the blit.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>
Fixes a leak:
==47470== 60 bytes in 1 blocks are definitely lost in loss record 1,790 of 1,904
==47470== at 0x484186F: malloc (vg_replace_malloc.c:381)
==47470== by 0x58EBA6A: compile_vertex_list (vbo_save_api.c:535)
==47470== by 0x58EDABF: wrap_buffers (vbo_save_api.c:1021)
==47470== by 0x58EDF97: upgrade_vertex (vbo_save_api.c:1134)
==47470== by 0x58EE52F: fixup_vertex (vbo_save_api.c:1251)
==47470== by 0x58EFE9E: _save_Normal3f (vbo_attrib_tmp.h:315)
Fixes: 69615d92a0 ("vbo/dlist: realloc prims array instead of free/malloc")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14474>
Fixes a leak:
==46154== 48 bytes in 1 blocks are definitely lost in loss record 1,571 of 1,905
==46154== at 0x48466AF: realloc (vg_replace_malloc.c:1437)
==46154== by 0x5FC98EC: util_idalloc_resize (u_idalloc.c:43)
==46154== by 0x5FC9C16: util_idalloc_alloc_range (u_idalloc.c:125)
==46154== by 0x56FDB9F: _mesa_EndList (dlist.c:13681)
Fixes: b703d7c15f ("dlist: store all dlist in a continuous memory block")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14474>
The coarser 32x32 cross-slice hashing mode seems to lead to better L1
and L2 utilization due to the improved execution locality, however it
can also lead to a bottleneck in a single slice, especially in
workloads that concentrate heavy rendering in small areas of the
screen (e.g. SynMark2 OglGeomPoint, OglTerrain*) -- This effect is
mitigated here by performing a permutation of the pixel pipe hashing
tables that ensures that adjacent rows map to pixel pipes as far away
as possible in the caching hierarchy.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
Unlike the Gen11 code, this requires us to allocate a pipe_resource
for the pixel pipe hashing tables and hold a reference to it from the
context, since we need to add it to the validation list of every
batch, the tables may be accessed by the hardware at any time after
they're specified via 3DSTATE_SLICE_TABLE_STATE_POINTERS.
Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
This starts off with the simplest possible pixel hashing table
calculation that just assigns consecutive indices (modulo N) to
adjacent entries of the table, along the lines of the existing
intel_compute_pixel_hash_table(). The same function will be improved
in a future commit with a more optimal calculation.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
In order to avoid some duplication between the GL and Vulkan driver,
which will get worse as we introduce additional code in order to
handle more recent generations.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
It's now an array with 7 tables, each table is intended to specify the
pixel pipe hashing behavior for every possible slice count between 2
and 8, however that doesn't actually work, among other reasons due to
hardware bugs that will cause the GPU to erroneously access the table
at the wrong index in some cases, so in practice all 7 tables need to
be initialized to the same value.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
The state cache invalidation shouldn't be necessary on recent
platforms. On ICL it *seems* to be required to get the hardware to
pick up an updated indirect clear color, so this change is only
applied to TGL platforms and later for the moment.
On some DG2 configs this seems to improve SynMark2/OglDrvRes by 16.0%
±0.1%, n=8.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
The current packed dispatch assumptions for fragment shaders seem to
be the reason that the fs-readFirstInvocation-uint-loop Piglit
test-case for the ARB_shader_ballot extension fails on DG2 in
combination with the patches in this series that enable pixel pipe
hashing (thanks Jordan for reporting the regression). I've confirmed
that the brw_fs_test_dispatch_packing() test fails on DG2 hardware for
fragment shaders, while it succeeds for other shader stages,
indicating that the PSD hardware no longer guarantees packed dispatch.
Disable it.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
Some more refactoring in d3d12_draw.cpp to re-use a bunch of state
and descriptor management, and some refactoring of the dirty states.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
We don't key off of it to try to figure out if we need to produce
a new shader variant, so there's no need to set it when changing
properties that feed into variants. If we do have a new shader or
variant at draw time, we'll produce a new PSO without this.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
It doesn't depend on the clc data being provided externally, so no
need to tie it there, we can re-use it for GL and Vulkan compute.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
TGSI reserves 2 components for the coord in the first operand vector, even
for 1D. Fixes r600 failure with shadow1d.
Fixes: 390a3fcdc4 ("nir_to_tgsi: Add support for TXP.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14322>
Actually, no, there's no need to do anything, just update some
comments for the record. An earlier revision of this change that
implemented the workaround text to the letter required no less than 8
new PIPE_CONTROLs throughout the tree. However Felix Degrood noticed
that the cost of some of the PIPE_CONTROLs was showing up in workloads
like Shadow of the Tomb Raider. The Windows driver wasn't emitting
many of those pipe controls, contrary to the W/A instructions, so we
engaged in a back and forth with the hardware team, who concluded that
the original suggested workaround was unnecessarily strict, and the
Windows driver's behavior acceptable. It turns out that Wa_1408224581
we had already implemented for TGL is roughly equivalent to the
Windows behavior, so no need to do anything new after all.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>
XeHP platforms require the invalidation of the instruction cache after
a STATE_BASE_ADDRESS change due to a hardware bug potentially leading
to instruction cache pollution. Note that the workaround text says
it's applicable "DG2 128/256/512-A/B", however it's also marked as
permanent and not confirmed to be fixed in any specific steping, so we
apply it to all Gfx12HP platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>
Eliminates a ETC1 special case. In fact this unit conversion applies to
all formats; the original code path works since ETC1 is the only format
with blocks bigger than 1x1 supported by vc4 (I assume).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
Different parts of our codebase disagree on whether spatial
coordinates/dimensions are given in pixels or blocks, which differ by a
constant factor for block-compressed formats. This disagreement
manifests as incorrect results accessing block-compressed formats.
To resolve this, define the public tiling routines to take their
coordinates in pixels, and align the relevant code in Panfrost
accordingly.
Fixes rendering glitches in Factorio, as well as a pile of piglits on
Panfrost. It should also fix glTexSubImage() with ETC1 on Lima, but
there are no tests for this in dEQP/Piglit.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> [dEQP/Lima]
Tested-by: Erico Nunes <nunes.erico@gmail.com> [Piglit/Lima]
Reported-by: Icecream95 <ixn@disroot.org>
Closes: #5560
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
There is a lot of unit confusion in Gallium due to pixels versus blocks
matching only with uncompressed textures. Add a helper to do a common
pixels->blocks unit conversion required in multiple drivers.
v2: Rename dst->blocks, src->pixels to avoid confusion about the units
to casual readers (Mike).
Note to mesa-stable maintainers: this is marked as Cc: mesa-stable so
the next patch (a set of bug fixes for Lima and Panfrost) can be
backported. It's not a bug fix in its own right, of course.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
This updates many places where 0 is used as NULL pointer.
There are a few warnings left when I build the default
configuration but they either relate to code
outside of mesa or where "None" is used instead.
Found with static analysis (smatch)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12174>