There's no need to separate them except that it was easier before, no
one will enable the second without also enabling the first. Now that
mesa will merge the states for us we can go ahead and merge them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25076>
We only need to emit MSAA state once per subpass at most, unless the
pipeline switches primitive types or for framebuffer-less subpasses
(which always use sysmem anyway). Therefore it seems like draw state
skipping isn't going to bring much benefit here, and having it as a draw
state in the first place is a remnant of how this used to be part of the
pipeline state.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25076>
While we don't need to emit all of the unused mesh/task states when mesh
is disabled, if we don't have them we fail some assertions in the
difference checks due to the corresponding state being empty.
This may happen when going from a mesh pipeline to a non-mesh one, or
one that uses task shaders to one that doesn't.
It may be possible to avoid having to do this, but I'd rather start from
a working state and optimize it later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
Any partially packed instructions should always be pre-packed by
genX_pipeline.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
3DSTATE_VFG was moved into a section that only gets emitted for legacy
pipelines, not mesh pipelines.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0ce772bd19 ("anv: split 3DSTATE_VFG emission")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
This is to inform you of some planned downtime in the LAVA lab as follows:
* Start: 2023-09-11 08:00 BST (UTC+1)
* End: 2023-09-11 12:00 BST (UTC+1)
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25141>
For some specific texture sizes, notably some texture sizes with width
4096, block stride calculation could end up calculating stride 256 which
is an invalid value.
In those specific cases, this could cause rendering artifacts or
application/driver crashes.
Cc: mesa-stable
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25084>
In order to compile monolithic shaders with pipeline libraries, we need
to keep the NIR around for inlining recursive stages.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21929>
the array dimensionality needs to match nir_add_inlinable_uniforms even if
only the first member is used
Fixes: 0c0fb216dd ("nir/inline_uniforms: Allow possibility of more than one UBO")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25063>
this has a lot of caveats:
* extension must be supported
* resource must have usage bit set
* resource must not have any pending batch usage
* resource must be in supported layout
if all of these conditionals pass, then HIC can be used for direct image subdata
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24775>
Flushing and invalidating caches isn't necessary for workgroup scope
fences. In fact, the DP_FLUSH_TYPE docs (BSpec 54041) say:
"If the fence scope is Local or Threadgroup, HW ignores the flush
type and operates as if it was set to None(no flush)"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
With the new nir_opt_barrier_modes() pass, we may encounter control
barriers with no memory modes set, such as:
@barrier () (execution_scope=WORKGROUP, memory_scope=WORKGROUP, mem_semantics=ACQ|REL, mem_modes=0)
The DXIL validator documentation [1] mentions an
INSTR.BARRIERMODENOMEMORY validation rule:
"sync must include some form of memory barrier - _u (UAV) and/or
_g (Thread Group Shared Memory). Only _t (thread group sync) is
optional."
We were generating a dx.op.barrier instruction with only one flag,
DXIL_BARRIER_MODE_SYNC_THREAD_GROUP. This seems to run afoul of the
above validator rule. So, this patch adjusts the code generator to
set DXIL_BARRIER_MODE_UAV_FENCE_THREAD_GROUP too, whenever
UAV_FENCE_GLOBAL isn't required.
[1] https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Most drivers will want nir_opt_barrier_modes() to optimize out
unnecessary memory barrier modes. However, virgl has to translate
back to GLSL, which means it can really only handle partial memory
barriers in compute shaders today, because there isn't a proper
way to express them otherwise. Just ask nir_to_tgsi to promote
these back to full barriers as a workaround.
See KHR-GL43.shader_storage_buffer_object.advanced-readWrite-case1
on virpipe-on-gl as a case where this hack is needed.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Originally written by Ian Romanick for the Intel backend, but ported
to the new nir_opt_barrier_modes() common optimization pass. Ian's
original explanation and commit message follows:
Shared memory only exists within a workgroup, so synchronizing it beyond
workgroup scope is nonsense.
Basically every SPIR-V compiler generates operations like
OpMemoryBarrier(/*Memory*/Device,
/*Semantics*/AcquireRelease | WorkgroupMemory)
This is suggested in numerous places, including
https://github.com/KhronosGroup/GLSL/blob/master/extensions/khr/GL_KHR_vulkan_glsl.txt.
Even Mesa's glsl_to_nir pass does this. This advice, which has been
copy-and-pasted everywhere, is contrary to issue 13 in the original
GL_ARB_compute_shader spec:
"Since shared memory is only accessible to threads within a single
work group, memoryBarrierShared() also only requires synchronization
with other threads in the same work group."
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Many shaders issue full memory barriers, which may need to synchronize
access to images, SSBOs, shared local memory, or global memory.
However, many of them only use a subset of those memory types - say,
only SSBOs.
Shaders may also have patterns such as:
1. shared local memory access
2. barrier with full variable modes
3. more shared local memory access
4. image access
In this case, the barrier is needed to ensure synchronization between
the various shared memory operations. Image reads and writes do also
exist, but they are all on one side of the barrier, so it is a no-op for
image access. We can drop the image mode from the barrier here too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>