Since the dirty range started out as 0..0, you would have 0..VBend as the
new dirty range on the first draw, and if your VB was >32b then you'd
flush every time you used it. Instead, if there's no existing dirty range
then just set it to our new VB's range.
Perf results with zink+anv on my CFL:
sauerbraten: +24.8182% +/- 0.602077% (n=5)
portal-2-v2.trace: +4.64289% +/- 0.285285% (n=5)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21370>
"Use 3DSTATE_CONST command for individual shaders instead of
3DSTATE_CONST_ALL COMMAND"
On gen 12.0 platforms, 3DSTATE_CONSTANT_ALL command is not processed
correctly in certain cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21301>
Patch changes comment to refer to the lineage 14014097488, this
workaround applies for ICL as well.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20952>
When writing descriptor with a null buffer/image we expect that
writing 0 will point to the null surface. For that to work the null
surface has to be in the bindless surface heap.
This fixes some new failures in dEQP-VK.robustness.* tests once
rewritten from the NV_ray_tracing to KHR_ray_tracing extension.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4ceaed7839 ("anv: split internal surface states from descriptors")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7762
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20953>
See the comments in emit_apply_pipe_flushes(). Flushing HDC is not
sufficient in GPGPU mode, and we need to set the untyped data port flush
bit as well.
Fixes many dEQP-VK failures with INTEL_COMPUTE_CLASS=1 on Alchemist.
Fixes: 1067ec90a5 ("anv: Update PIPELINE_CONTROL flush when switching pipeline mode in TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20774>
Field must be disabled if any render targets have integer
format, additionally for Gfx12+ field must be disabled when
num multisamples > 1 or forced multisample count > 1.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20671>
Some binding table entries have been identify as unused in the shaders
by the push constant analysis pass. We can just put the null entry in
there.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b49b18f0b7 ("anv: reduce BT emissions & surface state writes with push descriptors")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20555>
This 2 PIPELINE_CONTROL flushes are not necessary for TGL and newer
and also it have different requirements of flush, so here doing
this two changes at the same time.
As no ANV_PIPE_INVALIDATE_BITS is set as parameter of
anv_add_pending_pipe_bits(),
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer) will only emit one
PIPELINE_CONTROL.
BSpec: 44505
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20501>
For INTEL_MEASURE, ensure all prior instructions completed before
timestamp taken. Continue to support no CS flush case for Perfetto.
CS stall was dropped from pipecontrol when adding u_trace support.
Fixes: cc5843a573 ("anv: implement u_trace support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20502>
After using task shader, we need to emit a zero URB state and a
nullprim (empty pipe control) before rendering with primitives.
After this, a normal URB state needs to be returned, this will
happen when pipeline batch is emitted during pipeline switch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20334>
To avoid the replication of code to properly emit PIPELINE_SELECT.
init_compute_queue_state() had a different emit of PIPELINE_SELECT but
as there is no compute engine in GFX VER 11 we are safe with the
differences.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20444>
The loop going from 0 to max_draw_count multiplies the value which
could potentially overflow.
Fixes Coverity CID 1517852
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3596a8ea7a ("anv: factor out some indirect draw count entry points")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20436>
This reverts commit b1126abb38.
This breaks all hell at least on DG2, as there are several cases left
where current_pipeline gets checked against GPGPU to decide what to do,
and the value doesn't match that of ANV_HW_PIPELINE_STATE_COMPUTE.
On top of that, it also misses checking for
ANV_HW_PIPELINE_STATE_RAYTRACING.
Then there's the fact that in some cases, current_pipeline will be
UINT32_MAX, because it's the original undefined state and also used
after executing a secondary command buffer because we are not tracking
on which pipeline did the secondary left us.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7910
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20349>
Gen11 added a nifty feature where we have three custom system-generated
values called extended parameters that we can set to any 32-bit values
we want. These work just like vertex and instance ID and are controlled
in the pipeline by the 3DSTATE_SGVS_2 packet. They are provided to the
draw call either by extra DWORDs on the end of 3DSTATE_PRIMITIVE or by
storing values to more state registers.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
Delete anv_CmdDispatch, anv_CmdSetDeviceMask, and
anv_GetDeviceGroupPeerMemoryFeatures so that the vk_common_*
versions will be used instead. This will avoid repeated code.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20218>
On cmd_buffer_emit_scissor(), if VkViewport height or width are set to
a value lower than 1.0, y_max or x_max can be attributed negative values,
causing an overflow. That leads to ScissorRectangleYMax or
ScissorRectangleXMax to be set to values on an unsupported range.
Clamping x_max and y_max in the valid range solves the problem.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7471
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20200>
For Tiger Lake and onward, we generally don't need to ambiguate the CCS
before accessing it. This is safe for two reasons:
- Tiger Lake and onward treat all CCS values as legal.
- We enable compression on all writable image layouts. The CCS will
receive all writes and will therefore always be valid.
When dealing with modifiers, we continue to allow ambiguates in some
instances.
Before this patch, I found ~19.5k ambiguates in Wolfenstein:
Youngblood's Riverside benchmark (note that this includes manually
entering the benchmark and exiting the app). With this patch, the number
of ambiguates goes down to zero.
Improves performance of Fallout 4 at 1080p/High settings on Arc A380 by
around 22%.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20118>
The main goal is to be able to generate genX_bits.h for those
structures so we can get generated field offsets.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20011>