When enabling the caching of index,vertex data in the L3 RO Cache
(L3BypassDisable), we need to use L3ReadOnlyCacheInvalidationEnable
to invalidate cache when buffer is modified by CPU/GPU.
Ref: bspec 46314
Fixes: ed8f2c4cbe ("iris: Cache VB/IB in L3$ for Gen12")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14815>
To flush the blitter, we need to use MI_FLUSH_DW rather than the usual
PIPE_CONTROL we use on the 3D engine. Most of our code is set up to
suggest flushes via PIPE_CONTROL commands, however, so we hackily just
emit MI_FLUSH_DW when they ask for any kind of PIPE_CONTROL flush.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14912>
The format on this platform is slightly different from the one used on
TGL. Also it's part of the surface state instead of an aux-map.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14355>
Implements a workaround for HSDES#14014414195. Note that this change
deviates heavily from the workaround suggested in the HSDES, since all
of the suggestions are either costly at runtime or outright
non-compliant, so they would require us to apply the workaround
selectively for affected applications.
Instead of adding hacks to the compiler that manually implement the
LOD computation of 3D texturing operations in the shader, initialize
an extra sampler state structure in the driver that has anisotropic
filtering forced off, and use it instead of the normal sampler state
structure whenever a 3D texture is bound to the same sampler unit.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14489>
Broadwell introduced new fields in 3DSTATE_SBE which allow us to ask
the hardware to override Primitive ID for us, rather than requiring us
to turn on attribute swizzling and specify per-attribute overrides in
3DSTATE_SBE_SWIZ. We unconditionally enable attribute swizzling today,
but this is a step toward letting us think about disabling it in the
future.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14210>
v2: Fixup gpu_id computation, use minor of /dev/dri/* % 128 since we
don't know whether we get card0 or renderD128 for instance.
(Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13996>
The coarser 32x32 cross-slice hashing mode seems to lead to better L1
and L2 utilization due to the improved execution locality, however it
can also lead to a bottleneck in a single slice, especially in
workloads that concentrate heavy rendering in small areas of the
screen (e.g. SynMark2 OglGeomPoint, OglTerrain*) -- This effect is
mitigated here by performing a permutation of the pixel pipe hashing
tables that ensures that adjacent rows map to pixel pipes as far away
as possible in the caching hierarchy.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
Unlike the Gen11 code, this requires us to allocate a pipe_resource
for the pixel pipe hashing tables and hold a reference to it from the
context, since we need to add it to the validation list of every
batch, the tables may be accessed by the hardware at any time after
they're specified via 3DSTATE_SLICE_TABLE_STATE_POINTERS.
Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
In order to avoid some duplication between the GL and Vulkan driver,
which will get worse as we introduce additional code in order to
handle more recent generations.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
Actually, no, there's no need to do anything, just update some
comments for the record. An earlier revision of this change that
implemented the workaround text to the letter required no less than 8
new PIPE_CONTROLs throughout the tree. However Felix Degrood noticed
that the cost of some of the PIPE_CONTROLs was showing up in workloads
like Shadow of the Tomb Raider. The Windows driver wasn't emitting
many of those pipe controls, contrary to the W/A instructions, so we
engaged in a back and forth with the hardware team, who concluded that
the original suggested workaround was unnecessarily strict, and the
Windows driver's behavior acceptable. It turns out that Wa_1408224581
we had already implemented for TGL is roughly equivalent to the
Windows behavior, so no need to do anything new after all.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>
XeHP platforms require the invalidation of the instruction cache after
a STATE_BASE_ADDRESS change due to a hardware bug potentially leading
to instruction cache pollution. Note that the workaround text says
it's applicable "DG2 128/256/512-A/B", however it's also marked as
permanent and not confirmed to be fixed in any specific steping, so we
apply it to all Gfx12HP platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>
On XeHP+, Binding Table Pointers are an offset relative to the Surface
State Base Address anymore. Instead, they are relative to the State
Binding Table Pool Address, which is set by the command above.
We emit that command (pointing to the same address as the Surface
State Base Addresss), and everything should stay working as before.
Reworks:
* Jordan: Add iris
* Jordan: Drop i965
* Ken: Set MOCS to avoid a major perf impact. (Found by Felix DeGrood.)
* Jordan: Shrink size from 2MiB to actual iris, anv usage
* Lionel: Add BINDING_TABLE_POOL_BLOCK_SIZE
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4995
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Add Iris, adjust sizes]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13992>
Rework:
* Jordan: Set MOCS after
7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14212>
We don't current enable post sync operations, but it is probably
better to set them to "internal" MOCS than to remove the non-zero
checking for this genxml field.
Reworks:
* Fix COMPUTE_WALKER in cmd_buffer_trace_rays (s-b Jason)
Fixes: 7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13624>
For example, this allows us to take advantage of command-streamer
based register offsets in mi_builder.
Ref: 06cf838cbd ("intel/mi_builder: Support gen11 command-streamer based register offsets")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13652>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is disabled stream output targets, MOCS shouldn't matter, as
there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for these null stream
output buffers; we can just assume a MOCS for generic internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is 3DSTATE_VERTEX_BUFFERS where we set NullVertexBuffer.
It shouldn't matter here, as there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for null vertex buffers.
We can assume an internal buffer and request isl's vertex buffer MOCS.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we missed
setting a non-zero MOCS was in 3DSTATE_CONSTANT_ALL packets which fully
disable all constant buffers. (If any constant buffer was present, we
would set an actual MOCS value.)
MOCS really shouldn't matter here, as there are no actual constant
buffers to be cached. That said, it should be harmless to do so, and
we can just assume a generic MOCS for internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We were leaving this blank due to a Broadwell restriction, causing our
constant buffers to be uncached. We later fixed this for Gfx12+, but
left Gfx9-11 without a fix. We should specify one.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
isl now uses info->mocs regardless of whether there's any actual
depth/stencil/HiZ buffers involved, so pass it a legitimate one,
rather than zero. When we have entirely NULL surfaces, we just
default to isl's MOCS value for an internal depth buffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We don't use bindless sampler states today, but when we do, we'll want
them to have proper MOCS values. This also avoids asserts in upcoming
patches which enforce that MOCS isn't zero.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
Tigerlake revision 0 is an early stepping that should not be used in
production anywhere, so this code was only used for hardware bringup.
We can drop the unnecessary workarounds. This also keeps them from
triggering on early steppings of other Gfx12 parts.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13266>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
Scratch patching code in iris_upload_dirty_render_state (see MERGE_SCRATCH_ADDR
calls) assumes that in all shader stages derived_data field stores 3DSTATE_XS
packet first.
This is not true for TESS_EVAL (DS), so we end up patching 3DSTATE_TE
instead of 3DSTATE_DS leading to DWordLength becoming 11 instead of 9
(9 == 3DSTATE_DS.DWordLength, 2 == 3DSTATE_TE.DWordLength, and 9|2 == 11),
and hardware hanging on the next instruction.
Fix this by reversing the order of packets for TESS_EVAL stage.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5499
Fixes: 4256f7ed58 ("iris: Fill out scratch base address dynamically")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13358>
This patch adds Tessellation Distribution on top of Geometry
Distribution. Using recommended values based on performance studies
across a range of workloads.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Using recommended values based on performance studies across a range
of workloads.
Rework:
* Always enable geometry distribution
* Set ListCutIndexEnable if primitive restart is enabled
* Set distribution mode based on TEEnable
v2:
- Flag missing IRIS_DIRTY_VFG bit (Ken)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Clover likes to do this to clear our a bunch of samplers without
actually passing an array of NULL pointers. It's easy enough to
handle in iris.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13072>
This allows us to delete iris_resource_unfinished_aux_import, which
incorrectly assumed that a CCS-enabled resource needs an aux BO.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12795>