Commit graph

809 commits

Author SHA1 Message Date
Jason Ekstrand
dfedeccc13 intel: Only set VectorMaskEnable when needed
For cases with lots of very small primitives, this may improve
performance because we're not executing those dead channels all the
time.

Shader-db reports no instruction or cycle-count changes.  However, by
hacking up the driver to report when this optimization triggers, it
appears to affect about 10% of shader-db.

v2 (Kenneth Graunke): Always enable VMask prior to XeHP for now,
because using VMask on those platforms allows us to perform the
eliminate_find_live_channel() optimization.  However, XeHP doesn't
seem to have packed fragment shader dispatch, so we lose that
optimization regardless, and there's no reason not to avoid vmask.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1054>
2022-05-27 21:52:48 +00:00
Kenneth Graunke
27314718a3 intel: Drop Wa_1409226450 (stall before instruction cache invalidation)
Production Tigerlake and DG1 hardware shouldn't need this workaround.
It was only needed on the very first steppings which never went public.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16575>
2022-05-19 21:31:45 +00:00
Lionel Landwerlin
1c077ca9c0 u_trace/anv/iris: drop cs argument for recording traces
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16605>
2022-05-19 19:04:28 +00:00
Jason Ekstrand
62f0677223 iris: Set BindingTableEntryCount for compute shaders
This may slightly increase perf somewhere because the hardware can now
pre-cache binding tables.  The real feature is that INTEL_DEBUG=bat now
dumps out surface states for compute.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15759>
2022-05-11 23:47:08 +00:00
Karol Herbst
d98b82a103 iris/cs: take buffer offsets into account for CL
Sadly we pass in an offset, which the driver can't ignore

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16348>
2022-05-10 03:37:44 +00:00
Rohan Garg
581035b3a9 iris: set a default EDSC flag
anv sets the default EDSC flag, do the same for iris too

Fixes: 5ae278da18 ("iris: use vtbl to avoid multiple symbols, fix state base address")

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15905>
2022-04-13 12:36:01 +00:00
Kenneth Graunke
a969ad1ddf iris: Demote DC flush to HDC flush in cache tracker
FLUSH_HDC is sufficient to flush things out to L3, so we'd rather
use that where possible.  It's also emulated via DATA_CACHE_FLUSH
on platforms where it isn't supported, so we can use it unconditionally.

We still use DATA_CACHE_FLUSH for invalidating the data cache, and to
flush the DC-tagged cachelines in L3 to be globally-observable.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
1c8b4940eb iris: Emit flushes for push constant source buffers
Push constant loading is not coherent with L3 according to the document
that describes the hardware change for the vertex buffer L3 Bypass
Disable field.

If we've updated a push constant buffer with say, a blorp_buffer_copy,
we may need to flush both the render cache and the tile cache.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
bbd5714a7e iris: Use cache-tracker for draw count flushing
We should be using the cache tracker for this.  We can consider
this access IRIS_DOMAIN_OTHER_READ now that it's the catch-all
non-L3-coherent read-only access domain.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
43e3747eea iris: Extend the cache tracker to handle L3 flushes and invalidates
Most clients are L3-coherent these days.  However, there are some
notable exceptions, such as push constants, stream output, and command
streamer memory reads and writes.

With the advent of the tile cache, flushing the render or depth caches
alone are no longer sufficient for memory to become globally-observable.
For those, we need to flush the tile cache as well.  However, we'd like
to avoid that for L3-coherent clients, as it shouldn't be necessary,
and is expensive.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
8cd7e94eca iris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bit
This will let us use it without performing a VF cache invalidation,
should we want to do that.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
536eee31d0 iris: Fix UBO cache tracking for the !indirect_ubos_use_sampler case
On Tigerlake, we use the data cache for reading indirect UBOs instead
of the sampler.  But we still use the constant cache for direct UBO
access, so unfortunately we may access it through two different domains.

To work around this, we add a new domain for pull constants (UBOs),
which will be either constant+texture or constant+data.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
d39bd7ba70 iris: Split out an IRIS_DOMAIN_SAMPLER_READ domain from OTHER_READ
The bulk of IRIS_DOMAIN_OTHER_READ domain usage was the 3D sampler, but
there were also a few oddball cases like command streamer reads, blitter
access, and so on.  The sampler is definitely L3 coherent, but some off
the more esoteric reads may not be, so I'd like to separate them, so
that OTHER_READ can become a non-L3-coherent kitchen-sink domain.

The sampler cases only need TEXTURE_CACHE_INVALIDATE, and can skip the
CONSTANT_CACHE_INVALIDATE we had on IRIS_DOMAIN_OTHER_READ.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Kenneth Graunke
8e0ff0275d iris: Use IRIS_DOMAIN_DEPTH_WRITE for read only depth/stencil.
We were using IRIS_DOMAIN_OTHER_READ for read-only depth/stencil access
in an attempt to avoid unnecessary flushing; IRIS_DOMAIN_DEPTH_WRITE
could indicate read-write access.

However, IRIS_DOMAIN_OTHER_READ is clearly the wrong domain.  Depth and
stencil data is read via the depth cache, while IRIS_DOMAIN_OTHER_READ
currently corresponds to the sampler cache and constant cache together
(although this will change in future patches).

It's unclear whether this hack was useful.  For now, just drop it and
use the correct depth cache domain, even if it's marked as read-write.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>
2022-04-13 09:07:35 +00:00
Jason Ekstrand
cde1be0b84 iris: Handle range tracking for global bindings
The moment something is bound globally, the whole thing is valid.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15777>
2022-04-06 20:16:55 +00:00
Jason Ekstrand
187923c2eb iris: Account for BO offsets in iris_set_global_binding()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15777>
2022-04-06 20:16:55 +00:00
Francisco Jerez
6cc09699cd iris: Remove remaining history flushes.
This removes a couple of remaining history flushes which were
open-coded instead of using the iris_flush_and_dirty_for_history()
helper.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15738>
2022-04-04 10:32:31 -07:00
Mike Blumenkrantz
e219351457 iris: assert that samplerview base_array_layer is zero for hw < skl
this codepath is broken for older hardware

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15610>
2022-03-29 22:12:30 +00:00
Anuj Phogat
5cc4075f95 anv, iris: Add Wa_16011411144 for DG2
v2: Use CS_STALL instead of FLUSH_ENABLE in Iris (Lionel)
    Add missing CS_STALL after SO_BUFFER change in Anv (Lionel)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>
2022-03-17 14:18:02 +00:00
Jason Ekstrand
12d815bcac intel/guardband: Take min/max instead of total size
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:45 -05:00
Kenneth Graunke
8b9045e7a4 intel: Use 3DSTATE_BINDING_TABLE_POOL_ALLOC exclusively on Gfx11+
On Icelake and later, we can use a new 3DSTATE_BINDING_TABLE_POOL_ALLOC
command to update the location of the binder (buffer containing binding
table entries), rather than having to move Surface State Base Address
via a STATE_BASE_ADDRESS command.  This has less stalling and also means
our surface addresses can remain relative to a fixed 4GB address range,
meaning we don't have to re-stream them any time the binder changes.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke
e3a0e97300 intel: Limit Wa_1607854226 to Gfx12.0 only
This workaround is needed on all Gfx12.0 parts, but doesn't appear to be
necessary on XeHP.  The other drivers do not appear to be applying this
workaround on those parts.  As further evidence, we accidentally added
the 3DSTATE_BINDING_TABLE_POOL_ALLOC commands after switching back to
GPGPU mode, which would be an incorrect way to implement the workaround,
and things seem to be working.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke
ab47cad4fb iris: Rename surface_base_address to binder_address in a few places
On Gfx11+, we're going to stop changing Surface State Base Address
and instead start changing the Binding Table Pool address instead.

So, rename a few things to track the last binder address, which is
what we're actually changing, regardless of how we program it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke
db34c71513 iris: Use more efficient binding table pointer formats on Icelake+.
Skylake and older use a 15:5 binding table pointer format, which means
our binder can be at most 64kB in size.  Each binding table within the
binder must be aligned to 32B.

XeHP uses a new 20:5 binding table format, which allows us to increase
the binder size to 1MB while retaining the nice 32B alignment.  Larger
binders mean fewer stalls as we update the base address for the binder.

Icelake and Tigerlake can either use the 15:5 format or an 18:8 format.
18:8 mode requires the base of each binding table to be aligned to 256B
instead of 32B, but it gives us a maximum binder size of 512kB.

We can store 64 binding table entries in a 256B chunk (256B / 4B = 64),
but only 8 entries in a 32B chunk (32B / 4B = 8).  Assuming that most
binding tables have fewer than 64 entries, this means that with the 18:8
format, we're likely to be able to fit 2048 (512KB / 256B) tables into a
a buffer before needing to allocate a new one and stall.

Technically, the old format could also store 2048 binding tables per
buffer as well (64KB / 32B = 2048).  However, tables that needed more
than 8 entries would need multiple 32B chunks.  A single table would
take multiple aligned chunks, while with the larger 256B format, it
could fit in a single one.

This cuts binder resets by 6.3% on a Shadow of Mordor benchmark trace.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke
e6b7e74308 iris: Set MI_FLUSH_DW::PostSyncOperation correctly
The MI_FLUSH_DW post-sync operation uses the same encoding as the
PIPE_CONTROL one so we can use the same helper.  Write PS Depth Count
is not supported, of course, as the blitter has no depth pipeline.

This means that we can write the timestamp register from the blitter.

Fixes: 604d97671b ("iris: Add support for flushing the blitter (hackily)")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15157>
2022-02-24 21:42:16 +00:00
Paulo Zanoni
d10fd5b7c9 iris: fix register spilling on compute shaders on XeHP
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.

This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.

References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>
2022-02-22 22:16:57 +00:00
Tapani Pälli
ecc0041030 iris: fix a leak on surface states
Cc: mesa-stable
Closes:https://gitlab.freedesktop.org/mesa/mesa/-/issues/6013

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15010>
2022-02-15 07:06:50 +02:00
Paulo Zanoni
782efa29e6 iris: have a single border color pool per bufmgr
Have a single border color pool per bufmgr instead of per context.

We want to have a single VM shared among every context and the border
color pool is the last feature preventing us from having that.

Previously we had 1024 colors per context but once the buffer was full
we just waited for the buffer to be unused and restarted it. After
this patch we have 4096 colors for every single context and we can't
just flush buffers if they are full, so we simply return black.

There are many strategies we could try to implement to help alleviate
this new 4096 limit, none of which are implemented by this patch:

 - We could just expand the buffer to the full 16MB we can use,
   allowing 262144 colors.
 - We could use multiple buffers and make the contexts refcount them,
   so eventually older buffers would reach zero references and be
   recycled, moving us to a working set maximum from a lifetime
   maximum.
 - We could also make the border color pool be a standard memzone and
   then give smaller buffers to each context when they need, so the
   limit would be in the number of contexts that can use border color
   pools. This was my first implementation but Ken suggested I switch
   to the one provided by this patch, which is simpler.

Keep it like this since border colors don't seem to be used very much
and other Mesa drivers such as radeonsi also seem to employ the
"return black once we reach the limit" strategy.

As a last note, we could also move the contents of iris_border_color.c
to iris_bufmgr.c in order to avoid breaking some abstractions we have
in Iris, like we do with iris_bufmgr_get_border_color_pool(). I can do
this in case we want it.

v2: Switch from standard memzone to a per-screen thing (see above).
v3: Actually make it per bufmgr. Just making it per screen is not
    enough, since screens can share the same VM, an example being the
    gputest benchmark suite.
v4: Rebase.
v5: Remove dead code, lock around hash table lookup (Ken).
v6: Simple rebase.
v7: Another rebase (for_each_batch).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12028>
2022-02-11 01:42:45 +00:00
Nanley Chery
987bc44954 iris: Drop the iris_resource aux usage bit fields
A big reason we had these fields was to help create a set of surface
states for a resource. That's largely being handled through other means
now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
d905018a2c iris: Use iris_sample_with_depth_aux more often
We're going to remove res->aux.sampler_usages. This will simplify the
commit in which we do so.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
05b8b08ef4 iris: Avoid making some invalid CCS surface states
Although a resource may support CCS with its original format, a texture
view of that resource may have a format that doesn't support
compression. Don't create CCS surface states for such texture views.

This change affects iris' behavior when running piglit's
arb_texture_view-rendering-formats_gles3 test on SKL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
a9beb87dce iris: Inline some surface_state.cpu references
Now that we're using fill_surface_states, these aren't needed anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
d705faad6c iris: Add and use fill_surface_states
This helper simplifies some repeated logic.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
eb51fd0414 iris: Add and use use_surface_state
This helper simplifies some repeated logic.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
89ebdd67c4 iris: Add and use iris_surface_state::aux_usages
An iris_surface_state can have a different set of possible aux usages
than its iris_resource.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Nanley Chery
b60af618a0 iris: Drop res param from surf_state_offset_for_aux
This has been unused since commit 117a0368b0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
2022-02-10 04:47:14 +00:00
Tapani Pälli
562f7eef5b iris: invalidate L3 read only cache when VF cache is invalidated
When enabling the caching of index,vertex data in the L3 RO Cache
(L3BypassDisable), we need to use L3ReadOnlyCacheInvalidationEnable
to invalidate cache when buffer is modified by CPU/GPU.

Ref: bspec 46314
Fixes: ed8f2c4cbe ("iris: Cache VB/IB in L3$ for Gen12")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14815>
2022-02-09 10:05:10 +00:00
Kenneth Graunke
604d97671b iris: Add support for flushing the blitter (hackily)
To flush the blitter, we need to use MI_FLUSH_DW rather than the usual
PIPE_CONTROL we use on the 3D engine.  Most of our code is set up to
suggest flushes via PIPE_CONTROL commands, however, so we hackily just
emit MI_FLUSH_DW when they ask for any kind of PIPE_CONTROL flush.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14912>
2022-02-07 09:50:01 -08:00
Nanley Chery
dc70dd8c7d iris: Support the XeHP media compression format
The format on this platform is slightly different from the one used on
TGL. Also it's part of the surface state instead of an aux-map.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14355>
2022-01-28 00:30:55 +00:00
Francisco Jerez
4198ca4b3f iris/xehp: Implement workaround for 3D texturing+anisotropic filtering.
Implements a workaround for HSDES#14014414195.  Note that this change
deviates heavily from the workaround suggested in the HSDES, since all
of the suggestions are either costly at runtime or outright
non-compliant, so they would require us to apply the workaround
selectively for affected applications.

Instead of adding hacks to the compiler that manually implement the
LOD computation of 3D texturing operations in the shader, initialize
an extra sampler state structure in the driver that has anisotropic
filtering forced off, and use it instead of the normal sampler state
structure whenever a 3D texture is bound to the same sampler unit.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14489>
2022-01-21 23:24:33 +00:00
Kenneth Graunke
0bc7562466 iris: Do primitive ID overrides in 3DSTATE_SBE not SBE_SWIZ
Broadwell introduced new fields in 3DSTATE_SBE which allow us to ask
the hardware to override Primitive ID for us, rather than requiring us
to turn on attribute swizzling and specify per-attribute overrides in
3DSTATE_SBE_SWIZ.  We unconditionally enable attribute swizzling today,
but this is a step toward letting us think about disabling it in the
future.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14210>
2022-01-19 01:31:47 +00:00
Kenneth Graunke
223edb1ec1 iris: Use prog_data->inputs rather than shader info in SBE code.
This should be the same thing, but requires looking up less data.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14210>
2022-01-19 01:31:47 +00:00
Jordan Justen
f0692365a2 anv,blorp,crocus,i965,iris: Use devinfo->max_threads_per_psd for gfx8+
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13866>
2022-01-19 00:29:35 +00:00
Jordan Justen
a11dfc11cf iris: Use mi_builder for load/store reg/mem/imm functions
Ref: 06cf838cbd ("intel/mi_builder: Support gen11 command-streamer based register offsets")
Ref: 6ffdcc335e ("iris: Use mi_builder in iris_load_indirect_location()")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14340>
2022-01-18 23:11:38 +00:00
Jordan Justen
e29ed39d63 iris: Use mi_builder to set 3DPRIM registers for draws
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14340>
2022-01-18 23:11:38 +00:00
Lionel Landwerlin
2e3490dd0f iris: utrace/perfetto support
v2: Fixup gpu_id computation, use minor of /dev/dri/* % 128 since we
    don't know whether we get card0 or renderD128 for instance.
    (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13996>
2022-01-14 20:17:44 +00:00
Nanley Chery
f3c629733f anv,iris: PSS Stall Sync around color fast clears
Needed for XeHP (see Bspec 47704).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14024>
2022-01-12 01:30:34 +00:00
Francisco Jerez
074bde9989 intel/xehp: Switch to coarser cross-slice pixel hashing with table permutation.
The coarser 32x32 cross-slice hashing mode seems to lead to better L1
and L2 utilization due to the improved execution locality, however it
can also lead to a bottleneck in a single slice, especially in
workloads that concentrate heavy rendering in small areas of the
screen (e.g. SynMark2 OglGeomPoint, OglTerrain*) -- This effect is
mitigated here by performing a permutation of the pixel pipe hashing
tables that ensures that adjacent rows map to pixel pipes as far away
as possible in the caching hierarchy.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
2022-01-10 18:28:35 -08:00
Francisco Jerez
d149c5e6e0 iris: Program pixel hashing tables on XeHP.
Unlike the Gen11 code, this requires us to allocate a pipe_resource
for the pixel pipe hashing tables and hold a reference to it from the
context, since we need to add it to the validation list of every
batch, the tables may be accessed by the hardware at any time after
they're specified via 3DSTATE_SLICE_TABLE_STATE_POINTERS.

Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
2022-01-10 18:28:35 -08:00
Francisco Jerez
283d5bff4e intel: Rename intel_compute_pixel_hash_table() to intel_compute_pixel_hash_table_3way().
For consistency with intel_compute_pixel_hash_table_nway().

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>
2022-01-10 18:28:35 -08:00