Commit graph

15202 commits

Author SHA1 Message Date
Jason Ekstrand
63e80d441a intel/genxml: Remove old scratch fields on GFX version 12.5
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
eeeea5cb87 anv: Add support for scratch on XeHP
Rework:
 * Jordan: Handle per_thread_scratch==0 in anv_scratch_pool_get_surf
 * Jordan: Update subslices in anv_scratch_pool_alloc
 * Jason: Clean up the patch a bit

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
8ca0513eca intel/genxml: Add new ScratchSpaceBuffer fields on GFX version 12.5
Rework:
 * Jordan: Fix MEMZONE_BINDER_START detection
 * Jordan: Bump the IRIS_BINDLESS_SIZE to 8M

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
1e242785c3 intel/fs: Implement load/store_scratch on XeHP
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
c38812be1d intel/fs: Implement spilling on XeHP
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
96ee78778b intel/isl: Add support for scratch buffers
XeHP adds support for a new surface type for scratch.  It's similar to
SURFTYPE_STRBUF in that it's a 2D array-of-struct format but the one
key difference is that the U coordinate is computed automatically based
on the thread ID and only the V coordinate is provided in the dataport
message.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
443627fcc0 intel/genxml: Add SURFTYPE_SCRATCH on GFX version 12.5
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
2021-06-25 00:18:29 +00:00
Jason Ekstrand
d31dd81292 anv: Claim to be a discrete GPU if has_lmem
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Jordan Justen
b6a1063c2e intel/dev: Set has_local_mem for DG1
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Sagar Ghuge
e505c221fa anv: Allocate scratch and workaround BO in local memory
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Sagar Ghuge
6352371ff6 anv: Allocate BO in appropriate region
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Sagar Ghuge
3f8eca7f82 anv: Wrapper around I915_GEM_CREATE_EXT_MEMORY_REGIONS
v2 (Jordan Justin):
 - add anv_gem_stubs.c impl

v3 (Jason Ekstrand):
 - Use the upstream uAPI
 - Rework the interface a bit

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Sagar Ghuge
65e8d72bc1 anv: Query memory region info
Create additional memory type with DEVICE_LOCAL_BIT if we have local
memory region aviable.

v2 (Jason Ekstrand):
 - Don't leak mem_regions if the second ioctl fails

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Jordan Justen
cb6feae0b5 intel/devinfo: Add has_local_mem
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
2021-06-24 16:14:38 +00:00
Jason Ekstrand
b8030ab1ea isl,docs: Add a chapter on AUX state tracking
We also update and improve the docs in isl.h which get pulled into this
new chapter.

Acked-by: Luis Strano <luis.strano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
2021-06-24 13:57:40 +00:00
Jason Ekstrand
94a52bc85c isl,iris: Move the extra_aux_surf logic into iris
This gets rid of the awkward interface for isl_surf_get_ccs_surf where
we passed it two aux surfaces and it was supposed to fill out the second
one based on whether or not the first one already had stuff in it.
Instead, we now pass it three well-labled surfaces: surf,
hiz_or_mcs_surf, and ccs_surf which have obvious meanings.  This does
mean that iris has to carry a bit of logic and we have to flip
parameters around in all the callers.  But the resulting interface is
much cleaner.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
2021-06-24 13:57:40 +00:00
Jason Ekstrand
37f76aab1c isl: Take a hiz_or_mcs_surf in isl_surf_supports_ccs
Whether or not a surface supports CCS on Tigerlake and later is
dependent not only on the main surface but also on the MCS or HiZ
surface, if any.  We were doing some of these checks in
isl_get_ccs_surf based on the extra_aux parameter but not as many as we
probably should.  In particular, we were really only checking HiZ
conditions and nothing for MCS.  It also meant that, in spite of the
symmetry in names, the checks in isl_surf_get_ccs_surf were more
complete than in isl_surf_supports_ccs.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
2021-06-24 13:57:40 +00:00
Jason Ekstrand
2d2590a879 isl: Assert some iris invariants in isl_surf_get_ccs_surf
The only driver which calls isl_surf_get_ccs_surf with extra_aux != NULL
is iris and it always calls it with two aux surfaces and never calls it
for CCS twice.  We can turn those checks into asserts.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
2021-06-24 13:57:40 +00:00
Dave Airlie
0acd202858 intel/genxml: fix gfx6 GS SVB_INDEX encoding
This seems to match what the docs + 965 traces say

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11567>
2021-06-24 06:50:49 +00:00
Tapani Pälli
55951ac28e anv: fix emitting dynamic primitive topology
Initial implementation missed various fields that derive from the
primitive topology. This patch fixes 3DSTATE_RASTER/3DSTATE_SF,
3DSTATE_CLIP and 3DSTATE_WM (gen7.x) emission in the dynamic case.

Fixes: f6fa4a8000 ("anv: add support for dynamic primitive topology change")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4924
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11379>
2021-06-23 12:01:45 +00:00
Lionel Landwerlin
9b7cba7724 anv: bound checks buffer memory binding in debug builds
Validation layers should warn you about this
(VUID-VkBindBufferMemoryInfo-size-01037) but this would be useful for
zink debugging.

Requested by Zmike.

v2: Also check memoryOffset (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11542>
2021-06-23 08:16:57 +00:00
Francisco Jerez
4dc4284342 intel/fs: Implement Wa_14013745556 on TGL+.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
c19cfa9dc2 intel/fs: Fix synchronization of accumulator-clearing W/A move on TGL+.
Right now the accumulator-clearing move emitted by the generator for
Wa_14010017096 inherits the SWSB field from the previous instruction.
This can lead to redundant synchronization, or possibly more serious
issues if the previous instruction had a TGL_SBID_SET SWSB
synchronization mode.  Take the SWSB synchronization information from
the IR.

Fixes: a27542c5dd ("intel/compiler: Clear accumulator register before EOT")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
63abc083ce intel/fs: Teach IR about EOT instruction writing the accumulator implicitly on TGL+.
This is unlikely to have had any negative side effect on the original
TGL, but will lead to issues on XeHP+ if the software scoreboard pass
isn't able to synchronize the accumulator writes.

Fixes: a27542c5dd ("intel/compiler: Clear accumulator register before EOT")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
5e7f443de0 intel/fs: Add SWSB dependency annotations for cross-pipeline WaR data hazards on XeHP+.
In cases where an in-order instruction is overwriting a register
previously read by another in-order instruction, drop the dependency
iff the previous read is guaranteed to have occurred from the same
in-order pipeline.  This should only have an effect on XeHP+ since
previous Xe platforms only had one in-order FPU pipeline.

The previous workaround we were using for this treated all ordered
read dependencies as write dependencies to avoid noise from our
simulation environment.  Relative to our previous workaround this
improves performance of GFXBench5 gl_tess by ~7% on a DG2 system
among other single-digit percentual FPS improvements.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
d46bb14d14 intel/fs: Implement Wa_22012725308 for cross-pipe accumulator data hazard.
The hardware fails to provide the expected data coherency guarantees
for accumulator registers when accessed from multiple FPU pipelines.
Fix this by tracking implicit accumulator accesses just like we do for
regular GRF registers, but instead of adding synchronization
annotations for any dependency we only do it for dependencies with a
pipeline mismatch, since the hardware should be able to guarantee
proper synchronization for matching pipelines.

Note that this workaround handles RaW and WaW dependencies in addition
to the WaR dependencies described in the hardware bug report even
though cross-pipeline RaW accumulator dependencies should be extremely
rare, since chances are the hardware will also hang if we ever hit
such a condition.  This only affects XeHP+, since all FPU instructions
are executed as a single in-order pipeline on earlier Xe platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
385da1fe36 intel/fs: Track single accumulator in scoreboard lowering pass.
This change reduces the precision of the scoreboard data structure for
accumulator registers, because the rules determining the aliasing of
accumulator registers are non-trivial and poorly documented (e.g. acc0
overlaps the storage of acc1 when the former is accessed with an
integer type).  We could implement those rules but it wouldn't have
any practical benefit since we currently only use acc0-1, and for the
most part we can rely on the hardware's accumulator dependency
tracking.  Instead make our lives easier by representing it as a
single register.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Francisco Jerez
231337a13a intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.
As required by HSDES:14013363432.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>
2021-06-23 07:34:22 +00:00
Lionel Landwerlin
7ed0aaced7 nir: use a more fitting index for btd_stack_push_intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
423c47de99 nir: drop the btd_resume_intel intrinsic
This is now 100% equivalent to the new rt_resume intrinsic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
4d9fcf2799 intel/rt: switch to common pass for shader calls lowering
v2: rename for new indices

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
1dacea10f3 anv: implement caching for ray tracing pipelines
v2: Turn a bunch of pointer checks into checks against NULL (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
fed7ac932f anv: move trivial return shader to device
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
7c852f78c4 anv: store more RT shader data in pipeline_stage object
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
045f4600b1 anv: cache raytracing trampoline shader
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Lionel Landwerlin
ab77aeb488 blorp: add blorp string in shader keys
Upon looking at caching the raytracing shader (in particular the
trampoline one) I kind of got afraid that some of the keys used for
blorp would end up matching other keys. This is because blorp keys are
fairly simple. There is no SPIRV module hash included.

This change includes a "blorp" string at the beginning of the queue to
ensure we don't collide with other keys.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
7479fe6ae0 anv: Implement vkCmdTraceRays and vkCmdTraceRaysIndirect
v2: Fix anv_cmd_state::binding_tables array size (Lionel)

v2: Fix anv_cmd_state::samplers array size (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
ac6d7a1758 anv: Make anv_address::offset 64-bit
This allows us to convert a 64-bit address to an anv_address which is
useful for working with device addresses.

v2: switch to int64_t to keep state pool relative relocation working
    on non-softpin platforms

v3: Update assert to reflect relative offsets (Jason)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
a67d7c9dee anv: Allow _anv_combine_address with a NULL batch
This is required in order to be able to use GenXML pack functions for
structs with addresses when you're not packing into a batch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
f68d64dac0 anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
02f7964a13 anv: Compute scratch sizes for ray-tracing pipelines and shader groups
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
c3ac9afca3 anv: Create and return ray-tracing pipeline SBT handles
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
b66d3e627a intel/fs: Don't pull CS push constants if uses_inline_data
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
79dc25d867 anv: Compile trivial return and trampoline shaders
These don't necessarily go in any group but are required for dispatch to
work properly.  The trampoline is a compute shader that is the initial
start point for the trace.  It's in charge of invoking the actual
ray-gen shader.  The trivial return shader is used whenever another
shader is missing and it does no work except the minimum required to do
a stack return.

v2: Rebase on upstream changes (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
e104555851 anv: Compile ray-tracing shaders
This doesn't look too different from other compile functions we have in
anv_pipeline.c.  The primary difference is that ray-tracing pipelines
have this weird two-stage thing where you have "stages" which are
individual shaders and "groups" which are sort of mini pipelines that
are used to handle hits.  For any given ray intersection, only the hit
and intersection shaders from the same group get used together.  You
can't have an intersection shader from group A used with an any-hit from
group B.  This results in a weird two-step compile.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
379b9bb7b0 anv: Support fetching descriptor addresses from push constants
Bindless shaders don't have binding tables so they have to get at the
descriptor sets via a different mechanism.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
937ffb1af0 nir/apply_pipeline_layout: Handle bindless shaders
They don't have binding tables so they have to use A64 descriptor set
access and everything has to be bindless all the time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
770d331285 anv: Disallow UBO pushing for bindless shaders
They don't really have push constants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
c92fd35848 intel/rt: Use reloc constants for the resume SBT
It's going to be attached to the end of the shader binary, not an
arbitrary table somewhere in memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
705395344d intel/fs: Add support for compiling bindless shaders with resume shaders
Instead of depending on the driver to compile each resume shader
separately, we compile them all in one go in the back-end and build an
SBT as part of the shader program.  Shader relocs are used to make the
entries in the SBT point point to the correct resume shader.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00