Commit graph

10187 commits

Author SHA1 Message Date
Lionel Landwerlin
80feff8559 anv: emit 3DSTATE_URB_ALLOC_(MESH|TASK) only when mesh shaders are enabled
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
2023-09-11 10:08:44 +00:00
Lionel Landwerlin
ef8f28403f anv: fix 3DSTATE_VFG emission
3DSTATE_VFG was moved into a section that only gets emitted for legacy
pipelines, not mesh pipelines.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0ce772bd19 ("anv: split 3DSTATE_VFG emission")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
2023-09-11 10:08:44 +00:00
Ian Romanick
8ce4d7a08d intel/compiler: Don't evict for workgroup-scope fences
Flushing and invalidating caches isn't necessary for workgroup scope
fences.  In fact, the DP_FLUSH_TYPE docs (BSpec 54041) say:

   "If the fence scope is Local or Threadgroup, HW ignores the flush
    type and operates as if it was set to None(no flush)"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
2023-09-09 04:41:25 +00:00
Ian Romanick
5eddf60e56 intel/compiler: Combine control barriers with identical memory semantics
This prevents the second barrier generating a spurious, identical fence
message as the first barrier.

fossil-db stats on Alchemist:

   Totals:
   Instrs: 196513342 -> 196512777 (-0.00%); split: -0.00%, +0.00%
   Cycles: 14271426028 -> 14271404569 (-0.00%); split: -0.00%, +0.00%
   Send messages: 8021892 -> 8021770 (-0.00%)

   Totals from 46 (0.01% of 653252) affected shaders:
   Instrs: 76761 -> 76196 (-0.74%); split: -0.75%, +0.01%
   Cycles: 2027946 -> 2006487 (-1.06%); split: -1.45%, +0.39%
   Send messages: 7589 -> 7467 (-1.61%)

Nothing in shader-db was affected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
2023-09-09 04:41:25 +00:00
Kenneth Graunke
9f98f20c58 anv: Use nir_opt_barrier_modes() to drop unnecessary barriers
fossil-db stats on Alchemist:

   Totals:
   Instrs: 196514947 -> 196513342 (-0.00%); split: -0.00%, +0.00%
   Cycles: 14271450761 -> 14271426028 (-0.00%); split: -0.00%, +0.00%
   Send messages: 8022316 -> 8021892 (-0.01%)

   Totals from 43 (0.01% of 653252) affected shaders:
   Instrs: 98558 -> 96953 (-1.63%); split: -1.63%, +0.00%
   Cycles: 15867801 -> 15843068 (-0.16%); split: -0.17%, +0.02%
   Send messages: 8997 -> 8573 (-4.71%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
2023-09-09 04:41:24 +00:00
Sagar Ghuge
6a89507be8 anv: Program and emit STATE_COMPUTE_MODE
Don't rely on the HW to set values correctly so just emit
STATE_COMPUTE_MODE with default values set to zero.

Also, this change includes workaround changes:-
   - 14015808183 (Parent HSD 14015782607)  - Need to emit pipe control
     with HDC flush and untyped cache flush set to 1 when CCS has
     non-pipelined state update with STATE_COMPUTE_MODE.
   - 14014427904 (Parent HSD 22013045878) - We need additional
     invalidate/flush when emitting non-pipelined state commands with
     multiple CCS enabled.

v2: (Tapani)
- Use lineage HSD numbers for check
- Don't use poisoned WA directly
- Use intel_needs_workaround helper

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24508>
2023-09-08 23:08:26 +00:00
Sagar Ghuge
f0d5c7848a intel/genxml: Add STATE_COMPUTE_MODE instruction
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24508>
2023-09-08 23:08:26 +00:00
Lionel Landwerlin
eb0c197090 anv: bound image usages to the associated queue family
When applying barriers for image transitions, we're currently
considering all possible usages of an image. But when running on a
compute only queue for example, the usage of an image will never be
one of those :
   - VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
   - VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
   - VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
   - VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
   - VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_KHR

Removing unused usages for the compute queue allows us to reduce the
scope of the VK_IMAGE_LAYOUT_GENERAL for example. This a bunch of
transition operation that are completely useless when dealing with
barriers on the compute queue.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25092>
2023-09-08 18:05:08 +00:00
Lionel Landwerlin
80a352c87c anv: remove aux checking asserts
Zink is running into those asserts on CI. The problem is that with non
auxilary modifiers like I915_FORMAT_MOD_Y_TILED, we might still
allocate larger buffers with IMPLICIT_CCS.

This isn't a complete fix, the real fix with come with
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25003 where
we stop overallocating and those assert will match the private binding
allocation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 569f80f2df ("anv: Reduce accesses of isl_mod_info->aux_usage")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25099>
2023-09-08 16:57:53 +00:00
Timothy Arceri
84e0f5ce75 nir: remove unused param from nir_alu_src_copy()
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24986>
2023-09-08 03:01:39 +00:00
Timothy Arceri
af1528cc15 nir: replace use of nir_src_copy()
Since 03b2c34793 nir_src_copy() no longer does anything useful,
it will be removed in the following patch.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24986>
2023-09-08 03:01:39 +00:00
Lionel Landwerlin
6f4fe3f81b anv: Copy/Clear MSAA images over companion RCS while we are on compute
When we have MSAA copy/clear operation on the compute queue, use the
companion RCS command buffer to carry out copy/clear operations.

v2: (Sagar)
- Flush cache according to command buffer
- Invalidate AUX when we create new companion RCS command buffer if
  platform support AUX TT.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
5b8bef8650 anv: Extract batch print code to anv_print_batch helper
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
9866c4e32b anv: Skip layout transition on the compute queue
v2: (Nanley)
- Make sure we skip layout transition during queue ownership transfer

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
46d203c0ab anv: Add secondary companion RCS cmd buffer to primary
Add secondary buffer's companion RCS command buffer to primary buffer's
companion RCS command buffer for execution if secondary RCS is valid.

v2: (Lionel)
- Fix the primary companion RCS check
- Set batch error

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
4d79c2d280 anv: Execute an empty batch to sync main and companion RCS batch
We need to synchronize main (CCS/BCS) and companion rcs batch, so let's
create an empty batch and make both the batches (CCS/BCS) and companion
RCS batch wait on empty sync batch and signal the fence.

Reason to execute the empty batch is we need to make sure the companion
RCS batch finish as soon as the CCS/BCS batch finish. Preemption could
prevent the companion RCS batch execution and we might end up destroying
the CCS/BCS batch before companion RCS finishes.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
0c49d3cf97 anv: Setup companion RCS command buffer submission
Add all the wait fences from the main (CCS/BCS) command buffer to the
companion RCS command buffer so that the companion RCS batch starts at
the same time as the main (CCS/BCS) batch.

v2:
- Drop unncessary flush (Jose)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
a63277ec36 anv: Execute RCS init batch on companion RCS context/engine
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
103512ef3b anv: Move compute specfic bits under compute queue init
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
b375302576 anv: Create companion RCS engine
We need to create companion RCS engine when there is CCS/BCS engine
creation requested.

v2:
- Factor out anv_xe_create_engine code in create_engine (Jose)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Lionel Landwerlin
a5f2c8c845 anv: create individual logical engines on i915 when possible
This enables us to create more logical engines than HW engines are
available. This also brings the uAPI usage closer to what is happening
on Xe.

Rework: (Sagar)
- Correct exec_flag at the time of submission
- Handle device status check
- Set queue parameters

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
a5e4be45c0 intel: Pass virtual memory address space ID while creating context
In future patches, we will be creating a separate companion RCS engine
and each engine is created with it's own address space, and we really
don't want. CCS and RCS engine writes should be visible to each other in
order to get the wait/signal mechanism working.

v2:
- Move drm_i915_gem_context_create_ext_setparam out of if block (Lionel)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
b73960fc40 intel: Add helper to create/destroy i915 VM
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
13b3d7f741 anv: Handle companion RCS in end/destory/reset code path
If we have valid companion RCS command buffer, we should
end/destroy/reset in the same fashion as of main command buffer.

v2:
- Add lock around anv_cmd_buffer_destroy (Sagar)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:06 +00:00
Sagar Ghuge
801523f03d anv: Split out End/Destroy/Reset cmd buffer code into helper
Since we are going to have companion RCS command buffer, we need to
end/destroy/reset companion RCS command buffer similar to main (CCS/BCS)
command buffer.

It's better to split out common code into helper function so that we can
use it later in this series.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:05 +00:00
Sagar Ghuge
edcde0679c anv: Add helper to create companion RCS command buffer
This helper takes the main command buffer as input and then create a
companion RCS command buffer.

v2:
- Rename anv_get_render_queue_index helper to
  anv_get_first_render_queue_index (Jose)
- Rename RCS command buffer to companion RCS command buffer (Lionel)
- Add early return in anv_get_first_render_queue_index (Lionel)
- Add lock around the function (Jose)
- Move companion rcs command pool creation in device create (Sagar)
- Reset companion RCS cmd buffer (Sagar)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
2023-09-07 06:39:05 +00:00
Lionel Landwerlin
ceb1c6033b anv: split BLEND_STATE packing from BLEND_STATE_POINTERS emit
This way when blorp changes the 3DSTATE_BLEND_STATE_POINTERS, we can
just reemit the prior Vulkan state without repacking any of the values
in the BLEND_STATE structure.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
2b5f9cc30a anv: remove unused state emission
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
50f6903bd9 anv: add new low level emission & dirty state tracking
A single Vulkan state can map to multiple fields in different GPU
instructions. This change introduces the bottom half of a simplified
emission mechanism where we do the following :
          Vulkan runtime state
                   |
                   V
        Intermediate driver state
                   |
                   V
         Instruction programming

This way we can detect that the intermediate state didn't change and
avoid HW instruction emission.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
44656f98d5 anv: split pipeline programming into instructions
The goal of this change it to move away from a single batch buffer
containing all kind of pipeline instructions to a list of instructions
we can emit separately.

We will later implement pipeline diffing and finer state tracking that
will allow fewer instructions to be emitted.

This changes the following things :

   * instead of having a batch & partially packed instructions, move
     everything into the batch

   * add a set of pointer in the batch that allows us to point to each
     instruction (almost... we group some like URB instructions,
     etc...).

At pipeline emission time, we just go through all of those pointers
and emit the instruction into the batch. No additional packing is
involved.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
758540d741 anv: add a flag tracking occlusion query count change
We'll use this later to know when to reemit
3DSTATE_STREAMOUT::ForceRendering

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
0ce772bd19 anv: split 3DSTATE_VFG emission
Leave the static part in genX_pipeline.c and only repack the dynamic
part in genX_gfx_state.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
1e081bd680 anv: split 3DSTATE_TE packing between static & dynamic parts
We can reduce the amount of packing we do by only packing the dynamic
part.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
19c3f3ede4 anv: categorize partial/final pipeline instruction
The old gfx8 field doesn't apply anymore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:02 +00:00
Lionel Landwerlin
b1614c4e22 anv: rename files to represent their usage
gfx8_cmd_buffer.c does not apply to gfx8 anymore for instance, it can
also be included in all builds.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
a1f7e7d93e anv: move all dynamic state emission to cmd_buffer_flush_dynamic_state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
047c0ba44b intel/decoder: implement accumulated prints
Useful when you want to compare 2 batches with different ordering in
instruction emission. Also when the driver tries to avoid re-emitting
state.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
2c3a51573a intel/anv: batch stats util
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
1fdc089e9c anv: change anv_batch_emit_merge to also do packing
Instead of having that function do only merging of 2 sets of dwords,
it can also do the packing of the new dynamic values. This saves us a
bunch of local structures to declare and calling the packing functions
ourselves.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
5c287385c2 anv: remove ReorderMode from pipeline 3DSTATE_GS emission
This bit is set in the dynamic state emission. This is currently not
breaking anything because LEADING=0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 71ebd9b9d7 ("anv,hasvk: respect provoking vertex setting on geometry shaders")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
2023-09-06 20:07:01 +00:00
Lionel Landwerlin
adfa4f0453 blorp: remove unused variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24719>
2023-09-06 19:34:28 +00:00
Lionel Landwerlin
9231f24be1 hasvk: add state cache invalidation back before fast clears
Prior to 87149cc545, blorp added a state cache invalidation prior to
fast clears. This got dropped on Hasvk.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 87149cc545 ("blorp: update and move fast clear PIPE_CONTROLs to drivers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24719>
2023-09-06 19:34:28 +00:00
Lionel Landwerlin
c9739e8912 intel/fs: limit register flag interaction of FIND_*LIVE_CHANNEL
Those instructions do not access the flag registers on Gfx8+. Removing
the interaction enables CSE to remove more of those instructions.

Results are a bit mixed (DG2 vulkan fossils):

  ACO:
  Totals from 127 (5.97% of 2128) affected shaders:
  Instrs: 139966 -> 138972 (-0.71%); split: -0.85%, +0.14%
  Cycles: 1685747 -> 1667480 (-1.08%); split: -2.35%, +1.26%
  Max live registers: 10582 -> 10544 (-0.36%)
  Max dispatch width: 1048 -> 1040 (-0.76%)

  Cyberpunk 2077:
  Totals from 2879 (27.95% of 10301) affected shaders:
  Instrs: 4264789 -> 4225666 (-0.92%); split: -1.01%, +0.09%
  Cycles: 72380209 -> 71619521 (-1.05%); split: -1.63%, +0.58%
  Subgroup size: 30624 -> 30632 (+0.03%)
  Spill count: 98 -> 101 (+3.06%)
  Fill count: 90 -> 93 (+3.33%)
  Scratch Memory Size: 8192 -> 9216 (+12.50%)
  Max live registers: 217807 -> 217098 (-0.33%); split: -0.59%, +0.26%
  Max dispatch width: 23792 -> 24112 (+1.34%)

  Gaining 40 SIMD16 shaders

  Rise Of The Tomb Raider:
  Totals from 622 (5.06% of 12289) affected shaders:
  Instrs: 437380 -> 434760 (-0.60%); split: -0.72%, +0.12%
  Cycles: 261843085 -> 261580703 (-0.10%); split: -0.73%, +0.63%
  Max live registers: 27731 -> 27766 (+0.13%); split: -1.01%, +1.14%
  Max dispatch width: 5832 -> 5432 (-6.86%); split: +0.27%, -7.13%

  Loosing 26 SIMD32 shaders

  Strange Brigade:
  Totals from 1298 (31.48% of 4123) affected shaders:
  Instrs: 1504408 -> 1487968 (-1.09%); split: -1.17%, +0.08%
  Cycles: 20735976 -> 20443216 (-1.41%); split: -1.60%, +0.19%
  Max live registers: 89911 -> 89957 (+0.05%)

DG2 shader-db run:

  total instructions in shared programs: 23130895 -> 23130036 (<.01%)
  instructions in affected programs: 260956 -> 260097 (-0.33%)
  helped: 234
  HURT: 101
  helped stats (abs) min: 1 max: 54 x̄: 6.36 x̃: 4
  helped stats (rel) min: 0.05% max: 8.16% x̄: 2.01% x̃: 1.90%
  HURT stats (abs)   min: 1 max: 37 x̄: 6.23 x̃: 3
  HURT stats (rel)   min: 0.02% max: 5.67% x̄: 0.89% x̃: 0.55%
  95% mean confidence interval for instructions value: -3.62 -1.51
  95% mean confidence interval for instructions %-change: -1.33% -0.94%
  Instructions are helped.

  total loops in shared programs: 6071 -> 6071 (0.00%)
  loops in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total cycles in shared programs: 898610645 -> 898557166 (<.01%)
  cycles in affected programs: 18308201 -> 18254722 (-0.29%)
  helped: 315
  HURT: 48
  helped stats (abs) min: 1 max: 19312 x̄: 404.23 x̃: 128
  helped stats (rel) min: 0.02% max: 28.98% x̄: 3.92% x̃: 2.65%
  HURT stats (abs)   min: 2 max: 14478 x̄: 1538.60 x̃: 409
  HURT stats (rel)   min: <.01% max: 23.24% x̄: 3.34% x̃: 0.41%
  95% mean confidence interval for cycles value: -333.68 39.03
  95% mean confidence interval for cycles %-change: -3.51% -2.41%
  Inconclusive result (value mean confidence interval includes 0).

  total spills in shared programs: 5964 -> 5964 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 6909 -> 6909 (0.00%)
  fills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total sends in shared programs: 1040266 -> 1040266 (0.00%)
  sends in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  LOST:   3
  GAINED: 1

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24553>
2023-09-06 14:47:40 +00:00
Jordan Justen
8c8fca53fd intel/genxml: Fix comparing xml when node counts differ
This fix is more relevant to MR !20593. Normally when sorting the
number of nodes will be equivalent today, so this bug will not be
encountered. But in !20593, we can shrink (--import) or grow the
number of elements (--flatten) when the genxml_import.py tool is used.

Fixes: e60a0b1616 ("intel/genxml: Move sorting & writing into GenXml class")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24902>
2023-09-06 07:18:47 +00:00
Jordan Justen
d8038c8d09 intel/genxml: Ignore tail leading/trailing whitespace in node_validator()
When importing or flattening genxml with the genxml_import.py script
in MR !20593, it can lead to the tail portion of xml items differing
in whitespace.

If we strip the trailing and leading whitespace from the tail string,
and the strings are equivalent, then we can consider the xml items to
be equivalent.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24903>
2023-09-06 06:51:48 +00:00
Jordan Justen
5d37359f32 intel/dev/xe: Move placeholder subslice info into XEHP_FEATURES
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24418>
2023-09-05 23:02:42 -07:00
Chris Spencer
c29e3d5205 anv/video: use correct enum value for max level IDC
Signed-off-by: Chris Spencer <spencercw@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24649>
2023-09-06 05:10:33 +00:00
Jordan Justen
2b128c570b intel/clflush: Add support for clflushopt instruction
Rework:
 * Split clflushopt into a separate file as recommended by Ken.
   If we enable -mclflush on all driver source compilation, then
   gcc may insert uses of it on processors that don't support it.
 * Add uintptr_t casting to cpu_caps->cacheline usage

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22379>
2023-09-06 01:39:53 +00:00
Jordan Justen
e111d3241a anvil,hasvk: Use intel_flush_range_no_fence to flush command buffers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22379>
2023-09-06 01:39:53 +00:00
Jordan Justen
9f20be64e6 intel/common: Add intel_flush_range_no_fence
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22379>
2023-09-06 01:39:53 +00:00