The bo_flags are i915 specific and should not be handled in common
code, so here adding it to backend as it is in the hot-path.
There still i915 bo_flags handling in anv_device_import_bo() that
will be handled in the next patch.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25044>
The compaction introduced in a252123363 ("intel/compiler/mesh: compactify MUE layout")
is not suitable for the case where graphics pipeline libraries are fast
linked, as the fragment shader won't receive the mue_map to know where
to locate its inputs.
For that case, keep doing what we did before and lay things down in the
order varyings are defined, which is also how it works for the non-mesh
case.
Fixes dEQP-VK.fragment_shading_rate.*fast_linked_library*.ms
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>
We don't fill batch_len later when chaining batches. But that doesn't
seem to be a problem, I checked i915.ko and nothing Gen8+ seems to use
batch_len. The new xe.ko exec ioctl doesn't even ask for batch_len.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24681>
Don't hide failures, we have xe.ko bugs related to that, such as:
https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/496
The bind ioctl may fail if the application does something wrong, but
the wait really should never fail.
v2: Don't print an error message (Lionel).
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24681>
Our sparse implementation will require us to issue partial unbinds,
but partial unbinds are not supported in synchronous vm_bind ioctls,
requiring us to to have our VM be marked with the ASYNC flag. This is
not properly documented and is subject to change in the next
iterations of the API.
Error handling with async binds is also not documented anywhere and is
being actively discussed in the mailing lists, so whatever we decide
to implement here is likely to end up changing in a few weeks. Also, I
haven't seen these errors happening in the real world, so for now
they're a very corner case. So for now just foward errors to
user-space and hope things work.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24681>
For Sparse Resources we need to be able to specify the address, size
and offsets and we also want to be able to issue multiple binds at the
same time. Extend xe_vm_bind_op() to handle those cases and add
the new vfunc.
v2:
- use STACK_ARRAY() (Lionel)
- no more need to work around xe.ko bug that was fixed (José)
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24681>
The only driver that has a vm_bind ioctl is xe.ko, and its vm_bind
ioctl is not called GEM vm_bind, it's just DRM_IOCTL_XE_VM_BIND
(without GEM anywhere). Back when i915.ko was going to have a vm_bind
ioctl it had GEM on its name, so I guess that's how "gem" appeared in
the naming here, but now nothing does, so let's get rid of it.
Also, these vfuncs we have are specifically made to bind and unbind
whole BOs, so rename them to vm_bind_bo() and vm_unbind_bo() in order
to try to clarify what they mean. The goal is to add a more generic
vm_bind() later that can do anything.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24681>
While we don't need to emit all of the unused mesh/task states when mesh
is disabled, if we don't have them we fail some assertions in the
difference checks due to the corresponding state being empty.
This may happen when going from a mesh pipeline to a non-mesh one, or
one that uses task shaders to one that doesn't.
It may be possible to avoid having to do this, but I'd rather start from
a working state and optimize it later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
Any partially packed instructions should always be pre-packed by
genX_pipeline.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
3DSTATE_VFG was moved into a section that only gets emitted for legacy
pipelines, not mesh pipelines.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0ce772bd19 ("anv: split 3DSTATE_VFG emission")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
Don't rely on the HW to set values correctly so just emit
STATE_COMPUTE_MODE with default values set to zero.
Also, this change includes workaround changes:-
- 14015808183 (Parent HSD 14015782607) - Need to emit pipe control
with HDC flush and untyped cache flush set to 1 when CCS has
non-pipelined state update with STATE_COMPUTE_MODE.
- 14014427904 (Parent HSD 22013045878) - We need additional
invalidate/flush when emitting non-pipelined state commands with
multiple CCS enabled.
v2: (Tapani)
- Use lineage HSD numbers for check
- Don't use poisoned WA directly
- Use intel_needs_workaround helper
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24508>
When applying barriers for image transitions, we're currently
considering all possible usages of an image. But when running on a
compute only queue for example, the usage of an image will never be
one of those :
- VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
- VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
- VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
- VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
- VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_KHR
Removing unused usages for the compute queue allows us to reduce the
scope of the VK_IMAGE_LAYOUT_GENERAL for example. This a bunch of
transition operation that are completely useless when dealing with
barriers on the compute queue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25092>
Zink is running into those asserts on CI. The problem is that with non
auxilary modifiers like I915_FORMAT_MOD_Y_TILED, we might still
allocate larger buffers with IMPLICIT_CCS.
This isn't a complete fix, the real fix with come with
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25003 where
we stop overallocating and those assert will match the private binding
allocation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 569f80f2df ("anv: Reduce accesses of isl_mod_info->aux_usage")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25099>
When we have MSAA copy/clear operation on the compute queue, use the
companion RCS command buffer to carry out copy/clear operations.
v2: (Sagar)
- Flush cache according to command buffer
- Invalidate AUX when we create new companion RCS command buffer if
platform support AUX TT.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
v2: (Nanley)
- Make sure we skip layout transition during queue ownership transfer
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
We need to synchronize main (CCS/BCS) and companion rcs batch, so let's
create an empty batch and make both the batches (CCS/BCS) and companion
RCS batch wait on empty sync batch and signal the fence.
Reason to execute the empty batch is we need to make sure the companion
RCS batch finish as soon as the CCS/BCS batch finish. Preemption could
prevent the companion RCS batch execution and we might end up destroying
the CCS/BCS batch before companion RCS finishes.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
Add all the wait fences from the main (CCS/BCS) command buffer to the
companion RCS command buffer so that the companion RCS batch starts at
the same time as the main (CCS/BCS) batch.
v2:
- Drop unncessary flush (Jose)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
We need to create companion RCS engine when there is CCS/BCS engine
creation requested.
v2:
- Factor out anv_xe_create_engine code in create_engine (Jose)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
This enables us to create more logical engines than HW engines are
available. This also brings the uAPI usage closer to what is happening
on Xe.
Rework: (Sagar)
- Correct exec_flag at the time of submission
- Handle device status check
- Set queue parameters
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
In future patches, we will be creating a separate companion RCS engine
and each engine is created with it's own address space, and we really
don't want. CCS and RCS engine writes should be visible to each other in
order to get the wait/signal mechanism working.
v2:
- Move drm_i915_gem_context_create_ext_setparam out of if block (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
If we have valid companion RCS command buffer, we should
end/destroy/reset in the same fashion as of main command buffer.
v2:
- Add lock around anv_cmd_buffer_destroy (Sagar)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
Since we are going to have companion RCS command buffer, we need to
end/destroy/reset companion RCS command buffer similar to main (CCS/BCS)
command buffer.
It's better to split out common code into helper function so that we can
use it later in this series.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
This helper takes the main command buffer as input and then create a
companion RCS command buffer.
v2:
- Rename anv_get_render_queue_index helper to
anv_get_first_render_queue_index (Jose)
- Rename RCS command buffer to companion RCS command buffer (Lionel)
- Add early return in anv_get_first_render_queue_index (Lionel)
- Add lock around the function (Jose)
- Move companion rcs command pool creation in device create (Sagar)
- Reset companion RCS cmd buffer (Sagar)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
This way when blorp changes the 3DSTATE_BLEND_STATE_POINTERS, we can
just reemit the prior Vulkan state without repacking any of the values
in the BLEND_STATE structure.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
A single Vulkan state can map to multiple fields in different GPU
instructions. This change introduces the bottom half of a simplified
emission mechanism where we do the following :
Vulkan runtime state
|
V
Intermediate driver state
|
V
Instruction programming
This way we can detect that the intermediate state didn't change and
avoid HW instruction emission.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
The goal of this change it to move away from a single batch buffer
containing all kind of pipeline instructions to a list of instructions
we can emit separately.
We will later implement pipeline diffing and finer state tracking that
will allow fewer instructions to be emitted.
This changes the following things :
* instead of having a batch & partially packed instructions, move
everything into the batch
* add a set of pointer in the batch that allows us to point to each
instruction (almost... we group some like URB instructions,
etc...).
At pipeline emission time, we just go through all of those pointers
and emit the instruction into the batch. No additional packing is
involved.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
We'll use this later to know when to reemit
3DSTATE_STREAMOUT::ForceRendering
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
Leave the static part in genX_pipeline.c and only repack the dynamic
part in genX_gfx_state.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
We can reduce the amount of packing we do by only packing the dynamic
part.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
gfx8_cmd_buffer.c does not apply to gfx8 anymore for instance, it can
also be included in all builds.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>