Now that we're using 3DSTATE_BINDING_TABLE_POOL_ALLOC to set the base
address for the binding table pool separately from surface states, we
don't actually need to update surface state base address anymore.
Instead, we can just set STATE_BASE_ADDRESS once at context creation,
and never bother updating it again, saving some heavyweight flushes
and freeing us from the need for address offsetting trickery.
This patch was originally written by Jason Ekstrand, with fixes from
Lionel Landwerlin, but was targeting Icelake. Doing it there requires
additional changes (15:5 -> 18:8 binding table pointer formats) which
also involve some trade-offs, whereas the XeHP change is purely a win,
so we'll do it here first.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15616>
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).
Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
A new extension allowing the application to use the OpenGL depth
range in NDC, i.e. with depth in range [-1, 1], as opposed to
Vulkan’s default of [0, 1].
v2:
- call gfx8_cmd_buffer_emit_viewport on ANV_CMD_DIRTY_PIPELINE (Jason)
- remove redundant !! operator since negativeOneToOne must be true or
false (Tapani)
- coding style changes (Lionel)
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6070
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15304>
This raises our vertex input bindings limit from 28 to 31, and our
vertex input attribute limit from 28 to 29. We could theoretically
go higher, but it will take additional work.
The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33
vertex buffers, and 34 vertex elements. But we need up to two vertex
elements for system values (FirstVertex, BaseVertex, BaseInstance,
DrawID), and we currently use two vertex bindings for those.
There is another hidden limit: our compiler backend only supports the
push model for VS inputs currently. 3DSTATE_VS only allows URB Read
Lengths between [0, 15], which is measured in pairs of inputs, which
means we can theoretically push no more than 32 vertex elements. This
is no artifical limit either, as a vec4 element takes up 4 registers
in the payload, and 32 * 4 = 128, the entire size of our register file.
Plus, the VS Thread payload needs at least g0 and g1 for other things,
so we can really only push 31.
We can theoretically support one additional binding, by combining our
two SGV bindings into a single upload. In order to support additional
vertex elements, we would need to add support to the backend compiler
for the pull model for VS inputs.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991>
The Vulkan 1.3 spec says:
"The implementation-dependent limit bufferImageGranularity specifies
a page-like granularity at which linear and non-linear resources
must be placed in adjacent memory locations to avoid aliasing. Two
resources which do not satisfy this granularity requirement are said
to alias. bufferImageGranularity is specified in bytes, and must be
a power of two. Implementations which do not impose a granularity
restriction may report a bufferImageGranularity value of one.
Note: Despite its name, bufferImageGranularity is really a
granularity between "linear" and "non-linear" resources."
We set this limit to 64 bytes (a cacheline) at the dawn of time, without
any real rationale attached. There shouldn't be any restrictions here.
Our tile sizes are typically 4K, and tiled resource addresses are
aligned to the tile size, and the extent is also a multiple of the tile
sized. So if a linear resource occurs before a tiled one, there will
naturally be some space due to the alignment of the tiled resource's
starting address. If a linear resource occurs after a tiled one, the
tiled resource's ending address is already 4K aligned, which is already
guaranteeing that they won't share a cacheline.
So I think it should be fine to reduce this to 1. The other Vulkan
driver for our hardware seems to advertise 1 here as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15066>
Only on platforms that support it.
v3: Split out code setting up ray query shadow buffer (Caio)
Don't forget to setup ray query globals even when no shadow buffer
is used (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
The limit here is from the RENDER_SURFACE_STATE height/width/depth
fields - it's 2^30 for ISL_FORMAT_RAW buffers, and 2^27 otherwise.
anv_isl_format_for_descriptor_type() uses ISL_FORMAT_R32G32B32A32_FLOAT
for uniform buffers when compiler->indirect_ubos_use_sampler is set
(Icelake and earlier), but ISL_FORMAT_RAW when it isn't (Tigerlake+).
So we can increase the limit on Tigerlake and later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14892>
The Mesh pipeline is implemented as a variant of the
regular (primitive) Graphics Pipeline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13662>
Per primitive & attachment shading rate support added.
v2: Rebase on KHR_dynamic_rendering
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13739>
On XeHP, CCS doesn't require VMA on XeHP. The HW provides anything
allocated in LMEM a mapping to a CCS memory range for free. So, we:
1) use the implicit CCS framework to avoid adding an image memory
binding for the CCS surface.
2) leave each BO sized as-is instead of adding on space for the CCS.
Thankfully the framework only adds on space if an aux-map is present.
XeHP has no aux-map, so this patch doesn't explicitly do anything for
this.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14431>
this is triggered by mesa's own wsi handling, so stop printing nonsense
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14743>
This change introduces the anv_descriptor_size_for_mutable_type and
anv_descriptor_data_for_mutable_type helpers to compute the size and
data flags respectively for mutable descriptor types.
In order to make handling these types easier we now store a precomputed
descriptor stride for all types and use in the in appropriate places.
We also need to adjust the compiler to take into account this descriptor
stride. To that extent, we now pack the stride into the upper 16 bits
alongside the index and the dynamic offset index and use it later to
compute the correct offset.
Closes: #4250
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14633>
AUX and clear state is stored in the VkDevice private binding
Signed-off-by: Renato Pereyra <renatopereyra@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14416>
With the proper version checking in the common vulkan instance code
(commit 88b9b68) it is now possible to bring the reported interface
version up to v5.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14563>
The chipset_id should be named after i915 ioctl that's called
to get the device id. In user space this field holds pci device
id in reality. We now have a pci_device_id queried from drm
instead using the ioctl, so there is no much reason to keep
the chipset_id for the same purpose.
Signed-off-by: Jianxun Zhang <jianxun.zhang@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13936>
With the new input from PCI bus and device fields, we can compute
device uuids in a multi-gpu system.
Signed-off-by: Jianxun Zhang <jianxun.zhang@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13936>
Engines based contexts operate somewhat different for executing
batches. Previously, we would specify a bitmask value such as
I915_EXEC_RENDER to specify to run the batch on the render ring.
With engines contexts, instead this becomes an array of "engines", and
when the context is created we specify the class and instance of the
engine.
Each index in the array has a separate hardware-context. Previously we
had to create separate kernel level contexts to create multiple
hardware contexts, but now a single kernel context can own multiple
hardware contexts.
Another forward looking advantage to using the engines based contexts
is that the kernel does not plan to add new supported I915_EXEC_FOO
masks, whereas they instead plan to add new I915_ENGINE_CLASS_FOO
engine classes. Therefore some rings may only be usable with an engine
based class.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12692>
Instead of having a bunch of const vk_sync_type for each permutation of
vk_drm_syncobj capabilities, have a vk_drm_syncobj_get_type helper which
auto-detects features. If a driver can't support a feature for some
reason (i915 got timeline support very late, for instance), they can
always mask off feature bits they don't want.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are entirely unused except for the syncobj in submit_simple_batch.
We can use the libdrm wrappers for that as they're basically equivalent
and the core Vulkan sync code already depends on them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This is, unfortunately, a large flag-day mega-commit. However, any
other approach would likely be fragile and involve a lot more churn as
we try to plumb the new vk_fence and vk_semaphore primitives into ANV's
submit code before we delete it all. Instead, we do it all in one go
and accept the consequences.
While this should be mostly functionally equivalent to the previous
code, there is one potential perf-affecting change. The command buffer
chaining optimization no longer works across VkSubmitInfo structs.
Within a single VkSubmitInfo, we will attempt to chain all the command
buffers together but we no longer try to chain across a VkSubmitInfo
boundary. Hopefully, this isn't a significant perf problem. If it ever
is, we'll have to teach the core runtime code how to combine two or more
VkSubmitInfos into a single vk_queue_submit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This effectively partially reverts 13fe43714c ("anv: Add helpers in
anv_allocator for mapping BOs") where we both added helpers and reworked
memory mapping to stash the maps on the BO. The problem comes with
external memory. Due to GEM rules, if a memory object is exported and
then imported or imported twice, we have to deduplicate the anv_bo
struct but, according to Vulkan rules, they are separate VkDeviceMemory
objects. This means we either need to always map whole objects and
reference-count the map or we need to handle maps separately for
separate VkDeviceMemory objects. For now, take the later path.
Fixes: 13fe43714c ("anv: Add helpers in anv_allocator for mapping BOs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5612
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13795>
If we ever want to stop depending on the EXEC_OBJECT_PINNED to detect
when something is pinned (like for VM_BIND), having a helper will reduce
the code churn. This also gives us the opportunity to make it compile
away to true/false when we can figure it out just based on compile-time
GFX_VERx10.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>