On Intel HW we use the same mechanism for internal operations surfaces
as well as application surfaces (VkDescriptor).
This change splits the surface pool in 2, one part dedicated to
internal allocations, the other to application VkDescriptors.
To do so, the STATE_BASE_ADDRESS::SurfaceStateBaseAddress points to a
4Gb area, with the following layout :
- 1Gb of binding table pool
- 2Gb of internal surface states
- 1Gb of bindless surface states
That way any entry from the binding table can refer to both internal &
bindless surface states but none of the driver allocations interfere
with the allocation of the application.
Based off a change from Sviatoslav Peleshko.
v2: Allocate image view null surface state from bindless heap (Sviatoslav)
Removed debug stuff (Sviatoslav)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7110
Cc: mesa-stable
Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19275>
These are only used for storage-compatible compressed surfaces on
Broadwell and earlier and Stencil on Gfx7 where there isn't proper
stencil sampling support.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18402>
Implement Wa_1508744258:
Disable RHWO by setting 0x7010[14] by default except during resolve
pass.
Disable the RCC RHWO optimization at all times except when resolving
single sampled color surfaces.
v2: Move stalling to genX(cmd_buffer_apply_pipe_flushes) for clarity (Mark)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19450>
Zink on Anv running Gfxbench gl_driver2 is significantly slower than
Iris.
The reason is simple, whereas Iris implements uniform updates using
push constants and only has to emit 3DSTATE_CONSTANT_* packets, Zink
uses push descriptors with a uniform buffer, which on our
implementation use both push constants & binding tables.
Anv ends up doing the following for each uniform update :
- allocate 2 surface states :
- one for the uniform buffer as the offset specify by zink
- one for the descriptor set buffer
- pack the 2 RENDER_SURFACE_STATE
- re-emit binding tables
- re-emit push constants
Of all of those operations, only the last one ends up being useful in
this benchmark because all the uniforms have been promoted to push
constants.
This change defers the 3 first operations at draw time and executes
them only if the pipeline needs them.
Vkoverhead before / after :
descriptor_template_1ubo_push: 40670 / 85786
descriptor_template_12ubo_push: 4050 / 13820
descriptor_template_1combined_sampler_push, 34410 / 34043
descriptor_template_16combined_sampler_push, 2746 / 2711
descriptor_template_1sampled_image_push, 34765 / 34089
descriptor_template_16sampled_image_push, 2794 / 2649
descriptor_template_1texelbuffer_push, 108537 / 111342
descriptor_template_16texelbuffer_push, 20619 / 20166
descriptor_template_1ssbo_push, 41506 / 85976
descriptor_template_8ssbo_push, 6036 / 18703
descriptor_template_1image_push, 88932 / 89610
descriptor_template_16image_push, 20937 / 20959
descriptor_template_1imagebuffer_push, 108407 / 113240
descriptor_template_16imagebuffer_push, 32661 / 34651
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
We can avoid reemitting this when the index buffer index type doesn't
change.
Also we don't need to update this when the pipeline changes as we do
not pull any value from the pipeline. Instead rely on the dynamic
state to tell if dyn->ia.primitive_restart_enable changed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
Avoids a bunch of checks if we can.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
This code shows up a little on profiling on Gfx12 and since it's only
a gfx8/9 workaround we might as well ifdef it out.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
This change implements 3 states in one go:
- depth clamp enable
- depth clip enable
- depth clip negative one to one
This affects following packets:
3DSTATE_CLIP
3DSTATE_VIEWPORT_STATE_POINTERS_CC
3DSTATE_RASTER
v2: remove clip enable bit check from viewport emit (Lionel)
v3: use helper function from runtime to get depth clip (Lionel)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18879>
Remove 'polygon_mode' from pipeline and read it from
dynamic state instead.
This affects following packets:
3DSTATE_CLIP
3DSTATE_RASTER
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18879>
On DG2 the HW will fetch the binding entries into the cache
for every single thread when a compute walker is dispatched,
wiping out the advantages of the cache prefetch.
The spec also advises to not do a cache prefetch when we have more than
31 binding table entries, but most real world applications will never
hit that limit.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18498>
Things are a bit confusing because we use the term "scratch" for 2
different things :
* the buffer for register allocation spilling
* the buffer for storing live values between splitted shaders around shader calls
Here we're fixing the missing register allocation spilling buffer.
v2: update comments (Caio)
fix scratch bo size computation with pipeline libraries (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>
After each 3DPRIMITIVE, we need to send a dummy post sync op if point or
line list was used or if had only 1 or 2 vertices per primitive.
v2: add missing _3DPRIM_POINTLIST_BF (Lionel)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18746>
Field is ignored on BDW+, 3DSTATE_VF_TOPOLOGY is used to set topology.
We still want to preserve topology information in state because
of other upcoming changes that require it.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18698>
Intel has different Z interpolation float point rounding
than other mesa gpus
For example gl_Position.z = 0.0 will be interpolated to
gl_FragCoord.z = 0.5 for all gpus
gl_FragCoord = -0.00000001 will be interpolated to
gl_FragCoord.z = 0.4999999702 for Intel
and rounded to gl_FragCoord.z = 0.5 for other gpus
Games with LEQUAL depth func will fail depth test on Intel
and will pass it on other gpus in such case
This workaround lowers translated depth range
and several gl_FragCoord.z coords with extra small difference
will be translated to the same UINT16\UINT24\UINT32
value of an integer depth buffer
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7199
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18412>
We don't need the relocation offsets anymore, and just want to pin the
BO, and combine the address into a uint64_t. We can just open code
those two things; it's actually less code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18208>
We don't need the offset to write a relocation at any longer, so all
it does is call anv_reloc_list_add_bo() at this point. Just use that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18208>
We now always softpin and use the load_global_constant case, so there's
no need to set up a UBO for NIR constants.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18208>
This has basically identical semantics to the pseudo-ext enum we were
using before. Also, now that it's in the actual Vulkan enum, we can get
rid of all the #pragma garbage to avoid compiler warnings.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
Each logical device can point to its physical device intel_device_info
saving at least one intel_device_info.
This also allow us to set 'const' to avoid values in intel_device_info
being changed by mistake.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17897>
With the switch to common dynamic state tracking, something got lost
that made the scissors not always be emitted when they are not dynamic
and the pipeline is marked dirty.
Since both viewport and scissors make use of each other to calculate
their values, just stick the scissor emit in the same if block as
viewport for now.
I'd rather have them decoupled, and at least the Vulkan CTS didn't
complain when I tried it, but I don't know what other effects that
may have, especially when it comes to the guardband.
Fixes a bunch of tests under
dEQP-VK.pipeline.*.multisample.misc.*
Fixes: 7d25c04236 ("anv: Switch to using common dynamic state tracking")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17964>
v2: Drop ANV_PIPE_UNTYPED_DATAPORT_CACHE_FLUSH_BIT from invalidate bits (Lionel)
Add utrace support
Expand on comment about PIPE_CONTROL::UntypedDataPortCache
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
Set write back L1 cache policy in STATE_BASE_ADDRESS instruction for A64
messages.
v2: Also set the value in genX_state.c (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
We're missing a condition that is currently papered over by having
ANV_PIPE_HDC_PIPELINE_FLUSH_BIT in the invalidate bits.
v2: rework with simplication (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.
As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.
Make the workaround tracking code sample-dependent to fix this.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17859>