So that we can provide that information to WSI if it asks for it
immediately.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19224>
We can't have streamout and mesh enabled at the same time.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ef04caea9b ("anv: Implement Mesh Shading pipeline")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19323>
On Q2RTX shaders :
Instructions in all programs: 31039 -> 26150 (-15.8%)
SENDs in all programs: 1587 -> 1148 (-27.7%)
Loops in all programs: 4 -> 4 (+0.0%)
Cycles in all programs: 420218 -> 392179 (-6.7%)
Spills in all programs: 157 -> 132 (-15.9%)
Fills in all programs: 337 -> 262 (-22.3%)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
None of the platforms supported by this driver supports local memory.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19240>
Integer div lowering can potentially create a lot of code that is
not removed later on. Running const lowering pass first can be used
to eliminate that code.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19157>
This file is autogenerated by python's ply package and is used during
compilation to cache tables built by it for quicker parsing.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19153>
That was previously listed on the getopt_long struct but not actually
being used. This makes intel_clc argument processing easier as now
all of its arguments are handled with getopt and anything after the
special argument '--' is passed along to clang to form the final build
command.
Thanks to Dylan Baker for help with changes to the meson file.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19153>
intel_clc relies on the special argument '--' for getopt to be given so
it knows when to start expecting purely input files or clang arguments.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19153>
The info keyword was using the same short description that
was listed for input files on the struct for long_options.
Rewording it to 'v' and 'verbose' to be more in line with
expectations.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19153>
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_AABBS_BIT_KHR and
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_TRIANGLES_BIT_KHR, when specified,
make TraceRay behave as if the corresponding shader flags were set, but
without affecting the value of IncomingRayFlags in shaders.
v2 (Lionel): Improve comments
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19152>
This title is trying to create a buffer view with usage not supported by Anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17972>
It seems that's what the reference renderer in the CTS expects for
Vulkan. This mostly matters if the edges of a point primitive fall
exactly on a pixel sampling point.
Fixes some upcoming tests under
dEQP-VK.pipeline.monolithic.depth.format.*.point_list*
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19026>
It seems that's what the reference renderer in the CTS expects for
Vulkan. This mostly matters if the edges of a point primitive fall
exactly on a pixel sampling point.
Fixes some upcoming tests under
dEQP-VK.pipeline.monolithic.depth.format.*.point_list*
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19026>
Fixes crashes on almost every CTS test that uses raytracing pipelines.
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19130>
This ones were left to be done after initial conversion.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18975>
This convertions were missed due to bad rebased in my end, sorry.
Fixes: 03b959286e ("intel: Make engine related functions and types not i915 dependent")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18975>
Zink on Anv running Gfxbench gl_driver2 is significantly slower than
Iris.
The reason is simple, whereas Iris implements uniform updates using
push constants and only has to emit 3DSTATE_CONSTANT_* packets, Zink
uses push descriptors with a uniform buffer, which on our
implementation use both push constants & binding tables.
Anv ends up doing the following for each uniform update :
- allocate 2 surface states :
- one for the uniform buffer as the offset specify by zink
- one for the descriptor set buffer
- pack the 2 RENDER_SURFACE_STATE
- re-emit binding tables
- re-emit push constants
Of all of those operations, only the last one ends up being useful in
this benchmark because all the uniforms have been promoted to push
constants.
This change defers the 3 first operations at draw time and executes
them only if the pipeline needs them.
Vkoverhead before / after :
descriptor_template_1ubo_push: 40670 / 85786
descriptor_template_12ubo_push: 4050 / 13820
descriptor_template_1combined_sampler_push, 34410 / 34043
descriptor_template_16combined_sampler_push, 2746 / 2711
descriptor_template_1sampled_image_push, 34765 / 34089
descriptor_template_16sampled_image_push, 2794 / 2649
descriptor_template_1texelbuffer_push, 108537 / 111342
descriptor_template_16texelbuffer_push, 20619 / 20166
descriptor_template_1ssbo_push, 41506 / 85976
descriptor_template_8ssbo_push, 6036 / 18703
descriptor_template_1image_push, 88932 / 89610
descriptor_template_16image_push, 20937 / 20959
descriptor_template_1imagebuffer_push, 108407 / 113240
descriptor_template_16imagebuffer_push, 32661 / 34651
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
We'll use this information to avoid :
- binding table emission
- allocation of surface states
v2: Fix anv_nir_push_desc_ubo_fully_promoted()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
Some of the surface state packing functions are called from the hot
path in Anv. We can use function pointers to avoid repeatedly going
through switch/case.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
VUID-VkPipelineLayoutCreateInfo-pSetLayouts-00293
pSetLayouts must not contain more than one descriptor set layout
that was created with
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR set
There is only one push descriptor set with all the descriptor sets, so
no need to have an array.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
We can avoid reemitting this when the index buffer index type doesn't
change.
Also we don't need to update this when the pipeline changes as we do
not pull any value from the pipeline. Instead rely on the dynamic
state to tell if dyn->ia.primitive_restart_enable changed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
Avoids a bunch of checks if we can.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
This code shows up a little on profiling on Gfx12 and since it's only
a gfx8/9 workaround we might as well ifdef it out.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
Setting the NIR options takes care of iris thanks to the common st/mesa
linking code, and updating brw_nir_link_shaders should handle anv.
The main effort here is updating remap_tess_levels, which needs to
handle vector stores, writemasking, and swizzling. Unfortunately,
we also need to continue handling the existing single-component
access because it's used for TES inputs, which we don't vectorize.
We could try to vectorize TES inputs too, but they're all pushed
anyway, so it wouldn't buy us much other than deleting this code.
Also, we do have opt_combine_stores, but not one for loads.
One limitation of using nir_vectorize_tess_levels is that it works
on variables, and so isn't able to combine outer/inner writes that
happen to live in the same vec4 slot (for triangle domains). That
said, it's still better than before.
For writes, we allow the intrinsics to supply up to the full size
of the variable (vec4 for outer, vec2 for inner) even if the domain
only requires a subset of those components (i.e. triangles needs 3).
shader-db results on Icelake:
total instructions in shared programs: 19600314 -> 19597528 (-0.01%)
instructions in affected programs: 65338 -> 62552 (-4.26%)
helped: 271 / HURT: 0
helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12
helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59%
95% mean confidence interval for instructions value: -10.71 -9.85
95% mean confidence interval for instructions %-change: -6.17% -5.43%
Instructions are helped.
total cycles in shared programs: 851842332 -> 851808165 (<.01%)
cycles in affected programs: 618577 -> 584410 (-5.52%)
helped: 271 / HURT: 0
helped stats (abs) min: 64 max: 540 x̄: 126.08 x̃: 111
helped stats (rel) min: 2.57% max: 37.97% x̄: 6.12% x̃: 5.06%
95% mean confidence interval for cycles value: -135.35 -116.80
95% mean confidence interval for cycles %-change: -6.67% -5.57%
Cycles are helped.
total sends in shared programs: 1025238 -> 1024308 (-0.09%)
sends in affected programs: 6454 -> 5524 (-14.41%)
helped: 271 / HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4
helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39%
95% mean confidence interval for sends value: -3.57 -3.29
95% mean confidence interval for sends %-change: -15.42% -14.54%
Sends are helped.
According to Felix DeGrood, this results in a 10% improvement in
the draw call time for certain draw calls from Strange Brigade.
v2: Fix assertions about number of components and add more of them.
Combine the quads and triangles handling as it's nearly identical.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19061>
So we have less stuff in the global namespace
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>