Rob added these new helpers a while back, which freedreno and radeonsi
both share. We should use them too. The new helpers use variables and
system value intrinsics, so we can drop the explicit binding table
creation and just use the normal paths.
Because we have to rewrite the system value uploading anyway, we drop
the scrambling of the default tessellation levels on upload, and instead
let the compiler go ahead and remap components like any normal shader.
In theory, this results in more shuffling in the shader. In practice,
we already do MOVs for message setup. In the passthrough shaders I
looked at, this resulted in no extra instructions on Icelake (SIMD8
SINGLE_PATCH) and Tigerlake (8_PATCH). On Haswell, one shader grew by
a single instruction for a pittance of cycles in a stage that isn't a
performance bottleneck anyway. Avoiding remapping wasn't so much of an
optimization as just the way that I originally wrote it. Not worth it.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20809>
Since the dirty range started out as 0..0, you would have 0..VBend as the
new dirty range on the first draw, and if your VB was >32b then you'd
flush every time you used it. Instead, if there's no existing dirty range
then just set it to our new VB's range.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21370>
Since the dirty range started out as 0..0, you would have 0..VBend as the
new dirty range on the first draw, and if your VB was >32b then you'd
flush every time you used it. Instead, if there's no existing dirty range
then just set it to our new VB's range.
Perf results with zink+anv on my CFL:
sauerbraten: +24.8182% +/- 0.602077% (n=5)
portal-2-v2.trace: +4.64289% +/- 0.285285% (n=5)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21370>
VK_LAYER_EXPORT is going away in the next Vulkan header update. We
already have a PUBLIC macro in util/macros.h which does the same thing.
Unlike VK_LAYER_EXPORT, it should work in Windows too.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
These are manual since they're on a runner in my basement that sometimes
can go down, but it'll be nice to have this for throwing the rare hasvk MR
at.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>
Following the organization done in intel/dev and intel/vulkan.
Probably due to some rebase issue we had a duplicated copyright header
in intel_gem_i915.h that is being removed in here too.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21256>
Only code that cares about Vulkan WSI should get the corresponding
arguments passed. Otherwise, the Vulkan headers might end up including
other headers that we don't have the correct dependencies passed for.
So let's give those a dedicated variable, and only pass that where it's
actually needed.
Fixes: b39958a3a1 ("anv,nir: Move the ANV YCbCr lowering pass to common code")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8193
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21185>
"Use 3DSTATE_CONST command for individual shaders instead of
3DSTATE_CONST_ALL COMMAND"
On gen 12.0 platforms, 3DSTATE_CONSTANT_ALL command is not processed
correctly in certain cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21301>
This is to avoid out of bound register accesses (potentially leading
to hangs) when the dispatch size is smaller than when is reported in
the NIR subgroup_size.
v2: Implement bounding with a mask (since workgroup sizes are powers of 2) (Faith)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 530de844ef ("intel,anv,iris,crocus: Drop subgroup size from the shader key")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21282>
When mesh shader is spawned on a different slice than the originating
task shader, then input task urb handle can come from a different
slice, so masking this information off will load data from the current
slice, instead of the one where real data are.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21007>
Now it's possible to overflow anv_physical_device.queue.families
and anv_device.decoder.
CID: 1520852
Fixes: 056b0cb87f ("anv: add video engine support in various places")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21278>
From the Haswell PRM Vol. 7, "IEEE Floating Point Mode":
"Single precision (F, Float) denorms are flushed to sign-preserved
zero on input and output of any floating-point mathematical
operation."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
The rounding mode only needs to be set once, because 16-bit floats or
preserving denorms aren't supported for the platforms where vec4 is
used.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21210>
The size in src[2] is in byte and needs to cover any possible data
accessed in src[0] by the indirection. That way the register
allocation is aware of what cannot be spilled for the instruction to
execute on valid data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 70ace2bbcd ("intel/compiler: Implement Task Output and Mesh Input")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21188>