With perfetto that string is processed later leading to
use-after-free.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
Removes the need for emitting 3DSTATE_BINDING_TABLE_POINTER* commands
to make the HW gather push constants.
According to internal pointers, this been the default behavior on
Gfx11+.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
Blorp emits 3DSTATE_BINDING_TABLE_POINTER_* instructions in 3D mode.
At the moment we're saved by the push constants reemitting the btp but
we'll drop that in the next commit.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
The dEQP optimization in [1] for 1:1 ASTC copies exposed a race
condition where the internal decompression shader reads old data
from the texture cache before the copy finishes.
This patch adds cache flush to ensure the shader sees the newly
copied ASTC blocks. It also fixes the block extent calculation
to use the destination image metadata.
[1] https://gerrit.khronos.org/c/vk-gl-cts/+/17514
Fixes: dEQP-GLES31.functional.copy_image.compressed.viewclass_astc*
v2: Drop CS_STALL and update the bits order (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40060>
The previous commit enable different command buffers to program the
same 3DSTATE_BINDING_TABLE_POOL_ALLOC instruction even though they
allocated different chunks of binding tables.
Now we can just predicate this programming and skip the stalling,
flushing & invalidation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>
We currently allocate 64KiB chunks of binding table pools for each
command buffers and program the 3DSTATE_BINDING_TABLE_POOL_ALLOC
instruction accordingly.
But 3DSTATE_BINDING_TABLE_POINTERS_* instructions can address 2^20
bytes. So it's possible to have 2 command buffers share the same
programming if they just add some offsets to their
3DSTATE_BINDING_TABLE_POINTERS_* programming and round down
3DSTATE_BINDING_TABLE_POOL_ALLOC addresses to 2^20.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>
This helper is generally useful when trying to prettyprint a 32-bit value, so
make it available to the rest of the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
The new compression scheme introduced in Xe2 also applies to Xe3, so
we're liable for the same bugs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39953>
Instead of using whatever group was set by the previous
instruction. No behavior change, just normalizes what
we generate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>
The `group()` helper creates the new builder "relative" to the existing
one, so this was resulting in some uniform instructions having
a non-zero channel offset ("group") -- which was surprising and had no
practical effect.
Normalize to always use group = 0. No change in behavior expected.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.
No shader-db changes on any ELK platform. I suspect the problematic
cases only occur after scheduling has rearranged instructions. This is
likely the reason BRW didn't experience this problem until 09450faf.
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
This is a backport of BRW e26270249b.
shader-db:
All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18623918 -> 18624594 (<.01%)
instructions in affected programs: 125179 -> 125855 (0.54%)
helped: 0 / HURT: 139
total cycles in shared programs: 957073100 -> 957072484 (<.01%)
cycles in affected programs: 16534168 -> 16533552 (<.01%)
helped: 42 / HURT: 68
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.
shader-db:
Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3
total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1
Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11
total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%
Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%
Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
It was already computed in brw_shader::assign_curb_setup() so we can use it
in brw_assign_urb_setup().
There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when
push_sizes were not multiple of REG_SIZE, the first one was aligning every
push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum
of all push_size.
By luck the only places that did not had a push_size aligned to REG_SIZE only
had one push_size, so this was not an issue.
So here also fixing this mismatch and adding an assert to caught any future
mismatch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>
Apparently this a performance regression on our CI as opposed to what
the HW documentation recommends.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39991>
We just read this from the NIR and store it in iris_compiled_shader,
there's no reason for the backend compiler to be involved.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
These days, our system value concept is just about iris_program
communicating to iris_state which values to upload into a UBO.
Nowhere in that process is the backend compiler involved, so it
doesn't make sense for there to be brw/elk mechanisms.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
iris needs this, but anv does not, and it's just a small wrapper around
common NIR lowering anyway. This also removes some brw/elk splitting.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
nir_create_passthrough_tcs already validates the result, we don't need
to validate a second time.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
VK_FORMAT_{R8G8B8,B8G8R8}_{UNORM,SRGB} describe a 3-component, 8bpc,
24bpp, format. This is mapped to that type for Android, and implemented
as such by panvk. radv maps these to 4-component/32bpp formats, but only
support these formats for buffers rather than images. The outlier is
ANV, which relies on the 24->32bpp mapping to happen.
The Wayland WSI was mapping this to the 32bpp R8G8B8A8/B8G8R8A8 formats
instead. This would cause a failure to import the dmabuf into the
compositor on panvk, as it would send a buffer which was too small. (Or,
if it did import: garbage.)
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39552>
This commit extracts the third and final variant of function
anv_get_image_format_features2(). It is still a 296-line function, but
that is already significantly smaller than the 444-line behemoth that
anv_get_image_format_features2() was at the start of this patch
series.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
Function anv_get_image_format_features2() has 3 clear subvariants that
take paths independent of each other: one for compressed_emulated
formats, another for depth/stencil formats, and a third one for color
formats. Extract the 2 first subvariatns to their own sub-functions.
We'll extract the color variant in the next commit in order to make
the diff easier to review.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
A 76-line chunk of code just to decide if the format is supported,
let's move it to its own function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
It's redundant information, as it's already part of struct anv_format.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
In elk, we tried to store our own "driver" enum values after Mesa's
VARYING_SLOT_MAX. In brw, we eliminated all of these except for an
unnecessary "BRW_VARYING_SLOT_PAD" value. This was used for empty
slots, so vue_map::slot_to_varying[] could store something. This
patch replaces BRW_VARYING_SLOT_PAD with -1.
Our "driver" enum values overlapped with VARYING_SLOT_PATCH0, leading
to unnecessary headaches. Now gl_varying_slot_name_for_stage will do
the right thing for both regular and patch varyings.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
This drops native support for legacy GL's two-sided color feature
in favor of lowering it via nir_lower_two_sided_color(). Instead
of having a whole bunch of state management hassle to set up the
SBE unit to swizzle between the COL and BFC VUE slots, and have it
transparently deliver one or the other to the fragment shader, we
simply deliver both and insert a conditional select there:
(is-front-facing ? front color : back color)
This also works even for > 16 varyings, where swizzling via the SBE
unit isn't viable.
zink, asahi, freedreno, lima, panfrost, r600, v3d, and vc4 all use
this lowering rather than having native support. Only four games in
our shader-db even use this feature.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
If we have extended bindless surface offset (ExBSO) support, we want to
use it. Consolidate the anv_physical_device and brw_compiler bits into
a single static inline that take devinfo.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
The infrastructure was built-up, and this was updated...a while ago.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
Shorter to use, and also clearer where something more than devinfo
is used from brw_compiler.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>