This helper is generally useful when trying to prettyprint a 32-bit value, so
make it available to the rest of the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
The new compression scheme introduced in Xe2 also applies to Xe3, so
we're liable for the same bugs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39953>
Instead of using whatever group was set by the previous
instruction. No behavior change, just normalizes what
we generate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>
The `group()` helper creates the new builder "relative" to the existing
one, so this was resulting in some uniform instructions having
a non-zero channel offset ("group") -- which was surprising and had no
practical effect.
Normalize to always use group = 0. No change in behavior expected.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.
No shader-db changes on any ELK platform. I suspect the problematic
cases only occur after scheduling has rearranged instructions. This is
likely the reason BRW didn't experience this problem until 09450faf.
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
This is a backport of BRW e26270249b.
shader-db:
All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18623918 -> 18624594 (<.01%)
instructions in affected programs: 125179 -> 125855 (0.54%)
helped: 0 / HURT: 139
total cycles in shared programs: 957073100 -> 957072484 (<.01%)
cycles in affected programs: 16534168 -> 16533552 (<.01%)
helped: 42 / HURT: 68
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.
shader-db:
Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3
total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1
Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11
total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%
Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%
Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
It was already computed in brw_shader::assign_curb_setup() so we can use it
in brw_assign_urb_setup().
There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when
push_sizes were not multiple of REG_SIZE, the first one was aligning every
push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum
of all push_size.
By luck the only places that did not had a push_size aligned to REG_SIZE only
had one push_size, so this was not an issue.
So here also fixing this mismatch and adding an assert to caught any future
mismatch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>
Apparently this a performance regression on our CI as opposed to what
the HW documentation recommends.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39991>
We just read this from the NIR and store it in iris_compiled_shader,
there's no reason for the backend compiler to be involved.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
These days, our system value concept is just about iris_program
communicating to iris_state which values to upload into a UBO.
Nowhere in that process is the backend compiler involved, so it
doesn't make sense for there to be brw/elk mechanisms.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
iris needs this, but anv does not, and it's just a small wrapper around
common NIR lowering anyway. This also removes some brw/elk splitting.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
nir_create_passthrough_tcs already validates the result, we don't need
to validate a second time.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
VK_FORMAT_{R8G8B8,B8G8R8}_{UNORM,SRGB} describe a 3-component, 8bpc,
24bpp, format. This is mapped to that type for Android, and implemented
as such by panvk. radv maps these to 4-component/32bpp formats, but only
support these formats for buffers rather than images. The outlier is
ANV, which relies on the 24->32bpp mapping to happen.
The Wayland WSI was mapping this to the 32bpp R8G8B8A8/B8G8R8A8 formats
instead. This would cause a failure to import the dmabuf into the
compositor on panvk, as it would send a buffer which was too small. (Or,
if it did import: garbage.)
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39552>
This commit extracts the third and final variant of function
anv_get_image_format_features2(). It is still a 296-line function, but
that is already significantly smaller than the 444-line behemoth that
anv_get_image_format_features2() was at the start of this patch
series.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
Function anv_get_image_format_features2() has 3 clear subvariants that
take paths independent of each other: one for compressed_emulated
formats, another for depth/stencil formats, and a third one for color
formats. Extract the 2 first subvariatns to their own sub-functions.
We'll extract the color variant in the next commit in order to make
the diff easier to review.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
A 76-line chunk of code just to decide if the format is supported,
let's move it to its own function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
It's redundant information, as it's already part of struct anv_format.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
In elk, we tried to store our own "driver" enum values after Mesa's
VARYING_SLOT_MAX. In brw, we eliminated all of these except for an
unnecessary "BRW_VARYING_SLOT_PAD" value. This was used for empty
slots, so vue_map::slot_to_varying[] could store something. This
patch replaces BRW_VARYING_SLOT_PAD with -1.
Our "driver" enum values overlapped with VARYING_SLOT_PATCH0, leading
to unnecessary headaches. Now gl_varying_slot_name_for_stage will do
the right thing for both regular and patch varyings.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
This drops native support for legacy GL's two-sided color feature
in favor of lowering it via nir_lower_two_sided_color(). Instead
of having a whole bunch of state management hassle to set up the
SBE unit to swizzle between the COL and BFC VUE slots, and have it
transparently deliver one or the other to the fragment shader, we
simply deliver both and insert a conditional select there:
(is-front-facing ? front color : back color)
This also works even for > 16 varyings, where swizzling via the SBE
unit isn't viable.
zink, asahi, freedreno, lima, panfrost, r600, v3d, and vc4 all use
this lowering rather than having native support. Only four games in
our shader-db even use this feature.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
If we have extended bindless surface offset (ExBSO) support, we want to
use it. Consolidate the anv_physical_device and brw_compiler bits into
a single static inline that take devinfo.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
The infrastructure was built-up, and this was updated...a while ago.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
Shorter to use, and also clearer where something more than devinfo
is used from brw_compiler.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
iris decides to do recompiles or not based on its own program keys,
not the brw or elk keys. So, it makes sense to handle the "why did
we have to recompile a new variant" debugging based on those keys as
well. It also unifies the code, eliminating a brw/elk split, so it's
actually less code.
Additionally, this was the only remaining user of the brw code, so we
can delete that, resulting in even larger cleanups.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
This simplifies some iris wrapping for multiple compilers and also
saves some space in the brw_compiler singleton.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
Having the named field allowed us to indicate that our code conditions
are referring to the specific decision about how we handle indirect
UBOs, rather than some other arbitrary hardware change.
Still, there's no need to store this in a singleton struct - we can
easily have a static inline bool that does the devinfo check for us.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
This is always set to true for elk platforms. No need for the option.
crocus also assumes that we take the sampler path. hasvk had support
for both paths (leftover from when the driver still supported Gfx12).
We started using HDC messages for indirect UBO access on Tigerlake
(Gfx12.x) because of cache reworks that made it more viable. On all
prior platforms, we used the sampler because it has additional L1/L2
caches that the dataport lacks. Additionally, Ivybridge and nearby
platforms had notoriously slow L3 access in some very common cases.
Note that we do use the dataport for constant-offset UBO access,
since we can combine many reads into larger block loads.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
The previous code was copying RefPicList0 to RefPicList1 but not updating
num_ref_idx_l1_active_minus1, leaving it potentially uninitialized or zero.
This caused the hardware to see an inconsistent L1 list state.
Accordingly it sets num_ref_idx_active_override_flag if necessary.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>