A 76-line chunk of code just to decide if the format is supported,
let's move it to its own function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
It's redundant information, as it's already part of struct anv_format.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
In elk, we tried to store our own "driver" enum values after Mesa's
VARYING_SLOT_MAX. In brw, we eliminated all of these except for an
unnecessary "BRW_VARYING_SLOT_PAD" value. This was used for empty
slots, so vue_map::slot_to_varying[] could store something. This
patch replaces BRW_VARYING_SLOT_PAD with -1.
Our "driver" enum values overlapped with VARYING_SLOT_PATCH0, leading
to unnecessary headaches. Now gl_varying_slot_name_for_stage will do
the right thing for both regular and patch varyings.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
This drops native support for legacy GL's two-sided color feature
in favor of lowering it via nir_lower_two_sided_color(). Instead
of having a whole bunch of state management hassle to set up the
SBE unit to swizzle between the COL and BFC VUE slots, and have it
transparently deliver one or the other to the fragment shader, we
simply deliver both and insert a conditional select there:
(is-front-facing ? front color : back color)
This also works even for > 16 varyings, where swizzling via the SBE
unit isn't viable.
zink, asahi, freedreno, lima, panfrost, r600, v3d, and vc4 all use
this lowering rather than having native support. Only four games in
our shader-db even use this feature.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
Writing a back-face color but not a front-face color is undefined
behavior. We were trying to politely work around potential application
bugs, but this is not required to work, and other drivers don't do it.
Drop the extra complexity.
If we do find a broken application that needs this hack, then a better
way to handle it is to have brw_compute_vue_map set the slot for
VARYING_SLOT_BFC(n) to the slot for VARYING_SLOT_COL(n) when COL(n) is
unwritten. That way, this override is handled at shader compilation
time, and the run-time code can remain simple.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
The "override with a constant" handling appears to take precedence over
the "override with point sprite coordinates" handling. Because we were
overriding undefined inputs to <0, 0, 0, 1>, we needed to avoid this for
sprite coordinates, as they aren't written by a previous stage, but
shouldn't be overridden to zero.
Now that we've dropped that in the previous patch, there's no need to
special case sprite coordinates any longer.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
iris (and i965 before it) tried to to politely return <0, 0, 0, 1.0>
as the value of undefined FS inputs. anv, however, just returns the
value of the first FS input attribute. This makes iris match anv's
behavior, eliminating some overrides and simplifying the code.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
We no longer read the VUE header values in the fragment shader, instead
relying on the payload fields. So there's no need to do anything with
them here. (Note that OpenGL's rules for preserving exact values of
layer/viewport built-ins were relaxed a while back, allowing us to use
the payload fields directly. So this code might've been necessary in
the past, but it isn't now.)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
Apple M1 and M2 GPUs are similar enough to use the same deqp-runner
suite. Use "agx2" as suffix to cover GPUs implementing the AGX2 ISA.
This covers at least the GPUs in all M1 and M2 SoCs.
Extend the `renderer_check` to match M2 (G14x) GPUs as well. The
original check already included M1 Pro/Max/Utra (G13S, G13C and G13D)
erroneously.
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39763>
If we have extended bindless surface offset (ExBSO) support, we want to
use it. Consolidate the anv_physical_device and brw_compiler bits into
a single static inline that take devinfo.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
The infrastructure was built-up, and this was updated...a while ago.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
Shorter to use, and also clearer where something more than devinfo
is used from brw_compiler.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
iris decides to do recompiles or not based on its own program keys,
not the brw or elk keys. So, it makes sense to handle the "why did
we have to recompile a new variant" debugging based on those keys as
well. It also unifies the code, eliminating a brw/elk split, so it's
actually less code.
Additionally, this was the only remaining user of the brw code, so we
can delete that, resulting in even larger cleanups.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
This simplifies some iris wrapping for multiple compilers and also
saves some space in the brw_compiler singleton.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
Having the named field allowed us to indicate that our code conditions
are referring to the specific decision about how we handle indirect
UBOs, rather than some other arbitrary hardware change.
Still, there's no need to store this in a singleton struct - we can
easily have a static inline bool that does the devinfo check for us.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
This is always set to true for elk platforms. No need for the option.
crocus also assumes that we take the sampler path. hasvk had support
for both paths (leftover from when the driver still supported Gfx12).
We started using HDC messages for indirect UBO access on Tigerlake
(Gfx12.x) because of cache reworks that made it more viable. On all
prior platforms, we used the sampler because it has additional L1/L2
caches that the dataport lacks. Additionally, Ivybridge and nearby
platforms had notoriously slow L3 access in some very common cases.
Note that we do use the dataport for constant-offset UBO access,
since we can combine many reads into larger block loads.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.
fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%
fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>
We don't run this code before waitcnt insertion, so this isn't necessary.
This change improves accuracy in these two situations, because the waitcnt
insertion pass is more aware of divergent control flow:
v0 = valu
if (divergent) {
v0 = vmem
} else {
use(v0)
}
v0 = vmem
if (divergent) {
wait vmcnt(0)
} else {
wait vmcnt(0)
}
use(v0)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>