Before this patch, when updating the indirect clear color, BLORP only
invalidated the texture cache on gfx11. The hardware docs state that
the texture cache invalidation is also needed on gfx12 however. Add
this invalidation for gfx12 and move the fast-clear related cache
invalidations to the drivers for clarity and performance.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5850
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23588>
ISL's state-machine of CCS_D describes full resolves as leaving the aux
buffer in the pass-through state. Hardware doesn't behave this way on
gfx8 however. On that platform, full resolves transition the aux buffer
to the resolved state. This was verified by dumping the CCS before and
after a full resolve on BDW (gfx7 is simply assumed to behave the same).
Ambiguate after resolving to match driver expectations.
Prevents iris from failing piglit's fcc-write-after-clear on BDW with a
future patch which relies on fast-clear encodings being removed after a
resolve. The avoided failure is:
Testing implicit read of partial block UNORM -> SNORM
Probe color at (0,1,0)
Expected: 1.000000 1.000000 1.000000 1.000000
Observed: 0.000000 0.000000 0.000000 0.000000
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Fixes: 90a39cac87 ("intel/blorp: Emit compute program based on BLORP_BATCH_USE_COMPUTE")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23934>
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Set the default value for "Pixel Position Offset Enable" when emitting
3DSTATE_MULTISAMPLE in the genxml so that we can drop it from blorp
and genX_state.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23936>
For 32bpc formats, the ICL+ sampler fetches the raw clear color dwords
used for rendering instead of the converted pixel dwords typically used
for sampling. The CLEAR_COLOR struct page documents this for 128bpp
formats, but not for 32bpp and 64bpp formats.
In blorp_copy, map R11G11B10_FLOAT to R8G8B8A8_UINT instead of R32_UINT.
This will cause the sampler to fetch the clear color pixel, allowing
drivers to keep clear color support enabled during copies.
If iris is forced to convert blits to copies, this patch fixes the
following test on gfx12:
dEQP-GLES3.functional.fbo.color.repeated_clear.blit.rbo.r11f_g11f_b10f
At the moment, both iris and anv won't hit this issue outside of
blorp_copy. This is due to the read/write access restrictions they
currently place on texture views that reinterpret the surface format.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8964
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23604>
Rename the isl_aux_usage enum to clarify that it is optional on gfx125.
The new name comes from the Alchemist docs, where the feature is
referred to as "Fast Clear Optimization (FCV)".
The rename was done with this command:
git grep -l "GFX12_CCS_E" | xargs sed -ie "s/GFX12_CCS_E/FCV_CCS_E/g"
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23220>
Via Coccinelle patches
@@
expression a, b, c;
@@
-nir_channels(b, a, (1 << c) - 1)
+nir_trim_vector(b, a, c)
@@
expression a, b, c;
@@
-nir_channels(b, a, BITFIELD_MASK(c))
+nir_trim_vector(b, a, c)
@@
expression a, b;
@@
-nir_channels(b, a, 3)
+nir_trim_vector(b, a, 2)
@@
expression a, b;
@@
-nir_channels(b, a, 7)
+nir_trim_vector(b, a, 3)
Plus a fixup for pointless trimming an immediate in RADV and radeonsi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>
Via Coccinelle patch:
@@
expression a, b, c;
@@
-a.src = nir_src_for_ssa(b);
-a.src_type = c;
+a = nir_tex_src_for_ssa(c, b);
@@
expression a, b, c;
@@
-a.src_type = c;
-a.src = nir_src_for_ssa(b);
+a = nir_tex_src_for_ssa(c, b);
Plus manual fixups, including...
* a few identity swizzles changed to nir_trim_vector in TTN and prog-to-nir to
fix the Coccinelle-botched formatting, and similarly a pointless nir_channels
* collapsing a now-pointless temp in vtn
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>
This simplifies things a bit. Note that in some cases, the arguments are
swapped, because multiplications are commutative, and nir_fmul_imm only
allows the second operand to be an immediate.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>
This is useful for iris to know what formats will be used for copy
operations.
The new function introduces a couple refactors. It makes use of the
ISL_GFX_VER() macro and it also makes more use of the
isl_surf_usage_is_depth() function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
In blorp_copy, instead of checking if the surface's aux-usage is CCS_E,
check if its format supports CCS_E.
ISL won't report that a surface supports CCS_E if its format doesn't, so
this should strictly widen the scope of surfaces included in this path.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
We will soon update the CCS_E aux-usage check to a CCS_E format check.
Since depth formats support CCS_E on gfx12+, add another check for the
depth usage to prevent depth surfaces from falling into the CCS_E copy
format case.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
Sampling with HiZ is introduced on BDW+. For BLORP copies, instead of
using the depth format when the source uses HiZ, use it for all depth
sampling on BDW+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:
@@
expression A, B, C, D, E;
@@
-nir_ssa_dest_init(A, B, C, D, E);
+nir_ssa_dest_init(A, B, C, D);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
We can't do fast clear operations on some LODs of 8bpp surfaces. Add an
assertion to BLORP to protect against drivers attempting to do this.
This assertion was successfully hit with some local modifications to
iris and with the piglit test case, "generatemipmap-base-change format".
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7301
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22857>
Implement the ambiguate operation for MCS. This clears MCS layers with a
sample-dependent "uncompressed" value that tells the sampler to go look
at the main surface.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22545>
This ensures that users of libintel_dev.a won't be compiled until
include files are generated, and that they are recompiled when the
header changes.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20825>
This allows us to communicate to the back-end that we don't actually
know if the framebuffer is multisampled or not. No drivers set anything
but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but
those should be asserted.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed
in as a push constant. In this case, we have to often have to do the
calculation both ways and SEL the result. It's a bit more code but
decouples MSAA from the shader key.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
Patch changes comment to refer to the lineage 14014097488, this
workaround applies for ICL as well.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20952>
We don't use a base workgroup ID for BLOCS. It needs to be lowered, or
else we'll assert fail when compiling the compute shader.
(Note for stable: this patch doesn't fix a bug in 4abdecce22
specifically, but rather is a missing patch that needed to go along with
the rest of MR 20068, on whichever branches it exists on.)
Fixes: 4abdecce22 ("iris: Lower load_base_workgroup_id to zero")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20750>
Because commit b9403b1c47 moved dispatch enable handling away from the
compiler, the drivers must ensure correct dispatch enable values. This
is handled by the intel_set_ps_dispatch_state function.
v2: Fix gfx6 build and use brw_fs_get_dispatch_enables for gfx6 in
crocus
v3: Rebase, use intel_set_ps_dispatch_state, drop gfx6 handling
Fixes: b9403b1c47 ("intel: factor out dispatch PS enabling logic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20267>
This replaces brw_fs_get_dispatch_enables(), which was added in
b9403b1c47 ("intel: factor out dispatch PS enabling logic"), but this
function will not work well for future changes to 3DSTATE_PS.
So, instead, this moves the related code into a "genX" file which can
directly update 3DSTATE_PS for the given platform.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20329>
None of the drivers have used this since we dropped i965, and BLORP
no longer uses it as of the previous commit. We can also drop the
former compressed_multisample_tex_mask (now padding) field so that
things remain 64-bit aligned.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
This will result in us using the TXF_CMS_W message rather than the
TXF_CMS message on Skylake through Tigerlake for 2/4/8x MSAA blits,
which is technically slightly worse. However, it shouldn't be that
much worse: the TXF_CMS message was removed altogether on Alchemist.
iris and anv set key->msaa_16 unconditionally, to avoid paying the
cost of shader recompiles for a miniscule gain. crocus and hasvk
don't need to set it as they don't support 16x MSAA. BLORP already
recompiles based on the sample count, so it could easily keep doing
this for the minor benefit. But avoiding it will let us drop the
entire msaa_16 key field out of the compiler, which is nice.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
The compiler looks at this key field to determine whether to perform
an MCS fetch for a txf_ms or samples_identical texture message, if a
nir_tex_src_ms_mcs_intel source wasn't provided. If it isn't set,
it instead uses constant 0 (nothing is compressed).
All of the drivers (iris, crocus, anv, hasvk) unconditionally set this
to ~0 because we don't want to pay for costly shader recompiles (which
can cause nasty stuttering). Most textures are compressed anyway, and
the hardware ignores the l2dms MCS parameter if MCS is disabled.
The only user was BLORP, which sets the key field based on whether the
texture's aux usage has MCS. But if it has MCS, it also does the MCS
fetch itself and supplies it directly. Otherwise, it relies on the
compiler to fill in the 0 value. But it could easily just provide the
0 value itself in that case and not rely on the compiler at all.
With that fixed, we can just drop the key fields entirely. We leave
them as padding for now to avoid repacking structures; we won't need
to after the next commits anyway.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>