These were removed with Icelake. While they technically still exist on
Skylake, which this compiler supports, we have never used these opcodes
in the 14 years we could have done so. So just scrap them.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>
This is a performance tuning value, recommended value is 512 on DG2.
On DG2 this was in the privileged register RT_CTRL.
Minor CFE_STATE defintion fixes from Jose.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
So we can enable each mask individually when programming registers.
Also setting Mask2/mask of the second double word so all registers in
it are also zeored during state init.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
GFX versions older than GFX 20 have 'Thread Preemption disable' while
GFX 20 has 'Thread Preemption' with value flipped in compute walker
instruction.
So here by default enabling thread preemption, only disabling it
when BTD mode is enabled as instructed in Wa_14017794102.
Similar for 3DSTATE_BTD, enabling preemption by default and
only disabling when platform is affected by Wa_14017794102.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
This will let us reuse the bulk of this code in a new copy propagation
pass without replicating it. We retain a wrapper function for dealing
with ACP entries, which the new pass won't have.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
This was for Sandybridge's IF with embedded comparison, which only
existed for a single generation of hardware. Since the compiler fork,
we no longer support Sandybridge here.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
We were missing the following "newer" fields:
- ex_desc
- predicate_trivial
- sdepth
- rcount
- writes_accumulator
- no_dd_clear
- no_dd_check
- check_tdr
- send_is_volatile
- send_ex_desc_scratch
- send_ex_bso
- last_rt
- keep_payload_trailing_zeroes
- has_packed_lod_ai_src
We can actually just check ex_desc and the new "bits" union to handle
most of them with fewer checks.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
I want to be able to hash an fs_reg, including all the brw_reg fields.
It's easiest to do this if I can use the "bits" union field that
incorporates many of the other ones.
We also move the using declaration for "nr" down because that field was
moved to the second section a while back.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
Pass in the nir_src and check if it's constant, handling it via CPU-side
arithmetic instead of emitting instructions. While we can constant fold
these via our optimization passes, we have to do opt_algebraic to fold
the binary operation with constant sources into a MOV of an immediate,
then opt_copy_propagation to put it in the next expression, and so on,
until the entire expression is folded. This can take several iterations
of the optimization loop, which is inefficient.
For example, gfxbench5/aztec-ruins/normal/7 has load/store_scratch
intrinsics with constant sources, and this patch removes a number of
optimization passes according to INTEL_DEBUG=optimizer.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
Geometry shaders write outputs multiple times, with EmitVertex()
between them. The value of output variables becomes undefined after
calling EmitVertex(), so we don't need to preserve those. This lets
us recreate new registers after each EmitVertex(), assuming we aren't
in control flow, allowing them to have separate live ranges. It also
means that those registers are more likely to be written once, rather
than having multiple writes, which can make optimization easier.
This is pretty much a total hack, but it's helpful.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
The restriction on depth aux mode is gone on Xe2 in spec.
Fix: piglit
arb_post_depth_coverage-multisampling -auto -fbo
isl_surface_state.c:723: isl_gfx20_surf_fill_state_s:
Assertion `info->surf->samples == 1' failed.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29274>
There doesn't seem to be a restriction on the mentioned data types on
LNL anymore. Default to a maximum exec size of SIMD16.
This patch fixes dEQP-VK.subgroups.shuffle.framebuffer.* on LNL
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29504>
In order to support CCS, ISL may upgrade a main surface from Tile4 to
Tile64 with miptails disabled. To avoid using this space consuming
layout when not needed, inform ISL as soon as possible that compression
won't be used.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
Prevents the next patch from causing the following assert failure:
Test case 'dEQP-VK.ycbcr.copy.g8_b8_r8_3plane_420_unorm.g8_b8_r8_3plane_444_unorm.linear_linear_disjoint'..
deqp-vk: ../../src/intel/vulkan/anv_private.h:4962: anv_aspect_to_plane: Assertion `!(aspect & ~all_aspects)' failed.
We still disable CCS for multiplane formats elsewhere. I've attempted
enabling CCS for those cases but end up with failures in CI that I
cannot reproduce locally. Hopefully this change gets the next person a
step closer towards enabling this feature.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
Since 3beaaa9ae8 ("anv: drop lowered storage images code"), this
function has not used the VkImageUsageFlags parameter. So, we can drop
it and simplify its callers.
This function isn't used in hasvk.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
Instead of passing isl_extra_usage_flags to
add_aux_surface_if_supported, use the isl_surf::usage field of the
primary surface to check for ISL_SURF_USAGE_DISABLE_AUX_BIT.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
In the import paths in iris, there are several cases where surface VMAs
are created without relying on the calculated surface alignment.
Asserting the alignments of surface addresses, should help catch any
cases where we end up with the wrong alignment.
This found a couple issues during development. One which required a
change to existing code is that when creating uncompressed surfaces from
compressed ones, ISL will sometimes increase the image alignment as a
result of the new format supporting CCS. This patch adds the usage flag
to disable that behavior.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
Add and use two new surf usage bits:
* ISL_SURF_USAGE_MULTI_ENGINE_SEQ_BIT: the surface may be accessed by
multiple engines, but not in parallel.
* ISL_SURF_USAGE_MULTI_ENGINE_PAR_BIT: the surface may be accessed by
multiple engines in parallel.
Both usages are not concerned with read-after-read access patterns.
Using these bits allows ISL to conditionally use Tile64 or a 64KB
alignment to account for the gfx12.5 CCS WA from HSD 22015614752. Apart
from the potential space savings, there are three benefits of this
approach:
1) CCS can now be used with miptails (though nothing makes use of this
today).
2) CCS can now be used with 3D depth/stencil surfaces in GL.
3) CCS can now be used with 3D depth/stencil surfaces in Vulkan when
apps only use a single queue.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11111
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11117
Tested-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>
In iris, use the CCS scale down factor to calculate the impact of CCS on
TBIMR tile sizes. Even though we fall back to a seemingly less accurate
method to calculate the impact of CCS, it ends up giving the same
answer, 1bpp. Anv already uses this factor, so this patch replaces the
constant with this macro.
There are two benefits to doing this:
1) Consistency between anv and iris.
2) Preparation for a future where we no longer use ISL surfaces to
describe CCS on Xe+. In fact, in iris, we already don't create such
surfaces on ACM.
I considered using INTEL_AUX_MAP_MAIN_SIZE_SCALEDOWN for the calculation
in both drivers, but the naming is aux-map specific and the scaledown
actually exists on flat-ccs platforms as well.
So, we introduce a new macro for all Xe platforms, currently only used
for the specific use case of TBIMR calculations. We can add more such
macros for future platforms, as needed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
Introduce a macro so that drivers don't need to rely on the isl_surf
struct to determine the size of the CCS buffer on gfx12.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
Introduce a macro defining the alignment which aux data start addresses
should have. This alignment is for the worst case of the CCS buffer
being included in a dmabuf. Although a smaller alignment is possible for
non-dmabuf cases on TGL, no drivers would make use of that today as they
place CCS surfaces directly after tiled surfaces.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
Introduce a macro so that drivers don't need to rely on the isl_surf
struct to determine the pitch of the CCS buffer on gfx12. This is useful
during layout queries of dmabufs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
Before this patch, we special-cased the clear color plane for layout
queries. This was because that plane lacks an ISL surface whereas all
others have one. We plan to drop the ISL surface for CCS buffers on
gfx12 in a future commit. So, in preparation, generalize the clear color
plane code to work for every plane queried on a surface that uses
modifiers.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
At the interfaces which query the pitch of the clear color plane in GL
and Vulkan, we've been returning 64B for various reasons. Unify the
rationale under a macro.
The documentation for the macro is picked from anv, which reflects the
most recently synchronized copy of drm_fourcc.h. See the notable changes
at 8cd8f3d697.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>
Our implementation is a no-op for the following reasons :
- ISL always tries to go for the smallest tiling mode (see
isl_surf_choose_tiling())
- In the few cases where we need to use Tile64 for compression
workarounds, VK_MESA_image_alignment_control doesn't require use
to disable compression
- vkd3d-proton has the ability to disable compression using
VK_EXT_image_compression_control, disabling Tile64 requirements
and ensuring ISL can select a 4k tiling mode
So vkd3d-proton should always be able to get a 4k tiling mode if it
wants to.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29175>
Since a42a5bf87e, we've been closing the file descriptor immediately
after loading the devinfo struct.
intel_get_and_print_hwconfig_table() re-queries the hwconfig info from
the device to print out all the entries, so we need to leave the fd
open for this use. I moved the close() call to all paths which exit
the for loop's current iteration.
Ref: a42a5bf87e ("intel/devinfo: add an option to pick platform to print")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29549>
For 16-bit data type, we are padding 16-bit and using 32-bit data type,
so we need to account for the padded portion while calculating the
size_written.
Rework: (Rohan)
- Drop unnecessary fs_builder instance
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29271>
For 16-bit data type, we are padding 16-bit and using 32-bit data type,
so we need to account for the padded portion while calculating the
size_written.
Rework: (Rohan)
- Drop unnecessary fs_builder instance
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29271>
Removed mention of Wa_* when referencing an intended harware behavior
since version 12. This will prevent the erroneous usage of the
`intel_needs_workaround` in the future.
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29559>
Using the wrong type truncate the top bits of the pipeline flags.
Currently we don't have any bit in the top bits so not fixing any bug,
but in the future new extension could add some.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 688bb37552 ("anv: deal with new pipeline flags")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29553>
I'd like to phase out the ISL surface representation of CCS on gfx120 in
order to enable CCS without a 512B-aligned main surface pitch. Remove
the dependency on CCS ISL surfaces when fast-clearing to move drivers
one step towards that goal.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28536>
The vertical alignment of the fast-clear rectangle shrinks as the
bits-per-block of the CCS format increases.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28536>
WA 14018283232 indicates that we need to emit the resource barrier
when the following expression toggles value :
STATE_DEPTH_BOUNDS::depthboundstestenable & 3DSTATE_PS_EXTRA:: Pixel Shader Kills Pixel & 3DSTATE_PS_EXTRA:: Pixel Shader Valid
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29297>