Select a horizontal alignment value that matches the main MSAA surface.
We need a valid horizontal alignment to perform MCS ambiguates. The
halign value doesn't actually affect test behavior, but it is validated
by isl_surf_fill_state. We currently have an invalid halign for gfx125.
This patch fixes that.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22545>
With the shader cache enabled, intel_clc attempts to write to ~/.cache.
Many distributions' build systems limit file-system access, and will
kill the process thus causing the build to fail.
Fixes: 639665053f ("anv/grl: Build OpenCL kernels")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22968>
anv_ahb_format_for_vk_format needs to know the format at least. There
is no guarantee that AHardwareBuffer_allocate will succeed, but we are
reluctant to check with AHardwareBuffer_isSupported which may
test-allocate internally and is expensive.
v2: add anv_ahb_format_for_vk_format to anv_android_stubs.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22619>
When allocating a VkDeviceMemory exportable as AHB, it seems incorrect
to fall back to AHARDWAREBUFFER_FORMAT_BLOB when the image has no known
AHB format. We should fail the allocation instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22619>
Due the lack of APIs to set mmap modes, Xe KMD can't support the same
memory types as i915.
So here adding a i915 and Xe function to set memory types supported
by each KMD.
Iris function iris_xe_bo_flags_to_mmap_mode() has a table with all the
mmaps modes of each type of placement.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22906>
We're missing it for the memcpy with streamout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5cc4075f95 ("anv, iris: Add Wa_16011411144 for DG2")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22930>
This reverts commit 5489033fa8.
The problem I was trying to address is that we were programming the
3DSTATE_PS::PositionXYOffsetSelect bit differently with GPL (CENTROID)
than without (NONE).
I failed to understand that this bit also impacts the thread payload
layout. GPL fragment shaders don't know ahead of time if pos_offset is
going to be used. It'll be choosen at runtime base on push constant
bits. So we need to program this bit different just to have a payload
matching the compiled shader code.
This fixes the freedoom replay with GPL FS shader in SIMD32.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22938>
This is necessary to make anv work with clients like VA-API which relies
on implicit fencing only. The bahavior matches iris i915_batch_submit.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22937>
Since we are disabling mesh, which has issues with gpl, enable gpl by
default now, leaving the renamed environment variable as a way to
disable it for debug purposes.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22910>
We are seeing frequent hangs in other workloads when something using
mesh shaders runs at the same time, so gate the feature behind an
environment variable until we figure out what's going on.
v2: (Sagar)
- Give the mesh enabled variable a more descriptive name
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22910>
This can generate things like fneg! of load_const, which is silly.
Fold those away into an actual constant. Only do so on the scalar
backend because there's a comment above that the vec4 backend doesn't
want any new constants this late, and I'm inclined to believe it.
fossil-db stats show a very minor improvement:
Totals:
Instrs: 203091223 -> 203091099 (-0.00%); split: -0.00%, +0.00%
Cycles: 14410638075 -> 14410577067 (-0.00%); split: -0.00%, +0.00%
Totals from 20 (0.00% of 665070) affected shaders:
Instrs: 27067 -> 26943 (-0.46%); split: -0.47%, +0.01%
Cycles: 2687958 -> 2626950 (-2.27%); split: -2.27%, +0.00%
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22881>
Matching the original %016lx, and the "0x" prefix which is still
in the format string.
Fixes: 53b77a8102 ("anv: remove 48bit address space checks")
Signed-off-by: Christopher Snowhill <kode54@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22882>
Commit bb8e31b7ed ("anv: avoid hardcoding instruction VA constant in
shaders") had a slight negative impact on shaders (Red Dead Redemption
2 in particular). Dropping a few shaders from SIMD32 to SIMD16.
With this change, it brings back all the dropped SIMD32 shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22872>
We're about to manipulate these pools and dealing with the fix address
ranges is painful.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
We want to add more heaps in the future and so not having to do
address checks to find out in what heap to release a BO is convinient.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
All the supported platforms should have 36+ bits of virtual address
space.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
These asserts were checking isl_format_layout against itself, change
to compare surface format layout against view format layout.
Fixes: 628bfaf1c6 ("intel/isl: Add some sanity checks for compressed surfaces")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22790>
With the following test :
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_out_of_bounds_load
There is a :
shader_start:
... <- no control flow
g0 = some_alu
g1 = fbl
g2 = broadcast g3, g1
g4 = get_buffer_size g2
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
eliminate_find_live_channel will remove the fbl/broadcast because it
assumes lane0 is active at get_buffer_size :
shader_start:
... <- no control flow
g0 = some_alu
g4 = get_buffer_size g0
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
But then the instruction scheduler will move the get_buffer_size after
the halt :
shader_start:
... <- no control flow
halt <- on some lanes
g0 = some_alu
g4 = get_buffer_size g0
g5 = send <surface>, g4
get_buffer_size pulls the surface index from lane0 in g0 which could
have been turned off by the halt and we end up accessing an invalid
surface handle.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20765>
This value takes a few instructions to create, involving expanding
V-immediates, adding 8 for SIMD16, and so on. We can mark it UNDEF
so that it's clear that although these are partial writes, we are
actually defining the entire value.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22835>
Comparisons which produce 32-bit boolean results (0 or 0xFFFFFFFF)
but operate on 16-bit types would first generate a CMP instruction
with W or HF types, before expanding it out. This CMP is a partial
write, which leads us to think the register may contain some prior
contents still. When placed in a loop, this causes its live range
to extend beyond its real life time.
Mark the register with UNDEF first so that we know that no prior
contents exist and need to be preserved.
This affects:
flt32, fge32, feq32, fneu32, ilt32, ult32, ige32, uge32, ieq32, ine32
On one of Cyberpunk 2077's most complex compute shaders, this reduces
the maximum live registers from 696 to 537 (22.8%). Together with the
next patch, Cyberpunk's spills and fills are cut by 10.23% and 9.19%,
respectively.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22835>
This value depends on the per-sample value which can be unknown at
compile time with graphics pipeline libraries. So we need to have this
dynamic has well and pick the right value when generating the
3DSTATE_PS/3DSTATE_WM packet.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22728>
When using INTEL_DEBUG=bat, INTEL_DEBUG_BATCH_FRAME_START and
INTEL_DEBUG_BATCH_FRAME_STOP can limit dumping of batches for
particular frame ranges. Batch dumps are huge. Smart filtering
allows debugging of single frames during game play. Initial
commit to debug infrastructure.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22564>
Only apply the clamp in multi patch mode (where the input vertices
vary between [1, 32]).
The clamp NIR pass operates on lowered intrinsics so we need to call
it after the inputs have been lowered.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e25e17dd0c ("intel/fs: clamp per vertex input accesses to patchControlPoints")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8912
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22701>