Commit graph

194063 commits

Author SHA1 Message Date
Kenneth Graunke
437bda3013 intel/brw: Get rid of the lsc_msg_desc_wcmask helper
The LOAD/STORE opcodes take a vector size, while the LOAD/STORE_CMASK
opcodes take a channel mask.  The two are mutually exclusive.  So we
can just have the lsc_msg_desc() helper take one or the other in the
same parameter.  This more closely matches the actual descriptor.

We couldn't do this until the previous commit, since we were previously
relying on the lsc_msg_desc() function to calculate a cmask out of the
number of vector components.  But now we don't need it to do that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>
2024-08-27 09:25:59 +00:00
Kenneth Graunke
55f193a105 intel/brw: Switch from LSC CMASK opcodes to regular LOAD/STORE
The LOAD/STORE opcodes take a vector size (number of components), while
the LOAD/STORE_CMASK opcodes take a channel mask.  For some reason, we
were passing a number of channels to lsc_msg_desc(), then using it to
construct a channel mask with all channels enabled, and always using the
CMASK message variants.

Considering we don't actually want to mask off any channels, we should
probably just use the regular LOAD/STORE opcodes, as they're more
flexible anyway.

One exception is that typed messages on Xe2 apparently only support
LOAD_CMASK/STORE_CMASK and not regular LOAD/STORE.  So we keep using
those there.  (Thanks to Sagar Ghuge for catching this!)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>
2024-08-27 09:25:58 +00:00
Sviatoslav Peleshko
7e52b67801 anv: Add full subgroups WA for the shaders with barriers in Breaking Limit
When barriers are used in invalid shaders with non-uniform control flow
we might get a hang. Forcing 32-wide group can help by making it more
probable that barrier instruction is executed by at least one channel
in each thread, and thus hang will be avoided. This shouldn't affect
Xe2+, where active-thread-only barriers are used anyway.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11497
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko
1904fe1186 anv: Release correct BO in anv_cmd_buffer_set_ray_query_buffer
If p_atomic_cmpxchg doesn't set the ray_query_shadow_bos[bucket] to new_bo
allocated by this thread, it returns the bucket BO allocated by the other
thread and we use it. But due to a mistake, we also release that BO, not
the candidate just allocated by this thread and never used again.

Fixes: 5d3e4193 ("anv: enable ray queries")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko
09122e2be0 brw,elk: Fix opening flags on dumping shader binaries
Truncation is needed for overwriting correctly in cases when old file is
bigger than the one we want to dump (e.g. when the old one was edited
inplace). Also, creation permissions are way too broad.

Fixes: 4f41c44d ("intel/compiler: Add variable to dump binaries of all compiled shaders")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko
442cc7996e anv: Assert ray query BO actually exists
The crash will happen if the client tries to use ray queries without
enabling the KHR_ray_query extension. Add an assert to be able to catch
this sooner.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
2024-08-27 08:26:08 +00:00
Samuel Pitoiset
4c1a912372 radv: remove RADV_DEBUG=nogsfastlaunch2
It's been two Mesa releases since this fast-launch mode2 has been fixed
on GFX11 and everything works as expected. The option is no longer
needed, note that GFX12 only has mode2 apparently.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30815>
2024-08-27 07:51:33 +00:00
Nanley Chery
4a8f3181ba intel: Support any depth fast-clear value on Xe2
Remove the restriction that a depth fast-clear must have a clear value
which matches an image-dependent heuristic.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>
2024-08-27 06:15:36 +00:00
Nanley Chery
4a9e45061a anv: Add and use anv_image_hiz_clear_value()
The benchmarks we're tracking tend to prefer clearing depth buffers to
0.0f when the depth buffers are part of images with multiple aspects.
Otherwise, they tend to prefer clearing depth buffers to 1.0f.

Replace the ANV_HZ_FC_VAL constant with a function which implements this
heuristic.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>
2024-08-27 06:15:36 +00:00
Nanley Chery
9fd79dc49e anv: Pass the VkClearDepthStencilValue for clears
Xe2 can easily support fast-clearing depth buffers to multiple clear
values. Instead of assuming a hard-coded value in various parts of the
driver, pass the clear value down the expected paths.

For consistency, also adjust the slow depth clear function to have a
matching parameter.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>
2024-08-27 06:15:36 +00:00
Paulo Zanoni
f3c7e14f09 isl: don't assert(num_elements > (1ull << 27))
Some games such as Marvel's Spider-Man Remastered and Assassin's
Creed: Valhalla don't work in debug mode because they hit this
assertion. In Release mode, they appear to work (although in some
platforms there may be visual corruption or GPU hangs). There's
nothing we can do about this error (see below), so in this patch we
replace the assertion with an error message, because it allows us to
(i) test the rest of the game in debug mode so we may catch other
issues; and (ii) warn users of release mode that the issue is
happening.

The unsupported num_elements comes from vkGetDescriptorEXT() and
appears to be violating VUID-VkDescriptorGetInfoEXT-type-09427. This
function cannot return errors, but we can disable
VK_EXT_descriptor_buffer.

If we do disable the extension, then vkCreateBufferView() will start
triggering the assertion, and we can see that
VkBufferViewCreateInfo-range-00930 is being violated. If we change Anv
to return errors on these vkCreateBufferView() cases, then the games
won't work at all.

I reported this to vkd3d-proton, but according to the vkd3d-proton
developer Philip Rebohle:

 "There's also the problematic case of games using typed descriptors
  but passing non-typed buffer descriptors, which is an extremely
  common app bug that works on all D3D12 drivers that we need to work
  around by creating typed views. If that's what's happening here then
  the best we can do is to just not create the typed view and have the
  game be broken entirely, or create a smaller view and most likely
  still completely break the game, but at least that way it wouldn't
  trigger Vulkan validation. Emulating larger views via multiple
  smaller views is not possible for us."

 "Confirmed that it's the app itself creating these views."

 "D3D12 does not have runtime validation for this or any sort of query
  for the app, so we really can't do much here."

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9963
Link: https://github.com/HansKristian-Work/vkd3d-proton/issues/2071
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30775>
2024-08-27 05:47:50 +00:00
Faith Ekstrand
b78a691ce2 nil,nvk: Disable modifiers for B10G11R11_UFLOAT and E5B9G9R9_UFLOAT
The CTS tests fail due to precision issues (arguably a CTS bug) but it
also doesn't make a lot of sense to advertise modifiers on them at all.

Fixes: cd428e01d7 ("nvk: Advertise VK_EXT_image_drm_format_modifier")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30861>
2024-08-27 05:33:10 +00:00
Lionel Landwerlin
2158fe2ae2 nir/divergence: add missing load_constant_base_ptr
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30712>
2024-08-27 01:33:52 +00:00
Lionel Landwerlin
6336e0fe7f anv: order data in wa_bo to leave wa_addr last
We want to make sure the workaround_address is the last item in the BO
so that we don't have to care about the size of the writes going
there, we'll be sure they won't overwrite other items in that BO.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7b9400b7f7 ("intel/blorp: Don't use clear color conversion on gfx12")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11775
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30844>
2024-08-27 00:51:03 +00:00
Lionel Landwerlin
d8ec8acede anv: always use workaround_address, not workaround_bo
The workaround BO has some debug information at the beginning. The
workaround address is placed after that.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30844>
2024-08-27 00:51:03 +00:00
Nanley Chery
9b98cebe9a intel: Drop BLORP_BATCH_NO_UPDATE_CLEAR_COLOR
All drivers update the clear color themselves. So, drop the
functionality from BLORP as well as the flag controlling it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
2024-08-26 23:57:12 +00:00
Nanley Chery
64d861b700 iris: Skip some fast-clears even on color changes
Previously, we only skipped fast-clearing if the aux state was CLEAR and
the clear color hadn't changed. That was because we relied on
blorp_fast_clear() to update the clear color for us. Now that we update
the clear color outside of blorp_fast_clear(), also skip fast-clearing
when the clear color changes while in the CLEAR state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
2024-08-26 23:57:12 +00:00
Nanley Chery
2886851a8e iris: Always use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR
Update the clear color with iris rather than with BLORP. This enables an
optimization in the next patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
2024-08-26 23:57:12 +00:00
Nanley Chery
721d0c3e77 anv,hasvk: Always use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR
Store the clear color from within the drivers, rather than from BLORP.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
2024-08-26 23:57:11 +00:00
Nanley Chery
5fd42500cf anv,hasvk: Add and use set_image_clear_color()
We're going to be storing clear colors from the drivers rather than
BLORP. Add a function for this purpose.

For now, the first use replaces init_fast_clear_color(). One change in
behavior is that the clear color initialization is now done without
write-checking on gfx12. This actually matches what anv does to other
writes to the image's fast-clear tracking state. We can fix this later
if and when we address the larger issue.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
2024-08-26 23:57:11 +00:00
Yunhyeok "Yune" Choi
27014df366 glx: Getting rid of the double assignment in __glXWireToEvent.
Previously the field `event_type` in `GLXPbufferClobberEvent`
was assigned twice in succession with different values.
Removing the first assignment and retaining only the second one.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30836>
2024-08-26 23:19:56 +00:00
Dave Airlie
4bf257a18f llvmpipe: make sure to duplicate the fd handle before giving out
This handle is given to the user to close, so make sure to dup it
first.

Fixes: d74ea2c117 ("llvmpipe: Implement dmabuf handling")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30839>
2024-08-26 22:53:12 +00:00
Dave Airlie
521dc42e6c llvmpipe: handle stride properly on lvp udmabuf imports
The import data comes in via the fd import, but we need to make
sure to store the row stride value here.

Fixes: c44d65a467 ("lp: only map dt buffer on import from dmabuf")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30839>
2024-08-26 22:53:12 +00:00
Dave Airlie
7db16e7cdd radv: turn video decode/encode on for VCN4 with latest fw
With the latest fw in the linux-firmware repo, navi3x passes
all the CTS tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Dave Airlie
4255bbd958 radv: move video decode enable test into a flag
This makes it easier to start conditionalising this on fw releases.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Benjamin Cheng
95a980b61f radv/video: add event support for VCN4
This was the main missing piece for passing vulkan video CTS
as the video firmwares couldn't do proper vulkan events.

With new enough firmware this is now possible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Víctor Manuel Jáquez Leal
c340862555 frontends/va: Don't return P010/P016 as surface formats when encoding
This is almost a complete revert of 0eccd158 (!3285), since it was a
driver fix for a client bug. vaapih265enc should be fixed rather adding
a workaround that breaks the logic of API, since vaQuerySurfaceAttributes
depends only on config parameter, which defines the rt format.

You can verify it with vadumpcap https://github.com/fhvwy/vadumpcaps

Signed-off-by: Victor Jaquez <vjaquez@igalia.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19443>
2024-08-26 22:01:46 +00:00
Mike Blumenkrantz
786be05df3 dril: add zink stub
ironically this was the only driver left out

Fixes: 3de62b2f9a ("gallium/dril: Compatibility stub for the legacy DRI loader interface")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30851>
2024-08-26 21:08:58 +00:00
Mike Blumenkrantz
7255c5e108 ci: add a660 flake
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/62739168

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30857>
2024-08-26 20:42:22 +00:00
Assadian, Navid
cb32bcd3fe amd/vpelib: Add 420 semi-planar 12bit handling
Adds semi-Planar 420 12 bits formats.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:15 +00:00
Brendan
fcad791d07 amd/vpelib: Create virtual stream concept
[Why]
Need to create streams that don't come from input params (ex. for bg
gen) to prepare for future concepts.

[How]
Add enum for stream type, create helper functions to populate virtual
streams, and add custom functions where virtual stream function varies
from input stream function.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Brendan Leder <brendansteve.leder@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
b670701b65 amd/vpelib: Increase the CD field in vpe descriptor programming
Introduce the vpe desc writer hook.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Shih, Jude
cb9175a7af amd/vpelib: Update Plane Descriptor Writer
Refactor to support new plane descriptor hook, and update enum
vpe_scan_direction.

Co-authored-by: Jesse Agate <jesse.agate@amd.com>
Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal
18dae30b17 amd/vpelib: Add resource function hooks for checking support
Add function hooks for checking support including rotation, background
color, DCC capability and input/output support check.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Alan Liu
06097ad64d amd/vpelib: Remove unused structs
Remove the definition of unused structs:
- struct x_axis_config
- struct point_config
- struct curve_points32
- struct lut_point
- struct pwl_parameter2

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Chang, Tomson
6483c2c786 amd/vpelib: Add and fix collaborate sync data
[Why&How]
The original implementation always have sync data == 1.
Make it increasing with some 4 bits in random to help debugging
collaborate sync issues across multiple contexts.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
015b1b52c8 amd/vpelib: Remove extra collaborate sync commands in IB
Remove extra collaborate sync commands and fix coding format.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
e9e2fe389f amd/vpelib: Use VPE_IP_LEVEL_1_0 for VPE IP 6.1.3
Use VPE_IP_LEVEL_1_0 for VPE IP version 6.1.0 and 6.1.3.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal
73d112f372 amd/vpelib: Add input pixel format support
Add input pixel format support for VPE.

Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Reviewed-by: Jesse Agate <jesse.agate@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Hsieh, Mike
0164bfda65 amd/vpelib: Add cache mechanism for 3D Lut command
[WHY & HOW]
Converting 3D Lut parameters into vpe command takes time.
3D Lut will not change every frame, by adding cache mechanism can improve effeciency.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Mike Hsieh <mike.hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Kovac, Krunoslav
9817793cd9 amd/vpelib: Reuse existing float to reg format conversion
Remove vpe_fixpt_from_float and use existing conversion
for double(float)->reg custom 1.6.12 format.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Eric Engestrom
f79c80e6d6 turnip/ci: document all the a750 flakes seen in the last week
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30856>
2024-08-26 19:41:12 +00:00
Eric Engestrom
22bd67a16d zink+nvk/ci: document all the flakes seen in the last week
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30856>
2024-08-26 19:41:12 +00:00
Eric Engestrom
6ab8e089bd zink+nvk/ci: document new variant of test failing
Failing since a commit in the fef77e1d...7b32df69 range

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30856>
2024-08-26 19:41:12 +00:00
Rhys Perry
dea1fedf51 aco/tests: add more VALUMaskWriteHazard tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
11262a01ce aco: preserve bitsets after a lane mask is written
fossil-db (navi31):
Totals from 4840 (6.10% of 79395) affected shaders:
Instrs: 13733449 -> 13761177 (+0.20%); split: -0.00%, +0.21%
CodeSize: 71997868 -> 72102520 (+0.15%); split: -0.00%, +0.15%
Latency: 128385177 -> 128408780 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21105847 -> 21109475 (+0.02%); split: -0.00%, +0.02%
VALU: 7741209 -> 7741210 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
61e73c2323 aco: check SALU writing lanemask later for VALUMaskWriteHazard
This should be done after reads are checked and
sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version
also skipped checking the reads if the write check passed.

fossil-db (navi31):
Totals from 193 (0.24% of 79395) affected shaders:
Instrs: 3212435 -> 3212735 (+0.01%)
CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01%
Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
b1ba7d1b99 aco: don't consider sa_sdst=0 before SALU write to fix VALUMaskWriteHazard
LLVM does but that's probably a bug.

fossil-db (navi31):
Totals from 311 (0.39% of 79395) affected shaders:
Instrs: 380453 -> 381075 (+0.16%)
CodeSize: 1961012 -> 1964744 (+0.19%)
Latency: 4799095 -> 4800313 (+0.03%)
InvThroughput: 958358 -> 958904 (+0.06%)
VALU: 242322 -> 242633 (+0.13%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
8f5ee70d85 aco: also consider VALU reads for VALUMaskWriteHazard
fossil-db (navi31):
Totals from 9776 (12.31% of 79395) affected shaders:
Instrs: 19348258 -> 19383680 (+0.18%); split: -0.00%, +0.19%
CodeSize: 101223460 -> 101366964 (+0.14%); split: -0.01%, +0.15%
Latency: 172853115 -> 172866070 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 27590468 -> 27592390 (+0.01%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11550
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11436
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11337
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11738
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11741
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
ee648326d9 aco: ignore exec and literals when mitigating VALUMaskWriteHazard
LLVM ignores exec and literals don't seem to work in some cases.

fossil-db (navi31):
Totals from 2676 (3.37% of 79395) affected shaders:
Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07%
CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06%
Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00