Commit graph

193290 commits

Author SHA1 Message Date
David Heidelberg
34753cefd8 ci/alpine: use llvm variables
Fixes: da391650f5 ("ci: build a host version of mesa for cross builds")

Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
2024-08-10 15:13:33 +09:00
David Heidelberg
bda1a0596e meson/addrlib: allow unintialized callbacks
Resolves:
[328/4125] Compiling C++ object src/amd/addrlib/libaddrlib.a.p/src_core_addrlib1.cpp.o
In static member function 'static VOID Addr::Object::ClientFree(VOID*, const Addr::Client*)',
    inlined from 'static VOID Addr::Object::operator delete(VOID*)' at ../src/amd/addrlib/src/core/addrobject.cpp:190:15,
    inlined from 'virtual Addr::Object::~Object()' at ../src/amd/addrlib/src/core/addrobject.cpp:71:1:
../src/amd/addrlib/src/core/addrobject.cpp:129:28: error: '*(const Addr::Client*)((char*)this + 8).Addr::Client::callbacks._ADDR_CALLBACKS::freeSysMem' is used uninitialized [-Werror=uninitialized]
  129 |     if (pClient->callbacks.freeSysMem != NULL)
      |         ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11634

Suggested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
2024-08-10 15:13:33 +09:00
David Heidelberg
9c8e75e256 llvmpipe: Silence "possibly uninitialized value" warning for ssbo_limit (cont)
Fixes: ce611935df ("llvmpipe: Silence "possibly uninitialized value" warning for ssbo_limit.")

Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
2024-08-10 15:13:32 +09:00
Marek Olšák
07554d32db ac/nir: adjust gfx11 tuning for the compute blit
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
db7823e8b9 ac/nir: adjust performance-related decisions for clear/copy_buffer shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
361266fec7 ac/nir: import the clear/copy_buffer compute shader from radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
e41fec7812 radeonsi: align waves to 256B clear/copy area for the clear/copy_buffer shader
This is about 10% faster in certain unaligned cases.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
2f9201e91b radeonsi: implement optimized unaligned clear/copy_buffer compute shader
This totally beats CP DMA on Navi31.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
fa85b4b49e radeonsi: minor changes at the beginning of si_compute_clear_copy_buffer
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
4d78052321 radeonsi: add correctness tests for the clear/copy_buffer compute shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
a48a376bc5 radeonsi: test more alignment cases in si_test_dma_perf
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
fa53a23031 radeonsi: reject insert/extract opcodes in si_vectorize_callback
The vector variants are not implemented by ac_nir_to_llvm.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
d34a450098 util: move util_lower_clearsize_to_dword here
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
1d66acf993 nir: add ACCESS_KEEP_SCALAR, preventing vectorization
The comment explains the reason.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Faith Ekstrand
3f1c3f04be nvk: Advertise VK_EXT_descriptor_buffer
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
0f8f407e57 zink: Align descriptor buffers to descriptorBufferOffsetAlignment
Instead of aligning offsets, we just align the size every time we query
it.  This simplifies our offset and size calculations later since we can
always just add up descriptor buffer sizes and know that we'll be okay.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: 7ab5c5d36d ("zink: use EXT_descriptor_buffer with ZINK_DESCRIPTORS=db")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
fdf580bf74 nvk: Add support for embedded immutable samplers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
832f67e187 nvk: Implement descriptor buffer binding
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
b6c862bed7 nvk: Rework descriptor set bindings
This switches from struct-of-array to array-of-struct and also addes a
new nvk_descriptor_set_type enum to make it more clear exactly what, if
anything, is bound at any give time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
6c54344f5b nvk: Properly indent a comment
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
f7638ff1dc nvk: Implement descriptor capture/replay
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
ef9d9b70a6 nvk/descriptor_table: Add support for requesting a specific index
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
77db71db7d nvk: Implement GetDescriptorEXT
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
237c5d505a nvk: Refactor some descriptor set helpers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
ad9a13a163 nvk: Implement GetDescriptorLayoutSize/BindingOffsetEXT()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
fc0f0725a4 nvk: Use the EDB buffer view path with NVK_DEBUG=edb_bview
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
677f40383d nvk: Use nvk_edb_buffer_view_descriptor for EDB descriptor set layouts
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
3b94c5c22a nvk: Lower descriptors for VK_EXT_descriptor_buffer buffer views
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
8cafd2667f nvk: Refactor image intrinsic lowering a bit
This makes lower_msaa_image_intrin do all of the lowering for MSAA
images.  It's a tiny bit of duplicated code but it'll make more sense
when we work buffer views in the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
93b30bb353 nvk: Add a VK_EXT_descriptor_buffer buffer view cache
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
0f65011157 nvk/nvkmd: Advertise the usable VA range
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
6db3609eaf nvk: s/device/dev/ in nvk_buffer_view.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
1940c8e543 nvk: Move descrptor structs into a separate header
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
8244b87822 nvk: Support STORAGE_READ_WITHOUT_FORMAT on buffers
Fixes: fc19173014 ("nvk: Rework format features queries")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Faith Ekstrand
08f6066e87 nvk: Require color or depth/stencil attachment support for input attachments
Fixes: 20d8d1e239 ("nvk: Add a more competent GetPhysicalDeviceImageFormatProperties")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
2024-08-10 00:42:59 +00:00
Ian Romanick
119801e647 intel/brw: Move fsat instructions closer to the source
Intel GPUs have a saturate destination modifier, and
brw_fs_opt_saturate_propagation tries to replace explicit saturate
operations with this destination modifier. That pass is limited in
several ways. If the source of the explicit saturate is in a different
block or if the source of the explicit saturate is live after the
explicit saturate, brw_fs_opt_saturate_propagation will be unable to
make progress.

This optimization exists to help brw_fs_opt_saturate_propagation make
more progress. It tries to move NIR fsat instructions to the same block
that contains the definition of its source. It does this only in cases
where it will not create additional live values. It also attempts to do
this only in cases where the explicit saturate will ultimiately be
converted to a destination modifier.

v2: Fix metadata_preserve when theres no progress and use
nir_metadata_control_flow when there is progress. All suggested by
Alyssa.

v3: Fix a typo in the file header comment. Noticed by Ken.  Don't
require nir_metadata_instr_index. Use nir_def_rewrite_uses_after instead
of open-coding something slightly more specific. Both suggested by Ken.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19733645 -> 19733028 (<.01%)
instructions in affected programs: 193300 -> 192683 (-0.32%)
helped: 246
HURT: 1
helped stats (abs) min: 2 max: 48 x̄: 2.51 x̃: 2
helped stats (rel) min: 0.18% max: 0.39% x̄: 0.33% x̃: 0.34%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.31% max: 0.31% x̄: 0.31% x̃: 0.31%
95% mean confidence interval for instructions value: -2.87 -2.13
95% mean confidence interval for instructions %-change: -0.34% -0.32%
Instructions are helped.

total cycles in shared programs: 916180971 -> 916264656 (<.01%)
cycles in affected programs: 30197180 -> 30280865 (0.28%)
helped: 194
HURT: 142
helped stats (abs) min: 1 max: 21251 x̄: 872.75 x̃: 19
helped stats (rel) min: <.01% max: 23.17% x̄: 2.59% x̃: 0.23%
HURT stats (abs)   min: 1 max: 28058 x̄: 1781.68 x̃: 399
HURT stats (rel)   min: <.01% max: 37.21% x̄: 4.85% x̃: 1.63%
95% mean confidence interval for cycles value: -196.84 694.97
95% mean confidence interval for cycles %-change: -0.17% 1.27%
Inconclusive result (value mean confidence interval includes 0).

fossil-db:

Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 151512021 -> 151511351 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17209013596 -> 17209840995 (+0.00%); split: -0.02%, +0.02%
Max live registers: 32013312 -> 32013549 (+0.00%)
Max dispatch width: 5512304 -> 5512136 (-0.00%)

Totals from 774 (0.12% of 630172) affected shaders:
Instrs: 1559285 -> 1558615 (-0.04%); split: -0.05%, +0.01%
Cycle count: 1312656268 -> 1313483667 (+0.06%); split: -0.24%, +0.30%
Max live registers: 82195 -> 82432 (+0.29%)
Max dispatch width: 6664 -> 6496 (-2.52%)

Ice Lake
Totals:
Instrs: 151416791 -> 151416137 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15162468885 -> 15163298824 (+0.01%); split: -0.00%, +0.01%
Max live registers: 32471367 -> 32471603 (+0.00%)
Max dispatch width: 5623752 -> 5623712 (-0.00%)

Totals from 733 (0.12% of 635598) affected shaders:
Instrs: 877965 -> 877311 (-0.07%); split: -0.09%, +0.01%
Cycle count: 190763628 -> 191593567 (+0.44%); split: -0.21%, +0.64%
Max live registers: 72067 -> 72303 (+0.33%)
Max dispatch width: 6216 -> 6176 (-0.64%)

Skylake
Totals:
Instrs: 140794845 -> 140794075 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14665159301 -> 14665320514 (+0.00%); split: -0.00%, +0.01%
Max live registers: 31783341 -> 31783662 (+0.00%); split: -0.00%, +0.00%

Totals from 659 (0.11% of 625670) affected shaders:
Instrs: 829061 -> 828291 (-0.09%); split: -0.09%, +0.00%
Cycle count: 185478478 -> 185639691 (+0.09%); split: -0.33%, +0.41%
Max live registers: 67491 -> 67812 (+0.48%); split: -0.01%, +0.48%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:26:10 -07:00
Ian Romanick
f5815a003e intel/brw: Use def analysis for simple cases of saturate propagation
I had hoped this would improve compilation performance too. I tried
several different long running fossils, and there was no difference.

Fossil-db results are all over the place from platform to platform.

All of the Tiger Lake shaders hurt for spills and fills are fragment
shaders in rdr2.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19734088 -> 19733645 (<.01%)
instructions in affected programs: 71200 -> 70757 (-0.62%)
helped: 186
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 1
helped stats (rel) min: 0.06% max: 2.79% x̄: 0.83% x̃: 0.48%
95% mean confidence interval for instructions value: -2.69 -2.07
95% mean confidence interval for instructions %-change: -0.93% -0.72%
Instructions are helped.

total cycles in shared programs: 916290473 -> 916180971 (-0.01%)
cycles in affected programs: 3403719 -> 3294217 (-3.22%)
helped: 89
HURT: 88
helped stats (abs) min: 1 max: 36685 x̄: 1424.13 x̃: 10
helped stats (rel) min: <.01% max: 26.75% x̄: 1.66% x̃: 0.46%
HURT stats (abs)   min: 1 max: 8750 x̄: 195.98 x̃: 7
HURT stats (rel)   min: <.01% max: 17.12% x̄: 1.57% x̃: 0.19%
95% mean confidence interval for cycles value: -1199.88 -37.43
95% mean confidence interval for cycles %-change: -0.66% 0.56%
Inconclusive result (%-change mean confidence interval includes 0).

fossil-db:

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 151458346 -> 151457413 (-0.00%)
Cycle count: 17202426472 -> 17202406469 (-0.00%); split: -0.00%, +0.00%
Max live registers: 31989626 -> 31989959 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5500560 -> 5500384 (-0.00%)

Totals from 479 (0.08% of 628970) affected shaders:
Instrs: 398836 -> 397903 (-0.23%)
Cycle count: 18064565 -> 18044562 (-0.11%); split: -0.40%, +0.29%
Max live registers: 36663 -> 36996 (+0.91%); split: -0.02%, +0.92%
Max dispatch width: 4392 -> 4216 (-4.01%)

Tiger Lake
Totals:
Instrs: 149913036 -> 149912182 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15560086488 -> 15560135139 (+0.00%); split: -0.00%, +0.00%
Spill count: 61241 -> 61251 (+0.02%)
Fill count: 107304 -> 107314 (+0.01%)
Max live registers: 31964752 -> 31965119 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5517568 -> 5517248 (-0.01%)

Totals from 486 (0.08% of 628673) affected shaders:
Instrs: 396065 -> 395211 (-0.22%); split: -0.23%, +0.01%
Cycle count: 17677691 -> 17726342 (+0.28%); split: -0.23%, +0.51%
Spill count: 1302 -> 1312 (+0.77%)
Fill count: 3746 -> 3756 (+0.27%)
Max live registers: 37538 -> 37905 (+0.98%); split: -0.02%, +0.99%
Max dispatch width: 4576 -> 4256 (-6.99%)

Ice Lake
Totals:
Instrs: 151348422 -> 151347463 (-0.00%)
Cycle count: 15155678386 -> 15155691726 (+0.00%); split: -0.00%, +0.00%
Fill count: 108114 -> 108111 (-0.00%)
Max live registers: 32444479 -> 32444814 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5611288 -> 5611256 (-0.00%)

Totals from 483 (0.08% of 634352) affected shaders:
Instrs: 393333 -> 392374 (-0.24%)
Cycle count: 16706439 -> 16719779 (+0.08%); split: -0.14%, +0.22%
Fill count: 3654 -> 3651 (-0.08%)
Max live registers: 37246 -> 37581 (+0.90%); split: -0.02%, +0.92%
Max dispatch width: 4312 -> 4280 (-0.74%)

Skylake
Totals:
Instrs: 140741190 -> 140734481 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14659096516 -> 14659116346 (+0.00%); split: -0.00%, +0.00%
Max live registers: 31757558 -> 31757725 (+0.00%)
Max dispatch width: 5470040 -> 5469920 (-0.00%)

Totals from 3542 (0.57% of 624449) affected shaders:
Instrs: 3081309 -> 3074600 (-0.22%); split: -0.22%, +0.00%
Cycle count: 228843073 -> 228862903 (+0.01%); split: -0.11%, +0.12%
Max live registers: 304531 -> 304698 (+0.05%)
Max dispatch width: 31016 -> 30896 (-0.39%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:26:05 -07:00
Ian Romanick
adcce2bba4 intel/brw: Small code refactor in brw_fs_opt_saturate_propagation
This bit of code will have a second use in the next commit.

v2: Fix some broken indentation. Noticed by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:26:03 -07:00
Ian Romanick
9125b7c1b4 intel/elk: Don't propagate saturate to an instruction that writes flags
There are two problems.

1. This is not NaN safe. 'add.le.sat dst F, Inf F, -Inf F' has a
   different result than 'add dst F, Inf F, -Inf F; cmp.le null, dst F, 0F'.

2. Ignoring the first problem, this only produces the desired flags
   for LE and G. All other cases can produce the wrong result.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18282314 -> 18282316 (<.01%)
instructions in affected programs: 78 -> 80 (2.56%)
helped: 0
HURT: 2

total cycles in shared programs: 952924234 -> 952924252 (<.01%)
cycles in affected programs: 584 -> 602 (3.08%)
helped: 0
HURT: 2

Fixes: e6022281f2 ("intel/elk: Rename files to use elk prefix")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:26:01 -07:00
Ian Romanick
3d8fea0e09 intel/brw: Don't propagate saturate to an instruction that writes flags
There are two problems.

1. This is not NaN safe. 'add.le.sat dst F, Inf F, -Inf F' has a
   different result than 'add dst F, Inf F, -Inf F; cmp.le null, dst F, 0F'.

2. Ignoring the first problem, this only produces the desired flags
   for LE and G. All other cases can produce the wrong result.

For example, batman_arkham_city_goty.foz 6a63c4caacaa0dae has the
following code:

    mad.ge.f0.0(8)  g51<1>F         g50<8,8,1>F     g46<8,8,1>F     g11<1,1,1>F
    mov.sat(8)      g52<1>F         g51<1,1,0>F
    ...
    (+f0.0) sel(8)  g54<1>UD        g53<8,8,1>UD    0x3f000000UD

Without this commit, the saturate is incorrectly propagated to the MAD.

A similar case exists in witcher_3_dxvk_g2.foz 5b03243be667a275.

There are even worse cases like total_war_warhammer3.dx12vk-g6.foz
78328466761ef7ab and ee920491573860fc. The former has the following
code (and the latter has very similar code):

    mad.l.f0.0(16)  g95<1>F         g93<8,8,1>F     g62<8,8,1>F     g68<1,1,1>F
    ...
    mov.sat(16)     g109<1>F        -g95<1,1,0>F
    ...
    (+f0.0) sel(16) g68<1>UD        g111<1,1,0>UD   g54<1,1,0>UD
    (+f0.0) sel(16) g70<1>UD        g113<1,1,0>UD   g56<1,1,0>UD
    (+f0.0) sel(16) g72<1>UD        g115<1,1,0>UD   g58<1,1,0>UD

Saturate propagation makes a hash of this code:

    mad.sat.l.f0.0(16) g106<1>F     -g93<8,8,1>F    -g62<8,8,1>F    g68<1,1,1>F
    ...
    (+f0.0) sel(16) g70<1>UD        g110<1,1,0>UD   g56<1,1,0>UD
    (+f0.0) sel(16) g72<1>UD        g112<1,1,0>UD   g58<1,1,0>UD
    (+f0.0) sel(16) g68<1>UD        g108<1,1,0>UD   g54<1,1,0>UD

Not only is the saturate incorrectly applied to the MAD, but the MAD
result is negated without changing the conditional modifier to G!

NOTE: Backports of this commit to stable branches may need to be more
like the following commit to elk.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19729375 -> 19729377 (<.01%)
instructions in affected programs: 112 -> 114 (1.79%)
helped: 0
HURT: 2

total cycles in shared programs: 916234266 -> 916234288 (<.01%)
cycles in affected programs: 636 -> 658 (3.46%)
helped: 0
HURT: 2

fossil-db:

All Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531594 -> 151531601 (+0.00%)
Cycle count: 17209107419 -> 17209107474 (+0.00%); split: -0.00%, +0.00%

Totals from 6 (0.00% of 630198) affected shaders:
Instrs: 4550 -> 4557 (+0.15%)
Cycle count: 194629 -> 194684 (+0.03%); split: -0.00%, +0.03%

Fixes: 947c828d5c ("i965/fs: Add a saturation propagation optimization pass.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:25:57 -07:00
Ian Romanick
6da4649191 intel/brw: Eliminate dead flag writes
This prevents a couple small regressions in the next commit.

The only changes in shader-db or fossil-db were on Skylake. This seems
to eliminate an unused flags write that doesn't exist on other platforms.
With that flag write eliminated, a later CMP can be scheduled better.

I did not investigate this further.

v2: Clean up some unnecessary bits and add some comments to
can_elminate_conditional_mod. Suggested by Ken and Matt.

Skylake
Totals:
Cycle count: 14665454524 -> 14665454444 (-0.00%)

Totals from 10 (0.00% of 625685) affected shaders:
Cycle count: 38630 -> 38550 (-0.21%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
2024-08-09 14:25:54 -07:00
Jesse Natalie
169f8ec227 meson: Add an error message for llvmpipe without llvm draw support
Fixes: 010b2f9497 ("gallium/meson: Deconflate swrast/softpipe/llvmpipe")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30577>
2024-08-09 18:45:11 +00:00
Friedrich Vock
70fc5987d4 radv/rt: Don't atomicAdd local prefix sums
This only gets written to by one invocation at a time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30483>
2024-08-09 18:12:52 +00:00
Friedrich Vock
a3df3ebab4 radv/rt: Only do ploc atomicCompSwap once per workgroup
There is no need to do this for every invocation in the wave.

Improves GravityMark scores by 5%.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30483>
2024-08-09 18:12:52 +00:00
Alyssa Rosenzweig
7983c6c14d ir3: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30569>
2024-08-09 17:36:57 +00:00
Alyssa Rosenzweig
bf9a17e2d5 elk: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30566>
2024-08-09 17:07:59 +00:00
Alyssa Rosenzweig
eec02246f8 brw: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30566>
2024-08-09 17:07:59 +00:00
Rob Clark
19ff16387a gallium: Add option to not add version to libgallium filename
This is unneeded in some environments, like ChromeOS and Android.  And
for CrOS it specifically causes problems with the gpu sandbox rules.. we
don't want to have to update the sandbox rules for each new mesa
version.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30579>
2024-08-09 16:14:43 +00:00
Rob Clark
19d44313a4 egl: Fix surfaceless + modifiers
Modifiers are not restricted to dri3.

Fixes: d86f39e7cf ("egl: swap DRI_IMAGE checks for dmabuf/modifier support for driver check")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30579>
2024-08-09 16:14:43 +00:00
Georg Lehmann
48acf9d358 nir/lower_int64: replace uadd_sat with ior for find_lsb64 and ufind_msb64
Using ior here is equivalent to using uadd_sat, but works for every driver
and shouldn't hurt anywhere.

I forgot to fix this up when fixing up some vvl errors with zink.

Fixes crashes with the integer_ctz CL CTS tests in zink.

Fixes: 39ec184db6 ("zink: lower 64 bit find_lsb, ufind_msb and bit_count")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30535>
2024-08-09 15:09:57 +00:00