Francisco Jerez
8102500b95
intel/brw/xe3+: Mask subgroup shuffle index to be within valid range to avoid VRT hangs.
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664 >
2025-01-29 23:39:32 +00:00
Francisco Jerez
d2af77aa6b
intel/brw: Use urb_read_length instead of nr_attribute_slots to calculate VS first_non_payload_grf.
...
Makes sure the number of registers reserved for the payload matches
the size of the URB read, which prevents the VS shared function from
writing past the end of the register file on Xe3 with VRT enabled.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664 >
2025-01-29 23:39:32 +00:00
Francisco Jerez
7f59708422
intel/brw: Saturate shifted subgroup index to avoid reading past the end of register file.
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664 >
2025-01-29 23:39:32 +00:00
Caio Oliveira
2b6437a3f4
intel/brw: Remove unused enum
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33251 >
2025-01-28 02:17:17 +00:00
Caio Oliveira
0e1bb2f70e
intel/brw: Use brw prefix instead of namespace in dynamic_msaa_flags()
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33227 >
2025-01-28 00:48:38 +00:00
Caio Oliveira
a4afb81729
intel/brw: Use brw prefix for some schedule instructions identifiers
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33145 >
2025-01-27 18:32:41 +00:00
Lionel Landwerlin
6768eb31e5
intel: rework CL pre-compile
...
Stolen from asahi_clc :)
We drop the nasty LLVM17+ workaround code (Thanks Alyssa!)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33014 >
2025-01-25 03:28:07 +00:00
Lionel Landwerlin
5adac011b8
meson: rework mesa-clc=system handling
...
In theory you can build a driver using OpenCL kernels with a
-Dmesa-clc=system. That shouldn't require any LLVM/Clang/etc...
But the checks to find the pre-compiled mesa_clc & vtn_bindgen
binaries are in meson files or conditions only triggered if you build
with LLVM (:
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33014 >
2025-01-25 03:28:07 +00:00
Lionel Landwerlin
db11165c07
intel/cl: switch to SPIRV as shader storage
...
Effectively making intel-clc not needed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33014 >
2025-01-25 03:28:07 +00:00
Lionel Landwerlin
bf8a1e1e71
brw/elk: move internal kernel parsing out of intel_clc
...
So it can be called internally.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33014 >
2025-01-25 03:28:07 +00:00
Lionel Landwerlin
7ddb49653d
anv/brw: rework primitive count writing
...
Instead the complicated logic we currently have, do this :
We start with this shader :
int main() {
...
if (...) {
SetMeshOutputsEXT(0, 0);
return;
} else {
SetMeshOutputsEXT(...);
}
...
}
We turn it into this :
int main() {
uint __temp_prim_count = 0;
...
if (...) {
__temp_prim_count = 0;
return;
} else {
__temp_prim_count = ...;
}
...
if (is_first_group_lane()) {
SetMeshOutputsEXT(..., __temp_prim_count);
}
}
This works because the SPIRV spec says this :
"The arguments are taken from the first invocation in each
workgroup. Any invocation must execute this instruction no more
than once and under uniform control flow."
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12388
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33038 >
2025-01-24 10:19:28 +00:00
Caio Oliveira
ee625f44d5
intel/elk: Fix wrong destination to memset
...
Conversion to use rzalloc_array missed these.
Fixes: c9e667b7ad ("intel/elk: Remove uses of VLAs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12513
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33192 >
2025-01-24 01:09:26 +00:00
Rhys Perry
0eb5f66660
nir/validate: validate ssa dominance by default
...
This no longer modifies dominance metadata, so enable it by default.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32005 >
2025-01-23 23:35:44 +00:00
Caio Oliveira
563631cdd8
intel/brw: Rely on existing helper for dispatch width of geometry stages
...
Helper already exists and is used in the functions, just save the value
so can be reused.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33175 >
2025-01-23 20:29:31 +00:00
Lionel Landwerlin
2e4dcf72c6
brw: fix CSE with negation
...
The pass is currently turning this :
mul(16) %17:F, %1:F, 0.5f
mul(16) %19:F, %1:F, -0.5f
(+f0.0) sel(16) %27:UD, %19:UD, %17:UD
into this :
{ 12} mul(16) %17:F, %1:F, 0.5f
{ 14} (+f0.0) sel(16) %27:UD, -%17:F, %17:UD
The type change in the SEL instruction incurs a type conversion that
produces invalid values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 234c45c929 ("intel/brw: Write a new global CSE pass that works on defs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12477
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33070 >
2025-01-23 12:45:34 +00:00
Daniel Schürmann
f3be7ce01b
nir/from_ssa: only consider divergence if requested
...
This pass used to unconditionally use divergence information
which forced the caller to either call divergence_analysis or
ensure that the divergence is properly reset.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33009 >
2025-01-23 01:31:23 +00:00
Marek Olšák
02516ff0f9
nir: remove dead code due to IO being always lowered in st/mesa
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33146 >
2025-01-22 02:15:04 +00:00
Matt Turner
c9007999f6
elk: Pass number and sizeof separately to calloc
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:56 +00:00
Matt Turner
82330eca3c
elk: Bounds check access to p->store
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:56 +00:00
Matt Turner
262546eb0b
elk: Pass brw_codegen to next_offset
...
In the next commit we will use this to assert that we are not reading
past the end of `p->store`.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:56 +00:00
Matt Turner
7c6f4a6041
elk: Avoid reading past the end of p->store
...
On the last iteration of the loop, `offset` will point to the location
just beyond the last instruction in the program. If the program exactly
fills `p->store` then calling `next_offset()` will read out of bounds.
Instead just let the inner while loop call `next_offset()` one
additional time.
Fixes: a35b9cb625 ("i965: Add annotation data structure and support code.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12486
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:56 +00:00
Matt Turner
88fd100f97
brw: Pass number and sizeof separately to calloc
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:56 +00:00
Matt Turner
21bb7785bb
brw: Bounds check access to p->store
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:55 +00:00
Matt Turner
ab037b5daf
brw: Pass brw_codegen to next_offset
...
In the next commit we will use this to assert that we are not reading
past the end of `p->store`.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:55 +00:00
Matt Turner
a4f0a96dda
brw: Avoid reading past the end of p->store
...
On the last iteration of the loop, `offset` will point to the location
just beyond the last instruction in the program. If the program exactly
fills `p->store` then calling `next_offset()` will read out of bounds.
Instead just let the inner while loop call `next_offset()` one
additional time.
Fixes: a35b9cb625 ("i965: Add annotation data structure and support code.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12486
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101 >
2025-01-21 22:58:55 +00:00
Caio Oliveira
fb09dac988
intel/brw: Remove 'fs' prefix from reg alloc code
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33112 >
2025-01-21 07:33:49 -08:00
Caio Oliveira
62dd470d0a
intel/brw: Rename brw_fs_reg_allocate.cpp to brw_reg_allocate.cpp
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33112 >
2025-01-21 07:33:49 -08:00
Caio Oliveira
793cba0e6f
intel/brw: Apply conventions to lower_src_modifiers helper
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33110 >
2025-01-19 08:24:09 -08:00
Caio Oliveira
d7d210fed4
intel/brw: Move shuffle_from_32bit_read implementation to brw_builder
...
Make it a member function for convenience -- since another
member function uses it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33108 >
2025-01-18 20:48:57 +00:00
Caio Oliveira
b3001e4946
intel/brw: Move a few builder helpers to brw_builder.h/cpp
...
Add brw prefix when necessary.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33108 >
2025-01-18 20:48:57 +00:00
Caio Oliveira
1043187ec6
intel/brw: Stop using namespace for brw_builder
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33076 >
2025-01-18 16:12:56 +00:00
Caio Oliveira
5ac82efd35
intel/brw: Rename fs_builder to brw_builder
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33076 >
2025-01-18 16:12:55 +00:00
Caio Oliveira
f2d4c9db92
intel/brw: Rename brw_fs_builder.h to brw_builder.h
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33076 >
2025-01-18 16:12:54 +00:00
Caio Oliveira
f0fe0026c0
intel/brw: Remove extra wrapping around fs_visitor in tests
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33100 >
2025-01-18 07:41:35 -08:00
Caio Oliveira
94fa449318
intel/brw: Add missing cases to flags_written()
...
These virtual opcodes will write the whole flag set, either directly
(via brw_fill_flag()) or indirectly by using LOAD_LIVE_CHANNELS.
Issue was found when analysing a hang that would disappear
if the lowering of those opcodes was pulled all the way up
right before brw_opt_cmod_propagation (which uses the
flags_written).
Fixes: 019770f026 ("intel/brw: Add SHADER_OPCODE_VOTE_*")
Fixes: 2bd7592b0b ("intel/brw: Add SHADER_OPCODE_BALLOT")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12347
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12479
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33085 >
2025-01-18 05:30:23 +00:00
Lionel Landwerlin
d63b5fc8c5
brw: handle load_printf_buffer_size intrinsic
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067 >
2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig
e7a1d704d0
intel: set max_buffer_size to nir_lower_printf
...
instead of relying on an implicit value which doesn't make much sense.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067 >
2025-01-17 18:09:45 +00:00
Caio Oliveira
0b310ae4d8
intel/brw: Rename fs_generator to brw_generator
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32844 >
2025-01-17 00:04:41 +00:00
Caio Oliveira
3659934862
intel/brw: Add brw_generator.h header
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32844 >
2025-01-17 00:04:41 +00:00
Caio Oliveira
a5a9f42a39
intel/brw: Rename brw_fs_generator.cpp to brw_generator.cpp
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32844 >
2025-01-17 00:04:41 +00:00
Lionel Landwerlin
2774fb32e6
brw: fix coarse_z computation on Xe2+
...
The payload format changed and we forgot to update this path.
Putting a Fixes: commit that is kind of related but probably not the
source of the issue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12031
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11871
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12042
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12339
Fixes: 4672fcbc76 ("intel/fs: Fix PS thread payload setup for depth_w_coef_reg.")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33029 >
2025-01-16 07:19:57 +00:00
Caio Oliveira
634daf2827
intel/brw: Rename brw_fs_validate to brw_validate
...
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32843 >
2025-01-13 23:56:22 +00:00
Kenneth Graunke
894393470a
brw: Fix Xe2 spilling code to limit to SIMD32 rather than SIMD16
...
LSC can do native SIMD32 messages on Xe2.
Cuts spill/fills on Lunarlake:
- q2rtx-rt-pipeline: -20.83% / -16.85%
- Borderlands 3 DX12: -18.26% / -2.09%
- Cyberpunk 2077: -2.18% / -0.11%
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32986 >
2025-01-11 09:33:09 +00:00
Lionel Landwerlin
8ac7802ac8
brw: move final send lowering up into the IR
...
Because we do emit the final send message form in code generation, a
lot of emissions look like this :
add(8) vgrf0, u0, 0x100
mov(1) a0.1, vgrf0 # emitted by the generator
send(8) ..., a0.1
By moving address register manipulation in the IR, we can get this
down to :
add(1) a0.1, u0, 0x100
send(8) ..., a0.1
This reduce register pressure around some send messages by 1 vgrf.
All lost shaders in the below results are fragment SIMD32, due to the
throughput estimator. If turned off, we loose no SIMD32 shaders with
this change.
DG2 results:
Assassin's Creed Valhalla:
Totals from 2044 (96.87% of 2110) affected shaders:
Instrs: 852879 -> 832044 (-2.44%); split: -2.45%, +0.00%
Subgroup size: 23832 -> 23824 (-0.03%)
Cycle count: 53345742 -> 52144277 (-2.25%); split: -5.08%, +2.82%
Spill count: 729 -> 554 (-24.01%); split: -28.40%, +4.39%
Fill count: 2005 -> 1256 (-37.36%)
Scratch Memory Size: 25600 -> 19456 (-24.00%); split: -32.00%, +8.00%
Max live registers: 116765 -> 115058 (-1.46%)
Max dispatch width: 19152 -> 18872 (-1.46%); split: +0.21%, -1.67%
Cyberpunk 2077:
Totals from 1181 (93.43% of 1264) affected shaders:
Instrs: 667192 -> 663615 (-0.54%); split: -0.55%, +0.01%
Subgroup size: 13016 -> 13032 (+0.12%)
Cycle count: 17383539 -> 17986073 (+3.47%); split: -0.93%, +4.39%
Spill count: 12 -> 8 (-33.33%)
Fill count: 9 -> 6 (-33.33%)
Dota2:
Totals from 173 (11.59% of 1493) affected shaders:
Cycle count: 274403 -> 280817 (+2.34%); split: -0.01%, +2.34%
Max live registers: 5787 -> 5779 (-0.14%)
Max dispatch width: 1344 -> 1152 (-14.29%)
Hitman3:
Totals from 5072 (95.39% of 5317) affected shaders:
Instrs: 2879952 -> 2841804 (-1.32%); split: -1.32%, +0.00%
Cycle count: 153208505 -> 165860401 (+8.26%); split: -2.22%, +10.48%
Spill count: 3942 -> 3200 (-18.82%)
Fill count: 10158 -> 8846 (-12.92%)
Scratch Memory Size: 257024 -> 223232 (-13.15%)
Max live registers: 328467 -> 324631 (-1.17%)
Max dispatch width: 43928 -> 42768 (-2.64%); split: +0.09%, -2.73%
Fortnite:
Totals from 360 (4.82% of 7472) affected shaders:
Instrs: 778068 -> 777925 (-0.02%)
Subgroup size: 3128 -> 3136 (+0.26%)
Cycle count: 38684183 -> 38734579 (+0.13%); split: -0.06%, +0.19%
Max live registers: 50689 -> 50658 (-0.06%)
Hogwarts Legacy:
Totals from 1376 (84.00% of 1638) affected shaders:
Instrs: 758810 -> 749727 (-1.20%); split: -1.23%, +0.03%
Cycle count: 27778983 -> 28805469 (+3.70%); split: -1.42%, +5.12%
Spill count: 2475 -> 2299 (-7.11%); split: -7.47%, +0.36%
Fill count: 2677 -> 2445 (-8.67%); split: -9.90%, +1.23%
Scratch Memory Size: 99328 -> 89088 (-10.31%)
Max live registers: 84969 -> 84671 (-0.35%); split: -0.58%, +0.23%
Max dispatch width: 11848 -> 11920 (+0.61%)
Metro Exodus:
Totals from 92 (0.21% of 43072) affected shaders:
Instrs: 262995 -> 262968 (-0.01%)
Cycle count: 13818007 -> 13851266 (+0.24%); split: -0.01%, +0.25%
Max live registers: 11152 -> 11140 (-0.11%)
Red Dead Redemption 2 :
Totals from 451 (7.71% of 5847) affected shaders:
Instrs: 754178 -> 753811 (-0.05%); split: -0.05%, +0.00%
Cycle count: 3484078523 -> 3484111965 (+0.00%); split: -0.00%, +0.00%
Max live registers: 42294 -> 42185 (-0.26%)
Spiderman Remastered:
Totals from 6820 (98.02% of 6958) affected shaders:
Instrs: 6921500 -> 6747933 (-2.51%); split: -4.16%, +1.65%
Cycle count: 234400692460 -> 236846720707 (+1.04%); split: -0.20%, +1.25%
Spill count: 72971 -> 72622 (-0.48%); split: -8.08%, +7.61%
Fill count: 212921 -> 198483 (-6.78%); split: -12.37%, +5.58%
Scratch Memory Size: 3491840 -> 3410944 (-2.32%); split: -12.05%, +9.74%
Max live registers: 493149 -> 487458 (-1.15%)
Max dispatch width: 56936 -> 56856 (-0.14%); split: +0.06%, -0.20%
Strange Brigade:
Totals from 3769 (91.21% of 4132) affected shaders:
Instrs: 1354476 -> 1321474 (-2.44%)
Cycle count: 25351530 -> 25339190 (-0.05%); split: -1.64%, +1.59%
Max live registers: 199057 -> 193656 (-2.71%)
Max dispatch width: 30272 -> 30240 (-0.11%)
Witcher 3:
Totals from 25 (2.40% of 1041) affected shaders:
Instrs: 24621 -> 24606 (-0.06%)
Cycle count: 2218793 -> 2217503 (-0.06%); split: -0.11%, +0.05%
Max live registers: 1963 -> 1955 (-0.41%)
LNL results:
Assassin's Creed Valhalla:
Totals from 1928 (98.02% of 1967) affected shaders:
Instrs: 856107 -> 835756 (-2.38%); split: -2.48%, +0.11%
Subgroup size: 41264 -> 41280 (+0.04%)
Cycle count: 64606590 -> 62371700 (-3.46%); split: -5.57%, +2.11%
Spill count: 915 -> 669 (-26.89%); split: -32.79%, +5.90%
Fill count: 2414 -> 1617 (-33.02%); split: -36.62%, +3.60%
Scratch Memory Size: 62464 -> 44032 (-29.51%); split: -36.07%, +6.56%
Max live registers: 205483 -> 202192 (-1.60%)
Cyberpunk 2077:
Totals from 1177 (96.40% of 1221) affected shaders:
Instrs: 682237 -> 678931 (-0.48%); split: -0.51%, +0.03%
Subgroup size: 24912 -> 24944 (+0.13%)
Cycle count: 24355928 -> 25089292 (+3.01%); split: -0.80%, +3.81%
Spill count: 8 -> 3 (-62.50%)
Fill count: 6 -> 3 (-50.00%)
Max live registers: 126922 -> 125472 (-1.14%)
Dota2:
Totals from 428 (32.47% of 1318) affected shaders:
Instrs: 89355 -> 89740 (+0.43%)
Cycle count: 1152412 -> 1152706 (+0.03%); split: -0.52%, +0.55%
Max live registers: 32863 -> 32847 (-0.05%)
Fortnite:
Totals from 5354 (81.72% of 6552) affected shaders:
Instrs: 4135059 -> 4239015 (+2.51%); split: -0.01%, +2.53%
Cycle count: 132557506 -> 132427302 (-0.10%); split: -0.75%, +0.65%
Spill count: 7144 -> 7234 (+1.26%); split: -0.46%, +1.72%
Fill count: 12086 -> 12403 (+2.62%); split: -0.73%, +3.35%
Scratch Memory Size: 600064 -> 604160 (+0.68%); split: -1.02%, +1.71%
Hitman3:
Totals from 4912 (97.09% of 5059) affected shaders:
Instrs: 2952124 -> 2916824 (-1.20%); split: -1.20%, +0.00%
Cycle count: 179985656 -> 189175250 (+5.11%); split: -2.44%, +7.55%
Spill count: 3739 -> 3136 (-16.13%)
Fill count: 10657 -> 9564 (-10.26%)
Scratch Memory Size: 373760 -> 318464 (-14.79%)
Max live registers: 597566 -> 589460 (-1.36%)
Hogwarts Legacy:
Totals from 1471 (96.33% of 1527) affected shaders:
Instrs: 748749 -> 766214 (+2.33%); split: -0.71%, +3.05%
Cycle count: 33301528 -> 34426308 (+3.38%); split: -1.30%, +4.68%
Spill count: 3278 -> 3070 (-6.35%); split: -8.30%, +1.95%
Fill count: 4553 -> 4097 (-10.02%); split: -10.85%, +0.83%
Scratch Memory Size: 251904 -> 217088 (-13.82%)
Max live registers: 168911 -> 168106 (-0.48%); split: -0.59%, +0.12%
Metro Exodus:
Totals from 18356 (49.81% of 36854) affected shaders:
Instrs: 7559386 -> 7621591 (+0.82%); split: -0.01%, +0.83%
Cycle count: 195240612 -> 196455186 (+0.62%); split: -1.22%, +1.84%
Spill count: 595 -> 546 (-8.24%)
Fill count: 1604 -> 1408 (-12.22%)
Max live registers: 2086937 -> 2086933 (-0.00%)
Red Dead Redemption 2:
Totals from 4171 (79.31% of 5259) affected shaders:
Instrs: 2619392 -> 2719587 (+3.83%); split: -0.00%, +3.83%
Subgroup size: 86416 -> 86432 (+0.02%)
Cycle count: 8542836160 -> 8531976886 (-0.13%); split: -0.65%, +0.53%
Fill count: 12949 -> 12970 (+0.16%); split: -0.43%, +0.59%
Scratch Memory Size: 401408 -> 385024 (-4.08%)
Spiderman Remastered:
Totals from 6639 (98.94% of 6710) affected shaders:
Instrs: 6877980 -> 6800592 (-1.13%); split: -3.11%, +1.98%
Cycle count: 282183352210 -> 282100051824 (-0.03%); split: -0.62%, +0.59%
Spill count: 63147 -> 64218 (+1.70%); split: -7.12%, +8.82%
Fill count: 184931 -> 175591 (-5.05%); split: -10.81%, +5.76%
Scratch Memory Size: 5318656 -> 5970944 (+12.26%); split: -5.91%, +18.17%
Max live registers: 918240 -> 906604 (-1.27%)
Strange Brigade:
Totals from 3675 (92.24% of 3984) affected shaders:
Instrs: 1462231 -> 1429345 (-2.25%); split: -2.25%, +0.00%
Cycle count: 37404050 -> 37345292 (-0.16%); split: -1.25%, +1.09%
Max live registers: 361849 -> 351265 (-2.92%)
Witcher 3:
Totals from 13 (46.43% of 28) affected shaders:
Instrs: 593 -> 660 (+11.30%)
Cycle count: 28302 -> 28714 (+1.46%)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
a27d98e933
brw: avoid having the scratch surface handle partially written
...
Allows it to be visible through the def_analysis.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
aac906c16c
brw: add scheduler support for address registers
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
0a5bdf1199
brw: add infra to make use of the address register in the IR
...
This limits the address register to simple cases inside a block.
Validation ensures that the address register is only written once and
read once.
Instruction scheduling makes sure that instructions using the address
register in the generator are not scheduled while there is an usage of
the register in the IR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
c9fa235c28
brw: split validation iteration into blocks
...
No functional change.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
9b73a73a6e
brw: use phys_nr() more in generation
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00
Lionel Landwerlin
b110b06447
brw: introduce a new register type for the address register
...
We want to reuse the brw::nr field as a virtual address register
identifer. So we can't use brw::file=ARF brw::nr=ADDRESS.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199 >
2025-01-11 08:41:42 +00:00