Francisco Jerez
53d1d793cb
intel/fs: Delete manual 'inst->mlen' calculations from all uses of logical URB writes.
...
Rework:
* Marcin: update emit_urb_indirect_vec4_write
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195 >
2023-09-27 23:57:25 +00:00
Francisco Jerez
34a2c9ce35
intel/fs: Specify number of data components of logical URB writes via control immediate.
...
This is what most logical SEND messages do when they take a variable
number of components. 'inst->mlen' is expected to be zero for logical
SEND opcodes, which are expected to behave like plain arithmetic
operations, so certain automated transformations (like SIMD lowering)
can manipulate them without opcode-specific special-casing.
Guessing the number of components from 'inst->mlen' has other
disadvantages, because it requires duplicating the logic that infers
the message payload size in every use of the instruction -- Instead we
can just do the computation once during logical send lowering. In
addition on LNL platform this causes the 'inst->mlen' field of URB
writes to have units inconsistent with every other SEND instruction,
which is likely to lead to confusion and bugs down the road.
Rework:
* Marcin: update emit_urb_indirect_vec4_write
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195 >
2023-09-27 23:57:25 +00:00
Caio Oliveira
c89597085a
intel/compiler/xe2: Update TCS ICP handle code to support SIMD16
...
Rework:
* Use ffs(grf_size_bytes) (s-b Ken)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195 >
2023-09-27 23:57:25 +00:00
Caio Oliveira
f0fcb778b4
intel/compiler/xe2: Fix URB writes in TCS
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195 >
2023-09-27 23:57:25 +00:00
Jordan Justen
f1b9b7f955
intel/fs: Update SSBO & shared uniform block loads for Xe2
...
Note: lower_lsc_block_logical_send() most likely stills needs some
related updates.
Ref: a358b97c58 ("intel/fs: optimize uniform SSBO & shared loads")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 23:06:16 -07:00
Jordan Justen
9fb2b12c99
intel/compiler: Update RT stack_id access for Xe2
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 23:06:16 -07:00
Jordan Justen
9e43fa09a6
intel/compiler: Update emit_rt_lsc_fence() for Xe2
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 23:06:16 -07:00
Francisco Jerez
791d040104
intel/fs/xe2+: Fix execution width of SHADER_OPCODE_GET_BUFFER_SIZE for SIMD16 EU.
...
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 17:19:36 -07:00
Caio Oliveira
28744c8954
intel/compiler/xe2: Account for reg_unit() in TES intrinsics
...
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 17:19:36 -07:00
Caio Oliveira
9859f5b4d2
intel/compiler/xe2: Account for reg_unit() in TCS intrinsics
...
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 17:19:36 -07:00
Ian Romanick
ef817650c9
intel/compiler/xe2: Use SIMD16 for nir_intrinsic_image_size
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020 >
2023-09-20 17:19:36 -07:00
Sviatoslav Peleshko
b1a63d5418
intel/fs: Check if the whole ubo load range is in the push const range
...
Before this, we were checking only the beginning of the ubo range, so
partially overlapping loads were trying to load undefined data.
Fixes: b2da1238 ("i965: Use pushed UBO data in the scalar backend.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9748
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25111 >
2023-09-15 10:55:24 +00:00
Iván Briano
17d7f7a292
intel/fs: read viewport and layer from the FS payload
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047 >
2023-09-12 02:51:31 +00:00
Ian Romanick
8ce4d7a08d
intel/compiler: Don't evict for workgroup-scope fences
...
Flushing and invalidating caches isn't necessary for workgroup scope
fences. In fact, the DP_FLUSH_TYPE docs (BSpec 54041) say:
"If the fence scope is Local or Threadgroup, HW ignores the flush
type and operates as if it was set to None(no flush)"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842 >
2023-09-09 04:41:25 +00:00
Lionel Landwerlin
74a40cc4b6
intel/fs: move lower of non-uniform at_sample barycentric to NIR
...
We use a non-uniform lowering loop in the backend which we can do
better in NIR because we can also use divergence analysis there.
This change also limits VGRF usage to a single VGRF to hold the sample
ID in the backend.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716 >
2023-08-29 23:19:13 +00:00
Yonggang Luo
0b84e38684
intel/brw: use 4 instead of MAX_VERTEX_STREAMS to avoid #include "mesa/main/config.h"
...
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24824 >
2023-08-24 02:54:08 +00:00
Faith Ekstrand
65b6ac8aa4
nir: Rename nir_instr_type_ssa_undef to nir_instr_type_undef
...
We already renamed the type, we just need to rename the enum and the
casting helper functions.
Generated with sed:
sed -i -e 's/nir_instr_type_ssa_undef/nir_instr_type_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
sed -i -e 's/nir_instr_as_ssa_undef/nir_instr_as_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
and two tiny whitespace fixups in lima.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703 >
2023-08-15 17:44:27 +00:00
Faith Ekstrand
4695bebc79
nir: Drop nir_dest
...
Instead, we replace every use of it with nir_def. Most of this commit
was generated by sed:
sed -i -e 's/dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
A few manual fixups were required in lima and the nir_legacy code.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674 >
2023-08-14 21:22:53 +00:00
Faith Ekstrand
6c1d32581a
nir: Drop nir_alu_dest
...
Instead, we replace it directly with nir_def. We could replace it with
nir_dest but the next commit gets rid of that so this avoids unnecessary
churn. Most of this commit was generated by sed:
sed -i -e 's/dest.dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
There were a few manual fixups required in the nir_legacy.c and
nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a
similar pattern.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674 >
2023-08-14 21:22:53 +00:00
Faith Ekstrand
9d81f13a75
nir: Get rid of nir_dest_num_components()
...
We could add a nir_def_num_components() helper but we use
ssa.num_components about 3x as often as nir_dest_num_components() today
so that's a major Coccinelle refactor anyway and this doesn't make it
much worse. Most of this commit was generated byt the following
semantic patch:
@@
expression D;
@@
<...
-nir_dest_num_components(D)
+D.ssa.num_components
...
Some manual fixup was needed, especially in cpp files where Coccinelle
tends to give up the moment it sees any interesting C++.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674 >
2023-08-14 21:22:53 +00:00
Faith Ekstrand
80a1836d8b
nir: Get rid of nir_dest_bit_size()
...
We could add a nir_def_bit_size() helper but we use ->bit_size about 3x
as often as nir_dest_bit_size() today so that's a major Coccinelle
refactor anyway and this doesn't make it much worse. Most of this
commit was generated byt the following semantic patch:
@@
expression D;
@@
<...
-nir_dest_bit_size(D)
+D.ssa.bit_size
...
Some manual fixup was needed, especially in cpp files where Coccinelle
tends to give up the moment it sees any interesting C++.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674 >
2023-08-14 21:22:53 +00:00
Faith Ekstrand
ce8b157b94
intel/fs: Stop passing around nir_dest and nir_alu_dest
...
We want to get rid of nir_dest so back-ends need to stop storing it
in structs and passing it through helpers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674 >
2023-08-14 21:22:53 +00:00
Alyssa Rosenzweig
09d31922de
nir: Drop "SSA" from NIR language
...
Everything is SSA now.
sed -e 's/nir_ssa_def/nir_def/g' \
-e 's/nir_ssa_undef/nir_undef/g' \
-e 's/nir_ssa_scalar/nir_scalar/g' \
-e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \
-e 's/nir_gather_ssa_types/nir_gather_types/g' \
-i $(git grep -l nir | grep -v relnotes)
git mv src/compiler/nir/nir_gather_ssa_types.c \
src/compiler/nir/nir_gather_types.c
ninja -C build/ clang-format
cd src/compiler/nir && find *.c *.h -type f -exec clang-format -i \{} \;
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585 >
2023-08-12 16:44:41 -04:00
Alyssa Rosenzweig
42ee8a55dd
nir: Remove nir_alu_dest::write_mask
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:30 +00:00
Alyssa Rosenzweig
11fc4f969c
intel: Collapse is_ssa checks
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:29 +00:00
Alyssa Rosenzweig
95e3df39c0
treewide: sed out more is_ssa
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
a8013644a1
nir: Drop nir_alu_src::{negate,abs}
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
ab0d878932
treewide: Remove more is_ssa asserts
...
Stuff Coccinelle missed.
sed -i -e '/assert(.*\.is_ssa)/d' $(git grep -l is_ssa)
sed -i -e '/ASSERT.*\.is_ssa)/d' $(git grep -l is_ssa)
+ a manual fixup to restore the assert for parallel copy lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
5fead24365
treewide: Drop is_ssa asserts
...
We only see SSA now.
Via Coccinelle patch:
@@
expression x;
@@
-assert(x.is_ssa);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
d559764e7c
nir: Remove nir_alu_dest::saturate
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 >
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
51db19f7a2
nir: Rename scoped_barrier -> barrier
...
sed + ninja clang-format + fix up spacing for common code.
If you are unhappy that I did not manually change the whitespace of your driver,
you need to enable clang-format for it so the formatting would happen
automatically.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428 >
2023-08-01 23:18:29 +00:00
Jordan Justen
5df97c27dc
intel/compiler: Use nir SUBGROUP_INVOCATION for RT TOPOLOGY_ID
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21774 >
2023-07-28 22:54:59 +00:00
Mike Blumenkrantz
e68e612826
nir: add a helper for calculating variable slots
...
this will maybe avoid future bugs, but probably not
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163 >
2023-07-28 13:14:35 +00:00
Mike Blumenkrantz
59396eefe6
nir: fix slot calculations for compact variables with location_frac
...
a variable with a component offset may span multiple slots, and this cannot
be inferred from its type alone (e.g., compacted clip+cull distances)
cc: mesa-stable
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163 >
2023-07-28 13:14:35 +00:00
Lionel Landwerlin
d33aff783d
intel/fs: add support for sparse accesses
...
Purely from the backend point of view it's just an additional
parameter to sampler messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882 >
2023-07-27 02:02:30 +03:00
Faith Ekstrand
94f36cfaa3
intel/fs: Assume NIR is in SSA form
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310 >
2023-07-25 16:25:11 +00:00
Faith Ekstrand
965bbe5286
intel/fs: Rework the overlapping mov/vec case
...
Now that we're using load/store_reg intrinsics, the previous checks for
registers aren't what we want. Instead, we need to be looking for a mov
or vec where both the destination and a source are load/store_reg with
matching decl_reg.
Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310 >
2023-07-25 16:25:11 +00:00
Faith Ekstrand
45ee952efb
intel/fs: Use write masks from store_reg intrinsics
...
Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310 >
2023-07-25 16:25:10 +00:00
Lionel Landwerlin
8cbf730145
intel/fs: don't try to rebuild sequences of non ssa values
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04777171e0 ("intel/fs: try to rematerialize surface computation code")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9378
Reviewed-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228 >
2023-07-24 20:04:24 +00:00
Marcin Ślusarz
a252123363
intel/compiler/mesh: compactify MUE layout
...
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.
There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
float[N] requires N 1-dword slots, but
i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
if possible is beneficial
This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.
Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407 >
2023-07-24 07:55:29 +00:00
Alyssa Rosenzweig
a08286f993
intel/fs: Don't read reg.base_offset
...
It's not set in the new intrinsics path.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253 >
2023-07-21 11:25:48 +00:00
Faith Ekstrand
39b5bb0809
intel/fs: Drop support for nir_register
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104 >
2023-07-19 02:11:57 +00:00
Faith Ekstrand
ce75c3c3fe
intel: Switch to intrinsic-based registers
...
Results on HSW (vec4 only):
total instructions in shared programs: 2978400 -> 2974135 (-0.14%)
instructions in affected programs: 77870 -> 73605 (-5.48%)
helped: 143
HURT: 48
helped stats (abs) min: 1 max: 100 x̄: 30.22 x̃: 9
helped stats (rel) min: 0.03% max: 30.49% x̄: 8.02% x̃: 6.39%
HURT stats (abs) min: 1 max: 4 x̄: 1.19 x̃: 1
HURT stats (rel) min: 0.08% max: 16.67% x̄: 3.71% x̃: 3.23%
95% mean confidence interval for instructions value: -26.69 -17.97
95% mean confidence interval for instructions %-change: -6.24% -3.90%
Instructions are helped.
total cycles in shared programs: 45345924 -> 44742666 (-1.33%)
cycles in affected programs: 29083466 -> 28480208 (-2.07%)
helped: 4785
HURT: 3879
helped stats (abs) min: 2 max: 8072 x̄: 276.00 x̃: 24
helped stats (rel) min: 0.02% max: 54.43% x̄: 7.78% x̃: 1.95%
HURT stats (abs) min: 2 max: 14736 x̄: 184.95 x̃: 20
HURT stats (rel) min: 0.02% max: 97.00% x̄: 7.69% x̃: 1.53%
95% mean confidence interval for cycles value: -83.49 -55.77
95% mean confidence interval for cycles %-change: -1.16% -0.55%
Cycles are helped.
total spills in shared programs: 1093 -> 539 (-50.69%)
spills in affected programs: 772 -> 218 (-71.76%)
helped: 74
HURT: 0
total fills in shared programs: 760 -> 757 (-0.39%)
fills in affected programs: 66 -> 63 (-4.55%)
helped: 3
HURT: 0
Results on TGL (all stages):
total instructions in shared programs: 21486982 -> 21488266 (<.01%)
instructions in affected programs: 2245938 -> 2247222 (0.06%)
helped: 1288
HURT: 1385
helped stats (abs) min: 1 max: 93 x̄: 4.05 x̃: 2
helped stats (rel) min: 0.02% max: 3.82% x̄: 0.61% x̃: 0.46%
HURT stats (abs) min: 1 max: 134 x̄: 4.69 x̃: 2
HURT stats (rel) min: <.01% max: 5.59% x̄: 0.65% x̃: 0.44%
95% mean confidence interval for instructions value: 0.13 0.83
95% mean confidence interval for instructions %-change: <.01% 0.08%
Instructions are HURT.
total cycles in shared programs: 809326677 -> 809475669 (0.02%)
cycles in affected programs: 447781659 -> 447930651 (0.03%)
helped: 1924
HURT: 1994
helped stats (abs) min: 1 max: 74567 x̄: 1217.49 x̃: 10
helped stats (rel) min: <.01% max: 38.44% x̄: 1.09% x̃: 0.17%
HURT stats (abs) min: 1 max: 76426 x̄: 1249.47 x̃: 8
HURT stats (rel) min: <.01% max: 137.11% x̄: 1.64% x̃: 0.17%
95% mean confidence interval for cycles value: -125.61 201.67
95% mean confidence interval for cycles %-change: 0.12% 0.48%
Inconclusive result (value mean confidence interval includes 0).
LOST: 4
GAINED: 4
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104 >
2023-07-19 02:11:57 +00:00
Faith Ekstrand
b8209d69ff
intel/fs: Add support for new-style registers
...
The old ones still work for now.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104 >
2023-07-19 02:11:57 +00:00
Rohan Garg
c3110ef1e9
intel/compiler: reuse previously computed bitsize
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23933 >
2023-06-30 09:19:57 +00:00
Rohan Garg
7f48c70bab
intel/compiler: construct masks instead of using magic values
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23933 >
2023-06-30 09:19:57 +00:00
Caio Oliveira
59cc77f0fa
compiler: Move from nir_scope to mesa_scope
...
Just moving the enum and performing renames, no behavior change.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328 >
2023-06-19 23:29:26 +00:00
Ian Romanick
96cde9cc01
intel/fs: Emit better code for bfi(..., 0)
...
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20570141 -> 20570063 (<.01%)
instructions in affected programs: 30679 -> 30601 (-0.25%)
helped: 77 / HURT: 0
total cycles in shared programs: 902113977 -> 902118723 (<.01%)
cycles in affected programs: 3255958 -> 3260704 (0.15%)
helped: 60 / HURT: 19
Broadwell
total instructions in shared programs: 18524633 -> 18524547 (<.01%)
instructions in affected programs: 34095 -> 34009 (-0.25%)
helped: 75 / HURT: 2
total cycles in shared programs: 949532394 -> 949543761 (<.01%)
cycles in affected programs: 3419107 -> 3430474 (0.33%)
helped: 57 / HURT: 24
total spills in shared programs: 22484 -> 22484 (0.00%)
spills in affected programs: 516 -> 516 (0.00%)
helped: 2 / HURT: 2
total fills in shared programs: 29346 -> 29338 (-0.03%)
fills in affected programs: 572 -> 564 (-1.40%)
helped: 4 / HURT: 0
Haswell
total instructions in shared programs: 17331356 -> 17331523 (<.01%)
instructions in affected programs: 27920 -> 28087 (0.60%)
helped: 41 / HURT: 4
total cycles in shared programs: 936603192 -> 936574664 (<.01%)
cycles in affected programs: 3417695 -> 3389167 (-0.83%)
helped: 28 / HURT: 21
total spills in shared programs: 19718 -> 19756 (0.19%)
spills in affected programs: 436 -> 474 (8.72%)
helped: 0 / HURT: 4
total fills in shared programs: 22547 -> 22607 (0.27%)
fills in affected programs: 444 -> 504 (13.51%)
helped: 0 / HURT: 4
Ivy Bridge
total cycles in shared programs: 463451277 -> 463451273 (<.01%)
cycles in affected programs: 95870 -> 95866 (<.01%)
helped: 3 / HURT: 2
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152825278 -> 152819969 (-0.00%); split: -0.00%, +0.00%
Cycles: 15014075626 -> 15014628652 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 8528536 -> 8528560 (+0.00%)
Send messages: 7711431 -> 7711464 (+0.00%)
Spill count: 99907 -> 99509 (-0.40%); split: -0.40%, +0.00%
Fill count: 202459 -> 201598 (-0.43%); split: -0.43%, +0.00%
Scratch Memory Size: 4376576 -> 4371456 (-0.12%)
Totals from 2915 (0.44% of 662497) affected shaders:
Instrs: 2288842 -> 2283533 (-0.23%); split: -0.24%, +0.01%
Cycles: 471633295 -> 472186321 (+0.12%); split: -0.27%, +0.39%
Subgroup size: 27488 -> 27512 (+0.09%)
Send messages: 151344 -> 151377 (+0.02%)
Spill count: 48091 -> 47693 (-0.83%); split: -0.83%, +0.00%
Fill count: 59053 -> 58192 (-1.46%); split: -1.46%, +0.00%
Scratch Memory Size: 1827840 -> 1822720 (-0.28%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Lionel Landwerlin
6b9f838d62
intel/fs: handle load_global_constant_uniform_block_intel
...
Again, load the data just once in GRF, share it across lanes.
Shader-db on dg2:
total instructions in shared programs: 23214555 -> 23215400 (<.01%)
instructions in affected programs: 199977 -> 200822 (0.42%)
helped: 3
HURT: 38
helped stats (abs) min: 5 max: 670 x̄: 283.67 x̃: 176
helped stats (rel) min: 1.34% max: 49.41% x̄: 22.15% x̃: 15.70%
HURT stats (abs) min: 1 max: 185 x̄: 44.63 x̃: 32
HURT stats (rel) min: 0.13% max: 42.86% x̄: 10.25% x̃: 9.30%
95% mean confidence interval for instructions value: -18.65 59.87
95% mean confidence interval for instructions %-change: 3.29% 12.47%
Inconclusive result (value mean confidence interval includes 0).
total loops in shared programs: 5928 -> 5928 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 851137495 -> 851152449 (<.01%)
cycles in affected programs: 16406137 -> 16421091 (0.09%)
helped: 9
HURT: 32
helped stats (abs) min: 10 max: 13498 x̄: 6443.22 x̃: 5581
helped stats (rel) min: 0.11% max: 4.75% x̄: 1.45% x̃: 0.34%
HURT stats (abs) min: 3 max: 15056 x̄: 2279.47 x̃: 735
HURT stats (rel) min: 0.10% max: 23.71% x̄: 4.58% x̃: 4.65%
95% mean confidence interval for cycles value: -1315.40 2044.87
95% mean confidence interval for cycles %-change: 1.71% 4.80%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 11856 -> 11825 (-0.26%)
spills in affected programs: 2368 -> 2337 (-1.31%)
helped: 4
HURT: 0
total fills in shared programs: 16258 -> 16207 (-0.31%)
fills in affected programs: 2930 -> 2879 (-1.74%)
helped: 4
HURT: 0
total sends in shared programs: 1038194 -> 1038185 (<.01%)
sends in affected programs: 40 -> 31 (-22.50%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.25 x̃: 2
helped stats (rel) min: 10.00% max: 33.33% x̄: 21.46% x̃: 21.25%
95% mean confidence interval for sends value: -4.64 0.14
95% mean confidence interval for sends %-change: -40.41% -2.51%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 0
Some VK/DX titles result (on DG2 only), it's mostly additional
instruction counts except for the unity spaceship demo where a CS
shader gets additional SIMDness. The reason for additional
instructions is that since we're doing block loads, we need to find
the live channels in control flow to select a single lane value that
is valid.
aztec_ruins_high:
Totals from 3 (1.12% of 269) affected shaders:
Instrs: 17732 -> 17896 (+0.92%)
Cycles: 796518 -> 819302 (+2.86%)
cyberpunk_2077:
Totals from 17 (0.17% of 10301) affected shaders:
Instrs: 10848 -> 11658 (+7.47%)
Cycles: 248243 -> 259168 (+4.40%); split: -0.57%, +4.97%
fallout_4_dxvk_g2:
Totals from 2 (0.12% of 1638) affected shaders:
Instrs: 3157 -> 3368 (+6.68%)
Cycles: 487807 -> 490426 (+0.54%); split: -0.26%, +0.79%
Max live registers: 139 -> 141 (+1.44%)
red_dead_redemption2:
Totals from 68 (1.14% of 5970) affected shaders:
Instrs: 34871 -> 36486 (+4.63%)
Cycles: 551430 -> 565211 (+2.50%)
Send messages: 2074 -> 2072 (-0.10%)
Max live registers: 5078 -> 5077 (-0.02%)
total_war_warhammer2:
Totals from 5 (1.05% of 478) affected shaders:
Instrs: 6905 -> 6971 (+0.96%); split: -0.16%, +1.12%
Cycles: 97035 -> 97989 (+0.98%); split: -0.07%, +1.05%
unity spaceship demo (instruction count going up due to a CS shader
bump from SIMD8->16):
Totals from 53 (9.71% of 546) affected shaders:
Instrs: 223748 -> 233223 (+4.23%); split: -0.01%, +4.25%
Cycles: 23134697 -> 25207080 (+8.96%); split: -0.17%, +9.13%
Subgroup size: 480 -> 488 (+1.67%)
Spill count: 2156 -> 2242 (+3.99%); split: -0.19%, +4.17%
Fill count: 4617 -> 4845 (+4.94%); split: -0.09%, +5.02%
Max live registers: 5991 -> 6050 (+0.98%); split: -0.40%, +1.39%
Max dispatch width: 480 -> 488 (+1.67%)
witcher_3_dxvk_g2:
Totals from 27 (2.51% of 1074) affected shaders:
Instrs: 57067 -> 57677 (+1.07%); split: -0.03%, +1.10%
Cycles: 1397871 -> 1436704 (+2.78%); split: -0.35%, +3.13%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
5ae8a78d8c
intel/fs: make use of load_ubo_uniform_block_intel
...
The principle is the same as the load_ssbo_uniform_block_intel.
Whenever we see a uniform offset, load the data only once in GRFs to
reduce register pressure.
Iris shader-db run on DG2 :
total instructions in shared programs: 23001325 -> 23094969 (0.41%)
instructions in affected programs: 1775989 -> 1869633 (5.27%)
helped: 764
HURT: 2097
helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2
helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63%
HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7
HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60%
95% mean confidence interval for instructions value: 25.43 40.03
95% mean confidence interval for instructions %-change: 3.60% 4.33%
Instructions are HURT.
total loops in shared programs: 5847 -> 5847 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 839329852 -> 845491482 (0.73%)
cycles in affected programs: 130229434 -> 136391064 (4.73%)
helped: 1098
HURT: 2228
helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22
helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71%
HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87
HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82%
95% mean confidence interval for cycles value: 1342.16 2362.97
95% mean confidence interval for cycles %-change: 3.70% 4.52%
Cycles are HURT.
total spills in shared programs: 10768 -> 11856 (10.10%)
spills in affected programs: 9717 -> 10805 (11.20%)
helped: 25
HURT: 28
total fills in shared programs: 13720 -> 16258 (18.50%)
fills in affected programs: 12016 -> 14554 (21.12%)
helped: 25
HURT: 28
total sends in shared programs: 1034790 -> 1031266 (-0.34%)
sends in affected programs: 33416 -> 29892 (-10.55%)
helped: 1005
HURT: 0
helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3
helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08%
95% mean confidence interval for sends value: -3.72 -3.29
95% mean confidence interval for sends %-change: -15.82% -14.57%
Sends are helped.
LOST: 26
GAINED: 183
shader-db on a number of VK/DX titles on DG2 :
PERCENTAGE DELTAS Shaders Instrs Cycles
age_of_wonders_III 1928 +0.02% -0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width
assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65%
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers
aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12%
(stats look bad, but it's just one shader affected)
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers
fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width
red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
shooter-game 693 +0.07% -0.89% -0.09% -0.09%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00