Commit graph

3754 commits

Author SHA1 Message Date
Marcin Ślusarz
a252123363 intel/compiler/mesh: compactify MUE layout
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.

There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
  because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
   float[N] requires N 1-dword slots, but
   i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
  and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
  if possible is beneficial

This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.

Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Alyssa Rosenzweig
1466014184 nir: Rename lower_locals_to_reg_intrinsics back
The short name is freed up.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
2023-07-21 11:25:49 +00:00
Alyssa Rosenzweig
a08286f993 intel/fs: Don't read reg.base_offset
It's not set in the new intrinsics path.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
2023-07-21 11:25:48 +00:00
Marcin Ślusarz
3c83ac8002 intel/compiler: remove redundant code
has_lsc is checked few lines above, so this code doesn't matter.

Added in: a358b97c58 ("intel/fs: optimize uniform SSBO & shared loads")

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24260>
2023-07-21 07:22:22 +00:00
Felix DeGrood
d04be9770b intel/compiler: use shader source hash in shader dump code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
2023-07-20 09:08:08 +00:00
Lionel Landwerlin
3384f029be intel/compiler: rework input parameters
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
2023-07-20 09:08:08 +00:00
Lionel Landwerlin
46958bcb74 intel/fs: fix missing predicate on SEL instruction
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9381
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24236>
2023-07-19 21:57:25 +00:00
Faith Ekstrand
61a90c2968 intel/vec4: Drop support for nir_register
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Faith Ekstrand
39b5bb0809 intel/fs: Drop support for nir_register
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Faith Ekstrand
ce75c3c3fe intel: Switch to intrinsic-based registers
Results on HSW (vec4 only):

    total instructions in shared programs: 2978400 -> 2974135 (-0.14%)
    instructions in affected programs: 77870 -> 73605 (-5.48%)
    helped: 143
    HURT: 48
    helped stats (abs) min: 1 max: 100 x̄: 30.22 x̃: 9
    helped stats (rel) min: 0.03% max: 30.49% x̄: 8.02% x̃: 6.39%
    HURT stats (abs)   min: 1 max: 4 x̄: 1.19 x̃: 1
    HURT stats (rel)   min: 0.08% max: 16.67% x̄: 3.71% x̃: 3.23%
    95% mean confidence interval for instructions value: -26.69 -17.97
    95% mean confidence interval for instructions %-change: -6.24% -3.90%
    Instructions are helped.

    total cycles in shared programs: 45345924 -> 44742666 (-1.33%)
    cycles in affected programs: 29083466 -> 28480208 (-2.07%)
    helped: 4785
    HURT: 3879
    helped stats (abs) min: 2 max: 8072 x̄: 276.00 x̃: 24
    helped stats (rel) min: 0.02% max: 54.43% x̄: 7.78% x̃: 1.95%
    HURT stats (abs)   min: 2 max: 14736 x̄: 184.95 x̃: 20
    HURT stats (rel)   min: 0.02% max: 97.00% x̄: 7.69% x̃: 1.53%
    95% mean confidence interval for cycles value: -83.49 -55.77
    95% mean confidence interval for cycles %-change: -1.16% -0.55%
    Cycles are helped.

    total spills in shared programs: 1093 -> 539 (-50.69%)
    spills in affected programs: 772 -> 218 (-71.76%)
    helped: 74
    HURT: 0

    total fills in shared programs: 760 -> 757 (-0.39%)
    fills in affected programs: 66 -> 63 (-4.55%)
    helped: 3
    HURT: 0

Results on TGL (all stages):

    total instructions in shared programs: 21486982 -> 21488266 (<.01%)
    instructions in affected programs: 2245938 -> 2247222 (0.06%)
    helped: 1288
    HURT: 1385
    helped stats (abs) min: 1 max: 93 x̄: 4.05 x̃: 2
    helped stats (rel) min: 0.02% max: 3.82% x̄: 0.61% x̃: 0.46%
    HURT stats (abs)   min: 1 max: 134 x̄: 4.69 x̃: 2
    HURT stats (rel)   min: <.01% max: 5.59% x̄: 0.65% x̃: 0.44%
    95% mean confidence interval for instructions value: 0.13 0.83
    95% mean confidence interval for instructions %-change: <.01% 0.08%
    Instructions are HURT.

    total cycles in shared programs: 809326677 -> 809475669 (0.02%)
    cycles in affected programs: 447781659 -> 447930651 (0.03%)
    helped: 1924
    HURT: 1994
    helped stats (abs) min: 1 max: 74567 x̄: 1217.49 x̃: 10
    helped stats (rel) min: <.01% max: 38.44% x̄: 1.09% x̃: 0.17%
    HURT stats (abs)   min: 1 max: 76426 x̄: 1249.47 x̃: 8
    HURT stats (rel)   min: <.01% max: 137.11% x̄: 1.64% x̃: 0.17%
    95% mean confidence interval for cycles value: -125.61 201.67
    95% mean confidence interval for cycles %-change: 0.12% 0.48%
    Inconclusive result (value mean confidence interval includes 0).

    LOST:   4
    GAINED: 4

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Faith Ekstrand
abb6188a04 intel/vec4: Add support for new-style registers
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Faith Ekstrand
f783eb9ebd intel/vec4: Assume get_nir_dest() provides a sane write-mask
It should be providing a write mask that is all the channels.  Drop the
one case for load_input where we stomp this for no good reason.  Also,
make ALU write-masking AND with the existing mask.  This prepares us for
the next patch where we convert to new-style registers.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Faith Ekstrand
b8209d69ff intel/fs: Add support for new-style registers
The old ones still work for now.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Sagar Ghuge
efa6594536 intel/compiler: Look at 2 register worth of data instead of 4
Sampler always writes 4/8 register worth of data but for ld_mcs only
valid data is in first two register. So with 16-bit payload, we need to
split 2-32bit registers into 4-16-bit payload.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22802>
2023-07-14 21:17:19 +00:00
Marcin Ślusarz
36ff6c0004 intel/compiler: remove NV_mesh_shader support
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24071>
2023-07-14 08:27:14 +00:00
Christian Gmeiner
9383009809 nir: rename has_txs to has_texture_scaling
Convert it to an opt-in for backends to prefer and use nir_load_texture_scale
instead of txs for nir lowerings.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Faith Ekstrand
73e191924c nir: Add a reg_intrinsics flag to nir_convert_from_ssa
It doesn't do anything yet. We leave that to the subsequent patches so we can
keep the tree-wide refactor as simple as possible.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Yonggang Luo
48a25ef700 treewide: Remove all usage of nir_builder_init with nir_builder_create and nir_builder_at
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24038>
2023-07-10 19:20:17 +00:00
Karol Herbst
1e655b2f25 clc: rework optional subgroup feature
OpenCL 3.0 core requires __opencl_c_subgroups to be set, the OpenCL
cl_khr_subgroups extenions can only be enabled if and only if the driver
guarentees independent forward progress between subgroups.

See CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS for more information.

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22893>
2023-07-07 12:27:35 +00:00
Lionel Landwerlin
c26c0a36d3 intel/fs: disable coarse pixel shader with interpolater messages at sample
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9292
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23962>
2023-07-06 12:48:52 +00:00
Marcin Ślusarz
7ed9ec70c0 intel/compiler: simplify reading of gl_NumWorkGroups in task/mesh
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>
2023-07-04 09:15:08 +00:00
Marcin Ślusarz
1ac1d5d62e anv,intel/compiler: enable shortcut in wg id to wg idx lowering on >= gfx12.5
This speeds up vk_meshlet_cadscene in "VK mesh ext" renderer by 1.4%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>
2023-07-04 09:15:08 +00:00
Marcin Ślusarz
7ec1ef75d3 intel/compiler: pass num_workgroups from task to mesh shaders
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>
2023-07-04 09:15:08 +00:00
Konstantin Seurer
05269047d3 intel: Use nir_builder_at
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23883>
2023-07-03 15:21:38 +00:00
Rohan Garg
c3110ef1e9 intel/compiler: reuse previously computed bitsize
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23933>
2023-06-30 09:19:57 +00:00
Rohan Garg
7f48c70bab intel/compiler: construct masks instead of using magic values
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23933>
2023-06-30 09:19:57 +00:00
Yonggang Luo
68b8aa788d intel/compiler: Switch to use nir_foreach_function_impl
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23920>
2023-06-29 11:29:54 +00:00
Erik Faye-Lund
c4b6b0d949 intel: use imm-helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23855>
2023-06-29 07:08:19 +00:00
Alyssa Rosenzweig
173b9ee69a treewide: Use nir_builder_create more
perl -p0e 's/nir_builder_init\(&([^,]*), /\1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig
815efcdf7e nir: Use nir_builder_create
perl -p0e 's/nir_builder ([^;]*);\s*nir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Konstantin Seurer
8f3db26d14 intel: Use nir_ instead of nir_build_ helpers
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23858>
2023-06-27 17:37:54 +00:00
Alyssa Rosenzweig
6689c678fe nir/lower_locals_to_regs: Add bool bitsize knob
GLSL booleans (and hence bool derefs) may be translated either as 1-bit or
32-bit NIR registers, depending whether the backend uses nir_lower_bool_to_int32
or not. Add a knob for this and choose the right type for different backends.

Fixes nir_validate failure on
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_bvec3 run under
lavapipe. That test indexes into a bvec3 array, and gallivm first lowers bools
and then lowers derefs to registers, resulting in random 1-bit booleans mixed in
with 32-bit bools.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
2023-06-26 08:22:06 -04:00
Karol Herbst
570c263ea3 nir/load_libclc: run some opt passes for everybody
Cuts down serialized size from 2850288 to 1377780 bytes.

Reduces clinfo with Rusticl time by 40% for debug builds.

(Old data, but the point stands)

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15996>
2023-06-22 21:02:57 +00:00
Michel Zou
badb85edb8 util: reinstate ENUM_PACKED
gets rid of warning: 'gcc_struct' attribute ignored [-Wattributes] introduced by !23338

Fixes: 86532fa21d ("util: Use the gcc_struct attribute for packed structures in mingw")
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23478>
2023-06-21 21:51:59 +00:00
Ian Romanick
ed5d346868 intel/fs: Add missing newline
Emacs will add a newline to the end of this file whether I've edited
that line or not. It was driving me up the wall, so... yeah.

Trivial.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23777>
2023-06-21 19:57:58 +00:00
Ian Romanick
5336cbff3b intel/fs: Constant propagate into SHADER_OPCODE_SHUFFLE
Code already exists to convert SHADER_OPCODE_SHUFFLE into a simple MOV
when either source is constant. However... the constants have to
actually get into those sources!

On a shader that I'm working on that multiplies very large matrices using
lots of subgroup operations,

-SIMD8 shader: 1378 instructions. 3 loops. 793896 cycles. 0:0 spills:fills, 23 sends, scheduled with mode non-lifo. Promoted 0 constants. Compacted 22048 to 21664 bytes (2%)
+SIMD8 shader: 346 instructions. 3 loops. 61742 cycles. 0:0 spills:fills, 23 sends, scheduled with mode top-down. Promoted 0 constants. Compacted 5536 to 5216 bytes (6%)

No changes in shader-db or fossil-db on any Intel platform.

v2: Merge a bunch of identical cases. Suggested by Ken.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23609>
2023-06-21 17:16:57 +00:00
Tapani Pälli
12d7aaf2b8 intel/compiler: add more validation for acc register usage
This is described in Wa_14014617373 and a programming note has
been added to specification.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23682>
2023-06-21 08:15:59 +00:00
Caio Oliveira
fde8bf7b7f intel/compiler: Respect NIR_DEBUG_PRINT_INTERNAL flag
If flag is not set, don't print debugging
information for internal shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23756>
2023-06-21 00:01:10 +00:00
Caio Oliveira
59a72570b6 compiler: Move spirv into a module of its own
For historical reasons, nir and vtn were compiled together,
and a bunch of vtn specific targets were defined in
src/compiler/meson.build.

Now that we can, make src/compiler/spirv produce an internal
library that depends on NIR, and is used by the drivers/tools.
Also move the vtn specific targets into that directory's
meson.build.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23668>
2023-06-20 16:18:08 +00:00
Caio Oliveira
cb588d5d6e compiler/clc: Move related NIR passes to the common mesa clc
These were historically in the spirv+nir combo, but the common mesa clc
is a better home for them.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Nora Allen <blackcatgames@protonmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23667>
2023-06-20 03:43:41 +00:00
Caio Oliveira
be3e4c8aaf compiler/clc: Rename the internal library from libclc to libmesaclc
There is an actual external libclc and we do use it, so rename the
internal common library to avoid confusion.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Nora Allen <blackcatgames@protonmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23667>
2023-06-20 03:43:41 +00:00
Caio Oliveira
0c387249e1 intel/compiler: Move brw_kernel.c to the intel_clc target
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23667>
2023-06-20 03:43:40 +00:00
Caio Oliveira
59cc77f0fa compiler: Move from nir_scope to mesa_scope
Just moving the enum and performing renames, no behavior change.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>
2023-06-19 23:29:26 +00:00
Erik Faye-Lund
a593de7cf3 nir: add missed nir_cmp_imm-helpers
Seems I missed these in my previous round, let's fix them up now!

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:49 +00:00
Erik Faye-Lund
2a71e332aa nir: use new immediate comparison helpers
There's plenty of places we can use these new and shiny helpers, so
let's clean up the code a bit.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460>
2023-06-15 13:33:58 +02:00
Ian Romanick
96cde9cc01 intel/fs: Emit better code for bfi(..., 0)
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20570141 -> 20570063 (<.01%)
instructions in affected programs: 30679 -> 30601 (-0.25%)
helped: 77 / HURT: 0

total cycles in shared programs: 902113977 -> 902118723 (<.01%)
cycles in affected programs: 3255958 -> 3260704 (0.15%)
helped: 60 / HURT: 19

Broadwell
total instructions in shared programs: 18524633 -> 18524547 (<.01%)
instructions in affected programs: 34095 -> 34009 (-0.25%)
helped: 75 / HURT: 2

total cycles in shared programs: 949532394 -> 949543761 (<.01%)
cycles in affected programs: 3419107 -> 3430474 (0.33%)
helped: 57 / HURT: 24

total spills in shared programs: 22484 -> 22484 (0.00%)
spills in affected programs: 516 -> 516 (0.00%)
helped: 2 / HURT: 2

total fills in shared programs: 29346 -> 29338 (-0.03%)
fills in affected programs: 572 -> 564 (-1.40%)
helped: 4 / HURT: 0

Haswell
total instructions in shared programs: 17331356 -> 17331523 (<.01%)
instructions in affected programs: 27920 -> 28087 (0.60%)
helped: 41 / HURT: 4

total cycles in shared programs: 936603192 -> 936574664 (<.01%)
cycles in affected programs: 3417695 -> 3389167 (-0.83%)
helped: 28 / HURT: 21

total spills in shared programs: 19718 -> 19756 (0.19%)
spills in affected programs: 436 -> 474 (8.72%)
helped: 0 / HURT: 4

total fills in shared programs: 22547 -> 22607 (0.27%)
fills in affected programs: 444 -> 504 (13.51%)
helped: 0 / HURT: 4

Ivy Bridge
total cycles in shared programs: 463451277 -> 463451273 (<.01%)
cycles in affected programs: 95870 -> 95866 (<.01%)
helped: 3 / HURT: 2

DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152825278 -> 152819969 (-0.00%); split: -0.00%, +0.00%
Cycles: 15014075626 -> 15014628652 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 8528536 -> 8528560 (+0.00%)
Send messages: 7711431 -> 7711464 (+0.00%)
Spill count: 99907 -> 99509 (-0.40%); split: -0.40%, +0.00%
Fill count: 202459 -> 201598 (-0.43%); split: -0.43%, +0.00%
Scratch Memory Size: 4376576 -> 4371456 (-0.12%)

Totals from 2915 (0.44% of 662497) affected shaders:
Instrs: 2288842 -> 2283533 (-0.23%); split: -0.24%, +0.01%
Cycles: 471633295 -> 472186321 (+0.12%); split: -0.27%, +0.39%
Subgroup size: 27488 -> 27512 (+0.09%)
Send messages: 151344 -> 151377 (+0.02%)
Spill count: 48091 -> 47693 (-0.83%); split: -0.83%, +0.00%
Fill count: 59053 -> 58192 (-1.46%); split: -1.46%, +0.00%
Scratch Memory Size: 1827840 -> 1822720 (-0.28%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
e419eefd34 intel/fs: Use nir_opt_reassociate_bfi
All Skylake and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907072 -> 19907054 (<.01%)
instructions in affected programs: 8859 -> 8841 (-0.20%)
helped: 9 / HURT: 0

total cycles in shared programs: 855791238 -> 855779334 (<.01%)
cycles in affected programs: 3308294 -> 3296390 (-0.36%)
helped: 12 / HURT: 13

Broadwell
total instructions in shared programs: 17818231 -> 17817440 (<.01%)
instructions in affected programs: 9887 -> 9096 (-8.00%)
helped: 9 / HURT: 0

total cycles in shared programs: 902970035 -> 902941221 (<.01%)
cycles in affected programs: 2767243 -> 2738429 (-1.04%)
helped: 14 / HURT: 5

total spills in shared programs: 17784 -> 17718 (-0.37%)
spills in affected programs: 318 -> 252 (-20.75%)
helped: 1 / HURT: 0

total fills in shared programs: 25458 -> 24949 (-2.00%)
fills in affected programs: 1346 -> 837 (-37.82%)
helped: 1 / HURT: 0

Haswell
total instructions in shared programs: 16707799 -> 16707586 (<.01%)
instructions in affected programs: 24049 -> 23836 (-0.89%)
helped: 41 / HURT: 0

total cycles in shared programs: 882730648 -> 882723174 (<.01%)
cycles in affected programs: 5096737 -> 5089263 (-0.15%)
helped: 25 / HURT: 12

total spills in shared programs: 14937 -> 14909 (-0.19%)
spills in affected programs: 436 -> 408 (-6.42%)
helped: 4 / HURT: 0

total fills in shared programs: 17569 -> 17529 (-0.23%)
fills in affected programs: 444 -> 404 (-9.01%)
helped: 4 / HURT: 0

No shader-db changes on any older Intel platforms.

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 153118594 -> 153117340 (-0.00%); split: -0.00%, +0.00%
Cycles: 15011967556 -> 15011904351 (-0.00%); split: -0.00%, +0.00%
Fill count: 203692 -> 203684 (-0.00%)

Totals from 703 (0.11% of 662496) affected shaders:
Instrs: 192826 -> 191572 (-0.65%); split: -0.65%, +0.00%
Cycles: 29937640 -> 29874435 (-0.21%); split: -0.25%, +0.04%
Fill count: 4146 -> 4138 (-0.19%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Emma Anholt
10b94772d2 intel: Reduce cost of resetting last_grf_write.
In zink-on-anv fs-mod-dvec3-dvec3.shader_test, we were memsetting 2MB of
last_grf_write 2400 times, multiple times through the scheduler.  Just
resetting for the processed instructions reduces runtime from 21s to 16s.
No change on steam shader-db runtime across several runs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>
2023-06-14 16:16:56 +00:00
Emma Anholt
7d4769e802 intel: Allocate the last_grf_write once per scheduler.
No need to re-calloc it per block when we're going to use it again.  Also,
this fixes the vec4 backend to avoid allocating giant grf_count-sized
arrays on the stack.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>
2023-06-14 16:16:56 +00:00
Emma Anholt
2ad865b219 intel: Count reads_remaining across all blocks.
We were zeroing it out per block, but it doesn't actually help to count
per block, since the question is "will scheduling this instruction free
the reg?".  Saves some memsetting, which was showing up high in the
profile (but not from this source).

No change on iris SKL shader-db.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>
2023-06-14 16:16:55 +00:00