Commit graph

151 commits

Author SHA1 Message Date
Benjamin Lee
74ccf6cbdc nir: add option to use compact view indices
In panvk we pass absolute view indices to the hardware, so we need to do
the conversion from compacted to absolute at some point. Emitting
absolute indices from nir_lower_multiview initially looks like the
simplest option, but nir_lower_io_to_temporaries will emit a write for
every element of array varyings. This results in unnecessary writes to
disabled views.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
2024-12-09 20:31:49 +00:00
Benjamin Lee
becb014d27 nir: treat per-view outputs as arrayed IO
This is needed for implementing multiview in panvk, where the address
calculation for multiview outputs is not well-represented by lowering to
nir_intrinsic_store_output with a single offset.

The case where a variable is both per-view and per-{vertex,primitive} is
now unsupported. This would come up with drivers implementing
NV_mesh_shader or using nir_lower_multiview on geometry, tessellation,
or mesh shaders. No drivers currently do either of these. There was some
code that attempted to handle the nested per-view case by unwrapping
per-view/arrayed types twice, but it's unclear to what extent this
actually worked.

ANV and Turnip both rely on per-view outputs being assigned a unique
driver location for each view, so I've added on option to configure that
behavior rather than removing it.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
2024-12-09 20:31:49 +00:00
Marek Olšák
7f4e36ff7d gallium: replace PIPE_SHADER_CAP_INDIRECT_INPUT/OUTPUT_ADDR with NIR options
This is a prerequisite for enabling nir_opt_varyings for all gallium
drivers.

nir_lower_io_passes (called by the GLSL linker) only uses NIR options
to lower indirect IO access before lowering IO and calling
nir_opt_varyings.

Most drivers report full support for indirect IO and lower it themselves,
which prevents compaction of lowered indirectly accessed varyings because
nir_opt_varyings doesn't touch indirect varyings.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (Rb for asahi)
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> (for r300)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32423>
2024-12-03 12:57:36 +00:00
Marek Olšák
25d4943481 nir: make use_interpolated_input_intrinsics a nir_lower_io parameter
This will need to be set to true when the GLSL linker lowers IO, which
can later be unlowered by st/mesa, and then drivers can lower it again
without load_interpolated_input. Therefore, it can't be a global
immutable option.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32229>
2024-11-20 02:45:37 +00:00
Georg Lehmann
cba575f4df nir: always emit ddx intrinsics
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Ian Romanick
c96ceb50d0 intel/brw/xe2: Allow int64 conversions
As far as I can tell from looking at the Bspec, MOV between integers
of all sizes appears to be supported.

shader-db:

total instructions in shared programs: 17480631 -> 17480535 (<.01%)
instructions in affected programs: 26284 -> 26188 (-0.37%)
helped: 21 / HURT: 13

total cycles in shared programs: 897601907 -> 897664293 (<.01%)
cycles in affected programs: 10929664 -> 10992050 (0.57%)
helped: 48 / HURT: 45

fossil-db:

Totals:
Instrs: 140686824 -> 140686155 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21525129188 -> 21524717729 (-0.00%); split: -0.01%, +0.00%
Spill count: 70778 -> 70776 (-0.00%)
Fill count: 139172 -> 139168 (-0.00%)
Max live registers: 47513859 -> 47513795 (-0.00%)

Totals from 612 (0.11% of 549272) affected shaders:
Instrs: 964441 -> 963772 (-0.07%); split: -0.09%, +0.02%
Cycle count: 1215564312 -> 1215152853 (-0.03%); split: -0.09%, +0.06%
Spill count: 16172 -> 16170 (-0.01%)
Fill count: 37962 -> 37958 (-0.01%)
Max live registers: 70749 -> 70685 (-0.09%)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30700>
2024-08-21 20:16:00 +00:00
Alyssa Rosenzweig
eec02246f8 brw: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30566>
2024-08-09 17:07:59 +00:00
Caio Oliveira
52be72e676 intel: Let compiler set indirect_ubos_use_sampler
This option is used for Gfx < 12, elk already set it to true,
so set it in brw and change the drivers to not set it anymore.

Because the dual-compiler support in Iris, the helper function
there had to change to consult the right compiler value instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30393>
2024-07-31 19:26:20 +00:00
Daniel Schürmann
f3d8bd18dd nir: introduce discard_is_demote compiler option
This new option indicates that the driver emits the same
code for nir_intrinsic_discard and nir_intrinsic_demote.
Otherwise, it is assumed that discard is implemented as
terminate.

spirv_to_nir uses this option in order to directly emit
nir_demote in case of OpKill.

RADV GFX11:
Totals from 3965 (4.99% of 79439) affected shaders:
MaxWaves: 119418 -> 119424 (+0.01%); split: +0.03%, -0.03%
Instrs: 1608753 -> 1620830 (+0.75%); split: -0.18%, +0.93%
CodeSize: 8759152 -> 8785152 (+0.30%); split: -0.18%, +0.48%
VGPRs: 152292 -> 149232 (-2.01%); split: -2.37%, +0.36%
Latency: 9162314 -> 10033923 (+9.51%); split: -0.46%, +9.97%
InvThroughput: 1491656 -> 1493408 (+0.12%); split: -0.10%, +0.22%
VClause: 21424 -> 21452 (+0.13%); split: -0.31%, +0.44%
SClause: 53598 -> 55871 (+4.24%); split: -2.15%, +6.39%
Copies: 90553 -> 90462 (-0.10%); split: -2.91%, +2.81%
Branches: 16283 -> 16311 (+0.17%)
PreSGPRs: 113993 -> 113254 (-0.65%); split: -1.84%, +1.19%
PreVGPRs: 110951 -> 108914 (-1.84%); split: -2.08%, +0.24%
VALU: 963192 -> 963167 (-0.00%); split: -0.01%, +0.01%
SALU: 87926 -> 90795 (+3.26%); split: -2.92%, +6.18%
VMEM: 25937 -> 25936 (-0.00%)
SMEM: 110012 -> 109799 (-0.19%); split: -0.20%, +0.01%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
2024-06-17 19:37:15 +00:00
Ian Romanick
22095c60bc nir/algebraic: Add nir_lower_int64_options::nir_lower_iadd3_64
This allows us to not generate 64-bit iadd3 on Intel but continue
generating it for NVIDIA.

No shader-db or fossil-db changes.

v2: Add nir_lower_iadd3_64 flag so we can continue to generate 64-bit
iadd3 on NVIDIA platforms.

v3: s/bit_size == 64/s == 64/. This cut-and-paste bug prevented any of
the optimizations from ever occuring.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
2024-05-31 09:13:23 -07:00
Francisco Jerez
50daf161f4 intel/brw/xe2+: Lower 64-bit integer uadd_sat.
Fixes failures of CTS tests that currently end up emitting 64-bit
integer ADDs with saturation, which isn't supported by the hardware.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
2024-05-15 17:16:52 +00:00
Francisco Jerez
4bb5b25e53 intel/xe2+: Enable native 64-bit integer arithmetic.
Note that some previously-supported 64-bit integer operations have
been removed from the hardware, so we need to instruct NIR to lower
them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
2024-05-15 17:16:51 +00:00
Lionel Landwerlin
6a8ff3b550 intel/compiler: store u_printf_info in prog_data
So that the driver can decode the printf buffer.

We're not going to use the NIR data directly from the driver
(Iris/Anv) because the late compile steps might want to add more
printfs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
2024-05-15 13:13:38 +00:00
Ian Romanick
ded8690336 intel/brw: Remove dsign optimization
This bit from the comment should have been a big red flag:

    There are currently zero instances of fsign(double(x))*IMM in
    shader-db or any test suite, so it is hard to care at this time.

The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.

Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
2024-05-14 01:28:20 +00:00
Mike Blumenkrantz
39b66f9c84 intel: set compact_arrays in compiler options
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28601>
2024-04-12 18:43:48 +00:00
Caio Oliveira
865ef36609 intel/brw: Remove brw_shader.h
Find a better home for its existing content.  Some functions are
now just static functions at the usage sites.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>
2024-02-29 19:28:06 +00:00
Caio Oliveira
0a637dce05 intel/brw: Remove Gfx8- code from NIR options
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:38 +00:00
Caio Oliveira
a641aa294e intel/brw: Remove vec4 backend
It still exists as part of ELK for older gfx versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
7c23b90537 intel/brw: Always use scalar shaders
Remove scalar_stage[] array, since now it is always scalar.  This
removes any usage of vec4 shaders in brw.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
10230d2eec intel/brw: Assert Gfx9+
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27768>
2024-02-24 02:10:56 +00:00
Mark Janes
c4ce1ca847 intel/compiler: generate a hash function to use with the shader cache
Currently, Intel's shader cache incorporates PCI ID into shader cache
keys.

Many devices with different PCI IDs have identical shader compilation
functionality.  Using PCI ID as a component of the shader cache hash
means that a multi-platform shader cache will have redundant,
identical entries for similar platforms.

All Intel compiler functionality is selected based on device
configuration in `struct intel_device_info`.  intel_device_info.py
flags all fields accessed by intel/compiler.

This commit generates a hash function incorporating intel/compiler
device info fields.  Using this hash function in place of PCI ID will
produce a multiplatform cache with no duplicated content.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26844>
2024-02-15 16:58:15 -08:00
Lionel Landwerlin
2a1ff08376 intel/compiler: make default NIR compiler options visible
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Jordan Justen
c3a0483f5b intel/compiler: Lower DPAS instructions on ARL except ARL-H
Ref: bspec 55414
Ref: 951e08fc18 ("intel/compiler: Disable DPAS instructions on MTL")
Suggested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>
2024-02-06 21:23:19 +00:00
Karol Herbst
f2b7c4ce29 nir: rework and fix rotate lowering
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.

Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.

Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.

v2: always lower 64 bit

Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>

Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
2024-01-22 10:27:44 +00:00
Ian Romanick
7481d61a5d intel/compiler: Track mue_compaction and mue_header_packing flags in brw_get_compiler_config_value
v2: Use u_foreach_bit64. Suggested by Lionel.

Fixes: 48885c7fe3 ("intel/compiler: load debug mesh compaction options once")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
6f237a23c7 intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value
This user-settable flag affects compiler output, so it should be tracked
in the cache hash.

Fixes: 3756f60558 ("intel/fs: DPAS lowering")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
2741c6464c intel/compiler: Use u_foreach_bit64 in brw_get_compiler_config_value
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
951e08fc18 intel/compiler: Disable DPAS instructions on MTL
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3756f60558 ("intel/fs: DPAS lowering")
Closes: #10376
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
3756f60558 intel/fs: DPAS lowering
Implements integer dot product lowering both with and without
DP4A. Implements half-float dot product lowering.

There are a couple FINISHME comments describing future optimizations.

v2: Add a brw_compiler::lower_dpas flag to track when the lowering
should be applied.

v3: Use is_null() instead of checking file != ARF. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:15 -08:00
Francisco Jerez
8f92baa5d3 intel/fs/gfx12+: Don't set nir_divergence_single_prim_per_subgroup option for fragment shaders.
Flat-shaded inputs and other per-primitive values can no longer be
considered to be uniform across fragment shader subgroups due to
multipolygon dispatch.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Faith Ekstrand
09fc5e1c4d nir: Split has_[su]dot_4x8 bits into regular and _sat versions
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26533>
2023-12-06 23:15:33 +00:00
Georg Lehmann
9cf6984200 nir: unify lower_find_msb with has_{find_msb_rev,uclz}
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>
2023-08-22 12:08:37 +00:00
Georg Lehmann
2ac7e6614a nir: unify lower_bitfield_extract with has_bfe
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>
2023-08-22 12:08:37 +00:00
Georg Lehmann
34c3f81614 nir: unify lower_bitfield_insert with has_{bfm,bfi,bitfield_select}
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>
2023-08-22 12:08:37 +00:00
Yonggang Luo
86bcc90c0e intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24438>
2023-08-03 22:00:15 +00:00
Marcin Ślusarz
48885c7fe3 intel/compiler: load debug mesh compaction options once
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Christian Gmeiner
9383009809 nir: rename has_txs to has_texture_scaling
Convert it to an opt-in for backends to prefer and use nir_load_texture_scale
instead of txs for nir lowerings.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Alyssa Rosenzweig
1d4a59448c treewide: Remove use_scoped_barrier
It is now set by all relevant drivers and not checked anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Kenneth Graunke
a2d384a5c0 intel/compiler: Fix 64-bit ufind_msb, find_lsb, and bit_count
We only support 32-bit versions of ufind_msb, find_lsb, and bit_count,
so we need to lower them via nir_lower_int64.

Previously, we were failing to do so on platforms older than Icelake
and let those operations fall through to nir_lower_bit_size, which
used a callback to determine it should lower them for bit_size != 32.
However, that pass only emulates small bit-size operations by promoting
them to supported, larger bit-sizes (i.e. 16-bit using 32-bit).  It
doesn't support emulating larger operations (i.e. 64-bit using 32-bit).

So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to
flat ignore half of the bits.

Commit 78a195f252 (intel/compiler: Postpone most int64 lowering to
brw_postprocess_nir) provoked this bug on Icelake and later as well,
by moving the nir_lower_int64 handling for ufind_msb until late in
compilation, allowing it to reach nir_lower_bit_size which broke it.

To fix this, we always set int64 lowering for these opcodes, and also
correct the nir_lower_bit_size callback to ignore 64-bit operations.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>
2023-05-19 22:44:37 +00:00
Ian Romanick
28311f9d02 nir: intel/compiler: Move ufind_msb lowering to NIR
Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Cycles in all programs: 9098346105 -> 9098333765 (-0.0%)
Cycles helped: 6

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
0cc7bf63b7 nir: intel/compiler: Move ifind_msb lowering to NIR
Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so
no @32 annotation is required.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
15c6c859cf intel/compiler: Lower find_lsb in NIR
No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Marcin Ślusarz
432e263284 intel/compiler: fine-grained control of dispatch widths
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20854>
2023-01-27 11:00:41 +00:00
Nico Cortes
29adbb132f Revert "intel/compiler: fine-grained control of dispatch widths"
This reverts commit bed18ab3e2.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8063
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20654>
2023-01-12 00:33:25 +00:00
Marcin Ślusarz
bed18ab3e2 intel/compiler: fine-grained control of dispatch widths
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20535>
2023-01-11 08:17:12 +00:00
Ian Romanick
8ab7ec0129 intel/compiler: Enable lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts for pre-Gfx7
GLSL IR opcodes generated for bitfieldExtract and bitfieldInsert are
lowered by lower_instructions.  4dff3ff005 ("nir/opt_algebraic:
Optimize open coded bfm.") adds an optimization that can rematerialize
nir_op_bfm that was prevented by the GLSL IR lowering.

It appears that every piece of hardware, except older Intel GPUS, that
has real integers (i.e., lower_bitops is not set) also sets
lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4dff3ff005 ("nir/opt_algebraic: Optimize open coded bfm.")
Closes: #7874
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Lionel Landwerlin
25608659a0 intel/compiler: mark shader_record_ptr as uniform
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>
2022-12-23 09:22:13 +00:00
Illia Abernikhin
aa4ac5ff8b utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Kenneth Graunke
2dfab687ec intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes [v2]
Setting the NIR options takes care of iris thanks to the common st/mesa
linking code, and updating brw_nir_link_shaders should handle anv.

The main effort here is updating remap_tess_levels, which needs to
handle vector stores, writemasking, and swizzling.  Unfortunately,
we also need to continue handling the existing single-component
access because it's used for TES inputs, which we don't vectorize.

We could try to vectorize TES inputs too, but they're all pushed
anyway, so it wouldn't buy us much other than deleting this code.
Also, we do have opt_combine_stores, but not one for loads.

One limitation of using nir_vectorize_tess_levels is that it works
on variables, and so isn't able to combine outer/inner writes that
happen to live in the same vec4 slot (for triangle domains).  That
said, it's still better than before.

For writes, we allow the intrinsics to supply up to the full size
of the variable (vec4 for outer, vec2 for inner) even if the domain
only requires a subset of those components (i.e. triangles needs 3).

shader-db results on Icelake:

   total instructions in shared programs: 19600314 -> 19597528 (-0.01%)
   instructions in affected programs: 65338 -> 62552 (-4.26%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12
   helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59%
   95% mean confidence interval for instructions value: -10.71 -9.85
   95% mean confidence interval for instructions %-change: -6.17% -5.43%
   Instructions are helped.

   total cycles in shared programs: 851842332 -> 851808165 (<.01%)
   cycles in affected programs: 618577 -> 584410 (-5.52%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 64 max: 540 x̄: 126.08 x̃: 111
   helped stats (rel) min: 2.57% max: 37.97% x̄: 6.12% x̃: 5.06%
   95% mean confidence interval for cycles value: -135.35 -116.80
   95% mean confidence interval for cycles %-change: -6.67% -5.57%
   Cycles are helped.

   total sends in shared programs: 1025238 -> 1024308 (-0.09%)
   sends in affected programs: 6454 -> 5524 (-14.41%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4
   helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39%
   95% mean confidence interval for sends value: -3.57 -3.29
   95% mean confidence interval for sends %-change: -15.42% -14.54%
   Sends are helped.

According to Felix DeGrood, this results in a 10% improvement in
the draw call time for certain draw calls from Strange Brigade.

v2: Fix assertions about number of components and add more of them.
    Combine the quads and triangles handling as it's nearly identical.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19061>
2022-10-13 11:38:21 -07:00
Kenneth Graunke
b61b1d5a4c Revert "intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes"
This reverts commit abba55382f.
The assertions I added late in the process broke shader-db, and my
quick fix broke CI, so let's just revert it for now and I'll resubmit
this later when it's working better.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7385
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18895>
2022-09-29 17:39:18 -07:00