Commit graph

9350 commits

Author SHA1 Message Date
Faith Ekstrand
8188842fdc nir: Add a range to most I/O intrinsics
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:18 +00:00
Faith Ekstrand
a2b799c53c nir: Add an load_barycentric_at_offset_nv intrinsic
NVIDIA hardware takes the offset as two 4.12 fixed-point values packed
into a single 32-bit value.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:18 +00:00
Faith Ekstrand
1a2e8290ab nir: Add NV-specific texture opcodes
These are for implementing various texture queries.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:18 +00:00
Faith Ekstrand
5984265d45 nir: Add a load_sysval_nv intrinsic
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:18 +00:00
Faith Ekstrand
abf3175161 nir/lower_tex: Add a lower_txd_clamp option
Some of us want to lower all TXD with min_lod regardless of whether or
not it's shadow or cube or whatever.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:17 +00:00
Faith Ekstrand
d3d5122f7c nir: Add convert_alu_types to divergence analysis
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:17 +00:00
Faith Ekstrand
0680330cf7 nir: Add a nir_ssa_def_all_uses_are_fsat() helper
Extracted from nir_lower_to_source_mods, this is useful for back-ends
which don't want to rely on NIR source and destination modifiers.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25000>
2023-10-24 22:21:17 +00:00
Rhys Perry
4c3677094e aco,nir: add export_row_amd intrinsic
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25040>
2023-10-24 21:36:06 +00:00
Rhys Perry
4bd4ff5d9b nir: improve ms_cross_invocation_output_access with local_invocation_id
Since GFX11, RADV doesn't need to lower local_invocation_id.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25040>
2023-10-24 21:36:06 +00:00
Bas Nieuwenhuizen
a29cd20d17 nir: Add AMD cooperative matrix intrinsics.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24683>
2023-10-24 13:24:18 +00:00
Eric Anholt
aed6a39c10 glsl: Retire dround lowering.
We have competent lowering in NIR already available.

Drivers exposing CAP_DOUBLES but not SHADER_CAP_DROUND:
- d3d12 (NIR lowers ~0 if the underlying impl doesn't do floats)
- svga (Now sets the NIR lowering options)
- softpipe (Doesn't do GL4 so you can't use doubles anyway)
- llvmpipe (Lowers dround_even in NIR and passees the rest through
            successfully)
- zink (NIR lowers ~0 if the underlying impl doesn't do floats,
        otherwise passes things through successfully, except needed
        dround_even lowering to avoid lavapipe regression with
        native doubles)
- r600 (sets NIR rounding lowering flags, and lowers all fsign)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>
2023-10-24 00:16:30 +00:00
Eric Anholt
b416248cb5 nir: Add nir_lower_dsign as 64-bit fsign lowering.
Right now some drivers are doing dsign lowering in GLSL and haven't had to
have a NIR path due to there not being a corresponding vulkan driver.  We
want this in NIR now so that we can retire that batch of GLSL lowering
code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>
2023-10-24 00:16:30 +00:00
Eric Anholt
b1b0ebba1e glsl: Remove int64 div/mod lowering.
Most drivers that can expose GL4 were claiming the cap anyway (llvmpipe,
softpipe, zink, iris, nvc0, radeonsi, r600, freedreno, d3d12), and just
doing lowering in NIR if nessary.

crocus was only claiming the cap for gen8, but the backend compiler
enables NIR lowering regardless.

svga is the only other GL4 driver that didn't set it, and we can just set
the NIR lowering flag.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>
2023-10-24 00:16:30 +00:00
Caio Oliveira
9dcc74e557 spirv: Let spirv2nir find out the shader to use
Previous behavior was to default to "main"/fs, this wasn't nice
when the SPIR-V module had a single shader that didn't match that spec.

New behavior is to look at the available shaders:

- if there's only one, use just use it
- if multiple, narrow down using --stage and --entry as criteria
- if still multiple after narrowing down, print the list and fail

Note you can use just one --stage or --entry if that already narrows
down to a single match.  Note that in SPIR-V it is valid to have a
shader module with two shaders sharing the same entry-point name but
different stages.  Because of that in rare cases both --stage and
--entry will be needed.

This patch should remove the need of using --stage and --entry for most
of the uses of spirv2nir.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25461>
2023-10-23 22:42:45 +00:00
Caio Oliveira
c4734714ce spirv: List entry-points in spirv2nir when unsure what to use
When there's ambiguity about what shader to use, list the shaders.
Conveniently print in the command line argument for, so can be copied
pasted.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25461>
2023-10-23 22:42:45 +00:00
Caio Oliveira
45df1fd239 spirv: Change spirv2nir to use the shorter shader name abbreviations
This are the abbreviations we use elsewhere in Mesa.  For convenience we
make them case insensitive.  Old names still work for compatibility.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25461>
2023-10-23 22:42:45 +00:00
Caio Oliveira
4d3703c12d spirv: Expose stage enum conversion in vtn_private.h
Refactor it to not fail, just return MESA_SHADER_NONE.  Caller
takes care of handling error.  Exposed so can be used by spirv2nir
program.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25461>
2023-10-23 22:42:45 +00:00
Mike Blumenkrantz
ad72772d93 nir/lower_fragcolor: preserve location_frac
this otherwise breaks component-based outputs

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25794>
2023-10-18 23:02:45 +00:00
Emma Anholt
91da6be8ce glsl: Remove lower_discard().
Replaced by the new NIR pass.

i915g results:
total instructions in shared programs: 510678 -> 510714 (<.01%)
total temps in shared programs: 30429 -> 30426 (<.01%)

rv370 results:
total instructions in shared programs: 737649 -> 737656 (<.01%)
instructions in affected programs: 82 -> 89 (8.54%)
total temps in shared programs: 112093 -> 112094 (<.01%)
temps in affected programs: 6 -> 7 (16.67%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24763>
2023-10-18 01:27:04 +00:00
Emma Anholt
c5712410ec nir: Flatten ifs with discards in nir_opt_peephole_select for HW without CF.
i915g and r300-r400 don't have if statements, and discards are all
nir_intrinsic_discard_if.  We can flatten those discards here, saving a
separate GLSL pass to try to do so.

i915g:
GAINED: shaders/closed/xcom-enemy-unknown/413.shader_test FS

rv370:
GAINED: shaders/closed/xcom-enemy-unknown/12.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/122.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/132.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/145.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/146.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/19.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/413.shader_test FS
GAINED: shaders/closed/xcom-enemy-unknown/415.shader_test FS

Closes: #9918
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24763>
2023-10-18 01:27:04 +00:00
Marek Olšák
8ff4847b64 nir/algebraic: use only signed_zero_preserve_* for addition by 0 patterns, etc.
Some GLSL versions will set inf_preserve but not the other flags.
Additions by 0 only affect signed zeros.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392>
2023-10-17 17:27:12 +00:00
Marek Olšák
f3886e9c02 nir: split FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE_FP* flags
GLSL doesn't preserve NaNs, but it optionally preserves Infs.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392>
2023-10-17 17:27:12 +00:00
Karol Herbst
832efd097c nir/lower_mem_access_bit_sizes: fix invalid shift bit_size
Shifts always need 32 bit for their second source.

Fixes: c70d94a889 ("nir_lower_mem_access_bit_sizes: Support unaligned stores via a pair of atomics")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25740>
2023-10-15 16:19:36 +00:00
cheyang
cf22954973 isaspec : fix isaspec build error in aosp
in Android 12 build have error "ninja:
'external/mesa/src/compiler/isaspec/README.rst', needed by
'out/target/product/s/obj/MESON_MESA3D_GEN/.timestamp', missing and
no known rule to make it" because commit:
d48d8aefdf  (docs: Move isaspec out of
drivers/freedreno) modify isaspec.rst Location.

Signed-off-by: cheyang <cheyang@bytedance.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25697>
2023-10-13 11:09:40 +00:00
Alyssa Rosenzweig
be0ab37bac nir/opt_algebraic: Optimize LLVM booleans
Helps OpenCL kernels.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25687>
2023-10-13 02:55:48 +00:00
Emma Anholt
40f22f8ecd nir/print: Decode system values in the variable declarations.
decl_var system INTERP_MODE_NONE none vec4 #0
decl_var system INTERP_MODE_FLAT none mediump uint #1

turns into:

decl_var system INTERP_MODE_NONE none vec4 #0 (SYSTEM_VALUE_FRAG_COORD)
decl_var system INTERP_MODE_FLAT none mediump uint #1 (SYSTEM_VALUE_SUBGROUP_INVOCATION)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25647>
2023-10-12 22:52:42 +00:00
Alyssa Rosenzweig
8df8d1e2f2 nir/opt_algebraic: Reduce int64
If we just want the bottom 32-bits we don't need a full 64-bit operation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
2023-10-12 21:03:31 +00:00
Alyssa Rosenzweig
8b5b362be6 nir/lower_io: Use load_global_constant for OpenCL
Map __constant with a 64-bit address format to load_global_constant instead of
load_global. This notably allows nir_opt_preamble to hoist the load.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
2023-10-12 21:03:31 +00:00
Alyssa Rosenzweig
569d44eff4 nir/print: Handle KERNEL
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
2023-10-12 21:03:31 +00:00
Alyssa Rosenzweig
6d0efa8701 nir/legalize_16bit_sampler_srcs: Use instr_pass
Fixes the pass with multiple functions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
2023-10-12 21:03:31 +00:00
Alyssa Rosenzweig
b1b7616418 nir/opt_phi_precision: Work with libraries
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
2023-10-12 21:03:31 +00:00
Yogesh Mohan Marimuthu
09b574aa6c vulkan add 3D texture support for compute astc decoder
v2: use correct 2D/3D for view type (Chia-I Wu)

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
2023-10-11 19:28:40 +00:00
Yogesh Mohan Marimuthu
ff4d658fd5 vulkan/runtime: add compute astc decoder helper functions
The astc compute decode and lut creation code is copied
from https://github.com/Themaister/Granite/

Always set DECODE_8BIT idea is copied from
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19886

v2: use astc glsl shader code (Chia-I Wu)
v3: fix 32bit compilation error (Christopher Snowhill)
v4: use pitch to copy in vk_create_fill_image_visible_mem() function
    pass correct layer to decode_astc()
v5: use existing ASTCLutHolder (Chia-I Wu)
v6: use only staging buffer (Chia-I Wu)
    use texel buffer for partition table (Chia-I Wu)
v7: use 2DArray for input and output
v8: check for == mem_property (Chia-I Wu)
    do not use vk_common* functions (Chia-I Wu)
    squash single buffer patch (Chia-I Wu)
    fix for minTexelBufferOffsetAlignment (Chia-I Wu)
    avoid wasting 4 slots (Chia-I Wu)
    remove partition_tbl_mask (Chia-I Wu)
    remove wrong bindings count (Chia-I Wu)
    use binding names from glsl code (Chia-I Wu)
    use ARRAY_SIZE (Chia-I Wu)
    use VkFormat for getting partition table index (Chia-I Wu)
    fix mutex lock (Chia-I Wu)
    image layout should be based on function call (Chia-I Wu)
    VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE is wrong (Chia-I Wu)
    add vk_texcompress_astc tag to helpder functions (Chia-I Wu)
    remove write_desc_set_count (Chia-I Wu)
    use desc_i++ (Chia-I Wu)
    add assert for desc_i count at end (Chia-I Wu)
    remove unused vk_create_map_texel_buffer() function (Chia-I Wu)
    dynamically create the lut offset (Chia-I Wu)
    offset not to pass as push contant (Chia-I Wu)
v9: use correct stoage and sampled flags (Chia-I Wu)
    always pass single_buf_size (Chia-I Wu)
    query drivers for minTexelBufferOffsetAlignment (Chia-I Wu)
    remove blank lines (Chia-I Wu)
    remove unnecessary if check in destroy (Chia-I Wu)
    name label as unlock instead of fail and pass (Chia-I Wu)
    use prog_glslang.found() (Chia-I Wu)
    add offset,extent check to astc shader (Chia-I Wu)
v10: prog_glslang can be undefined in meson.build (Chia-I Wu)
v11: remove with_texcompress_astc and use required in find_program (Chia-I Wu)
v12: offset are aligned to blk size (Chia-I Wu)
v13: texel_blk_start should be under vulkan if check (Chia-I Wu)
     dst image layout is always VK_IMAGE_LAYOUT_GENERAL (Chia-I Wu)

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
2023-10-11 19:28:40 +00:00
Chia-I Wu
0236b7f18a mesa: make astc_decoder.glsl vk-compatible
glslangValidator -V -S comp astc_decoder.glsl

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
2023-10-11 19:28:39 +00:00
Christian Gmeiner
fe0965afa6 spirv: Don't use libclc for rotate
We have a nir lowering for drivers that do not support urol.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25660>
2023-10-11 17:39:08 +00:00
Karol Herbst
7afdbd5f6d nir/load_libclc: fix libclc memory leak
Fixes: ef453f5439 ("spirv: Add a shared libclc loader")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25649>
2023-10-11 03:05:23 +00:00
Alyssa Rosenzweig
b3da29ae58 nir/opt_preamble: Respect ACCESS_CAN_SPECULATE
In general, it is unsafe to speculatively hoist conditionally executed loads
into the preamble. For example, if the shader does:

   if (ptr is valid) {
      foo(*ptr)
   }

we cannot dereference ptr in the preamble without knowing that the pointer is
valid (which may not be determinable, since it might not be uniform).
nir_opt_preamble needs to stop speculating in this case, or otherwise using
preambles can cause faults on legal shaders.

However, some platforms may be able to speculate loads safely. For example,
Apple hardware is able to suppress MMU faults, making speculation safe.  This is
controlled global register to control this behaviour, set at boot-time by the
kernel.  (macOS suppresses these faults unconditionally, this feature may be
used in their implementation of sparse textures. Currently Linux does not
suppress any faults but this may change later.)

Since nir_opt_preamble should work soundly and optimally on a variety of
platforms, we need to respect the ACCESS flag.

Thanks to the if-else hoisting implemented earlier in the series, this isn't too
terrible of a band-aid on Asahi:

    total instructions in shared programs: 1499674 -> 1507699 (0.54%)
    instructions in affected programs: 78865 -> 86890 (10.18%)
    helped: 0
    HURT: 337
    Instructions are HURT.

    total bytes in shared programs: 10238284 -> 10279308 (0.40%)
    bytes in affected programs: 554504 -> 595528 (7.40%)
    helped: 3
    HURT: 334
    Bytes are HURT.

    total halfregs in shared programs: 452049 -> 454015 (0.43%)
    halfregs in affected programs: 7569 -> 9535 (25.97%)
    helped: 7
    HURT: 150
    Halfregs are HURT.

There are no shader-db changes on ir3 as expected, since ir3 can safely
speculate all instructions in my shader-db.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
8d037d943d nir/opt_preamble: Move phis for movable if's
Add infrastructure to reconstruct if's. Later in the series, this will let us
hoist loads from inside uniform if's without speculating. For now, it lets us
handle phi's in nir_opt_preamble in a straightforward way.

Results on AGX are good:

   total instructions in shared programs: 1504730 -> 1499674 (-0.34%)
   instructions in affected programs: 153673 -> 148617 (-3.29%)
   helped: 496
   HURT: 0
   Instructions are helped.

   total bytes in shared programs: 10287768 -> 10238284 (-0.48%)
   bytes in affected programs: 1113724 -> 1064240 (-4.44%)
   helped: 496
   HURT: 0
   Bytes are helped.

   total halfregs in shared programs: 452669 -> 452049 (-0.14%)
   halfregs in affected programs: 14825 -> 14205 (-4.18%)
   helped: 152
   HURT: 99
   Halfregs are helped.

   total threads in shared programs: 16469504 -> 16470784 (<.01%)
   threads in affected programs: 8960 -> 10240 (14.29%)
   helped: 10
   HURT: 0
   Threads are helped.

Results on ir3 is a bit more of a wash but still should be a win overall: The
regression in moves seems scary, but the cost model already accounts for them as
evidenced by instruction count coming out ahead.

   total instructions in shared programs: 3108750 -> 3105993 (-0.09%)
   instructions in affected programs: 317367 -> 314610 (-0.87%)
   helped: 675
   HURT: 242
   Instructions are helped.

   total nops in shared programs: 673152 -> 675048 (0.28%)
   nops in affected programs: 74551 -> 76447 (2.54%)
   helped: 353
   HURT: 347
   Inconclusive result (%-change mean confidence interval includes 0).

   total non-nops in shared programs: 2435598 -> 2430945 (-0.19%)
   non-nops in affected programs: 232664 -> 228011 (-2.00%)
   helped: 816
   HURT: 38
   Non-nops are helped.

   total mov in shared programs: 78201 -> 84011 (7.43%)
   mov in affected programs: 10726 -> 16536 (54.17%)
   helped: 60
   HURT: 781
   Mov are HURT.

   total cov in shared programs: 74964 -> 74906 (-0.08%)
   cov in affected programs: 273 -> 215 (-21.25%)
   helped: 17
   HURT: 0
   Cov are helped.

   total dwords in shared programs: 6716814 -> 6748726 (0.48%)
   dwords in affected programs: 879778 -> 911690 (3.63%)
   helped: 12
   HURT: 948
   Dwords are HURT.

   total full in shared programs: 193210 -> 193212 (<.01%)
   full in affected programs: 278 -> 280 (0.72%)
   helped: 12
   HURT: 22
   Inconclusive result (value mean confidence interval includes 0).

   total constlen in shared programs: 493632 -> 494816 (0.24%)
   constlen in affected programs: 19904 -> 21088 (5.95%)
   helped: 9
   HURT: 306
   Constlen are HURT.

   total cat0 in shared programs: 742476 -> 745046 (0.35%)
   cat0 in affected programs: 84455 -> 87025 (3.04%)
   helped: 277
   HURT: 489
   Cat0 are HURT.

   total cat1 in shared programs: 153303 -> 159059 (3.75%)
   cat1 in affected programs: 17810 -> 23566 (32.32%)
   helped: 69
   HURT: 780
   Cat1 are HURT.

   total cat2 in shared programs: 1144508 -> 1140731 (-0.33%)
   cat2 in affected programs: 121284 -> 117507 (-3.11%)
   helped: 841
   HURT: 0
   Cat2 are helped.

   total cat3 in shared programs: 942098 -> 934804 (-0.77%)
   cat3 in affected programs: 87140 -> 79846 (-8.37%)
   helped: 855
   HURT: 1
   Cat3 are helped.

   total cat4 in shared programs: 65261 -> 65249 (-0.02%)
   cat4 in affected programs: 42 -> 30 (-28.57%)
   helped: 12
   HURT: 0
   Cat4 are helped.

   total sstall in shared programs: 237311 -> 241281 (1.67%)
   sstall in affected programs: 33755 -> 37725 (11.76%)
   helped: 179
   HURT: 493
   Sstall are HURT.

   total (ss) in shared programs: 58166 -> 58795 (1.08%)
   (ss) in affected programs: 4535 -> 5164 (13.87%)
   helped: 35
   HURT: 664
   (ss) are HURT.

   total systall in shared programs: 503784 -> 503805 (<.01%)
   systall in affected programs: 3170 -> 3191 (0.66%)
   helped: 16
   HURT: 13
   Inconclusive result (value mean confidence interval includes 0).

   total (sy) in shared programs: 27261 -> 27259 (<.01%)
   (sy) in affected programs: 76 -> 74 (-2.63%)
   helped: 8
   HURT: 5
   Inconclusive result (value mean confidence interval includes 0).

   total waves in shared programs: 439848 -> 439872 (<.01%)
   waves in affected programs: 160 -> 184 (15.00%)
   helped: 12
   HURT: 0
   Waves are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
802fb8f7f3 nir/opt_preamble: Unify foreach_use logic
Deduplication in prep for reconstructing if's.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
3fda1d9691 nir/opt_preamble: Preserve IR when replacing phis
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
e065983dce nir/opt_preamble: Walk cf_list manually
The way backends walk NIR when translating. This will make it easy to filter
can_move based on the parent control flow.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
3325b4778a nir: Add ACCESS_CAN_SPECULATE
Determining whether it is safe to hoist a load instruction out of control flow
depends on complex hardware and driver details. Rather than encoding this as
knobs in every NIR pass that wants to do so (notably nir_opt_preamble and
nir_opt_peephole_select), add a per-load ACCESS flag for backends to set.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
2023-10-10 13:51:00 +00:00
Alyssa Rosenzweig
335cf5f22f nir: Use a tagged pointer for nir_src parents
This allows us to pack the is_if boolean into the bottom bit of the parent
pointer, eliminating the boolean and hence shrinking the nir_src by 8 bytes (due
to the extra 63 bits of padding incurred in the old layout).

Because all access is forced through helpers now, this is a local change.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig
316af8c965 nir: Assert the nir_src union is used safely
It is undefined behaviour in C to read a different member of a union than was
written. Nothing in-tree should be using this behaviour with the nir_src union:
nir_if should never be read as nir_instr and vice versa. Assert this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig
c39896b17b nir: Use getters for nir_src::parent_*
First, we need to give the parent_instr field a unique name to be able to
replace with a helper.  We have parent_instr fields for both nir_src and
nir_def, so let's rename nir_src::parent_instr in preparation for rework.

This was done with a combination of sed and manual fix-ups.

Then we use semantic patches plus manual fixups:

    @@
    expression s;
    @@

    -s->renamed_parent_instr
    +nir_src_parent_instr(s)

    @@
    expression s;
    @@

    -s.renamed_parent_instr
    +nir_src_parent_instr(&s)

    @@
    expression s;
    @@

    -s->parent_if
    +nir_src_parent_if(s)

    @@
    expression s;
    @@

    -s.renamed_parent_if
    +nir_src_parent_if(&s)

    @@
    expression s;
    @@

    -s->is_if
    +nir_src_is_if(s)

    @@
    expression s;
    @@

    -s.is_if
    +nir_src_is_if(&s)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig
ad619da3bc nir: Use set_parent_instr internally
This properly clears is_if.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:04 -04:00
Alyssa Rosenzweig
19f8e0e3aa nir: Add trivial nir_src_* getters
These will become nontrivial later in the series. For now these have no smarts
in them, in order to make the conversion completely mechanical.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:04 -04:00
Iván Briano
987749430d nir: round f2f16{_rtne/_rtz} correctly for constant expressions
As noted in the previous commit, the intermediate cast to float from
double can produce wrong results.

Fixes upcoming Vulkan CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_frag
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_frag

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>
2023-10-09 23:37:52 +00:00
Iván Briano
c8a8b09c15 nir/lower_int64: respect rounding mode when casting to float
Appendix A: Vulkan environemtn for SPIR-V says:
  Operations described as “correctly rounded” will return the infinitely
  precise result, x, rounded so as to be representable in
  floating-point. The rounding mode is not specified, unless the entry
  point is declared with the RoundingModeRTE or the RoundingModeRTZ
  Execution Mode.

Conversion between types are classified as correctly rounded, so let's
do rounding correctly.

v2: check rounding mode for destination bit size (Georg)

Fixes upcoming Vulkan CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_frag
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_frag

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>
2023-10-09 23:37:52 +00:00
antonino
a2e96a86e1 nir: fix several crashes in nir_lower_tex
This patch fixes the following issues that lead to crashes in some cases:

* an instruction is inserted to get texture lod that depends on a
  texture instruction that hasn't been inserted yet.
* this code tries to read channel 1 of the lod, but lod is scalar
* the code assumed there would only be 2 srcs, this isn't the case when
  bindless is used.

Fixes: b154a4154b ("nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25621>
2023-10-09 17:31:34 +00:00