Commit graph

185397 commits

Author SHA1 Message Date
Caio Oliveira
a641aa294e intel/brw: Remove vec4 backend
It still exists as part of ELK for older gfx versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
7c23b90537 intel/brw: Always use scalar shaders
Remove scalar_stage[] array, since now it is always scalar.  This
removes any usage of vec4 shaders in brw.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
303fd4e935 intel/brw: Move type_size_* functions out of vec4-specific file
Will make easier later to delete vec4 files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
9bfccc1935 intel/brw: Move brw_compile_* functions out of vec4-specific files
These contain code that is both fs and vec4.  Will make easier later to
delete vec4 files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Caio Oliveira
c11d7743b3 intel/blorp: Remove Gfx8- references in BRW code
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:37 +00:00
Vinson Lee
6c190bdfe9 intel/clc: Fix file descriptor leak
Fix defect reported by Coverity Scan.

Resource leak (RESOURCE_LEAK)
leaked_storage: Variable fp going out of scope leaks the storage it points to.

Fixes: 4fd7495c69 ("intel/clc: add ability to output NIR")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27778>
2024-02-28 04:30:33 +00:00
Faith Ekstrand
41722c6137 nak: Add support for imad on Volta+ and enable it in simple cases
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>
2024-02-27 21:51:30 -06:00
Faith Ekstrand
a747cd1bd5 nak: Move NAK_FS_OUT_COLOR next to the enum
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>
2024-02-27 21:51:30 -06:00
Faith Ekstrand
f4fb5277c3 nir: Add an imad opcode
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>
2024-02-27 21:51:30 -06:00
Faith Ekstrand
1881d97c27 nak: Implement nir_op_iadd3 on SM70+
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>
2024-02-27 21:51:29 -06:00
Mike Blumenkrantz
0c95d39309 zink: add nvk baseline
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27843>
2024-02-27 21:50:47 -05:00
Mike Blumenkrantz
9ffb7e0179 zink: update nv blob baseline
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27843>
2024-02-27 21:50:47 -05:00
Daniel Almeida
efc4ac0d27 nak/sm50: sprinkle OpAnnotate in optimization passes
Not only do we want to know where an Op originated from, but also how it got
transformed along the way if possible. Preferably all the way to the final
machine code emitted.

This commit inserts OpAnnotates in some of the optimization passes when
map_instr() or Instr::new_boxed is used.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27158>
2024-02-28 01:12:03 +00:00
Daniel Almeida
feb2d3e1da nak/sm50: support annotations through OpAnnotate
Add a new op to annotate the IR. This will help debugging and is only
in effect when NAK_DEBUG=annotate is set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27158>
2024-02-28 01:12:03 +00:00
Daniel Almeida
a69bd9a70a nak/sm50: add an annotate debug flag
Add a flag so that users can enable debug annotations when printing the IR.
This does nothing for now. A follow-up commit will actually implement
annotations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27158>
2024-02-28 01:12:03 +00:00
Daniel Almeida
02774be708 nak/sm50: add a memstream abstraction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27158>
2024-02-28 01:12:03 +00:00
Juston Li
e57cf175e2 venus: move feedback on empty last batch to prior batch
For submissions with an empty last batch containing no cmd buffers but
with semaphores as zink does, adding feedback to that batch would make
it no longer empty and increase submission overhead on some drivers.

Since feedback order is enforced by barriers, the feedback cmds can
instead be appended to the previous batch (if it exists) so that the
last batch remains empty.

Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27830>
2024-02-28 00:56:26 +00:00
Thong Thai
0586a3fb22 frontends/va/postproc: do not use efc if image is to be translated
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10658

Signed-off-by: Thong Thai <thong.thai@amd.com>
Tested-by: Andrej Benz <hello@benz.dev>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27802>
2024-02-27 22:56:04 +00:00
Faith Ekstrand
b8c3d18fba nvk: Advertise VK_EXT_shader_object
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9648
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
fb564040a7 nvk: Advertise VK_KHR_graphics_pipeline_library
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9635
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
813b253939 nvk: Switch to shader objects
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
4001658c18 nvk: Use vk_render_pass_state::attachments for write masks
This pulls everything into nvk_cmd_draw.c where it's a bit easier to
manage.  When the time comes for switching to EXT_shader_object, this
will let us handle VK_EXT_dynamic_rendering_unused_attachments via the
common vk_pipeline code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
839629634f nvk: Move nir_lower_patch_vertices to nvk_lower_nir()
As long as it happens after we merge tess info between the two stages
(it does) then there's no need to have it in the pipeline code.  It's
just an optimization anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
bd76444257 nvk: Pass an array of descriptor sets to nvk_lower_nir
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
a4f519d72d nvk: Move populate_fs_key to nvk_shader.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
045741ac30 nvk/shader: Refactor some helpers
This puts them in the form we need for vk_shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
626f38e25e nvk: Populate vk_descriptor_set_layout::blake3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
9308e8d90d vulkan: Add generic graphics and compute VkPipeline implementations
These implementations are built on top of vk_shader.  For the most part,
the driver shouldn't notice a difference between draws consuming
pipelines vs. draws consuming shaders.  The only real difference is
that, when vk_driver_shader_ops::compile() is called for pipelines, a
struct vk_graphics_pipeline_state is provided.  For shader objects, the
state object will be NULL indicating that all state is unknown.  Besides
that, all the rest of the differences between Vulkan 1.0 pipelines,
VK_EXT_graphics_pipeline_library, and VK_EXT_shader_object are handled
by the Vulkan runtime code.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
c488dc9f50 vulkan: Add a BLAKE3 hash to vk_descriptor_set_layout
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
682b99a63f vulkan: Add push constant ranges to vk_pipeline_layout
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
e2cb395a1f vulkan: Add a vk_pipeline base struct
We need to be able to thunk through a destroy callback if we want to
have different kinds of pipelines implemented in different parts of the
stack.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
5e71e6f3f6 vulkan: Add a new dynamic state for render pass attachments
This is useful for implementing VK_EXT_dynamic_rendering_unused_attachments

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
6ec177b116 vulkan: Rework vk_render_pass_state::attachments
The new bitfield has a separat flag for each of the color attachments.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
c09c086c12 vulkan: Add a vk_render_pass_state_has_attachment_info() helper
We already have a helper like this internally.  Give it a better name
and expose it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Faith Ekstrand
9f62008bff vulkan: Add runtime code for VK_EXT_shader_object
This adds a new base vk_shader object along with vtables for creating,
binding, and working with shader objects.

Unlike other parts of the runtime, the new shader object code is a bit
more sanitized and opinionated than just handing you the Vulkan
entrypoints.  For one thing, the create_shaders() calback takes a NIR
shader, not SPIR-V.  Conversion of SPIR-V into NIR, handling of magic
meta NIR shaders, etc. is all done in common code.  [De]serialization is
done via `struct blob` and the common code does a checksum of the binary
and handles rejecting invalid binaries based on shaderBinaryUUID and
shaderBinaryVersion.  This should make life a bit easier for driver
authors as well as provides a bit nicer interface for building the
common pipeline implementation on top of shader objects.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Connor Abbott
0d225c9e43 vk/graphics_state: Add stubs required by VK_EXT_shader_objects
Because these functions were introduced by VK_EXT_shader_objects, we
technically have to expose them even though they have to do with NV
extensions that no one else supports.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:09 +00:00
Connor Abbott
657b8e5264 vk/graphics_state: Remove bogus assert in CmdSetSampleMaskEXT
We're supposed to just ignore samples above what we support, and there's
no VU matching this assert. Fixes a crash in
dEQP-VK.pipeline.shader_object_unlinked_spirv.extended_dynamic_state.misc.sample_shading_dynamic_sample_count.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:08 +00:00
Faith Ekstrand
6ad294202e vulkan: Move the descriptor set limit to vk_limits.h
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:08 +00:00
Faith Ekstrand
498d58a5f8 vulkan: Add a vk_get_subgroup_size() helper
No reason to duplicate this logic between pipelines and shader objects.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
2024-02-27 22:17:08 +00:00
M Henning
af2cea8f84 nak: Remove assert on nir->info.outputs_written
nir->info.outputs_written isn't used for fragment shaders except as an
early out a few lines above this, so we don't rely on this property.

My best guess is that this was intended to check if the information
from nir_gather_info is stale, but dead variables fail the assert
even if the info is up to date.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27595>
2024-02-27 21:58:08 +00:00
David Rosca
82ff9204ab frontends/va: Only set VP9 segmentation fields when segmentation is enabled
Workaround for ffmpeg setting segmentation_update_map to 1 with
segmentation_enabled == 0.

Fixes decoding sample from https://github.com/mpv-player/mpv/issues/13533

Cc: mesa-stable

Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27816>
2024-02-27 20:54:48 +00:00
Ruijing Dong
eb74aa8515 frontends/va: get av1 encoding ref frame infos for L0.
Reference frame list is formed by each of the provided
recon_frame, while the assumption here is to use the API
provided by VAAPI interface, when a frame is marked as
"long term reference" by

av1->picture_flags.bits.long_term_reference

Its recon_frame will be kept in DPB marked by its
recon_frame as signature. When a future input requests
refering to it, it can go this way:

1. set av1->ref_frame_ctrl_l0.field.search_idx2 to indicate
   which ref_frame_idx slot will be used.
   x = av1->ref_frame_ctrl_l0.field.search_idx2;
2. n = av1->ref_frame_idx[x-1];
   av1->reference_frames[n] as the signature to compare with.
   if av1->reference_frames[n] is pointing to the
   same video buffer (signature) as the one marked as
   "long term reference". Then the new input is refering to
   it only.
3. in SVC case, long terms are used for temproal_id 0 only,
   because using long term means potentially scene change
   could happen.
4. the "long term reference" recon_frame should be kept,
   instead of being reused until it is no longer needed to
   avoid signature duplication.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
2024-02-27 20:20:46 +00:00
Ruijing Dong
4b92fa9e10 radeonsi/vcn: vcn4 av1 long term ref support
Add vcn4 av1 long term reference support.

So that frames can be controlled from application
side to refer to the identified reference, which
usually could provide better coding efficiency in
the case of scene chagne back and forth, just it
needs to identify and mark these frames before
using them.

We assume 2 long term reference frames should be
good in a key frame period, and these long term
references can be overwritten by marking new ones.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
2024-02-27 20:20:46 +00:00
Ruijing Dong
5663221bdb radeonsi/vcn: data structure av1 enc long term reference.
Here it borrowed the term "long term reference" to represent
the customized reference frame rather than the default ones used.

To enable that, it needs application to leverage VAAPI existing
interface to mark a frame as "long term reference", and then
it will be preserved in the DPB for later usage. This preserved
frame later could be refered to by having its signature used in
the ref_frame_idx[] list, and the index can be indicated by
RefFrameCtrl index2, which has not been used for other purpose.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
2024-02-27 20:20:45 +00:00
Hans-Kristian Arntzen
2d3e7b6e9a wsi/wl: Fix deadlock in dispatch_queue_timeout.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: a00f9c401b ("loader/wayland: Add fallback wl_display_dispatch_queue_timeout")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27828>
2024-02-27 19:09:28 +00:00
Ian Romanick
a2292f53b5 nir: Optimize uniform vote_all and vote_any
No shader-db changes on any Intel platform.

fossil-db:

All Ice Lake and newer platforms had similar results. (Ice Lake)
Totals:
Instrs: 165513303 -> 165511820 (-0.00%)
Cycles: 15125314947 -> 15125211500 (-0.00%); split: -0.00%, +0.00%

Totals from 82 (0.01% of 656120) affected shaders:
Instrs: 544627 -> 543144 (-0.27%)
Cycles: 22616493 -> 22513046 (-0.46%); split: -0.46%, +0.00%

No fossil-db changes on Gfx9.

Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 09:44:32 -08:00
Ian Romanick
535caaf3e0 nir: Optimize uniform iadd, fadd, and ixor reduction operations
This adds optimizations for iadd, fadd, and ixor with reduce,
inclusive scan, and exclusive scan.

NOTE: The fadd and ixor optimizations had no shader-db or fossil-db
changes on any Intel platform.

NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size
and base-local-size.shader_test on DG2 and MTL. This is just changing
the code path taken to not use whatever path was not working properly
before.

This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The
min, max, iand, and ior exclusive_scan optimizations are not
implemented.

Broadwell on shader-db is not happy. I have not investigated.

v2: Silence some warnings about discarding const.

v3: Rename mbcnt to count_active_invocations. Add a big comment
explaining the differences between the two paths. Suggested by Rhys.

shader-db:

All Gfx9 and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300384 -> 20299545 (<.01%)
instructions in affected programs: 19167 -> 18328 (-4.38%)
helped: 35 / HURT: 0

total cycles in shared programs: 842809750 -> 842766381 (<.01%)
cycles in affected programs: 2160249 -> 2116880 (-2.01%)
helped: 33 / HURT: 2

total spills in shared programs: 4632 -> 4626 (-0.13%)
spills in affected programs: 206 -> 200 (-2.91%)
helped: 3 / HURT: 0

total fills in shared programs: 5594 -> 5581 (-0.23%)
fills in affected programs: 664 -> 651 (-1.96%)
helped: 3 / HURT: 1

fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165551893 -> 165513303 (-0.02%)
Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00%
Spill count: 45258 -> 45204 (-0.12%)
Fill count: 74286 -> 74157 (-0.17%)
Scratch Memory Size: 2467840 -> 2451456 (-0.66%)

Totals from 712 (0.11% of 656120) affected shaders:
Instrs: 598931 -> 560341 (-6.44%)
Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04%
Spill count: 983 -> 929 (-5.49%)
Fill count: 2274 -> 2145 (-5.67%)
Scratch Memory Size: 52224 -> 35840 (-31.37%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 09:44:11 -08:00
Ian Romanick
c63ea755fe intel/fs: Use nir_opt_uniform_subgroup
shader-db:

All Skylake and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300435 -> 20300384 (<.01%)
instructions in affected programs: 303 -> 252 (-16.83%)
helped: 2 / HURT: 0

total cycles in shared programs: 842810326 -> 842809750 (<.01%)
cycles in affected programs: 8374 -> 7798 (-6.88%)
helped: 2 / HURT: 0

fossil-db:

All Intel platforms (note below) had similar results. (Ice Lake shown)
Instrs: 165559735 -> 165551893 (-0.00%)
Cycles: 15133083961 -> 15132539132 (-0.00%); split: -0.00%, +0.00%
Spill count: 45262 -> 45258 (-0.01%)
Fill count: 74293 -> 74286 (-0.01%)

Totals from 854 (0.13% of 656120) affected shaders:
Instrs: 3461998 -> 3454156 (-0.23%)
Cycles: 154252729 -> 153707900 (-0.35%); split: -0.36%, +0.01%
Spill count: 2655 -> 2651 (-0.15%)
Fill count: 3881 -> 3874 (-0.18%)

DG2 did not see changes in spills or fills.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 08:38:45 -08:00
Ian Romanick
f10d1ef372 nir: Initial framework for optimizing uniform subgroup operations
The first commit just optimizes operation where the result of the
subgroup operation is the same as each of the individual channel
results.

This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 08:38:31 -08:00
Ian Romanick
8fb37ef985 intel/fs: Add fast path for ballot(true)
This doesn't help very much now. A later commit adds a NIR optimization
pass, tentatively called nir_opt_uniform_subgroup, that converts many
kinds of subgroup operations to things involving
bitCount(ballot(true)). This commit makes a huge difference in the
results of that later commit.

No shader-db changes on any Intel platform.

Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165558033 -> 165557519 (-0.00%)
Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00%

Totals from 299 (0.05% of 656117) affected shaders:
Instrs: 88293 -> 87779 (-0.58%)
Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03%

v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 08:37:46 -08:00