Alyssa Rosenzweig
35d6f4a394
agx: fix spilling inside sample loop
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
bdd200a202
agx: handle subgroup barriers
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
d183b76fd4
agx: fix frag sidefx with sample shading
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
6269a1474d
agx: fix load_helper_invocation with sample shading
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
94f0209fb2
agx: fix phi translation corruption
...
we can't stomp over srcs[], where we allocated our space for sources. unclear
how this worked before but it definitely breaks once you have a phi with 7
sources.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
f21dbfe5ae
agx: allow 8-bit bcsel
...
can be generated from our lowerings but it just works with the implicit
conversion semantics we have.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
a948244058
agx: handle cross-workgroup memory barriers
...
there's no prior art for this, but experimentally this seems to do the right
thing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
c22ce3cab9
agx: fix some ms texture packing
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
ec47f325f8
agx: fix query LOD of array
...
need to ignore the layer
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
8df39ac49b
agx: enable more lowering
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
69d7063ec0
agx: optimize and/or with booleans
...
Beneficial so we can fuse the comparison.
total instructions in shared programs: 2188179 -> 2185535 (-0.12%)
instructions in affected programs: 392512 -> 389868 (-0.67%)
helped: 894
HURT: 9
Instructions are helped.
total alu in shared programs: 1706063 -> 1703445 (-0.15%)
alu in affected programs: 275063 -> 272445 (-0.95%)
helped: 880
HURT: 9
Alu are helped.
total fscib in shared programs: 1702385 -> 1699743 (-0.16%)
fscib in affected programs: 276199 -> 273557 (-0.96%)
helped: 894
HURT: 9
Fscib are helped.
total ic in shared programs: 462494 -> 462490 (<.01%)
ic in affected programs: 124 -> 120 (-3.23%)
helped: 1
HURT: 0
total bytes in shared programs: 14476964 -> 14464512 (-0.09%)
bytes in affected programs: 2870824 -> 2858372 (-0.43%)
helped: 888
HURT: 155
Bytes are helped.
total regs in shared programs: 662444 -> 662461 (<.01%)
regs in affected programs: 1025 -> 1042 (1.66%)
helped: 14
HURT: 12
Inconclusive result (value mean confidence interval includes 0).
total uniforms in shared programs: 1638301 -> 1638374 (<.01%)
uniforms in affected programs: 17778 -> 17851 (0.41%)
helped: 22
HURT: 54
Uniforms are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
c43413f729
compiler: add ACCESS_IN_BOUNDS_AGX
...
useful for internal shaders on agx.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
eb5f82d221
nir,agx: fix load_active_subgroup_index
...
It can't be reordered globally, since its value is control-flow dependent.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
7fb60c4c81
nir,agx: add depth=never workaround
...
There seems to be a hardware issue where fragment shaders with side effects get
skipped if depth testing with NEVER. Add a workaround for this case where we
discard programmatically instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:25 +00:00
Alyssa Rosenzweig
9d824bd123
nir: add quad_ballot_agx intrinsic
...
to lower quad votes in nir.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:24 +00:00
Alyssa Rosenzweig
2912f531a7
nir: add texops for AGX border colour emulation
...
AGX has limited border colour hardware. To support full
customBorderColorWithoutFormat semantics, we're forced to emulate in shaders at
a substantial performance penalty. Actually, that's needed just to pass CTS
because of other hardware issues stacking on top of each others... Hooray!
Add the texops we need to facilitate efficient custom border colour lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:24 +00:00
Alyssa Rosenzweig
8b9ed851ec
nir: add is_first_fan_agx sysval
...
needed for correct flatshading with fans, without falling back on software input
assembly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179 >
2024-05-14 04:57:24 +00:00
Faith Ekstrand
8bc694223e
zink: Set workarounds.can_do_invalid_linear_modifier for NVK
...
This fixes most of the egl_image_dma_buf* piglit tests. The remaining
fails are YCbCr tests which are likely unrelated to core dma-buf
import/export.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
e6f77defec
nvk/wsi: Advertise modifier support
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
28342a581f
vulkan/wsi: Bind memory planes, not YCbCr planes.
...
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Fixes: f5433e4d6c ("vulkan/wsi: Add modifiers support to wsi_create_native_image")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10176
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
cd428e01d7
nvk: Advertise VK_EXT_image_drm_format_modifier
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9636
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9480
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
d8e200c0d9
nvk: Advertise VK_EXT_queue_family_foreign
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Mohamed Ahmed
bca2f13dd8
nvk: enable rendering to DRM_FORMAT_MOD_LINEAR images
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
224d9a514a
nvk: Implement DRM format modifier queries
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
4ad79bfef4
nvk: Set tile mode and PTE kind on dedicated dma-buf BOs
...
This is our compromise to make NVK and nouveau GL play nice when it
comes to modifiers. The old GL driver depends heavily on the PTE kind
and tile mode, even for images with modifiers. While it correctly
encodes the PTE kind and tile mode in the modifiers it advertises, it
may ignore the modifier and just trust what's set on the BO when it
imports a dma-buf image. This is partly because it doesn't support
VM_BIND and partly because of preexisting bugs in the modifiers
implementation. In either case, we can't fix it retroactively.
To work around this, NVK also sets the PTE kind and tile mode on the BO
when it's a dedicated allocation created for a DRM format modifiers
image. If DRM format modifiers are used without dedicated allocations,
things may still break but that's getting into vanishingly unlikely
scenarios.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
f1fdffa1b2
nvk: Support image creation with modifiers
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
3bb531d245
nouveau/winsys: Add back nouveau_ws_bo_new_tiled()
...
This reverts commit ce1cccea98 . In this
new version, we also add a query for whether or not tiled BOs are
supported by nouveau.ko.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
03c4a46fe5
drm-uapi: Sync nouveau_drm.h
...
Taken from drm-misc-next-fixes:
commit 959314c438caf1b62d787f02d54a193efda38880
Author: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Date: Thu May 9 23:43:52 2024 +0300
drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
8cce121da4
nvk: Allow VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Mohamed Ahmed
6063f96c61
nil: Support creating images with DRM modifiers
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Mohamed Ahmed
e1bd4127f3
nil: Add some helpers for DRM format modifiers
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
b7773f96f9
nil: Default to NV_MMU_PTE_KIND_GENERIC_MEMORY on Turing+
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
603389f7a3
nvk: Set color/Z compression based on nil_image::compressed
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Mohamed Ahmed
873a044cb3
nil: Add a nil_image::compressed bit
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
73c87dbc0c
nil: Use the right PTE kind for Z32 pre-Turing
...
This got lost in the Rust rewrite.
Fixes: 426553d61d ("nil: Re-implement nil_image in Rust")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
71d1fa129a
nvk: Allow GART for dma-bufs
...
We also allow dma-bufs to be imported into arbitrary heaps because we
relly don't know where they'll come from.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
6cd58de4eb
nouveau/winsys: Make BO_LOCAL and BO_GART separate flags
...
It's sometimes useful to specify both to allow migration.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:34 +00:00
Faith Ekstrand
19b143b7bc
nouveau/winsys: Take a reference to BOs found in the cache
...
Fixes: c370260a8f ("nouveau/winsys: Add dma-buf import support")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:33 +00:00
Faith Ekstrand
d63f015d0b
nvk: Improve the GetMemoryFdKHR error
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 >
2024-05-14 04:04:33 +00:00
Faith Ekstrand
756cbb41a2
nvk: Use the upload queue for NVK_DEBUG=zero_memory
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10800
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29183 >
2024-05-14 03:40:24 +00:00
Faith Ekstrand
22e44d54fd
nvk/upload_queue: Add a _fill method
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29183 >
2024-05-14 03:40:24 +00:00
Faith Ekstrand
3132a49eb0
nvk/upload_queue: Add some useful asserts
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29183 >
2024-05-14 03:40:24 +00:00
Faith Ekstrand
9b098209b9
nvk/upload_queue: Only upload one line of data
...
This only doesn't blow up beause we set multi_line_enable = FALSE.
Fixes: 2074e28a0d ("nvk: Add an upload queue")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29183 >
2024-05-14 03:40:24 +00:00
Mike Blumenkrantz
ac78076cd2
zink: hook up VK_EXT_legacy_vertex_attributes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29069 >
2024-05-14 03:11:22 +00:00
Ian Romanick
97e3c6a12a
intel/brw: Use range analysis to optimize fsign
...
shader-db:
Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
total instructions in shared programs: 19674784 -> 19665960 (-0.04%)
instructions in affected programs: 933425 -> 924601 (-0.95%)
helped: 3656 / HURT: 0
total cycles in shared programs: 810343919 -> 810241030 (-0.01%)
cycles in affected programs: 56752034 -> 56649145 (-0.18%)
helped: 3032 / HURT: 434
LOST: 11
GAINED: 0
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20315795 -> 20305856 (-0.05%)
instructions in affected programs: 979698 -> 969759 (-1.01%)
helped: 3845 / HURT: 0
total cycles in shared programs: 830600281 -> 830534694 (<.01%)
cycles in affected programs: 45675615 -> 45610028 (-0.14%)
helped: 3250 / HURT: 325
total spills in shared programs: 4583 -> 4565 (-0.39%)
spills in affected programs: 180 -> 162 (-10.00%)
helped: 3 / HURT: 0
total fills in shared programs: 5245 -> 5219 (-0.50%)
fills in affected programs: 379 -> 353 (-6.86%)
helped: 3 / HURT: 0
LOST: 14
GAINED: 8
fossil-db:
All Intel platforms except Tiger Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 154024263 -> 154023814 (-0.00%)
Cycle count: 17463341602 -> 17461726239 (-0.01%); split: -0.01%, +0.00%
Totals from 322 (0.05% of 631440) affected shaders:
Instrs: 199933 -> 199484 (-0.22%)
Cycle count: 168492537 -> 166877174 (-0.96%); split: -0.96%, +0.00%
Tiger Lake
Instrs: 149984723 -> 149984287 (-0.00%)
Cycle count: 15238596937 -> 15239260415 (+0.00%); split: -0.00%, +0.01%
Max dispatch width: 5553408 -> 5553424 (+0.00%)
Totals from 318 (0.05% of 631414) affected shaders:
Instrs: 179624 -> 179188 (-0.24%)
Cycle count: 160724533 -> 161388011 (+0.41%); split: -0.06%, +0.48%
Max dispatch width: 3296 -> 3312 (+0.49%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:21 +00:00
Ian Romanick
e578657313
intel/brw: Implement more strictly correct fsign lowering
...
The huge amount of helped shaders is due to the "~" versions of the
patterns.
shader-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19672345 -> 19662605 (-0.05%)
instructions in affected programs: 1147766 -> 1138026 (-0.85%)
helped: 2691 / HURT: 1650
total cycles in shared programs: 810323688 -> 810145191 (-0.02%)
cycles in affected programs: 68918312 -> 68739815 (-0.26%)
helped: 3651 / HURT: 1832
LOST: 29
GAINED: 38
Tiger Lake
total instructions in shared programs: 19489619 -> 19479909 (-0.05%)
instructions in affected programs: 1124564 -> 1114854 (-0.86%)
helped: 2682 / HURT: 1643
total cycles in shared programs: 811468406 -> 811706747 (0.03%)
cycles in affected programs: 66397690 -> 66636031 (0.36%)
helped: 3692 / HURT: 1775
total spills in shared programs: 3906 -> 3907 (0.03%)
spills in affected programs: 16 -> 17 (6.25%)
helped: 0 / HURT: 1
total fills in shared programs: 3220 -> 3222 (0.06%)
fills in affected programs: 50 -> 52 (4.00%)
helped: 0 / HURT: 1
LOST: 33
GAINED: 36
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20317882 -> 20307495 (-0.05%)
instructions in affected programs: 1199651 -> 1189264 (-0.87%)
helped: 2863 / HURT: 1680
total cycles in shared programs: 830880024 -> 830457927 (-0.05%)
cycles in affected programs: 63347102 -> 62925005 (-0.67%)
helped: 4118 / HURT: 1622
total spills in shared programs: 4593 -> 4583 (-0.22%)
spills in affected programs: 205 -> 195 (-4.88%)
helped: 4 / HURT: 0
total fills in shared programs: 5284 -> 5245 (-0.74%)
fills in affected programs: 464 -> 425 (-8.41%)
helped: 4 / HURT: 0
LOST: 70
GAINED: 33
fossil-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 154025275 -> 154022035 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17472869499 -> 17463289530 (-0.05%); split: -0.06%, +0.00%
Spill count: 141269 -> 141246 (-0.02%); split: -0.02%, +0.00%
Fill count: 265342 -> 265159 (-0.07%); split: -0.11%, +0.04%
Max live registers: 32597829 -> 32597986 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5536776 -> 5537048 (+0.00%)
Totals from 1590 (0.25% of 631423) affected shaders:
Instrs: 1146532 -> 1143292 (-0.28%); split: -0.44%, +0.16%
Cycle count: 1230843330 -> 1221263361 (-0.78%); split: -0.83%, +0.05%
Spill count: 15832 -> 15809 (-0.15%); split: -0.19%, +0.04%
Fill count: 36071 -> 35888 (-0.51%); split: -0.79%, +0.29%
Max live registers: 93529 -> 93686 (+0.17%); split: -0.00%, +0.17%
Max dispatch width: 15168 -> 15440 (+1.79%)
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149564084 -> 149562467 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15151701515 -> 15158290114 (+0.04%); split: -0.00%, +0.04%
Max live registers: 32249443 -> 32249620 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5540536 -> 5540488 (-0.00%)
Totals from 1605 (0.25% of 630303) affected shaders:
Instrs: 584950 -> 583333 (-0.28%); split: -0.49%, +0.21%
Cycle count: 160926321 -> 167514920 (+4.09%); split: -0.05%, +4.14%
Max live registers: 90851 -> 91028 (+0.19%); split: -0.00%, +0.20%
Max dispatch width: 15440 -> 15392 (-0.31%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
864268ff0d
intel/brw: Algebraic optimizations for CSEL
...
No shader-db or fossil-db changes on any Intel platform. In this MR, the
only benefit of these changes is to convert some "-a > 0" CSEL
comparisons to "a < 0" for improved readability.
v2: Add integer CSEL support
v3: Use fs_inst::resize_sources and brw_type_is_sint. Both suggested by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
033405cd4b
intel/brw: Combine constants and constant propagation for CSEL
...
No shader-db or fossil-db changes on any Intel platform. This ends up
begin helpful in "intel/brw: Use range analysis to optimize fsign."
v2: Add integer CSEL support
v3: Massive simplification (-20 lines!) of constant propagation
logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm.
Noticed by Ken.
v4: While MAD can mix F and HF sources on some platforms, CSEL
cannot. Found by skqp on TGL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
504b742b83
intel/brw: Update CSEL source type validation
...
Gfx9 can only have F, but newer GPUs can have F, HF, *D, or *W. The
source and destination types must still match in size.
v2: Simplify the float vs integer logic. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
3f151c03af
intel/brw: Handle fsign optimization in a NIR algebraic pass
...
This is a lot less code, and it makes it easier to experiment with other
pattern-based optimizations in the future.
The results here are nearly identical to the results I got from Ken's
"intel/brw: Make fsign (for 16/32-bit) in SSA form"... which are not
particularly good.
In this commit and in Ken's, all of the shader-db shaders hurt for
spills and fills are from Deus Ex Mankind Divided. Each shader has a
bunch of texture instructions with a single fsign between the
blocks. With the dependency on the flag removed, the scheduler puts all
of the texture instructions at the start... and there are a LOT of them.
shader-db:
All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19647060 -> 19650207 (0.02%)
instructions in affected programs: 734718 -> 737865 (0.43%)
helped: 382 / HURT: 1984
total cycles in shared programs: 823238442 -> 822785913 (-0.05%)
cycles in affected programs: 426901157 -> 426448628 (-0.11%)
helped: 3408 / HURT: 3671
total spills in shared programs: 3887 -> 3891 (0.10%)
spills in affected programs: 256 -> 260 (1.56%)
helped: 0 / HURT: 4
total fills in shared programs: 3236 -> 3306 (2.16%)
fills in affected programs: 882 -> 952 (7.94%)
helped: 0 / HURT: 12
LOST: 37
GAINED: 34
fossil-db:
DG2 and Meteor Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 154005469 -> 154008294 (+0.00%); split: -0.00%, +0.00%
Cycle count: 17551859277 -> 17554293955 (+0.01%); split: -0.02%, +0.04%
Spill count: 142078 -> 142090 (+0.01%)
Fill count: 266761 -> 266729 (-0.01%); split: -0.02%, +0.01%
Max live registers: 32593578 -> 32593858 (+0.00%)
Max dispatch width: 5535944 -> 5536816 (+0.02%); split: +0.02%, -0.01%
Totals from 5867 (0.93% of 631350) affected shaders:
Instrs: 5475544 -> 5478369 (+0.05%); split: -0.04%, +0.09%
Cycle count: 1649032029 -> 1651466707 (+0.15%); split: -0.24%, +0.39%
Spill count: 26411 -> 26423 (+0.05%)
Fill count: 57364 -> 57332 (-0.06%); split: -0.10%, +0.04%
Max live registers: 431561 -> 431841 (+0.06%)
Max dispatch width: 49784 -> 50656 (+1.75%); split: +2.38%, -0.63%
Tiger Lake
Totals:
Instrs: 149530671 -> 149533588 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15261418953 -> 15264764921 (+0.02%); split: -0.00%, +0.03%
Spill count: 60317 -> 60316 (-0.00%); split: -0.02%, +0.01%
Max live registers: 32249201 -> 32249464 (+0.00%)
Max dispatch width: 5540608 -> 5540584 (-0.00%)
Totals from 5862 (0.93% of 630309) affected shaders:
Instrs: 4740800 -> 4743717 (+0.06%); split: -0.04%, +0.10%
Cycle count: 566531248 -> 569877216 (+0.59%); split: -0.13%, +0.72%
Spill count: 11709 -> 11708 (-0.01%); split: -0.09%, +0.08%
Max live registers: 424560 -> 424823 (+0.06%)
Max dispatch width: 50304 -> 50280 (-0.05%)
Ice Lake
Totals:
Instrs: 150499705 -> 150502608 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15105629116 -> 15105425880 (-0.00%); split: -0.00%, +0.00%
Spill count: 60087 -> 60090 (+0.00%)
Fill count: 100542 -> 100541 (-0.00%); split: -0.00%, +0.00%
Max live registers: 32605215 -> 32605495 (+0.00%)
Max dispatch width: 5617752 -> 5617792 (+0.00%); split: +0.00%, -0.00%
Totals from 5882 (0.93% of 634934) affected shaders:
Instrs: 4737206 -> 4740109 (+0.06%); split: -0.04%, +0.10%
Cycle count: 598882104 -> 598678868 (-0.03%); split: -0.08%, +0.05%
Spill count: 10278 -> 10281 (+0.03%)
Fill count: 22504 -> 22503 (-0.00%); split: -0.01%, +0.01%
Max live registers: 424184 -> 424464 (+0.07%)
Max dispatch width: 50216 -> 50256 (+0.08%); split: +0.25%, -0.18%
Skylake
Totals:
Instrs: 139092612 -> 139095257 (+0.00%); split: -0.00%, +0.00%
Cycle count: 14533550285 -> 14533544716 (-0.00%); split: -0.00%, +0.00%
Spill count: 58176 -> 58172 (-0.01%)
Fill count: 95877 -> 95796 (-0.08%)
Max live registers: 31924594 -> 31924874 (+0.00%)
Max dispatch width: 5484568 -> 5484552 (-0.00%); split: +0.00%, -0.00%
Totals from 5789 (0.93% of 625512) affected shaders:
Instrs: 4481987 -> 4484632 (+0.06%); split: -0.04%, +0.10%
Cycle count: 578310124 -> 578304555 (-0.00%); split: -0.05%, +0.05%
Spill count: 9248 -> 9244 (-0.04%)
Fill count: 19677 -> 19596 (-0.41%)
Max live registers: 415340 -> 415620 (+0.07%)
Max dispatch width: 49720 -> 49704 (-0.03%); split: +0.10%, -0.13%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00