fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 15:58:06 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	c8348139fd	nir: change signature of nir_src_is_divergent() Now, it takes nir_src * instead of nir_src. Also move the implementation to nir_divergence_analysis.c. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c25c63ebc0	nir/divergence: separately indicate whether loops have divergent continues or breaks bool nir_loop_is_divergent(nir_loop *) replaces the previous loop->divergent indicator. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Marek Olšák	65ace5649b	nir: reject unsupported component counts from all vectorize callbacks If you allow an unsupported component count in the callback for loads, nir_opt_load_store_vectorize will align num_components to the next supported vector size, essentially overfetching. This changes all callbacks to reject it. AMD will enable it in a later commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Christian Gmeiner	cf939334e6	v3d: Add a few function traces Sprinkle around a few traces that were useful in locating submit and fence waits. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31575>	2024-10-14 12:21:51 +00:00
Iago Toral Quiroga	4d1971f17f	broadcom: fix pairing tmu lookup with previous ldtmu There are some restrictions when pairing a new TMU lookup with a previous LDTMU and we had code to handle this but we were not limiting the restriction only to TMU lookups. total instructions in shared programs: 10856992 -> 10823967 (-0.30%) instructions in affected programs: 1823670 -> 1790645 (-1.81%) helped: 10212 HURT: 110 Instructions are helped. total max-temps in shared programs: 2234069 -> 2233153 (-0.04%) max-temps in affected programs: 15100 -> 14184 (-6.07%) helped: 660 HURT: 3 Max-temps are helped. total sfu-stalls in shared programs: 15935 -> 15967 (0.20%) sfu-stalls in affected programs: 317 -> 349 (10.09%) helped: 31 HURT: 57 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 10872927 -> 10839934 (-0.30%) inst-and-stalls in affected programs: 1824656 -> 1791663 (-1.81%) helped: 10199 HURT: 111 Inst-and-stalls are helped. total nops in shared programs: 185612 -> 185767 (0.08%) nops in affected programs: 4865 -> 5020 (3.19%) helped: 164 HURT: 256 Nops are HURT. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31574>	2024-10-10 06:58:15 +00:00
Iago Toral Quiroga	c58bfb355a	broadcom/compiler: generate mali opcodes for clamping on Pi5 Models C0 and D0 support these opcodes too. total instructions in shared programs: 10869461 -> 10856992 (-0.11%) instructions in affected programs: 1467666 -> 1455197 (-0.85%) helped: 6012 HURT: 1413 Instructions are helped. total threads in shared programs: 431014 -> 431010 (<.01%) threads in affected programs: 8 -> 4 (-50.00%) helped: 0 HURT: 2 total uniforms in shared programs: 5432771 -> 5430909 (-0.03%) uniforms in affected programs: 183047 -> 181185 (-1.02%) helped: 976 HURT: 128 Uniforms are helped. total max-temps in shared programs: 2235272 -> 2234069 (-0.05%) max-temps in affected programs: 38163 -> 36960 (-3.15%) helped: 1262 HURT: 168 Max-temps are helped. total spills in shared programs: 4331 -> 4363 (0.74%) spills in affected programs: 964 -> 996 (3.32%) helped: 6 HURT: 47 total fills in shared programs: 6527 -> 6622 (1.46%) fills in affected programs: 2047 -> 2142 (4.64%) helped: 6 HURT: 47 total sfu-stalls in shared programs: 15807 -> 15935 (0.81%) sfu-stalls in affected programs: 787 -> 915 (16.26%) helped: 71 HURT: 172 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 10885268 -> 10872927 (-0.11%) inst-and-stalls in affected programs: 1469423 -> 1457082 (-0.84%) helped: 5998 HURT: 1417 Inst-and-stalls are helped. total nops in shared programs: 184280 -> 185612 (0.72%) nops in affected programs: 10000 -> 11332 (13.32%) helped: 311 HURT: 1193 Nops are HURT. The results show a reduction in register pressure, but an increase in spills, which looks contradictory. This is because for some reason, this optimization makes the NIR scheduler produce code for some shaders in Godot that cause additional spilling, but the problem seems to be exclusive to Godot shaders and not really related to the optimization itself but to how the NIR scheduler works. Excluding Godot shaders we actually see a decrease in spills and a slightly larger improvement in instruction counts: total instructions in shared programs: 10720106 -> 10707621 (-0.12%) instructions in affected programs: 1375316 -> 1362831 (-0.91%) helped: 5948 HURT: 1364 Instructions are helped. total threads in shared programs: 428248 -> 428244 (<.01%) threads in affected programs: 8 -> 4 (-50.00%) helped: 0 HURT: 2 total spills in shared programs: 3729 -> 3712 (-0.46%) spills in affected programs: 451 -> 434 (-3.77%) helped: 6 HURT: 0 total fills in shared programs: 4738 -> 4714 (-0.51%) fills in affected programs: 564 -> 540 (-4.26%) helped: 6 HURT: 0 Comparing only shaders from Godot: total instructions in shared programs: 149355 -> 149371 (0.01%) instructions in affected programs: 92350 -> 92366 (0.02%) helped: 64 HURT: 49 Inconclusive result (value mean confidence interval includes 0). total max-temps in shared programs: 16477 -> 16472 (-0.03%) max-temps in affected programs: 180 -> 175 (-2.78%) helped: 5 HURT: 0 Max-temps are helped. total spills in shared programs: 602 -> 651 (8.14%) spills in affected programs: 513 -> 562 (9.55%) helped: 0 HURT: 47 total fills in shared programs: 1789 -> 1908 (6.65%) fills in affected programs: 1483 -> 1602 (8.02%) helped: 0 HURT: 47 Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:08 +00:00
Iago Toral Quiroga	c57be33d96	broadcom/compiler: implement NIR mali opcodes for clamping These translate directly to new unpack modifiers on V3D 7.x. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:08 +00:00
Iago Toral Quiroga	5a62d47762	broadcom/compiler: don't use small immediates in geometry stages Shader-db shows this is beneficial, even if it comes with a small increase in register pressure. total instructions in shared programs: 10889197 -> 10869857 (-0.18%) instructions in affected programs: 3625014 -> 3605674 (-0.53%) helped: 14911 HURT: 8324 Instructions are helped. total threads in shared programs: 431034 -> 431014 (<.01%) threads in affected programs: 40 -> 20 (-50.00%) helped: 0 HURT: 10 Threads are HURT. total uniforms in shared programs: 5308006 -> 5432767 (2.35%) uniforms in affected programs: 2204951 -> 2329712 (5.66%) helped: 9 HURT: 30766 Uniforms are HURT. total max-temps in shared programs: 2226471 -> 2235269 (0.40%) max-temps in affected programs: 272670 -> 281468 (3.23%) helped: 2372 HURT: 8479 Max-temps are HURT. total spills in shared programs: 4318 -> 4331 (0.30%) spills in affected programs: 39 -> 52 (33.33%) helped: 2 HURT: 7 total fills in shared programs: 6514 -> 6527 (0.20%) fills in affected programs: 42 -> 55 (30.95%) helped: 2 HURT: 7 total sfu-stalls in shared programs: 15166 -> 15808 (4.23%) sfu-stalls in affected programs: 2389 -> 3031 (26.87%) helped: 513 HURT: 944 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 10904363 -> 10885665 (-0.17%) inst-and-stalls in affected programs: 3660930 -> 3642232 (-0.51%) helped: 14878 HURT: 8450 Inst-and-stalls are helped. total nops in shared programs: 183672 -> 184256 (0.32%) nops in affected programs: 12532 -> 13116 (4.66%) helped: 1841 HURT: 2251 Nops are HURT. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>	2024-09-25 14:21:46 +00:00
Iago Toral Quiroga	390849f6a2	broadcom/compiler: don't add const offset to unifa if it is 0 Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>	2024-09-25 14:21:46 +00:00
Iago Toral Quiroga	09e0e53a3b	broadcom/compiler: avoid register conflict with ldunif(a) and ldvary ldvary instructions have implicit writes to rf0 (r5 in Pi4) that are read in follow-up instructions to complete the interpolation calculations so we rather not allocate ldunif(a)'s dst to rf0/r5 during these sequence too to facilitate pairing. This gives us -0.25% of instructions for fragment shaders in shader-db for Pi5 and -0.64% on Pi4. Shader-db Pi5: total instructions in shared programs: 10890641 -> 10889197 (-0.01%) instructions in affected programs: 575506 -> 574062 (-0.25%) helped: 2506 HURT: 1378 Instructions are helped. total max-temps in shared programs: 2226555 -> 2226471 (<.01%) max-temps in affected programs: 5061 -> 4977 (-1.66%) helped: 139 HURT: 78 Max-temps are helped. total sfu-stalls in shared programs: 15143 -> 15166 (0.15%) sfu-stalls in affected programs: 310 -> 333 (7.42%) helped: 134 HURT: 195 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 10905784 -> 10904363 (-0.01%) inst-and-stalls in affected programs: 577053 -> 575632 (-0.25%) helped: 2497 HURT: 1415 Inst-and-stalls are helped. total nops in shared programs: 183945 -> 183672 (-0.15%) nops in affected programs: 3862 -> 3589 (-7.07%) helped: 478 HURT: 234 Nops are helped. Shader-db Pi4: total instructions in shared programs: 12842116 -> 12835720 (-0.05%) instructions in affected programs: 996970 -> 990574 (-0.64%) helped: 6027 HURT: 367 Instructions are helped. total max-temps in shared programs: 2251877 -> 2251707 (<.01%) max-temps in affected programs: 2670 -> 2500 (-6.37%) helped: 167 HURT: 9 Max-temps are helped. total sfu-stalls in shared programs: 21132 -> 21093 (-0.18%) sfu-stalls in affected programs: 114 -> 75 (-34.21%) helped: 92 HURT: 55 Sfu-stalls are helped. total inst-and-stalls in shared programs: 12863248 -> 12856813 (-0.05%) inst-and-stalls in affected programs: 1008237 -> 1001802 (-0.64%) helped: 6070 HURT: 359 Inst-and-stalls are helped. total nops in shared programs: 281645 -> 281200 (-0.16%) nops in affected programs: 2241 -> 1796 (-19.86%) helped: 501 HURT: 88 Nops are helped. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>	2024-09-25 14:21:46 +00:00
Iago Toral Quiroga	917e8e5439	broadcom/compiler: rename is_ldunif_dst to try_rf0 We flag nodes used to ldunif dst so we can try and favor allocating rf0 to them, so be more explicit about its purpose. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>	2024-09-25 14:21:46 +00:00
Iago Toral Quiroga	68014b0d9b	broadcom/compiler: skip small immediates optimization on vpm instructions total instructions in shared programs: 11164938 -> 10890641 (-2.46%) instructions in affected programs: 6557250 -> 6282953 (-4.18%) helped: 59134 HURT: 9752 Instructions are helped. total threads in shared programs: 431068 -> 431034 (<.01%) threads in affected programs: 68 -> 34 (-50.00%) helped: 0 Threads are HURT. total uniforms in shared programs: 3880437 -> 5308006 (36.79%) uniforms in affected programs: 2669367 -> 4096936 (53.48%) helped: 2 HURT: 74046 Uniforms are HURT. total max-temps in shared programs: 2244298 -> 2226555 (-0.79%) max-temps in affected programs: 463611 -> 445868 (-3.83%) helped: 17473 HURT: 8040 Max-temps are helped. total spills in shared programs: 4312 -> 4318 (0.14%) spills in affected programs: 0 -> 6 helped: 0 HURT: 2 total fills in shared programs: 6508 -> 6514 (0.09%) fills in affected programs: 0 -> 6 helped: 0 HURT: 2 total sfu-stalls in shared programs: 14794 -> 15143 (2.36%) sfu-stalls in affected programs: 1261 -> 1610 (27.68%) helped: 238 HURT: 586 Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total inst-and-stalls in shared programs: 11179732 -> 10905784 (-2.45%) inst-and-stalls in affected programs: 6570407 -> 6296459 (-4.17%) helped: 59126 HURT: 9786 Inst-and-stalls are helped. total nops in shared programs: 273422 -> 183945 (-32.72%) nops in affected programs: 139446 -> 49969 (-64.17%) helped: 60679 HURT: 2277 Nops are helped. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31259>	2024-09-23 07:45:46 +00:00
Konstantin Seurer	ce24486ee4	nir: Introduce nir_debug_info_instr Adds a new instruction type that stores metadata that might be useful for debugging purposes. Passes must ignore these instructions when making decisions. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18903>	2024-08-25 10:26:33 +00:00
Iago Toral Quiroga	ad9ff707ce	broadcom: drop backend implementation of nir_op_ufind_msb We can have NIR do this for us now that we have uclz. Suggested by Georg Lehmann. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30614>	2024-08-13 13:16:18 +02:00
Iago Toral Quiroga	35a10f5d5a	broadcom: implement nir_op_uclz This enables some algebraic optimizations. No changes in shader-db, but it does cause some CTS tests to produce less instructions. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30614>	2024-08-13 13:16:11 +02:00
Alyssa Rosenzweig	c3d999dec9	broadcom: switch to derivative intrinsics Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30570>	2024-08-09 13:54:11 +00:00
Zan Dobersek	7fd5f76393	nir/lower_vars_to_scratch: calculate threshold-limited variable size separately ir3's lowering of variables to scratch memory has to treat 8-bit values as 16-bit ones when comparing such value's size against the given threshold since those values are handled through 16-bit half-registers. But those values can still use natural 8-bit size and alignment for storing inside scratch memory. nir_lower_vars_to_scratch now accepts two size-and-alignment functions, one used for calculating the variable size and the other for calculating the size and alignment needed for storing inside scratch memory. Non-ir3 uses of this pass can just duplicate the currently-used function. ir3 provides a separate variable-size function that special-cases 8-bit types. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Iago Toral Quiroga	086ed1e54b	broadcom/compiler: emit instructions producing flags earlier We usually emit flags right before consuming them but this is suboptimal from the point of view of register pressure: if an instruction is only used to generate flags then waiting to emit it right before reading the flags extends the liveness of the sources used to generate the flags for no gain. This pass will check for such instructions and try to move them as early as possible. Shader-db results below show this is effective to reduce register pressure, allowing a few shaders to increase thread counts and/or reduce spilling: total instructions in shared programs: 11057173 -> 11057076 (<.01%) instructions in affected programs: `1955543` -> 1955446 (<.01%) helped: 4214 HURT: 3905 Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 425096 -> 425170 (0.02%) threads in affected programs: 74 -> 148 (100.00%) helped: 37 HURT: 0 Threads are helped. total uniforms in shared programs: 3846275 -> 3845674 (-0.02%) uniforms in affected programs: 23574 -> 22973 (-2.55%) helped: 217 HURT: 30 Uniforms are helped. total max-temps in shared programs: 2222910 -> 2220488 (-0.11%) max-temps in affected programs: 61904 -> 59482 (-3.91%) helped: 2145 HURT: 113 Max-temps are helped. total spills in shared programs: 4294 -> 4280 (-0.33%) spills in affected programs: 148 -> 134 (-9.46%) helped: 8 HURT: 0 total fills in shared programs: 6497 -> 6468 (-0.45%) fills in affected programs: 291 -> 262 (-9.97%) helped: 8 HURT: 0 total sfu-stalls in shared programs: 14344 -> 14611 (1.86%) sfu-stalls in affected programs: 1308 -> 1575 (20.41%) helped: 217 HURT: 335 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 11071517 -> 11071687 (<.01%) inst-and-stalls in affected programs: 1946767 -> 1946937 (<.01%) helped: 4191 HURT: 3909 Inconclusive result (value mean confidence interval includes 0). total nops in shared programs: 270628 -> 269829 (-0.30%) nops in affected programs: 22032 -> 21233 (-3.63%) helped: 1213 HURT: 571 Inconclusive result (%-change mean confidence interval includes 0). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30511>	2024-08-07 09:28:39 +02:00
Daniel Stone	e05415a82e	format: Generate endian-independent format aliases Instead of having a hardcoded list of endian-independent format aliases in the header, generate them from the format definitions. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29649>	2024-07-19 13:50:42 +00:00
Iago Toral Quiroga	33187012ab	broadcom/compiler: implement nir_op_fsat Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30086>	2024-07-10 08:29:59 +02:00
Iago Toral Quiroga	d62082a131	broadcom/compiler: disallow copy propagation of FMOV exclusive modifiers Since .sat, .nsat and .max0 are only supported with FMOV we can't copy propagate an FMOV with any of these unpack modifiers into a different opcode. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30086>	2024-07-10 08:29:50 +02:00
Juan A. Suarez Romero	4581bf595b	broadcom: follow version naming convention We usually name the functions that depend on hardware version as v3d<version>_foo. Keep the same convention in QPU and lower_image_load_store, so it makes easier when searching for versioned functions. Acked-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30000>	2024-07-08 11:19:31 +00:00
Juan A. Suarez Romero	7dc6b8df11	broadcom/compiler: use unsigned types when performing bitshifting Ensure unsigned integers are used instead of signed ones when performing left bit shifts. This has been detected by the Undefined Behaviour Sanitizer (UBSan). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29911>	2024-07-01 08:02:07 +00:00
David Heidelberg	68215332a8	build: pass licensing information in SPDX form Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Signed-off-by: David Heidelberg <david@ixit.cz> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29972>	2024-06-29 12:42:49 -07:00
Iago Toral Quiroga	4e6b675974	broadcom/compiler: drop multop if we dce umul24 We always emit multop+umul24 to implement integer multiply and this is the only scenario in which we use multop, so if we decide to DCE umul24 we should also DCE the previous multop. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>	2024-06-27 06:43:09 +00:00
Iago Toral Quiroga	0a7a36372f	broadcom/compiler: validate rtop + thrsw hazard Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>	2024-06-27 06:43:09 +00:00
Iago Toral Quiroga	d1f8351f3c	broadcom/compiler: fix per-quad spilling This is not safe when we have conditional spills since we could be spilling disabled lanes with undefined values that could overwrite valid data for those lanes from a previous spill of the same temp that was unconditional (or that condionally enabled those same lanes). Fixes some Piglit OpenCL tests as well as the following OpenCL tests: integer_divideAssign integer_moduloAssign integer_mad_sat integer_ops integer_divideAssign integer_ops integer_mad_sat integer_ops integer_moduloAssign integer_ops quick_char_math integer_ops quick_short_math math_brute_force half_powr math_brute_force pow math_brute_force pown math_brute_force powr math_brute_force rootn Fixes: `597560e27c` ('broadcom/compiler: always enable per-quad on spill operations') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>	2024-06-27 06:43:09 +00:00
Iago Toral Quiroga	38b7f411a1	broadcom/compiler: don't spill in between multop and umul24 The multop instruction implicitly writes rtop which is not preserved acrosss thread switches. We can spill the sources of the multop (since these would happen before multop) and the destination of umul24 (since that would happen after umul24). Fixes some OpenCL tests when V3D_DEBUG=opt_compile_time is used to choose a different compile configuration. cc: mesa-stable Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>	2024-06-27 06:43:09 +00:00
Karol Herbst	742984a325	broadcom/compiler: handle variable shared memory Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:03 +00:00
Karol Herbst	9bf0b3a112	broadcom/compiler: call nir_lower_64bit_phis Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:03 +00:00
Karol Herbst	4a169a518e	broadcom/compiler: implement load_kernel_input Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:03 +00:00
Karol Herbst	caa3872f76	broadcom/compiler: abort on unknown intrinsics Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:03 +00:00
Karol Herbst	f8ab9c0e93	broadcom/compiler: handle up to vec16 load_uniforms Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:03 +00:00
Karol Herbst	e050b13777	broadcom/compiler: try handling 8/16 bit alu operations Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:02 +00:00
Karol Herbst	c7f9cca985	broadcom/compiler: fix iu2f32 for 8 and 16 bit inputs Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:02 +00:00
Karol Herbst	214121e9b0	broadcom/compiler: handle fp16 conversion ops As long as fp16 isn't advertized it's not doing much, but it also doesn't hurt to add them. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:02 +00:00
Karol Herbst	c2ec65eeda	broadcom/compiler: add generated v3d_nir_lower_algebraic Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>	2024-06-26 10:04:02 +00:00
Alyssa Rosenzweig	da752ed7c1	treewide: use nir_def_replace sometimes Two Coccinelle patches here. Didn't catch nearly as much as I would've liked but it's a start. Coccinelle patch: @@ expression intr, repl; @@ -nir_def_rewrite_uses(&intr->def, repl); -nir_instr_remove(&intr->instr); +nir_def_replace(&intr->def, repl); Coccinelle patch: @@ identifier intr; expression instr, repl; @@ nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr); ... -nir_def_rewrite_uses(&intr->def, repl); -nir_instr_remove(instr); +nir_def_replace(&intr->def, repl); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> [etna] Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> [r300] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29817>	2024-06-21 15:36:56 +00:00
Iago Toral Quiroga	02f33b7d92	broadcom/compiler: initialize payload_conflict for all initial nodes Fixes: `cb83f25b39` ('broadcom/compiler: don't assign payload registers to spilling setup temps') cc: mesa-stable Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29759>	2024-06-18 07:19:07 +00:00
Alyssa Rosenzweig	15257b65c6	treewide: use nir_metadata_control_flow Via Coccinelle patch: @@ @@ -nir_metadata_block_index \| nir_metadata_dominance +nir_metadata_control_flow ...plus some manual fixups for call sites missed by coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:14 -04:00
Daniel Schürmann	7af16e9f1e	nir/shader_info: remove uses_demote This flag is mostly redundant with uses_discard and was only introduced to implement demote with LLVM when it didn't have that intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	9b1a748b5e	nir: remove nir_intrinsic_discard The semantics of discard differ between GLSL and HLSL and their various implementations. Subsequently, numerous application bugs occurred and SPV_EXT_demote_to_helper_invocation was written in order to clarify the behavior. In NIR, we now have 3 different intrinsics for 2 things, and while demote and terminate have clear semantics, discard still doesn't and can mean either of the two. This patch entirely removes nir_intrinsic_discard and nir_intrinsic_discard_if and replaces all occurences either with nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the case that the NIR option 'discard_is_demote' is being set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Karol Herbst	05b9705ae0	broadcom/compiler: rework scratch lowering Let's rely on nir_lower_mem_access_bit_sizes doing all the heavy work, so v3d_nir_lower_scratch can be cleaned up quite a lot. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	75196e86f1	broadcom/compiler: only handle load_uniform explicitly in v3d_nir_lower_load_store_bitsize Also use nir_get_io_offset_src_number while at it. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	a2eff2b9f9	broadcom/compiler: convert 2x32 global operations to scalar variants Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	9827cfe49e	broadcom/compiler: use nir_lower_mem_access_bit_sizes for memory lowering It does everything we need and allows us to remove a lot of code. It also helps with supporting vec8/16 and unaligned load/stores for OpenCL. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	66b58e8a0e	broadcom/compiler: support global load/store intrinsics It's the same as global_2x32 as there the 2nd component is ignored anyway Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	83883a6cc2	broadcom/compiler: handle load_workgroup_size Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29554>	2024-06-06 12:01:00 +00:00
Iago Toral Quiroga	c30833f233	broadcom/compiler: check if vertex shader writes point size The same we already check for geometry shaders. We will use this shortly. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29413>	2024-05-28 05:31:13 +00:00

1 2 3 4 5 ...

909 commits