fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Rhys Perry	75a68eee28	aco: optimize swizzled SALU 8/16-bit conversions We only need one s_bfe for a conversion with a swizzled source. shader-db (parallel-rdp, Navi): Totals from 487 (71.30% of 683) affected shaders: SpillSGPRs: 3284 -> 3233 (-1.55%); split: -2.71%, +1.16% SpillVGPRs: 2174 -> 2150 (-1.10%); split: -1.24%, +0.14% CodeSize: 2497864 -> 2445544 (-2.09%); split: -2.11%, +0.01% Instrs: 450613 -> 445104 (-1.22%); split: -1.27%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5259>	2020-07-30 17:34:51 +00:00
Rhys Perry	9a49d4c2db	aco: remove isel for GLSL-style barriers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5980>	2020-07-29 17:57:13 +00:00
Rhys Perry	ccfe9813fb	aco: create acq+rel barriers instead of acq/rel NIR doesn't have atomic loads/stores, so we have to workaround that with this for dEQP-VK.memory_model.* to pass. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	3af2b9e3de	aco: improve sync_info for TCS output stores Stop scheduling them as SSBO stores. No fossil-db changes on Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	8a16498cc6	aco: use storage_scratch fossil-db (Navi): Totals from 9 (0.01% of 114665) affected shaders: VMEM: 14456 -> 15312 (+5.92%) VClause: 336 -> 327 (-2.68%) Helps 9 Dark Souls 3 shaders a little. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	7a61480613	aco: consider intrinsic access in visit_{load,store}_image radv_nir_lower_memory_model will use this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	cd392a10d0	radv/aco,aco: use scoped barriers fossil-db (Navi): Totals from 109 (0.08% of 132058) affected shaders: SGPRs: 5416 -> 5424 (+0.15%) CodeSize: 460500 -> 460508 (+0.00%); split: -0.07%, +0.07% Instrs: 87278 -> 87272 (-0.01%); split: -0.09%, +0.09% Cycles: 2241996 -> 2241852 (-0.01%); split: -0.04%, +0.04% VMEM: 33868 -> 35539 (+4.93%); split: +5.14%, -0.20% SMEM: 7183 -> 7184 (+0.01%); split: +0.36%, -0.35% VClause: 1857 -> 1882 (+1.35%) SClause: 2052 -> 2055 (+0.15%); split: -0.05%, +0.19% Copies: 6377 -> 6380 (+0.05%); split: -0.02%, +0.06% PreSGPRs: 3391 -> 3392 (+0.03%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	d1f992f3c2	aco: rework barriers and replace can_reorder fossil-db (Navi): Totals from 273 (0.21% of 132058) affected shaders: CodeSize: 937472 -> 936556 (-0.10%) Instrs: 158874 -> 158648 (-0.14%) Cycles: 13563516 -> 13562612 (-0.01%) VMEM: 85246 -> 85244 (-0.00%) SMEM: 21407 -> 21310 (-0.45%); split: +0.05%, -0.50% VClause: 9321 -> 9317 (-0.04%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Daniel Schürmann	626081fe4b	aco: don't split store data if it was already split into more elements Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6024>	2020-07-23 18:18:35 +00:00
Daniel Schürmann	bd75e99233	aco: ensure to not extract more components than have been fetched Fixes: `7015d2c249` ('aco: fix scratch loads which cross element_size boundaries') Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6024>	2020-07-23 18:18:35 +00:00
Daniel Schürmann	7015d2c249	aco: fix scratch loads which cross element_size boundaries Previously, we've set element_size == 16 which causes loads from packed vec3 arrays to cross the boundary and return wrong data. This patch sets element_size = 4 and splits loads into single channel. Fixes all of dEQP-VK.subgroups.ballot_broadcast.* Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5977>	2020-07-22 13:12:25 +00:00
Samuel Pitoiset	7615f2d690	aco: add support for nir_intrinsic_shared_atomic_fadd Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6000>	2020-07-22 10:01:59 +02:00
Rhys Perry	e75946cfef	aco: move some setup code into helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6013>	2020-07-21 19:38:43 +00:00
Rhys Perry	2694a34aa2	aco: add NUW flag This (combined with a pass to actually set the corresponding NIR flags) should help fix a lot of the regressions from the SMEM addition combining change. fossil-db (Navi): Totals from 12 (0.01% of 135946) affected shaders: CodeSize: 12376 -> 12304 (-0.58%) Instrs: 2436 -> 2422 (-0.57%) VMEM: 1105 -> 1096 (-0.81%) SClause: 133 -> 130 (-2.26%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	3a4847179b	aco: allow overflow for some SMEM instructions fossil-db (Navi): Totals from 10184 (7.49% of 135946) affected shaders: CodeSize: 83419748 -> 82430824 (-1.19%); split: -1.19%, +0.01% Instrs: 16054612 -> 15908523 (-0.91%); split: -0.93%, +0.02% VMEM: 1608018 -> 1581829 (-1.63%); split: +0.20%, -1.83% SMEM: 577031 -> 563492 (-2.35%); split: +0.10%, -2.45% VClause: 242643 -> 242512 (-0.05%); split: -0.06%, +0.00% SClause: 640966 -> 569897 (-11.09%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	d169f09e37	aco: be more careful combining additions that could wrap into loads/stores SMEM does the addition with 64-bits, not 32. So if the original code relied on wrapping around (for example, for subtraction), it would break. Apparently swizzled MUBUF accesses also have issues with combining additions that could overflow. Normal MUBUF accesses seem fine. fossil-db (Navi): Totals from 27219 (20.02% of 135946) affected shaders: CodeSize: 128303256 -> 131062756 (+2.15%); split: -0.00%, +2.15% Instrs: 24818911 -> 25280558 (+1.86%); split: -0.01%, +1.87% VMEM: 162311926 -> 177226874 (+9.19%); split: +9.36%, -0.17% SMEM: 18182559 -> 20218734 (+11.20%); split: +11.53%, -0.34% VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20% SClause: 865384 -> 1104986 (+27.69%); split: -0.00%, +27.69% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	04ea4f1ce4	aco: implement b2i8/b2i16 Fixes lots of tests under dEQP-VK.spirv_assembly.type.* Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5993>	2020-07-21 12:27:30 +00:00
Rhys Perry	b36950ad2c	aco: fix nir_op_f2f16_rtne with non-default rounding modes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>	2020-07-17 16:40:47 +00:00
Rhys Perry	d14f4faa13	aco: flush denormals before fp16 fabs/fneg if needed Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>	2020-07-17 16:40:47 +00:00
Rhys Perry	a6a731bea5	aco: implement <32-bit masked_swizzle_amd This is needed since we will be lowering some 8/16-bit shuffles to masked_swizzle_amd. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Rhys Perry	d377fbf95d	aco: optimize some masked swizzles to DPP Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Rhys Perry	f622e80494	aco: create better code for boolean phis with constant operands fossil-db (Navi): Totals from 6394 (4.70% of 135946) affected shaders: SGPRs: 651408 -> 651344 (-0.01%) SpillSGPRs: 52102 -> 52019 (-0.16%) CodeSize: 68369664 -> 68229180 (-0.21%); split: -0.21%, +0.00% Instrs: 13236611 -> 13202126 (-0.26%); split: -0.26%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3388>	2020-07-10 22:36:14 +00:00
Rhys Perry	ec4d3def16	aco: use VOP2 version of v_mbcnt_hi_u32_b32 on GFX6/7 fossil-db (Pitcairn): Totals from 2172 (1.58% of 137414) affected shaders: CodeSize: 7109080 -> 7100100 (-0.13%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5623>	2020-07-07 18:48:15 +00:00
Bas Nieuwenhuizen	c5d8961b0b	Revert "radv: add support for MRTs compaction to avoid holes" This reverts commit `7a5e6fd25f`. Since we have two different users bisecting issues to this commit, let's revert. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `7a5e6fd25f` "radv: add support for MRTs compaction to avoid holes" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3202 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3228 (Other report in https://gitlab.freedesktop.org/mesa/mesa/-/issues/3151#note_558589) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5758>	2020-07-06 14:06:37 +00:00
Samuel Pitoiset	7a5e6fd25f	radv: add support for MRTs compaction to avoid holes SPI_SHADER_COL_FORMAT allocates export memory and CB_SHADER_MASK map them to higher MRTs if necessary. The hardware allows to remap MRTs to avoid holes somehow. For example, if we have a scenario where MRT0 is unused and only MRT1 and MRT2 are used, SPI_SHADER_COL_FORMAT is 0x77 and CB_SHADER_MASK/CB_TARGET_MASK are 0x770 (this assumes SPI_SHADER_UINT16_ABGR is set). This allows us to remove one workaround that was added for fixing GPU hangs with DXVK. I think this is because SPI_SHADER_COL_FORMAT expects contiguous MRTs to be allocated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5434>	2020-06-29 08:43:14 +00:00
Samuel Pitoiset	a102896cff	radv: lower 64-bit dfloor on GFX6 for fixing precision issues GFX6 doesn't support v_floor_f64 and the precision of v_fract_f64 which is used to implement 64-bit floor is less than what Vulkan requires. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5609>	2020-06-25 12:09:08 +00:00
Samuel Pitoiset	c84f11e7b6	radv: lower 64-bit drcp/dsqrt/drsq for fixing precision issues The hardware precision of v_rcp_f64, v_sqrt_f64 and v_rsq_f64 is less than what Vulkan requires. This lowers using the Goldschmidt's algorithm to improve precision. Fixes dEQP-VK.glsl.builtin.precision_double.* on both compiler backends. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5609>	2020-06-25 12:09:08 +00:00
Rhys Perry	91d7e40176	aco: don't create byte-aligned short loads The ISA docs don't seem to say if this is allowed, so just assume short loads require short alignment. In practice, the only situation this should affect are byte-aligned u8vec2 loads. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:28 +00:00
Rhys Perry	c3259b6e6a	aco: add missing bld.scc() in byte_align_scalar() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:28 +00:00
Rhys Perry	a0f6ca4393	aco: don't store byte-aligned short stores The ISA docs don't seem to say if this is allowed, so just assume short stores require short alignment. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:28 +00:00
Rhys Perry	a18da83d18	aco: fix copy+paste error in split_buffer_store Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:28 +00:00
Rhys Perry	841fdfcd45	radv/aco,aco: allow SMEM SSBO loads on GFX6/7 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:28 +00:00
Rhys Perry	35b5e1fc7c	aco: allow SMEM for some sub-dword accesses Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:27 +00:00
Rhys Perry	c702f8ed15	aco: only use SMEM if we can prove it's safe Totals from 26 (0.02% of 127638) affected shaders: SGPRs: 1680 -> 1664 (-0.95%) VGPRs: 1492 -> 1504 (+0.80%) CodeSize: 233140 -> 233016 (-0.05%); split: -0.09%, +0.04% Instrs: 47121 -> 47114 (-0.01%); split: -0.08%, +0.06% VMEM: 4930 -> 4655 (-5.58%); split: +0.12%, -5.70% SMEM: 2030 -> 2001 (-1.43%); split: +3.79%, -5.22% VClause: 891 -> 947 (+6.29%) SClause: 876 -> 816 (-6.85%) Copies: 4734 -> 4716 (-0.38%); split: -0.40%, +0.02% Branches: 2048 -> 2047 (-0.05%) PreSGPRs: 1400 -> 1396 (-0.29%) PreVGPRs: 1440 -> 1443 (+0.21%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>	2020-06-24 10:52:27 +00:00
Daniel Schürmann	f03a5f6cac	radv/aco: implement logic64 instead of lowering to make use of the scalar ALU Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5527>	2020-06-22 10:59:45 +00:00
Rhys Perry	f7cc7079b0	aco: use the same regclass as the definition for undef phi operands Subdword phis can't have SGPR operands on GFX6-8. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5544>	2020-06-18 17:29:33 +00:00
Rhys Perry	3d6f67950d	aco: improve 8/16-bit constants fossil-db (Navi, fp16 enabled): Totals from 1 (0.00% of 127638) affected shaders: CodeSize: 4540 -> 4388 (-3.35%) Instrs: 861 -> 830 (-3.60%) Cycles: 3444 -> 3320 (-3.60%) VMEM: 489 -> 465 (-4.91%) SMEM: 107 -> 110 (+2.80%) SClause: 31 -> 30 (-3.23%) Copies: 58 -> 54 (-6.90%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	dd23345567	aco: fix half_pi constant for 16-bit fsin/fcos This worked because the optimizer didn't consider that the 16-bit instruction would interpret the inline constant differently. This will change in the next commit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	f5a5674178	aco: update comment about preserving fp16/fp64 denormals Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	1b6a319c15	aco: add and set precise flag No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	a8f800a836	aco: use p_as_uniform in emit_vop1_instruction No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	b6d9e45f47	aco: improve code for f2{i,u}{8,16} Use sub-dword definitions so that the RA can use SDWA No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Daniel Schürmann	1f98d8c804	aco: fix shared subdword loads Shared subdword loads don't need byte alignment as they are split into multiple loads if necessary. Fixes: `5cde4989d3` ('aco: remove unnecessary split- and create_vector instructions for subdword loads') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441>	2020-06-12 13:56:12 +00:00
Samuel Pitoiset	7b44f549b3	aco: implement radv_enable_mrt_output_nan_fixup workaround Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>	2020-06-12 14:43:57 +02:00
Rhys Perry	56345b8c61	aco: allow reading/writing upper halves/bytes when possible Use SDWA, opsel or a different opcode to achieve this. shader-db (Navi, fp16 enabled): Totals from 42 (0.03% of 127638) affected shaders: VGPRs: 3424 -> 3416 (-0.23%) CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23% Instrs: 156638 -> 155733 (-0.58%) Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00% VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05% SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11% VClause: 1477 -> 1475 (-0.14%) Copies: 13216 -> 12406 (-6.13%) Branches: 5942 -> 5901 (-0.69%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>	2020-06-10 15:05:11 +00:00
Rhys Perry	98060ba0f0	aco: p_extract_vector in 64-bit u2f16/i2f16 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>	2020-06-10 15:05:11 +00:00
Daniel Schürmann	5cde4989d3	aco: remove unnecessary split- and create_vector instructions for subdword loads This helps GFX6/7 by removing unnecessary shuffle code. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	5446e3cf2e	aco: fix alignment of vectors with 4 elements I think this case was just missing. This fixes a bunch of 16-bit storage related CTS failures like dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	c7bd0f8cd5	aco: implement 8-bit/16-bit conversions on GFX6-GFX7 Use v_bfe to implement small bitsize conversions because the compiler probably optimizes this better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	6391f9ab4c	aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>	2020-06-05 16:04:06 +02:00

1 2 3 4 5 ...

328 commits