fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 22:20:14 +01:00

Author	SHA1	Message	Date
Tony Wasserka	47de553283	aco/isel: Move context initialization code to a dedicated file aco_instruction_selection_setup.cpp (previously used as a header) has been split into a header and an implementation file. The latter "only" implements init_context and setup_isel_context, but since these files carry a long trail of helper functions, this cleans up the isel header a lot. Reduces library size by 3.1% due to more functions being compiled with static linkage. Makes aco_instruction_selection.cpp compile 3% faster. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	150de6358d	aco/isel: Consistently use references for input parameters in emit_load Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	dab0af0616	aco/isel: Simplify nested branching code Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	757de68a43	aco/isel: Turn the function template emit_load into a proper function Statically known values were encoded using template parameters previously, causing specializations for each of the 5 sets of template arguments to be generated. Since emit_load is not performance critical (the inner loop never runs more often than twice), it's better for build time to use runtime arguments everywhere. Reduces build time of this file by 9% (17.3s -> 15.7s on my machine) and reduces libaco's size by 2.6%. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Daniel Schürmann	0b6448bbe7	aco/isel: refactor emit_vop3a_instruction() to handle 2 operand instructions Only AC:O has been affected. Totals from 4 (0.00% of 136546) affected shaders (RAVEN): CodeSize: 16428 -> 16420 (-0.05%) Instrs: 3294 -> 3292 (-0.06%) Cycles: 14208 -> 14200 (-0.06%) VMEM: 936 -> 978 (+4.49%) VClause: 80 -> 77 (-3.75%) Copies: 211 -> 209 (-0.95%) PreVGPRs: 127 -> 126 (-0.79%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6635>	2020-09-08 16:20:44 +00:00
Daniel Schürmann	5b31056257	aco/isel: refactor code and remove unnecessary v_mov Changes mainly due to avoided v_movs for fmin/fmax/fadd/fmul. Totals from 12783 (9.36% of 136546) affected shaders (RAVEN): SGPRs: 1097752 -> 1098264 (+0.05%); split: -0.09%, +0.14% VGPRs: 856920 -> 850800 (-0.71%); split: -0.82%, +0.11% SpillSGPRs: 49494 -> 49496 (+0.00%); split: -0.00%, +0.01% CodeSize: 99997916 -> 99989948 (-0.01%); split: -0.04%, +0.03% MaxWaves: 53895 -> 54448 (+1.03%) Instrs: 19634960 -> 19632626 (-0.01%); split: -0.05%, +0.04% Cycles: 1620601696 -> 1620900712 (+0.02%); split: -0.02%, +0.04% VMEM: 3334181 -> 3299626 (-1.04%); split: +1.62%, -2.66% SMEM: 865573 -> 865876 (+0.04%); split: +0.84%, -0.81% VClause: 337100 -> 335224 (-0.56%); split: -0.88%, +0.32% SClause: 696813 -> 697267 (+0.07%); split: -0.14%, +0.21% Copies: 1549897 -> 1548023 (-0.12%); split: -0.52%, +0.40% Branches: 682118 -> 682108 (-0.00%); split: -0.01%, +0.00% PreSGPRs: 893524 -> 895129 (+0.18%); split: -0.00%, +0.18% PreVGPRs: 790180 -> 783036 (-0.90%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6635>	2020-09-08 16:20:44 +00:00
Rhys Perry	6049dc1a9d	aco: improve fsign selection Idea from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 fossil-db (Navi): Totals from 4053 (2.95% of 137413) affected shaders: SGPRs: 305810 -> 305906 (+0.03%); split: -0.01%, +0.04% VGPRs: 249000 -> 249144 (+0.06%); split: -0.01%, +0.07% CodeSize: 29967092 -> 29885768 (-0.27%); split: -0.27%, +0.00% Instrs: 5749494 -> 5737971 (-0.20%); split: -0.20%, +0.00% Cycles: 255028584 -> 254955444 (-0.03%); split: -0.04%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6583>	2020-09-08 12:17:43 +00:00
Samuel Pitoiset	73eb24ab31	aco: handle unaligned loads on GFX10.3 Same as GFX10. Cc: 20.2 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6594>	2020-09-04 13:19:45 +00:00
Rhys Perry	8faf85f687	aco: fix byte_align_scalar for 3 dword vectors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `fe08f0ccf9` ('aco: add byte_align_scalar() & trim_subdword_vector() helper functions') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4710>	2020-09-04 13:03:50 +00:00
Samuel Pitoiset	8076c7596d	aco: fix wrong source position for constant with nir_op_cube_face_coord Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6480>	2020-08-28 08:03:55 +02:00
Rhys Perry	d2cf6a8399	aco: sink get_alu_src() in bfe lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	14d748eb28	aco: fix sgpr ubfe/ibfe if the offset is too large If the offset is large enough, it could affect the width. I'm also not sure if the hardware masks the offset by 0x1f. Found by inspection. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	454bc595d1	aco: remove 64-bit SGPR ubfe/ibfe ubfe/ibfe is always 32-bit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	156fd58cda	aco: reserve 2 sgprs for each branch We'll need two sgprs for the possibility of a long jump. fossil-db (Navi): Totals from 10197 (7.50% of 135946) affected shaders: SGPRs: 946268 -> 946468 (+0.02%) VGPRs: 705884 -> 707956 (+0.29%); split: -0.00%, +0.30% SpillSGPRs: 31485 -> 36212 (+15.01%); split: -0.04%, +15.05% CodeSize: 88296484 -> 88384604 (+0.10%); split: -0.01%, +0.11% MaxWaves: 81379 -> 81171 (-0.26%) Instrs: 17219111 -> 17231682 (+0.07%); split: -0.03%, +0.10% Cycles: 1594875900 -> 1596450136 (+0.10%); split: -0.05%, +0.15% VMEM: 1687263 -> 1689080 (+0.11%); split: +0.14%, -0.03% SMEM: 657726 -> 660262 (+0.39%); split: +0.61%, -0.22% VClause: 294806 -> 294638 (-0.06%); split: -0.08%, +0.02% SClause: 556702 -> 556210 (-0.09%); split: -0.12%, +0.03% Copies: 1466323 -> 1469349 (+0.21%); split: -0.57%, +0.78% Branches: 619793 -> 618556 (-0.20%); split: -0.28%, +0.08% PreSGPRs: 806364 -> 811477 (+0.63%); split: -0.14%, +0.77% PreVGPRs: 655845 -> 657174 (+0.20%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Daniel Schürmann	a79dad950b	nir,amd: remove trinary_minmax opcodes These consist of the variations nir_op_{i\|u\|f}{min\|max\|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>	2020-08-24 20:56:11 +00:00
Samuel Pitoiset	9c46e6fca3	aco: add a helper for building a trap handler shader It's way easier to write a trap handler shader using ACO IR instead of writing disassembly by hand + clrxasm + copy&paste. This trap handler is quite simple for now, it just loads a buffer descriptor from the TMA BO, it saves ttmp0-1 which contain various info about the faulty instruction, and it stores some hw registers about the wave/trap status. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Karol Herbst	e5899c1e88	nir: rename nir_op_fne to nir_op_fneu It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>	2020-08-21 17:26:21 +00:00
Rhys Perry	9c1e0d86a8	aco: fix non-rtz pack_half_2x16 We were using the wrong conversion opcode. The high bits are also not zero'd on GFX10, which can cause v_cvt_pk_u16_u32 to clamp. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `df645fa369` ('aco: implement VK_KHR_shader_float_controls') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6346>	2020-08-21 16:30:26 +00:00
Jason Ekstrand	1ccd681109	nir: Add an LOD parameter to image_*_size The OpenCL image_width/height/depth functions have variants which can take an LOD parameter. More importantly, LLVM-SPIRV-Translator always generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero. Given that over half the hardware out there has an LOD field for image size queries (based on a rudimentary scan through their NIR -> whatever code), we may as well just add the source to the NIR intrinsic. If this is ever a problem for anyone, the lowering is pretty trivial. I've also added asserts to everyone's drivers that should alert them if they ever see an LOD other than zero. This will never happen with GL or Vulkan so there's no need for panic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>	2020-08-20 20:48:10 +00:00
Samuel Pitoiset	c2b1978aa4	aco: rework the way various compilation/validation errors are reported The upcoming change will allow to report all ACO errors (or warnings) directly to the app via VK_EXT_debug_report. This is similar to what we already do for reporting various SPIRV->NIR errors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318>	2020-08-20 08:15:06 +02:00
Samuel Pitoiset	d452c04aa1	aco: do not set valid_mask for POS0 exports on GFX 10.3 This hardware issue seems only present on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6278>	2020-08-13 07:13:56 +00:00
Rhys Perry	fea3e498c3	aco: replace MADs in isel with FMA on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	41c901b7df	aco: disable SMEM stores on GFX10.3 These are removed in GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	07250a92da	aco: implement subgroup shader_clock on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	75a68eee28	aco: optimize swizzled SALU 8/16-bit conversions We only need one s_bfe for a conversion with a swizzled source. shader-db (parallel-rdp, Navi): Totals from 487 (71.30% of 683) affected shaders: SpillSGPRs: 3284 -> 3233 (-1.55%); split: -2.71%, +1.16% SpillVGPRs: 2174 -> 2150 (-1.10%); split: -1.24%, +0.14% CodeSize: 2497864 -> 2445544 (-2.09%); split: -2.11%, +0.01% Instrs: 450613 -> 445104 (-1.22%); split: -1.27%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5259>	2020-07-30 17:34:51 +00:00
Rhys Perry	9a49d4c2db	aco: remove isel for GLSL-style barriers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5980>	2020-07-29 17:57:13 +00:00
Rhys Perry	ccfe9813fb	aco: create acq+rel barriers instead of acq/rel NIR doesn't have atomic loads/stores, so we have to workaround that with this for dEQP-VK.memory_model.* to pass. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	3af2b9e3de	aco: improve sync_info for TCS output stores Stop scheduling them as SSBO stores. No fossil-db changes on Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	8a16498cc6	aco: use storage_scratch fossil-db (Navi): Totals from 9 (0.01% of 114665) affected shaders: VMEM: 14456 -> 15312 (+5.92%) VClause: 336 -> 327 (-2.68%) Helps 9 Dark Souls 3 shaders a little. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	7a61480613	aco: consider intrinsic access in visit_{load,store}_image radv_nir_lower_memory_model will use this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	cd392a10d0	radv/aco,aco: use scoped barriers fossil-db (Navi): Totals from 109 (0.08% of 132058) affected shaders: SGPRs: 5416 -> 5424 (+0.15%) CodeSize: 460500 -> 460508 (+0.00%); split: -0.07%, +0.07% Instrs: 87278 -> 87272 (-0.01%); split: -0.09%, +0.09% Cycles: 2241996 -> 2241852 (-0.01%); split: -0.04%, +0.04% VMEM: 33868 -> 35539 (+4.93%); split: +5.14%, -0.20% SMEM: 7183 -> 7184 (+0.01%); split: +0.36%, -0.35% VClause: 1857 -> 1882 (+1.35%) SClause: 2052 -> 2055 (+0.15%); split: -0.05%, +0.19% Copies: 6377 -> 6380 (+0.05%); split: -0.02%, +0.06% PreSGPRs: 3391 -> 3392 (+0.03%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	d1f992f3c2	aco: rework barriers and replace can_reorder fossil-db (Navi): Totals from 273 (0.21% of 132058) affected shaders: CodeSize: 937472 -> 936556 (-0.10%) Instrs: 158874 -> 158648 (-0.14%) Cycles: 13563516 -> 13562612 (-0.01%) VMEM: 85246 -> 85244 (-0.00%) SMEM: 21407 -> 21310 (-0.45%); split: +0.05%, -0.50% VClause: 9321 -> 9317 (-0.04%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Daniel Schürmann	626081fe4b	aco: don't split store data if it was already split into more elements Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6024>	2020-07-23 18:18:35 +00:00
Daniel Schürmann	bd75e99233	aco: ensure to not extract more components than have been fetched Fixes: `7015d2c249` ('aco: fix scratch loads which cross element_size boundaries') Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6024>	2020-07-23 18:18:35 +00:00
Daniel Schürmann	7015d2c249	aco: fix scratch loads which cross element_size boundaries Previously, we've set element_size == 16 which causes loads from packed vec3 arrays to cross the boundary and return wrong data. This patch sets element_size = 4 and splits loads into single channel. Fixes all of dEQP-VK.subgroups.ballot_broadcast.* Cc: 20.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5977>	2020-07-22 13:12:25 +00:00
Samuel Pitoiset	7615f2d690	aco: add support for nir_intrinsic_shared_atomic_fadd Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6000>	2020-07-22 10:01:59 +02:00
Rhys Perry	e75946cfef	aco: move some setup code into helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6013>	2020-07-21 19:38:43 +00:00
Rhys Perry	2694a34aa2	aco: add NUW flag This (combined with a pass to actually set the corresponding NIR flags) should help fix a lot of the regressions from the SMEM addition combining change. fossil-db (Navi): Totals from 12 (0.01% of 135946) affected shaders: CodeSize: 12376 -> 12304 (-0.58%) Instrs: 2436 -> 2422 (-0.57%) VMEM: 1105 -> 1096 (-0.81%) SClause: 133 -> 130 (-2.26%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	3a4847179b	aco: allow overflow for some SMEM instructions fossil-db (Navi): Totals from 10184 (7.49% of 135946) affected shaders: CodeSize: 83419748 -> 82430824 (-1.19%); split: -1.19%, +0.01% Instrs: 16054612 -> 15908523 (-0.91%); split: -0.93%, +0.02% VMEM: 1608018 -> 1581829 (-1.63%); split: +0.20%, -1.83% SMEM: 577031 -> 563492 (-2.35%); split: +0.10%, -2.45% VClause: 242643 -> 242512 (-0.05%); split: -0.06%, +0.00% SClause: 640966 -> 569897 (-11.09%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	d169f09e37	aco: be more careful combining additions that could wrap into loads/stores SMEM does the addition with 64-bits, not 32. So if the original code relied on wrapping around (for example, for subtraction), it would break. Apparently swizzled MUBUF accesses also have issues with combining additions that could overflow. Normal MUBUF accesses seem fine. fossil-db (Navi): Totals from 27219 (20.02% of 135946) affected shaders: CodeSize: 128303256 -> 131062756 (+2.15%); split: -0.00%, +2.15% Instrs: 24818911 -> 25280558 (+1.86%); split: -0.01%, +1.87% VMEM: 162311926 -> 177226874 (+9.19%); split: +9.36%, -0.17% SMEM: 18182559 -> 20218734 (+11.20%); split: +11.53%, -0.34% VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20% SClause: 865384 -> 1104986 (+27.69%); split: -0.00%, +27.69% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	04ea4f1ce4	aco: implement b2i8/b2i16 Fixes lots of tests under dEQP-VK.spirv_assembly.type.* Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5993>	2020-07-21 12:27:30 +00:00
Rhys Perry	b36950ad2c	aco: fix nir_op_f2f16_rtne with non-default rounding modes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>	2020-07-17 16:40:47 +00:00
Rhys Perry	d14f4faa13	aco: flush denormals before fp16 fabs/fneg if needed Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>	2020-07-17 16:40:47 +00:00
Rhys Perry	a6a731bea5	aco: implement <32-bit masked_swizzle_amd This is needed since we will be lowering some 8/16-bit shuffles to masked_swizzle_amd. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Rhys Perry	d377fbf95d	aco: optimize some masked swizzles to DPP Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Rhys Perry	f622e80494	aco: create better code for boolean phis with constant operands fossil-db (Navi): Totals from 6394 (4.70% of 135946) affected shaders: SGPRs: 651408 -> 651344 (-0.01%) SpillSGPRs: 52102 -> 52019 (-0.16%) CodeSize: 68369664 -> 68229180 (-0.21%); split: -0.21%, +0.00% Instrs: 13236611 -> 13202126 (-0.26%); split: -0.26%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3388>	2020-07-10 22:36:14 +00:00
Rhys Perry	ec4d3def16	aco: use VOP2 version of v_mbcnt_hi_u32_b32 on GFX6/7 fossil-db (Pitcairn): Totals from 2172 (1.58% of 137414) affected shaders: CodeSize: 7109080 -> 7100100 (-0.13%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5623>	2020-07-07 18:48:15 +00:00
Bas Nieuwenhuizen	c5d8961b0b	Revert "radv: add support for MRTs compaction to avoid holes" This reverts commit `7a5e6fd25f`. Since we have two different users bisecting issues to this commit, let's revert. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `7a5e6fd25f` "radv: add support for MRTs compaction to avoid holes" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3202 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3228 (Other report in https://gitlab.freedesktop.org/mesa/mesa/-/issues/3151#note_558589) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5758>	2020-07-06 14:06:37 +00:00
Samuel Pitoiset	7a5e6fd25f	radv: add support for MRTs compaction to avoid holes SPI_SHADER_COL_FORMAT allocates export memory and CB_SHADER_MASK map them to higher MRTs if necessary. The hardware allows to remap MRTs to avoid holes somehow. For example, if we have a scenario where MRT0 is unused and only MRT1 and MRT2 are used, SPI_SHADER_COL_FORMAT is 0x77 and CB_SHADER_MASK/CB_TARGET_MASK are 0x770 (this assumes SPI_SHADER_UINT16_ABGR is set). This allows us to remove one workaround that was added for fixing GPU hangs with DXVK. I think this is because SPI_SHADER_COL_FORMAT expects contiguous MRTs to be allocated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5434>	2020-06-29 08:43:14 +00:00
Samuel Pitoiset	a102896cff	radv: lower 64-bit dfloor on GFX6 for fixing precision issues GFX6 doesn't support v_floor_f64 and the precision of v_fract_f64 which is used to implement 64-bit floor is less than what Vulkan requires. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5609>	2020-06-25 12:09:08 +00:00

1 2 3 4 5 ...

352 commits