fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 11:10:10 +01:00

Author	SHA1	Message	Date
Samuel Pitoiset	8076c7596d	aco: fix wrong source position for constant with nir_op_cube_face_coord Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6480>	2020-08-28 08:03:55 +02:00
Samuel Pitoiset	502b9daa7a	aco: add ACO_DEBUG=novn,noopt,nosched for debugging purposes To disable value numbering, optimizations and scheduling. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6470>	2020-08-27 10:23:51 +00:00
Rhys Perry	d2cf6a8399	aco: sink get_alu_src() in bfe lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	14d748eb28	aco: fix sgpr ubfe/ibfe if the offset is too large If the offset is large enough, it could affect the width. I'm also not sure if the hardware masks the offset by 0x1f. Found by inspection. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	454bc595d1	aco: remove 64-bit SGPR ubfe/ibfe ubfe/ibfe is always 32-bit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00
Rhys Perry	eb3c16e1f8	aco/tests: add tests for long jumps Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	192b9f4303	aco: shorten disassembly for repeated instructions Future tests will do this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	ae6330d955	aco/tests: add test for GFX10 0x3f bug Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	fe2dc41258	aco: create long jumps When the branch offset can't be encoded, we have to use s_setpc_b64. Fixes hang in RPCS3 vertex ubershader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3231 Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	156fd58cda	aco: reserve 2 sgprs for each branch We'll need two sgprs for the possibility of a long jump. fossil-db (Navi): Totals from 10197 (7.50% of 135946) affected shaders: SGPRs: 946268 -> 946468 (+0.02%) VGPRs: 705884 -> 707956 (+0.29%); split: -0.00%, +0.30% SpillSGPRs: 31485 -> 36212 (+15.01%); split: -0.04%, +15.05% CodeSize: 88296484 -> 88384604 (+0.10%); split: -0.01%, +0.11% MaxWaves: 81379 -> 81171 (-0.26%) Instrs: 17219111 -> 17231682 (+0.07%); split: -0.03%, +0.10% Cycles: 1594875900 -> 1596450136 (+0.10%); split: -0.05%, +0.15% VMEM: 1687263 -> 1689080 (+0.11%); split: +0.14%, -0.03% SMEM: 657726 -> 660262 (+0.39%); split: +0.61%, -0.22% VClause: 294806 -> 294638 (-0.06%); split: -0.08%, +0.02% SClause: 556702 -> 556210 (-0.09%); split: -0.12%, +0.03% Copies: 1466323 -> 1469349 (+0.21%); split: -0.57%, +0.78% Branches: 619793 -> 618556 (-0.20%); split: -0.28%, +0.08% PreSGPRs: 806364 -> 811477 (+0.63%); split: -0.14%, +0.77% PreVGPRs: 655845 -> 657174 (+0.20%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	e8ac14527a	aco: keep loop live-through variables spilled fossil-db (Navi): Totals from 3149 (2.32% of 135946) affected shaders: VGPRs: 280928 -> 280932 (+0.00%) SpillSGPRs: 51133 -> 30042 (-41.25%) CodeSize: 43063076 -> 41377252 (-3.91%); split: -3.92%, +0.00% Instrs: 8278435 -> 8037133 (-2.91%); split: -2.92%, +0.00% Cycles: 709575456 -> 683366172 (-3.69%); split: -3.69%, +0.00% VMEM: 542887 -> 542937 (+0.01%); split: +0.05%, -0.04% SMEM: 210255 -> 206368 (-1.85%); split: +0.12%, -1.97% SClause: 258847 -> 258019 (-0.32%); split: -0.52%, +0.20% Copies: 731836 -> 684784 (-6.43%); split: -6.44%, +0.01% Branches: 305422 -> 292844 (-4.12%); split: -4.12%, +0.00% PreSGPRs: 333103 -> 332701 (-0.12%) PreVGPRs: 280086 -> 280089 (+0.00%) Helps mostly Detroit: Become Human and the single spilling Doom Eternal shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	75d6c30572	aco: fix spills_entry heuristic for branch blocks in init_live_in_vars() fossil-db (Navi): Totals from 222 (0.16% of 135946) affected shaders: SpillSGPRs: 9121 -> 9117 (-0.04%) SpillVGPRs: 2820 -> 1821 (-35.43%) CodeSize: 5134264 -> 5053336 (-1.58%); split: -1.63%, +0.05% Instrs: 953435 -> 938761 (-1.54%); split: -1.59%, +0.05% Cycles: 100567688 -> 97252432 (-3.30%); split: -3.34%, +0.04% VMEM: 40752 -> 39219 (-3.76%); split: +0.04%, -3.80% SMEM: 15416 -> 15509 (+0.60%); split: +0.64%, -0.03% VClause: 20120 -> 19091 (-5.11%) SClause: 23540 -> 23544 (+0.02%); split: -0.11%, +0.12% Copies: 125912 -> 122017 (-3.09%); split: -3.36%, +0.26% Branches: 31131 -> 30009 (-3.60%) Mostly affects parallel-rdp ubershaders. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	fc9f502a5b	aco: fix regclass checks when fixing to vcc/exec with Builder Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	a537c9e73f	aco: don't fix break condition for break+discard to exec This would move the old exec mask back into exec. This also fixes the live_out_exec. Issue found in dEQP-VK.graphicsfuzz.cosh-return-inf-unused Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	1a5444b900	aco: don't consider the first partial spill if it's the wrong type Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	8f6a900d5e	aco: consider branch definitions in spiller Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Daniel Schürmann	a79dad950b	nir,amd: remove trinary_minmax opcodes These consist of the variations nir_op_{i\|u\|f}{min\|max\|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>	2020-08-24 20:56:11 +00:00
Timur Kristóf	f820dde201	aco: Fix convert_to_SDWA when instruction has 3 operands. Previously, when the instruction had 3 operands, this would cause possible corruption because of writing to sdwa->sel[2]. This was noticed thanks to GCC 10's -Wstringop-overflow warning. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6436>	2020-08-24 15:55:14 +02:00
Timur Kristóf	0d194a70c6	aco: Fix unused variable warning by adding ASSERTED. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6436>	2020-08-24 15:55:06 +02:00
Samuel Pitoiset	8fd2f5c16d	radv: add a small interface for creating the trap handler shader Similar to the GS copy shader except that NIR is unused because the shader is written directly using ACO IR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Samuel Pitoiset	a0814a873d	aco: skip unnecessary compiler pass for the trap handler program The shader is written by hands with assigned registers, so most of the pass are unnecessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Samuel Pitoiset	9c46e6fca3	aco: add a helper for building a trap handler shader It's way easier to write a trap handler shader using ACO IR instead of writing disassembly by hand + clrxasm + copy&paste. This trap handler is quite simple for now, it just loads a buffer descriptor from the TMA BO, it saves ttmp0-1 which contain various info about the faulty instruction, and it stores some hw registers about the wave/trap status. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Samuel Pitoiset	a6146aa598	aco: validate that SMEM operands can use fixed registers To fix a validation error when loading the scalar tma buffer descriptor because it's not a temp but a fixed reg (tma_lo/tma_hi). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Samuel Pitoiset	baa9268eb6	aco: add TBA/TMA/TTMP0-11 physical registers definitions The TBA/TMA scalar registers are only available on GFX6-GFX8. On GFX9+, TBA/TMA addr are stored in hardware registers and the number of TTMP scalar registers is thus increased by 4. Just keep in mind that tba_lo is actually ttmp0. Best would be to support ttmp registers in RA but that's more complicated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>	2020-08-24 11:08:24 +00:00
Karol Herbst	e5899c1e88	nir: rename nir_op_fne to nir_op_fneu It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>	2020-08-21 17:26:21 +00:00
Rhys Perry	2133e64203	aco: use nir_intrinsic_has_access Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6402>	2020-08-21 16:47:00 +00:00
Rhys Perry	9c1e0d86a8	aco: fix non-rtz pack_half_2x16 We were using the wrong conversion opcode. The high bits are also not zero'd on GFX10, which can cause v_cvt_pk_u16_u32 to clamp. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `df645fa369` ('aco: implement VK_KHR_shader_float_controls') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6346>	2020-08-21 16:30:26 +00:00
Samuel Pitoiset	f153151730	aco: add ACO_DEBUG=force-waitcnt to emit wait-states Sounds useful for debugging missing wait-states and for improving detection of the faulty instruction in case of memory violations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6386>	2020-08-21 13:22:58 +02:00
Jason Ekstrand	1ccd681109	nir: Add an LOD parameter to image_*_size The OpenCL image_width/height/depth functions have variants which can take an LOD parameter. More importantly, LLVM-SPIRV-Translator always generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero. Given that over half the hardware out there has an LOD field for image size queries (based on a rudimentary scan through their NIR -> whatever code), we may as well just add the source to the NIR intrinsic. If this is ever a problem for anyone, the lowering is pretty trivial. I've also added asserts to everyone's drivers that should alert them if they ever see an LOD other than zero. This will never happen with GL or Vulkan so there's no need for panic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>	2020-08-20 20:48:10 +00:00
Samuel Pitoiset	58817bda8b	aco: fix file leak in ra_fail() Fixes: `c2b1978aa4` ("aco: rework the way various compilation/validation errors are reported") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6399>	2020-08-20 14:42:07 +00:00
Samuel Pitoiset	e901b901cb	radv,aco: report ACO errors/warnings back via VK_EXT_debug_report To help developers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318>	2020-08-20 08:15:08 +02:00
Samuel Pitoiset	c2b1978aa4	aco: rework the way various compilation/validation errors are reported The upcoming change will allow to report all ACO errors (or warnings) directly to the app via VK_EXT_debug_report. This is similar to what we already do for reporting various SPIRV->NIR errors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318>	2020-08-20 08:15:06 +02:00
Samuel Pitoiset	bc723dfda7	aco: rename DEBUG_VALIDATE to DEBUG_VALIDATE_IR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318>	2020-08-20 08:15:04 +02:00
Samuel Pitoiset	d452c04aa1	aco: do not set valid_mask for POS0 exports on GFX 10.3 This hardware issue seems only present on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6278>	2020-08-13 07:13:56 +00:00
Daniel Schürmann	fdb97d3d29	aco: execute branch instructions in WQM if necessary It could happen that only the branch condition was computed in WQM and not the branch instruction. There is now some rendundancy which should be cleaned up. Fixes: `3817fa7a4d` ('aco: fix WQM handling in nested loops') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6260>	2020-08-11 15:35:59 +00:00
Rhys Perry	7b4c24eb67	aco: don't move memory accesses to before control barriers Fixes random failures of dEQP-VK.image.qualifiers.volatile.cube_array.r32i and similar tests on Vega. fossil-db (Navi): Totals from 6 (0.00% of 135946) affected shaders: VMEM: 1218 -> 1110 (-8.87%); split: +2.46%, -11.33% SMEM: 174 -> 189 (+8.62%) Copies: 84 -> 87 (+3.57%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `cd392a10d0` ('radv/aco,aco: use scoped barriers') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6174>	2020-08-11 14:16:00 +01:00
Rhys Perry	6e70508151	aco: set constant_data_offset correctly in the case of merged shaders setup_nir() is done for all shaders before any of them are selected, so constant_data_offset could be incorrect for the first shader. Fixes incorrect geometry in Mafia III and Max Payne 3. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2768 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6205>	2020-08-10 18:21:47 +00:00
Rhys Perry	21b47cbd99	aco: fix C++11/C++14 compilation static_assert without a message is only available since C++17. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `d1f992f3c2` ('aco: rework barriers and replace can_reorder') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3374 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6216>	2020-08-06 23:51:14 +01:00
Rhys Perry	fea3e498c3	aco: replace MADs in isel with FMA on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	41c901b7df	aco: disable SMEM stores on GFX10.3 These are removed in GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	b811b1d083	aco: update aco_opcodes.py for GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	07250a92da	aco: implement subgroup shader_clock on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	a5303a3cbe	aco: update vgpr_alloc_granule for GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	37988b5b8e	aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3 fossil-db (Polaris): Totals from 20263 (14.75% of 137414) affected shaders: SGPRs: 871407 -> 871679 (+0.03%); split: -0.00%, +0.03% VGPRs: 513828 -> 550028 (+7.05%); split: -1.68%, +8.72% CodeSize: 18869680 -> 18828148 (-0.22%); split: -0.23%, +0.01% MaxWaves: 162012 -> 162030 (+0.01%); split: +0.01%, -0.00% Instrs: 3629172 -> 3618817 (-0.29%); split: -0.30%, +0.02% Cycles: 15682244 -> 15638244 (-0.28%); split: -0.30%, +0.02% VMEM: 10675942 -> 10673344 (-0.02%); split: +0.18%, -0.21% SMEM: 1209717 -> 1206088 (-0.30%); split: +0.03%, -0.33% VClause: 81780 -> 81227 (-0.68%); split: -0.73%, +0.06% SClause: 231724 -> 231561 (-0.07%); split: -0.07%, +0.00% Copies: 187126 -> 180831 (-3.36%); split: -3.62%, +0.26% Branches: 26841 -> 26837 (-0.01%); split: -0.03%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	c68fba9bba	aco: update bug workarounds for GFX10_3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	4f1242a4d8	aco: don't create v_mad_f32 on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	5718f7c8a7	aco: fix waitcnt insertion on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:32 +01:00
Eric Anholt	d8c2f896db	amd: Swap from nir_opt_shrink_load() to nir_opt_shrink_vectors(). This should do much more trimming than shrink_load, and is a win on i965's vec4 and nir-to-tgsi. For scalar backends like this that don't need ALU shrinking, it still gets more load intrinsics covered. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6050>	2020-08-03 21:26:45 +00:00
Rhys Perry	75a68eee28	aco: optimize swizzled SALU 8/16-bit conversions We only need one s_bfe for a conversion with a swizzled source. shader-db (parallel-rdp, Navi): Totals from 487 (71.30% of 683) affected shaders: SpillSGPRs: 3284 -> 3233 (-1.55%); split: -2.71%, +1.16% SpillVGPRs: 2174 -> 2150 (-1.10%); split: -1.24%, +0.14% CodeSize: 2497864 -> 2445544 (-2.09%); split: -2.11%, +0.01% Instrs: 450613 -> 445104 (-1.22%); split: -1.27%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5259>	2020-07-30 17:34:51 +00:00
Boris Brezillon	bfee35b45c	nir: Stop passing an options arg to nir_lower_int64() This information is exposed through shader->options->lower_int64_options. Removing the extra arg forces drivers to initialize this field correctly. This also allows us to check the int64 lowering options from each int64 lowering helper and decide if we should lower the instructions we introduce. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>	2020-07-30 16:54:24 +00:00

1 2 3 4 5 ...

875 commits