fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 19:18:11 +02:00

Author	SHA1	Message	Date
Bas Nieuwenhuizen	d51a4b4c4b	radv: Add initial CPU BVH building. The algorithm used for the BVH: 1) first create 1 leaf per primitive (triangle/aabb/instance) 2) Then create internal layers from the bottom up until we are left with 1 node in the top layer. Node i in the layer will have children (i4+0) ... (i4+3) in the previous layer. This results in a very naive algorithm but it is also very simple to implement. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11078>	2021-06-18 22:16:27 +00:00
Bas Nieuwenhuizen	67e949a8f8	radv: Use the global BO list for acceleration structures. We have nested structures so tracking this from the descriptor set is going to be a mess. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11078>	2021-06-18 22:16:27 +00:00
Samuel Pitoiset	977355c6e5	radv: fix dynamic culling and depth/stencil related dynamic states To avoid overwriting previous dynamic state with default state from the pipeline. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4926 Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11375>	2021-06-18 16:27:57 +00:00
Mike Blumenkrantz	651c6b16ff	radv: move pipe_misaligned and l2_coherent image checks to flags set on init this should save 4-5% cpu in some cases Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11462>	2021-06-18 16:02:26 +00:00
Samuel Pitoiset	60348360a2	radv: create only one pipeline for decompressing depth/stencil images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11263>	2021-06-18 14:15:30 +02:00
Samuel Pitoiset	213c4c5f44	radv: always decompress both aspects of a depth/stencil image If compressed rendering is only used for the depth aspect of a depth/stencil image, stencil might also be compressed and it needs to be decompressed. This only happens for non-TC compatible images. As long as the driver needs to decompress the depth aspect, I don't think that decompressing the stencil aspect introduces extra cost. Fixes dEQP-VK.renderpasslate_fragment_tests.d32_sfloat_s8_uint for chips that don't support TC-compat HTILE. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11263>	2021-06-18 14:15:30 +02:00
Samuel Pitoiset	50233d0daa	radv: reject binding buffer/image when the device memory is too small From the Vulkan spec 1.2.181: "The difference of the size of memory and memoryOffset must be greater than or equal to the size member of the VkMemoryRequirements structure returned from a call to vkGetImageMemoryRequirements with the same image" This is invalid usage but adding a check in the driver is safe and might avoid spurious failures. This is a workaround for the inventory GPU hang with Cyberpunk 2077 which is actually a game bug. Luckily the game handles this error gracefully. Since the addrlib change from March, addrlib now selects a better swizzle mode (4KB instead of 64KB) which reduces image size. Though, the game assumes that an image with 2 mips is always smaller than the same image but with 6 mips. This is not always true if the swizzle mode is different. Then, it creates a D312 heap that is too small for the 2 mips image and the GPU hang with a memory violation, ugh... Note that next vkd3d-proton release should also reject this but fixing both sides is fine. Cc: 21.1 mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4823 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4593 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11448>	2021-06-18 08:04:29 +00:00
Yiwei Zhang	ec1968dcc9	radv: fix build errors after commit `8b7ff784` Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11373>	2021-06-16 19:55:48 +00:00
Daniel Stone	a8c1155209	ci/bare-metal: Set CPU and GPU governors to max, disable GPU runtime PM Give us a bit more predictable performance by making sure we always run at full tilt. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Martin Peres <martin.peres@mupuf.org> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>	2021-06-15 14:02:44 +02:00
Daniel Stone	0fcb53e8f4	ci/lava: Use HWCI_KERNEL_MODULES to load modules One fewer difference to bare-metal. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Martin Peres <martin.peres@mupuf.org> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>	2021-06-15 14:02:44 +02:00
Rhys Perry	bc1c527834	aco/lower_phis: don't allocate unused temporary ids The excessive number of temporary IDs caused #4872's live-out sets to be extremely large and expensive to iterate. With this change, #4872's shader is much faster to compile and uses much less memory. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4872 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11300>	2021-06-14 16:48:38 +00:00
Rhys Perry	ecc0353af7	aco/lower_phis: fix undef_operands initialization with >32 predecessors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11300>	2021-06-14 16:48:38 +00:00
Samuel Pitoiset	16d5939ff5	radv: fix dynamic rasterizer discard enable state If a pipeline enables rasterizerDiscardEnable statically we have to properly initialize the value, otherwise it won't be updated when a new pipeline is bound. Fixes few dEQP-VK.pipeline.extended_dynamic_state.*disable_raster. Fixes: `dd19bf9d7d` ("radv: implement dynamic rasterizer discard enable") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11242>	2021-06-14 16:31:14 +00:00
Rhys Perry	d64f5a3f9d	aco: move VMEM instructions below descriptor loads This is to prevent sequences like: a = descriptor_load() vmem(a) b = descriptor_load() vmem(b) and instead create: a = descriptor_load() b = descriptor_load() vmem(a) vmem(b) fossil-db (GFX10.3): Totals from 114521 (78.30% of 146267) affected shaders: VGPRs: 4540352 -> 4540216 (-0.00%); split: -0.03%, +0.02% CodeSize: 289864228 -> 289114652 (-0.26%); split: -0.29%, +0.03% MaxWaves: 2940234 -> 2940338 (+0.00%); split: +0.00%, -0.00% Instrs: 55112418 -> 54919910 (-0.35%); split: -0.38%, +0.03% Latency: 956528393 -> 954682011 (-0.19%); split: -0.24%, +0.05% InvThroughput: 229280830 -> 229238107 (-0.02%); split: -0.04%, +0.02% VClause: 1141832 -> 1139002 (-0.25%); split: -0.63%, +0.38% SClause: 2357840 -> 2225008 (-5.63%); split: -6.01%, +0.38% Copies: 3316040 -> 3331519 (+0.47%); split: -0.31%, +0.77% Branches: 1187212 -> 1186919 (-0.02%); split: -0.03%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>	2021-06-14 15:47:37 +00:00
Rhys Perry	bc71222cd9	aco: don't move descriptor loads below buffer loads fossil-db (GFX10.3): Totals from 52870 (36.15% of 146267) affected shaders: VGPRs: 2109936 -> 2110056 (+0.01%); split: -0.01%, +0.01% CodeSize: 134898056 -> 134812748 (-0.06%); split: -0.08%, +0.02% MaxWaves: 1347354 -> 1347346 (-0.00%) Instrs: 25598063 -> 25575415 (-0.09%); split: -0.11%, +0.02% Latency: 432491613 -> 432047723 (-0.10%); split: -0.12%, +0.02% InvThroughput: 90940977 -> 90927545 (-0.01%); split: -0.03%, +0.01% VClause: 570039 -> 570019 (-0.00%); split: -0.05%, +0.04% SClause: 1145076 -> 1139040 (-0.53%); split: -0.60%, +0.07% Copies: 1513949 -> 1513102 (-0.06%); split: -0.32%, +0.26% Branches: 524279 -> 524275 (-0.00%); split: -0.03%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>	2021-06-14 15:47:37 +00:00
Rhys Perry	f8bf6b9e0a	aco/ra: use adjust_max_used_regs() in compact_relocate_vars() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>	2021-06-14 15:47:37 +00:00
Samuel Pitoiset	44e7057304	radv/winsys: remove useless errno.h includes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11269>	2021-06-14 15:52:48 +02:00
Samuel Pitoiset	ec7f7a7e33	radv/winsys: adjust some error messages Report the return code from libdrm instead of errno. While we are at it, fix the function name in radv_amdgpu_wait_timeline_syncobj(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11269>	2021-06-14 15:52:45 +02:00
Bas Nieuwenhuizen	720ee494e5	radv: Allow DCC images to be compressed with foreign queues. Otherwise we would always decompress when transitioning to the foreign queue. Fixes: `8b9033ad0a` ("radv: Support DCC modifiers fully.") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10802>	2021-06-14 11:20:59 +00:00
Bas Nieuwenhuizen	f44a6c6a54	radv: Actually return correct value for read-only DCC compressedness. Most stuff that depends on the value wouldn't be triggered anyway but ... Fixes: `b5ecf0748a` ("radv: Ensure we never decompress or FCE read-only textures.") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10802>	2021-06-14 11:20:59 +00:00
Bas Nieuwenhuizen	f7c622307d	radv: Don't skip barriers that only change queues. We depend on the queue mask for some decisions ... CC: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10802>	2021-06-14 11:20:59 +00:00
Rhys Perry	1d50ef9ca6	aco: adjust the condition for expanding vertex fetch data format Instead of avoiding out-of-bounds access, avoid creating a load larger than the original attribute. This should work just as well, since the only situations expending a load helped was because we shrunk it first. Also fixes a bug where a 3 component load (4 components with the first component skipped) would be incorrectly expanded to 4 components because the stride check would never be performed. Maybe we should avoid skipping the first component in some situations, but I'm not sure if it's worth the VGPR cost. fossil-db (vega10): Totals from 583 (0.39% of 149974) affected shaders: CodeSize: 1496848 -> 1500868 (+0.27%); split: -0.03%, +0.30% Instrs: 286155 -> 286575 (+0.15%); split: -0.07%, +0.22% Latency: 2947101 -> 2946865 (-0.01%); split: -0.23%, +0.22% InvThroughput: 797396 -> 797127 (-0.03%); split: -0.08%, +0.04% fossil-db (polaris10): Totals from 583 (0.39% of 151365) affected shaders: SGPRs: 38880 -> 39216 (+0.86%) VGPRs: 24440 -> 24356 (-0.34%) CodeSize: 1506808 -> 1510876 (+0.27%); split: -0.01%, +0.28% Instrs: 288735 -> 289167 (+0.15%); split: -0.06%, +0.21% Latency: 2963263 -> 2961884 (-0.05%); split: -0.24%, +0.19% InvThroughput: 802351 -> 801665 (-0.09%); split: -0.12%, +0.04% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>	2021-06-14 09:48:32 +00:00
Rhys Perry	91f8f82806	radv,aco: use all attributes in a binding to obtain an alignment for fetch Instead of assuming scalar alignment for an attribute, we can use the required alignment of other attributes in a binding to expect a higher one. This uses the alignment of all attributes in the pipeline, not just the ones loaded. This can create slightly better code, but could break pipelines which relied on unused (and unaligned) attributes no being loaded. I don't think such pipelines are allowed by the spec. fossil-db (Sienna Cichlid): Totals from 44350 (30.32% of 146267) affected shaders: VGPRs: 1694464 -> 1700616 (+0.36%); split: -0.08%, +0.44% CodeSize: 60207184 -> 58093836 (-3.51%); split: -3.51%, +0.00% MaxWaves: 1175998 -> 1174948 (-0.09%); split: +0.02%, -0.11% Instrs: 11763444 -> 11458952 (-2.59%); split: -2.60%, +0.01% Latency: 70679612 -> 67062215 (-5.12%); split: -5.27%, +0.15% InvThroughput: 11482495 -> 11362911 (-1.04%); split: -1.20%, +0.16% VClause: 359459 -> 343248 (-4.51%); split: -6.36%, +1.85% SClause: 422404 -> 419229 (-0.75%); split: -1.17%, +0.42% Copies: 754384 -> 764368 (+1.32%); split: -1.74%, +3.06% Branches: 197472 -> 197474 (+0.00%); split: -0.03%, +0.03% PreVGPRs: 1215348 -> 1215503 (+0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>	2021-06-14 09:48:32 +00:00
Daniel Stone	d0e5203855	ci/lava: Use per-job rootfs overlay for environment Trying to get arbitrary strings suitably quoted for shell, embedded in a YAML file, processed by Python templating, is like seven bad ideas all embedded into one big can of bees. Reuse the same script we use for bare-metal to generate the environment, tar that up into a per-job overlay which is added to the inter-pipeline-reusable rootfs built by the container jobs and the intra-pipeline-reusable overlay built by the build jobs. @anholt wrote a chunk of this - replacing the $ENV_VARS GitLab CI variable with a Python loop across the POSIX job environment - in !11192, but this still had YAML quoting nightmares, and was more needless duplication between LAVA and bare-metal. The diff is large and annoying, but is mostly a sed job to get ENV_VARS="FOO=bar BAZ=quux" into FOO: bar\nBAZ: quux. Signed-off-by: Daniel Stone <daniels@collabora.com> Co-authored-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>	2021-06-11 12:13:00 +00:00
Daniel Schürmann	bb1c06343d	aco/ra: refactor register assignment for vector operands No functional changes. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>	2021-06-11 12:35:46 +02:00
Daniel Schürmann	09b99f1b7c	aco/ra: refactor affinity coalescing Also adds v_interp_p2_f32 to the list of affinity-related instructions. Totals from 68 (0.05% of 149839) affected shaders (GFX10.3): CodeSize: 792928 -> 792056 (-0.11%) Instrs: 152843 -> 152625 (-0.14%) Latency: 1235353 -> 1235278 (-0.01%) InvThroughput: 224087 -> 224049 (-0.02%) Copies: 9218 -> 9000 (-2.36%) Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>	2021-06-11 12:35:31 +02:00
Daniel Schürmann	3a98f484d1	aco/ra: only create phi-affinities for killed operands If a phi-operand is not killed, it must be copied anyway. The additional affinity would only overwrite any potential better affinity that was already created Totals from 1067 (0.71% of 149839) affected shaders (GFX10.3): VGPRs: 68072 -> 68064 (-0.01%) CodeSize: 8252588 -> 8245220 (-0.09%); split: -0.12%, +0.03% Instrs: 1596146 -> 1593941 (-0.14%); split: -0.16%, +0.02% Latency: 18828176 -> 18823914 (-0.02%); split: -0.08%, +0.06% InvThroughput: 3575063 -> 3574787 (-0.01%); split: -0.05%, +0.04% VClause: 24345 -> 24325 (-0.08%); split: -0.16%, +0.07% Copies: 88712 -> 87398 (-1.48%); split: -1.77%, +0.29% Branches: 52067 -> 51364 (-1.35%); split: -1.38%, +0.03% Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>	2021-06-11 12:35:12 +02:00
Georg Lehmann	d3f735a249	ac: Enable 32bit predication on gfx9 with fw feature version 52. Amdvlk does this as well and it passes the vulkan CTS on renoir. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>	2021-06-11 06:07:10 +00:00
Georg Lehmann	fc437ef944	ac: Enable 32bit predication on gfx10. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>	2021-06-11 06:07:10 +00:00
Georg Lehmann	a41ba20cbd	ac: Check me_fw_feature for 32bit predication on gfx10.3 Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>	2021-06-11 06:07:10 +00:00
Samuel Pitoiset	4026a07e74	radv: fix aligning the image offset by using align64() This doesn't fix anything known. Found by inspection. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11302>	2021-06-11 07:35:32 +02:00
Rhys Perry	6204e17b44	radv: increase maxComputeSharedMemorySize Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11262>	2021-06-10 12:55:53 +00:00
Rhys Perry	9162963f0a	aco: fix emit_mbcnt() with a VGPR mask Found by inspection. Should be possible with nir_intrinsic_mbcnt_amd. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11295>	2021-06-10 11:21:47 +00:00
Timur Kristóf	18337fbcf2	aco: Use as_vgpr for the second source of mbcnt_amd. Fixes: `1e49018ced` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11292>	2021-06-10 10:13:02 +00:00
Samuel Pitoiset	3a643a9ce1	ci: add expected list of failures for Bonaire (RADV) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11239>	2021-06-10 09:36:55 +00:00
Samuel Pitoiset	cfe7e81214	radv/winsys: add a small comment explaining the CHAIN bit Without it the hardware launches an IB2 which might hang in some rare situations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11214>	2021-06-10 08:31:11 +02:00
Samuel Pitoiset	a234840e60	radv: do not launch an IB2 for secondary cmdbuf with INDIRECT_MULTI on GFX7 It's illegal to emit DRAW_{INDEX}_INDIRECT_MULTI from an IB2 on GFX7. PAL applies this workaround for indirect dispatches and also on GFX8-9 but it doesn't seem needed. This fixes various GPU hangs on Bonaire (GFX7). Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11214>	2021-06-10 08:31:08 +02:00
Timur Kristóf	1e49018ced	amd: Add extra source to the mbcnt_amd NIR intrinsic. The v_mbcnt instructions can take an extra source that they add to the result. This is not exposed in SPIR-V but we now expose it in NIR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	f6b2db298f	ac/nir: Refactor and optimize the repacking sequence. According to feedback, the terminology with "exclusive scan" and "reduction" is difficult. Change it to use "repack" instead, which better fits what this sequence is actually used for. The new sequence stores only 1 byte / wave to LDS, and uses packed instructions to produce the results. This has lower latency and fewer instructions than what we previously had. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	b4e22eb482	aco: Keep VGPR destinations for uniform shared loads when beneficial. When the result of these loads is only used by cross-lane instructions, it is beneficial to use a VGPR destination. This is because this allows to put the s_waitcnt further down, which decreases latency. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	ce141e4c5f	aco: Implement byte and lane permute intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	5713e059ea	aco: Add validation for v_permlane instructions. Previously there hasn't been any validation for these instructions, but after shooting myself in the leg with it a few times, I decided to add the validation now. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	fd6605367d	aco: Implement nir_op_sad_u8x4. Fix up the operand size for v_sad instructions, and implement the new NIR horizontal add. There is no viable way to do this in SALU, so let's always use a VGPR destination. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	228169c87c	aco: Add note about v_alignbyte in the ISA README. We tried to use this instruction for a more optimal sequence, but it turned out that it doesn't exactly work as it was supposed to. This note is to help others who want to use it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Rhys Perry	c129ede523	aco: use ds_read_{u8,u16}_d16 This allows partial writes and writes to the upper half of the destination. fossil-db (Sienna Cichlid): Totals from 135 (0.09% of 149839) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>	2021-06-09 12:06:50 +00:00
Rhys Perry	6334d73fc9	aco: don't ever widen 8/16-bit sgpr load_shared Doesn't seem to create incorrect code, but it is suboptimal. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>	2021-06-09 12:06:50 +00:00
Rhys Perry	d2b9c7e982	radv: improve LDS alignment check for load/store vectorization Previously, this could vectorize two scalar 16-bit loads into a u8vec4 load. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>	2021-06-09 12:06:50 +00:00
Rhys Perry	4870d7d829	aco: use v1b/v2b for ds_read_u8/ds_read_u16 The p_extract_vector isn't necessary. For ds_read_u8 and ds_read_u16, we used a 32-bit regclass, but did't load 32 bits, and used dst_hint for vector loads when we shouldn't have. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4863 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>	2021-06-09 12:06:50 +00:00
Samuel Pitoiset	2fb436e92a	ci: update list of expected failures for Pitcairn/Oland (RADV) The robustness2 failures were a mistake because they are actually not supported (no VK_EXT_scalar_block_layout on GFX6). The sparse related failures are no longer supported since sparse is only enabled for Polaris10+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11243>	2021-06-09 11:27:44 +00:00
Samuel Pitoiset	d169dad393	aco: fix emitting literal offsets with SMEM on GFX7 When the offset is negative, reg() isn't 255. Fix this by splitting SGPR and literal emission. While we are at it, adjust a comment saying that literals are also accepted on GFX6 which is wrong. Fixes another batch of robustness tests. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11247>	2021-06-09 11:10:38 +00:00

1 2 3 4 5 ...

7470 commits