fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 17:38:08 +02:00

Author	SHA1	Message	Date
Tony Wasserka	97c97781f6	aco: Fix vector::reserve() being called with the wrong size The container is moved from before and hence returns size 0. To get the correct value, the new instruction container must be used instead. This was flagged by clang-tidy. The fixed call still triggers the corresponding diagnostic, hence this change silences it by adding a redundant clear() after move. Fixes: `7f1b537304` ("aco: add new NOP insertion pass for GFX6-9") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>	2021-03-08 10:44:20 +01:00
Rhys Perry	9f8a0b797e	radv: cache pipeline statistics Applications rarely require them, but this improves fossil-db replay time. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Rhys Perry	7c7e8942f8	radv,aco: remove aco_compiler_statistics This removes a pointer from radv_shader_binary_legacy::data. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Pierre-Eric Pelloux-Prayer	7f5a8db96d	ac/rgp: move radv/sqtt functions to ac pso_correlation and code_object_loader don't depend on drivers specific logic so move them to the shared code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	b2ef94943f	ac/rtld: make ac_rtld_upload returns the code size This will be useful to keep a copy of the uploaded code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	e5b1e645e7	ac/rgp: make the max gap between shader code a warning For radeonsi the shaders don't live in the same BOs, so they're unlikely to be less that 0x1000 bytes apart. So this commit bumps the threshold to 0x10000 and warns once when hitting it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Rhys Perry	524848707b	radv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11 Matches radeonsi and PAL. From PAL: // 1 is recommended, but doesn't provide sufficient precision Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4394 Fixes: `ed94638156` ("radv: Enable RB+ where possible.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9427>	2021-03-05 11:16:40 +00:00
Samuel Pitoiset	2169c4f763	radv: re-enable TC-compat HTILE for MSAA D32S8 images on GFX9+ Should help MSAA games. Note that it's broken on GFX8 because the tiling doesn't match. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3868 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9284>	2021-03-05 08:44:40 +00:00
Samuel Pitoiset	367a93830b	radv: skip useless FCE when fast-clearing MSAA images with DCC enabled The clear code is 0xCC which means CMASK isn't fast-cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9392>	2021-03-05 08:11:28 +00:00
Samuel Pitoiset	6102507a74	radv: remove useless check about mips+layers for TC-compat HTILE images radv_use_htile_for_image() prevents it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:10:19 +01:00
Samuel Pitoiset	438f65fb1e	radv: cleanup enabling TC-compat HTILE for depth surfaces It makes more sense to try to enable TC-compat if the image has HTILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:09:42 +01:00
Samuel Pitoiset	517600b4d5	Revert "radv: stop using VM_ALWAYS_VALID on APUs" Disabling VM_ALWAYS_VALID actually hurts more than it helps after doing more testing. Managing the global BO list in userspace is really costly and make a bunch of games CPU bound. I think re-enabling VM_ALWAYS_VALID is a step in the right direction. This reverts commit `6ac6e2fbfb`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9341>	2021-03-04 09:37:59 +00:00
Marek Olšák	a0cc0b3a15	ac/llvm: open code fpow on LLVM 12 using fmul.legacy A quick look at the asm shows that this enables source modifiers (neg, abs) for v_mul_legacy_f32. Totals from affected shaders: SGPRS: 110104 -> 110400 (0.27 %) VGPRS: 57632 -> 57636 (0.01 %) Spilled SGPRs: 66 -> 63 (-4.55 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3290412 -> 3283068 (-0.22 %) bytes Max Waves: 32141 -> 32141 (0.00 %) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	18c1c1404d	ac/llvm: add type parameter into ac_build_buffer_load to fix 16-bit TES inputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	ed351b9a71	ac/llvm: fix visit_load_ubo_buffer to use SMEM for 16 bits instead of VMEM This has 3 advantages: - It's SMEM. - Multiple single component loads are merged into 1 multi-dword load by LLVM. - The result is always packed for packed instructions. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	46ce67a331	ac/llvm: implement 16-bit and 64-bit fpow correctly LLVM converts to 32 bits and back for llvm.pow, so we can't use it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	3475c79328	ac/llvm: add support for 16-bit source operands for samplers Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Samuel Pitoiset	578fc7dbbc	radv: fix RGP barrier layout transition for TC-compatible CMASK images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9390>	2021-03-03 16:49:29 +00:00
Rhys Perry	21697082ec	radv: don't shrink image stores for The Surge 2 The game seems to declare the wrong format. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e4d75c22` ("nir/opt_shrink_vectors: shrink image stores using the format") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4347 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>	2021-03-03 14:18:37 +00:00
Rhys Perry	cbb5ed476c	nir/opt_shrink_vectors: add option to skip shrinking image stores Some games declare the wrong format, so we might want to disable this optimization in that case. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e4d75c22` ("nir/opt_shrink_vectors: shrink image stores using the format") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>	2021-03-03 14:18:37 +00:00
Samuel Pitoiset	b33792b794	radv: bump the initial SQTT buffer size to 32MB per SE Most of the games need 32MB or more, but rarely less. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	6813b52290	radv: trigger a new SQTT capture automatically after resizing the buffer It's way better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	0a1e3cc1cb	radv: double the SQTT buffer size when it is resized Computing the expected buffer size isn't reliable on GFX10+ because DROPPED_CNTR returns weird results. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	c0608bb083	ac/sqtt: fix determining if the trace is complete on GFX10+ DROPPED_CNTR isn't reliable and might still report non-zero if the SQTT buffer isn't full. Checking if the number of written bytes by the hw is equal to the SQTT buffer size seems reliable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	f4c4c0f207	radv: do not trace inactive shader engines with SQTT This fixes a GPU hang on my Sienna because the number of SE is less than the maximum, and SE #1 is disabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9370>	2021-03-03 08:16:42 +01:00
Marek Olšák	f9e6c7a220	ac/llvm: fix ac_build_atomic_rmw with LLVM 13 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4383 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>	2021-03-03 01:19:24 +00:00
Rhys Perry	3a72044ece	aco: add missing usable_read2 check A Hitman 2 shader does: read64(local_invocation_index() * 4 - 4). This was likely emitting a ds_read2_b32 on GFX6. For local_invocation_index()=0, because the first dword was out-of-bounds, the second was likely also considered out-of-bounds (even though it's not, at offset 0). Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3882 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `57e6886f98` ("aco: refactor load_lds to use new helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rhys Perry	941739619e	Revert "radv,aco: allow unaligned LDS access on GFX9+" This reverts commit `1a0b0e8460`. The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128 make this feature very difficult to use safely. This fixes a blocking artifact in Hitman 2. Previously, it contained: ds_read_b64(local_invocation_index() * 4 - 4) For local_invocation_index()=0, the second dword would be considered out-of-bounds, even though it's at offset 0. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Samuel Pitoiset	97925cee8d	radv: remove useless decompression of the DS resolve attachment The DS resolve attachment is the destination attachment, it doesn't need to be decompressed before resolving the depth/stencil attachment. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9256>	2021-03-01 18:10:35 +00:00
Bas Nieuwenhuizen	ff99faf0cf	radv: Add nodisplaydcc option. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9318>	2021-03-01 14:42:41 +00:00
Bas Nieuwenhuizen	3c9452c3ae	radv: Add sam option. So that people without large BAR can try this out. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9318>	2021-03-01 14:42:41 +00:00
Bas Nieuwenhuizen	0360ecac73	radv: Enable linear sampling for depth textures. Turns out there are CTS tests. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4258 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9319>	2021-03-01 13:14:09 +00:00
Samuel Pitoiset	56bff270fe	radeonsi,radv: do not overallocate the SQTT buffer size The number of shader engines isn't always 4. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9307>	2021-03-01 13:13:36 +01:00
Samuel Pitoiset	24f015eddc	Revert "radv: do not overallocate the SQTT buffer" This fixes computing the thread trace data offset. This reverts commit `c7e6f4ff3d`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9307>	2021-03-01 13:09:15 +01:00
Samuel Pitoiset	6b53f7f969	radv: exclude perf counters for SQTT also on GFX10.3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9300>	2021-03-01 11:20:19 +00:00
Samuel Pitoiset	859dbf953d	radv: fix exporting SQTT pipelines with LLVM Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9300>	2021-03-01 11:20:19 +00:00
Samuel Pitoiset	d26bcc0f5c	radv: always select the first active CU when profiling with SQTT This probably fixes instruction tracing on many chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9300>	2021-03-01 11:20:19 +00:00
Samuel Pitoiset	b511bf262d	radv: remove duplicate REG_INCLUDE_CONTEXT setting for SQTT It was set twice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9300>	2021-03-01 11:20:19 +00:00
Bas Nieuwenhuizen	f67259d83b	radv: Expose robustBufferAccessUpdateAfterBind correctly. We do support it. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4351 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9281>	2021-02-26 23:53:52 +00:00
Bas Nieuwenhuizen	5acc115bd8	ac/rgp: Only report double the prims per clock on GFX10. Misinterpreted review comment. Fixes: `4ded99f99d` ("ac/rgp: report the number of primitives per clock") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9312>	2021-02-27 00:21:00 +01:00
Rob Clark	a9618e7c42	util: Add accessor for util_cpu_caps In release builds, there should be no change, but in debug builds the assert will help us catch undefined behavior resulting from using util_cpu_caps before it is initialized. With fix for u_half_test for MSVC from Jesse Natalie squashed in. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>	2021-02-26 18:31:19 +00:00
Samuel Pitoiset	4ded99f99d	ac/rgp: report the number of primitives per clock Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9303>	2021-02-26 18:05:47 +01:00
Samuel Pitoiset	435bff34e3	ac/rgp: report the number of memory operations per clock So that RGP reports the memory type and the memory throughput. Based on AMDVLK. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9303>	2021-02-26 18:05:45 +01:00
Samuel Pitoiset	c2271f66ea	ac/rgp: report LDS size in CU mode on GFX10+ RGP expects that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9303>	2021-02-26 18:05:43 +01:00
Samuel Pitoiset	ceded1d0a2	ac/rgp: recognize more memory types Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9303>	2021-02-26 18:05:42 +01:00
Rhys Perry	c3af0c2079	aco: use p_as_uniform for get_sampler_desc and convert_pointer_to_64_bit Since value-numbering no longer works across loops, we no longer need to use v_readfirstlane_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Rhys Perry	5f1b354472	aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's source is calculated in Exact. Fixes hang when running Assassin's Creed: Valhalla benchmark. fossil-db (GFX10.3): Totals from 1021 (0.70% of 146267) affected shaders: CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10% Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13% SClause: 78921 -> 78920 (-0.00%) Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22% Branches: 12987 -> 13933 (+7.28%) PreSGPRs: 47599 -> 47813 (+0.45%) Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08% VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03% SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Daniel Schürmann	690ac7409a	aco/value_numbering: use can_eliminate() function to avoid unnecessary hashmap lookups No fossil-db changes. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195>	2021-02-25 11:35:42 +01:00
Daniel Schürmann	fbf791e70c	aco: value number VOPC instructions with different exec masks This becomes possible as long as we do val = s_and_b32/64 exec, val before any subgroup operations. This precautional instruction can be removed by the optimizer if 'val' was computed by a VOPC instruction using the same exec mask. Totals from 59 (0.04% of 146267) affected shaders (Navi10): VGPRs: 2808 -> 2816 (+0.28%) CodeSize: 340888 -> 340852 (-0.01%); split: -0.20%, +0.19% Instrs: 61733 -> 61625 (-0.17%); split: -0.18%, +0.01% Cycles: 470636 -> 469112 (-0.32%); split: -0.33%, +0.01% VMEM: 8091 -> 7993 (-1.21%) SMEM: 2736 -> 2719 (-0.62%); split: +0.29%, -0.91% VClause: 1745 -> 1741 (-0.23%) SClause: 2394 -> 2392 (-0.08%); split: -0.25%, +0.17% Copies: 3249 -> 3253 (+0.12%); split: -0.62%, +0.74% Branches: 1210 -> 1206 (-0.33%) PreSGPRs: 3126 -> 3176 (+1.60%); split: -0.16%, +1.76% Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195>	2021-02-25 11:35:42 +01:00
Samuel Pitoiset	8a47422d97	radv: do not scale the depth bias for D16_UNORM depth surfaces Scaling the depth bias doesn't seem correct with Vulkan. This is probably the root cause of the shadow artifacts differences between RADV and AMDVLK/AMDGPU-PRO. Fix dEQP-VK.rasterization.depth_bias.d16_unorm. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2217 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9249>	2021-02-25 08:17:27 +01:00

1 2 3 4 5 ...

6878 commits