fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Daniel Schürmann	40a93e271c	aco: clang-format No changes, just formatting. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13087>	2021-09-28 19:48:00 +00:00
Timur Kristóf	9478901824	aco: Implement integer conversions using p_extract. Fossil DB stats on Sienna Cichlid: Totals from 563 (0.44% of 128647) affected shaders: SpillSGPRs: 1381 -> 1382 (+0.07%) SpillVGPRs: 1606 -> 1552 (-3.36%) CodeSize: 2474724 -> 2446612 (-1.14%); split: -1.15%, +0.02% Scratch: 181248 -> 180224 (-0.56%) Instrs: 440973 -> 435091 (-1.33%); split: -1.35%, +0.01% Latency: 9123609 -> 8517830 (-6.64%); split: -6.66%, +0.02% InvThroughput: 3685256 -> 3383293 (-8.19%); split: -8.22%, +0.02% VClause: 8425 -> 8372 (-0.63%) Copies: 66553 -> 66681 (+0.19%); split: -0.49%, +0.68% Branches: 13824 -> 13825 (+0.01%); split: -0.01%, +0.01% PreSGPRs: 21816 -> 21824 (+0.04%) Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 58802 (45.71% of 128647) affected shaders: SpillSGPRs: 6541 -> 6542 (+0.02%) SpillVGPRs: 1606 -> 1552 (-3.36%) CodeSize: 162976608 -> 162244340 (-0.45%); split: -0.45%, +0.00% Scratch: 181248 -> 180224 (-0.56%) Instrs: 31163521 -> 31098078 (-0.21%); split: -0.21%, +0.00% Latency: 146893569 -> 144920070 (-1.34%); split: -1.34%, +0.00% InvThroughput: 25384324 -> 25035940 (-1.37%); split: -1.38%, +0.00% VClause: 552310 -> 552257 (-0.01%) Copies: 3356856 -> 3356984 (+0.00%); split: -0.01%, +0.01% Branches: 1237314 -> 1237315 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 2185339 -> 2185347 (+0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11560>	2021-09-28 17:59:27 +00:00
Samuel Pitoiset	deede6b03d	radv: pass the pipeline key to the backend compilers It exactly matches the shader keys now. Everything was copied from the pipeline key to the shader keys. There is still some work to completely remove radv_shader_variant_key. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13032>	2021-09-27 11:57:25 +02:00
Bas Nieuwenhuizen	8ca54b4d38	radv: Support nir_intrinsic_load_global_constant. SPIR-V parsing can result in some direct constant usage for shader records. Lower this early to a global based intrinsic so that it doesn't interfere with the later 32-bit offset based constants for scratch usage. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	c299968988	aco: Add support for ray launch size. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	817553c052	aco: Implement call scope. Since we do no repacking yet, just use invocation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	b6be96a2bd	radv: Modify load_sbt_amd intrinsic to get the descriptor. That way we can get the address to the entry, which is needed for some nir builtins because extra data in the entry can be used as shader input. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Timur Kristóf	966cff9cfa	aco/isel: Fix emit_vop2_instruction to apply 16/24-bit flags properly. Previously it used a builder function but didn't use the return value from that function, so the flags were not applied. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>	2021-09-20 12:39:03 +02:00
Rhys Perry	2a7fa132be	aco: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:28 +00:00
Rhys Perry	e0d232c2fc	aco: implement nir_op_pack_32_4x8 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:28 +00:00
Rhys Perry	4dd420f76d	radv,aco: implement iadd_sat Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:28 +00:00
Daniel Schürmann	0988f7b9ba	aco: remove explicit dst_preserve flag Instead, we can rely on the fact that subdword definitions must preserve the unused bits while dword definitions either pad or sign-extend. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>	2021-09-02 20:39:17 +02:00
Daniel Schürmann	9e3ff06c38	aco: rewrite SDWA selector This commit introduces a new struct SubdwordSel in order to ease and clean up the usage of SDWA selections. This includes removing the distinction between register-allocated and fixed SDWA selections. Instead, SDWA selections can now also access the high bits of subdword variables. Alignment and sizes are validated accordingly. Size, offset and sign_extend can be evaluated via helper methods. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>	2021-09-02 20:39:17 +02:00
Rhys Perry	9df9fe7dfa	aco: include utility in isel For std::exchange(). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c1d11bb92c` ("aco: Add loop creation helpers.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5301 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12614>	2021-08-30 14:28:00 +00:00
Timur Kristóf	cfb0d931f2	aco: Emit zero for the derivatives of uniforms. Observed in a shader from Resident Evil Village. This also helps prevent emitting invalid IR. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12599>	2021-08-27 20:34:22 +00:00
Daniel Schürmann	23d5865f42	aco: refactor nir_op_imul selection Previously, the optimization to use v_mul_lo_u16 for 32bit multiplications was done in instruction_selection. This was moved to the optimizer to ease some case distinctions. The mixed results are due to increased use of SDWA. Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3) VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01% CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01% Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01% Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07% InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06% VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03% SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01% Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Timur Kristóf	5b7446d74c	radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+. Previously, indices 0, 2, 4 were used. This worked, but it was somewhat unintuitive. This commit changes it to use indices 0, 1, 2 instead, which makes the code easier to understand. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12511>	2021-08-26 05:20:15 +00:00
Daniel Schürmann	cd489e5388	aco: remove redundant s_and exec after nir_op_inot Totals from 22585 (15.04% of 150170) affected shaders: (GFX10.3) VGPRs: 1474048 -> 1473904 (-0.01%) CodeSize: 155238876 -> 155187688 (-0.03%); split: -0.06%, +0.03% MaxWaves: 385086 -> 385122 (+0.01%) Instrs: 29297735 -> 29284442 (-0.05%); split: -0.08%, +0.04% Latency: 675841742 -> 675764151 (-0.01%); split: -0.02%, +0.01% InvThroughput: 174859037 -> 174854796 (-0.00%); split: -0.01%, +0.01% VClause: 479790 -> 479781 (-0.00%); split: -0.01%, +0.00% SClause: 1106900 -> 1106615 (-0.03%); split: -0.03%, +0.01% Copies: 1829037 -> 1828042 (-0.05%); split: -0.09%, +0.03% Branches: 859971 -> 859967 (-0.00%); split: -0.00%, +0.00% PreSGPRs: 1341850 -> 1342356 (+0.04%); split: -0.01%, +0.04% PreVGPRs: 1327322 -> 1327034 (-0.02%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>	2021-08-25 12:43:50 +00:00
Rhys Perry	8852c5448d	aco: fix vectorized 16-bit load_input/load_interpolated_input Seems we haven't encountered this before because nir_lower_io_to_scalar_early usually scalarizes this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12486>	2021-08-23 10:11:36 +00:00
Timur Kristóf	448592b9ae	aco: Use Navi 10 empty NGG output workaround on NGG culling shaders. Navi 10 can hang when an NGG workgroup has no output, so we work around that by always exporting a single zero-area triangle with a single vertex that has all-NaN coordinates. Thus far, we only employed this for NGG GS, because on all other stages, the output can't be empty. However, with NGG culling, the output can be empty, so let's apply the same workaround there too. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12169>	2021-08-04 12:28:34 +00:00
Rhys Perry	566970f273	aco: use image_dim and image_array intrinsic indices Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12190>	2021-08-04 12:09:07 +00:00
Timur Kristóf	da9f4b2e67	nir, aco: Remove vertex and primitive count overwrite intrinsic. It's no longer needed. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	1bbea90f50	aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags. Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags at once of every vertex. This takes fewer VALU instructions than previously. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: CodeSize: 161028288 -> 158751628 (-1.41%) Instrs: 30917985 -> 30519571 (-1.29%) Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01% InvThroughput: 21280238 -> 20927401 (-1.66%) Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00% No Fossil DB changed with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Samuel Pitoiset	6694c37ea0	aco: implement VK_EXT_shader_atomic_float2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12060>	2021-07-27 08:44:31 +02:00
Jason Ekstrand	e83fe65cd8	radv,radeonsi: Do cube size divide-by-6 lowering in NIR No point in carrying all this code around twice each in two back-ends. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12005>	2021-07-22 14:22:35 -05:00
Timur Kristóf	55d57b828f	aco: Fix how p_elect interacts with optimizations. Since p_elect doesn't have any operands, ACO's value numbering and/or the pre-RA optimizer could currently recognize two p_elect instructions in two different blocks as the same. This patch adds exec as an operand to p_elect in order to achieve correct behavior. Fixes: `e66f54e5c8` Closes: #5080 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11943>	2021-07-18 00:48:06 +02:00
Timur Kristóf	e66f54e5c8	aco: Allow elect to take advantage of knowing when all lanes are active. Implement elect using a pseudo-op which is lowered during the insert_exec_mask pass. This makes it possible to emit a more optimal sequence when the exec mask is constant. Fossil DB results on Sienna Cichlid: Totals from 211 (0.16% of 128647) affected shaders: CodeSize: 2254356 -> 2240468 (-0.62%); split: -0.62%, +0.00% Instrs: 438471 -> 434996 (-0.79%); split: -0.80%, +0.01% Latency: 2717082 -> 2709400 (-0.28%); split: -0.28%, +0.00% InvThroughput: 566987 -> 566342 (-0.11%); split: -0.11%, +0.00% Copies: 40058 -> 40162 (+0.26%) Branches: 31209 -> 31211 (+0.01%) PreSGPRs: 9927 -> 10125 (+1.99%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>	2021-07-16 14:31:54 +00:00
Timur Kristóf	b12318f26c	aco: Swap s_and operand order for ballot. This allows our optimizer to recognize this and eliminate it when it can prove that the s_and with exec is unneeded. Fossil DB changes on Sienna Cichlid: Totals from 1969 (1.53% of 128647) affected shaders: CodeSize: 9468228 -> 9469348 (+0.01%); split: -0.00%, +0.01% Instrs: 1773566 -> 1773581 (+0.00%); split: -0.01%, +0.01% Latency: 19504042 -> 19503385 (-0.00%); split: -0.00%, +0.00% InvThroughput: 3617406 -> 3617333 (-0.00%) Copies: 108998 -> 110592 (+1.46%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>	2021-07-16 14:31:54 +00:00
Daniel Schürmann	114d38e57d	aco/isel: avoid unnecessary calls to nir_unsigned_upper_bound() These were responsible for ~20% of the time spent in instruction selection. Reduces overall compile times by ~0.5%. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11879>	2021-07-14 18:10:40 +02:00
Timur Kristóf	182d9b1e60	aco: Implement NGG culling related intrinsics. These are very straightforward as they just copy data from the newly added shader arguments. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Tony Wasserka	cfd866ed42	aco: Clean up unneeded literal casts These were only needed to select the appropriate Operand constructor before. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>	2021-07-13 17:43:26 +00:00
Tony Wasserka	66e51dc474	aco: Remove use of deprecated Operand constructors This migration was done with libclang-based automatic tooling, which performed these replacements: * Operand(uint8_t) -> Operand::c8 * Operand(uint16_t) -> Operand::c16 * Operand(uint32_t, false) -> Operand::c32 * Operand(uint32_t, bool) -> Operand::c32_or_c64 * Operand(uint64_t) -> Operand::c64 * Operand(0) -> Operand::zero(num_bytes) Casts that were previously used for constructor selection have automatically been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)). Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>	2021-07-13 17:43:26 +00:00
Daniel Schürmann	b97cd93b35	aco: fix extract_vector optimization If the allocated_vec map contains a different RegType for the elements, ensure that the size matches exactly. Otherwise, it could happen that extracting a dword element matched with a subdword element. No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11823>	2021-07-13 09:14:43 +02:00
Daniel Schürmann	1e2639026f	aco: Format. Manually adjusted some comments for more intuitive line breaks. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>	2021-07-12 21:27:31 +00:00
Samuel Pitoiset	ee79b87c62	radv: lower primitive shading rate in NIR This allows more potential compiler optimizations if the value is a constant or from a scalar load. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11579>	2021-07-12 17:54:07 +00:00
Daniel Schürmann	0eea0e55ad	aco: add 'common/' and 'llvm/' prefix to #includes Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>	2021-07-12 12:09:31 +00:00
Daniel Schürmann	59fdaa1985	aco: reorder and cleanup #includes Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>	2021-07-12 12:09:31 +00:00
Samuel Pitoiset	543eb42c35	aco: use nir_ssa_def_is_unused() to determine if atomic dest is used Instead of duplicating this chunk everywhere. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11793>	2021-07-09 12:12:57 +00:00
Samuel Pitoiset	74a221bcfd	aco: fix shared_atomic_comp_swap if the second source isn't a VGPR Only VGPRs are valid with DS instructions. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11777>	2021-07-08 10:41:14 +00:00
Rhys Perry	a9c4a31d8d	aco: handle NIR loops without breaks Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11626>	2021-07-01 10:01:52 +00:00
Rhys Perry	c094765a01	aco: remove resource flags After disabling SMEM stores, nir_opt_access() now does the same analysis and we don't need this anymore. Doing it in isel is also too late if we want to lower descriptor loads in NIR. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11652>	2021-06-30 19:07:12 +01:00
Timur Kristóf	e6bf5cfe59	aco/gfx10: Emit barrier at the start of NGG VS and TES. The Navi 1x NGG hardware can hang in certain conditions when not every wave launched before s_sendmsg(GS_ALLOC_REQ). As a workaround, to ensure this never happens, let's emit a workgroup barrier at the beginning of NGG VS and TES. Note that NGG GS already has a workgroup barrier so it doesn't need this. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837>	2021-06-22 14:32:27 +00:00
Timur Kristóf	f9447abb36	aco/gfx10: NGG zero output workaround for conservative rasterization. Navi 1x GPUs have an issue: they can hang when the output vertex and primitive counts are zero. The workaround is exporting a dummy triangle. This commit changes the dummy triangle's vertex so its positions are all NaN. This should make sure the triangle is never rendered. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837>	2021-06-22 14:32:27 +00:00
Jason Ekstrand	f0f713960b	nir,amd: Suffix nir_op_cube_face_coord/index with _amd Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:03:34 -05:00
Rhys Perry	1d50ef9ca6	aco: adjust the condition for expanding vertex fetch data format Instead of avoiding out-of-bounds access, avoid creating a load larger than the original attribute. This should work just as well, since the only situations expending a load helped was because we shrunk it first. Also fixes a bug where a 3 component load (4 components with the first component skipped) would be incorrectly expanded to 4 components because the stride check would never be performed. Maybe we should avoid skipping the first component in some situations, but I'm not sure if it's worth the VGPR cost. fossil-db (vega10): Totals from 583 (0.39% of 149974) affected shaders: CodeSize: 1496848 -> 1500868 (+0.27%); split: -0.03%, +0.30% Instrs: 286155 -> 286575 (+0.15%); split: -0.07%, +0.22% Latency: 2947101 -> 2946865 (-0.01%); split: -0.23%, +0.22% InvThroughput: 797396 -> 797127 (-0.03%); split: -0.08%, +0.04% fossil-db (polaris10): Totals from 583 (0.39% of 151365) affected shaders: SGPRs: 38880 -> 39216 (+0.86%) VGPRs: 24440 -> 24356 (-0.34%) CodeSize: 1506808 -> 1510876 (+0.27%); split: -0.01%, +0.28% Instrs: 288735 -> 289167 (+0.15%); split: -0.06%, +0.21% Latency: 2963263 -> 2961884 (-0.05%); split: -0.24%, +0.19% InvThroughput: 802351 -> 801665 (-0.09%); split: -0.12%, +0.04% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>	2021-06-14 09:48:32 +00:00
Rhys Perry	91f8f82806	radv,aco: use all attributes in a binding to obtain an alignment for fetch Instead of assuming scalar alignment for an attribute, we can use the required alignment of other attributes in a binding to expect a higher one. This uses the alignment of all attributes in the pipeline, not just the ones loaded. This can create slightly better code, but could break pipelines which relied on unused (and unaligned) attributes no being loaded. I don't think such pipelines are allowed by the spec. fossil-db (Sienna Cichlid): Totals from 44350 (30.32% of 146267) affected shaders: VGPRs: 1694464 -> 1700616 (+0.36%); split: -0.08%, +0.44% CodeSize: 60207184 -> 58093836 (-3.51%); split: -3.51%, +0.00% MaxWaves: 1175998 -> 1174948 (-0.09%); split: +0.02%, -0.11% Instrs: 11763444 -> 11458952 (-2.59%); split: -2.60%, +0.01% Latency: 70679612 -> 67062215 (-5.12%); split: -5.27%, +0.15% InvThroughput: 11482495 -> 11362911 (-1.04%); split: -1.20%, +0.16% VClause: 359459 -> 343248 (-4.51%); split: -6.36%, +1.85% SClause: 422404 -> 419229 (-0.75%); split: -1.17%, +0.42% Copies: 754384 -> 764368 (+1.32%); split: -1.74%, +3.06% Branches: 197472 -> 197474 (+0.00%); split: -0.03%, +0.03% PreVGPRs: 1215348 -> 1215503 (+0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>	2021-06-14 09:48:32 +00:00
Rhys Perry	9162963f0a	aco: fix emit_mbcnt() with a VGPR mask Found by inspection. Should be possible with nir_intrinsic_mbcnt_amd. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11295>	2021-06-10 11:21:47 +00:00
Timur Kristóf	18337fbcf2	aco: Use as_vgpr for the second source of mbcnt_amd. Fixes: `1e49018ced` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11292>	2021-06-10 10:13:02 +00:00
Timur Kristóf	1e49018ced	amd: Add extra source to the mbcnt_amd NIR intrinsic. The v_mbcnt instructions can take an extra source that they add to the result. This is not exposed in SPIR-V but we now expose it in NIR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	ce141e4c5f	aco: Implement byte and lane permute intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00

... 3 4 5 6 7 ...

828 commits