fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
James Park	1351fcf3c3	amd: Fix warnings around variable sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162>	2021-04-23 10:37:22 +00:00
Timur Kristóf	74c467d988	aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7. On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add instruction, which clobbers the VCC register. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346>	2021-04-22 11:29:49 +00:00
Rhys Perry	0eaa5dfac0	aco: remove image parameter from get_sampler_desc() We can just check whether tex_instr is NULL instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Rhys Perry	3cbe9894f7	aco: set TRUNC_COORD=0 for nir_texop_tg4 Fixes black squares in Assassin's Creed: Valhalla and rendering of FidelityFX-CACAO demo. fossil-db (sienna cichlid): Totals from 3052 (2.09% of 146267) affected shaders: SpillSGPRs: 8437 -> 8646 (+2.48%) CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40% Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29% Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05% InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03% VClause: 92114 -> 92132 (+0.02%) SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01% Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61% Branches: 219629 -> 219635 (+0.00%) PreSGPRs: 248970 -> 249366 (+0.16%) fossil-db (polaris10): Totals from 3050 (2.06% of 147787) affected shaders: SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02% VGPRs: 242572 -> 242612 (+0.02%) SpillSGPRs: 10387 -> 10675 (+2.77%) CodeSize: 31872460 -> 31996128 (+0.39%) MaxWaves: 10924 -> 10925 (+0.01%) Instrs: 6222217 -> 6239072 (+0.27%) Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09% InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06% VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01% SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00% Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41% Branches: 219698 -> 219703 (+0.00%) PreSGPRs: 244251 -> 244644 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `58f25098a0` ("radv: Use TRUNC_COORD on samplers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Samuel Pitoiset	9434675d60	aco: fix opquantize2f16 on GFX6-7 Make sure to preserve signed zeroes. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero on GFX6 (Pitcairn). Untested on GFX7. Fixes: `54a09545ec` ("aco: optimize a*0.0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319>	2021-04-19 16:33:37 +00:00
Marek Olšák	ec1ddb976a	amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Timur Kristóf	5dbab03a80	aco: Emit fewer branches for NGG VS/TES with late primitive export. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	af7d5f5b86	aco: Set block_kind_export_end in create_vs/fs_exports. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	2b312a4fd7	aco: Extract ngg_nogs_export_prim_id to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	231ef14b3d	aco: Use s_setprio 3 at the beginning of every VS and TES. The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	4c86c7aa15	aco: Remove useless s_setprio near gs_alloc_req. We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	75cd43741a	aco: Align NGG scratch size to 16 so a single ds_read can always read it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Timur Kristóf	c1346e5c22	aco: Optimize workgroup exclusive scan to better avoid bank conflicts. Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Rhys Perry	d8f12fd421	aco: fix 16-bit f2{u8,i8} on GFX6/7 Not really tested. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Rhys Perry	d0e15b8c22	aco: fix 16-bit u2f32 This shouldn't sign-extend. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Samuel Pitoiset	1ad295ed6f	radv: allow to force VRS rates on GFX10.3 with RADV_FORCE_VRS This allows to force the VRS rates via RADV_FORCE_VRS, the supported values are 2x2, 1x2 and 2x1. This supports the primitive shading rate mode for non GUI elements. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7794>	2021-04-09 14:47:53 +02:00
Rhys Perry	835c5b7ebf	aco: fix integer tg4 workaround with unnormalized coordinates Same as LLVM from `2abf62d348`. fossil-db (GFX8): Totals from 15 (0.01% of 147787) affected shaders: VGPRs: 744 -> 748 (+0.54%) CodeSize: 100472 -> 100732 (+0.26%) Instrs: 19995 -> 20059 (+0.32%) Latency: 1001530 -> 1001859 (+0.03%) InvThroughput: 378508 -> 378747 (+0.06%) SClause: 676 -> 675 (-0.15%) Copies: 1655 -> 1654 (-0.06%) PreSGPRs: 735 -> 742 (+0.95%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10053>	2021-04-07 15:21:51 +00:00
Samuel Pitoiset	65bca137bd	aco: implement a workaround for the image load DCC hw bug on GFX10.3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919>	2021-04-05 08:54:55 +00:00
Samuel Pitoiset	3dfb453626	aco: fix get_sampler_desc() for image loads Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919>	2021-04-05 08:54:55 +00:00
Tony Wasserka	8557ac9a12	aco/isel: Add documentation for (u)int64->f16 conversion The upper 32 bits are truncated before converting, which still produces correct results since they never meaningfully contribute to the result. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	b5be03f39f	aco/isel: Fix large inputs being truncated in int32->f16 conversions The previous code produced incorrect results for inputs outside the range [INT16_MIN, INT16_MAX]. A problematic case is e.g. i2f16 32768, which previously would be converted to -32768.0 instead of returning the exactly representable floating point result. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	4ce8e422e3	aco/isel: Add documentation and asserts for convert_int This function has evolved to be a generic helper function used throughout the file, so having those assumptions written down explicitly and document unsupported edge cases should help prevent incorrect use. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	1e03796fa4	aco/isel: Don't request sign extension when truncating signed integers This doesn't change semantics but allows us to reject this potentially ambiguous configuration in convert_int in a later change. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	3a2b055726	aco/isel: Fix i64/u64->float32 conversion for large inputs Previously, inputs such as 0x100000000 would have their upper 32-bits ignored despite being representable by 32-bit floats. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	436922c84a	aco/isel: Don't emit unsupported i16<->f16 conversion opcodes on GFX6/7 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `b86305bb57` ("nir/algebraic: collapse conversion opcodes (many patterns)") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4357 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Rhys Perry	e3c283e0bc	aco: use -1.0x and 1.0\|x\| for fneg/fabs Besides -1.0*x being 1 dword smaller than x^0x80000000, this commit also improves generated code when the application requires that denormals are flushed. Future versions of DXVK will require that 32-bit denormals are flushed. fossil-db (GFX8): Totals from 21021 (14.22% of 147787) affected shaders: SGPRs: 1288960 -> 1288944 (-0.00%); split: -0.01%, +0.01% VGPRs: 792672 -> 792848 (+0.02%); split: -0.01%, +0.03% CodeSize: 62439228 -> 62403552 (-0.06%); split: -0.11%, +0.05% MaxWaves: 136182 -> 136181 (-0.00%); split: +0.00%, -0.00% Instrs: 12230882 -> 12239927 (+0.07%); split: -0.01%, +0.08% fossil-db (GFX10.3): Totals from 20191 (13.80% of 146267) affected shaders: VGPRs: 799992 -> 800032 (+0.01%) CodeSize: 59763656 -> 59715484 (-0.08%); split: -0.12%, +0.03% MaxWaves: 525378 -> 525376 (-0.00%) Instrs: 11511082 -> 11517419 (+0.06%); split: -0.00%, +0.06% fossil-db (GFX8, d3d float controls): Totals from 87160 (58.98% of 147787) affected shaders: SGPRs: 5395072 -> 5408480 (+0.25%); split: -0.06%, +0.31% VGPRs: 3596716 -> 3581592 (-0.42%); split: -0.55%, +0.13% CodeSize: 271347396 -> 266814460 (-1.67%); split: -1.67%, +0.00% MaxWaves: 539669 -> 540400 (+0.14%); split: +0.15%, -0.02% Instrs: 53395194 -> 52257505 (-2.13%); split: -2.13%, +0.00% fossil-db (GFX10.3, d3d float controls): Totals from 82306 (56.27% of 146267) affected shaders: VGPRs: 3572312 -> 3558848 (-0.38%); split: -0.44%, +0.06% CodeSize: 273494748 -> 269648968 (-1.41%); split: -1.41%, +0.00% MaxWaves: 2007156 -> 2009950 (+0.14%); split: +0.15%, -0.01% Instrs: 52251568 -> 51356424 (-1.71%); split: -1.71%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079>	2021-03-24 14:02:41 +00:00
Rhys Perry	27e2f82f17	aco: implement image_deref_samples It used to be that this intrinsic was never created and texture instructions were always used. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `50881d59e6` ("compiler/spirv: fix image sample queries") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9686>	2021-03-19 10:31:46 +00:00
Timur Kristóf	89c8e22cc6	aco: Fix constant address offset calculation for ds_read2 instructions. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9678>	2021-03-18 10:43:41 +00:00
Rhys Perry	5bc100eb2d	aco: use a single instruction for uadd32_sat() on GFX8 fossil-db (GFX8): Totals from 8 (0.01% of 147787) affected shaders: SGPRs: 352 -> 368 (+4.55%) CodeSize: 49576 -> 48788 (-1.59%) Instrs: 9487 -> 9318 (-1.78%) Latency: 49935 -> 49607 (-0.66%) InvThroughput: 138493 -> 137443 (-0.76%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598>	2021-03-17 15:33:34 +00:00
Rhys Perry	3decb52c82	aco: use uadd32_sat() helper for nir_op_uadd_sat Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598>	2021-03-17 15:33:31 +00:00
Rhys Perry	590de30093	aco: implement 64-bit VGPR {u,i}find_msb This can be created by subgroupBallotFindMSB(). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4458 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598>	2021-03-17 15:33:22 +00:00
Timur Kristóf	ed7c6e46e7	aco: Delete superfluous tess and ESGS I/O code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	540168fd15	radv: Use new, NIR-based I/O lowering. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	b3a16c0e19	radv: Fill some tess shader info earlier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	582229585b	aco: Implement new Geometry Shader intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	5c95b32c6e	aco: Implement the new tessellation I/O related NIR intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	e10e74a7af	aco: Implement new buffer load/store intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Rhys Perry	0af7ff49fd	aco: lower p_constaddr into separate instructions earlier This allows them to be scheduled properly and simplifies the assembler a little. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	7d5643c0fe	aco: track divergent and uniform branch depth Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:30 +00:00
Rhys Perry	8f71be0a7b	aco: simplify loop_nest_depth tracking in isel Keep track of the current loop depth in Program and set the depth inside Program::insert_block() instead of repeating it every time we insert one. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:24 +00:00
Rhys Perry	341dd9d834	aco: set compr for fp16 exports Obviously this didn't affect correctness. Not sure about performance. It also changes enabled_channels to match radeonsi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `f29c81f863` ("aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9459>	2021-03-11 13:54:18 +00:00
Rhys Perry	3a72044ece	aco: add missing usable_read2 check A Hitman 2 shader does: read64(local_invocation_index() * 4 - 4). This was likely emitting a ds_read2_b32 on GFX6. For local_invocation_index()=0, because the first dword was out-of-bounds, the second was likely also considered out-of-bounds (even though it's not, at offset 0). Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3882 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `57e6886f98` ("aco: refactor load_lds to use new helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rhys Perry	941739619e	Revert "radv,aco: allow unaligned LDS access on GFX9+" This reverts commit `1a0b0e8460`. The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128 make this feature very difficult to use safely. This fixes a blocking artifact in Hitman 2. Previously, it contained: ds_read_b64(local_invocation_index() * 4 - 4) For local_invocation_index()=0, the second dword would be considered out-of-bounds, even though it's at offset 0. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rhys Perry	c3af0c2079	aco: use p_as_uniform for get_sampler_desc and convert_pointer_to_64_bit Since value-numbering no longer works across loops, we no longer need to use v_readfirstlane_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Rhys Perry	5f1b354472	aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's source is calculated in Exact. Fixes hang when running Assassin's Creed: Valhalla benchmark. fossil-db (GFX10.3): Totals from 1021 (0.70% of 146267) affected shaders: CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10% Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13% SClause: 78921 -> 78920 (-0.00%) Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22% Branches: 12987 -> 13933 (+7.28%) PreSGPRs: 47599 -> 47813 (+0.45%) Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08% VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03% SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Daniel Schürmann	fbf791e70c	aco: value number VOPC instructions with different exec masks This becomes possible as long as we do val = s_and_b32/64 exec, val before any subgroup operations. This precautional instruction can be removed by the optimizer if 'val' was computed by a VOPC instruction using the same exec mask. Totals from 59 (0.04% of 146267) affected shaders (Navi10): VGPRs: 2808 -> 2816 (+0.28%) CodeSize: 340888 -> 340852 (-0.01%); split: -0.20%, +0.19% Instrs: 61733 -> 61625 (-0.17%); split: -0.18%, +0.01% Cycles: 470636 -> 469112 (-0.32%); split: -0.33%, +0.01% VMEM: 8091 -> 7993 (-1.21%) SMEM: 2736 -> 2719 (-0.62%); split: +0.29%, -0.91% VClause: 1745 -> 1741 (-0.23%) SClause: 2394 -> 2392 (-0.08%); split: -0.25%, +0.17% Copies: 3249 -> 3253 (+0.12%); split: -0.62%, +0.74% Branches: 1210 -> 1206 (-0.33%) PreSGPRs: 3126 -> 3176 (+1.60%); split: -0.16%, +1.76% Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195>	2021-02-25 11:35:42 +01:00
Daniel Schürmann	29b866fef6	aco: remove special handling of load_helper_invocation These should now behave the same as is_helper_invocation. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>	2021-02-17 21:53:52 +00:00
Rhys Perry	1a0b0e8460	radv,aco: allow unaligned LDS access on GFX9+ fossil-db (GFX10.3): Totals from 223 (0.16% of 139391) affected shaders: SGPRs: 10032 -> 10096 (+0.64%) VGPRs: 7480 -> 7592 (+1.50%) CodeSize: 853960 -> 821920 (-3.75%); split: -3.76%, +0.01% MaxWaves: 5916 -> 5908 (-0.14%) Instrs: 154935 -> 150281 (-3.00%); split: -3.01%, +0.01% Cycles: 3202496 -> 3080680 (-3.80%); split: -3.81%, +0.00% VMEM: 48187 -> 46671 (-3.15%); split: +0.29%, -3.44% SMEM: 13869 -> 13850 (-0.14%); split: +1.52%, -1.66% VClause: 3110 -> 3085 (-0.80%); split: -1.03%, +0.23% SClause: 4376 -> 4381 (+0.11%) Copies: 12132 -> 12065 (-0.55%); split: -2.61%, +2.06% Branches: 5204 -> 5203 (-0.02%) PreVGPRs: 6304 -> 6359 (+0.87%); split: -0.10%, +0.97% See https://reviews.llvm.org/D82788 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8762>	2021-02-17 12:57:12 +00:00
Rhys Perry	3d4c13f3b8	aco: add DeviceInfo Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:44:22 +00:00
Rhys Perry	7ff805a19d	radv,aco: add radv_nir_compiler_options::wgp_mode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:35:36 +00:00

... 3 4 5 6 7 ...

751 commits