fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 22:20:14 +01:00

Author	SHA1	Message	Date
Rhys Perry	dd23345567	aco: fix half_pi constant for 16-bit fsin/fcos This worked because the optimizer didn't consider that the 16-bit instruction would interpret the inline constant differently. This will change in the next commit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	f5a5674178	aco: update comment about preserving fp16/fp64 denormals Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	1b6a319c15	aco: add and set precise flag No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	a8f800a836	aco: use p_as_uniform in emit_vop1_instruction No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Rhys Perry	b6d9e45f47	aco: improve code for f2{i,u}{8,16} Use sub-dword definitions so that the RA can use SDWA No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>	2020-06-15 18:24:22 +00:00
Daniel Schürmann	1f98d8c804	aco: fix shared subdword loads Shared subdword loads don't need byte alignment as they are split into multiple loads if necessary. Fixes: `5cde4989d3` ('aco: remove unnecessary split- and create_vector instructions for subdword loads') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441>	2020-06-12 13:56:12 +00:00
Samuel Pitoiset	7b44f549b3	aco: implement radv_enable_mrt_output_nan_fixup workaround Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>	2020-06-12 14:43:57 +02:00
Rhys Perry	56345b8c61	aco: allow reading/writing upper halves/bytes when possible Use SDWA, opsel or a different opcode to achieve this. shader-db (Navi, fp16 enabled): Totals from 42 (0.03% of 127638) affected shaders: VGPRs: 3424 -> 3416 (-0.23%) CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23% Instrs: 156638 -> 155733 (-0.58%) Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00% VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05% SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11% VClause: 1477 -> 1475 (-0.14%) Copies: 13216 -> 12406 (-6.13%) Branches: 5942 -> 5901 (-0.69%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>	2020-06-10 15:05:11 +00:00
Rhys Perry	98060ba0f0	aco: p_extract_vector in 64-bit u2f16/i2f16 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>	2020-06-10 15:05:11 +00:00
Daniel Schürmann	5cde4989d3	aco: remove unnecessary split- and create_vector instructions for subdword loads This helps GFX6/7 by removing unnecessary shuffle code. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	5446e3cf2e	aco: fix alignment of vectors with 4 elements I think this case was just missing. This fixes a bunch of 16-bit storage related CTS failures like dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	c7bd0f8cd5	aco: implement 8-bit/16-bit conversions on GFX6-GFX7 Use v_bfe to implement small bitsize conversions because the compiler probably optimizes this better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Samuel Pitoiset	6391f9ab4c	aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>	2020-06-05 16:04:06 +02:00
Samuel Pitoiset	a521c67d22	aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>	2020-06-03 19:48:40 +02:00
Timur Kristóf	045c9ffa7d	aco: Implement subgroup shuffle on GFX6-7. GFX6 and GFX7 don't have the ds_bpermute (or permute) instruction, but we would like to support subgroup shuffle on these old GPUs. So we introduce a new pseudio instruction which will be lowered to an "unrolled loop" that emulates bpermute on GFX6 and GFX7 using readlane instructions, while also respecting the exec mask thanks to v_cmpx. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>	2020-06-02 21:12:12 +00:00
Timur Kristóf	14a5021aff	aco/gfx10: Refactor of GFX10 wave64 bpermute. The emulated GFX10 wave64 bpermute no longer needs a linear_vgpr, so we don't consider it a reduction anymore. Additionally, the code is slightly reorganized in preparation for the GFX6 emulated bpermute. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>	2020-06-02 21:12:12 +00:00
Rhys Perry	01ce7887bf	aco: fix 64-bit shared_atomic_exchange Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>	2020-05-28 10:34:03 +00:00
Samuel Pitoiset	94570e87bd	aco: add support for bias/lod with texture gather Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5147>	2020-05-25 08:51:10 +02:00
Samuel Pitoiset	cecd4aad46	aco: implement nir_intrinsic_shader_clock with device scope Use s_memrealtime instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>	2020-05-24 20:37:52 +02:00
Samuel Pitoiset	b3c87c52ea	aco: implement 8-bit/16-bit nir_intrinsic_quad_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	dfa62d97a0	aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	f03e56eaf0	aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	86e2b03e3f	aco: implement 8-bit/16-bit reductions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	1647e098e9	aco: implement 16-bit interp For 16-bit bank LDS (ie. Kabini/Stoney) we need a slightly different path. It's completely untested though because I don't have these chips but according to vkpipeline-db the generated assembly seems fine. Note that 16-bit I/O is currently only exposed on GFX9+ for both compiler backends. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	3fba5bb9cc	aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	7ffd394605	aco: implement 8-bit/16-bit mov's with p_create_vector ACO doesn't lower 8-bit/16-bit mov's in NIR. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2997 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	860b4d16f4	aco: allow to load/store 16-bit values in VMEM for tess and geom We only have to adjust some assertions to allow storing/loading 16-bit values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	9bd3b67163	aco: convert 16-bit values before exporting MRTs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	462a5fe6f4	aco: store 16-bit temporary outputs as v2b Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	aaf5706aa3	aco: add support for texturing with clamped LOD This is a requirement for the shaderResourceMinLod feature which allows to clamp LOD. This uses all image_sample__cl variants. All dEQP-VK.glsl.texture_functions.textureclamp.* pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>	2020-05-14 10:05:44 +00:00
Samuel Pitoiset	47a769143b	aco: remove useless check for nir_tex_src_bias I think only nir_texop_txb can have a bias operand anyways. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>	2020-05-14 10:05:44 +00:00
Jason Ekstrand	ca2d53f451	nir: Make "divergent" a property of an SSA value v2: fix usage in ACO (by Daniel Schürmann) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>	2020-05-13 18:49:22 +00:00
Samuel Pitoiset	3fba0a7a6f	aco: fix 64-bit trunc with negative exponents on GFX6 v_frexp_exp returns the exponent as an unsigned value. Also, v_ashr returns either 0 or -1 depending on the sign of the source operand, but what we want is only the sign bit. Fixes a bunch of recent dEQP-VK.glsl.builtin.precision_double.* tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4921>	2020-05-11 08:31:23 +02:00
Samuel Pitoiset	90d9f9a37e	aco: remove unecessary p_split_vector with v2b reg class Should be fine now that RA take full registers for v2b if it's not an SDWA instruction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4879>	2020-05-05 08:50:10 +02:00
Timur Kristóf	fdbb296853	aco: Remember VS/TCS output driver locations. Instead of relying on calling shader_io_get_unique_index repeatedly, remember the which output driver location corresponds to which varying slot. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>	2020-04-29 11:51:04 +00:00
Timur Kristóf	ab07c4ea70	aco: Use context variables instead of calculating TCS inputs/outputs. VS needs the number of TCS inputs, and TES needs the number of TCS outputs. It is error-prone to repeat those calculations in both instruction selection and setup. Just set them in one place instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>	2020-04-29 11:51:04 +00:00
Rhys Perry	9392ddab43	aco: consider blocks unreachable if they are in the logical cfg unreachable was true if the last block is unreachable in the linear cfg, but it should also be true if it is unreachable in the logical cfg. Fixes dEQP-VK.graphicsfuzz.for-with-ifs-and-return Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `8d8c864beb` ('aco: improve check for unreachable loop continue blocks') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4764>	2020-04-29 11:07:09 +00:00
Samuel Pitoiset	60cc065c7d	aco: fix adjusting the sample index with FMASK if value is negative The SPIR-V spec doesn't say explicitly that the sample index must be an unsigned integer. This fixes crashes with some new VK_EXT_robustness2 tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775>	2020-04-29 07:29:54 +00:00
Samuel Pitoiset	a112ec4c11	aco: fix nir_texop_texture_samples with NULL descriptors With VK_EXT_robustness2, descriptors can be NULL and the number of samples returned by nir_texop_texture_samples should be 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775>	2020-04-29 07:29:54 +00:00
Rhys Perry	3ee3ad561a	aco: fix vgpr nir_op_vecn with sgpr operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>	2020-04-28 23:16:55 +00:00
Rhys Perry	bcd9467d5c	aco: improve sub-dword emit_split_vector() with sgprs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	a3dc1441f0	aco: clobber scc in s_bfe_u32 in get_alu_src() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	7db7206631	aco: allow 8/16-bit shared loads These should work now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	48b7beb7b0	aco: add and use get_buffer_store_op() helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	936b70c8cf	aco: refactor visit_store_scratch() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	18817041f7	aco: refactor visit_store_global() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	c7bd69b3ae	aco: refactor visit_store_ssbo() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	f75c830433	aco: refactor store_vmem_mubuf() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	98b4cc7110	aco: refactor store_lds() to use new helpers It should also work correctly for 8/16-bit stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	562353e1f1	aco: add helpers for splitting stores split_store_data() splits a vector and p_as_uniforms it if needed. scan_write_mask()/advance_write_mask() are similar to u_bit_scan_consecutive_range(), but makes it easier to only clear part of the range and will also give ranges for zero'd bits. split_buffer_store() is a helper for splitting VMEM/SMEM stores. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00

1 2 3 4 5 ...

291 commits