fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 15:30:14 +01:00

Author	SHA1	Message	Date
Rob Clark	356b93f102	freedreno: remove some obsolete debug options 'fraghalf' is unused (superceeded by actually lowering output based on the precision information in nir). And glsl140 support in ir3 is long past the experimental stage, so the glsl120 option is no longer needed. So remove them and free up some bits for new things. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4366>	2020-03-30 23:20:12 +00:00
Jason Ekstrand	b113170559	nir/opt_loop_unroll: Fix has_nested_loop handling In `87839680c0`, a very subtle mistake was made with the CFG walking recursion. Instead of setting the local has_nested_loop variable when process child loops, has_nested_loop_out was passed directly into the process_loop_in_block call. This broke nested loop detection heuristics and caused loop unrolling to run massively out of control. In particular, it makes the following CTS test compile virtually forever: dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.struct_mixed_types.uniform_buffer_block_geom Fixes: `87839680c0` "nir: Fix breakage of foreach_list_typed_safe..." Closes: #2710 Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4380> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4380>	2020-03-30 22:20:47 +00:00
Eric Anholt	92afe94d28	freedreno: Work around UBWC flakiness. In trying to track down the new failure in #2670, I found that I could get the flaky test set down to 4 tests, and dropping any remaining test wouldn't trigger the failure (a bad 8x4 block in the middle of dEQP-GLES3.functional.fbo.msaa.4_samples.r16f's render target). Disabling gmem or bypass didn't help, and adding lots of CCU flushing didn't help. What did help was disabling blitting, or this memset to initialize the UBWC area after we (presumably) pull a BO out of the BO cache. My guess is that the 2D blitter can't handle some rare set of state in the flags buffer and emits some garbage. I've run 8 gles3 and 7 gles31 runs with this branch now so hopefully I've got the4 right set of flakes marked for removal. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2670 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290>	2020-03-30 21:48:59 +00:00
Eric Anholt	d0b3ccb060	freedreno: Fix detection of being in a blit for acc queries. The batch might not have stage == FD_STAGE_BLIT set because fd_blitter_pipe_begin was sticking the stage on some random batch (or none at all) rather than the one that would be used in the meta operation. What we actually wanted to be looking at was set_active_query_state(), which is already called by util_blitter and whose state we just needed to track. Fixes piglit occlusion_query_meta_no_fragments. I haven't changed query_hw.c's stage handling to clean the rest up because I don't have a db410c/db820c at home to iterate over the piglit tests. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	57d54bcf99	freedreno: Rename "is_blit" to "is_discard_blit" It's about the special case of an overwrite of a level meaning we can discard old batch contents. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	8cdc6c1e4b	freedreno/a6xx: Fix timestamp queries. We were returning the same kind of result as time_elapsed (an end - start time in ns), which on a timestamp query is approximately zero since begin/end are at the same point in time. What we're supposed to return is a converted-to-ns timestamp based on the GPU clock. Remove the _pause() function for time_elapsed to reduce the command stream overhead, and just capture start (which is, unfortunately, going to happen on each tile and thus the final start value we ready will be the last tile of the frame, not the first). Fixes piglit spec/arb_timer_query/query gl_timestamp Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	7ef61c1f10	freedreno: Count blits in GL_TIME_ELAPSED and perf counter queries. Fixes 0 gpu time reported for glBlitFramebuffer in apitrace replay --pgpu. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	4a07839948	freedreno: Associate the acc query bo with the batch. Otherwise, a result query with wait won't trigger flushing the batch, and we can end up with zeroed results. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	36612c96bd	freedreno: Fix acc query handling in the presence of batch reordering. When we switch batches and start a new draw, we need to cap the queries in the previous batch and start queries again in the new one. FD_STAGE_NULL got renamed to 0 so that it would naturally return !is_active and end the queries at the end of the batch. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	a99ff93374	freedreno: Remove the "active" member of queries. The state tracker only gets to begin/query/destroy when !active and end when active, so we have no need to try to track this ourselves. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Eric Anholt	b7fe793869	freedreno: Remove always-true return from per-gen begin_query. You should do failure-prone allocation in create_query, not begin, anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>	2020-03-30 21:35:21 +00:00
Rhys Perry	1ef9658906	util/u_queue: fix race in total_jobs_size access Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> CC: <mesa-stable@lists.freedesktop.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>	2020-03-30 20:17:43 +00:00
Rhys Perry	d101ca3f5a	glsl: fix race in instance getters Insertions can modify entry->data. Seems to fix random Fossilize crashes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net> CC: <mesa-stable@lists.freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>	2020-03-30 20:17:43 +00:00
Jason Ekstrand	f5b14d983e	nir: Set UBO alignments in lower_uniforms_to_ubo Fixes: `fb64954d9d` "nir: Validate that memory load/store ops work on..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4378> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4378>	2020-03-30 19:18:17 +00:00
Rhys Perry	4a909068ad	aco: look at p_{extract,split}_vector's definitions in pred_by_exec_mask() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>	2020-03-30 17:34:46 +00:00
Daniel Stone	9197fd59da	CI: Re-enable Windows VS2019 builds The failures are fixed, but I didn't notice this had been silently disabled in !4272. Re-enable the VS2019 build. Signed-off-by: Daniel Stone <daniels@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4374> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4374>	2020-03-30 16:22:20 +00:00
Jason Ekstrand	fb64954d9d	nir: Validate that memory load/store ops work on whole bytes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	4e80151c5d	anv: Set alignments on descriptor and constant loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	c217ee8d35	nir: Insert b2b1s around booleans in nir_lower_to By inserting a b2b1 around the load_ubo, load_input, etc. intrinsics generated by nir_lower_io, we can ensure that the intrinsic has the correct destination bit size. Not having the right size can mess up passes which try to optimize access. In particular, it was causing brw_nir_analyze_ubo_ranges to ignore load_ubo of booleans which meant that booleans uniforms weren't getting pushed as push constants. I don't think this is an actual functional bug anywhere hence no CC to stable but it may improve perf somewhere. Shader-db results on ICL with iris: total instructions in shared programs: 16076707 -> 16075246 (<.01%) instructions in affected programs: 129034 -> 127573 (-1.13%) helped: 487 HURT: 0 helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.45% max: 3.00% x̄: 1.33% x̃: 1.36% 95% mean confidence interval for instructions value: -3.00 -3.00 95% mean confidence interval for instructions %-change: -1.37% -1.29% Instructions are helped. total cycles in shared programs: 338015639 -> 337983311 (<.01%) cycles in affected programs: 971986 -> 939658 (-3.33%) helped: 362 HURT: 110 helped stats (abs) min: 1 max: 1664 x̄: 97.37 x̃: 43 helped stats (rel) min: 0.03% max: 36.22% x̄: 5.58% x̃: 2.60% HURT stats (abs) min: 1 max: 554 x̄: 26.55 x̃: 18 HURT stats (rel) min: 0.03% max: 10.99% x̄: 1.04% x̃: 0.96% 95% mean confidence interval for cycles value: -79.97 -57.01 95% mean confidence interval for cycles %-change: -4.60% -3.47% Cycles are helped. total sends in shared programs: 815037 -> 814550 (-0.06%) sends in affected programs: 5701 -> 5214 (-8.54%) helped: 487 HURT: 0 LOST: 2 GAINED: 0 The two lost programs were SIMD16 shaders in CS:GO. However, CS:GO was also one of the most helped programs where it shaves sends off of 134 programs. This seems to reduce GPU core clocks by about 4% on the first 1000 frames of the PTS benchmark. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	d2dfcee7f7	nir: Use b2b opcodes for shared and constant memory No shader-db changes on ICL with iris Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	16a80ff18a	aco: Implement b2b32 and b2b1 The implementations here just clone i2b32 and i2b1. This means that b2b32 doesn't technically generate true NIR 0/-1 booleans but it should be fine as it's only ever generated for shared variable writes which will always be consumed by something which will then run it through an i2b again. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	b2db84153a	nir: Add b2b opcodes These exist to convert between different types of boolean values. In particular, we want to use these for uniform and shared memory operations where we need to convert to a reasonably sized boolean but we don't care what its format is so we don't want to make the back-end insert an actual i2b/b2i. In the case of uniforms, Mesa can tweak the format of the uniform boolean to whatever the driver wants. In the case of shared, every value in a shared variable comes from the shader so it's already in the right boolean format. The new boolean conversion opcodes get replaced with mov in lower_bool_to_int/float32 so the back-end will hopefully never see them. However, while we're in the middle of optimizing our NIR, they let us have sensible load_uniform/ubo intrinsics and also have the bit size conversion. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Jason Ekstrand	2cb9cc56d5	intel/nir: Run copy-prop and DCE after lower_bool_to_int32 No shader-db impact on ICL with iris. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Christian Gmeiner	5278e9dea7	etnaviv: compiled_framebuffer_state: get rid of SE_SCISSOR_* Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Christian Gmeiner	22ee3eabca	etnaviv: s/scissor_s/scissor Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Christian Gmeiner	43b4eb394c	etnaviv: get rid of struct compiled_scissor_state We can reuse pipe_scissor_state. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Christian Gmeiner	9491c1b04d	etnaviv: do the left shift by 16 at emit time Also round up the max bounds. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Christian Gmeiner	5ba2d398d8	etnaviv: rework clippling calculation to be a derived state This moves the whole clipping calculation out of the emit function. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Christian Gmeiner	95763e20ce	etnaviv: get rid of SE_CLIP_* The only difference between e.g. SE_SCISSOR_RIGHT and SE_CLIP_RIGHT is the used margin value. With that information we can remove SE_CLIP_* and apply the different margins during emit time. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>	2020-03-30 15:30:15 +00:00
Jose Fonseca	27d58a1c20	gitlab-ci: Prune all SCons jobs except scons-win64, and allows failures. Based on the discussion in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4352 Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4363> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4363>	2020-03-30 14:52:34 +00:00
Samuel Pitoiset	3935a729d9	nir/algebraic: add fexp2(fmul(flog2(a), 0.5) -> fsqrt(a) optimization Helps some Wolfenstein II and Wolfenstein Youngblood shaders. pipeline-db (VEGA10/ACO): Totals from affected shaders: SGPRS: 17904 -> 17904 (0.00 %) VGPRS: 14492 -> 14492 (0.00 %) Spilled SGPRs: 20 -> 20 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 1753152 -> 1749708 (-0.20 %) bytes Max Waves: 2581 -> 2581 (0.00 %) pipeline-db (VEGA10/LLVM): Totals from affected shaders: SGPRS: 26656 -> 26656 (0.00 %) VGPRS: 23780 -> 23780 (0.00 %) Spilled SGPRs: 2112 -> 2112 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2552712 -> 2549236 (-0.14 %) bytes Max Waves: 3359 -> 3359 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353>	2020-03-30 14:07:43 +00:00
Jose Fonseca	2e92d33819	scons: Prune out unnecessary targets. This prunes out all targets except libgl-gdi, libgl-xlib, and svga, as suggested by Marek Olšák. libgl-xlib will be remove once I have had time to confirm no automated tests we have rely upon it. There are also a bunch of Makefile.sources which become orphaned as result, that are not taken care of in this change. v2: Prune remainders of swr support. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4348> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4348>	2020-03-30 13:38:01 +00:00
Timur Kristóf	0f847b18bc	aco: Don't store LS VS outputs to LDS when TCS doesn't need them. Totals: Code Size: 254764624 -> 254745104 (-0.01 %) bytes Totals from affected shaders: VGPRS: 12132 -> 12112 (-0.16 %) Code Size: 573364 -> 553844 (-3.40 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	798dd98d6e	aco: When LS and HS invocations are the same, pass LS outputs in temps. We know that in this case, the LS and HS invocations are working on the exact same vertex, so it's safe to skip the LDS. Totals: VGPRS: 3960744 -> 3961844 (0.03 %) Code Size: 254824300 -> 254764624 (-0.02 %) bytes Max Waves: 1053748 -> 1053574 (-0.02 %) Totals from affected shaders: VGPRS: 26152 -> 27252 (4.21 %) Code Size: 1496600 -> 1436924 (-3.99 %) bytes Max Waves: 4860 -> 4686 (-3.58 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0a91c086b8	aco: Extract store_output_to_temps into a separate function. Will be used by LS output stores. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0f35b3795d	aco: Fix workgroup size calculation. Clear the workgroup size for all supported shader stages. Also, unify the workgroup size calculation accross various places. As a result, insert_waitcnt can use the proper workgroup size which means that some waits can be dropped from tessellation shaders. Also, in cases where the previous calculation was wrong, we now insert s_barrier instructions. Totals from affected shaders (GFX10): Code Size: 340116 -> 338484 (-0.48 %) bytes Fixes: `a8d15ab6da` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	99ad62ff27	aco: Extract setup_tcs_info to a separate function. Will be required by the workgroup size calculation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0ad65f2c55	aco: Zero-fill undefined elements in create_vec_from_array. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	50634ad4a0	aco: Change isel inputs/outputs to a flat array. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e4a1b246a4	aco: Treat outputs of the previous stage as inputs of the next stage. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	f1dd81ae10	nir: Collect if shader uses cross-invocation or indirect I/O. The following new fields are added to tess shader info: * `tcs_cross_invocation_inputs_read` * `tcs_cross_invocation_outputs_read` These are I/O masks that are a subset of inputs_read and outputs_read and they contain which per-vertex inputs and outputs are read cross-invocation. Additionall, the following new fields are added to shader_info: * `inputs_read_indirectly` * `outputs_accessed_indirectly` * `patch_inputs_read_indirectly` * `patch_outputs_accessed_indirectly` These new fields can be used for optimizing TCS in a back-end compiler. If you can be sure that the TCS doesn't use cross-invocation inputs or outputs, you can choose a different strategy for storing VS and TCS outputs. However, such optimizations might need to be disabled when the inputs/outputs are accessed indirectly due to backend limitations, so this information is also collected. Example: RADV currently has to store all VS and TCS outputs in LDS, but for shaders when only inputs and/or outputs belonging to the current invocation ID are used, it could skip storing these in LDS entirely. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e7d733fdab	aco: Use more optimal sequence at the beginning of merged shaders. It can be further optimized in the future, but the new sequence already has a few advantages: * Uses fewer instructions * Uses even fewer instructions in wave32 mode * Doesn't use the VALU at all Totals from affected shaders (GFX10): VGPRS: 43504 -> 43496 (-0.02 %) Code Size: 2436000 -> 2423688 (-0.51 %) bytes Max Waves: 8704 -> 8705 (0.01 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	17c779ab9e	aco: Skip 2nd read of merged wave info when TCS in/out vertices are equal. When TCS has an equal number of input and output, it means that the number of VS and TCS invocations (LS and HS) are the same; and that the HS invocations operate on the same vertices as the LS. When this is the case, this commit removes the else-if between the merged VS and TCS halves, making it possible to schedule and optimize the code accross the two halves. Totals: SGPRS: 5577367 -> 5581735 (0.08 %) VGPRS: 3958592 -> 3960752 (0.05 %) Code Size: 254867144 -> 254838244 (-0.01 %) bytes Max Waves: 1053887 -> 1053747 (-0.01 %) Totals from affected shaders: SGPRS: 29032 -> 33400 (15.05 %) VGPRS: 35664 -> 37824 (6.06 %) Code Size: 1979028 -> 1950128 (-1.46 %) bytes Max Waves: 7310 -> 7170 (-1.92 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	4ec48440a0	aco: Allow combining LDS loads when loading tess factors. Previously the tess factors were loaded individually, but now they can be loaded using a single LDS load instruction. Note that the inner and outer tess factors are not yet combined. Totals (GFX10): Code Size: 254896008 -> 254879212 (-0.01 %) bytes Totals from affected shaders (GFX10): Code Size: 2028352 -> 2011556 (-0.83 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	ace3833293	aco: Allow combining TCS output VMEM stores. Some copypasta may have stuck in the code. This was left on false by mistake. Totals (GFX10): Code Size: 254939248 -> 254896008 (-0.02 %) bytes Totals from affected shaders (GFX10): VGPRS: 16196 -> 16212 (0.10 %) Code Size: 1126332 -> 1083092 (-3.84 %) bytes Max Waves: 2336 -> 2334 (-0.09 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e2b1d749b1	aco: Fix handling of tess factors. There is no need to check whether they are written using indirect indices, because all tess factors should be written to VMEM only at the end of the shader. No pipeline db changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	d3f6adcaed	aco: Extract tcs_driver_location_matches_api_mask to separate function. Also clear up should_write_tcs_output_to_lds a little bit. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e0dff5fd86	aco: Create null exports in instruction selection instead of assembler. This allows the passes after isel to assume that the exports are always correct, and also allows to schedule these null exports later. Additionally, it ensures that the correct exec mask is used for these exports. Totals from affected shaders (GFX10): SGPRS: 84224 -> 84344 (0.14 %) VGPRS: 23088 -> 23076 (-0.05 %) Code Size: 882892 -> 894368 (1.30 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Danylo Piliaiev	87839680c0	nir: Fix breakage of foreach_list_typed_safe assumptions in loop unrolling foreach_list_typed_safe works with assumption that even if current node becomes invalid, the next will be still valid. However process_loops broke this assumption, because during iteration when immediate child is unrolled - not only current node could be removed but also the one after it. This doesn't cause issues now but it will cause issues when undefined behaviour in foreach* macros is fixed. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4189> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4189>	2020-03-30 14:41:30 +03:00
Pierre-Eric Pelloux-Prayer	716a065ac0	radeon: switch to 3-spaces style For clang-format config see the previous commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319>	2020-03-30 11:05:52 +00:00

... 29 30 31 32 33 ...

123203 commits