mesa/src/intel/compiler
Kenneth Graunke be21d54aca intel/compiler: Use an existing URB write to end TCS threads when viable
VS, TCS, TES, and GS threads must end with a URB write message with the
EOT (end of thread) bit set.  For VS and TES, we shadow output variables
with temporaries and perform all stores at the end of the shader, giving
us an existing message to do the EOT.

In tessellation control shaders, we don't defer output stores until the
end of the thread like we do for vertex or evaluation shaders.  We just
process store_output and store_per_vertex_output intrinsics where they
occur, which may be in control flow.  So we can't guarantee that there's
a URB write being at the end of the shader.

Traditionally, we've just emitted a separate URB write to finish TCS
threads, doing a writemasked write to an single patch header DWord.
On Broadwell, we need to set a "TR DS Cache Disable" bit, so this is
a convenient spot to do so.  But on other platforms, there's no such
field, and this write is purely wasteful.

Insetad of emitting a separate write, we can just look for an existing
URB write at the end of the program and tag that with EOT, if possible.
We already had code to do this for geometry shaders, so just lift it
into a helper function and reuse it.

No changes in shader-db.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>
2022-09-27 18:17:42 -07:00
..
brw_cfg.cpp intel/fs: Add physical fall-through CFG edge for unconditional BREAK instruction. 2021-12-21 00:43:29 +00:00
brw_cfg.h intel/compiler: Add cfg_t::adjust_block_ips() method 2021-07-14 09:56:59 -07:00
brw_clip.h
brw_clip_line.c intel/compiler: Split 3DPRIM_* defines out to a separate header. 2022-06-30 23:46:35 +00:00
brw_clip_point.c
brw_clip_tri.c intel/compiler: Split 3DPRIM_* defines out to a separate header. 2022-06-30 23:46:35 +00:00
brw_clip_unfilled.c intel/compiler: Split 3DPRIM_* defines out to a separate header. 2022-06-30 23:46:35 +00:00
brw_clip_util.c intel: move away from booleans to identify platforms 2021-11-08 16:48:06 +00:00
brw_compile_clip.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_compile_ff_gs.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_compile_sf.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_compiler.c intel/compiler: Rename 8_PATCH to MULTI_PATCH 2022-08-24 00:39:57 +00:00
brw_compiler.h intel/compiler: Store the number of position slots in the VUE map 2022-08-31 02:00:18 +00:00
brw_dead_control_flow.cpp
brw_dead_control_flow.h
brw_debug_recompile.c intel/compiler: Stop including src/mesa/main/config.h 2022-06-30 23:46:35 +00:00
brw_disasm.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_disasm_info.c intel/eu: Handle compaction when inserting validation errors 2022-07-28 21:31:45 +00:00
brw_disasm_info.h intel/eu: Handle compaction when inserting validation errors 2022-07-28 21:31:45 +00:00
brw_eu.c intel/compiler: Convert brw_eu.cpp back to brw_eu.c 2022-06-30 23:46:35 +00:00
brw_eu.h intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ 2022-08-24 17:51:40 +00:00
brw_eu_compact.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_eu_defines.h intel/fs: remove unused opcode 2022-08-24 17:51:40 +00:00
brw_eu_emit.c intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ 2022-08-24 17:51:40 +00:00
brw_eu_util.c intel: Rename genx keyword to gfxx in source files 2021-04-02 18:33:07 +00:00
brw_eu_validate.c intel/fs: fixup SEND validation check on overlapping src0/src1 2022-08-24 17:51:40 +00:00
brw_fs.cpp intel/compiler: Use an existing URB write to end TCS threads when viable 2022-09-27 18:17:42 -07:00
brw_fs.h intel/compiler: Use an existing URB write to end TCS threads when viable 2022-09-27 18:17:42 -07:00
brw_fs_bank_conflicts.cpp intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_fs_builder.h intel/compiler: Fix instruction size written calculation 2021-11-22 21:27:30 -08:00
brw_fs_cmod_propagation.cpp intel: fix typos found by codespell 2022-06-27 10:20:55 +00:00
brw_fs_combine_constants.cpp intel/compiler: Fix missing break in switch 2021-07-22 23:38:04 +00:00
brw_fs_copy_propagation.cpp intel/compiler: Avoid copy propagating large registers into EOT messages 2022-07-07 20:20:01 +00:00
brw_fs_cse.cpp intel/compiler: Implement nir_intrinsic_last_invocation 2022-03-26 00:28:19 +00:00
brw_fs_dead_code_eliminate.cpp intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:20 -07:00
brw_fs_generator.cpp intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ 2022-08-24 17:51:40 +00:00
brw_fs_live_variables.cpp intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:20 -07:00
brw_fs_live_variables.h intel: Rename gen_device prefix to intel_device 2021-04-20 20:06:33 +00:00
brw_fs_lower_pack.cpp
brw_fs_lower_regioning.cpp intel/compiler/fs: Fix compilation of shaders with SHADER_OPCODE_SHUFFLE of float64 type 2022-09-14 19:32:43 +00:00
brw_fs_nir.cpp intel/compiler/fs: Use DF to load constants when has_64bit_int is not supported 2022-09-14 19:32:43 +00:00
brw_fs_reg_allocate.cpp intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ 2022-08-24 17:51:40 +00:00
brw_fs_register_coalesce.cpp intel/compiler: Update block IPs once in register_coalesce 2021-07-14 09:57:04 -07:00
brw_fs_saturate_propagation.cpp
brw_fs_scoreboard.cpp intel/compiler: Don't set SBID on EOT send messages 2022-07-09 05:26:25 +00:00
brw_fs_sel_peephole.cpp intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:20 -07:00
brw_fs_thread_payload.cpp intel/compiler: Use brw_ud* helpers in thread payload code 2022-09-13 01:44:24 +00:00
brw_fs_validate.cpp intel/compiler: Print more details when fs_visitor::validate() fails 2022-08-22 18:58:55 +00:00
brw_fs_visitor.cpp intel/compiler: Make thread_payload struct abstract 2022-09-13 01:44:24 +00:00
brw_gfx_ver_enum.h intel/compiler: Fix brw_gfx_ver_enum.h to be a proper header file 2022-06-30 23:46:35 +00:00
brw_inst.h intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_interpolation_map.c intel: Rename genx keyword to gfxx in source files 2021-04-02 18:33:07 +00:00
brw_ir.h intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ 2022-08-24 17:51:40 +00:00
brw_ir_allocator.h
brw_ir_analysis.h intel/compiler: use C++ template instead of preprocessor 2020-11-03 10:42:29 +00:00
brw_ir_fs.h intel/compiler: Make component() work for FIXED_GRF/ARF 2022-08-23 19:52:38 +00:00
brw_ir_performance.cpp intel/fs: Remove non-_LOGICAL URB messages 2022-07-08 19:45:34 +00:00
brw_ir_performance.h
brw_ir_vec4.h intel: fix typos found by codespell 2022-06-27 10:20:55 +00:00
brw_isa_info.h intel/compiler: Remove use of thread_local for opcode tables 2022-06-30 23:46:35 +00:00
brw_kernel.c intel,anv,iris,crocus: Drop subgroup size from the shader key 2022-07-08 22:47:22 +00:00
brw_kernel.h intel/compiler: fix singleton pointer coverity warning 2022-04-19 12:36:10 +03:00
brw_lower_logical_sends.cpp intel/fs: fixup a64 messages 2022-09-23 08:29:17 +00:00
brw_mesh.cpp intel/compiler/task: use shared memory for small task payload loads & stores 2022-09-21 09:16:20 +00:00
brw_nir.c intel/compiler: print shader after successful brw_nir_lower_shading_rate_output 2022-09-20 17:23:45 +00:00
brw_nir.h driconf: Add a limit_trig_input_range option 2022-05-13 06:47:53 +00:00
brw_nir_analyze_boolean_resolves.c
brw_nir_analyze_ubo_ranges.c intel/nir,i965: Move HW generation check for UBO pushing to i965 2021-06-03 05:12:33 +00:00
brw_nir_attribute_workarounds.c intel/compiler: Use named NIR intrinsic const index accessors 2022-08-16 05:44:30 +00:00
brw_nir_clamp_image_1d_2d_array_sizes.c intel/compiler: use nir_shader_instructions_pass in brw_nir_clamp_image_1d_2d_array_sizes 2021-10-05 10:02:54 +00:00
brw_nir_lower_alpha_to_coverage.c
brw_nir_lower_conversions.c intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_conversions 2021-10-05 10:02:54 +00:00
brw_nir_lower_cs_intrinsics.c intel/compiler: Lower Task/Mesh local_invocation_{id,index} 2021-12-04 00:41:46 +00:00
brw_nir_lower_intersection_shader.c intel/rt: Handle halts in any-hit shaders properly 2022-08-05 11:51:31 +00:00
brw_nir_lower_mem_access_bit_sizes.c intel/compiler: add support for 8/16 bits task payload loads 2022-09-21 09:16:20 +00:00
brw_nir_lower_ray_queries.c intel/nir/rt: store ray query state in scratch 2022-09-23 08:29:17 +00:00
brw_nir_lower_rt_intrinsics.c intel/compiler: extract brw_nir_load_global_const out of rt code 2021-12-04 00:41:46 +00:00
brw_nir_lower_scoped_barriers.c intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_scoped_barriers 2021-10-05 10:02:54 +00:00
brw_nir_lower_shader_calls.c intel/nir/rt: store ray query state in scratch 2022-09-23 08:29:17 +00:00
brw_nir_lower_shading_rate_output.c intel: fix typos found by codespell 2022-06-27 10:20:55 +00:00
brw_nir_lower_storage_image.c nir/builder: Add a nir_trim_vector helper 2022-05-11 14:47:33 +00:00
brw_nir_opt_peephole_ffma.c Revert "nir: Drop the unused instr arg for src/dest copy functions." 2022-08-30 18:21:44 +00:00
brw_nir_rt.c intel: Use nir_test_mask instead of i2b(iand) 2022-06-30 18:00:32 +00:00
brw_nir_rt.h intel/fs: lower ray query intrinsics 2022-02-08 12:55:25 +00:00
brw_nir_rt_builder.h intel/nir/rt: fixup alignment of memcpy iterations 2022-09-23 08:29:17 +00:00
brw_nir_tcs_workarounds.c intel/compiler: use nir_metadata_none instead of its value 2021-10-05 10:02:54 +00:00
brw_nir_trig_workarounds.py driconf: Add a limit_trig_input_range option 2022-05-13 06:47:53 +00:00
brw_packed_float.c
brw_predicated_break.cpp intel/compiler: Don't predicate a WHILE if there is a CONT 2021-12-08 14:56:32 -08:00
brw_prim.h intel/compiler: Split 3DPRIM_* defines out to a separate header. 2022-06-30 23:46:35 +00:00
brw_private.h intel,anv,iris,crocus: Drop subgroup size from the shader key 2022-07-08 22:47:22 +00:00
brw_reg.h intel/compiler: Add a few more brw_ud* helpers 2022-09-13 01:44:24 +00:00
brw_reg_type.c intel: Rename gen_device prefix to intel_device 2021-04-20 20:06:33 +00:00
brw_reg_type.h intel/compiler: Move type_is_unsigned_int to brw_reg_type.h 2021-08-30 14:00:14 -07:00
brw_rt.h intel/fs: lower ray query intrinsics 2022-02-08 12:55:25 +00:00
brw_schedule_instructions.cpp intel/fs: Lower URB messages to SEND 2022-07-08 19:45:34 +00:00
brw_shader.cpp intel/compiler: Use FS thread payload only for FS 2022-09-13 01:44:24 +00:00
brw_shader.h intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_simd_selection.c intel/fs: fixup simd selection with shader calls 2022-08-05 11:51:31 +00:00
brw_vec4.cpp intel/compiler: Use FS thread payload only for FS 2022-09-13 01:44:24 +00:00
brw_vec4.h intel/compiler: remove gfx6 gather wa from backend. 2021-12-22 21:37:55 +00:00
brw_vec4_builder.h intel: Rename Genx keyword to Gfxx 2021-04-02 18:33:07 +00:00
brw_vec4_cmod_propagation.cpp intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:32 -07:00
brw_vec4_copy_propagation.cpp intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
brw_vec4_cse.cpp intel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix 2022-07-08 19:45:34 +00:00
brw_vec4_dead_code_eliminate.cpp intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:32 -07:00
brw_vec4_generator.cpp intel/compiler: Delete unused Gfx8+ code in brw_find_live_channel() 2022-08-02 08:41:43 +00:00
brw_vec4_gs_nir.cpp intel/compiler: Use named NIR intrinsic const index accessors 2022-08-16 05:44:30 +00:00
brw_vec4_gs_visitor.cpp intel/compiler: Use FS thread payload only for FS 2022-09-13 01:44:24 +00:00
brw_vec4_gs_visitor.h intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
brw_vec4_live_variables.cpp intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:32 -07:00
brw_vec4_live_variables.h intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 2021-08-11 13:09:32 -07:00
brw_vec4_nir.cpp gallium,glsl: Delete PIPE_CAP_VERTEXID_NOBASE and lower_vertex_id. 2022-08-31 22:57:03 +00:00
brw_vec4_reg_allocate.cpp intel/compiler: Don't create vec4 reg-set for gen8+ 2022-07-14 17:49:01 +00:00
brw_vec4_surface_builder.cpp intel: move away from booleans to identify platforms 2021-11-08 16:48:06 +00:00
brw_vec4_surface_builder.h
brw_vec4_tcs.cpp intel/compiler: Use FS thread payload only for FS 2022-09-13 01:44:24 +00:00
brw_vec4_tcs.h intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
brw_vec4_tes.cpp intel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix 2022-07-08 19:45:34 +00:00
brw_vec4_tes.h intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
brw_vec4_visitor.cpp intel/vec4: Inline emit_texture and move helpers to brw_vec4_nir.cpp 2021-12-16 00:09:45 -08:00
brw_vec4_vs.h intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
brw_vec4_vs_visitor.cpp intel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix 2022-07-08 19:45:34 +00:00
brw_vue_map.c intel/compiler: Store the number of position slots in the VUE map 2022-08-31 02:00:18 +00:00
gfx6_gs_visitor.cpp intel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix 2022-07-08 19:45:34 +00:00
gfx6_gs_visitor.h intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
intel_clc.c intel/compiler: Introduce a new brw_isa_info structure 2022-06-30 23:46:35 +00:00
meson.build intel/compiler: Use FS thread payload only for FS 2022-09-13 01:44:24 +00:00
test_eu_compact.cpp intel/compiler: Fixes [-Wdeprecated-declarations] in test_eu_compact.cpp 2022-08-23 15:19:16 +00:00
test_eu_validate.cpp intel/compiler: Fixes [-Wdeprecated-declarations] in test_eu_validate.cpp 2022-08-23 15:19:16 +00:00
test_fs_cmod_propagation.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_fs_copy_propagation.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_fs_saturate_propagation.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_fs_scoreboard.cpp intel/fs/xehp: Add unit test for handling of RaR deps across multiple pipelines. 2022-01-25 22:40:44 +00:00
test_simd_selection.cpp intel/compiler: Initialize SIMDSelectionTest member error. 2021-11-03 04:22:35 +00:00
test_vec4_cmod_propagation.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_vec4_copy_propagation.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_vec4_dead_code_eliminate.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_vec4_register_coalesce.cpp intel/fs,vec4: Drop support for shader time 2021-12-10 21:20:47 +00:00
test_vf_float_conversions.cpp