fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 17:50:12 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	272ab2823d	intel/eu: Include brw_compiler.h in brw_eu.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	91348d125d	intel/eu: Add some new helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	54ba0daa28	intel/compiler: Get rid of the global compaction table pointers With discrete GPUs, it's going to be possible to have GPUs from two different hardware generations in the machine at the same time. Global singletons like this aren't going to fly. Have a struct containing the pointers which gets initialized once per shader disassemble instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Danylo Piliaiev	bc4a127d6e	intel/disasm: Label support in shader disassembly for UIP/JIP Shader instructions which use UIP/JIP now get formatted with a label in addition with immediate value, labels have "LABEL%d" format. v2: - Consider brw_jump_scale when calculating label's offset From: "Lonnberg, Toni" <toni.lonnberg@intel.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>	2020-09-02 10:33:29 +00:00
Danylo Piliaiev	6cbd4764cd	intel/disasm: brw_label and support functions Pre-work for shader disassembly label support. Introduction of the structures and functions used by the shader disassembly jump target labeling. From: "Lonnberg, Toni" <toni.lonnberg@intel.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>	2020-09-02 10:33:29 +00:00
Danylo Piliaiev	afa39d07e4	intel/disasm: Change visibility of has_uip and has_jip Pre-work for shader disassembly label support. From: "Lonnberg, Toni" <toni.lonnberg@intel.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>	2020-09-02 10:33:29 +00:00
Jason Ekstrand	561aaeeb48	intel/eu: Add the RNDU opcode We don't want to use it on gen5 and earlier because only RNDD can be done with a single instruction and we can implement RNDU(x) as -RNDD(-x) so it's better to just do that when we have the instruction. On gen6 and above, we may as well just use the right instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:54 +00:00
Jason Ekstrand	8a0d772dca	intel/eu: Add a brw_urb_dest_msg_type helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:54 +00:00
Kenneth Graunke	2c762955d4	intel/eu: Add a brw_urb_desc helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Jason Ekstrand	c48f42e178	intel/fs: Emit HALT for discard on Gen4-5 Using HALT to immediately jump to the end of the shader is required to implement GL_EXT_gpu_shader4 and OpenGL 3.0. However, vanilla OpenGL 1.2 doesn't forbid it and it likely makes something somewhere faster. We should be consistent and implement the same discard behavior on all hardware if we can. The rules for HALT on Gen4-5 are a bit different from Gen6+. On the older hardware, there is no stack for HALT; instead it's up to software to save and restore mask registers. However, there's no real saving needed since we only use HALT to jump to the end of the program where we're about about to do our FB writes. All we need to do is reset AMask to DMask, the value it was initialized to at the start of the thread. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5244>	2020-05-30 06:21:15 +00:00
Jason Ekstrand	4985e380dd	intel/eu: Use non-coherent mode (BTI=253) for stateless A64 messages We don't care about full IA coherency since we always have the opportunity in GL or Vulkan to flush the data cache. Using IA-coherent mode is likely just making A64 access slower than it needs to be. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4819>	2020-04-30 14:45:50 +00:00
Caio Marcelo de Oliveira Filho	f858fa26b4	intel/fs,vec4: Pull stall logic for memory fences up into the IR Instead of emitting the stall MOV "inside" the SHADER_OPCODE_MEMORY_FENCE generation, use the scheduling fences when creating the IR. For IvyBridge, every (data cache) fence is accompained by a render cache fence, that now is explicit in the IR, two SHADER_OPCODE_MEMORY_FENCEs are emitted (with different SFIDs). Because Begin and End interlock intrinsics are effectively memory barriers, move its handling alongside the other memory barrier intrinsics. The SHADER_OPCODE_INTERLOCK is still used to distinguish if we are going to use a SENDC (for Begin) or regular SEND (for End). This change is a preparation to allow emitting both SENDs in Gen11+ before we can stall on them. Shader-db results for IVB (i965): total instructions in shared programs: 11971190 -> 11971200 (<.01%) instructions in affected programs: 11482 -> 11492 (0.09%) helped: 0 HURT: 8 HURT stats (abs) min: 1 max: 3 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.03% max: 0.50% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for instructions value: 0.66 1.84 95% mean confidence interval for instructions %-change: 0.01% 0.27% Instructions are HURT. Unlike the previous code, that used the `mov g1 g2` trick to force both `g1` and `g2` to stall, the scheduling fence will generate `mov null g1` and `mov null g2`. During review it was decided it was not worth keeping the special codepath for the small effect will have. Shader-db results for HSW (i965), BDW and SKL don't have a change on instruction count, but do report changes in cycles count, showing SKL results below total cycles in shared programs: 341738444 -> 341710570 (<.01%) cycles in affected programs: 7240002 -> 7212128 (-0.38%) helped: 46 HURT: 5 helped stats (abs) min: 14 max: 1940 x̄: 676.22 x̃: 154 helped stats (rel) min: <.01% max: 2.62% x̄: 1.28% x̃: 0.95% HURT stats (abs) min: 2 max: 1768 x̄: 646.40 x̃: 362 HURT stats (rel) min: <.01% max: 0.83% x̄: 0.28% x̃: 0.08% 95% mean confidence interval for cycles value: -777.71 -315.38 95% mean confidence interval for cycles %-change: -1.42% -0.83% Cycles are helped. This seems to be the effect of allocating two registers separatedly instead of a single one with size 2, which causes different register allocation, affecting the cycle estimates. while ICL also has not change on instruction count but report changes negative changes in cycles total cycles in shared programs: 352665369 -> 352707484 (0.01%) cycles in affected programs: 9608288 -> 9650403 (0.44%) helped: 4 HURT: 104 helped stats (abs) min: 24 max: 128 x̄: 88.50 x̃: 101 helped stats (rel) min: <.01% max: 0.85% x̄: 0.46% x̃: 0.49% HURT stats (abs) min: 2 max: 2016 x̄: 408.36 x̃: 48 HURT stats (rel) min: <.01% max: 3.31% x̄: 0.88% x̃: 0.45% 95% mean confidence interval for cycles value: 256.67 523.24 95% mean confidence interval for cycles %-change: 0.63% 1.03% Cycles are HURT. AFAICT this is the result of the case above. Shader-db results for TGL have similar cycles result as ICL, but also affect instructions total instructions in shared programs: 17690586 -> 17690597 (<.01%) instructions in affected programs: 64617 -> 64628 (0.02%) helped: 55 HURT: 32 helped stats (abs) min: 1 max: 16 x̄: 4.13 x̃: 3 helped stats (rel) min: 0.05% max: 2.78% x̄: 0.86% x̃: 0.74% HURT stats (abs) min: 1 max: 65 x̄: 7.44 x̃: 2 HURT stats (rel) min: 0.05% max: 4.58% x̄: 1.13% x̃: 0.69% 95% mean confidence interval for instructions value: -2.03 2.28 95% mean confidence interval for instructions %-change: -0.41% 0.15% Inconclusive result (value mean confidence interval includes 0). Now that more is done in the IR, more dependencies are visible and more SWSB annotations are emitted. Mixed with different register allocation decisions like above, some shaders will see more `sync nops` while others able to avoid them. Most of the new `sync nops` are also redundant and could be dropped, which will be fixed in a separate change. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3278>	2020-04-29 07:17:27 +00:00
Caio Marcelo de Oliveira Filho	c76f2292b5	intel/fs,vec4: Properly account SENDs in IVB memory fence Change brw_memory_fence to return the number of messages emitted, and use that to update the send_count statistic in code generation. This will fix the book-keeping for IVB since the memory fences will result in two SEND messages. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4646>	2020-04-20 09:29:09 -07:00
Matt Turner	88a0523bd2	intel/compiler: Move Gen4/5 rounding to visitor Gen4/5's rounding instructions operate differently than later Gens'. They all return the floor of the input and the "Round-increment" conditional modifier answers whether the result should be incremented by 1.0 to get the appropriate result for the operation (and thus its behavior is determined by the round opcode; e.g., RNDZ vs RNDE). Since this requires a second instruciton (a predicated ADD) that consumes the result of the round instruction, the round instruction cannot write its result directly to the (write-only) message registers. By emitting the ADD in the generator, the backend thinks it's safe to store the round's result directly to the message register file. To avoid this, we move the emission of the ADD instruction to the NIR translator so that the backend has the information it needs. I suspect this also fixes code generated for RNDZ.SAT but since Gen4/5 don't support GLSL 1.30 which adds the trunc() function, I couldn't write a piglit test to confirm. My thinking is that if x=-0.5: sat(trunc(-0.5)) = 0.0 But on Gen4/5 where sat(trunc(x)) is implemented as rndz.r.f0 result, x // result = floor(x) // set f0 if increment needed (+f0) add result, result, 1.0 // fixup so result = trunc(x) then putting saturate on both instructions will give the wrong result. floor(-0.5) = -1.0 sat(floor(-0.5)) = 0.0 // +1 increment would be needed since floor(-0.5) != trunc(-0.5) sat(sat(floor(-0.5)) + 1.0) = 1.0 Fixes: `6f394343b1` ("nir/algebraic: i2f(f2i()) -> trunc()") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2355 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459>	2020-01-22 23:47:02 +00:00
Matt Turner	22462ba242	intel/compiler: Validate fuzzed instructions ... before giving them to the instruction compactor. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Jason Ekstrand	a0999bc049	intel/fs: Add DWord scattered read/write opcodes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Francisco Jerez	d3f3bdcd18	intel/eu/gen12: Add tracking of default SWSB state to the current brw_codegen instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c22db5e188	intel/fs/gen12: Add codegen support for the SYNC instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	057902dcf8	intel/eu: Encode and decode native instruction opcodes from/to IR opcodes. Change brw_inst_set_opcode() and brw_inst_opcode() to call brw_opcode_encode/decode() transparently in order to translate between hardware and IR opcodes, and update the EU compaction code in order to do the same as needed, so we can eventually drop the one-to-one correspondence between hardware and IR opcodes. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	25dd67099d	intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode. This rewrites the current opcode description tables as a more compact flat data structure. The purpose is to allow efficient constant-time look-up by either HW or IR opcode, which will allow us to drop the hard-coded correspondence between HW and IR opcodes -- See the next commits for the rationale. brw_eu.c is now built as C++ source so we can take advantage of pointers to member in order to make the look-up function work regardless of the opcode_desc member used as look-up key. v2: Optimize devinfo struct comparison (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	51dc40cefb	intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Samuel Iglesias Gonsálvez	28da9558f5	i965/fs/generator: refactor rounding mode helper in preparation for float controls v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper (this one) and the new opcode (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Danylo Piliaiev	04a9951580	intel/compiler: add ability to override shader's assembly When dumping shader's assembly with INTEL_DEBUG=vs,tcs,... sha1 of the resulting assembly is also printed, having environment variable INTEL_SHADER_ASM_READ_PATH present driver will try to load a "%sha1%.bin" file from the path and substitute current assembly with the one from the file. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-05 17:19:09 +00:00
Eric Engestrom	abc226cf41	tree-wide: replace MAYBE_UNUSED with ASSERTED Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Caio Marcelo de Oliveira Filho	b390ff3517	intel/fs: Add support for SLM fence in Gen11 Gen11 SLM is not on L3 anymore, so now the hardware has two separate fences. Add a way to control which fence types to use. At this time, we don't have enough information in NIR to control the visibility of the memory being fenced, so for now be conservative and assume that fences will need a stall. With more information later we'll be able to reduce those. Fixes Vulkan CTS tests in ICL: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.buffer.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp The whole set of supported tests in dEQP-VK.memory_model.* group should be passing in ICL now. v2: Pass BTI around instead of having an enum. (Jason) Emit two SHADER_OPCODE_MEMORY_FENCE instead of one that gets transformed into two. (Jason) List tests fixed. (Lionel) v3: For clarity, split the decision of which fences to emit from the emission code. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-11 08:29:32 -07:00
Sagar Ghuge	83fdec0f0d	intel/compiler: Enable the emission of ROR/ROL instructions v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support align16 mode (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Jason Ekstrand	9e403dc56e	intel/fs: Do a stalling MFENCE in endInvocationInterlock() Fixes: `939312702e` "i965: Add ARB_fragment_shader_interlock support" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-30 14:00:26 +00:00
Jason Ekstrand	859de4a748	intel/fs,vec4: Use g0 as the header for MFENCE We set header_present but then pass it some random garbage. Give it g0 instead. I'm not actually sure this does anything but g0 is the usual header data and this is what the windows driver does so it seems like a good idea. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-30 14:00:26 +00:00
Ian Romanick	dea19138dd	intel/compiler: Silence many unused parameter warnings in brw_eu.h In file included from src/intel/compiler/brw_eu_util.c:34:0: src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’: src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_desc_header_present(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’: src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’: src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc_ex_mlen(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’: src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’: src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_sampler(const struct gen_device_info devinfo, uint32_t desc) ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’: src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_return_format(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’: src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_dp_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’: src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, /< 0 for SIMD4x2 / ^~~~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’: src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, ^~~~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:31 -08:00
Jason Ekstrand	10b7d14c31	intel/vec4: Drop dead code for handling typed surface messages Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	c4fb6b0c81	intel/eu: Add an EOT parameter to send_indirect_[split]_message For split indirect sends we have to put the EOT parameter in the extended descriptor as well as the instruction itself so just calling brw_inst_set_eot is insufficient. Moving the EOT handling handling into the send_indirect_[split]_message helper lets us handle it properly.	2019-02-25 11:35:12 -06:00
Jason Ekstrand	e644ed468f	intel/fs: Implement nir_intrinsic_global_atomic_* eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	1c25bf4373	intel/fs: Implement load/store_global with A64 untyped messages eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	8babaa84e8	intel/eu: Add support for the SENDS[C] messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	13a6fabc62	intel/eu: Add more message descriptor helpers We want to be able to extract data from descriptors as well as unify a bit of the descriptor construction. One of the unifications we do is to unify the read/write and dataport descriptors. On gen4-5, read/write are substantially different and the read descriptors change between gen4 and gen4.x. On gen6, they unified layouts between read, write, and dataport. Then, on gen8, they added one bit to the message type field but left it reserved MBZ for read/write messages. This commit chooses to treat that as if they expanded the field everywhere and just didn't have enough enum values for read/write to bother with the extra bit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d2d3e04501	intel/fs: Use SHADER_OPCODE_SEND for surface messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	ba3c5300f9	intel/eu: Rework surface descriptor helpers This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Ian Romanick	41399f4bc7	intel/compiler: Silence unused parameter warnings in brw_eu.h All of the other brw__desc functions take a devinfo parameter, and all of the others at least have an assert that uses it. Keep the parameter, but mark it as unused. Silences 37 warnings like: In file included from src/intel/common/gen_disasm.c:27:0: src/intel/compiler/brw_eu.h: In function ‘brw_pixel_interp_desc’: src/intel/compiler/brw_eu.h:377:53: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_pixel_interp_desc(const struct gen_device_info devinfo, ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-28 15:35:38 -07:00
Ian Romanick	d515c75463	intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	f347348f8a	intel/compiler: Expand untyped atomic message type field by a bit This is necessary for a new Gen9 message type that will be added in the next patch. There are also Gen8 message types that need the extra bit (mostly for bindless). v2: Split off from the next patch. Suggested by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Francisco Jerez	6f81e2b994	intel/eu: Get rid of the return value of brw_send_indirect_message(). The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	2a9605d610	intel/eu: Use descriptor constructors for dataport untyped surface messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b10b4e7c45	intel/eu: Use descriptor constructors for pixel interpolator messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8fa4bc4676	intel/eu: Use descriptor constructors for dataport write messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	2bac890bf5	intel/eu: Use descriptor constructors for dataport read messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	27c211e30f	intel/eu: Use descriptor constructors for sampler messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	1c90ae5acc	intel/eu: Provide desc immediate argument up front to brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	c3793d49e4	intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see `d2eecf0b0b` fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	d0f589a55b	intel/eu: Define helper to specify the descriptor immediates of a SEND instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	ed09e78023	intel/eu: Return new instruction to caller from brw_fb_WRITE(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00

1 2 3

118 commits