fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 09:10:11 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	ba3c5300f9	intel/eu: Rework surface descriptor helpers This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	5b17379631	intel/eu: Add has_simd4x2 bools to surface_write functions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	2ce93b88c0	intel/fs: Take an explicit exec size in brw_surface_payload_size() Instead of magically falling back to SIMD8 for atomics and typed messages on Ivy Bridge, explicitly figure out the exec size and pass that into brw_surface_payload_size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Matt Turner	e166003cb7	intel/compiler: Reset default flag register in brw_find_live_channel() emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its flag_subreg set, so that the IR knows which flag is accessed. However the flag is only used on Gen7 in Align1 mode. To avoid setting unnecessary bits in the instruction words, get the information we need and reset the default flag register. This allows round-tripping through the assembler/disassembler. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-01-23 22:48:29 -08:00
Jason Ekstrand	0a7ac6d543	intel/eu: Stop overriding exec sizes in send_indirect_message For a long time, we based exec sizes on destination register widths. We've not been doing that since `1ca3a94427` but a few remnants accidentally remained. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-01-18 10:18:52 -06:00
Matt Turner	3b967e1724	intel/compiler: Avoid false positive assertions A follow on patch will move the 'nr' field to the union containing the immediate field, so prepare by checking that we're only testing these assertions if the .file is correct. The assertions with != ARF were kind of silly to begin with because the <128 check is specifically only for things in the GRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Francisco Jerez	464e79144f	intel/eu/gen7: Fix brw_MOV() with DF destination and strided source. I triggered this bug while prototyping code for a future platform on IVB. Could be a problem today though if a strided move is copy-propagated into a type-converting move with DF destination. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Sagar Ghuge	e7598c5a62	intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region When RepCtrl is set, the swizzle field is ignored by the hardware. In order to ensure a 1-to-1 correspondence between the human-readable disassembly and the binary instruction encoding always set the swizzle to XXXX (all zeros) when it is unused due to RepCtrl Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-12-10 10:06:55 -08:00
Sagar Ghuge	0a7664fe8c	intel/compiler: Change src1 reg type to unsigned doubleword To have uniform behavior while disassembling send(c) instruction use register type of unsigned doubleword for src1 when message descriptor is immediate value. Bspec does not specifiy anything for src1 immediate default type. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-10-23 12:44:24 -07:00
Ian Romanick	d515c75463	intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Francisco Jerez	6643143f6e	intel/eu: Assert that the instruction is send-like in brw_set_desc_ex(). Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6f81e2b994	intel/eu: Get rid of the return value of brw_send_indirect_message(). The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b3cce4c130	intel/eu: Get rid of the return value of brw_send_indirect_surface_message(). All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	95b5367149	intel/eu: Use descriptor constructors for dataport typed surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	94166cef40	intel/eu: Use descriptor constructors for dataport scattered byte surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	2a9605d610	intel/eu: Use descriptor constructors for dataport untyped surface messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8e707fc2af	intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b10b4e7c45	intel/eu: Use descriptor constructors for pixel interpolator messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8fa4bc4676	intel/eu: Use descriptor constructors for dataport write messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	2bac890bf5	intel/eu: Use descriptor constructors for dataport read messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	27c211e30f	intel/eu: Use descriptor constructors for sampler messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	1c90ae5acc	intel/eu: Provide desc immediate argument up front to brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	b382bdde1d	TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	c3793d49e4	intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see `d2eecf0b0b` fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	d0f589a55b	intel/eu: Define helper to specify the descriptor immediates of a SEND instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	789d20df36	intel/eu: Fix pixel interpolator queries for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	ed09e78023	intel/eu: Return new instruction to caller from brw_fb_WRITE(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	6a9525bf67	intel/eu: Switch to a logical state stack Instead of the state stack that's based on copying a dummy instruction around, we start using a logical stack of brw_insn_states. This uses a bit less memory and is way less conceptually bogus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	db9675f5a4	intel/eu: Set flag [sub]register number differently for 3src Prior to gen8, the flag [sub]register number is in a different spot on 3src instructions than on other instructions. Starting with Broadwell, they made it consistent. This commit fixes bugs that occur when a conditional modifier gets propagated into a 3src instruction such as a MAD. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	2d20303e18	intel/eu: Copy fields manually in brw_next_insn Instead of doing a memcpy, this moves us to start with a blank instruction (memset to zero) and copy the fields over one at a time. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	381fac2740	intel/eu: Add some brw_get_default_ helpers This is much cleaner than everything that wants a default value poking at the bits of p->current directly. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Plamena Manolova	939312702e	i965: Add ARB_fragment_shader_interlock support. Adds suppport for ARB_fragment_shader_interlock. We achieve the interlock and fragment ordering by issuing a memory fence via sendc. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-06-01 16:36:39 +01:00
Jason Ekstrand	417b9e5770	intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0 Fixes: `d6cd14f213` "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-22 09:53:23 -07:00
Jason Ekstrand	d2eecf0b0b	intel/compiler/icl: Clear "null render target" bit in extended message descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Kenneth Graunke	70de61594d	i965/fs: Add infrastructure for generating CSEL instructions. v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Anuj Phogat	56dc9f9f49	intel/compiler: Memory fence commit must always be enabled for gen10+ Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-02 11:45:21 -08:00
Francisco Jerez	e7c9adca57	intel/eu: Plumb header present bit to codegen helpers for HDC messages. This makes sure that the header-present bit of the message descriptor is in sync with the IR instruction fields, which gives the optimizer more control to avoid the overhead of setting up a message header when it's possible to do so. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	6edb332b44	intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL. This shouldn't cause any functional change at this point, it changes SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at the IR level instead of the hard-coded f1.0, now that it can be represented in backend_instruction::flag_subreg. This will be necessary for scheduling to behave correctly once more things start making use of f1.0. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Matt Turner	432674ce93	intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+ The PLN instruction is no more. Its functionality is now implemented using two MAD instructions with the new native-float type. Instead of pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F we now have mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F ... and in the case of SIMD8 only the first pair of MAD instructions is used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	2cff324210	intel/compiler: Add Gen11+ native float type This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	01ebfbb67a	i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride Some cases weren't handled, such as stride 4 which is needed for 64-bit operations. Presumably fixes the assertion failure mentioned in commit `2d04572038` (Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+") but who can really say since the commit neglected to list any of them! Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-01-11 10:11:59 -08:00
Jose Maria Casanova Crespo	c57a3f200d	i965/fs: Add byte scattered read message and fs support v2: Fix alignment style (Topi Pohjolainen) (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_read. v3: (Jason Ekstrand) - Use renamed brw_byte_scattered_data_element_from_bit_size method - Assert scattered read for Gen8+ and Haswell. - Use conditional expresion at components_read. - Include comment about params for scattered opcodes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	f1a9936ee1	i965/fs: Add byte scattered write message and fs support v2: (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_write. v3: - Remove leftover newline (Topi Pohjolainen) - Rename brw_data_size to brw_scattered_data_element and use defines instead of an enum (Jason Ekstrand) - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	d6cd14f213	i965/fs: Define new shader opcode to set rounding modes Although it is possible to emit them directly as AND/OR on brw_fs_nir, having a specific opcode makes it easier to remove duplicate settings later. v2: (Curro) - Set thread control to 'switch' when using the control register - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode. - Avoid magic numbers setting rounding mode field at control register. v3: (Curro) - Remove redundant and add missing whitespace lines. - Match printing instruction to IR opcode "rnd_mode" v4: (Topi Pohjolainen) - Fix code style. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	ab378734f5	intel/eu: Explicitly set EXECUTE_1 where needed Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	8280560705	intel/eu: Make automatic exec sizes a configurable option We have had a feature in codegen for some time that tries to automatically infer the execution size of an instruction from the width of its destination. For things such as fixed function GS, clipper, and SF programs, this is very useful because they tend to have lots of hand-rolled register setup and trying to specify the exec size all the time would be prohibitive. For things that come from a higher-level IR, however, it's easier to just set the right size all the time and the automatic exec sizes can, in fact, cause problems. This commit makes it optional while enabling it by default. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6041a31e77	intel/eu: Fix broadcast instruction for 64-bit values on little-core We're not using broadcast for any 32-bit types right now since we mostly use it for emit_uniformize on 32-bit buffer indices. However, SPIR-V subgroups are going to need it for 64-bit so let's make it work. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	068beb41d8	intel/eu: Just modify the offset in brw_broadcast This means we have to drop const from a variable but it also means that 100% of the code which deals with the offset limit is in one place. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	e3bcc86133	intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST These restrictions effectively already existed due to the way we use indirect sources but weren't being directly enforced. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	562b8d458c	intel/eu: Use EXECUTE_1 for JMPI The PRM says "The execution size must be 1." In `73137997e2`, the execution size was set to 1 when it should have been BRW_EXECUTE_1 (which maps to 0). Later, in `dc2d3a7f5c`, JMPI was used for line AA on gen6 and earlier and we started manually stomping the exeution size to BRW_EXECUTE_1 in the generator. This commit fixes the original bug and makes brw_JMPI just do the right thing. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `73137997e2`	2017-10-25 16:14:09 -07:00

1 2 3

117 commits