Commit graph

79 commits

Author SHA1 Message Date
Sagar Ghuge
3016756398 intel/compiler: Fix assertions in brw_alu3
v2: Fix assertion for src1 (Ian Romanick)

Fixes: 3b967e17 (intel/compiler: Avoid false positive assertions)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-03 23:14:34 -07:00
Jason Ekstrand
9e403dc56e intel/fs: Do a stalling MFENCE in endInvocationInterlock()
Fixes: 939312702e "i965: Add ARB_fragment_shader_interlock support"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-30 14:00:26 +00:00
Jason Ekstrand
859de4a748 intel/fs,vec4: Use g0 as the header for MFENCE
We set header_present but then pass it some random garbage.  Give it g0
instead.  I'm not actually sure this does anything but g0 is the usual
header data and this is what the windows driver does so it seems like a
good idea.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-30 14:00:26 +00:00
Iago Toral Quiroga
fb990bd76e intel/eu: force stride of 2 on NULL register for Byte instructions
The hardware only allows a stride of 1 on a Byte destination for raw
byte MOV instructions. This is required even when the destination
is the NULL register.

Rather than making sure that we emit a proper NULL:B destination
every time we need one, just fix it at emission time.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-18 11:05:18 +02:00
Iago Toral Quiroga
8f40d392b9 intel/compiler: set correct precision fields for 3-source float instructions
Source0 and Destination extract the floating-point precision automatically
from the SrcType and DstType instruction fields respectively when they are
set to types :F or :HF. For Source1 and Source2 operands, we use the new
1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1
means half-precision. Since we always use the type of the destination for
all operands when we emit 3-source instructions, we only need set Src1Type
and Src2Type to 1 when we are emitting a half-precision instruction.

v2:
 - Set the bit separately for each source based on its type so we can
   do mixed floating-point mode in the future (Topi).

v3:
 - Use regular citation style for the comment referencing the PRM (Matt).
 - Decided not to add asserts in the emission code to check that only
   mixed HF/F types are used since such checks would break negative tests
   for brw_eu_validate.c (Matt)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-18 11:05:18 +02:00
Iago Toral Quiroga
e6b7410187 intel/compiler: allow half-float on 3-source instructions since gen8
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-18 11:05:18 +02:00
Iago Toral Quiroga
4588f4a604 intel/compiler: handle extended math restrictions for half-float
Extended math with half-float operands is only supported since gen9,
but it is limited to SIMD8. In gen8 we lower it to 32-bit.

v2: quashed together the following patches (Jason):
  - intel/compiler: allow extended math functions with HF operands
  - intel/compiler: lower 16-bit extended math to 32-bit prior to gen9
  - intel/compiler: extended Math is limited to SIMD8 on half-float

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
  (allow extended math functions with HF operands,
   extended Math is limited to SIMD8 on half-float)
2019-04-18 11:05:18 +02:00
Jason Ekstrand
10b7d14c31 intel/vec4: Drop dead code for handling typed surface messages
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-02-28 16:58:20 -06:00
Jason Ekstrand
c4fb6b0c81 intel/eu: Add an EOT parameter to send_indirect_[split]_message
For split indirect sends we have to put the EOT parameter in the
extended descriptor as well as the instruction itself so just calling
brw_inst_set_eot is insufficient.  Moving the EOT handling handling into
the send_indirect_[split]_message helper lets us handle it properly.
2019-02-25 11:35:12 -06:00
Jason Ekstrand
8babaa84e8 intel/eu: Add support for the SENDS[C] messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d6a6e10390 intel/inst: Indent some code
We're about to add some more if cases so let's have the giant re-indent
in it's own patch to make review easier.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d2d3e04501 intel/fs: Use SHADER_OPCODE_SEND for surface messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
ba3c5300f9 intel/eu: Rework surface descriptor helpers
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure.  This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it should be fairly straightforward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
5b17379631 intel/eu: Add has_simd4x2 bools to surface_write functions
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
2ce93b88c0 intel/fs: Take an explicit exec size in brw_surface_payload_size()
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Matt Turner
e166003cb7 intel/compiler: Reset default flag register in brw_find_live_channel()
emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its
flag_subreg set, so that the IR knows which flag is accessed. However
the flag is only used on Gen7 in Align1 mode.

To avoid setting unnecessary bits in the instruction words, get the
information we need and reset the default flag register. This allows
round-tripping through the assembler/disassembler.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-23 22:48:29 -08:00
Jason Ekstrand
0a7ac6d543 intel/eu: Stop overriding exec sizes in send_indirect_message
For a long time, we based exec sizes on destination register widths.
We've not been doing that since 1ca3a94427 but a few remnants
accidentally remained.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-01-18 10:18:52 -06:00
Matt Turner
3b967e1724 intel/compiler: Avoid false positive assertions
A follow on patch will move the 'nr' field to the union containing the
immediate field, so prepare by checking that we're only testing these
assertions if the .file is correct.

The assertions with != ARF were kind of silly to begin with because the
<128 check is specifically only for things in the GRF.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Francisco Jerez
464e79144f intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.
I triggered this bug while prototyping code for a future platform on
IVB.  Could be a problem today though if a strided move is
copy-propagated into a type-converting move with DF destination.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Sagar Ghuge
e7598c5a62 intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region
When RepCtrl is set, the swizzle field is ignored by the hardware. In
order to ensure a 1-to-1 correspondence between the human-readable
disassembly and the binary instruction encoding always set the swizzle
to XXXX (all zeros) when it is unused due to RepCtrl

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-12-10 10:06:55 -08:00
Sagar Ghuge
0a7664fe8c intel/compiler: Change src1 reg type to unsigned doubleword
To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-10-23 12:44:24 -07:00
Ian Romanick
d515c75463 intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages
v2: Split changes to the message type field to another patch.  Suggested
by Caio.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Francisco Jerez
6643143f6e intel/eu: Assert that the instruction is send-like in brw_set_desc_ex().
Constructing a descriptor in-place as part of the immediate of an ALU
instruction is no longer supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
6f81e2b994 intel/eu: Get rid of the return value of brw_send_indirect_message().
The return value is not used anymore.  This allows simplifying the
code slightly, and in addition it should frustrate anybody's attempts
to continue using the obsolete piecemeal approach to construct a
message descriptor in combination with brw_send_indirect_message().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
b3cce4c130 intel/eu: Get rid of the return value of brw_send_indirect_surface_message().
All users of brw_send_indirect_surface_message() should be providing a
full descriptor immediate up front by now, this isn't necessary
anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
95b5367149 intel/eu: Use descriptor constructors for dataport typed surface messages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
94166cef40 intel/eu: Use descriptor constructors for dataport scattered byte surface messages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
2a9605d610 intel/eu: Use descriptor constructors for dataport untyped surface messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
8e707fc2af intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message().
Instead of the current message_len, response_len and header_present
arguments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
b10b4e7c45 intel/eu: Use descriptor constructors for pixel interpolator messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:58 -07:00
Francisco Jerez
8fa4bc4676 intel/eu: Use descriptor constructors for dataport write messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
2bac890bf5 intel/eu: Use descriptor constructors for dataport read messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
27c211e30f intel/eu: Use descriptor constructors for sampler messages.
v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
1c90ae5acc intel/eu: Provide desc immediate argument up front to brw_send_indirect_message().
The current approach of returning a setup instruction where additional
descriptor fields can be specified is still supported in order to keep
things working, but it will be removed later in this series.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
b382bdde1d TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
c3793d49e4 intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls.
This replaces brw_set_message_descriptor() with the composition of
brw_set_desc() and a new inline helper function that packs the common
message descriptor controls into an integer.  The goal is to represent
all message descriptors as a 32-bit integer which is written at once
into the instruction, which is more flexible (SENDS anyone?), robust
(see d2eecf0b0b fixing an issue
ultimately caused by some bits of the extended message descriptor
being left undefined) and future-proof than the current approach of
specifying the individual descriptor fields directly into the
instruction.

This approach also seems more self-documenting, since it will allow
removing calls to functions with way too many arguments like
brw_set_*_message() and brw_send_indirect_message(), and instead
provide a single descriptor argument constructed from an appropriate
combination of brw_*_desc() helpers.

Note that because brw_set_message_descriptor() was (conditionally?)
overriding fields of the instruction which strictly speaking weren't
part of the message descriptor, this involves calling
brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition
to brw_set_desc().

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
d0f589a55b intel/eu: Define helper to specify the descriptor immediates of a SEND instruction.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-09 23:46:57 -07:00
Francisco Jerez
789d20df36 intel/eu: Fix pixel interpolator queries for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
ed09e78023 intel/eu: Return new instruction to caller from brw_fb_WRITE().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
6a9525bf67 intel/eu: Switch to a logical state stack
Instead of the state stack that's based on copying a dummy instruction
around, we start using a logical stack of brw_insn_states.  This uses a
bit less memory and is way less conceptually bogus.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
db9675f5a4 intel/eu: Set flag [sub]register number differently for 3src
Prior to gen8, the flag [sub]register number is in a different spot on
3src instructions than on other instructions.  Starting with Broadwell,
they made it consistent.  This commit fixes bugs that occur when a
conditional modifier gets propagated into a 3src instruction such as a
MAD.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
2d20303e18 intel/eu: Copy fields manually in brw_next_insn
Instead of doing a memcpy, this moves us to start with a blank
instruction (memset to zero) and copy the fields over one at a time.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
381fac2740 intel/eu: Add some brw_get_default_ helpers
This is much cleaner than everything that wants a default value poking
at the bits of p->current directly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Plamena Manolova
939312702e i965: Add ARB_fragment_shader_interlock support.
Adds suppport for ARB_fragment_shader_interlock. We achieve
the interlock and fragment ordering by issuing a memory fence
via sendc.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-06-01 16:36:39 +01:00
Jason Ekstrand
417b9e5770 intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0
Fixes: d6cd14f213 "i965/fs: Define new shader opcode to..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-05-22 09:53:23 -07:00
Jason Ekstrand
d2eecf0b0b intel/compiler/icl: Clear "null render target" bit in extended message descriptor
Otherwise all our render target writes go no where.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Kenneth Graunke
70de61594d i965/fs: Add infrastructure for generating CSEL instructions.
v2 (idr): Don't allow CSEL with a non-float src2.

v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.

v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt).  Don't
reset the access mode afterwards (suggested by Samuel and Matt).  Add
support for CSEL not modifying the flags to more places (requested by
Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Anuj Phogat
56dc9f9f49 intel/compiler: Memory fence commit must always be enabled for gen10+
Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES # 1404612949

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-02 11:45:21 -08:00
Francisco Jerez
e7c9adca57 intel/eu: Plumb header present bit to codegen helpers for HDC messages.
This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
6edb332b44 intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg.  This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00