Commit graph

116201 commits

Author SHA1 Message Date
Francisco Jerez
2c4c9aba30 intel/eu/gen12: Codegen pathological SEND source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
bafc9515db intel/eu/gen12: Codegen control flow instructions correctly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6e1daba3b4 intel/eu/gen12: Codegen three-source instruction source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
9fdb67aa09 intel/eu/gen12: Fix codegen of immediate source regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6cb764ae9c intel/eu/gen12: Add Gen12 opcode descriptions to the table.
Quite a lot of churn because the encoding of most hardware opcodes has
changed unfortunately.

v2: Split dot-product description fixes to separate patch (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
31182e7aa9 intel/eu/gen11+: Mark dot product opcodes as unsupported on opcode_descs table.
These instructions have been removed from the hardware.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c742be1437 intel/eu/gen12: Implement datatype binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Sagar Ghuge
a12533f2ce intel/eu/gen12: Implement immediate 64 bit constant encoding.
On Gen12, 64 bit immediate constants are loaded in reverse order. Lower
32 bit gets loaded from bit 96-127 and higher 32 bits from 64-95 in
instruction encoding.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Co-authored-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
5291283af0 intel/eu/gen12: Implement compact instruction binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
77d09d0d50 intel/eu/gen12: Implement indirect region binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
81400470be intel/eu/gen12: Implement SEND instruction binary encoding.
v2: Fix off-by-one upper GET_BITS() bound, combine 25-29 and 30-31
    descriptor fields (Ken).  Shorten name of GEN12_MD() macro, drop
    some removed TS message descriptor fields (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d24b8af23d intel/eu/gen12: Implement control flow instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
956c156dc4 intel/eu/gen12: Implement three-source instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
fa48281795 intel/eu/gen12: Implement basic instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
143176163d intel/eu/gen12: Add sanity-check asserts to brw_inst_bits() and brw_inst_set_bits().
These caught a few bugs during the development of this series.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
7e5a8638d3 intel/eu/gen12: Extend brw_inst.h macros for Gen12 support.
The encoding of almost every instruction field has changed in Gen12,
so this involves adding a Gen12+ bitfield spec to every brw_inst
macro.  In addition some new macros are required to handle certain
discontiguous and variable-length fields.

This commit doesn't actually include the Gen12 updated bitfield specs,
only the macros are extended here for reviewability.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Rename FDC() to FFDC() and FDC1() to FDC() for consistency with
    the existing F() and FF() macros.
2019-10-11 12:24:16 -07:00
Francisco Jerez
6965a02e09 intel/ir: Represent physical edge of unconditional CONTINUE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where control flow isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
eeaad2992c intel/ir: Represent physical edge of ELSE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where the condition isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
152754665a intel/ir: Represent logical edge of BREAK instruction.
Currently only the physical back-edge is represented, which
incidentally also leads to the exit block of the loop, but we need the
direct logical edge in addition for our logical CFG representation to
be complete.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c344c92b31 intel/ir: Add helper function to push block onto CFG analysis stack.
Requested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d6a9731d8f intel/ir: Represent physical and logical subsets of the CFG.
This represents two control flow graphs in the same cfg_t data
structure: The physical CFG that will include all possible control
flow paths the EU can physically take, and the logical CFG restricted
to the control flow paths that exist in the original scalar program.
The latter is a subset of the former because in case of divergence the
SIMD vectorized program will take control flow paths that aren't part
of the original scalar program.

The bblock_link constructor and bblock_t::add_successor() now take a
"kind" parameter that specifies whether the edge is purely physical or
whether it's part of both the logical and physical CFGs (a logical
edge is of course always guaranteed to be in the physical CFG as
well).  bblock_t::is_predecessor_of() and ::is_successor_of() also
take a kind parameter specifying which CFG is being queried.  The '~>'
notation will be used now in order to represent purely physical edges
in IR dumps.

This commit doesn't actually add nor remove any edges from the CFG
(the only edges marked as purely physical here are the two WHILE loop
ones that already existed).  Optimization passes should continue using
the same (incomplete) physical CFG they were using before until
they're fixed to do something smarter in a later commit, so this
shouldn't lead to any functional changes.

v2: Remove tabs from lines changed in this file (Caio).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
1b570456ca intel/ir: Drop hard-coded correspondence between IR and HW opcodes.
Having the IR opcodes locked to their hardware representation is risky
because it causes opcodes as different as BRC and IFF to compare equal
at the IR level (luckily the back-end only ever uses one opcode from
each group, right now), and it prevents us from supporting
instructions that change their hardware representation across
generations, which will become a problem on Gen12+ platforms.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
057902dcf8 intel/eu: Encode and decode native instruction opcodes from/to IR opcodes.
Change brw_inst_set_opcode() and brw_inst_opcode() to call
brw_opcode_encode/decode() transparently in order to translate between
hardware and IR opcodes, and update the EU compaction code in order to
do the same as needed, so we can eventually drop the one-to-one
correspondence between hardware and IR opcodes.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
25dd67099d intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode.
This rewrites the current opcode description tables as a more compact
flat data structure.  The purpose is to allow efficient constant-time
look-up by either HW or IR opcode, which will allow us to drop the
hard-coded correspondence between HW and IR opcodes -- See the next
commits for the rationale.

brw_eu.c is now built as C++ source so we can take advantage of
pointers to member in order to make the look-up function work
regardless of the opcode_desc member used as look-up key.

v2: Optimize devinfo struct comparison (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
51dc40cefb intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
35bcd08d61 intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C).
The brw_inst opcode accessors are going away in one of the following
commits.  We could potentially replace them with the new helpers that
do opcode remapping, but that would lead to a circular dependency
between brw_inst.h and brw_eu.h.  This way we also avoid ordering
issues that can cause the semantics of the ex_desc accessors to change
depending on whether the ex_desc field is set after or before the
opcode instruction field.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
b2ae65c7d9 intel/fs: Fix constness of implied_mrf_writes() argument.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6f275a863d intel/fs: Define is_send() convenience IR helper.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
f326d9d218 intel/fs: Define is_payload() method of the IR instruction class.
This is required because SEND message payload sources are fetched
asynchronously by the hardware, which can lead to WaR data corruption
on Gen12+ platforms if not handled specially by the compiler to
guarantee proper synchronization.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a42581fa8f intel/fs: Teach fs_inst::is_send_from_grf() about some missing send-like instructions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Bas Nieuwenhuizen
6da3bf2600 nir/dead_cf: Remove dead control flow after infinite loops.
And after discard-only loops. Otherwise we end up with dead code
which confuses nir_repair_ssa into adding a whole bunch of uses
of undefined. However, for derefs, we sometimes always expect to
get a variable instead of undefined.

Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv.

Fixes: c832820ce9 "nir/dead_cf: Repair SSA if the pass makes progress"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-11 17:24:26 +02:00
Rhys Perry
f13ad839f1 aco: don't use p_as_uniform for vgpr sampler/image indices
p_as_uniform can get CSE'd, which can be incorrect and break some
dEQP-VK.descriptor_indexing.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
0c3fe323b6 aco: implement divergent vulkan_resource_index
Fixes the UBO/SSBO dEQP-VK.descriptor_indexing.* tests

v2: remove bld.copy() usage

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
5526a557ee aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()
This can happen when bcsel is used between the results of two
vulkan_resource_index. It's also probably needed for non-uniform
descriptor indexing

Fixes dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.reads_opselect_two_buffers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
45d6c69b99 aco: use can_accept_constant in valu_can_accept_literal
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
b37857bcea aco: don't apply sgprs/constants to read/write lane instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
599d634c2c nir/lower_input_attachments: pass on non-uniform access flag
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-11 14:26:58 +00:00
Rhys Perry
5ef04d7982 nir/lower_non_uniform: lower image/texture instructions taking derefs
v2: always assert on the texture/sampler handle's num_components
v3: replicate the deref inside the loop
v4: remove a case of useless line wrapping

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-11 14:26:58 +00:00
Jonathan Marek
7e3b900c80 etnaviv: rework etna_resource_create tiling choice
Now that the base resource is allowed to be incompatible with PE, we can
make a smarter choice of tiling mode to avoid allocating a PE compatible
base that is never used for regular textures. This affects GPUs like GC2000
where there is no tiling compatible with both PE and TE.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
b962776530 etnaviv: rework compatible render base
For PE-incompatible layouts, use a mechanism similar to what texture does
to create a compatible base resource.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
e7e02435a8 etnaviv: get addressing mode from tiling layout
Remove the "addressing_mode" state, which is currently set incorrectly, and
instead deduce the addressing mode from the tiling layout.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
5403b36653 etnaviv: clear texture cache and flush ts when texture is modified
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Christian Gmeiner
6dc650fe71 etnaviv: output the same shader-db format as freedreno, v3d and intel
This lets us reuse their report.py.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-11 12:35:15 +02:00
Christian Gmeiner
140bc0f040 etnaviv: nir: start to make use of compile_error(..)
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-11 11:37:03 +02:00
Michel Dänzer
d60b8679a4 gitlab-ci: Disable meson-mingw32-x86_64 job again for now
The wrapdb.mesonbuild.com SSL certificate expired, causing the job to
fail: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/731864

Switching to http:// doesn't avoid it:
https://gitlab.freedesktop.org/daenzer/mesa/-/jobs/732043
2019-10-11 11:10:01 +02:00
Michel Dänzer
eb86cbabe6 gitlab-ci: Add .use-debian-10 template
It simplifies the definitions of jobs using the Debian 10 image.

The needs: was previously missing from the llvmpipe/softpipe test jobs,
so they could spuriously run if the debian-10 job failed or was
cancelled.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-11 10:05:21 +02:00
Michel Dänzer
9691329727 gitlab-ci: Remove redundant .meson-cross template script
It was identical to the one inherited from the .meson-build template.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-11 10:04:45 +02:00
Dave Airlie
f59ff014b1 gallivm: fix coroutines on aarch64 with llvm 8
The coroutine split pass is missing a dependency before LLVM 9.0,
and fails to initialise properly if the CallGraphWrapperPass hasn't
be initialised earlier (x86 does it due to some of it's passes
requiring it).

This is a workaround for llvm 8 (coroutines are only supported in 8
and higher). It adds another pass that has a dependency on the pass
the coroutines split requires. This pass shouldn't have any raal
effects.

Fixes: d32690b43c (gallivm: add coroutine pass manager support)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 12:15:45 +10:00
Dave Airlie
05b008c961 llvmpipe: add support for tg4 component selection.
This is needed as part of GLES3.1 and helps for ARB_gpu_shader5.

Fixes: KHR-GLES31.core.texture_gather.* cases
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00
Dave Airlie
a70f0a8841 st/glsl: add support for alternate TG4 encoding.
This will encode the component selection value (0, 1, 2, 3) into
the X swizzle of the sampler, if the driver requests it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00