Commit graph

1694 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
5cc758558d intel/compiler: Add common function for CS dispatch info
We have this small calculations repeated in each Intel driver, so move
them to a single place to be reused.  Also includes "right_mask" since
is always used in the same context and depends on the dispatch info
values.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504>
2021-05-04 08:15:19 -07:00
Dave Airlie
52e426fd8b intel/compiler: add support for compiling fixed function gs
This is ported from i965, but the interface is cleaned up

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9721>
2021-05-04 03:39:45 +00:00
Dave Airlie
ac33e2b66b intel: move brw_ff_gs_prog_key/data to compiler.
Step one to moving the ff_gs emitter to compiler for sharing,
move BRW_MAX_SOL_BINDINGS up so the keys are in same area

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9721>
2021-05-04 03:39:45 +00:00
Jason Ekstrand
05a37e2422 intel/nir: Set lower txs with non-zero LOD
There's a recently discovered HW bug affecting hardware at least as far
back as Skylake where, if the LOD is out-of-bounds for any SIMD lane,
then garbage may be returned in all SIMD lanes.  The easy solution is to
set lower_txs_lod so that we always have a constant LOD of 0 which we
know a priori is always in-bounds.  Fortunately, not many shaders
actually use textureSize() with LOD.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19948537 -> 19948564 (<.01%)
    instructions in affected programs: 3859 -> 3886 (0.70%)
    helped: 0
    HURT: 7

One of the shaders is in Civilization: Beyond Earth, and the rest are
all in Civilization VI.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10538>
2021-05-04 00:02:43 +00:00
Jason Ekstrand
3f36e027d3 intel/fs: Don't use pixel_z for Gen4-5 source_depth_to_render_target
The source_depth_to_render_target flag can get set on old gen4-5 HW in a
few cases which are independent of the app writing gl_FragDepth.  It
should be safe to just use fetch_payload_reg in that case instead of
depending in interpolation setup.  This fixes a bug with certain very
simple shaders where we might end up not including the depth when we
should have.

While we're here, rework the logic around setting src_depth and add a
comment so it's more clear what's going on.

Fixes: 6d4070f3dd "intel/compiler: add support for fragment coordinate..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10596>
2021-05-03 23:51:51 +00:00
Jason Ekstrand
94c1e65de9 intel/eu: Set message subtype properly for SIMD8 FB fetch
There were two bugs which crep in here as part of 64551610d1:
forgetting that exec sizes in HW are in log2 space and having the
exec_size condition for the subtype backwards.

Fixes: 64551610d1 "intel/compiler: rework message descriptors..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10588>
2021-05-03 15:30:41 +00:00
Jason Ekstrand
34c560ae95 intel/fs: Stop using brw_dp_read/write_desc in Gen7+ only code
Those helpers exist primarily to sort out some of the weirdness around
Gen4-6 dataport access.  On Gen5 and earlier, everything was called
"dataport" and, instead of the SFID we have today there was a "target
cache" parameter in the descriptor.  There are also some bits that moved
around on various gens depending on read vs. write.  Starting with Gen6,
most things which target one of the data cache SFIDs should use
brw_dp_desc() instead.

v2: Drop backward comment (Ken)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Jason Ekstrand
2e7656ae2f intel/eu: SVB writes only happen on Gen6
It's a Gen6 XFB thing.  It's never used for anything else so there's no
point in having a target cache switch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
0421690f83 intel/compiler: add restrictions related to coarse pixel shading
v2: Update to BITSET_TEST()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
81f369c93b intel/compiler: add coarse pixel offset on Gfx12.5+
Gfx12.5 has a slightly different code path.

v2: Document the oddness

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
6d4070f3dd intel/compiler: add support for fragment coordinate with coarse pixels
v2: Drop new internal opcodes (Jason)
    Simplify code (Jason)

v3: Add Z computation for coarse pixels

v4: Document things a little

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
a297061524 intel/compiler: add support for fragment shading rate variable
v2: Drop old register type initializers (Jason)
    Simplify instruction snippet (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
b6332fc4a8 intel/compiler: handle coarse pixel in render target writes descriptors
v2: Use the new inst->ex_desc field (Jason)

v3: Drop CPS LoD compensation from sampler messages (Lionel)

v4: Drop useless uses_rate_shading (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
d665c2dcf0 intel/compiler: use existing helpers to pull bits of descriptors
v2: Use new RT descriptor helper

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
64551610d1 intel/compiler: rework message descriptors for render targets
Render target message descriptors are slightly different from the
dataport ones. In particular the msg_type field is on bits 14:17 for
RT while bits 14:18 for DP.

v2: Drop unused send_commit_msg field in brw_fb_write_desc() (Ken)

v3: Rebase on top renaming (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
dabaaaf6c7 intel/compiler: make sure we keep the lowest dispatch limit
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
2021-05-02 20:20:06 +00:00
Jordan Justen
3f04383521 intel/compiler: Fix INTEL_DEBUG=hex
With the missing else, this prints the compacted hex followed by hex
for an uncompacted version of the compacted instruction. It also
doesn't print hex for instructions that are not compacted.

Fixes: bc4a127d6e ("intel/disasm: Label support in shader disassembly for UIP/JIP")
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10535>
2021-04-30 01:51:23 -07:00
Jason Ekstrand
134af5ada2 intel/compiler: Don't insert barriers for NULL sources
Normally, we never see NULL in a source.  However, starting with
eab1c55590, we can with a SHADER_OPCODE_SEND if it only has the first
payload.  We were inserting barriers which adds unnecessary scheduling
dependencies and takes a lot of compile time because inserting a single
barrier is an O(n) operation.

All the extra O(n) can have a surprisingly large effect.  This cuts the
runtime of dEQP-VK.binding_model.buffer_device_address.set3.depth3.
basessbo.convertcheckuv2.store.single.std140.frag by a factor of 20x for
a debug build.

Shader-db results on ICL:

    total instructions in shared programs: 19918983 -> 19921610 (0.01%)
    instructions in affected programs: 884074 -> 886701 (0.30%)
    helped: 1688
    HURT: 817
    helped stats (abs) min: 1 max: 163 x̄: 4.23 x̃: 1
    helped stats (rel) min: 0.02% max: 12.50% x̄: 1.08% x̃: 0.61%
    HURT stats (abs)   min: 1 max: 2674 x̄: 11.95 x̃: 2
    HURT stats (rel)   min: 0.11% max: 70.22% x̄: 1.71% x̃: 1.03%
    95% mean confidence interval for instructions value: -1.97 4.06
    95% mean confidence interval for instructions %-change: -0.28% -0.06%
    Inconclusive result (value mean confidence interval includes 0).

    total cycles in shared programs: 976503324 -> 975884809 (-0.06%)
    cycles in affected programs: 82581703 -> 81963188 (-0.75%)
    helped: 4144
    HURT: 5010
    helped stats (abs) min: 1 max: 79294 x̄: 311.31 x̃: 8
    helped stats (rel) min: <.01% max: 53.69% x̄: 2.00% x̃: 0.51%
    HURT stats (abs)   min: 1 max: 92266 x̄: 134.04 x̃: 8
    HURT stats (rel)   min: <.01% max: 218.09% x̄: 3.25% x̃: 0.53%
    95% mean confidence interval for cycles value: -119.85 -15.29
    95% mean confidence interval for cycles %-change: 0.68% 1.07%
    Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

    total spills in shared programs: 10659 -> 12014 (12.71%)
    spills in affected programs: 441 -> 1796 (307.26%)
    helped: 7
    HURT: 12

    total fills in shared programs: 11551 -> 14429 (24.92%)
    fills in affected programs: 993 -> 3871 (289.83%)
    helped: 8
    HURT: 11

    total sends in shared programs: 1025832 -> 1025353 (-0.05%)
    sends in affected programs: 2241 -> 1762 (-21.37%)
    helped: 105
    HURT: 1
    helped stats (abs) min: 1 max: 87 x̄: 4.57 x̃: 2
    helped stats (rel) min: 5.56% max: 54.72% x̄: 11.37% x̃: 10.00%
    HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
    HURT stats (rel)   min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
    95% mean confidence interval for sends value: -7.39 -1.65
    95% mean confidence interval for sends %-change: -12.95% -7.70%
    Sends are helped.

    LOST:   93
    GAINED: 109

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4648
Fixes: eab1c55590 "intel/fs: Support SENDS in SHADER_OPCODE_SEND"
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10412>
2021-04-22 18:00:16 +00:00
Anuj Phogat
c144cc7889 intel: Rename calculate_gen_slm_size to intel_calculate_slm_size
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965
grep -E "calculate_gen_slm_size" -rIl $SEARCH_PATH | xargs sed -ie "s/calculate_gen_slm_size/intel_calculate_slm_size/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
07eec673fc intel: Rename eu compact instruction tests
grep -E "gen_[[:alnum:]_]{2,}" -rIl src/intel/compiler/test_eu_compact.cpp | xargs sed -ie "s/gen_\([[:alnum:]_]\{2,\}\)/test_\1/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
492da8b8c1 intel: Rename index_gen keyword to index_ver
grep -E "index_gen" -rIl src/intel/compiler | xargs sed -ie "s/index_gen/index_ver/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
0d66f0a2ee intel: Rename gens keyword to gfx_vers
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965
grep -E "gens" -rIl src/intel/compiler | xargs sed -ie "s/gens/gfx_vers/g"
Exclude changes to few comments.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
ea13901354 intel: Rename gen keyword in test_eu_validate.cpp
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
9e39e49e2c intel: Rename gen enum to gfx_ver
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965
grep -E "gen_from_devinfo" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_from_devinfo/gfx_ver_from_devinfo/g"

Few manual changes.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
47a32160eb intel: Rename brw_gen_enum.h to brw_gfx_ver_enum.h
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965
grep -E "brw_gen_enum" -rIl $SEARCH_PATH | xargs sed -ie "s/brw_gen_enum\.h/brw_gfx_ver_enum\.h/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
726d9696dd intel: Rename gen_get_device prefix to intel_get_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_get_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_get_device/intel_get_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Anuj Phogat
4c535cbf99 intel: Fix alignment and line wrapping due to gen_device renaming
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
cd39d3b1ad intel: Rename gen_device prefix in filenames
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "gen_device" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_device/intel_device}"' \;
grep -E "gen_device_info*\.[cph]" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device_info\(.*\.[cph]\)/intel_device_info\1/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
926d343acf intel: Rename files with gen_debug prefix
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "*gen_debug.*[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \;
grep -E "gen_debug" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_debug\./intel_debug\./g"
grep -E "GEN_DEBUG" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Matt Turner
566dc4d740 intel/eu: Add instruction compaction support on XeHP.
This patch includes a number of reworks and fixes squashed in by
Nanley Chery, Sagar Ghuge, Jordan Justen and Francisco Jerez.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
a2572a9da4 intel/fs: Add more efficient fragment coordinate calculation.
The PIXEL_X/Y opcodes used by the current implementation are broken on
XeHP due to the new regioning restrictions of the floating-point pipe.
We could have the regioning lowering pass fix it in theory by lowering
the conversions into separate MOV instructions, but that would be more
costly than this implementation that only needs a pair of pipelined
ADDs and a pair of pipelined MOVs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
a0e0dfe174 intel/fs: Introduce lowering pass to implement derivatives in terms of quad swizzles.
Unfortunately the funky Align1 regions used by the code generator in
order to implement derivatives efficiently aren't available to the
floating-point pipeline on XeHP.  We need to lower them into a number
of pipelined integer shuffle instructions followed by the
floating-point difference computation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Jordan Justen
635ed58e52 intel/compiler: Lower txd for 3D samplers on XeHP.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Jordan Justen
515ee73b4e intel/fs: End computer shader with message gateway on XeHP.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Jordan Justen
262cb08557 intel/fs: Disable 3-src immediates on XeHP.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

[ Francisco Jerez: Add TODO comment explaining why this is helpful and
  how we could better fix it. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Jordan Justen
02ce55d2b1 intel/eu: Allow 64-bit registers on XeHP.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
262b647b25 intel/compiler: Lower integer division on XeHP.
It has been removed from the hardware.

[jordan.l.justen@intel.com: Move to brw_postprocess_nir]

v2: Switch to nir_lower_idiv_precise (Rhys).
v3: Fix for interface changes of nir_lower_idiv.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Rafael Antognolli
49b2d9f428 intel/fs: Lower dword integer multiplies on XeHP.
From the BSpec:

 "When multiplying DW X DW, resulting dst can only be QW precision. If
 DW precision is required at output than MUL/MACH macro must be used."

So for now simply lower it. We might want to revisit it later.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
3f50dde8b3 intel/eu: Teach EU validator about FP/DP pipeline regioning restrictions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
f3e5cd813a intel/fs: Handle regioning restrictions of split FP/DP pipelines.
The floating-point and double-precision FPU pipelines of XeHP
platforms don't support arbitrary regioning modes, corresponding
channels of sources and destination are required to be aligned to the
same sub-register offset, similar to the restriction FP64 instructions
had on CHV/BXT platforms.

Most violations of this restriction can be fixed easily by teaching
has_dst_aligned_region_restriction() about the change so the regioning
lowering pass gets rid of any unsupported regioning.  For cases where
this is not sufficient (e.g. because a virtual instruction internally
uses some regioning mode not supported by the floating-point pipeline)
the regioning lowering pass is extended with an additional
lower_exec_type() codepath that bit-casts sources and destination to
an integer type whenever the execution type is not supported by the
instruction.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
0dc16965a9 intel/fs: Fix repclear assembly for XeHP+ regioning restrictions.
The regioning mode used here is no longer supported by the
floating-point pipeline.  We could run the regioning lowering pass in
order to fix it with some extra copies, but it's more efficient to
change the instruction to use integer types.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
05cce1f97d intel/fs: Use CHV/BXT implementation of 64-bit MOV_INDIRECT on XeHP+.
According to the hardware spec "Vx1 and VxH indirect addressing for
Float, Half-Float, Double-Float and Quad-Word data must not be used."

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
d57f3ced6c intel/fs: Calculate SWSB cross-pipeline synchronization information.
In combination with the previous changes we can just check whether an
instruction has any potentially unsatisfied dependencies on more than
one pipeline, and if so use TGL_PIPE_ALL synchronization with an
appropriate RegDist counter, otherwise synchronize with the single
pipeline it has a dependency on, if any.

Only minor difficulty is caused by the fact that the hardware doesn't
have any way to encode pipeline information when a RegDist and an SBID
dependency need to be provided simultaneously, in which case the
synchronization pipeline is inferred by the hardware.  We need to
verify that the hardware's inference will give the correct result
(which may not be the case if e.g. some data was bit-cast from a
different type), and if not emit separate SYNC instructions instead of
baking the RegDist dependency into the instruction (Note that SET SBID
dependencies must always be baked into the corresponding out-of-order
instruction).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
3f063334fc intel/fs: Represent SWSB in-order dependency addresses as vectors.
This extends the current ordered_address instruction counter to a
vector with one component per asynchronous ALU pipeline, allowing us
to track the last instruction that accessed a register separately for
each ALU pipeline of the XeHP EU, making it straightforward to
infer the right cross-pipeline synchronization annotations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

v2: Make unit tests happy (with ubsan as run by GitLab automation).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Jordan Justen
78b643fb7f Revert "intel/compiler: Silence unused parameter warning in update_inst_scoreboard"
This was a placeholder for the XeHP cross-pipeline synchronization
code, bring it back.

This reverts commit a80e44902f.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
d4537770bb intel/fs: Add helper functions inferring sync and exec pipeline of an instruction.
Define two helper functions local to the software scoreboard lowering
pass describing the behavior of the hardware and code generator:
inferred_sync_pipe() calculates the ALU pipeline the hardware will
implicitly synchronize with when a RegDist SWSB annotation is used
without providing explicit pipeline synchronization information,
inferred_exec_pipe() infers the ALU pipeline that will execute the
instruction.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez
12479abded intel/fs: Implement representation of SWSB cross-pipeline synchronization annotations.
The execution units of XeHP platforms have multiple asynchronous ALU
pipelines instead of (as far as software is concerned) the single
in-order pipeline that handled most ALU instructions except for
extended math in the original Xe.  It's now the compiler's
responsibility to identify cross-pipeline dependencies and insert
synchronization annotations whenever necessary, which are encoded as
some additional bits of the SWSB instruction field.

This commit represents the cross-pipeline synchronization annotations
as part of the existing tgl_swsb structure used for codegen.  The
existing tgl_swsb_*() helpers used by hand-crafted assembly are
extended to default to TGL_PIPE_ALL big-hammer synchronization in
order to ensure backwards compatibility with the existing assembly.
The following commits will extend the software scoreboard lowering
pass in order to keep track of cross-pipeline dependencies across IR
instructions, and insert more specific pipeline annotations in the
SWSB field.

The disassembler is also extended here to print out any existing
pipeline sync annotations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:34 +00:00
Michel Dänzer
d200f45875 Use explicit break instead of fall-through to break-only case
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Michel Dänzer
2928c21eb7 Convert most remaining free-form fall-through comments to FALLTHROUGH
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00