Commit graph

2263 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
ce0b72a13a intel/fs: Don't emit_uniformize when getting a constant SSBO index
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7340>
2020-10-29 21:54:01 +00:00
Ian Romanick
67956689bb nir: Rename replicated-result dot-product instructions
All these instructions replicate the result of a N-component dot-product
to a vec4.  Naming them fdot_replicatedN gives the impression that are
some sort of abstract dot-product that replicates the result to a vecN.
They also deviate from fdph_replicated... which nobody would reasonably
consider naming fdot_replicatedh.

Naming these opcodes fdotN_replicated more closely matches what they
are, and it matches the pattern of fdph_replicated.

I believe that the only reason these opcodes were named this way was
because it simplified the implementation of the binop_reduce function in
nir_opcodes.py.  I made some fairly simple changes to that function, and
I think the end result is ok.

The bulk of the changes come from the sed rename:

    sed --in-place -e 's/fdot_replicated\([234]\)/fdot\1_replicated/g' \
        $(grep -r 'fdot_replicated[234]' src/)

v2: Use a named parameter to binop_reduce instead of using
isinstance(name, str).  Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>
2020-10-22 18:00:19 +00:00
Caio Marcelo de Oliveira Filho
e7e24d5039 intel/fs: Handle nir_intrinsic_terminate
For terminate operation, jump the invocation without predicating on
the rest of the quad being disabled -- which is what is done for
demote and discard.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7150>
2020-10-15 21:40:09 +00:00
Ian Romanick
262ca98b3a intel/compiler: Remove Gen10-specific code
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>
2020-10-15 09:29:53 -07:00
Kenneth Graunke
341f5bffb7 intel/compiler, anv: Delete cs_prog_data->slm_size
cs_prog_data->slm_size is basically redundant with
prog_data->total_shared, which is the field that we actually use for
controlling the shared local memory size in all drivers.  We were
still using it in one place for VK_EXT_pipeline_executable_properties,
but we should just fix that and delete the field.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
2020-10-14 23:13:41 +00:00
Jason Ekstrand
f8117f7051 intel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE
Without this, we end up with indirect sampler messages all the time
because we don't propagate the texture/image BTI.  This makes debugging
shaders with imageSize or textureSamples in them a pain.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19720612 -> 19720564 (<.01%)
    instructions in affected programs: 4998 -> 4950 (-0.96%)
    helped: 12
    HURT: 0

All affected shaders were compute shaders in Deus Ex: Mankind Divided.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6794>
2020-10-14 21:35:30 +00:00
Jason Ekstrand
5abac85177 intel/fs: Rework scratch handling on Gen9+
The current scratch mechanism uses an MRF hack where we reserve a few
GRF registers to treat like the MRF and we collect the data into that
MRF region before doing a scratch write.  We also use that region for
the header for scratch reads.

This commit changes things and gets rid of the MRF hack.  Instead, we
reserve a single register (which RA is free to pick) for the scratch
header and uses split sends for scratch writes to avoid having to do
the copy.  This should provide RA with more freedom in the presence of
spilling as well as avoid some unnecessary data moves.  In future, the
new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own
per-thread scratch base address calculations rather than depending on
the scratch base address that gets pushed into g0.  Having an opcode for
this lets us do it once at the top of the shader rather than repeating
it at every read/write.

One other noticeable difference is the use of SHADER_OPCODE_SEND.  We
can get away with this thanks to the fact that we're now using a set to
track which instructions are generated by spills and don't rely on the
opcodes to find spill/fill instructions.  This allows us to avoid adding
more virtual opcodes and let the normal code paths handle things like
scoreboard dependencies between header setup and the SEND.  It also
means that post-RA scheduling may be able to space out the header setup
MOV and the SEND for better latency hiding.

Shader-db results on Skylake:

    total spills in shared programs: 12137 -> 10604 (-12.63%)
    spills in affected programs: 6685 -> 5152 (-22.93%)
    helped: 274
    HURT: 2

    total fills in shared programs: 13065 -> 11515 (-11.86%)
    fills in affected programs: 9007 -> 7457 (-17.21%)
    helped: 275
    HURT: 1

Shader-db results on Ice Lake:

    total spills in shared programs: 12482 -> 10953 (-12.25%)
    spills in affected programs: 6586 -> 5057 (-23.22%)
    helped: 275
    HURT: 0

    total fills in shared programs: 12819 -> 11234 (-12.36%)
    fills in affected programs: 7867 -> 6282 (-20.15%)
    helped: 274
    HURT: 0

Shader-db results on Tigerlake:

    total spills in shared programs: 11689 -> 10233 (-12.46%)
    spills in affected programs: 4740 -> 3284 (-30.72%)
    helped: 259
    HURT: 0

    total fills in shared programs: 10840 -> 9443 (-12.89%)
    fills in affected programs: 6244 -> 4847 (-22.37%)
    helped: 259
    HURT: 0

Fossil-db results on Ice Lake:

    Spills in all programs: 245249 -> 201633 (-17.8%)
    Fills in all programs: 366066 -> 314368 (-14.1%)

More practically, this seems to give about a 0.5-1% perf boost in
Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
e557af9781 intel/fs/ra: Use a set to track added spill/fill instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
f650c4c0c6 intel/fs/ra: Sanity-check our IP counts
Starting with e99081e76d, we don't re-construct liveness information
every time we spill a register.  Instead, we're very careful to track
which instructions are spill instructions and not contribute those to
the IP count so that we can continue to use the old liveness information
even though instructions have been added.  This commit adds an assert
that sanity-checks that we count the same number of instructions as our
liveness information is based on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
d80d0a6ced intel/fs/ra: Store the last non-spill VGRF node
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
2af6528c33 intel/fs/ra: Refactor handling of Gen7 scratch reads
The attempt at de-duplication with the gen7_read Boolean wasn't actually
saving us anything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
74a1843ca0 intel/fs/ra: Increment spill_offset as part of the emit_spill loop
This makes it consistent with our handling of src.offset and with our
handling of spill_offset in emit_unspill.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
06ebf23283 intel/fs: Add a SCRATCH_HEADER opcode
This opcode is responsible for setting up the buffer base address and
per-thread scratch space fields of a scratch message header.  For the
most part, it's a copy of g0 but some messages need us to zero out g0.2
and the bottom bits of g0.5.

This may actually fix a bug when nir_load/store_scratch is used.  The
docs say that the DWORD scattered messages respect the per-thread
scratch size specified in gN.3[3:0] in the message header but we've been
leaving it zero.  This may mean that we've been ignoring any scratch
reads/writes from a load/store_scratch intrinsic above the 1KB mark.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Jason Ekstrand
24b64c8408 intel/fs: Copy the PTSS from g0 for scratch reads/writes
In theory, this fixes a bug where we were dropping the PTSS bound on the
floor.  The hardware docs claim that the A32 DWORD and BYTE scattered
read/write messages do a PTSS bounds check.   However, in practice, it
seems that the hardware ignores the bounds check so this doesn't
actually matter.  I verified this with the following couple of piglit
tests:

    https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399

In practice, this prevents the next commit from making a subtle
behavioral change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
2020-10-13 21:59:27 +00:00
Rhys Perry
8850a63161 radv/aco,nir/lower_subgroups: don't lower elect
ACO can implement this better.

fossil-db (Navi):
Totals from 33 (0.02% of 135946) affected shaders:
SGPRs: 1736 -> 1744 (+0.46%)
VGPRs: 1680 -> 1656 (-1.43%)
CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04%
MaxWaves: 449 -> 461 (+2.67%)
Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05%
Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>
2020-10-13 12:47:20 +00:00
Timur Kristóf
f11f4a2a4d nir: Add ability to count primitives per stream.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>
2020-10-09 15:26:14 +02:00
Timur Kristóf
aac5adc3c2 nir: Count vertices per stream.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>
2020-10-09 15:26:14 +02:00
Timur Kristóf
2be99012e9 nir: Add ability to count emitted GS primitives.
Add an option to nir_lower_gs_intrinsics which tells it to track
the number of emitted primitives, not just vertices. Additionally,
also make it per-stream.

Also rename the set_vertex_count intrinsic to
set_vertex_and_primitive_count.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>
2020-10-09 15:26:14 +02:00
Jason Ekstrand
3d22de05ca intel/fs: Add an option to use dataport messages for UBOs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>
2020-10-08 01:17:06 -05:00
Jason Ekstrand
0d462dbee5 intel/fs: Add an alignment to VARYING_PULL_CONSTANT_LOAD_LOGICAL
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>
2020-10-08 01:14:46 -05:00
Jason Ekstrand
dd9c34a907 intel/nir: Lower load_global_constant in lower_mem_access_bit_sizes
It's identical to nir_intrinsic_load_global except that it works on data
that's guaranteed to be constant throughout the shader invocation.

Fixes: ff2f44d865 "intel/fs: Implement nir_intrinsic_load_global_constant"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>
2020-10-08 03:56:01 +00:00
Jason Ekstrand
fd04f858b0 intel/nir: Don't try to emit vector load_scratch instructions
In 53bfcdeecf, we added load/store_scratch instructions which deviate
a little bit from most memory load/store instructions in that we can't
use the normal untyped read/write instructions which can read and write
up to a vec4 at a time.  Instead, we have to use the DWORD scattered
read/write instructions which are scalar.  To handle this, we added code
to brw_nir_lower_mem_access_bit_sizes to cause them to be scalarized.
However, one case was missing: the load-as-larger-vector case.  In this
case, we take small bit-sized constant-offset loads replace it with a
32-bit load and shuffle the result around as needed.

For scratch, this case is much trickier to get right because it often
emits vec2 or wider which we would then have to lower again.  We did
this for other load and store ops because, for lower bit-sizes we have
to scalarize thanks to the byte scattered read/write instructions being
scalar.  However, for scratch we're not losing as much because we can't
vectorize 32-bit loads and stores either.  It's easier to just disallow
it whenever we have to scalarize.

Fixes: 53bfcdeecf "intel/fs: Implement the new load/store_scratch..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>
2020-10-08 03:56:01 +00:00
Jason Ekstrand
9df9f940f0 iris: Add support for load_work_dim as a system value
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>
2020-10-07 16:01:31 -05:00
Marcin Ślusarz
9c25689287 intel: drop likely/unlikely around INTEL_DEBUG
It's included in declaration of INTEL_DEBUG.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
2020-10-06 18:43:07 +00:00
Vinson Lee
81cd4c8f59 intel/vec4: Remove leftover code from Gen8+ removal.
Remove code missed in commit 2a49007411 ("intel/vec4: Remove all
support for Gen8+ [v2]").

Fix defect reported by Coverity Scan.

Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement:
mcs.swizzle = 80U;

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6927>
2020-10-03 03:53:46 +00:00
Jason Ekstrand
8427e56067 intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs
When I copied and pasted the code from MOV_INDIRECT for handling the
dependency controls, I missed a subtle difference between MOV_INDIRECT
and SHUFFLE.  Specifically, MOV_INDIRECT gets lowered to a narrow
instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to
split it in the generator.  Therefore, the check safety check for
whether or not we can use dependency control has to be based on the
lowered width rather than the width of the original instruction.

Fixes: a8ac61b0ee "intel/fs: NoMask initialize the address..."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>
2020-10-02 19:53:56 +00:00
Jason Ekstrand
a8ac61b0ee intel/fs: NoMask initialize the address register for shuffles
Cc: mesa-stable@lists.freedesktop.org
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2979
Tested-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6825>
2020-10-02 00:42:56 +00:00
Eric Anholt
618556a8cb nir: Drop the high_offset argument to the load_store_vectorizer filter.
Nothing uses it, and it's not clear to me what it provides over
alignment/num_components/bit_size.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
2020-09-30 19:53:43 +00:00
Eric Anholt
5f757bb95c nir: Make the load_store_vectorizer provide align_mul + align_offset.
It was passing an encoding of the two that wasn't good for ensuring "Don't
combine loads that would make us straddle a vec4 boundary" for
nir_lower_ubo_vec4.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
2020-09-30 19:53:43 +00:00
Connor Abbott
b2ede6280c intel/nir: Use nir control flow helpers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>
2020-09-30 15:47:51 +00:00
Ian Romanick
1d71b1a311 intel/vec4: Remove everything related to VS_OPCODE_SET_SIMD4X2_HEADER_GEN9
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:10 -07:00
Ian Romanick
2a49007411 intel/vec4: Remove all support for Gen8+ [v2]
v2: Restore the gen == 10 hunk in brw_compile_vs (around line 2940).
This function is also used for scalar VS compiles.  Squash in:

    intel/vec4: Reindent after removing Gen8+ support
    intel/vec4: Silence unused parameter warning in try_immediate_source

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:10 -07:00
Ian Romanick
60e1d0f028 intel/compiler: Remove INTEL_SCALAR_... env variables
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:10 -07:00
Ian Romanick
d0ce24c8ca intel/vec4: Remove inline lowering of LRP
Since dd7135d55d ("intel/compiler: Use the flrp lowering pass for all
stages on Gen4 and Gen5"), it's not possible to get to this function on
GPUs that don't have a LRP instruction.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:10 -07:00
Ian Romanick
86bab92aa4 intel/compiler: Don't fallback to vec4 when scalar GS compile fails [v2]
v2: Add missing error string handling.  Noticed by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:04 -07:00
Ian Romanick
92f08860c9 intel/compiler: Silence unused parameter warning in brw_surface_payload_size
src/intel/compiler/brw_eu_emit.c: In function ‘brw_surface_payload_size’:
src/intel/compiler/brw_eu_emit.c:3070:46: warning: unused parameter ‘p’ [-Wunused-parameter]
 3070 | brw_surface_payload_size(struct brw_codegen *p,
      |                          ~~~~~~~~~~~~~~~~~~~~^

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:04 -07:00
Ian Romanick
9bcdca2455 intel/vec4: Silence unused paramter warnings in brw_vec4_generator.cpp
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_gs_svb_write(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:488:49: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
  488 |                       struct brw_vue_prog_data *prog_data,
      |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_pull_constant_load(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1269:55: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
 1269 |                             struct brw_vue_prog_data *prog_data,
      |                             ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_get_buffer_size(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1331:52: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
 1331 |                          struct brw_vue_prog_data *prog_data,
      |                          ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_pull_constant_load_gen7(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1357:60: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
 1357 |                                  struct brw_vue_prog_data *prog_data,
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:04 -07:00
Danylo Piliaiev
77486db867 intel/fs: Disable sample mask predication for scratch stores
Scratch stores are being lowered to the instructions with side-effects,
however they should be enabled in fs helper invocations, since they
are produced from operations which don't imply side-effects.

To fix this - we move the decision of whether the sample mask predication
is enable to the point where logical brw instructions are created.

GLSL example of the issue:

 int tmp[1024];
 ...
 do {
   // changes to tmp
 } while (some_condition(tmp))

If `tmp` is lowered to scrach memory, `some_condition` would be
undefined if scratch write is predicated on sample mask, making
possible for the while loop to become infinite and hang the GPU.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3256
Fixes: 53bfcdeecf
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6056>
2020-09-25 09:48:06 +00:00
Kenneth Graunke
140f53e646 Revert "nir: replace lower_ffma and fuse_ffma with has_ffma"
This reverts commit 939ddf3f67.

Intel has a separate pass for fusing FFMAs selectively.  We split
these flags in commit 1b72c31e1f and
the reasoning still stands.  The patch being reverted was just a
cleanup, so there should be no issue with reverting it.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>
2020-09-24 13:11:50 -07:00
Marek Olšák
939ddf3f67 nir: replace lower_ffma and fuse_ffma with has_ffma
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>
2020-09-24 12:29:11 +00:00
Marek Olšák
771aad3027 nir: split lower_ffma into lower_ffma16/32/64
AMD wants different behavior for each bit size

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>
2020-09-24 12:29:11 +00:00
Jason Ekstrand
9750164c09 nir: Rename get_buffer_size to get_ssbo_size
This makes it explicit that this intrinsic is only for SSBOs.  For the
v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be
able to distinguish between the two.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>
2020-09-22 13:34:12 +00:00
Lionel Landwerlin
cc3bf00cc2 intel/compiler: fixup Gen12 workaround for array sizes
We didn't handle the case of NULL images/textures for which we should
return 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 397ff2976b ("intel: Implement Gen12 workaround for array textures of size 1")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3522
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6729>
2020-09-21 21:20:09 +00:00
Jason Ekstrand
f63ffc18e7 intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP
It's not really unordered in the sense that it can still stall on
ordered things and we don't need a SYNC_NOP for that because it is a
SYNC_NOP.  However, it also doesn't count when computing instruction
distances.

Fixes: 18e72ee210 "intel/fs: Add FS_OPCODE_SCHEDULING_FENCE"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6781>
2020-09-20 14:43:40 +00:00
Gert Wollny
80cde3ad55 intel/compiler: Set lower_uniform_to_ubo compiler flag
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316>
2020-09-16 10:07:42 +00:00
Marcin Ślusarz
18eb853ac8 intel/compiler: quiet Coverity warnings
Coverity complains about possible out-of-bounds write & read, because
it thinks that "loc + i" can be bigger than sizes of the 2 used arrays.

It's not obvious from the code it cannot happen, so add asserts here.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
2020-09-10 12:16:58 +00:00
Marcin Ślusarz
5ea0b6a9c6 intel/compiler: initialize remaining fields of various classes
These variables seem to be initialized before being used, so this
patch is not fixing any bug, but leaving them unitialized may become
a bug after some refactoring.

These classes were affected: fs_reg_alloc, fs_visitor, fs_generator,
instruction_scheduler.

Found by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
2020-09-10 12:16:58 +00:00
Marcin Ślusarz
40b964dc8f intel/compiler: remove unused fs_validator::param_size
Found by Coverity as unitialized variable.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
2020-09-10 12:16:58 +00:00
Jason Ekstrand
3bd7c3c9db intel/nir: Call validate_ssa_dominance at both ends of the NIR compile
This invokes it before we go into the optimization/lowering pass and
then right before we go out of SSA.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
2020-09-08 19:44:01 +00:00
Marcin Ślusarz
64b0b7c274 intel/compiler: fix typo in a comment
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602>
2020-09-04 17:38:25 +00:00