Commit graph

7431 commits

Author SHA1 Message Date
Henry Goffin
fe617bcca0 intel/compiler/test: Fix build with GCC 7
Without this change, test_fs_scoreboard.cpp does not compile on GCC 7
due to the use of C99 initializers in a C++ source file.

Fixes: c847bfaaf5 ("intel/fs/gen12: Add tests for scoreboard pass")

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14349>
2021-12-30 19:59:52 +00:00
Dave Airlie
a2293e33fd intel/genxml/gen4-5: fix more Raster Operation in BLT to be a uint
This has been partly fixed twice before, but looks like some got missed.

Fixes arb_copy_image on gen4 with crocus

Fixes: de625dddee ("intel/genxml: fix raster operation field in blt genxml")

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14345>
2021-12-30 11:40:33 +10:00
Lionel Landwerlin
eca7b24e74 intel/devinfo: adjust subslice array size
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14324>
2021-12-28 14:22:53 -08:00
Dave Airlie
4392c24844 intel/compiler: drop unused decleration
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>
2021-12-22 21:37:55 +00:00
Dave Airlie
2692a5f8db intel/compiler: don't lower swizzles in backend.
These are lowered by crocus in the frontend, the key
entries are still used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>
2021-12-22 21:37:55 +00:00
Dave Airlie
e12b0d0d60 intel/compiler: remove gfx6 gather wa from backend.
Crocus lowers this in the frontend, they key member is still used
but reset prior to backend.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>
2021-12-22 21:37:55 +00:00
Marcin Ślusarz
a48f1d51e2 intel/compiler: disable workaround not applicable to gfx >= 11
There's nothing in bspec that would suggest this is still needed.
It only affected gfx 9 and 10.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14280>
2021-12-22 10:13:25 +00:00
Caio Oliveira
ac90519e35 anv: Simplify assertions related to graphics stages
In all three cases, COMPUTE was on the table but with an invalid
value (zero).  Drop it from the tables and the extra assertion, so if
a COMPUTE is passed it will just fail the ARRAY_SIZE assertion.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14274>
2021-12-21 18:25:05 +00:00
Caio Oliveira
de916d827f anv: Refactor dirty masking in cmd_buffer_flush_state
Instead of masking the dirty variable itself, use an appropriate mask
in the users of dirty.  This will avoid extra tracking when dealing
with Task/Mesh later.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14275>
2021-12-21 11:07:31 +00:00
Caio Oliveira
37fca614b8 anv/blorp: Split blorp_exec into a render and compute
And set the relevant push_constants_dirty for each case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14275>
2021-12-21 11:07:31 +00:00
Francisco Jerez
e7470a40c5 intel/fs: Add physical fall-through CFG edge for unconditional BREAK instruction.
This adds a missing CFG edge that represents a possible physical
control flow path the EU might take under some conditions which isn't
part of the logical CFG of the program.  This possibility shouldn't
have led to problems on platforms prior to Gfx12, since the missing
control flow edge cannot possibly influence liveness intervals.
However on Gfx12+ it becomes the compiler's responsibility to resolve
data dependencies across instructions, and the missing physical
control flow paths may lead to a WaR data hazard currently not visible
to the software scoreboard pass, which could lead to data corruption.

Worse, the possibility for this path to be taken by the EU increases
on Gfx12+ due to a hardware bug affecting EU fusion -- However the
same physical path can be potentially taken on earlier platforms as
well, so this patch extends the CFG on all platforms for consistency,
even though the lack of this edge shouldn't lead to any functional
issues on platforms earlier than Gfx12.  There are no shader-db
changes on earlier platforms, so there seems to be no disadvantage
from using the same CFG representation as on later platforms.

This issue has ben reported on TGL with the following conformance
test, thanks to Ian for bringing the FULSIM dependency check warning
to my attention:

   dEQP-VK.graphicsfuzz.spv-stable-pillars-volatile-nontemporal-store

Fixes: 4d1959e693 ("intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4940
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14248>
2021-12-21 00:43:29 +00:00
Rafael Antognolli
e9b509755b intel: Emit 3DSTATE_BINDING_TABLE_POOL_ALLOC for XeHP
On XeHP+, Binding Table Pointers are an offset relative to the Surface
State Base Address anymore. Instead, they are relative to the State
Binding Table Pool Address, which is set by the command above.

We emit that command (pointing to the same address as the Surface
State Base Addresss), and everything should stay working as before.

Reworks:
 * Jordan: Add iris
 * Jordan: Drop i965
 * Ken: Set MOCS to avoid a major perf impact. (Found by Felix DeGrood.)
 * Jordan: Shrink size from 2MiB to actual iris, anv usage
 * Lionel: Add BINDING_TABLE_POOL_BLOCK_SIZE

Ref: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4995
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Add Iris, adjust sizes]

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13992>
2021-12-20 17:58:13 +00:00
Jordan Justen
e6fc231184 anv: Add BINDING_TABLE_POOL_BLOCK_SIZE
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13992>
2021-12-20 17:58:13 +00:00
Jordan Justen
1ed7a65e6d intel/genxml/12.5: Remove bt-pool enable from 3DSTATE_BINDING_TABLE_POOL_ALLOC
This was dropped in gfx12.5.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13992>
2021-12-20 17:58:13 +00:00
Jason Ekstrand
eebb2dedb2 intel/fs: Add a NONE scheduling mode
While our LIFO scheduling mode attempts to optimize for register
pressure, it's often hard for a scheduling algorithm to do better than
the instruction order provided by the shader author.  Shader authors
often do perfectly reasonable things like using texture results
immediately after fetching them or constructing texture coordinates
immediately before the texture op.  When we throw all the instruction
ordering information away, we loose any help the author may have given
us.  By attempting NONE before we fall back to the worst case LIFO mode.

And, yes, I tried this with NONE both before and after LIFO and doing
NONE before LIFO is substantially better, according to shader-db.

    total instructions in shared programs: 19673152 -> 19665202 (-0.04%)
    instructions in affected programs: 33669 -> 25719 (-23.61%)
    helped: 20
    HURT: 0
    helped stats (abs) min: 15 max: 4609 x̄: 397.50 x̃: 107
    helped stats (rel) min: 2.33% max: 67.50% x̄: 14.60% x̃: 9.12%
    95% mean confidence interval for instructions value: -867.61 72.61
    95% mean confidence interval for instructions %-change: -21.74% -7.46%
    Inconclusive result (value mean confidence interval includes 0).

    total cycles in shared programs: 935562500 -> 935020920 (-0.06%)
    cycles in affected programs: 18620349 -> 18078769 (-2.91%)
    helped: 104
    HURT: 48
    helped stats (abs) min: 88 max: 60986 x̄: 8031.48 x̃: 3680
    helped stats (rel) min: 0.61% max: 51.44% x̄: 14.95% x̃: 8.87%
    HURT stats (abs)   min: 10 max: 54724 x̄: 6118.62 x̃: 1530
    HURT stats (rel)   min: 0.13% max: 46.45% x̄: 10.28% x̃: 6.46%
    95% mean confidence interval for cycles value: -5724.34 -1401.71
    95% mean confidence interval for cycles %-change: -9.86% -4.10%
    Cycles are helped.

    total spills in shared programs: 12158 -> 10327 (-15.06%)
    spills in affected programs: 1831 -> 0
    helped: 20
    HURT: 0

    total fills in shared programs: 14749 -> 12635 (-14.33%)
    fills in affected programs: 2114 -> 0
    helped: 20
    HURT: 0

    LOST:   8
    GAINED: 649

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
e6ddee764e intel/fs: Reset instruction order before re-scheduling
The way the current scheduler loop is implemented, each scheduling pass
starts with what the previous pass had.  This means that, if PRE screwed
everything up majorly, PRE_NON_LIFO would have to try to fix it.  It
also meant that tiny changes to one pass would affect every later pass.
Instead, reset the order of the instructions before each scheduling
pass.  This makes the passes entirely independent of each other.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19670486 -> 19670648 (<.01%)
    instructions in affected programs: 25317 -> 25479 (0.64%)
    helped: 2
    HURT: 7
    helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
    helped stats (rel) min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07%
    HURT stats (abs)   min: 8 max: 70 x̄: 24.29 x̃: 12
    HURT stats (rel)   min: 0.41% max: 4.95% x̄: 1.47% x̃: 0.87%
    95% mean confidence interval for instructions value: -1.28 37.28
    95% mean confidence interval for instructions %-change: -0.04% 2.30%
    Inconclusive result (value mean confidence interval includes 0).

    total cycles in shared programs: 935535948 -> 935490243 (<.01%)
    cycles in affected programs: 421994824 -> 421949119 (-0.01%)
    helped: 1269
    HURT: 879
    helped stats (abs) min: 1 max: 12008 x̄: 259.38 x̃: 52
    helped stats (rel) min: <.01% max: 28.02% x̄: 1.12% x̃: 0.14%
    HURT stats (abs)   min: 1 max: 29931 x̄: 322.46 x̃: 20
    HURT stats (rel)   min: <.01% max: 32.17% x̄: 1.74% x̃: 0.22%
    95% mean confidence interval for cycles value: -71.37 28.81
    95% mean confidence interval for cycles %-change: -0.11% 0.21%
    Inconclusive result (value mean confidence interval includes 0).

    total spills in shared programs: 12403 -> 12430 (0.22%)
    spills in affected programs: 1355 -> 1382 (1.99%)
    helped: 2
    HURT: 7

    total fills in shared programs: 15128 -> 15182 (0.36%)
    fills in affected programs: 3294 -> 3348 (1.64%)
    helped: 2
    HURT: 7

    LOST:   21
    GAINED: 28

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
d49d092259 Revert "intel/fs: Do cmod prop again after scheduling"
This reverts commit ba2fa1ceaf.  Doing
optimizations after scheduling but before RA means doing them in the
middle of the scheduling loop which introduces additional dependencies
between one scheduling iteration and the next.  That won't work if we
want to make the scheduling modes independent, at least not unless we
have some way of fully cloning the IR.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
e6f0def97d intel/eu: Don't double-loop as often in brw_set_uip_jip
brw_find_next_block_end() scans through the instructions to find the end
of the block.  We were calling it for every instruction in the program
which is, if you have a single basic block, makes the whole mess a nice
clean O(n^2) when it really doesn't need to be.  Instead, only call
brw_find_next_block_end() as-needed.  This brings it back to O(n) like
it should have been.

This cuts the runtime of the following Vulkan CTS on my SKL box by 5%
from 1:51 to 1:45:  dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
cf98a3cc19 intel/fs: Use OPT() for split_virtual_grfs
Now that we're being conservative in the pass, it's easy to tell when it
makes progress and we can put it in the OPT() macro.  This way, we get
nice INTEL_DEBUG=optimizer dumps for it.  While we're here, fix the
header comment which is massively out-of-date.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
38fa18a7a3 intel/fs: Be more conservative in split_virtual_grfs
Instead of modifying every single instruction, keep track of which VGRFs
are actually split in a bit-set, and only modify the instructions that
actually touch split regs.

This cuts the runtime of the following Vulkan CTS on my SKL box by 45%
from 3:21 to 1:51:  dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
288a670f17 anv/pipeline: Get rid of sample_shading_enable
Putting it in the pipeline is a bit of a lie.  We no longer need it for
nir_lower_wpos_center. The only other user is pipeline_has_coarse_pixel
and that is used to build the shader key which we construct before we've
processed any NIR so we don't have accurate information at that time
anyway.  Instead, look at ms_info->sampleShadingEnable directly in
pipeline_has_coarse_pixel and trust the back-end to deal with disabling
coarse when we need per-sample dispatch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
deec7a590b anv,nir: Use sample_pos_or_center in lower_wpos_center
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
3c89dbdbfe intel/fs: Implement the sample_pos_or_center system value
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
a580fd55e1 intel/fs: Rework emit_samplepos_setup()
This rolls compute_sample_position into emit_samplepos_setup, its only
caller, by using a loop instead of calling it twice.  We also
early-return for the !persample_dispatch case instead of doing it as
part of the sample calculation.  This means that we don't call
fetch_payload_reg() to get sample_pos_reg unless we're actually going to
use it so the function is safe to call even if we haven't set up
sample_pos_reg.  This will be important for the next commit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
ac7255ed1e intel/fs: Return fs_reg directly from builtin setup helpers
There's no good reason why we're allocating them on the heap and
returning a pointer.  Return the fs_reg directly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
3878094eb1 anv: Drop anv_sync_create_for_bo
The older helper is unused so we can roll it all into
anv_create_sync_for_memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14237>
2021-12-17 00:55:31 +00:00
Lionel Landwerlin
b00086d393 anv,wsi: simplify WSI synchronization
Rather than using 2 vfuncs, use one since we've unified the
synchronization framework in the runtime with a single vk_sync object.

v2 (Jason Ekstrand):
 - create_sync_for_memory is now in vk_device

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14237>
2021-12-17 00:55:31 +00:00
Jason Ekstrand
9ae1e621e5 anv: Implement vk_device::create_sync_for_memory
Fixes: 36ea90a361 ("anv: Convert to the common sync and submit framework")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14237>
2021-12-17 00:55:31 +00:00
Ian Romanick
ff44547ea4 intel/stub: Implement shell versions of DRM_I915_GEM_GET_TILING and DRM_I915_SEM_GET_TILING
This is necessary to use intel_stub_gpu with Crocus.

v2: Remove unused i915_bo::swizzle_mode. Noticed by Emma.

Fixes: 953a4ca6fe ("intel: Add has_bit6_swizzle to devinfo")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14218>
2021-12-16 23:06:38 +00:00
Ian Romanick
2dc7c24b80 intel/stub: Silence "initialized field overwritten" warning
src/intel/tools/intel_noop_drm_shim.c:459:36: warning: initialized field overwritten [-Woverride-init]
  459 |    [DRM_I915_GEM_EXECBUFFER2_WR] = i915_ioctl_noop,
      |                                    ^~~~~~~~~~~~~~~

Fixes: 0f4f1d70bf ("intel: add stub_gpu tool")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14218>
2021-12-16 23:06:38 +00:00
Kenneth Graunke
7325179bcb intel/compiler: Use uppercase enum values in brw_ir_performance.cpp
This is by far the more common style in Mesa.  It also gives a cue that
e.g. num_dependency_ids is a fixed definition rather than some kind of
local variable maintaining a count.

While hre, we also rename the enums to have full prefixes to prepare for
a future where we use them in multiple files for future backend work.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14182>
2021-12-16 09:00:57 +00:00
Kenneth Graunke
d3f4f23ca3 intel/vec4: Inline emit_texture and move helpers to brw_vec4_nir.cpp
emit_texture() only has one caller, nir_emit_texture().  We may as well
inline that.  Move the associated helper functions for emitting sampler
messages there as well, to keep associated code nearby.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5183
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14191>
2021-12-16 00:09:45 -08:00
Kenneth Graunke
92d194427d intel/vec4: Use nir_texop in emit_texture instead of translating
We eliminated the GLSL IR -> vec4 backend ages ago, so the only caller
uses a nir_texop enum.  Drop a layer of translating.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14191>
2021-12-16 00:09:44 -08:00
Kenneth Graunke
2729a741fc intel/vec4: Use ir_texture_opcode less in emit_texture()
This replaces a bunch of uses of the GLSL IR ir_texture_opcode enum with
the backend opcode, in preparation for removing it altogether.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14191>
2021-12-16 00:09:36 -08:00
Sagar Ghuge
cd38b6e2e8 anv, iris: Implement Wa_14014890652 for DG2
Workaround is to set:

3DSTATE_VFG::GranularityThresholdDisable = 1
3DSTATE_VFG::DistributionGranularity = BATCH
3DSTATE_VF::GeometryDistributionEnable = 1

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14212>
2021-12-16 00:00:23 +00:00
Anuj Phogat
40b66a4499 anv, iris: Add Wa_22011440098 for DG2
Rework:
 * Jordan: Set MOCS after
   7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14212>
2021-12-16 00:00:22 +00:00
Anuj Phogat
17a1df79ba anv, iris: Add Wa_16011773973 for DG2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14212>
2021-12-16 00:00:22 +00:00
Ian Romanick
2ca13abcce intel/fs: Use HF as destination type for F32TOF16 in fquantize2f16
Having an integer destination type instead of a float destination type
confuses the SWSB code.  This causes problems on some Intel GPUs.  Fix
this by using the correct type in the destination of the F32TOF16
opcode.

Gfx7 doesn't have the HF type, so continue to emit W on that platform.
The assertions in brw_F32TO16 (brw_eu_emit.c) are updated to reflect
this.  In scalar mode, UD is never emitted as a destination type for
this opcode, so remove it from the allowed types in the assertion.

I also condidered doing something like de55fd358f ("intel/fs/xehp:
Teach SWSB pass about the exec pipeline of
FS_OPCODE_PACK_HALF_2x16_SPLIT."), but Curro recommended that just using
the correct types is a better fix.  I agree.

v2: Add missing changes to fs_generator::generate_pack_half_2x16_split.
I'm not sure how I (and the Intel CI) missed that the first time. :(

v3: Fix copy-and-paste issue in the v2 fix. Noticed by Tapani.

Reviewed-by: Francisco Jerez <currojerez@riseup.net> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14181>
2021-12-15 20:03:51 +00:00
Jason Ekstrand
b05d228695 Revert "anv: Stop doing too much per-sample shading"
This reverts commit 1f559930b6.  Turns
out, this approach won't work.

Fixes: 1f559930b6 ("anv: Stop doing too much per-sample shading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14196>
2021-12-14 18:09:03 +00:00
Rafael Antognolli
a026d2d11c intel/compiler: Assert that unsupported tg4 offsets were lowered for XeHP
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14142>
2021-12-13 16:59:44 -08:00
Jordan Justen
52a55f097f intel/compiler: Use nir_lower_tex_options::lower_offset_filter for tg4 on XeHP
Based on Rafael's:
 * "nir/lower_tex: Add option to lower offset for tg4 too."
 * "intel/compiler: Lower offsets for tg4 on gen9+."
 * "WIP: Do not lower basic offsets."
 * "WIP: intel/compiler: Enable lowering offsets restriction."

But, with these changes:
 * Fixed range checking to be signed 4 bits
 * Converted to filter
 * Apply only to gfx12.5+
 * Use nir_src_is_const / nir_src_comp_as_int (s-b Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14142>
2021-12-13 16:59:37 -08:00
Jordan Justen
c17e2216dd anv: Align buffer VMA to 2MiB for XeHP
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14155>
2021-12-13 22:29:18 +00:00
Caio Oliveira
2ad11b39bd intel/compiler: Use a struct for brw_compile_bs parameters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14139>
2021-12-13 01:08:16 +00:00
Caio Oliveira
58c4a95320 intel/compiler: Use a struct for brw_compile_gs parameters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14139>
2021-12-13 01:08:16 +00:00
Caio Oliveira
acf2d3c78b intel/compiler: Use a struct for brw_compile_tes parameters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14139>
2021-12-13 01:08:16 +00:00
Caio Oliveira
7372a48a4a intel/compiler: Use a struct for brw_compile_tcs parameters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14139>
2021-12-13 01:08:16 +00:00
Jason Ekstrand
88e97d75d0 intel/dev: Add gtt_size to devinfo
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13647>
2021-12-11 05:05:19 +00:00
Jason Ekstrand
1f559930b6 anv: Stop doing too much per-sample shading
We were setting anv_pipeline::sample_shading_enable based on
sampleShadingEnable without looking at minSampleShading.  We would then
pass this value into nir_lower_wpos_center which would add sample_pos to
frag_coord.  Then the back-end compiler picks up on the existence of
sample_pos and forces persample dispatch.  This leads to doing
per-sample dispatch whenever sampleShadingEnable = VK_TRUE regardless of
the value of minSampleShading.  This is almost certainly costing us
perf somewhere.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14022>
2021-12-11 04:40:20 +00:00
Nanley Chery
18231fc548 intel/blorp: Modify get_fast_clear_rect for XeHP
The alignment and scale down values have changed on this platform.

To support drivers that won't use a CCS surface on this platform, this
patch computes the CCS fast clear rectangle using the main surface.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13555>
2021-12-11 04:14:20 +00:00
Nanley Chery
c35b8a2889 intel/blorp: Modify the SKL+ CCS resolve rectangle
According to Bspec 2424, "Render Target Resolve":

   The Resolve Rectangle size is same as Clear Rectangle size from SKL+.

Use get_fast_clear_rect in blorp_ccs_resolve for SKL+.

Note that the Bspec differs from Vol7 of the Sky Lake PRM, which only
specifies aligning by the scaledown factors.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13555>
2021-12-11 04:14:20 +00:00