this is subtle, but the relevant igc:
// In case of shooting down of this instruction, we need to add sync to
// preserve the swsb id sync, so that it's safe to clear the dep
if (currInst.hasPredication() ||
(currInst.getExecSize() != dep.getInstruction()->getExecSize()) ||
(currInst.getChannelOffset() != dep.getInstruction()->getChannelOffset()))
needSyncForShootDownInst = true;
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40960>
We need to use these registers on another file and I don't want to add
another copy of their definition to our code base.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40937>
This check for ">= 125" is already inside a check for ">= 125". Also,
let's take this opportunity to comment the #else and #endif of the
relevant check to make the code easier to follow.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40937>
Promote DBG() failure paths in enumerate_sysfs_metrics() to mesa_logw()
so users see why OA metrics are unavailable without needing INTEL_DEBUG.
Also log in PPS when no OA queries are available after initialization.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40898>
When observation_paranoid (xe) or perf_stream_paranoid (i915) prevents
unprivileged access to OA metrics, the existing code silently returns no
OA queries. PPS then fails with just a segfault.
This patch adds INTEL_PERF_FEATURE_OA_BLOCKED_BY_POLICY to
intel_perf_features, set by both KMD backends when the paranoid sysctl
exists but lacks sufficent privilage. PPS checks this flag immediately
after initialising intel_perf and returns an error before attempting
metric-set selection.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40898>
Jay is a new SSA-based compiler for Intel GPUs. This is an early
work-in-progress. It isn't ready to ship, but we'd like to move development in
tree rather than rebasing the world every week. Please don't bother testing yet
- we know the status and we're working on it!
Jay's design is similar to other modern NIR backends, particularly ACO, NAK and
AGX. It is fully SSA, deconstructing phis after RA. We use a Colombet register
allocator similar to NAK, allowing us to handle Intel's complex register
regioning restrictions in a straightforward way. Spilling logical registers is
straightforward with Braun-Hack.
Thanks to the SSA-based design, the entire backend is essentially linear time,
regardless of register pressure, addressing brw's excessive compile time when
especially spilling with brw.
In this current early draft, we support a limited subset of all three APIs on
Xe2. A lot works and a lot doesn't. The core compiler is there (spilling,
scoreboarding, SIMD32, etc should more or less work), but there are details to
fill in for both performance and correctness. We essentially pass conformance on
OpenGL ES 3.0 and OpenCL 3.0, and we're busy iterating on Vulkan.
Likewise, additional hardware support will come down the line. There's nothing
fundamentally Xe2-specific here. I just have a Lunarlake laptop on my desk, Ken
has a Battlemage card, and we had to pick _something_ as the first target.
Co-authored-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
In the future this might even do something clever.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
While Jay supports subgroups, efficient reductions are TODO so it's probably
better not to run this pass yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
Jay will need more work to handle these payloads properly especially in SIMD32.
For now just disable the optimization for Jay for correctness.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
These lowered versions map to what Jay can deal with. The hardware is more
flexible but we're not due to data model restrictions. We choose to lower to get
us off the ground, we can revisit later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
this ensures we don't see vec5 @load_ssbo_uniform_block_intel which
requires special backend handling, instead rounding up in NIR to vec8
which the LSC can do. affects
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec3_lowp_compute.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>
This updates 14024997852 with BMG and brings in media WA
16021867713.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40881>
This will let us share a common scratch swizzling between brw and jay.
Changes by Ken:
- Use an immediate SIMD width when known so we don't need to re-lower
- Switch to load_simd_width_intel because it may not match
info->api_subgroup_size on Vulkan without VK_EXT_subgroup_size_control
- Stop using DWord Scattered Write messages for scratch. These take an
offset in DWords, and our offsets are now always in bytes. This also
means that we no longer create MEMORY_OPCODE_* IR with inconsistent
units of either bytes or dwords. Yikes. We use byte scattered
messages now.
fossil-db stats on Battlemage:
Instrs: 500477504 -> 500450056 (-0.01%); split: -0.01%, +0.00%
CodeSize: 7807432368 -> 7806786192 (-0.01%); split: -0.01%, +0.00%
Cycle count: 62404008370 -> 62398437734 (-0.01%); split: -0.01%, +0.00%
Fill count: 546690 -> 546695 (+0.00%); split: -0.00%, +0.00%
Max live registers: 141257956 -> 141258100 (+0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 72350283 -> 72336544 (-0.02%)
Totals from 99 (0.01% of 1581969) affected shaders:
Instrs: 366593 -> 339145 (-7.49%); split: -7.58%, +0.09%
CodeSize: 6425936 -> 5779760 (-10.06%); split: -10.06%, +0.00%
Cycle count: 2412009876 -> 2406439240 (-0.23%); split: -0.26%, +0.03%
Fill count: 19675 -> 19680 (+0.03%); split: -0.02%, +0.04%
Max live registers: 17600 -> 17744 (+0.82%); split: -0.09%, +0.91%
Non SSA regs after NIR: 37894 -> 24155 (-36.26%)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
brw_nir_apply_key typically knows the dispatch width (it's fixed for
geometry stages, and we clone the NIR for compute and mesh shaders).
For compute/mesh, this was the very next thing called. For the others,
if we know the width, there's no reason not to lower it.
Scratch lowering will start using load_simd_width_intel soon, so we
need it to work in all stages.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
This records the actual SIMD width we selected for the shader, in
all cases except fragment shaders, where we don't know it yet.
MR 37258 notes that "Backends can update [these fields] when they make
new decisions about the subgroup size" - which is what we now do.
Note that nir->info.api_subgroup_size may be different than min/max
subgroup size on Vulkan prior to SPV1.6/VK_EXT_subgroup_size_control,
so we do not alter that.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
This lets us emit NIR code based on the SIMD size. For non-fragment
stages, we'll replace it with a constant and optimize, but for FS,
we delay it until the backend.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>