The alignment and scale down values have changed on this platform.
To support drivers that won't use a CCS surface on this platform, this
patch computes the CCS fast clear rectangle using the main surface.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13555>
According to Bspec 2424, "Render Target Resolve":
The Resolve Rectangle size is same as Clear Rectangle size from SKL+.
Use get_fast_clear_rect in blorp_ccs_resolve for SKL+.
Note that the Bspec differs from Vol7 of the Sky Lake PRM, which only
specifies aligning by the scaledown factors.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13555>
The comment states that 64K alignment of surfaces is required when an
aux map is present on the platform. However, the code checks for GFX12
instead of dev->info->has_aux_map. Update the code to match the comment.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13555>
Previously a predicated BREAK that appeared immediately before the WHILE
would get merged into the WHILE. This doesn't work if other flow
control (e.g., a CONT) can transfer directly to the WHILE.
On Intel platforms, this fixes the CTS test
dEQP-VK.graphicsfuzz.stable-binarysearch-tree-nested-if-and-conditional.
No shader-db changes on any Intel platform.
When this commit was first created (over a month before it is going to
land), there were some regressions that were prevented by other commits
in MR !13095. That does not appear to be the case now, so I don't know
what changed. Basically, the treatment of discard as a combination of
demote and terminate causes additional continues in some loops, and
those continues trigger this bug. The other commits from that MR
prevent those continues from being generated in the first place.
All Intel platforms had simlar fossil-db results. (Ice Lake shown)
Instructions in all programs: 144419989 -> 144419995 (+0.0%)
SENDs in all programs: 6947332 -> 6947332 (+0.0%)
Loops in all programs: 38277 -> 38277 (+0.0%)
Spills in all programs: 204075 -> 204075 (+0.0%)
Fills in all programs: 319480 -> 319480 (+0.0%)
A few shaders in Doom 2016 were hurt by one instruction each. It seems
likely that these shaders would have experienced at least some
mis-rendering.
Closes: #4213
Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14128>
This virtual instruction is translated into multiple half float
physical instructions, even though its destination is typically of
integer type, which prevents the software scoreboard pass from
inferring the correct execution pipeline for the virtual instruction
on XeHP+ platforms. Teach the SWSB lowering pass about this
inconsistency between the IR and physical instruction types.
Fixes among other tests:
dEQP-GLES31.functional.shaders.builtin_functions.pack_unpack.packhalf2x16_compute
Fixes: d4537770bb ("intel/fs: Add helper functions inferring sync and exec pipeline of an instruction.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5685
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14002>
include it explicitly in the correct places
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14104>
This adds a bunch of other headers in, and adds mtypes.h to iris
for perf query object.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14104>
This file needs futexes so make an explicit include, so it doesn't
come via the compiler
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14104>
These checks are done in isl_format_supports_ccs_*. Since
isl_surf_supports_ccs calls these functions, it doesn't need to check
them itself.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14082>
These formats are used when creating surfaces with the
I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS modifier.
Makes iris pass the out-of-tree piglit test,
ext_image_dma_buf_import-intel-modifiers.
Fixes: 1433fe7860 ("intel/isl: Unify fmt checks in isl_surf_supports_ccs")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14082>
On XeHP, the XY_BLOCK_COPY_BLT command has a number of fields that
describe the layout of the surface, much like SURFACE_STATE does.
Several of them are encoded in such a similar manner that we really
would like to reuse the isl helpers for emitting those. This commit
moves them into a new isl_genX_helpers.h file which I can include
from the BLORP code. (The alternative would be to add XY_BLOCK_COPY_BLT
filling commands to isl, but that...seems more like a BLORP feature.)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14094>
Those defines exist in the packing headers too and some parts of the
code (like mi_builder.h) include both.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13954>
Those names are a bit too common and sometimes clash variables.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13954>
Use the same URB access helpers that were added for Task Output. The
Arrayed I/O (per-primitive and per-vertex) is handled by applying the
pitch from the MUE layout into the NIR intrinsics and including the
non-arrayed offset on top of it. After that, the index src can be
used directly for lowering.
Because we keep around the non-arrayed offset AND the pitch is
aligned, we can identify cases where the access is indirect but
guaranteed to be aligned, and dispatch a single message. Added a TODO
to explore that later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
Implement the output written by the task *workgroup* and available to
all the mesh *workgroups* dispatched from that task. We currently
ignore any layout annotations (since they are not really testable) and
produce a (packed) layout ourselves.
The URB messages are only SIMD8, so for larger SIMDs, the functions
will produce multiple messages. Making this lowering here instead of
the generic lower_simd_width() since it is not just a matter of
zip/unzip, e.g. the offset must be adjusted.
Indirect writes/reads are implemented by handling one component at a
time and using the PER_SLOT variant of the messages.
Note that VK_NV_mesh_shader allows reading outputs, so add support for
that as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
Task/Mesh stages are CS-like stages, and include many
builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup
memory primitives) originally present only in CS.
This commit add two new stages (task and mesh) that 'inherit' from CS
by embedding a brw_cs_prog_data in their own prog_data structure, so
that CS functionality can be easily reused. They also currently use
the same helpers to select the SIMD variant to use -- that was
recently added for CS.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
Allows to assert its existence for per-primitive variables and will
later be useful to implement the "more than 16 attributes" case for
Mesh.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
Since the outputs are shared among the whole workgroup, these can't be
staged in registers as they will not be always visible for all the
invocations (to read/flush). If they ever need to be staged, we
should use SLM for that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
In Fragment Shader, regular inputs are laid out in the thread payload
in a one dword per each half-GRF, that gives room for having the two
delta dwords needed for interpolation.
Per-primitive inputs are laid out before the regular inputs, and since
there's no need to have delta information, they are packed. So
half-GRF will be fully filled with 4 dwords of input.
When num_per_primitive_inputs is zero (the default case), behavior
should be the same as before.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10153>
From VK_KHR_synchronization2:
"Image memory barriers that do not perform an image layout
transition can be specified by setting oldLayout equal to
newLayout.
E.g. the old and new layout can both be set to
VK_IMAGE_LAYOUT_UNDEFINED, without discarding data in the
image."
v2: make assert more readable (Lionel Landwerlin)
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14008>
Engines based contexts operate somewhat different for executing
batches. Previously, we would specify a bitmask value such as
I915_EXEC_RENDER to specify to run the batch on the render ring.
With engines contexts, instead this becomes an array of "engines", and
when the context is created we specify the class and instance of the
engine.
Each index in the array has a separate hardware-context. Previously we
had to create separate kernel level contexts to create multiple
hardware contexts, but now a single kernel context can own multiple
hardware contexts.
Another forward looking advantage to using the engines based contexts
is that the kernel does not plan to add new supported I915_EXEC_FOO
masks, whereas they instead plan to add new I915_ENGINE_CLASS_FOO
engine classes. Therefore some rings may only be usable with an engine
based class.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12692>