Secondary command buffers don't add much information to the traces, but
add more noise and overhead. In order to disable their tracepoints by
default, a separate "secondary_cmd_buffer" tracepoint is created.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34588>
This brings back c9e33e5cbf ("intel/fs/cse: Make HALT instruction act
as CSE barrier."), from the old CSE pass into the new one.
Fixes new CTS test: dEQP-VK.subgroups.shader_quad_control.terminated_invocation
Fixes: 9690bd369d ("intel/brw: Delete old local common subexpression elimination pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34643>
and also stop legalizing away src 1 modifiers. Both of these are present,
they just move to a different place in the encoding.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34750>
This fixes a compilation issue in Marvel Rivals where the legalization
logic and the encoding logic don't line up, which results in an
assertion failure on this instruction:
r17 = hfma2 r17.xx -r18.xx 0x3c003c00
The fix here is a little overly restrictive because it turns out we
actually do have modifiers for all 3 sources. Those modifiers will
be added in later commits.
Fixes: 567cae69c3 ("nak: Add 16-bits float operations")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34750>
For Xe2+, from Bspec 64643, bit field "StackID": The maximum number of
StackIDs can be 2^12- 1.
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34709>
Re-organize the configuration lists to make easier to include BFloat16
only for the Gfx125+ that support it, while keeping MTL supporting the
"lowered" configurations from pre-Gfx125.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
Handle bfloat16 by converting sources to float, performing the
operation, and converting result back to bfloat16 if needed. This is
done because not all ALU ops have a `bf` version in NIR.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
Take the opportunity to add a comment about why the bit_size
comes from the NIR def and not the original type.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
When converting BFloat16 from/to non-Float32 type, use
the Float32 conversion as an intermediate step. Take the
opportunity to separate the unary_op/convert code-paths.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
This will allow including types that don't have a nir_alu_type
equivalent, like bfloat16.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
SPV_KHR_bfloat16 requires a small set of operations,
since it doesn't support all the arithmetic ops.
This patch adds conversions to/from Float32 and also
the necessary ops (bfdot, bffma, bfmul) to implement
SpvOpDot using the same lowering approach than the
Float32 counterpart.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
In float pointing rules adding +0.0f preserves all values except
for -0.0f, so what we want here is to add -0.0f. In the future
we should add proper support for float immediates in the assembler.
Fixes: fafdd24285 ("intel/executor: Update bfloat example")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
We were wrongly lowering all add/sub operations with saturation on 8-bit
values on v11+.
This fixes CTS failures on
"dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.*" and
likely more apps.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: d79a31bf81 ("pan/bi: Lower removed instructions in algebraic on v11+")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34743>
This probably doesn't fix anything in practice. I don't think this path is
ever taken for SGPRs except for pseudo-scalar transcendental instructions,
and those don't have any precolored operands/definitions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
Aligning these can create situations where register allocation is
impossible. Only pseudo-instructions can use these, and they don't require
any alignment.
I'm not sure if these temporaries actually happen in practice. This
probably only affects the get_reg()'s compact_relocate_vars fallback path,
which doesn't usually happen for SGPRs.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
D16 MIMG or pseudo-scalar transcendental instructions might give lower
limits to the maximum register available for their definitions, so just
try to place them earlier.
This is also part of fixing compact_relocate_vars with aligned NPOT
def/killed-op space (the second part is the later commit which changes
get_stride()).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
This gives more details on the hardware depth bias implementation details.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34745>
The necessary bits have been added to the bifrost/valhall backend,
so we can advertise support for VK_EXT_shader_demote_to_helper_invocation
now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34676>
Bifrost/Valhall implementation is actually close to the demote behavior,
where discarded threads will continue execution until explicitly
terminated.
Given we'll need to support the demote behavior for Vulkan, let's use
nir_lower_terminate_to_demote(), which solves the problem, and allows
for extra dead-code elimination when a termination point is guaranteed
to be taken by all threads.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34676>