Since Xe2, the registers are bigger and even the instruction
structures got updated to have 6 bits.
The way I detected this issue was when I tried to use
src/intel/executor to add the following instruction:
add(8) g6.8<1>UD g4<8,8,1>UD 0x00000008UD { align1 WE_all 1Q I@1 };
Executor would read this and end up emitting an add with dst being
g6<1>UD instead of what we wanted. It turns out that inside
brw_gram.y, at dstoperand and dstoperandex we do:
$$.subnr = $$.subnr * brw_type_size_bytes($4);
which would overflow subnr back to 0.
The overflow doesn't seem to be a problem with code we emit directly
(unlike the code we parse, like above) due to the fact that we seem to
treat Xe2 registers as smaller all the way until we call phys_nr() and
phys_subnr() during code generation. The phys_subnr() function can
generate a value that would overflow reg.subnr, but this value is
never written back to reg.subnr, it's just returned as an unsigned
int.
Fixes: e9f63df2f2 ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33539>
Otherwise replay of renderdoc captures don't work.
Instead avoid passing the flag down the allocator.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33583>
It seems like (ss) is not enough to resolve WAR hazards for
ray_intersection.
Fixes CTS tests:
- dEQP-VK.ray_query.stress.fragment_shader.aabbs
- dEQP-VK.ray_query.stress.fragment_shader.triangles
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33597>
We would create an immed 0 SRC2 for, for example, load_uav. Even though
this src would be dismissed in the final assembly, it would still waste
a register or alias.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33596>
The intent here was to check if the tile we're trying to merge
vertically (prev_y_tile) has already been merged horizontally into a
neighboring tile, but I used the slot_mask which also contains the tiles
that have been merged into the prev_y_tile, so the check was too
conservative and would fail even if another tile had been merged into
prev_y_tile. This meant that we would fail to ever create 2x2 regions of
tiles. Fix this by just testing prev_y_tile's bit in the mask.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33534>
Even though we always try to merge a horizontally or vertically adjacent
tile, when we try to merge a vertically adjacent tile it may not
actually be adjacent because it was merged horizontally and the current
tile wasn't or vice versa. We have to detect this and reject merging it.
Fixes: 3fdaad0948 ("tu: Implement bin merging for fragment density map")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33534>
This custom builder implements fine-grained instance node bounds
calculation by looking at all AABBs at tree depth 2.
Shaves off 0.3ms in the start scene for Indiana Jones: The Great Circle
on Deck (roughly 29.1ms->28.7ms).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32797>
This allows drivers to inject custom functions to calculate the bounds
of instance nodes. For example, this can be used to determine instance
bounds by transforming the AABBs of all child nodes at some level in the
BVH. When instance transforms contain rotations of close to 45°, this
can yield a tighter AABB than just taking the instance's top-level AABB
and rotating it.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32797>
For decode this is also done in decode_video.
This breaks if app doesn't call vkCmdEncodeVideoKHR before end, eg:
vkCmdBeginVideoCodingKHR
vkCmdControlVideoCodingKHR
vkCmdEndVideoCodingKHR
Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33582>
KGSL unconditionally supports preemption so we cannot ignore it.
On a6xx, we have to emit VSC addresses per-bin or make the amble include
these registers, because CP_SET_BIN_DATA5_OFFSET will use the
register instead of the pseudo register and its value won't survive
across preemptions. The blob seems to take the second approach and
emits the preamble lazily. We chose the per-bin approach but blob's
should be a better one.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12627
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33580>
Assertion (or attempting the layout change) is causing crash when
launching Steel Rats. Tighten the condition for change so that it should
affect only when runtime has made changes.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12602
Fixes: eed788213b ("anv: ensure consistent layout transitions in render passes")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33523>
Combiner unit runs after fmul/smul/fadd/sadd units and it can consume
the results that previous units wrote to the registers. So prefer
placing scalar mul into combiner unit and predecessors (if any)
into other units
shader-db:
total instructions in shared programs: 29072 -> 27698 (-4.73%)
instructions in affected programs: 11237 -> 9863 (-12.23%)
helped: 163
HURT: 0
helped stats (abs) min: 1 max: 42 x̄: 8.43 x̃: 4
helped stats (rel) min: 0.64% max: 30.00% x̄: 13.03% x̃: 11.76%
95% mean confidence interval for instructions value: -9.89 -6.96
95% mean confidence interval for instructions %-change: -14.09% -11.97%
Instructions are helped.
total loops in shared programs: 2 -> 2 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 367 -> 372 (1.36%)
spills in affected programs: 16 -> 21 (31.25%)
helped: 1
HURT: 2
total fills in shared programs: 1208 -> 1224 (1.32%)
fills in affected programs: 51 -> 67 (31.37%)
helped: 2
HURT: 2
LOST: 0
GAINED: 0
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33568>
Combiner unit support scalar by vector multiplication and scalar mov.
Implement it for codegen
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33568>
Assert on unexpected pipeline dest for fmul and vmul to catch scheduler
bugs early
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Fix multiple issues with atan in disassembler:
- arg1_en field in combiner unit actually seems to be a bit indicating
that one of sources is vector (e.g. for atan_pt2, or multiplication)
- atan2 has 2 arguments, not one
- properly handle all instruction variants
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Print index of the node that breaks node_to_instr to make debugging
easier
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Apparently fast path cannot handle mismatched mutability and we
should use CP_BLIT which has SP_PS_2D_SRC_INFO.MUTABLEEN to signal
src mutability. Previously it was partially handled by
tu_attachment_store_mismatched_swap.
Fixes: a104a7ca1a
("tu: Handle non-identity GMEM swaps when resolving")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>