This fixes an issue with Hellblade Senua's Sacrifice because
RADV_PERFTEST_RT_WAVE_64 is set using drirc, but if two devices are
created RADV_PERFTEST flags might differ.
The proposed solution is to filter out unused RADV_PERFTEST flags for
the winsys.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36727>
GFX12 seems to behave slightly differently. Setting these bits to TRUE
causes zero-area triangles to not pass the primitive clipping stage.
So, the actual number of primitives output by the primitive clipping
stage was wrong.
After digging a lot, it seems PAL doesn't set these bits either on
GFX12.
CC: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36670>
We only do minimal checks to ensure that copy propagation doesn't break
the readport setup, but we don't update the groups readport setup. So
before scheduling the group do this update. Also check the readport
constellation when scheduling a group is finished.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36572>
- If a readport reservation is not successful then we have to reset the
readport reservation.
- Since the scheduler can add instructions in any order, we have to
update the readports in the same order the slots were filled when
re-evaluating.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36572>
Now that we track the free slots right away, we can make use of this
information when testing whether a vec instruction can be scheduled into
a trans slot.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36584>
Lowering u2f64 and i2f64 will create such instructions and with that
the ALU groups are filled without the need to do scheduler trickery with
two-slot ops that have also two dest registers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36587>
It's always better if init/finish come in pairs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36646>
It was just a dummy wrapper around the runtime struct. We do, however,
have to keep at least the Create/Update entrypoints because RADV has to
do some patching for video encode. Since we're keeping Create, we keep
Destroy as well.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36646>
This also allows us to simplify the interface to
vk_video_session_parameters_create().
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36646>
These are never created on the stack or deep inside other objects so it
makes sense to use create/destroy instead of init/finish.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36646>
so that we can get rid of ac_get_ptr_args.
RADV uses AC_ARG_CONST_PTR for num_work_groups, which maps to i8, which
seems wrong.
No functional change.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36696>
There seems to be a hardware bug that sometimes causes a GPU hang when
an alias...sam sequence crosses an instruction cache line boundary. This
commit adds a workaround pass that inserts padding nops to ensure no
such sequence cross a cache line. Until an alternative solution is
found, this is the best we can do.
While the number of nops we have to insert is fixed at this point, we
can try to minimize the number of nops executed at runtime by replacing
nops encoded in instructions by standalone nops. That is, if this pass
has to insert one nop, it will try to make one of the following
replacements:
- (rptN)nop -> (rptN-1)nop; nop
- (nopN)foo -> (nopN-1)foo; nop
It does so by keeping track of "insert points". Each insert point keeps
track of the instruction and the maximum number of nops that can be
inserted there without pushing any subsequent alias sequences over the
next cache line. Whenever we need to insert nops, we first try it at the
encountered insert points and only if that doesn't work, we insert them
right before the first alias. The pass makes sure the insert points are
only visited a bounded number of times in total to keep the whole pass
O(n).
Totals:
Instrs: 48207402 -> 48278230 (+0.15%)
CodeSize: 101907026 -> 102294524 (+0.38%)
NOPs: 8386320 -> 8457148 (+0.84%)
(ss)-stall: 4013046 -> 4012931 (-0.00%)
(sy)-stall: 16741190 -> 16741033 (-0.00%)
Preamble Instrs: 11506988 -> 11520671 (+0.12%)
Last helper: 11686328 -> 11701615 (+0.13%)
Cat0: 9241457 -> 9312285 (+0.77%)
Totals from 25237 (15.32% of 164705) affected shaders:
Instrs: 22172360 -> 22243188 (+0.32%)
CodeSize: 44372164 -> 44759662 (+0.87%)
NOPs: 4201698 -> 4272526 (+1.69%)
(ss)-stall: 1982473 -> 1982358 (-0.01%)
(sy)-stall: 7379552 -> 7379395 (-0.00%)
Preamble Instrs: 4552074 -> 4565757 (+0.30%)
Last helper: 6260280 -> 6275567 (+0.24%)
Cat0: 4616677 -> 4687505 (+1.53%)
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36639>
We already have these two:
- dep_android comes from vulkan_lite_runtime_deps, through
idep_vulkan_runtime.
- VK_USE_PLATFORM_ANDROID_KHR comes from idep_vulkan_wsi_defines,
through idep_vulkan_runtime.
So let's remove these two, as they don't really add anything new.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36716>
There's two functions that use sysconf(), and they don't seem to agree
on what combination of platforms supports the function. Let's perform
proper function detection instead.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36621>
The POSIX spec says that_SC_PAGE_SIZE is a synonym for _SC_PAGESIZE, and
both will have the same value. Let's be consistent in which one we use,
and let's use the one that points directly to the authoritative
documentation.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36621>