a7xx introduced a new way to upload shared consts with old one
becoming unavailable, use fallback mechanism until we implement
the new shared consts.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
Fixes a lot of tests in dEQP-VK.memory_model.* e.g.:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_local.buffer.comp
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
_Presumably_ invalidates workgroup-wide cache for image/buffer data access.
so while "fence" is enough to synchronize data access inside a workgroup,
for cross-workgroup synchronization we have to invalidate that cache.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
Though blob is not seen to even use mode1 on a740, it uses
S2EN variant instead.
Fixes:
dEQP-VK.binding_model.descriptor_buffer.multiple.*
dEQP-VK.binding_model.descriptor_buffer.embedded_imm_samplers.*
dEQP-VK.pipeline.monolithic.descriptor_limits.compute_shader.*
Adapted from Jonathan Marek's changes.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
The set of magic regs is different between generations and even
sub-gens. Adding a new one and/or emitting one on specific generation
takes much more code than necessary. Doing this in a single place is
much nicer.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
nir_lower_tex can lower tg4 coords into tg4 offset which on DG2+ we
also need to lower into constant offsets.
Unfortunately the nir_lower_tex pass is not able to lower the
instructions it itself generates, so the easy fix for when
nir_lower_tex lowers tg4 coords into tg4 offsets is to rerun the pass.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9735
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25015>
This commit will implement and set VK_KHR_format_feature_flags2
instead of the old ones.
Signed-off-by: Vlad Schiller <vlad-radu.schiller@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24929>
Otherwise, if a secondary CS is grown and then executed without IB2,
the INDIRECT_BUFFER packet would have been copied but it shouldn't.
This fixes a regression that introduced GPU hangs with
gl_vk_meshlet_cadscene on RDNA2.
Fixes: df0c742543 ("radv/amdgpu: rework growing a CS with the chained IB path slightly")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24891>
If a secondary cmdbuf has been grown and is executed without IB2
(eg. on compute queue or when it's not allowed), the ib size ptr
contains chaining info, which means the IB size was wrong.
This fixes CPU crashes when running gl_vk_meshlet_cadscene.
Fixes: 277b2afd70 ("radv/amdgpu: add support for executing DGC cmdbuf with RADV_DEBUG=noibs")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24891>
This was left unused after 624ac55721 ("anv: move total_batch_size to
anv_batch"). We're now going to use it to store the total amount of
commands written in a command buffer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24628>
This name is confusing, the real thing it represents is the allocated
amount of batch space.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24628>
The disk_cache_create() now always returns valid cache even when disk
cache is disabled. In a case of disabled cache, the disk cache is NO-OP.
Test whether get/put() work as expected for the disabled cache.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24985>
Previous commit decoupled EGL_ANDROID_blob_cache from the disk cache
and haven't updated the SHADER_CACHE_DISABLE_BY_DEFAULT test-case that
is failing because now cache is always created even if disk cache is
disabled, such cache is NO-OP in this case. Fix the failing test.
Fixes: 39f26642 ("util: Decouple disk cache from EGL_ANDROID_blob_cache")
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24985>
Test for disabled cache was removed when we decoupled
EGL_ANDROID_blob_cache from the disk cache because test was failing
since it became outdated. Add the updated test.
Fixes: 39f26642 ("util: Decouple disk cache from EGL_ANDROID_blob_cache")
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24985>
Dolphin ubershaders have a pattern:
if (x && y) {
} else {
discard;
}
The current code to simplify if's will bail on this pattern, since the condition
is not a comparison. However, if that check is dropped and we allow NIR to
invert this, we get:
if (!(x && y)) {
discard;
} else {
}
which is now in a form for nir_opt_conditional_discard to turn into it
discard_if(!(x && y))
which may be substantially cheaper than the original code.
In general, I see no reason to restrict to conditionals. Assuming the backend is
clever enough to delete empty else blocks (I think most are), then this patch is
a strict win as long as inot instructions are cheaper than empty else blocks.
This matches my intuition for typical GPUs, where simple ALU instructions are
cheaper than control flow. Furthermore, it may be possible in practice for
backends to fold the inot into a richer set of instructions. For example, most
GPUs have a NAND instructions which would fold in the inot in the above code.
So just drop the check, simplify the pass, get the win.
---
Also, to avoid inflating register pressure, make sure we put the inot right
before the if. Android shader-db on is uninspiring due to terrible
coalescing decisions in the current RA. But it does fix the Dolphin smell.
total instructions in shared programs: 1756571 -> 1756568 (<.01%)
instructions in affected programs: 1600 -> 1597 (-0.19%)
helped: 1
HURT: 4
Inconclusive result (value mean confidence interval includes 0).
total bytes in shared programs: 11521172 -> 11521156 (<.01%)
bytes in affected programs: 10080 -> 10064 (-0.16%)
helped: 1
HURT: 4
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24965>
the vbo index is used to set the stride, so it needs to be updated
affects dEQP-VK.pipeline.pipeline_library.bind_buffers_2.single.stride_0_4_offset_1_0.count_2
Fixes: 7672545223 ("gallium: move vertex stride to CSO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24954>
Issue: For texture with multiple planes, the planes will point to the
same BO with the total size, so current vcn dt_size is incorrect.
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[0]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[1]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
This is because: in function static struct si_texture *si_texture_create_object(),
if (plane0) {
/* The buffer is shared with the first plane. */
resource->bo_size = plane0->buffer.bo_size;
...
radeon_bo_reference(sscreen->ws, &resource->buf, plane0->buffer.buf);
resource->gpu_address = plane0->buffer.gpu_address;
}
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9728
Cc: mesa-stable
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25013>
When MSAA is enabled, instead of using BLENDFACTOR_ZERO use CONST_COLOR,
CONST_ALPHA and supply zero by using blend constants.
We need info on blend state entries in the CSO so that we can set them
up properly.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24714>
When MSAA is enabled, instead of using BLENDFACTOR_ZERO use CONST_COLOR,
CONST_ALPHA and supply zero by using blend constants.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24714>
When the list of expected failures match, the job shouldn't fail.
This also adjusts the first error check to catch segfaults.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25025>
The opcodes write to w by default so using anything else means we can't
schedule anything in the rbg slot anyway becasue we have to replicate the
result from w. We already attempt to do this during the scheduling, but
at that point it is more tricky, so doing it early leads to much better
code. Performance++
RV530 benchmarks:
Lightsmark, 1280x800, fullscreen
before:
N Min Max Median Avg Stddev
x 5 27.32 27.36 27.34 27.34 0.015811388
after:
N Min Max Median Avg Stddev
x 5 27.53 27.61 27.59 27.576 0.034351128
Unigine Sanctuary, 1280x800, fullscreen, medium shaders
before:
N Min Max Median Avg Stddev
x 5 10.1211 10.1238 10.1214 10.12192 0.0011211601
after:
N Min Max Median Avg Stddev
x 5 10.4607 10.4637 10.4619 10.46206 0.0012441865
RV530 shader-db:
total instructions in shared programs: 129643 -> 128038 (-1.24%)
instructions in affected programs: 45415 -> 43810 (-3.53%)
helped: 514
HURT: 43
total presub in shared programs: 4912 -> 5201 (5.88%)
presub in affected programs: 752 -> 1041 (38.43%)
helped: 40
HURT: 30
total omod in shared programs: 381 -> 383 (0.52%)
omod in affected programs: 6 -> 8 (33.33%)
helped: 1
HURT: 3
total temps in shared programs: 16904 -> 16841 (-0.37%)
temps in affected programs: 1377 -> 1314 (-4.58%)
helped: 81
HURT: 52
total lits in shared programs: 3555 -> 3550 (-0.14%)
lits in affected programs: 294 -> 289 (-1.70%)
helped: 13
HURT: 11
total cycles in shared programs: 194771 -> 193734 (-0.53%)
cycles in affected programs: 79079 -> 78042 (-1.31%)
helped: 452
HURT: 84
GAINED: shaders/glamor/82.shader_test FS
RV370 shader-db:
total instructions in shared programs: 82116 -> 81600 (-0.63%)
instructions in affected programs: 11888 -> 11372 (-4.34%)
helped: 273
HURT: 40
total temps in shared programs: 12438 -> 12441 (0.02%)
temps in affected programs: 692 -> 695 (0.43%)
helped: 36
HURT: 39
total cycles in shared programs: 128140 -> 127630 (-0.40%)
cycles in affected programs: 25838 -> 25328 (-1.97%)
helped: 266
HURT: 41
GAINED: shaders/0ad/12.shader_test FS
GAINED: shaders/CC3-tiberium-wars/314.shader_test FS
GAINED: shaders/lightsmark/16.shader_test FS
GAINED: shaders/sanctuary/159.shader_test FS
GAINED: shaders/sanctuary/162.shader_test FS
GAINED: shaders/sanctuary/51.shader_test FS
GAINED: shaders/sanctuary/54.shader_test FS
GAINED: shaders/trine/fp-422.shader_test FS
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6661
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24889>
This commit will add a new PVR_DEBUG flag that, when used,
it will display information about the display and render
devices in the common code (without adding dependencies)
Signed-off-by: Vlad Schiller <vlad-radu.schiller@imgtec.com>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24931>
Arm hdlcd display units do exist on Juno SoC's. This is the
first time Mesa has had to deal with panfrost working on these SoC's,
thus have to add hdlcd support.
Signed-off-by: Carsten Haitzler <carsten.haitzler@foss.arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25027>
ANV functionality was used as a reference. As stated in anv_BindImageMemory2:
Ignore this struct on Android, we cannot access swapchain
structures there.
Fixes 2 failing VTS test:
dEQP-VK.wsi.android.swapchain.create#image_swapchain_create_info
dEQP-VK.wsi.android.swapchain.simulate_oom#image_swapchain_create_info
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25028>
Functionality ensures gralloc won't allocate compressed buffer
incompatible with shared presentable image support.
Broadcom does not support compressed buffers and we can just enable the
feature without additional logic. Despite that, we add the logic here
so it can be replaced with the generic code someday.
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25028>
Looks like indirect dispatches require an event marker instead of an
event marker with dims. That makes sense somehow given the blocks size
is not known at record time with indirect dispatches.
This allows RGP to report correct block sizes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24994>