Relax this assert based on x/y offsets for GFX_VERx10 >= 200.
This is getting hit when running gfxbench5 on LNL/BMG.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32128>
Since the shader parameters are passed as inline data, push constants
are no longer used and so, not actually set on dispatch. But the
nr_params = 4 was still making the shader emit the code to load them,
causing page faults on simulation, and would also on HW if we didn't
always have a scratch page set.
The uses_inline_data parameter will be set from brw_compile_cs(), called
shortly after this point, so we don't need it here.
The subgroup_size is misleading, as we don't actually require that size
and the code that checks for it isn't even running for this shader.
Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32150>
From the perspective of the gpu, host read or host write has the same
implication (gpu cache flush) in the dst access flags. We should
include host write in the dst access flags.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32102>
For the host-to-device domain operation, it is possible that
wait_sb_mask is empty but there is a cache invalidaton,
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
Fragments are processed in rasterization order within a fragment job.
The fragment subqueue self-wait is nop in most cases. The only
exception is when there is a feedback loop.
When there is a feedback loop, because we lower subpassLoad to
texelFetch, we have to split the render pass.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
IDVS jobs within a render pass use the same scoreboard slot. There is
no need to wait.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
We don't emit the fragment job until the end of a render pass. There is
nothing to wait.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
The fragment subqueue always waits for the tiler subqueue. There is no
need to emit additional waits for barriers.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
src_stages and dst_stages together define an execution dependency. Both
of them should be considered at the same time.
Add a low-level helper, add_execution_dependency, to translate pipeline
stages to subqueue wait masks. The subqueue wait masks only specify
which subqueues should wait for which. The callers will decide how the
waits are performed exactly.
Update collect_cs_deps to call add_execution_dependency and use the
subqueue wait masks to initialize panvk_cs_deps.
The main difference is that barriers such as
.srcStageMask = VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT,
.dstStageMask = VK_PIPELINE_STAGE_2_NONE,
are ignored.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
src_access defines the availability op and the host-to-device domain op.
dst_access defines the visibility op and the device-to-host domain op.
They should be treated separately.
Add a low-level helper, add_memory_dependency, to translate access flags
to panvk_cache_flush_info.
Update collect_cache_flush_info to use add_memory_dependency. Also
replace the custom subqueue access flag mappings by
vk_filter_{src,dst}_access_flags2.
The main difference is that barriers such as
.srcAccessMask = VK_ACCESS_2_MEMORY_WRITE_BIT,
.dstAccessMask = VK_ACCESS_2_NONE,
or
.srcAccessMask = VK_ACCESS_2_NONE,
.dstAccessMask = VK_ACCESS_2_MEMORY_READ_BIT,
are no longer ignored.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32074>
Add and use panvk_memory_mmap and panvk_memory_munmap.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32125>
Adds an optional region selection, based off percentages of the
starting/ending of an image's X & Y values.
This is intended as a performance enhancement tradeoff for smaller
images to be created.
With a smaller image size, the screenshotting layer will change the
region boundaries on the GPU side, which will decrease the amount of
time it takes to copy the image over to CPU-accessible memory.
Using vkcube as an example, the original image size is 500x500:
mesa-screenshot: DEBUG: Screenshot Authorized!
mesa-screenshot: DEBUG: Needs 2 steps
mesa-screenshot: DEBUG: Time to copy: 123530 nanoseconds
Then, by cropping the area to a 100x100 image, we get the following:
mesa-screenshot: DEBUG: Screenshot Authorized!
mesa-screenshot: DEBUG: Using region: startX = 40% (200), startY = 40% (200), endX = 60% (300), endY = 60% (300)
mesa-screenshot: DEBUG: Needs 2 steps
mesa-screenshot: DEBUG: Time to copy: 12679 nanoseconds
For this example, this is a ~90% time reduction improvement!
Overall, this option reduces the copy time to a point where it can
become negligible, relative to the frame time of the application.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood felix.j.degrood@intel.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32016>
Lots of tests are hitting the assert, one in particular :
dEQP-VK.binding_model.mutable_descriptor.single.switches.sampler_combined_image_sampler.update_copy.nonmutable_source.normal_source.pool_same_types.pre_update.no_array.comp
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b6d11ba5b4 ("anv: Protect memcpy/memset/qsort calls against NULL arguments")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32133>
When (off_const > max) there is a wrap around uint when calling
try_extract_const_addition.
Exit early since folding doesn't make sense in this case.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32118>
All of the spilling code should work with physical register units
because for example SEND messages will expect a physical register as
destination.
So always allocate a full physical register for the spilled/unspilled
values and adjust the offsets of the registers to physical sizes too.
Cc: mesa-stable
Fixes: aa494cba ("brw: align spilling offsets to physical register sizes")
Closes: mesa/mesa#11967
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Found-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32124>
The g610-vk jobs are just too unstable to be pre-merge jobs. Let's keep
them as post-merge so people can still execute them manually if they
care.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32129>
This needs to use aligned size, otherwise it will output two
slices when the size is not 16 aligned.
Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
Avoids sanitizer errors like:
```
../src/intel/vulkan/anv_pipeline_cache.c:409:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_descriptor_set.c:696:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_descriptor_set.c:2709:10: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_descriptor_set.c:2709:10: runtime error: null pointer passed as argument 2, which is declared to never be null
```
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>
Avoids sanitizer errors like:
```
../src/intel/vulkan/anv_pipeline_cache.c:406:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:406:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:417:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:435:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:435:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:439:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:439:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:443:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:443:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:447:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:447:4: runtime error: null pointer passed as argument 2, which is declared to never be null
```
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>
Introduced in 2013 with prospect of being used in future.
... 11 years later.
Fixes: 4b45b61fef ("util: add avx2 and xop detection to cpu detection code") # 24.3
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31998>
We don't use it for anything.
Cc: mesa-stable # 24.3
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31998>
Currently pointless, Pentium II or Celeron and later has SSE.
Cc: mesa-stable # 24.3
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31998>
As per the code comment added in this commit the nir produced from
glsl to nir doesn't always keep function declarations before the
code that calls them e.g. calls from within other function
implementations. The change in this commit works around this problem by
first cloning all function declarations in a first pass, then cloning
the implementations in a second pass once we have filled the remap
table.
Fixes: cbfc225e2b ("glsl: switch to a full nir based linker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12115
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32100>