Imported linear images may have an arbitrary row pitch. As long as it is
aligned to 16 agx can support. Initialize `.linear_stride_B` from the
supplied parameter and let ail verify it.
Fixes gtk dmabuf based tests with a pitch aligned to 256.
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33682>
`command_count` is under control of the vulkan application and can
become quite large. At a command count around 30000 the size of the
alloca() allocated buffers exceeds the default stack size of 16MB.
Fixes fixes segfaults in 'gtk:compare vulkan lots-of-offscreens-nogl*'
gtk 4 test cases which end up with a `command_count` around 32768.
Fixes: https://gitlab.freedesktop.org/asahi/mesa/-/issues/47
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33682>
no, I don't know how this worked before.
fixes KHR-GL46.cull_distance.functional with nir_opt_varyings changes but
this seemed to be passing just by luck otherwise.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33682>
libwrap.dylib is helpful to trace control streams on macOS. When it was
originally implemented, we..
* supported macOS in our OpenGL driver and needed to actually exercise these
interfaces
* didn't have Linux support or hypervisor support or anything so needed the
traces to be utterly thorough
* only had a single macOS version to worry about
The landscape today is very different
* no macOS support in our driver stack
* we can trace registers via the hypervisor - libwrap.dylib is no longer
"correctness" bearing, it's just a convenience tool
* what counts is the hardware side - tracing all the macOS software structs is
not actually useful, the hypervisor is the right place to grab control regs
* piles of macOS versions, this code only ever worked properly on 11.x and 12.x,
but with m4 r/e coming up soon we need a lot more versions working.
So... we keep around libwrap.dylib, but slim it down to only decode the bare
minimum of macOS versioned structures, just enough to grab the control stream
pointer and dump that. This is a loss of functionality around CRs (but we have the
hypervisor as a much better way to grab CRs). In exchange it makes the code much
more manageable and less likely to break every 6 months.
So in exchange for all this deletion we also get things working again, this time
on 13.x. But porting back to 12.x or 11.x would be a very small diffstat given
the reduced focus of the new code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33682>
I dumped assembly generated by our driver with INTEL_DEBUG=shaders,
copied and pasted it into a lua file, tried to run it with
src/intel/executor, but the disassembler started telling me some
instructions were invalid.
This happened because we print the "compacted" flag in our assembly
text, so when brw_gram.y parses our assembly flag, it sees the
"compacted" flag and sets it to the instruction by calling
add_instruction_option(). But the executor tool never sets the
BRW_ASSEMBLE_COMPACT flag when it calls brw_assemble(), so when
brw_assemble() calls dump_assembly(), which calls brw_disassbemble(),
the disassembler gets confused and prints misinterpreted instructions
and calls them invalid.
It is not the job of brw_gram.y (our text assembly parser) to mark
instructions as compacted. Whatever is later assembling the
instruction is the entity that should decide if the instructions are
compacted or not. So in this patch we just ignore this flag.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33614>
If we don't have a region in the X no MIT-SHM case don't go using
the damage call set region.
Fixes: bbdf7e45b1 ("wsi/x11: Hook up KHR_incremental_present")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33592>
The only change since the previous kernel is that the new one includes
the device tree blobs for the mt8195-cherry-tomato-r2 and
mt8186-corsola-steelix-sku131072 devices.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33606>
The kernel+rootfs jobs previously downloaded the prebuilt kernel iamge,
but this was unnecessary as LAVA doesn't use them here, and the images
were never uploaded to S3. LAVA acquires the kernel in lava_submit.sh,
and baremetal downloads the required images and dtbs in baremetal_build.sh.
The kernel modules are still required for some devices.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33606>
Directly download the kernel instead of using the
download-prebuilt-kernel.sh script.
Save the kernel to /kernel for clarity, replacing the previous
/lava-files directory.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33606>
Turns out we were missing the glapi bits, making it impossible to use get
the function pointers for this extension. Whoops?!
[daniels: Squashed in a618 SkQP fails, presumably caused by these not
being skipped anymore.]
Fixes: 9f5af68995 ("mesa/main: expose `EXT_multi_draw_indirect`")
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
It's never been ported to DRI3, but nobody seems to care. Since DRI2 is
untested at this point, just drop the code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
It's never been ported to DRI3, but nobody seems to care. Since DRI2 is
untested at this point, just drop the code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
My editor does this on save, so let's just apply it to EGL's python for
consistency. The only exception is that the genCommon import needs the
sys.path.insert, so that part of autopep8 was reverted.
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
Both instructions for building were the same, and there's not much sense
in calling out just xcb-dri2 out of all the deps there are.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
Add support to rewrite shared atomics into compare-and-swap loops,
previously the nir_lower_atomics pass only supported global and ssbo
atomics.
Only freedreno irc3 reuses nir_lower_atomics, this change does not
impact their usage since they do not support shared atomics.
Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33572>
Adding the b2i(a) == 1 and b2i(a) != 1 patterns also helps prevent
regressions when spurious negations are removed from integer equality
comparisons, as is done in !33498.
v2: Make all variables part of the iteration instead of calculating some
of them. Suggested by Alyssa.
shader-db:
All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 16973331 -> 16973309 (<.01%)
instructions in affected programs: 266 -> 244 (-8.27%)
helped: 2 / HURT: 0
total cycles in shared programs: 915620774 -> 915620550 (<.01%)
cycles in affected programs: 4360 -> 4136 (-5.14%)
helped: 2 / HURT: 0
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209748011 -> 209748003 (-0.00%)
Cycle count: 30514920286 -> 30514920400 (+0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 237334726 -> 237334710 (-0.00%)
Totals from 8 (0.00% of 706651) affected shaders:
Instrs: 16956 -> 16948 (-0.05%)
Cycle count: 261052 -> 261166 (+0.04%); split: -0.92%, +0.96%
Non SSA regs after NIR: 20000 -> 19984 (-0.08%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
At least some Total War: Warhammer3 vertex shaders associate the
comparisons differntly, so the existing patterns were not triggered.
No shader-db changes on any Intel platform.
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209748654 -> 209748173 (-0.00%)
Cycle count: 30514333964 -> 30514361348 (+0.00%); split: -0.00%, +0.00%
Fill count: 622688 -> 622537 (-0.02%)
Max live registers: 65477039 -> 65477033 (-0.00%)
Non SSA regs after NIR: 237334768 -> 237334728 (-0.00%)
Totals from 512 (0.07% of 706651) affected shaders:
Instrs: 1000693 -> 1000212 (-0.05%)
Cycle count: 42174312 -> 42201696 (+0.06%); split: -0.15%, +0.21%
Fill count: 11456 -> 11305 (-1.32%)
Max live registers: 121599 -> 121593 (-0.00%)
Non SSA regs after NIR: 1253445 -> 1253405 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
Use an accessor function to read opcode properties or to change the
opcode. This would allow for different instruction descriptions to
be used for different architectures. Not necessary now, but may
be useful groundwork.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29765>
These are persistant objects that you can use to signal and wait over.
We need to import without VK_SEMAPHORE_IMPORT_TEMPORARY_BIT and we can't
throw away the Vulkan semaphore after each submit.
Fixes: 32597e116d ("zink: implement GL semaphores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
When the size of the signals[] array was changed to 3, the
signal_values[] array was not updated accordingly. If we have a
signal_semaphore and are presenting at the same time, this can lead to
an array overflow and the driver will read some random stack value as
the signal value. This is causing chromium to lock up when running
WebGL.
Fixes: 7f56fd9655 ("zink: it's kopperin' time")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
From the Vulkan documentation, the queueFamilyIndex value will be
created with VkDeviceQueueCreateInfo. So let's avoid counting the
index value and just refer to the already-created value.
This will resolve crashes on some GPUs for various workloads.
v2: Needed to use GetDeviceQueue() in order to map the queueFamilyIndex
values. These values can be different when obtaining the queue used
for presentation, so we need to ensure we update the mapped
queueFamilyIndex value for the associated queue_data struct.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33487>
Not sure if this was intentionally left when block_check_for_allowed_instrs's
param was changed from bool to int, but it certainly was broken without the
previous commit for discards. Now those should work, so the (unintentional?)
special case can be removed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>