- the vkSetDebugMetadataAsyncGOOGLE command should
not have an entry in the function table: it
leads to missing prototype errors
- Make gfxstream respect cpp_msvc_compat_args, since
it is a C++ project. -Wmissing-prototypes will be
made a cpp error *eventually*.
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39418>
Otherwise:
gallium/auxiliary/gallivm/lp_bld_nir_soa.c:2394:7:
error: variable 'opname' is used uninitialized whenever switch default is taken
is observed.
Reviewed-by: @LingMan
Fixes: 12bceb228a ("gallivm: let reduce ops use llvm intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39418>
It turns out that it was intended to round down when dividing the
framebuffer size by FDM size and all other implementations of
VK_EXT_fragment_density_map did that. We followed the spec, which
doesn't say to round (which is equivalent to rounding up), but the spec
will be updated to reflect the intended behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39434>
There were a few missing things here:
- The max_waves can be odd even when wavesize_granularity = 2, unlike
with registers, so we should not multiply by wavesize_granularity.
This means we have to double branchstack_size to compensate.
- The actual limit was half what it should be on a6xx-a7xx, because when
I originally calculated this computerator was using the wrong
branchstack units. We need to double branchstack_size again.
- We should limit the branchstack based on max_branchstack and align it
to 2 on a5xx+, as we do when programming the HW.
- On a8xx the limit is doubled compared to a7xx to compensate for losing
wave128.
Fix all of these.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39468>
The branchstack starts one bit lower, and we have to round to the next
even value instead of dividing by 2. This matches the actual HW
definition and will make the next commits simpler.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39468>
This moves the dispatching for each winsys function out to arch-specific
variants of the pvr_winsys_ops structure instead. This gets rid of some
needless complexity, and should make the code easier to maintain in the
long run.
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39348>
All uses of PVR_ARCH_DISPATCH in the powervr winsys were due to needing
to reach the kmd_stream.xml definitions. However, this isn't quite
enough to do this multi-arch; we also need to widen the interface to
pass extra context-switching information for future GPUs.
But, doing this with the per-arch infrastructure isn't a huge gain,
because all of this code runs during context-init. So let's walk things
a bit, and drop the dispatching here.
This does mean we need to stop using kmd_stream.xml definitions; I don't
think this is a huge loss; we're mostly open-coding the firmware
interface here anyway.
Unfortunately, the same is not the case in the pvrsrvkm winsys, because
the kernel driver used there doesn't abstract away the same HW details,
so we'll need to set up a bunch of things based on HW definitions. So
let's take a different approach there.
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39348>
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>
This commit change the BVH layout a little so that we can load the BVH
offset as constant rather than reading from memory.
We have to force the instance leaves pointer at the end which gets used
in copy.comp shader.
Totals:
Instrs: 54798 -> 54728 (-0.13%)
Send messages: 3854 -> 3847 (-0.18%)
Cycle count: 1915106 -> 1913954 (-0.06%); split: -0.07%, +0.01%
Non SSA regs after NIR: 18594 -> 18575 (-0.10%)
Totals from 7 (7.37% of 95) affected shaders:
Instrs: 5532 -> 5462 (-1.27%)
Send messages: 367 -> 360 (-1.91%)
Cycle count: 132592 -> 131440 (-0.87%); split: -1.01%, +0.14%
Non SSA regs after NIR: 1989 -> 1970 (-0.96%)
PERCENTAGE DELTAS Shaders Instrs Send messages Cycle count Non SSA regs after NIR
q2rtx-rt-pipeline 95 -0.13% -0.18% -0.06% -0.10%
--------------------------------------------------------------------------------------
All affected 7 -1.27% -1.91% -0.87% -0.96%
--------------------------------------------------------------------------------------
Total 95 -0.13% -0.18% -0.06% -0.10%
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39106>
Compute shaders are the fastest for all copies and some clears.
Note that this is a very different compute shader than the one in RADV.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>
now that transient images are a more complete mechanism, this should
in theory be okay and also accounts for the case where
a framebuffer contains mixed msrtt textures and plain multisampled textures
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39464>
Previously, we were allowing up to 1024 entries to be accumulated and
pushed. Nouveau kernel side always report 510 entries but we are going
to increase this at some point.
This makes it so that we now dynamically allocate
nvkmd_nouveau_exec_ctx::req_push.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39239>
Add a unique_id field to struct tu_bo, to ensure that every bo has a
unique identifier.
On KGSL, importing the same dma-buf multiple times produces distinct GEM
handles, which violates the VK_EXT_device_memory_report requirement that
the object ID uniquely identifies the underlying memory.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39406>
```
Traceback (most recent call last):
File "bin/pick-ui.py", line 31, in <module>
loop = urwid.MainLoop(u.render(), PALETTE, event_loop=evl, handle_mouse=False)
~~~~~~~~^^
File "bin/pick/ui.py", line 196, in render
asyncio.ensure_future(self.update())
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/lib64/python3.14/asyncio/tasks.py", line 730, in ensure_future
loop = events.get_event_loop()
File "/usr/lib64/python3.14/asyncio/events.py", line 715, in get_event_loop
raise RuntimeError('There is no current event loop in thread %r.'
% threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'MainThread'.
```
Of the 3 dependencies, only urwid actually needs to be updated, but
while at it let's pick the latest of each.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39452>