loader_get_pci_driver calls os_read_file on linux to get the pci id, and
os_read_file uses open instead of fopen.
This allows loader_get_pci_driver to work rather than falling back to
loader_get_kernel_driver_name.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22951>
We cannot immidiately free VMA range when BO is freed, we have to
wait until kernel stops considered BO as busy and frees its internal
VMA range. Otherwise userspace and kernel VMA will get desynchronized.
To fix this and re-enable replaying of BDA we place BO's information
into a queue. The queue is drained:
- On BO allocation;
- When we cannot allocate an iova passed from the client.
For more information about this see:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/7106
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18254>
For zombie vma tracking, we'll need access to the queue at bo deletion
time. This simplest way to make that work is just move queue deletion
to late in device teardown.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18254>
Since last commit drm fd is being created on per logical device
granularity, which means each logical device has its own
address space. So VMA heap could be moved to logical device.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18254>
The main reason is to simplify BO managment when
bufferDeviceAddressCaptureReplay would be enabled.
Having to track some BO information in physical device and some
info in logical device gets challenging when BOs are shared
between logical devices.
Other benefits:
- Isolation from hangs in other logical devices;
- Each logical device limited only by its own address space size.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18254>
It is very common in games to read just the .x channel of a vec4 shadow
result (since GL defaults to either LUMINANCE or RED depth mode depending
on context). So, we can avoid shader recompiles to handle the other
components, in that case.
Fixes some recompiles in CS:GO.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22912>
Reduces fp variant recompiles on google's CS:GO trace on zink+anv from 115
to 31.
Fixes: 0843d4cbc3 ("nir: switch to a normal sampler for ARB program with not depth textures")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22928>
If we have multiple LRZ clears, emit them all at once. This also avoids
redundant LRZ clears if app does multiple clears in sequence.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22895>
Just the scaffolding for now, nothing actually creates multiple sub-
passes yet. For now, only planning to use this for a6xx, as other
gens are doing clears on 3d.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22895>
This will give better visibility in perfetto, and prepares for the next
commit where we could have per-subpass clears.
While we are at it, start adopting vulkan terms for tile load/store. No
need to be pointlessly different.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22895>
batch->fast_cleared will be per-subpass. But we can use the cleared
bitmask instead in the few places where we just need to know if there
was a clear in any subpass. For the conditional-ib it is even
preferable since we know a clear touched the contents of the tile so
we know what the result of the conditional would be.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22895>
Once we introduce subpass, it won't be just a single IB. But per
subpass clears + IB. So interoduce a sysmem counterpart for
emit_tile().
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22895>
This is necessary to make anv work with clients like VA-API which relies
on implicit fencing only. The bahavior matches iris i915_batch_submit.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22937>
This helps by reducing the number of branches with their corresponding
delay slots, at the expense of additional register pressure. It also helps
a lot with SFU stalls, probably because removing control-flow blocks
gives us more QPU scheduling flexibility to hide them.
Shader-db results below correspond to the "closed shaders" set, since the
full set is very dominated by the massive impact this change has on Skia's
shaders (for the better), so this is probably more representative of real
impact:
total instructions in shared programs: 11887255 -> 11854898 (-0.27%)
instructions in affected programs: 538170 -> 505813 (-6.01%)
helped: 1653
HURT: 43
Instructions are helped.
total threads in shared programs: 385924 -> 385872 (-0.01%)
threads in affected programs: 236 -> 184 (-22.03%)
helped: 22
HURT: 48
Inconclusive result (%-change mean confidence interval includes 0).
total uniforms in shared programs: 3552808 -> 3547894 (-0.14%)
uniforms in affected programs: 157486 -> 152572 (-3.12%)
helped: 1673
HURT: 35
Uniforms are helped.
total max-temps in shared programs: 2062403 -> 2064720 (0.11%)
max-temps in affected programs: 18209 -> 20526 (12.72%)
helped: 168
HURT: 369
Max-temps are HURT.
total spills in shared programs: 1937 -> 1994 (2.94%)
spills in affected programs: 79 -> 136 (72.15%)
helped: 0
HURT: 1
total fills in shared programs: 2652 -> 2717 (2.45%)
fills in affected programs: 115 -> 180 (56.52%)
helped: 0
HURT: 1
total sfu-stalls in shared programs: 19349 -> 18010 (-6.92%)
sfu-stalls in affected programs: 2321 -> 982 (-57.69%)
helped: 674
HURT: 74
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 11906604 -> 11872908 (-0.28%)
inst-and-stalls in affected programs: 541339 -> 507643 (-6.22%)
helped: 1656
HURT: 43
Inst-and-stalls are helped.
total nops in shared programs: 245740 -> 238085 (-3.12%)
nops in affected programs: 19282 -> 11627 (-39.70%)
helped: 1335
HURT: 76
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22922>
Similar to SpvDecorationRestrict, looks like it's also incorrectly
generated by glslang.
This will allow RADV/CI to leave MESA_SPIRV_LOG_LEVEL as default
(ie. only warnings).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22917>
Resolves ambient occlusion rendering in Replicant
Resolves grass and ocean animations in Automata, and maybe more.
Both of these games have shaders that expect trig values to work across
large ranges with good precision.
Closes#7656
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22894>
Ensure the render target values are in the proper range.
This fixes `spec@!opengl 3.0@render-integer`.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>
Some of the new features require at least V3D 4.2. And actually, 4.2 is
the version used by the Raspberry Pi 4 hardware.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>
Instead of hardcoding conditionals to know which hardwared-based version
of a function to call, just wrap them in a macro to use
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>