anv_state::offset in the context of anv_state_pool is equal to the offset from
the begining of block_pool + start_offset.
Like it is set in anv_state_pool_alloc_no_vg() in the path that allocs a new
block in anv_block_pool.
As anv_state_pool_return_chunk() expects only the offset from the begining of
anv_block_pool so here subtracting to make the path that grabs a larger chunk of
memory of the pool and split into smaler chunks to properly work.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37669>
Only 3 pools sets a value different than zero to start_offset so that might be
a issue that was being hidden by luck.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37669>
struct anv_state::offset and struct anv_block_pool::max_size are 64bits so these
parameters should also be 64bit or risk overflow.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37669>
While at it also renaming EMPTY to ANV_FREE_LIST_EMPTY_VAL to be more explicit.
No changes in behavior expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37669>
The HW bug was fixed on Xe2, report so accordingly.
Can test behavior with dEQP-VK.fragment_operations.early_fragment.*_maintenance5
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37730>
Add the common magic numbers in a header file and make the functions
use them.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
When the system is under memory pressure (which can happen, for
example, during CI runs), don't immediately give up the exec ioctl
(which, for Vulkan, will result in the device being declared lost).
Instead, retry a little bit just like we do for i915.ko.
This is a trade-off.
One of the reasons to *not* have unified behavior regarding ENOMEM
between i915.ko and xe.ko is the fact that xe.ko uses vm_bind, so if
the user tried to bind more memory than it is able to, we'll just keep
getting ENOMEM as long as we retry the ioctl. We now have a retry
limit, so we'll eventually return the error.
On the other hand, if the problem is other applications consuming all
the memory, having the retry loop may really help avoid unnecessarily
marking the device as lost, since one of our retries may eventually
succeed.
I believe the tradeoff of "we'll now eventually succeed in some cases
where it's possible to succeed, at the expense of retrying for a few
seconds until giving up in cases where we would never be able to
succeed" is an improvement.
If xe.ko ever gives us a way to differentiate between the two
different reasons for ENOMEM, we'll be able to make things much
better. We can also tune our timeouts if needed.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
By this point, the user may have noticed their game is not drawing any
new frames, so perhaps an error message might help.
This would also, of course, help identifying ENOMEM problems in our CI
runs.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
If nothing has freed memory until that point, return the error, which
may make the upper layers report the device as lost. It could be that
the system is under very very heavy swapping and that waiting a little
more would make it work, but let's try 16s for now.
v2: Bring down the timeout from ~60s to ~16s (José).
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
If the ioctl is returning ENOMEM, incessantly retrying does not seem
to be the best way to proceed. After the second retry, sleep 0.1ms,
then more each time, giving the CPU some time to run the other threads
and processes, in the hope that whatever is eating all the memory
might eventually return it.
If the problem is the current thread, then busy looping won't help
either, so here we at least save some power before the user kills the
app.
v2: Adjust the control flow and the sleep time.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
Unify the common code for i915.ko execbuf submission between Iris and
Anv. I plan to add more code to this function in the next patches.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
Don't check something that will never be there, just assert() it.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
The i915.ko backend calls vk_queue_set_lost(), let's unify behavior.
v2: Adjust patch due to series reordering.
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
Move the handling to inside anv_gem_execbuffer(), and also rework
things in a way where we can tell from which caller the error is
coming from.
v2: properly return VkResult (José).
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
Move the handling to inside xe_exec_ioctl() and also rework things in
a way where we can tell from which caller the error is coming from.
v2:
- properly return VkResult (José).
- adjust the patch due to reordering the series.
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
For now, this just simplifies checking for devinfo->no_hw, but we're
planning to add more code that everybody calling the XE_EXEC ioctl
will run.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
Every single caller does the same check.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
By dumping the contents of a HiZ buffer before and after fast-clearing,
I've observed that a zeroed HiZ block corresponds to the CLEAR state
until gfx12. The fast-clearing application was piglit's bin/hiz. I ran
this test on a couple bare metal platforms (ICL and BDW) and many
simulated ones (SKL, TGL, DG2, and LNL).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36383>
For CCS_E on gfx12+, this will cause us to perform full resolves when
transitioning from undefined to a layout which does not support
compression. We don't currently perform such transitions because
compression is always enabled.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36383>
isl_aux_get_initial_state() will soon be used for non-zeroed CCS on
gfx9+. Update the function to avoid hitting an unreachable().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36383>
The bindless sampler heap was introduced in Gfx11 which ELK doesn't
support.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37692>
It's always set to a fixed value and not used in many places. Use the
value directly where it's needed.
Suggested-by: Lucas Fryzek <lfryzek@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>
With EXT_shader_object, it became possible to compile shaders
independently and then use them together later, so we cannot rely on the
lack of task shader data to decide that no task shader will be used. The
flag VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT exists for that purpose,
but it doesn't really make any difference for us. Always assume that if
the mesh shader is reading the task payload, it's going to be used with
one, as otherwise the application is doing it wrong.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13983
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>
Sometimes the compute shader workgroup size requires a larger SIMD width
than the minimum in order to fit in the available threads. In that case
we'll skip the SIMD8 shader, and need to try SIMD16 regardless of how
the register pressure estimate looks.
Fixes: 3af4e63061 ("brw: Skip compilation of larger SIMDs when pressure is too high")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37649>
I didn't bother switching either iris or elk/hasvk but one could.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37517>
It remove a duplication and also it will be used in a future patch
from other file.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37670>
We will reuse LNL and BMG modifiers on newer platforms until
new modifiers show up.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35776>
This allows us to skip the entire backend compilation process for
large SIMD widths when register pressure is high enough that we'd
likely decide to prefer a smaller one in the end anyway. The hope
is to make the same decisions as before, but with less CPU overhead.
We are making mostly the same decisions as before:
| API / Platform | Total Shaders | Changed | % Identical
--------------------------------------------------
| VK / Arc A770 | 905,525 | 1,157 | 99.872% |
| VK / Arc B580 | 788,127 | 53 | 99.993% |
| VK / Panther | 786,333 | 13 | 99.998% |
| GL / Arc A770 | 308,618 | 269 | 99.913% |
| GL / Arc B580 | 264,066 | 13 | 99.995% |
| GL / Panther | 273,212 | 0 | 100.000% |
Improves compile times on my i7-12700K:
| Game | Arc B580 | Arc A770 |
---------------------------------------------------
| Assassins Creed: Odyssey | -13.47% | -10.98% |
| Borderlands 3 (DX12) | -10.05% | -11.31% |
| Dark Souls 3 | -21.06% | -21.08% |
| Oblivion Remastered | -11.10% | -9.82% |
| Phasmophobia | -32.73% | -31.00% |
| Red Dead Redemption 2 | -20.10% | -14.38% |
| Total War: Warhammer III | -10.11% | -14.44% |
| Wolfenstein Youngblood | -15.91% | -13.47% |
| Shadow of the Tomb Raider | -30.23% | -25.86% |
It seems to have nearly no effect on compile times on Xe3 unfortunately,
as only 1,014 shaders in fossil-db even fail SIMD32 compilation in the
first place, and we want to let most of the "might succeed" cases
through to the backend for throughput analysis.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>