Currently SAMPLER_CONFIG0 is always emitted to either update the active
configuration or disable the sampler. With NTE this always emits 32 state
dwords, while there are a lot of cases that only use a small number of
samplers and never change the other samplers from their disabled state.
Track the active samplers from the last emit, so we can skip the state
emission when the sampler is already disabled. Only emit the full state
after a context flush where we don't know the previous sampler state of
the GPU.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23579>
iterating all the stages like this ends up unnecessarily calling
through to geometry stage binds when no shader was bound and no shader
is being bound by the power of optimization, so instead only do the unbind
part for the stages that are being unbound
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23600>
It is now set by all relevant drivers and not checked anywhere.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
True for all backends supporting barriers. This lets us collapse lots of code,
since scoped_barriers are based on the SPIR-V definition.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
Both radeonsi and radv use scoped barriers, so this should not be possible to
hit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
In addition to bringing us one backend closer to the scoped-only future, this
improves the generated code in cases like:
memoryBarrierBuffer();
memoryBarrierShared();
controlBarrier();
With scoped_barriers + nir_opt_combine_barriers, we now emit only one MEMBAR
instruction (and a BARRIER) rather than two MEMBARs.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
GFX8+ only compares the bits according to the index type by default
(GFX9 can be changed by VGT_MULTI_PRIM_IB_RESET_EN.MATCH_ALL_BITS),
so we can always leave the programmed value at the maximum.
This reduces context rolls on GFX8+ when primitive restart is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23459>
Primitive reset has a corresponding dirty state which is already
included the used_states so it is not necessary to also check
the primitive reset index here.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23459>
These performance warnings should help to get a better understanding
where we doing non performance optimal things.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23615>
This makes sure we apply WA only when it is required, these issues
do not happen for later MTL steppings.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23596>
This makes sure we apply WA only when it is required, these issues
do not happen for later MTL steppings.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23596>
Specifically we would bail out previously when encountering any
control flow, now we would optimize it even when the second ARL/ARR
is inside a lower level if/else branch.
shader-db
RV530:
total instructions in shared programs: 132020 -> 131924 (-0.07%)
instructions in affected programs: 3374 -> 3278 (-2.85%)
helped: 4
HURT: 0
RV370:
no change (no control flow there)
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23560>
We already do this for ARL, so just generalize the pass.
shader-db
RV530:
total instructions in shared programs: 132235 -> 132020 (-0.16%)
instructions in affected programs: 8492 -> 8277 (-2.53%)
helped: 41
HURT: 1
total temps in shared programs: 16900 -> 16887 (-0.08%)
temps in affected programs: 83 -> 70 (-15.66%)
helped: 13
HURT: 0
RV370:
total instructions in shared programs: 82395 -> 82320 (-0.09%)
instructions in affected programs: 4715 -> 4640 (-1.59%)
helped: 33
HURT: 1
total temps in shared programs: 12316 -> 12305 (-0.09%)
temps in affected programs: 75 -> 64 (-14.67%)
helped: 11
HURT: 0
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23560>
Its particularly important to have the copy-propagate pass run first.
So that when the round is vectorized, we don't have to follow the MOVs
to find out if it leads to ARL or not (we don't vectorize ARR/ARL at the
moment).
No shader-db change.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23560>
Specifically after the first copy propagate run but before the
second one. Removal of ARLs will enable the copy propagate to be more
aggresive, as it is very carefull in such cases.
shader-db
RV530:
total instructions in shared programs: 131861 -> 131503 (-0.27%)
instructions in affected programs: 23949 -> 23591 (-1.49%)
helped: 199
HURT: 15
total temps in shared programs: 16997 -> 16903 (-0.55%)
temps in affected programs: 767 -> 673 (-12.26%)
helped: 69
HURT: 9
RV370:
total instructions in shared programs: 82360 -> 82027 (-0.40%)
instructions in affected programs: 19516 -> 19183 (-1.71%)
helped: 183
HURT: 15
total temps in shared programs: 12370 -> 12262 (-0.87%)
temps in affected programs: 664 -> 556 (-16.27%)
helped: 73
HURT: 0
The hurt programs are due to some constant load being copy propagated
which leads to bad interaction with source conflict resolve pass later.
v2: add missing shader type initialized to the tests. Previously we were
checking for has_omod which also practically means we have a fragment
shader, however its less readable.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23560>
Having a high value for the first activity timeout made sense back in
the days when the machine may not be associated with salad early... but
this isn't the case anymore!
So let's go with a very conservative value of 2 minutes to boot :)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21730>
Makes it more obvious that this function is actually initializing the dri_screen
and not some helper.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054>
Avoid calling this function on screen creation failure as we will discard its
result right after.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054>
Avoid inconsistent behavior on screen creation failures which might lead
to double free issues.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054>
This function is actually used before the use of dri_init_screen_helper so
it is not exactly releasing the memory allocated by the screen helper.
Also clear the base.screen variable after destroy to make this function
reentrant.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054>
The code to release the device was actually always used after the call
to this function.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054>
On RDNA2, VRS rates are part of the HTILE buffer but if we disable
HTILE completely for eg. GENERAL, VRS rates aren't read by the hw.
Fix this by disabling HTILE compression which should have the same
effect without VRS.
Fixes recent
dEQP-VK.fragment_shading_rate.renderpass2.monolithic.attachment_rate.misc.*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23209>